Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758319AbcLASqZ (ORCPT ); Thu, 1 Dec 2016 13:46:25 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:44670 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757595AbcLASqX (ORCPT ); Thu, 1 Dec 2016 13:46:23 -0500 Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free To: Linus Torvalds References: <20161128072315.GC14788@dhcp22.suse.cz> <20161129155537.f6qgnfmnoljwnx6j@merlins.org> <20161129160751.GC9796@dhcp22.suse.cz> <20161129163406.treuewaqgt4fy4kh@merlins.org> <20161129174019.fywddwo5h4pyix7r@merlins.org> <20161130174713.lhvqgophhiupzwrm@merlins.org> <20161130203011.GB15989@htj.duckdns.org> <20161201135014.jrr65ptxczplmdkn@kmo-pixel> CC: Kent Overstreet , Tejun Heo , Marc MERLIN , Michal Hocko , "Vlastimil Babka" , linux-mm , LKML , Joonsoo Kim , "Greg Kroah-Hartman" From: Jens Axboe Message-ID: Date: Thu, 1 Dec 2016 11:46:00 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [216.160.245.98] X-ClientProxiedBy: CY1PR0801CA0014.namprd08.prod.outlook.com (10.163.136.152) To CY4PR15MB1192.namprd15.prod.outlook.com (10.172.177.14) X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;2:ii2o91GLflIxqkGZ0Jopd22hyXLBNozBDSVDHvThQlcrXzLD+e6qXDdls/8XI4sXwcm07Na3KCiUekUeEwY/llWrwohQx91IcjmSNzoU0sedc41TOhtQZ9R9K/QahWp0bUuZWg1PIIyk2JqoTYSqi4US+qEdkHmoomHAaJ/fw7Q=;3:c+hAuO1SLJt1cJ2hp5uWfYfl5fjGGtMg2jzEwLF05xUBRBx15xIht9BrRGweUfIcIEiSrGFufASEAeCqneUwaZMTjuO6PQHONoa6Ejimz5Qbs/Lcv94lIfL/tK0MWjDP7lR9eBmvMm/D/A1ahmyE2vYmM4AYyJOdTwpD6wx/eOo= X-MS-Office365-Filtering-Correlation-Id: c3c3b5e6-3d5a-4b01-ca6f-08d41a1a4d5f X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:CY4PR15MB1192; X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;25:nE5RkWlQDixF0tugWCQgEZwmGy0AJsOLzJ4OsrjDdJ4d2LyyKE83E4aFD+GnpAQqzXghLr0YgiaLhIuocjFjl2DiV2wlt4cBUERS8HyKhyr3OiiczBK8Necl9jOidehgABFCeNlTU9S1eK3bDfMcxrYR8TSzmzuRhYRxo4KMo++a15zWQ/p6QNm6ODsHqPR8hPXMMZoH+6lCPExUKjXeT0bkhSj9Z+u3tSLF/w6hrQYHx79lqbkF9rqkw9s/YA7zI8JvZ0y8eBqd+tLvsSPJvzizkR+6TJYbGbRIo4IPLiTnLZo7Lzpe9Sf+AifGgpmEqnZmVhOLkD1rv8zCJpKPJbbSKMprqe+6sjoeeWh+8RVSrr5f7rrfLoRA7tiRq1sUR3uPNXrcnJ2m1g4Z1gTbJa/X9UbOgYN+OXf135hMMNoQWPZaJNRMBtV6hYnQzlOP1r3UDNyhzY1q/jFCp8XZsGDZmM04TccIGKF58HthP3gcSsbZ83idAer2f8xr235LhyC8Iy8kNB6MG0jPJUnrOTlaFiUrEInRh3FEViGPClINvXXfd4uLz1i51opHRTzZp2BALPJejvE9iqrPjCBQsMRJMBwb4EYn+w5y3t+6BTWVfByc8UGe0ZXHfT6YpBqMkm8RIYMgp4TVomdWiL9AGM2ExpdL9cZ4FnJNMfnaUKNGIHYrYP4hZhj6+wnckVQzNifT9g9nkcFSPUKe3vNfLJGuL/PsW97GVDmFeTDqiSxey8Y9C7fPCIuu2sMFD4Uj X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;31:eRGuQ6yL1iMKG7LdNIiUBVUsx1IchK94B5lZMLPCdb7eill9bohQHb+CfrBC0GIrzYdUnbfkUwrQgIDlAj0ia5b1nLvnNUN/mW1cP/TztgQ7VC4YtHl1XVZhkqmv7wZt/UnSbgiwvqz1UiivlUvMJ6502qMYhoLcNyeeR2eTZzAK9ldT93+H97qQOMNBsXuniUDWNc/1OtvcYv7eadOQBvVMiptkSTlMR7/ent54xXazzJlMN+8r69UJtYsy5rC1kA+ENzTIVccU87O88UkhTBajcpI+NWgu5irzol8xb60=;20:LBuDkoMz0yHB8YSDS6wmTUIy7BAq7DOEOjS1k6KU7ZFq8hL36vqk3HYRDw4t0fVNgizqVGjvuemNI80gIQ8kdHnmEwUKz524uUkl9FNjRMp6ionScRCMYiJOkpt71uoOTGOnycZlbbGzs683aEdz5lBsYvbvG03gOOXYCfl31TFEIr2qaoONmX9imblREUi6UcaUYA2Wy0HZpeACfVX1g8alqk+MdDAslvAmqne0D3n9WkNTsNxEm/NVaL2tJ035 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(67672495146484); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6041248)(20161123560025)(20161123562025)(20161123555025)(20161123564025)(6072148);SRVR:CY4PR15MB1192;BCL:0;PCL:0;RULEID:;SRVR:CY4PR15MB1192; X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;4:JUZQaZgFJEYToXvq6CCxJFIVmLblPMRACu33vyHqNWm4GtGUYGPegjXIJYQB+5n1wBFyvCNdb0/gNaAnV4ZieQwLTd8pVBkwcdZqg4Wot7Lkdgk/3iYVlmClIwIW+Z2GX9Eie4OAHE91/OgpLPnAK8Kl5G6E2w5bpEWiG1Gevd6ZFwMsUqsaY/iCklcrafdmkbMNkuVJEbWXMDp1fHnAUaTx+Ui+51gEYdjc4SGp0KT/GB4ihtsdyYitQKcUdGtCSr3vQBt5lll8sw4e5rdNU36Y7S9JJ5vVFD2Ab2vStH+V+NiGblIM2mxrvo3ZZAbF4DmK7u+GS5IZcBDlE4SpeW61JAXr2QrspKthhYggTC3cE8BCzC1WnmSNC0eDcqsrCqbbZTbJS58iS22wUPoEqAD3T7RvbhKc+spNbzdQD6ZBuBPCMVh4UMqtCL6gMrj6kXJukhdNbfN35Z5+r9XI8c/7b3O5rvv7JXa3TIFQPzEvVALQ8cnT7TBXlwNwDW4EngMx1ebPvc8e9ZACWcJFz7l7ID/KdfrdTOnYCShSYha5cZUR2cXgQKl+jj1dZ55528sIgU54JftFZTjTrfME2OmVUElPsYSMy5gYNtPYnz6dcWXf5L5Oe2vwS98wcKXy X-Forefront-PRVS: 014304E855 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(6049001)(7916002)(377454003)(189002)(24454002)(199003)(117156001)(65956001)(101416001)(33646002)(66066001)(7846002)(77096006)(65806001)(93886004)(86362001)(83506001)(81166006)(97736004)(7416002)(38730400001)(64126003)(7736002)(6486002)(733004)(2950100002)(305945005)(3846002)(39060400001)(31696002)(36756003)(6116002)(2906002)(92566002)(39450400002)(47776003)(81156014)(110136003)(8676002)(50986999)(4001350100001)(76176999)(230700001)(50466002)(31686004)(23676002)(54356999)(5660300001)(189998001)(39410400001)(68736007)(105586002)(4326007)(65826007)(229853002)(106356001)(6916009)(42186005);DIR:OUT;SFP:1102;SCL:1;SRVR:CY4PR15MB1192;H:[192.168.1.129];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtDWTRQUjE1TUIxMTkyOzIzOnVyNzVYT0F6S3cvMzRPUEI0ekFtODhIVmZk?= =?utf-8?B?VDkvcHBGQ3F2RkJ6amh2K3pBK2RkTkhoUUV3dEZDMUJJQkxpRElEWWszUlFm?= =?utf-8?B?dUt0STk2VjFRcVFyQWlnSVk5Q1lDckdvMDZHNmYyKzRLaHZjTVJQQnpuTEZW?= =?utf-8?B?dGh0WFlLSGpTN2RIc2E5eWNqaWVTbWtiZmhBQWxKdTRFMnhBUzgxWG9XOWVl?= =?utf-8?B?WjViZDU5SmlVY21IWmhFWitkb254Mkt3a2FOM1dDOSszT0pWT2ozcE9kQURS?= =?utf-8?B?WFJ1a3BHNlI5ekdOanFMbFVIYkhCSnhnS3lzWnU2d1FwbUZCZHFydnV6U2xX?= =?utf-8?B?UUtaYXZ5SDdOZ2tsQVd1QVduckJVZmZIUmZjWWJjcGxxMEJFMzV2M2xDRWlY?= =?utf-8?B?NmRFZFM4eXdPS2V4ODhweVFUc1lHUXhoZWViYXlsRzlRKzdoemRwZHpKU3F6?= =?utf-8?B?NHBHMGw4NGJHQi9qbHA2QXdvbC83R3ViMUZ3T0dnNW5HU3JFUU1zbE91Z25T?= =?utf-8?B?eXBUVzlKd3BJNmFscVE5NDhHRS9wWElmZjNMS0ZBRTBkRGsvRWJlRkc4NWlO?= =?utf-8?B?YzFZR1ZUR0l0SnJGa0Q4S2JSVjhiUG9OeEVxU0l1d0ZlRlc3ajlZU0hZNmJM?= =?utf-8?B?djRnT21zaHpET1NZRGZpWGRJSFpnVnBhNDF1ZjFHODlzd1BPdlNPbTR0L1Z2?= =?utf-8?B?amtjSnU3Zm9CVkY4d2VpbnVKSXpJcTQ5dnVRWmhUYTA5MUZvQW9sSk0xeDdX?= =?utf-8?B?Y3hOTno5bEtRM2ZmT2sxblpUejE0cVIrdk9nRkYrVHZaZmlKK2xRODFWSjJX?= =?utf-8?B?bHpxRnVFaWRuQkhZZWxHUVFiMUJ6QjkxakwyaFkvVER2ZE5rMld4OUpiTURq?= =?utf-8?B?RG0xd1JYV3hhNjJuUEpaOEpCV0NQVnI4ZXFldnN0OVlLQktsb2FHbm9BREx0?= =?utf-8?B?cVNKNEdsb3lacG8za0QyYjF5RDh3cE9wbjY1ZEMwMmFQLy9RTW5mMlI0cmVj?= =?utf-8?B?Z3h5bjI2a0Jyd3pnWUZ4NkdaTGVMZWdkM016dy91N0JKVDlZM3A5dEl2Y1dO?= =?utf-8?B?KzNwVVkwS1I0L0JWVHFxcDZ3V3dZc2IzM0o0UjdiTmY4K21rQjB3blEwcVIy?= =?utf-8?B?dGVYQWxwU29UWlRRZTJuMGozUGlhSWZJWkVmNnBlQUNZU3hGbWJXOGgxZGtG?= =?utf-8?B?L2dxdUw3SEc4UVF0aHBiR2k2V2dmNTh0blp2aThoUWZxVEFkY3VHR294a3Bs?= =?utf-8?B?Tit3MXZxUlU4VGZnYXcrVTkyWWxQbDNsMkUrU1dxR1NJVk1ORlBmY0R2aWR5?= =?utf-8?B?NUpkTGZqbW5JdUY0d2lMdjh4UWhPaldTeUJhYUNLOE9iWjZHTEI5VXk2aksx?= =?utf-8?B?UDdCNzhNclNiOFZ3VHN6WXpyTElDT2piQTB4RDllR3hqNTBoTVUrRWdyU3J1?= =?utf-8?B?K1hhZlpNRzFmUVNRMklxZW5zVGlEajQ2R3krOTN3Y2RmZkJ0VDNPNzFmOUxW?= =?utf-8?B?NXpUejBJRWU0Y1RqM0g4aFhLbk9Ba2EraDRyUnJPUWd4UjhqMUpvdDFvWlZx?= =?utf-8?B?ekZBY3h1Z2xlM2xNb3l6ZGRnSGZpTXpJWWx2VVhnb3luQ0RJMEY2Ny90cnh6?= =?utf-8?B?QzAvejg2Q2FRYjZaZHpBNXdlSy9jR09Rd3pZNjh2a0I5OFhCSmZNc2pVYlpj?= =?utf-8?B?Z2FKZktVSkxPcXBCeGRzWnltRHJBN3AyblVMMEFDQjMzaUx4dVNqWHp3azQ4?= =?utf-8?B?SGpPWlhwVGhZc3UyM0gvZUtUUzFjM2lhMG5iYzg3S0lBRlB3Q2FkRVlyeVJ2?= =?utf-8?B?bkJJOVJEb2RjamEwVkMrWThNcjFYT0ZaTDI3aDl4MS95bVllWnk3cFd0dGxH?= =?utf-8?B?dHo3SXdXcmg5RHBrd1NzOEx3QXh1M2JvVGIrT0JlTGZvVGVmaDhLL0p0QnJz?= =?utf-8?B?WFRYNGYyVUhISzdvRnk3YVRBc1RCN1BlWjdWSUlhQllHMFBsdXVPdjJhaXdF?= =?utf-8?Q?uG1pQs?= X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;6:Unfl89dNQ9wsZxM4gLZv9GujnJA1ao3W69q71eoD2rVBvHGG7ZS6Uk1ch5h5Hmb3SgAe++fhqMgqXkwdWlNxa+tSNN3pQ2xiJsDEbL97yTh5QnLgqQ4jkOb14zd0ZlvL1mFueqZfJXtzuD5oCs1qpl9LUAvdE2bVeIO7sISQJoWRAcWiqYnb9JbFUZ+0uZeGm+i4y4pVkC30GCcGC3+iCfqwm1slfw6TwiNQ0Iuh6w1lOc0ztyOaWCKlUGxd3GkpnX42vTwqJtw1iG1XVEejfM47Wln+MGlVoOw0fNyHvuZ7plyulFVq0AgP0z8R0C7A1F5Seier6cSeQwqPY6Wu5Km6qBF+++HsP9gDc8hXb+k=;5:kHiL2YpyPTc3SzABfCEfDo47bQHDqHGhEnIN8SbclnLdh3gdQ2hro85UnevplqjOTu4BGxPQ6g6C7rx0h5R7/3MilWUzEI/ahX0tfyk2XkBLZaiyQ+4tkQjervBHKVsw99MzVVuAuX8rfd3nYabJ0g==;24:CfQzhAizyonqbdNgGZ9J1kmInlmNfT6Uas73TkQnQN3XTrGEhL1uAKUlFZrUzuAovcULC7eBum6C22OzgvunxqD4vwZYqE/Bg1PmM1MRMCw= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;CY4PR15MB1192;7:IAKHOnVDmmMb/hxAq0BSISWlvKLCLJaIPd/M0Unzpb7hJB1MJLZL7qbqaG0/omCZ7NwyGQu1vsOr+OAbYV53ZIYk8lOuqiuxqfKeX4M9l7wZDiUNFnPEb7Pn5f08Be8yKZcbZSjY61u30YXUSohY4PORhGydArnW6t8Np71mJaXN0kXQA39sVuTc3l6GsHgNeMHb+EgRkca0M2GwZ2M8Q4FZcsXj9/3TJqd8o5OES8u+214lGop6f0Mb7dWw3gfpie7TfDiE/zzgwe5IL+E0B5BEehrmK4jwiAn6tXMpXZ3BSu27xXuihw2eWL2KYD1F8tH6veMnDO6AQhn83gb5T9bmTwns7p3q6TYyfJImBN8=;20:fbMXtKglAJdMmYjaAEhmWaBshmIAJMBpU2+3mYFc/h5JDyK+0gxTty3ZxhdzPvFFgJy0RLN9tRJVHajxZ8OFk6NiNIp/SzLJjKco+1cIRs9xC7qxSnPwQn/Xwsn3mXJ0yNyk4tch0UqzORPT8DuJ5G3gSuSYGE4fKAcNQP8H0o0= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Dec 2016 18:46:03.3005 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR15MB1192 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-01_16:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1689 Lines: 37 On 12/01/2016 11:37 AM, Linus Torvalds wrote: > On Thu, Dec 1, 2016 at 10:30 AM, Jens Axboe wrote: >> >> It's two different kinds of throttling. The vm absolutely should >> throttle at dirty time, to avoid having insane amounts of memory dirty. >> On the block layer side, throttling is about avoid the device queues >> being too long. It's very similar to the buffer bloating on the >> networking side. The block layer throttling is not a fix for the vm >> allowing too much memory to be dirty and causing issues, it's about >> keeping the device response latencies in check. > > Sure. But if we really do just end up blocking in the block layer (in > situations where we didn't used to), that may be a bad thing. It might > be better to feed that information back to the VM instead, > particularly for writes, where the VM layer already tries to ratelimit > the writes. It's not a new blocking point, it's the same blocking point that we always end up in, if we run out of requests. The problem with bcache and other stacked drivers is that they don't have a request pool, so they never really need to block there. > And frankly, it's almost purely writes that matter. There just aren't > a lot of ways to get that many parallel reads in real life. Exactly, it's almost exclusively a buffered write problem, as I wrote in the initial reply. Most other things tend to throttle nicely on their own. > I haven't looked at your patches, so maybe you already do this. It's currently not fed back, but that would be pretty trivial to do. The mechanism we have for that (queue congestion) is a bit of a mess, though, so it would need to be revamped a bit. -- Jens Axboe