Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757679AbcLASa5 (ORCPT ); Thu, 1 Dec 2016 13:30:57 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:35910 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750775AbcLASaz (ORCPT ); Thu, 1 Dec 2016 13:30:55 -0500 Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free To: Linus Torvalds , Kent Overstreet References: <20161128072315.GC14788@dhcp22.suse.cz> <20161129155537.f6qgnfmnoljwnx6j@merlins.org> <20161129160751.GC9796@dhcp22.suse.cz> <20161129163406.treuewaqgt4fy4kh@merlins.org> <20161129174019.fywddwo5h4pyix7r@merlins.org> <20161130174713.lhvqgophhiupzwrm@merlins.org> <20161130203011.GB15989@htj.duckdns.org> <20161201135014.jrr65ptxczplmdkn@kmo-pixel> CC: Tejun Heo , Marc MERLIN , Michal Hocko , Vlastimil Babka , linux-mm , LKML , Joonsoo Kim , Greg Kroah-Hartman From: Jens Axboe Message-ID: Date: Thu, 1 Dec 2016 11:30:22 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [216.160.245.98] X-ClientProxiedBy: MWHPR13CA0048.namprd13.prod.outlook.com (10.173.117.162) To DM5PR15MB1195.namprd15.prod.outlook.com (10.173.209.13) X-MS-Office365-Filtering-Correlation-Id: 732138b4-7aaa-4acb-a75a-08d41a181e9d X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:DM5PR15MB1195; X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;3:pgBvwEyRtcuC8tdRd1iSlde9nY/FQx9oZoZTLo2dseOgMfMebJduvCf5Z4ISrzmYgXGEJkspi4tp1S6GcqKxg3I5VmR5YFRkX1vZWW+qkhGlAK7iWM1iBnRF0igVawPghzc7zwASGDoLekvApRR2DaRaISGsrbmFQmKTWJeTBy3j6vW8e3LCvlhxSVW1TWc/TH78oQihEgfX8AoqHsOQvt2BinmN88Wwp3f+z6wPxtjsxippph1A8xPRAgtCH5+M7uejXlriuoBDXcbJiGNOqA== X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;25:jZw2Zu+OggzqCzAUj1oheftj9MJ9vAfKWVT98kukvr7ppF9l8BrFM48+Bq1RLnTe6lWjdqDtPdKUr6QAVV2nZWPLyzNsh0gK2uy1dvXLXVEKcuzw+SV6OkVGLDcQXVqCftxvGnRJARGVewdQKriaxM5aJtHH+5oxJ0RM9MvK+tSe0rrf7qmbeuIBrLr/ZjfQaPAdf5B3S0lwjRW10JcWqmELFAk+qXquG/yjbKV0UEagzkMCOS/PWM58d21Fi3M5gkw3xX0ZkLAPRa1Ahuqso3NP2VY6nLWfNpjH5o0D5DiD/23M2lHc9Kk0ENqOyZo/zbZVTKsQ2GSdUFJrrN58HYTQsVqEjAafxMdDLAeJsJ1W7726goPaTlZEpgH9fYVi1rE3pv3lfu/FzIFj+Yc8Wi4XxDNqRBl4V7q7HMGcyxQz/UZkqDTzUbIbVA8D+DVaDQwbyOcnxF5ARq7RyIUreveX4N2N20xXhhK/z9i1mv1kIEYXq76aiYzNmGknaBVX3ReXRspCv7uk6nY9uDmUw8ZNzwVnR0RFzEX4hra1IeLNgf+IYg2WsV9Q641NaGqNVo375AOrLFoVuKjUFkmEz2MfnwuBVPL0X+XDyx8yIsrT2MiK24F7+S0RoKWf0A04pajQe5Nl6GNSYwaJ26nEEBzRGKhSQ+LB0g1gyjhIL/hCp8IfoueSoNLCrCPL7pHM4pciOveCrH+m1UocSzLeLcml+9qJDdwckbk79hYKp0V/sth/mq0A2CPFBoyYc9Af X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;31:dObSv+ISOcU7Bq8N87TvFMT1muBpzPY0lqJyqCFOuSf5r4+CJQONy1Fu2dm1fFCRzMgosxx5YNoYYt6ePRgGsrR42zK+3uN4R6nYvkaZgnKKd0dKq81n+0F/vRQmYKVGKwJJbyByT92cR+tGkGChWKhHFZpyWnfrGn8Cc0fNh7fZzr6iHS+KLIvi1CblU6NA7N2M+J60Zh8TvfcqzXHsm7x2HnzXLYUrKieN9h2KjQDW+NjcDAL1fr0Fq3NHXv96ZywuuBJN/St123O3j3VfIw==;20:0t+pqEtoj2A35lnd+RPbpenya6WcQr67MFp7d4NkVWat1D1w3DmsGN5ecWiCrZHhPL3KnYIkVRvzCnRrJ3AzUEtXhb1mkBSAWntXNZaJRhFChIYEknolAXGmdF0TPBEoW6nnJlkxRVfk4UMFTpcIBbtFCR8csYON/K4kr/12Ti5FW9qqsx0kCaSIMWKKGLCme8zU8dtHODnAw6oCIhObDd4MKXriWJin9EmGuzGN0pEGYqPTjssA7cuk41Ls+EDR X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(5005006)(8121501046)(10201501046)(3002001)(6041248)(20161123564025)(20161123560025)(20161123558021)(20161123562025)(20161123555025)(6072148)(6042181);SRVR:DM5PR15MB1195;BCL:0;PCL:0;RULEID:;SRVR:DM5PR15MB1195; X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;4:aNh+J1gQ2slDr6NcFblNXqXZItodgmhH0Ytp60TB9CN6Jb117udDZbcq3J8I1Wlq9KnN2sgdo3NplCzjJ8lp5cscWkKaGWJSiWiBdxxzKH6HIR87eh97rZ+LIQer8XeS0inO9K8f4bYmoDYUl8JqxBimPaEBiGCXdg55enT8zx+dDxw6MNYCvGijuHYokhM83XsYNAgdUU2lMv73iQldzOKliWvLcJL5LAytTujWnwDHmVLzmvT6zole6Uj/oFwJloSM7CL+pMHJd/OYvDWdGU1w8mq6iihJNuirGeyMcLH0PSK+cJIy8S3ESbu916JYm2RyHoAIvrGb+szBreBcaGPEsp3eNmeEb0ZxgjslByhxBHGvNr+JI3c4suDtWjk3m5/Vj+6Myxm8QXbuw/8qA4TxmwddEHZHTeVWgKaKri3PI71IiKiyAuWxBmD81fsqJGK0NhZFvbYmSiBRGZivVZsiCJ9SJtRkf6d65gaYL3nRXbHKHud3BrL7Vsh0X7+n0S2NvbmMFgXiNDDEJCXioQe1fonBGaut0t6NkBzORCQZM+fy7FJfAW3rZFjwRAMVdXwlCLNFKlzlAOPJVECggLtVTD+CnTkaPpSFvxOIZ2j616N+7C0M0eyWnzBbHM4l X-Forefront-PRVS: 014304E855 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6049001)(6009001)(7916002)(24454002)(377454003)(189002)(199003)(50986999)(54356999)(76176999)(31686004)(33646002)(42186005)(93886004)(23746002)(117156001)(106356001)(105586002)(101416001)(4001350100001)(5001770100001)(7736002)(97736004)(305945005)(81156014)(2950100002)(6116002)(81166006)(189998001)(6666003)(8676002)(7846002)(50466002)(6486002)(2906002)(4326007)(39060400001)(229853002)(3846002)(65956001)(733004)(83506001)(77096006)(64126003)(68736007)(38730400001)(36756003)(92566002)(7416002)(65806001)(47776003)(65826007)(5660300001)(31696002)(86362001)(66066001)(230700001);DIR:OUT;SFP:1102;SCL:1;SRVR:DM5PR15MB1195;H:[192.168.1.129];FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;DM5PR15MB1195;23:aV6hxHWdT5Nh9lKAEJRZ8abJbQmuo6aTRBnu1?= =?Windows-1252?Q?z3axo86vzHrwYGTukVxIpB0qzxbcvfr3w88yNwPjtibzuKgxXWvVa6r7?= =?Windows-1252?Q?VnPGFyLbgIjq4QfOVi4oPkVoMI3zsyJeQzWZ7/NPCRwnF7FvAZ+TTeQV?= =?Windows-1252?Q?0xU79KLQTUWCoynbGqn0L37eBYp9DdaHmewzynZH+hWYak1eq+rIjbz3?= =?Windows-1252?Q?/qfVb78gJmtrakUk08mmSDm891DQCNQXAyDqUjAKXvPiCafptTguHH0N?= =?Windows-1252?Q?4l+sNTwRQb40KnJhHi4EEMvhau7I9So5cTXhfKWHuMCZPJJhnIwTiOQp?= =?Windows-1252?Q?i+v9Zc+3JiQugNrKVTDhNSrpLzFOM6P3YIs79IyLgfKFeQv7OSpVGnUO?= =?Windows-1252?Q?sPUNn4Kk943cTPDIb2HjPrG9Y08Wom6Bpz4knkL9ENJ8IqZ8lFKzE3q2?= =?Windows-1252?Q?avjtltFa/OmCxr2DAAC03/RELELg3kPhJEwBBW1Gu7LnaXSiM3VECgQI?= =?Windows-1252?Q?qvLcEgTVPFqeOV29r4AooKmfwgujeTmGzqsBxhIRdGRM8Kh8LPlSdfHj?= =?Windows-1252?Q?oGun0Gg2BYLPtufUaJe+PxvVzpUYu47Tx+T7UM+sFGi54Ql5ps5yBkcJ?= =?Windows-1252?Q?HUoErAUsK/3hT0kimyTkQ+bYE+fW0sp9Jzzyi8+jNGlMMSOk1uuqQea7?= =?Windows-1252?Q?grq3xvwJ4os1Cw0GTDd3ZqnuHjzpSihki59OfIXFlRcM3n/VfDzcck1l?= =?Windows-1252?Q?t05ix9pZ3mHllF24Ar0VZfjFiUbQA816xZITEoCMgtVnXOivlRrWqCE3?= =?Windows-1252?Q?udW5xmVO2zckEoV37FT+YM1TpX1Tmv06u6HR6I0VuUkmHEHlpMInxZa2?= =?Windows-1252?Q?ngi13KWJTRMcHlS4F2JgkaJ67ZNHwc/Bj/fX8KJ+Rvn+hq/2QGHXhMSa?= =?Windows-1252?Q?Ot+9d5qZEPkP2xTuhma6tKrFQgnYB4cOYg+GnxMapCPN0hbT/QQ9EHFD?= =?Windows-1252?Q?IPcYnhWEHr+xdudEt5w3+aGTb0CVHm7XnX20xNMeI3K3+hdTuOBaNZrC?= =?Windows-1252?Q?AYdhNK0ROxyojTF3rz6Dz4QhQYgvh3AspWCw4ZW/+AjOOTYV05zqEkMV?= =?Windows-1252?Q?HKdr/eIRar6/GMGmaEnX9VYUhnpaSaNtG1jv6MErqWwdCUfuy00H0bYU?= =?Windows-1252?Q?/38WNvGK+y1orZedzOgJzTIOhDvaNOjFz7V9k2MyLTRCRgh9Z8DtVW5l?= =?Windows-1252?Q?hpHshOaoCjPDlZhc/Gw4juH2MTr+nDKMBCTN2Wius6qGBLqPq/0klNIE?= =?Windows-1252?Q?5U2mO2f1xRVRugXzWEQ2GN4uOWP2VOm2Motzl0WrYaW8sxSaC7t5rCa1?= =?Windows-1252?Q?yLg07fPUsMtMPh7ZEtin85Z4yE2zLG9dOb2KArH65mk1Yg1dMle6nC/g?= =?Windows-1252?Q?QLICqX+nVbi3tUpmlnTH8yE6gmEEGZyZhIhnvMh6KIltTDxu6BFgCOmr?= =?Windows-1252?Q?NPewpc=3D?= X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;6:SHR112KEwfo2jy+FzWODmTpts4+kgU4KBgmLDV9nvG8Fv1RbGGDQf9sEIYCEBMOcG6NHKpGZFE5sANZBUm3ugjr1LV0R6XShuGD8ttkmawF7czeI9O5MgDksvIOVJBof6so/yT35kLsCCQtQannk8A+jccL6cpj6u2Ls9PYc7JMunOy6OrCBnihuB5eILh4E1PXssUiiah1HFRdOcRF6bbABc8zWmkXkbs+cjrFuFf1Q05Cw26ZeaEIlLVzj9HPliad1Aw4Us+5ZpIahK7HHDurN4DcpvL13xA+Q053V9lJvhsgudCdjQTAzC44SZOMiOqs8cuMKpuX4/3MMXlxwlJ8cK6RJjkqfbBBZk7/UTh5JoBhTe8fTokfsWXS864swEfUSgUelaJR2ZOageNiJBaHEucRHPzIcEaOc+nrrdVNOXWqUWmUhUn0EbfWAUU+WgZD9QRwaXLcbCTkV+0yKxg==;5:pN1ra571ypRU1lqtGIyNYXtl4GXsww7Y1HEFEM6ZC4WNniDjpsQsc4ARfbudyoUFjsUtJNLylpE/6tWeSB6e/pR6axYhMzGA5NhkLDlNHndx8ygqjKqeTbZRQbVVqPk0qlgoWcvqjdYdT0YEwKFMDA==;24:A3Z3oPzpXMcQ4jluejRWlH713CsJD51dpbF1nxrLhJ+po65i+yGIIRdPjxyU/CRKmSRzl6MPBZGifvPmNSD+hyC7L+3qdq9ll0gcTF6qsnc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DM5PR15MB1195;7:5CwLrJ9gVwx+Qzhrq94uY4X/fD3sVpjfMXw1tLysVzeHxY7TABAGXw4t1OOD9ghE0rH3+aYc47yj8wsZzdJ3ulq5au3IuzXpXmrP0kQuQGfGYiKmigxyQZM2oemr+yjJbRVxj7gGpM703FugQ0xjLr+aUUOuf9BLQvdbe8YcGVOdYMhzOVwFE1ql4Eu0/E1XjshKevyWp8WH1PpaG1z/6LvLFbykvzHY7NaxFwTnofdCN1CwQSXbM25VsnEOa2SujbNAD5w+v7aiNeUYdq5QRQ/Bmr0RBSm+jC1jKccAp4BRHX8Ghq+pYC8lw8As3rFdT5QNV27Ee+58ALezSTVC2AvXOcZiAo9aqCYNeRIC3zdW1fseJWYKgV8ZKNcGOKjq/cwIF2FaG8Ln6TpBG7Y/eMJfBLh9KD3z6THp6gILCch7qIErvYSM5X3akS/d+t0IA8ITNF9ZOFDJKRjHYp5BgQ==;20:/p1oGjCL6zi15jF/Bm/HNBZLcy8Nfk1KCprRngK6otygJLK4q9KwnfK3Eei2ZOri6piFJolrpcTT5NNC0mgOCk+2tPst5v4BnFkxbkAR3yI1wN0EAU2N+LiTMgCAxfC2Ov+84KzY1bd656EQnxItwtBdgf86cswG5/Znt09E6pk= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Dec 2016 18:30:25.1320 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR15MB1195 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-12-01_15:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1747 Lines: 36 On 12/01/2016 11:16 AM, Linus Torvalds wrote: > On Thu, Dec 1, 2016 at 5:50 AM, Kent Overstreet > wrote: >> >> That said, I'm not sure how I feel about Jens's exact approach... it seems to me >> that this can really just live within the writeback code, I don't know why it >> should involve the block layer at all. plus, if I understand correctly his code >> has the effect of blocking in generic_make_request() to throttle, which means >> due to the way the writeback code is structured we'll be blocking with page >> locks held. > > Yeah, I do *not* believe that throttling at the block layer is at all > the right thing to do. > > I do think that the block layer needs to throttle, but it needs to be > seen as a "last resort" kind of thing, where the block layer just > needs to limit how much it will have oending. But it should be seen as > a failure mode, not as a write balancing issue. > > Because the real throttling absolutely needs to happen when things are > marked dirty, because no block layer throttling will ever fix the > situation where you just have too much memory dirtied that you cannot > free because it will take a minute to write out. > > So throttling at a VM level is sane. Throttling at a block layer level is not. It's two different kinds of throttling. The vm absolutely should throttle at dirty time, to avoid having insane amounts of memory dirty. On the block layer side, throttling is about avoid the device queues being too long. It's very similar to the buffer bloating on the networking side. The block layer throttling is not a fix for the vm allowing too much memory to be dirty and causing issues, it's about keeping the device response latencies in check. -- Jens Axboe