Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1460267pxb; Tue, 17 Aug 2021 12:15:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwIExPCvMrAzLeC0LK4NFxzZPKo+RDvw6ERR++KMeFPscFwNhr1jbjMoil6Lqkg08REM0T/ X-Received: by 2002:a5d:8484:: with SMTP id t4mr3930726iom.126.1629227747986; Tue, 17 Aug 2021 12:15:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629227747; cv=none; d=google.com; s=arc-20160816; b=x9EQkWKhK5qkpedo9k0pCnhzXFyXy9m4/LCHY4F5WUbTnJMdJmbEq6fxzAEG8e/csW pREpoh1AJhd8jeF1gjgMqhFBZdPA7M6V0lAtB1TO72229xTwar5bPGgwlilBQ1VVNmAf K2vJeYQrQscr8B6gKPhT7GmEHRB6FwFr2iLcXM+XBEHcUC1tiFMcyXgvHlaySfTeyyp+ 8CowX0vwUjuhS2lzRcHlREdGlKHMlXa3OEcXCrGqMjLYQu+du7zwdMYapK7w8rxzZG63 e5w4H4UScUXjLL/50kvHoVCAnQPBtpPLCypG/Oi6Bi28n9OQ/rbBKQ839/6CJARvPGQ3 ikKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=47usRoZUOuaV4n/NBI9CuIiSN6CNtH1R2Gcj7xUy/UM=; b=NUyULUvgryHQnJn44Y1MjmAkxn4EREgtiMMrZDL5218gL2mJO0UCQseg7U0/J57m7t IJ85rPpa2mILS1TIiAoGcXMlB0zcJZDCAJyHH6QSqpXdm7pPCSmMTNXilL9mofHRcbSM Z/ZIRDXHdcz+3Gh42J+lTvbRD23/GN/E26IOiS3pnKaZ++LMjdu9oxqVECZVVrSeOZ6j gU9o50PLqcBHax8DmQq/O42kvTZsAH8V2XXSYKbaVxcy5uTkmMiRflJ0m51PnLSofeMC wX3OKdUzuYu9qd7lXsXuMRYBn3GcZp2aS64b4Yz9yPz3yeAOCGoE+a3eJ7xSkxDvoTo9 p25A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b="s0/KRm5R"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a4si3047625jat.17.2021.08.17.12.15.35; Tue, 17 Aug 2021 12:15:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b="s0/KRm5R"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233153AbhHQTOo (ORCPT + 99 others); Tue, 17 Aug 2021 15:14:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:42688 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbhHQTOo (ORCPT ); Tue, 17 Aug 2021 15:14:44 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A5FEB60EBD; Tue, 17 Aug 2021 19:14:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1629227650; bh=e7J/njEnxcO8SaLkLK/P3EfzspDwSQw73D4u4l2pHCA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=s0/KRm5RBJw6o+Xc0boKgzneAED+creTvrWPnBOqIx0TwvgWd0l+LP3GRh2oQHcNb mjDmgI7oKJ41dOQgHIQVrVVweEXEjG8cZh6ct0U7Epi40o2jTBoYphIau66HX7aDP2 X/EwSWdyxxu9nO8r0Alq+Z4TKy2mASx81P3bHFCA= Date: Tue, 17 Aug 2021 12:14:08 -0700 From: Andrew Morton To: Johannes Weiner Cc: Leon Yang , Chris Down , Roman Gushchin , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim Message-Id: <20210817121408.47be5d9a11baf5bba44da9a1@linux-foundation.org> In-Reply-To: <20210817180506.220056-1-hannes@cmpxchg.org> References: <20210817180506.220056-1-hannes@cmpxchg.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 17 Aug 2021 14:05:06 -0400 Johannes Weiner wrote: > We've noticed occasional OOM killing when memory.low settings are in > effect for cgroups. This is unexpected and undesirable as memory.low > is supposed to express non-OOMing memory priorities between cgroups. > > The reason for this is proportional memory.low reclaim. When cgroups > are below their memory.low threshold, reclaim passes them over in the > first round, and then retries if it couldn't find pages anywhere else. > But when cgroups are slighly above their memory.low setting, page scan > force is scaled down and diminished in proportion to the overage, to > the point where it can cause reclaim to fail as well - only in that > case we currently don't retry, and instead trigger OOM. > > To fix this, hook proportional reclaim into the same retry logic we > have in place for when cgroups are skipped entirely. This way if > reclaim fails and some cgroups were scanned with dimished pressure, > we'll try another full-force cycle before giving up and OOMing. Which kernel version(s) do you think need this?