Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp506239pxb; Wed, 18 Aug 2021 07:23:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxg9SkjGGEAoiHXyqzkeP+38mYv9gRBPNHivcuhffCVQ8x7W3808igo3yFbq0EbWdXV9Xhk X-Received: by 2002:a50:d6db:: with SMTP id l27mr10497816edj.309.1629296591846; Wed, 18 Aug 2021 07:23:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629296591; cv=none; d=google.com; s=arc-20160816; b=0b0i2Cv8oEkRI54SFTMylzd0A12XUkh1rTk333BqxVRivnqbdB7o+O6xL/M7d3b0Lo tfGFyOhZMl3UVJk0Eo67Q+0DLK/dyDstYTz1ndOP5+fQIc63e7MFyu2ScI+XzhYA3NpM Z0cse4XPyeHFObkzrzsWoAmxiUejG15I0/VlVTsfN3VmcPCajzIiP9MVAiSwLE0MZCV8 MKLYTNn84hRZ+UZODTk82H0loMt++8pNDrsOBrIEYjJowjyNXpxdk45d/Nc4FwUEnCwE S2n4PfxOOUKGGdIA/CiPJERo3if3M27GvhdWNFvnbxkTlMtFRn2Vrih+ybR1t09LyxGm XZ1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=QuwtlkiDGdVrCHQjEs+V1pH13gkXul/lhVEkHhu3sPg=; b=NB5sseWHr3yeqUqZBVx0AvXwmXVIbZBXPJjGOCOJ3EFrymKAZXNTXkQ8nbyPmX4LKc WUdRNtdFRx8wSijMayRWjavCfqXP9pbdLTQ325C1v1YcuG/cuyY4m0u3P2Zf/w6AyNuj hZSPVF6htD+iOd4Z3lxI8+yLcONVdmfHtK8khy9/0nXywVPQppgCPj0FuTnbhQnwL4v9 OYShFS1heCSX89fKEsz7i5pxwqd0NyiQwUEydfAIfRUfSaljBlf8dQ6nvcKiJig+X4is tf/JfZlhyXCHqA6KasjeA7biXb1AdjXIUywV3ya63AQbWVJK/acArusaebgj/XTDQ4Os Kp7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=HGixUth5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id gn15si37276ejc.141.2021.08.18.07.22.44; Wed, 18 Aug 2021 07:23:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20150623.gappssmtp.com header.s=20150623 header.b=HGixUth5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238621AbhHROVi (ORCPT + 99 others); Wed, 18 Aug 2021 10:21:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238610AbhHROT7 (ORCPT ); Wed, 18 Aug 2021 10:19:59 -0400 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCF0CC061141 for ; Wed, 18 Aug 2021 07:15:22 -0700 (PDT) Received: by mail-qk1-x72e.google.com with SMTP id t66so3205080qkb.0 for ; Wed, 18 Aug 2021 07:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=QuwtlkiDGdVrCHQjEs+V1pH13gkXul/lhVEkHhu3sPg=; b=HGixUth5XmHhaExQjj0gkfLh0CcKhfT9fLJ8YT+dCDfj28MGFMjoiPk12fFJ75GriO T3vJh7OduH2ggAbsAw4pR2TMzK5OkotvAbwW5kg/WMKj9tVKuwutWoUoQqQ0mair1Ivn IXnMaAE+5G6/0DOyN2E24YF4TEN52GyJyeAHja0iPBTGhHdmzv72e37rv5SRqMwiX0jB 0zAct3Jr3JADH5pG3y5yrxAkueO/MsBtxxTBZXMS87vYomrc7Gl69AVEtEHAd0dNxdAq BG3Rnqfq+rO/ArKUpIPfdWadeGYwTOuXPzo9l1ubVv0Dcpykxz3Wfu/DmKo5bFuvq0W4 6vpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=QuwtlkiDGdVrCHQjEs+V1pH13gkXul/lhVEkHhu3sPg=; b=t4lhw2pD+z2pvxpBZ4soW7vbY/SO9sAqw4FpIHIbSCIc/A2aPPIAq4aAmWaw6NmEi2 kp8fo3a+4oxACyKmFRy5+bZCu1TLZowID671D4oCViK88OJ00Ame2qk+MlnX0rRA7vJP G/p7roEZfEXMoCzTRCRYU8JKbXuOzBVu1OXpJJCRox0qBV06hYpoIieugo3pg6oA2TOP Dhq1CEqXyipmV5vw9E0itVoEjHKVowBt35//k5pJ5zull04AgehGjqLi9Hc9she7MLf8 deCaxkeCTLnt65zLQyKAFZVrq8HxpzXvOTyMmLM4bvtvkr2+XaJZyExs6RLJuyjjYYrS SPjQ== X-Gm-Message-State: AOAM532kXLpNFr+doVnlyUaheFUJGzedKkeuHX06hXwErX+vdX4tHaEF vNiBVa6iWosDU34TzxMWUFHVVw== X-Received: by 2002:a05:620a:450e:: with SMTP id t14mr9583617qkp.93.1629296122062; Wed, 18 Aug 2021 07:15:22 -0700 (PDT) Received: from localhost (cpe-98-15-154-102.hvc.res.rr.com. [98.15.154.102]) by smtp.gmail.com with ESMTPSA id f15sm59672qte.5.2021.08.18.07.15.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Aug 2021 07:15:21 -0700 (PDT) Date: Wed, 18 Aug 2021 10:16:59 -0400 From: Johannes Weiner To: Shakeel Butt Cc: Andrew Morton , Leon Yang , Chris Down , Roman Gushchin , Michal Hocko , Linux MM , Cgroups , LKML , Kernel Team Subject: Re: [PATCH] mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim Message-ID: References: <20210817180506.220056-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 17, 2021 at 12:10:16PM -0700, Shakeel Butt wrote: > On Tue, Aug 17, 2021 at 11:03 AM Johannes Weiner wrote: > > > > We've noticed occasional OOM killing when memory.low settings are in > > effect for cgroups. This is unexpected and undesirable as memory.low > > is supposed to express non-OOMing memory priorities between cgroups. > > > > The reason for this is proportional memory.low reclaim. When cgroups > > are below their memory.low threshold, reclaim passes them over in the > > first round, and then retries if it couldn't find pages anywhere else. > > But when cgroups are slighly above their memory.low setting, page scan > > *slightly > > > force is scaled down and diminished in proportion to the overage, to > > the point where it can cause reclaim to fail as well - only in that > > case we currently don't retry, and instead trigger OOM. > > > > To fix this, hook proportional reclaim into the same retry logic we > > have in place for when cgroups are skipped entirely. This way if > > reclaim fails and some cgroups were scanned with dimished pressure, > > *diminished Oops. Andrew, would you mind folding these into the checkpatch fixlet? > > we'll try another full-force cycle before giving up and OOMing. > > > > Reported-by: Leon Yang > > Signed-off-by: Johannes Weiner > > Should this be considered for stable? Yes, I think so after all. Please see my reply to Roman. > Reviewed-by: Shakeel Butt Thanks Shakeel!