Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp152232ybm; Thu, 28 May 2020 18:53:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyynO7GR3Op/oAhXr4M5/bOwUoN837C+j/KeMWEuj3ir4uZW16V1AQrPW1hezc019jPSjt5 X-Received: by 2002:a17:906:f112:: with SMTP id gv18mr5540495ejb.142.1590717206557; Thu, 28 May 2020 18:53:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590717206; cv=none; d=google.com; s=arc-20160816; b=U8CxCJrv2KYdL9VrCx0gy/44B3BzZ8K5tJnd69u9yxYCP/obGtlQpurrZWiVz9W5fo 9TBQgxT6Ngk0E+fr/ePluzdKOv/b/5EwmV7x0tCvxdu/3/ghMdSWwjxSVVgluIUPPBdf z6aCL0DbHF6h9I+S2B+/tZcklccGVNktiRvyT/WD3ieR0kJWYD3gm8E7uOoQx42jQcrZ f72Kn7by3Y+YPaEdj6LwxSycd6+m8C/R6wZ68rF3QIkEcr/yhLuL1F98rqZ5r3N1onoT PMxmP1g1TmJ6Vb3oX/Hi2As1uLWzOXgTqRn1uI1+YJH2+A5YolKy7uHMa/0rayKARcjw 0wcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=H3h/KKWXDUeQOc5C3bt+8FTDd7C+ZKV4cv02x2eGUM0=; b=A7dfll4ZB5VnGkznNstGp86lMTnPAOLjPJ4hbSzFESZk1BqCwGq4O22XKNAwzcySVW jGe1SDUYr96rRkfoaz4MeM+gmNm3G2w0u7q28+jqTACKCo0ZHUdR414PL78hb+JUZ8Ez keRjXLKzCk5/LpxzHFVrNAgoHS+IRVv+MTtfLwJDBOxklJC5BDPrRjzqMfMsMRT7Rtps G00wQtTHRlFyHFxNFNh4m51K6WJvba04x4a0Sif7RBOTpjdNuwFRrraKOqO7oecIfsvz YCjO0DuZcPVPwEQQHeNitU5LFVTJTu8+t97G/mmTp8JNUsQb6BtpePJjN0ssrixoyC3z oOFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hv3Ro80U; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id do2si6470573ejc.178.2020.05.28.18.52.54; Thu, 28 May 2020 18:53:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=hv3Ro80U; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390874AbgE2BvO (ORCPT + 99 others); Thu, 28 May 2020 21:51:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390018AbgE2BvK (ORCPT ); Thu, 28 May 2020 21:51:10 -0400 Received: from mail-il1-x144.google.com (mail-il1-x144.google.com [IPv6:2607:f8b0:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46C4EC08C5C6; Thu, 28 May 2020 18:51:10 -0700 (PDT) Received: by mail-il1-x144.google.com with SMTP id t8so438183ilm.7; Thu, 28 May 2020 18:51:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=H3h/KKWXDUeQOc5C3bt+8FTDd7C+ZKV4cv02x2eGUM0=; b=hv3Ro80UVyre+PN/oxyCxv5Wj3U9Miq4RcYmwn7Wer3m5lna+secRy/iNZy1XU5vs6 DwjrNtSLHZt3BxA7cIhF92ENeQKAwPhXRCSqVVVGdfZEG1vXKe2rTqhNQ5Hvrpi2UFjd LXRB+yKNrnG7mdAT8gMmN500gmHD5X41hEKBe0ZSg81bWVcCF0wDSjdIsr/DX4GRWgtn jcsGk9dGIRwLlEPEmqwg9TjN0k2usiunSA7X//YLVJEyZx8FK26kMVPXHvJZ+cbDD4ho dfy41JIPsQje/uQaXQhSJnK2rs1lAoB5dKeFYnuDrG4bDMDOculzn8iblXX+9y2QjMdX sHpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=H3h/KKWXDUeQOc5C3bt+8FTDd7C+ZKV4cv02x2eGUM0=; b=Z6rdMIlkKUQtcxDveLz7QSTlLJyE0dU1drQHpQ36JTqjQbP/xlGw3JvBCwp6aJ38YO yrRvb9CBHSEP5YKAz8fUcwF1qzJdkDHx95lmyZHtWQyZIJackPpaponrSDUJiQ7KSNGW f1gFJxFNccq7aEDd9H8UMvY0SJZlVW0Oe2FfTyVrjFMvKsgS5Jz0Z8dB8I5nFXIMhqvI z50D1RxnzwX5qFtNfaaX8IBcsmRtA0aytgdgNX9cJ8P2fUF4EL0x1qk5GgGyI9fnp7TH 0wzN37b8O2QNVQ7JmXmuLlxGyf1nnYmmQBSgPkNwJiz3FHOieejAvWHI7KExkj1Sh/ZE Vfdg== X-Gm-Message-State: AOAM530GXaFmL595/YXvCsRAxesB0uuiT5qrqaTMbGlopesjoJ2Gzjd1 0QwQztxL3wl4mys1AGnxkq4tv2TkZ4eOpBIbUuk= X-Received: by 2002:a92:770c:: with SMTP id s12mr79501ilc.203.1590717069612; Thu, 28 May 2020 18:51:09 -0700 (PDT) MIME-Version: 1.0 References: <20200519084535.GG32497@dhcp22.suse.cz> <20200520190906.GA558281@chrisdown.name> <20200521095515.GK6462@dhcp22.suse.cz> <20200521163450.GV6462@dhcp22.suse.cz> <20200528150310.GG27484@dhcp22.suse.cz> <20200528164121.GA839178@chrisdown.name> In-Reply-To: <20200528164121.GA839178@chrisdown.name> From: Yafang Shao Date: Fri, 29 May 2020 09:50:33 +0800 Message-ID: Subject: Re: mm: mkfs.ext4 invoked oom-killer on i386 - pagecache_get_page To: Chris Down Cc: Naresh Kamboju , Michal Hocko , Anders Roxell , "Linux F2FS DEV, Mailing List" , linux-ext4 , linux-block , Andrew Morton , open list , Linux-Next Mailing List , linux-mm , Arnd Bergmann , Andreas Dilger , Jaegeuk Kim , "Theodore Ts'o" , Chao Yu , Hugh Dickins , Andrea Arcangeli , Matthew Wilcox , Chao Yu , lkft-triage@lists.linaro.org, Johannes Weiner , Roman Gushchin , Cgroups Content-Type: text/plain; charset="UTF-8" Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Fri, May 29, 2020 at 12:41 AM Chris Down wrote: > > Naresh Kamboju writes: > >On Thu, 28 May 2020 at 20:33, Michal Hocko wrote: > >> > >> On Fri 22-05-20 02:23:09, Naresh Kamboju wrote: > >> > My apology ! > >> > As per the test results history this problem started happening from > >> > Bad : next-20200430 (still reproducible on next-20200519) > >> > Good : next-20200429 > >> > > >> > The git tree / tag used for testing is from linux next-20200430 tag and reverted > >> > following three patches and oom-killer problem fixed. > >> > > >> > Revert "mm, memcg: avoid stale protection values when cgroup is above > >> > protection" > >> > Revert "mm, memcg: decouple e{low,min} state mutations from protectinn checks" > >> > Revert "mm-memcg-decouple-elowmin-state-mutations-from-protection-checks-fix" > >> > >> The discussion has fragmented and I got lost TBH. > >> In http://lkml.kernel.org/r/CA+G9fYuDWGZx50UpD+WcsDeHX9vi3hpksvBAWbMgRZadb0Pkww@mail.gmail.com > >> you have said that none of the added tracing output has triggered. Does > >> this still hold? Because I still have a hard time to understand how > >> those three patches could have the observed effects. > > > >On the other email thread [1] this issue is concluded. > > > >Yafang wrote on May 22 2020, > > > >Regarding the root cause, my guess is it makes a similar mistake that > >I tried to fix in the previous patch that the direct reclaimer read a > >stale protection value. But I don't think it is worth to add another > >fix. The best way is to revert this commit. > > This isn't a conclusion, just a guess (and one I think is unlikely). For this > to reliably happen, it implies that the same race happens the same way each > time. Hi Chris, Look at this patch[1] carefully you will find that it introduces the same issue that I tried to fix in another patch [2]. Even more sad is these two patches are in the same patchset. Although this issue isn't related with the issue found by Naresh, we have to ask ourselves why we always make the same mistake ? One possible answer is that we always forget the lifecyle of memory.emin before we read it. memory.emin doesn't have the same lifecycle with the memcg, while it really has the same lifecyle with the reclaimer. IOW, once a reclaimer begins the protetion value should be set to 0, and after we traversal the memcg tree we calculate a protection value for this reclaimer, finnaly it disapears after the reclaimer stops. That is why I highly suggest to add an new protection member in scan_control before. [1]. https://lore.kernel.org/linux-mm/20200505084127.12923-3-laoar.shao@gmail.com/ [2]. https://lore.kernel.org/linux-mm/20200505084127.12923-2-laoar.shao@gmail.com/ -- Thanks Yafang