Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp1169700ybv; Thu, 13 Feb 2020 17:18:10 -0800 (PST) X-Google-Smtp-Source: APXvYqwV7b0pDm3/duB3NPRKC6ChlR7dCPWYdlSlenlgMWGPrvxkbEsQc9OsVIh0WxfOjxY4uCJ8 X-Received: by 2002:aca:1a05:: with SMTP id a5mr280650oia.97.1581643089927; Thu, 13 Feb 2020 17:18:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581643089; cv=none; d=google.com; s=arc-20160816; b=GQIz477bT+BD1of1839K+8y+TrwAz5PmSZvfuUPJQVf4hGqdtVkZoeEFCLRZoku+Am x9EwgrBrMPgTh73lHzqr7d6lcFvmYrcXIKIBNVeOMLWTgVeuXDhF7DJUb3OVz1stq20z BQmif8Khaxzy/l8YC8sS3UorJ2w/7BoQ7N9pMQ0TuuYM37f0DLoAWtXVy2uL7/z9upzi +fyJyPMbvIN0tSYcNLTzqVvIuMb/rK/nERASiLW3Q0DgqYxKaGJjQahKQeAcmxnZ+gDF bySFn2YqxbscDnFyXpmCcuBKE96J7pnwBJUB3F4IMgraDXL8loLAmlxsX/UeQrNAgs/j wuBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=X1rUbbk5eSi/RPUfUTIVpYH2GwBV8H1jUAjzu+pVHyM=; b=p28DIt04rxDVdXKKdmYJ1AnHwLHdhA1EQXMUGQi31Ia0+y3c9rT/HkvzaTO0yZ7BZs PlJK7xayrndwpXgdQu+yQmXl5q2XsqYSEkiClvfw5UPeM0ZA+b1v5IlqRahYZnjiEObh GtAW0qbZHLJMe5rj1JxV1VLc8cByyxdU3Fc5QVBU7MRFKDn7OX45McdZpBMDwTWWuM+W nifkrqhACgR9BElnDGKhcNbFCF2x3g+LkGHU8mZG63Vc6W0QZR1oCo748xWPgkKmaPvq 4a4OzUcl+Pr4Kv790q0hnYS6NyN58ks3wyrJL9GaLQZxy5PcIeqXWFxgCsb66Q0oyfhK dGwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gbZ6Fcpl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s20si1969946otp.4.2020.02.13.17.17.57; Thu, 13 Feb 2020 17:18:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=gbZ6Fcpl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728175AbgBNBRs (ORCPT + 99 others); Thu, 13 Feb 2020 20:17:48 -0500 Received: from mail-il1-f195.google.com ([209.85.166.195]:45845 "EHLO mail-il1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727604AbgBNBRr (ORCPT ); Thu, 13 Feb 2020 20:17:47 -0500 Received: by mail-il1-f195.google.com with SMTP id p8so6668108iln.12; Thu, 13 Feb 2020 17:17:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=X1rUbbk5eSi/RPUfUTIVpYH2GwBV8H1jUAjzu+pVHyM=; b=gbZ6FcplDBHiAEulkuso5ll5gdmYYNALxEiNfJPbgUQJzqp5Hr73oBkRdE8+jbq8Hr rYTgJ9Zd5ZGkMLSr3gGEbKqllTFSCHBTo0JsdLa1MjrztRe38LWfLxBltiEBfcllz6qF xx1PzIeoOoIZhzR5DVoYmZp5qiW+Nh0bS8XB1BN3BDLkcJk3utWVtkrcQt3eC5GAOlm+ pGCYWW6e0BBV1rGBknKtohqKld6cGV+9CZ13olsdim8mYe5/zay51ZNAE0cig9IXQTuM y9xYCUNMj9FNuopMkInsN/7fCmhkvj5DhKpJ6m/nq30JcU6CIRvrSV0NIWQEw8qVipKK 69Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=X1rUbbk5eSi/RPUfUTIVpYH2GwBV8H1jUAjzu+pVHyM=; b=VhyGrbv8QkfS+PljojJGsNCHmKKy/mzq2945YSACInu69AlVzbMgaNSsMiZY3exbR/ kH1RWaD8zdy7KHxM8eJu1x+FFvTQokN3wCXNmeSZeDZoYfO27QRuzl5Lc5w7CrFD5GXQ nsonwy65o7luL3geXSdK3LcocHk2TQtcFaZhlBNvewPFZO58SdMjKt3t4+yALQlff7Zg 8d4pBvVQP853lZ6NCirCOu4tcoxR3jrKYZA7jESCgXXWhpeBmUYxodrLRHNuw4Iyb1c1 ifFN2VJjFpKVFsWEdjpFduq07XwekYT47KyVxexDoKrGLs1Bbsea+xpHk6a790A7jI+j Juaw== X-Gm-Message-State: APjAAAV+ILiJOjlBinqweml9AwYTsD+hmxj4dSxmQrYDNO9WtZ4zY+9Z uxhN5PkdjQvxSTC+3iU1geWQgbta6sFMHdKv2h0= X-Received: by 2002:a92:8656:: with SMTP id g83mr750858ild.9.1581643066684; Thu, 13 Feb 2020 17:17:46 -0800 (PST) MIME-Version: 1.0 References: <20191107205334.158354-1-hannes@cmpxchg.org> <20191107205334.158354-3-hannes@cmpxchg.org> <20200212102817.GA18107@js1304-desktop> <20200212181834.GD180867@cmpxchg.org> In-Reply-To: <20200212181834.GD180867@cmpxchg.org> From: Joonsoo Kim Date: Fri, 14 Feb 2020 10:17:35 +0900 Message-ID: Subject: Re: [PATCH 2/3] mm: vmscan: detect file thrashing at the reclaim root To: Johannes Weiner Cc: Andrew Morton , Andrey Ryabinin , Suren Baghdasaryan , Shakeel Butt , Rik van Riel , Michal Hocko , Linux Memory Management List , cgroups@vger.kernel.org, LKML , kernel-team@fb.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 2020=EB=85=84 2=EC=9B=94 13=EC=9D=BC (=EB=AA=A9) =EC=98=A4=EC=A0=84 3:18, J= ohannes Weiner =EB=8B=98=EC=9D=B4 =EC=9E=91=EC=84=B1: > > On Wed, Feb 12, 2020 at 07:28:19PM +0900, Joonsoo Kim wrote: > > Hello, Johannes. > > > > When I tested my patchset on v5.5, I found that my patchset doesn't > > work as intended. I tracked down the issue and this patch would be the > > reason of unintended work. I don't fully understand the patchset so I > > could be wrong. Please let me ask some questions. > > > > On Thu, Nov 07, 2019 at 12:53:33PM -0800, Johannes Weiner wrote: > > ...snip... > > > -static void snapshot_refaults(struct mem_cgroup *root_memcg, pg_data= _t *pgdat) > > > +static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_da= ta_t *pgdat) > > > { > > > - struct mem_cgroup *memcg; > > > - > > > - memcg =3D mem_cgroup_iter(root_memcg, NULL, NULL); > > > - do { > > > - unsigned long refaults; > > > - struct lruvec *lruvec; > > > + struct lruvec *target_lruvec; > > > + unsigned long refaults; > > > > > > - lruvec =3D mem_cgroup_lruvec(memcg, pgdat); > > > - refaults =3D lruvec_page_state_local(lruvec, WORKINGSET_A= CTIVATE); > > > - lruvec->refaults =3D refaults; > > > - } while ((memcg =3D mem_cgroup_iter(root_memcg, memcg, NULL))); > > > + target_lruvec =3D mem_cgroup_lruvec(target_memcg, pgdat); > > > + refaults =3D lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE= ); > > > + target_lruvec->refaults =3D refaults; > > > > Is it correct to just snapshot the refault for the target memcg? I > > think that we need to snapshot the refault for all the child memcgs > > since we have traversed all the child memcgs with the refault count > > that is aggregration of all the child memcgs. If next reclaim happens > > from the child memcg, workingset transition that is already considered > > could be considered again. > > Good catch, you're right! We have to update all cgroups in the tree, > like we used to. However, we need to use lruvec_page_state() instead > of _local, because we do recursive comparisons in shrink_node()! So > it's not a clean revert of that hunk. > > Does this patch here fix the problem you are seeing? I found that my problem comes from my mistake. Sorry for bothering you! Anyway, following hunk looks correct to me. Acked-by: Joonsoo Kim > diff --git a/mm/vmscan.c b/mm/vmscan.c > index c82e9831003f..e7431518db13 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2993,12 +2993,17 @@ static void shrink_zones(struct zonelist *zonelis= t, struct scan_control *sc) > > static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_data_t= *pgdat) > { > - struct lruvec *target_lruvec; > - unsigned long refaults; > + struct mem_cgroup *memcg; > > - target_lruvec =3D mem_cgroup_lruvec(target_memcg, pgdat); > - refaults =3D lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE= ); > - target_lruvec->refaults =3D refaults; > + memcg =3D mem_cgroup_iter(target_memcg, NULL, NULL); > + do { > + unsigned long refaults; > + struct lruvec *lruvec; > + > + lruvec =3D mem_cgroup_lruvec(memcg, pgdat); > + refaults =3D lruvec_page_state(lruvec, WORKINGSET_ACTIVAT= E); > + lruvec->refaults =3D refaults; > + } while ((memcg =3D mem_cgroup_iter(target_memcg, memcg, NULL))); > } > > /* > > > > @@ -277,12 +305,12 @@ void workingset_refault(struct page *page, void= *shadow) > > > * would be better if the root_mem_cgroup existed in all > > > * configurations instead. > > > */ > > > - memcg =3D mem_cgroup_from_id(memcgid); > > > - if (!mem_cgroup_disabled() && !memcg) > > > + eviction_memcg =3D mem_cgroup_from_id(memcgid); > > > + if (!mem_cgroup_disabled() && !eviction_memcg) > > > goto out; > > > - lruvec =3D mem_cgroup_lruvec(memcg, pgdat); > > > - refault =3D atomic_long_read(&lruvec->inactive_age); > > > - active_file =3D lruvec_lru_size(lruvec, LRU_ACTIVE_FILE, MAX_NR_Z= ONES); > > > + eviction_lruvec =3D mem_cgroup_lruvec(eviction_memcg, pgdat); > > > + refault =3D atomic_long_read(&eviction_lruvec->inactive_age); > > > + active_file =3D lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE= ); > > > > Do we need to use the aggregation LRU count of all the child memcgs? > > AFAIU, refault here is the aggregation counter of all the related > > memcgs. Without using the aggregation count for LRU, active_file could > > be so small than the refault distance and refault cannot happen > > correctly. > > lruvec_page_state() *is* aggregated for all child memcgs (as opposed > to lruvec_page_state_local()), so that comparison looks correct to me. Thanks for informing this. I have checked lruvec_page_state() but not mod_lruvec_state() so cannot find that counter is the aggregated value. Thanks.