Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3342143rdb; Wed, 13 Sep 2023 09:09:24 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFutS54742GqgW6ojX08FIRNsR7PN9oMKxjpKNIh6zD9XPfv4/uwNYJhi0DiHiE4QUwC/Vx X-Received: by 2002:a17:90a:2e84:b0:25e:a8ab:9157 with SMTP id r4-20020a17090a2e8400b0025ea8ab9157mr2466478pjd.22.1694621363679; Wed, 13 Sep 2023 09:09:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694621363; cv=none; d=google.com; s=arc-20160816; b=D1BDO6OCxVQTNTuegOV1TNr9epmAlWKigvlbMpdjG8f1cKIKQnVGLBbueOHtTH2ZAb vXfmGB79i/+1XYVr4sogWrZYmrzS6wUKJfqtEPGklnsS7qFIPhhskSFvtbxnk0Gp1LaX eIpD7YCAPiXpCu+kpAyO4vlyJkYrEGuZR4DrCcdEp1ClpnYFD8Kv4xnpBoboQCz7zczw G58cw5HfYS+LiyqqMGR2HpXZ/Fn4MJ5jtKHC+SQMkvf5Rdp9HltFSnKubyY3iZDXY5ch wPFi1HdMINhLWING340b4rt1M5gSMazx/ns1qUd3Ph8HhSPhOozOoVKGbfb8XM0R8sxG AnwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; fh=Rhg6ZBXQA4fZersjDe2tU1HtZn2/8IOlJnDyGvWM6wk=; b=1KeUsT6YSD09Ro5OD1hT9VydzmyTy47ob5491dNvqfnKhYwl6CbDZtT9PNIpA8jnyi 9v0ddU4SAOrP1VbWtH8HEGhnbgjyLwzshm87lPaYPCdxlqezqlEP/xA1LujLqr5wksDi WdVVLWJtpT2ilmPF4Mv6cDoGiXVNW8TfKOkjGFdCA6WzVsHpmlxogNkvXuIBh76J0eJ0 Imr/+1EdzvpGsjTySerX66fNmFkmb8ZTqilVh4KVaGZZiis7jjmADJmI5FzGlwiD8tZy J3+NjTceJTo8I32XRYsckYP52i8MOFZXSaa/cjQpXtvIQFOwDeU7mqx7UFQtLF50o/4D m0Cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=mP4K6tx6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id d5-20020a17090a6f0500b0026825823da0si1799354pjk.59.2023.09.13.09.09.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 09:09:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=mP4K6tx6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 69F1982E332C; Wed, 13 Sep 2023 08:52:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230020AbjIMPw1 (ORCPT + 99 others); Wed, 13 Sep 2023 11:52:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230014AbjIMPwX (ORCPT ); Wed, 13 Sep 2023 11:52:23 -0400 Received: from mail-vs1-xe30.google.com (mail-vs1-xe30.google.com [IPv6:2607:f8b0:4864:20::e30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 893A7E3 for ; Wed, 13 Sep 2023 08:52:19 -0700 (PDT) Received: by mail-vs1-xe30.google.com with SMTP id ada2fe7eead31-450f8f1368cso6075137.1 for ; Wed, 13 Sep 2023 08:52:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1694620338; x=1695225138; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; b=mP4K6tx6m8jDbVnwufSbH5lSLnoi99dPlXkX6G4rzn4sw8GKJ9rg1N/JznD4FvT3/k iPAQC7xNhgiQ8llAAATBxuJEqt2G+VxBxCHSEH+A/59YfeDqHbTrhcWrZQ1XOzCIGzWi kWVollAxCUs6hXYe8Gd9CjGjvkq0Qh2kZ3pgwSXC7xVqif8WFEap/s5PooeDjVIdD5vD kSEgWCIg38DPi0AOOjm2q1PEnE0vrEqFkTA38Ycsgf0kHf02SqBcjNAypSXD78RKDboq 3zR6rYtxoh2ln9OVJ3F0pKKfLeWK2CnmqgSgFATMK84FABbb7sJx4NeL96heWUToqJ8n k0rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694620338; x=1695225138; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=U1o+xAaY0eE6mqHpp9tQ0uQLU9Huay2534Teze/R2PM=; b=qEQSmTJwQQzjOH6qkKO6RL32WM+ExKO8Iuvb8S6dD2AWycuMWJ/SG0sbI4u19IKs8M a7YvXv7i2xO7t84cbfVHRVEeKEVNgVVQ4Ft50OPumm8EqQIuUUWI47sV9ejWxqz3vyyO wi7hwSc4UpKrqGf75DM0P7EgwSLa7Mj9reidZFGSMEMxbqP66fYvS2Vc3sT/jSpAya35 N/6KIyp0LBruQvcP9Zxa7TxhFF1or52RCkTa9L/uQ/rjAwChmUipoVYD7vfmvrugQ8CD IicHEGHUEVz0+zuo8dhNFST383kDNE70Ggx/teWYDzS1/vPWbGCk69x+w+6sM24frq3k SKDQ== X-Gm-Message-State: AOJu0YyH4L5rUad/8NEdyg6IvpqAVrqKyeMfSYSXFasgpoGdKPfJ1d7m F7G0R4htHTXH1OSulItN3pl07qnQAI4DF7FWfUQ= X-Received: by 2002:a67:ea43:0:b0:44e:9cba:5f2 with SMTP id r3-20020a67ea43000000b0044e9cba05f2mr2088907vso.31.1694620338540; Wed, 13 Sep 2023 08:52:18 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id r1-20020a0cb281000000b0063cf4d0d558sm4513021qve.25.2023.09.13.08.52.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 08:52:18 -0700 (PDT) Date: Wed, 13 Sep 2023 11:52:17 -0400 From: Johannes Weiner To: Vern Hao Cc: mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, akpm@linux-foundation.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xin Hao Subject: Re: [PATCH v2] mm: memcg: add THP swap out info for anonymous reclaim Message-ID: <20230913155217.GC45543@cmpxchg.org> References: <20230912021727.61601-1-vernhao@tencent.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230912021727.61601-1-vernhao@tencent.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Wed, 13 Sep 2023 08:52:57 -0700 (PDT) X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email On Tue, Sep 12, 2023 at 10:17:25AM +0800, Vern Hao wrote: > From: Xin Hao > > At present, we support per-memcg reclaim strategy, however we do not > know the number of transparent huge pages being reclaimed, as we know > the transparent huge pages need to be splited before reclaim them, and > they will bring some performance bottleneck effect. for example, when > two memcg (A & B) are doing reclaim for anonymous pages at same time, > and 'A' memcg is reclaiming a large number of transparent huge pages, we > can better analyze that the performance bottleneck will be caused by 'A' > memcg. therefore, in order to better analyze such problems, there add > THP swap out info for per-memcg. > > Signed-off-by: Xin Hao > --- > v1 -> v2 > - Do some fix as Johannes Weiner suggestion. > v1: > https://lore.kernel.org/linux-mm/20230911160824.GB103342@cmpxchg.org/T/ > > mm/memcontrol.c | 2 ++ > mm/page_io.c | 4 +++- > mm/vmscan.c | 1 + > 3 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ecc07b47e813..32d50db9ea0d 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -752,6 +752,8 @@ static const unsigned int memcg_vm_event_stat[] = { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > THP_FAULT_ALLOC, > THP_COLLAPSE_ALLOC, > + THP_SWPOUT, > + THP_SWPOUT_FALLBACK, Can you please add documentation to Documentation/admin-guide/cgroup-v2.rst? Sorry, I missed this in v1. > @@ -208,8 +208,10 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > static inline void count_swpout_vm_event(struct folio *folio) > { > #ifdef CONFIG_TRANSPARENT_HUGEPAGE > - if (unlikely(folio_test_pmd_mappable(folio))) > + if (unlikely(folio_test_pmd_mappable(folio))) { > + count_memcg_folio_events(folio, THP_SWPOUT, 1); > count_vm_event(THP_SWPOUT); > + } > #endif > count_vm_events(PSWPOUT, folio_nr_pages(folio)); > } Looking through the callers, they seem mostly fine except this one: static void sio_write_complete(struct kiocb *iocb, long ret) { struct swap_iocb *sio = container_of(iocb, struct swap_iocb, iocb); struct page *page = sio->bvec[0].bv_page; int p; if (ret != sio->len) { /* * In the case of swap-over-nfs, this can be a * temporary failure if the system has limited * memory for allocating transmit buffers. * Mark the page dirty and avoid * folio_rotate_reclaimable but rate-limit the * messages but do not flag PageError like * the normal direct-to-bio case as it could * be temporary. */ pr_err_ratelimited("Write error %ld on dio swapfile (%llu)\n", ret, page_file_offset(page)); for (p = 0; p < sio->pages; p++) { page = sio->bvec[p].bv_page; set_page_dirty(page); ClearPageReclaim(page); } } else { for (p = 0; p < sio->pages; p++) count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); This is called at the end of IO where the page isn't locked anymore. Since it's not locked, page->memcg is not stable and might get freed (charge moving is deprecated but still possible). The fix is simple, though. Every other IO path bumps THP_SWPOUT before starting the IO while the page is still locked. We don't really care if we get SWPOUT events even for failed IOs. So we can just adjust this caller to fit the others, and count while still locked: diff --git a/mm/page_io.c b/mm/page_io.c index fe4c21af23f2..7925e19aeedd 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -278,9 +278,6 @@ static void sio_write_complete(struct kiocb *iocb, long ret) set_page_dirty(page); ClearPageReclaim(page); } - } else { - for (p = 0; p < sio->pages; p++) - count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); } for (p = 0; p < sio->pages; p++) @@ -296,6 +293,7 @@ static void swap_writepage_fs(struct page *page, struct writeback_control *wbc) struct file *swap_file = sis->swap_file; loff_t pos = page_file_offset(page); + count_swpout_vm_event(page_folio(sio->bvec[p].bv_page)); set_page_writeback(page); unlock_page(page); if (wbc->swap_plug)