Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1864917pxp; Thu, 17 Mar 2022 19:44:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6pjhYL9LQH3oG6KGDbSR966IMQ5PFONvx+ziyXfoIZ6rgUOJ7I/yPkNU4x4joFcKNqLog X-Received: by 2002:a05:6402:1217:b0:419:249:8461 with SMTP id c23-20020a056402121700b0041902498461mr3877507edw.10.1647571443395; Thu, 17 Mar 2022 19:44:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647571443; cv=none; d=google.com; s=arc-20160816; b=UzX46gn7mmgmnGyEuuLxQy4RbefMFD3neHa6HTSQSdQm27SNha88lT9k69u/nauvcQ bcFOhzY5JZVVYQ381EERu//LJkPDfwZu5c8s4DwFLwcN1UA8ZiYIzIp6Sw/WS9gAljg+ OXROoDivbwz3HbxrijDUZ1lxWEkMeIbQgl9tjtYjwMM0+9qfrQi56zN5DklzF/fwwv9e mbz2kKhDdvXvgYgE9xr6DZuWwPLwy2F9JIycyTQsTu285x9AjMJ7ZX2aLlWaEdiySziO 2zGVxKZag2D8rB17mHpcR6alnonIfPi/OTQEDfeGDD/qHW2S6nM38b4ifRv5q/ayPzkc w0Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=VRsb5Zmb/JCOeENoTDRg7wIKUPQgA0/d6iKWZvPMkaw=; b=WXNTtACg+0mgT9GW6JbHMhnaFn8BjcO5i7moOrEGAofOnATOCPmqlwrOCEPD6DCbAL AWo9vG9QwBTqK7rRHuO4B4HWq2c5829qAcDKI4Ep9MjZ4xpk4vecnPHYERQ82EtNy9AK OGWMmmWkfOkdDtasZAvWVwnhLWKmThpx9KG2ISI4H9uQnAuYhVF3pGOEr7A0HZtZyzjZ /+dX5X1bpMUUmsSYFwPRVC2gdl7IBX9ga3CMjeoej25I9SOdqvU7dSjf7+S6s9kImB92 pmcjEeRgYnkn33r6doNU5Rk/UKN5pSeJA0BMH3PdOC46UrLugqly3KE4BVG0kJD/l5Xg xyLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=gxFBnfWR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h8-20020a50c388000000b00418e80c7232si2611912edf.558.2022.03.17.19.43.33; Thu, 17 Mar 2022 19:44:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=gxFBnfWR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229921AbiCRBRZ (ORCPT + 99 others); Thu, 17 Mar 2022 21:17:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231423AbiCRBRY (ORCPT ); Thu, 17 Mar 2022 21:17:24 -0400 Received: from mail-yb1-xb31.google.com (mail-yb1-xb31.google.com [IPv6:2607:f8b0:4864:20::b31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1AE52A046C; Thu, 17 Mar 2022 18:16:05 -0700 (PDT) Received: by mail-yb1-xb31.google.com with SMTP id h126so13428831ybc.1; Thu, 17 Mar 2022 18:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=VRsb5Zmb/JCOeENoTDRg7wIKUPQgA0/d6iKWZvPMkaw=; b=gxFBnfWRS6wjleZvzFQNXoTfaiX0E6pjWIupBHBHr0SfVAYDgZiI+XgDsV7PpB4U67 4y8iSEE+2ly4nmLzGztFYRJBtHjgJQvoyuGvqtUvwyC9gh3qycNrJ4lA7A/pvnX4ZacD DxXxxLvuchL/+vp/rItPOgGmEEn+NuAc5WJx4lBt9E9CyTd5hlBSrEyMLFUrb6ECAy43 mVQH5bYzRijK+Q6tdNGM6qLrKzZCDs5Cjv4QM39KllWc+dOFyZcNDKPGlSIGgynuhjZJ hjUAOJbtI3oOSpzwT6+makiViBL4VWYpaFa37QI2iDg6ENewo7KGQjLNjMo9fIupbIk4 RBLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=VRsb5Zmb/JCOeENoTDRg7wIKUPQgA0/d6iKWZvPMkaw=; b=a15qf1g5C/W7/vJzLh+iLxvvbsmMd0c4UbL8vAGlnEf1zaHQcs6wHU5l4SlRNFhCzo 0gwe2NQs37WcCyVXxGQqFm4BmJ90DMsHiUJsigeIxvu3WPzgBblNJavvHPKFByZIBTGD DvcvujJ3XXpbDJEKeP8yxgFHB3MmCPDQJdTGRnean+QqC1LZGg1dxp+H5d3S9L4tCR1h zu4aVKgp5tyk+sO1PNYLYl8ktUExeKnB/hlYR73wP34h8+ARqonqw7sG34efGXNr1t+y mCnOuZUx/pcKywVpTLO+O7ia3j8w1utJQHifsrnQ94WATWC4j7m5fKs8zFDlJrycWD6+ E64A== X-Gm-Message-State: AOAM531mrlnIKybbh1lrMmNALu4/KSvb/DDWAwM1pMWhGxrwu9uRJDzD RShzIjwIp5edsveJYXxTPA3QcstSFDnqjU5JsCY= X-Received: by 2002:a25:a223:0:b0:621:1238:68b1 with SMTP id b32-20020a25a223000000b00621123868b1mr8097719ybi.370.1647566164753; Thu, 17 Mar 2022 18:16:04 -0700 (PDT) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-4-yuzhao@google.com> In-Reply-To: <20220309021230.721028-4-yuzhao@google.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 18 Mar 2022 14:15:48 +1300 Message-ID: Subject: Re: [PATCH v9 03/14] mm/vmscan.c: refactor shrink_node() To: Yu Zhao Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , Kernel Page Reclaim v2 , x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao wrote: > > This patch refactors shrink_node() to improve readability for the > upcoming changes to mm/vmscan.c. > > Signed-off-by: Yu Zhao > Acked-by: Brian Geffon > Acked-by: Jan Alexander Steffens (heftig) > Acked-by: Oleksandr Natalenko > Acked-by: Steven Barrett > Acked-by: Suleiman Souhlal > Tested-by: Daniel Byrne > Tested-by: Donald Carr > Tested-by: Holger Hoffst=C3=A4tte > Tested-by: Konstantin Kharlamov > Tested-by: Shuang Zhai > Tested-by: Sofia Trinh > Tested-by: Vaibhav Jain Reviewed-by: Barry Song seems nice refactoring since we are going to skip the whole function for lru_gen later: static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc) { unsigned long file; struct lruvec *target_lruvec; if (lru_gen_enabled()) return; ... } > --- > mm/vmscan.c | 198 +++++++++++++++++++++++++++------------------------- > 1 file changed, 104 insertions(+), 94 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 59b14e0d696c..8e744cdf802f 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2718,6 +2718,109 @@ enum scan_balance { > SCAN_FILE, > }; > > +static void prepare_scan_count(pg_data_t *pgdat, struct scan_control *sc= ) > +{ > + unsigned long file; > + struct lruvec *target_lruvec; > + > + target_lruvec =3D mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat)= ; > + > + /* > + * Flush the memory cgroup stats, so that we read accurate per-me= mcg > + * lruvec stats for heuristics. > + */ > + mem_cgroup_flush_stats(); > + > + /* > + * Determine the scan balance between anon and file LRUs. > + */ > + spin_lock_irq(&target_lruvec->lru_lock); > + sc->anon_cost =3D target_lruvec->anon_cost; > + sc->file_cost =3D target_lruvec->file_cost; > + spin_unlock_irq(&target_lruvec->lru_lock); > + > + /* > + * Target desirable inactive:active list ratios for the anon > + * and file LRU lists. > + */ > + if (!sc->force_deactivate) { > + unsigned long refaults; > + > + refaults =3D lruvec_page_state(target_lruvec, > + WORKINGSET_ACTIVATE_ANON); > + if (refaults !=3D target_lruvec->refaults[0] || > + inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)= ) > + sc->may_deactivate |=3D DEACTIVATE_ANON; > + else > + sc->may_deactivate &=3D ~DEACTIVATE_ANON; > + > + /* > + * When refaults are being observed, it means a new > + * workingset is being established. Deactivate to get > + * rid of any stale active pages quickly. > + */ > + refaults =3D lruvec_page_state(target_lruvec, > + WORKINGSET_ACTIVATE_FILE); > + if (refaults !=3D target_lruvec->refaults[1] || > + inactive_is_low(target_lruvec, LRU_INACTIVE_FILE)) > + sc->may_deactivate |=3D DEACTIVATE_FILE; > + else > + sc->may_deactivate &=3D ~DEACTIVATE_FILE; > + } else > + sc->may_deactivate =3D DEACTIVATE_ANON | DEACTIVATE_FILE; > + > + /* > + * If we have plenty of inactive file pages that aren't > + * thrashing, try to reclaim those first before touching > + * anonymous pages. > + */ > + file =3D lruvec_page_state(target_lruvec, NR_INACTIVE_FILE); > + if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FIL= E)) > + sc->cache_trim_mode =3D 1; > + else > + sc->cache_trim_mode =3D 0; > + > + /* > + * Prevent the reclaimer from falling into the cache trap: as > + * cache pages start out inactive, every cache fault will tip > + * the scan balance towards the file LRU. And as the file LRU > + * shrinks, so does the window for rotation from references. > + * This means we have a runaway feedback loop where a tiny > + * thrashing file LRU becomes infinitely more attractive than > + * anon pages. Try to detect this based on file LRU size. > + */ > + if (!cgroup_reclaim(sc)) { > + unsigned long total_high_wmark =3D 0; > + unsigned long free, anon; > + int z; > + > + free =3D sum_zone_node_page_state(pgdat->node_id, NR_FREE= _PAGES); > + file =3D node_page_state(pgdat, NR_ACTIVE_FILE) + > + node_page_state(pgdat, NR_INACTIVE_FILE); > + > + for (z =3D 0; z < MAX_NR_ZONES; z++) { > + struct zone *zone =3D &pgdat->node_zones[z]; > + > + if (!managed_zone(zone)) > + continue; > + > + total_high_wmark +=3D high_wmark_pages(zone); > + } > + > + /* > + * Consider anon: if that's low too, this isn't a > + * runaway file reclaim problem, but rather just > + * extreme pressure. Reclaim as per usual then. > + */ > + anon =3D node_page_state(pgdat, NR_INACTIVE_ANON); > + > + sc->file_is_tiny =3D > + file + free <=3D total_high_wmark && > + !(sc->may_deactivate & DEACTIVATE_ANON) && > + anon >> sc->priority; > + } > +} > + > /* > * Determine how aggressively the anon and file LRU lists should be > * scanned. The relative value of each set of LRU lists is determined > @@ -3188,109 +3291,16 @@ static void shrink_node(pg_data_t *pgdat, struct= scan_control *sc) > unsigned long nr_reclaimed, nr_scanned; > struct lruvec *target_lruvec; > bool reclaimable =3D false; > - unsigned long file; > > target_lruvec =3D mem_cgroup_lruvec(sc->target_mem_cgroup, pgdat)= ; > > again: > - /* > - * Flush the memory cgroup stats, so that we read accurate per-me= mcg > - * lruvec stats for heuristics. > - */ > - mem_cgroup_flush_stats(); > - > memset(&sc->nr, 0, sizeof(sc->nr)); > > nr_reclaimed =3D sc->nr_reclaimed; > nr_scanned =3D sc->nr_scanned; > > - /* > - * Determine the scan balance between anon and file LRUs. > - */ > - spin_lock_irq(&target_lruvec->lru_lock); > - sc->anon_cost =3D target_lruvec->anon_cost; > - sc->file_cost =3D target_lruvec->file_cost; > - spin_unlock_irq(&target_lruvec->lru_lock); > - > - /* > - * Target desirable inactive:active list ratios for the anon > - * and file LRU lists. > - */ > - if (!sc->force_deactivate) { > - unsigned long refaults; > - > - refaults =3D lruvec_page_state(target_lruvec, > - WORKINGSET_ACTIVATE_ANON); > - if (refaults !=3D target_lruvec->refaults[0] || > - inactive_is_low(target_lruvec, LRU_INACTIVE_ANON)= ) > - sc->may_deactivate |=3D DEACTIVATE_ANON; > - else > - sc->may_deactivate &=3D ~DEACTIVATE_ANON; > - > - /* > - * When refaults are being observed, it means a new > - * workingset is being established. Deactivate to get > - * rid of any stale active pages quickly. > - */ > - refaults =3D lruvec_page_state(target_lruvec, > - WORKINGSET_ACTIVATE_FILE); > - if (refaults !=3D target_lruvec->refaults[1] || > - inactive_is_low(target_lruvec, LRU_INACTIVE_FILE)) > - sc->may_deactivate |=3D DEACTIVATE_FILE; > - else > - sc->may_deactivate &=3D ~DEACTIVATE_FILE; > - } else > - sc->may_deactivate =3D DEACTIVATE_ANON | DEACTIVATE_FILE; > - > - /* > - * If we have plenty of inactive file pages that aren't > - * thrashing, try to reclaim those first before touching > - * anonymous pages. > - */ > - file =3D lruvec_page_state(target_lruvec, NR_INACTIVE_FILE); > - if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FIL= E)) > - sc->cache_trim_mode =3D 1; > - else > - sc->cache_trim_mode =3D 0; > - > - /* > - * Prevent the reclaimer from falling into the cache trap: as > - * cache pages start out inactive, every cache fault will tip > - * the scan balance towards the file LRU. And as the file LRU > - * shrinks, so does the window for rotation from references. > - * This means we have a runaway feedback loop where a tiny > - * thrashing file LRU becomes infinitely more attractive than > - * anon pages. Try to detect this based on file LRU size. > - */ > - if (!cgroup_reclaim(sc)) { > - unsigned long total_high_wmark =3D 0; > - unsigned long free, anon; > - int z; > - > - free =3D sum_zone_node_page_state(pgdat->node_id, NR_FREE= _PAGES); > - file =3D node_page_state(pgdat, NR_ACTIVE_FILE) + > - node_page_state(pgdat, NR_INACTIVE_FILE); > - > - for (z =3D 0; z < MAX_NR_ZONES; z++) { > - struct zone *zone =3D &pgdat->node_zones[z]; > - if (!managed_zone(zone)) > - continue; > - > - total_high_wmark +=3D high_wmark_pages(zone); > - } > - > - /* > - * Consider anon: if that's low too, this isn't a > - * runaway file reclaim problem, but rather just > - * extreme pressure. Reclaim as per usual then. > - */ > - anon =3D node_page_state(pgdat, NR_INACTIVE_ANON); > - > - sc->file_is_tiny =3D > - file + free <=3D total_high_wmark && > - !(sc->may_deactivate & DEACTIVATE_ANON) && > - anon >> sc->priority; > - } > + prepare_scan_count(pgdat, sc); > > shrink_node_memcgs(pgdat, sc); > > -- > 2.35.1.616.g0bdcbb4464-goog > Thanks Barry