Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp684332rdb; Tue, 19 Sep 2023 07:22:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEKf/87Aq1QbnujqPhygNVENMjExe+gq8IF1+2u9ILqi69I6AxdM54wpSWnPY/IGdTUQj66 X-Received: by 2002:a17:902:f681:b0:1c3:9544:cf51 with SMTP id l1-20020a170902f68100b001c39544cf51mr14094619plg.1.1695133346041; Tue, 19 Sep 2023 07:22:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695133346; cv=none; d=google.com; s=arc-20160816; b=Vt2Q/yUB+/ICz5Dh2NNyOcAy1CPTqE4SD+DCU7nUbKjzDeoRfi5aqIO/ULWpzomik5 7602I7Zlf971Ac5Cen2ezqes2LTG7cs4Jf4cxCROL2V/S//Ycq4OXVAbXZn5MeUWB6Nc qScVv3fKuI5er6cAs9tITBPkApe8CBAgRQWMuaCNfaJvHNFfg0As3uoe9so08+39pui1 UcTh8O9qHW/l3t0pV6Cjq91i2HmAuPwW4w4JkGz8xFkD51tu/5MYDPPYBsnekHWp76wy ZZ9RGjo/Wua/kOo4zMrk1Ti9OtutU+71DlLYBwr/Xlp8lXobRNUg3bkN3rtvzHdHi+EH dWag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=J1h1iJ0wFVP3zHxNcBhL2Oixo7OH135dJdL3D7qr0/U=; fh=ftZax1jEQ6IWYUxHRvBO0gYGGnIdDlF8qcTjKhta5hs=; b=NfMgwnno1KHzvGEuC179eJk0R2atr8eFW0G8DshAWxPnXko9MQZ07xvaAM2YAsjKY/ CZhSNHKFJdxD5NDddpM7Iei2y2bThNNEVssGpp4bSz0c50fLB2DpqQZZK7RGd2LyL/hq o43tVZuKqD98rYpr10vQ8BseoSROfzFClvY9CMlcywpYYv6sxDheMR6x+71o6kBa6tUm WV/GKNBoH9TL/Wqaj0ObSF1GMR1u9dflbFDctqtfhTFhqJZHd67Lue8YETOmeakebMYH 1s9k3xKuFCRLxWT9K8eFoDiJc3p8vTgxy0S993dKwDF8C3dq5xm2H0/XGdHAioz15yvy bGiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id w21-20020a170902d71500b001bc5f13c67esi9520919ply.589.2023.09.19.07.22.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 07:22:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 3708282B5168; Tue, 19 Sep 2023 07:08:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232633AbjISOIo (ORCPT + 99 others); Tue, 19 Sep 2023 10:08:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232357AbjISOIn (ORCPT ); Tue, 19 Sep 2023 10:08:43 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61D1AF2; Tue, 19 Sep 2023 07:08:37 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1141C433C8; Tue, 19 Sep 2023 14:08:35 +0000 (UTC) Date: Tue, 19 Sep 2023 10:09:09 -0400 From: Steven Rostedt To: Jaewon Kim Cc: yuzhao@google.com, tjmercier@google.com, kaleshsingh@google.com, akpm@linux-foundation.org, vbabka@suse.cz, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, jaewon31.kim@gmail.com Subject: Re: [PATCH] vmscan: add trace events for lru_gen Message-ID: <20230919095927.5a964094@gandalf.local.home> In-Reply-To: <20230919025216.1878-1-jaewon31.kim@samsung.com> References: <20230919025216.1878-1-jaewon31.kim@samsung.com> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Tue, 19 Sep 2023 07:08:52 -0700 (PDT) On Tue, 19 Sep 2023 11:52:16 +0900 Jaewon Kim wrote: > /* > * Now redefine the EM() and EMe() macros to map the enums to the strings > diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h > index d2123dd960d5..e8f9d0452e89 100644 > --- a/include/trace/events/vmscan.h > +++ b/include/trace/events/vmscan.h > @@ -327,6 +327,55 @@ TRACE_EVENT(mm_vmscan_lru_isolate, > __print_symbolic(__entry->lru, LRU_NAMES)) > ); > > +TRACE_EVENT(mm_vmscan_lru_gen_scan, > + TP_PROTO(int highest_zoneidx, > + int order, > + unsigned long nr_requested, > + unsigned long nr_scanned, > + unsigned long nr_skipped, > + unsigned long nr_taken, > + isolate_mode_t isolate_mode, > + int lru), This is a lot of parameter passing, can you consolidate it? (see below to where you call this) > + > + TP_ARGS(highest_zoneidx, order, nr_requested, nr_scanned, nr_skipped, nr_taken, isolate_mode, lru), > + > + TP_STRUCT__entry( > + __field(int, highest_zoneidx) > + __field(int, order) > + __field(unsigned long, nr_requested) > + __field(unsigned long, nr_scanned) > + __field(unsigned long, nr_skipped) > + __field(unsigned long, nr_taken) > + __field(unsigned int, isolate_mode) > + __field(int, lru) > + ), > + > + TP_fast_assign( > + __entry->highest_zoneidx = highest_zoneidx; > + __entry->order = order; > + __entry->nr_requested = nr_requested; > + __entry->nr_scanned = nr_scanned; > + __entry->nr_skipped = nr_skipped; > + __entry->nr_taken = nr_taken; > + __entry->isolate_mode = (__force unsigned int)isolate_mode; > + __entry->lru = lru; > + ), > + > + /* > + * classzone is previous name of the highest_zoneidx. > + * Reason not to change it is the ABI requirement of the tracepoint. > + */ > + TP_printk("isolate_mode=%d classzone=%d order=%d nr_requested=%lu nr_scanned=%lu nr_skipped=%lu nr_taken=%lu lru=%s", > + __entry->isolate_mode, > + __entry->highest_zoneidx, > + __entry->order, > + __entry->nr_requested, > + __entry->nr_scanned, > + __entry->nr_skipped, > + __entry->nr_taken, > + __print_symbolic(__entry->lru, LRU_GEN_NAMES)) > +); > + > TRACE_EVENT(mm_vmscan_write_folio, > > TP_PROTO(struct folio *folio), > @@ -437,6 +486,53 @@ TRACE_EVENT(mm_vmscan_lru_shrink_active, > show_reclaim_flags(__entry->reclaim_flags)) > ); > > +TRACE_EVENT(mm_vmscan_lru_gen_evict, > + > + TP_PROTO(int nid, unsigned long nr_reclaimed, > + struct reclaim_stat *stat, int priority, int file), > + > + TP_ARGS(nid, nr_reclaimed, stat, priority, file), > + > + TP_STRUCT__entry( > + __field(int, nid) On 64 bit architectures, this causes a 4 byte hole in the ring buffer layout. Please keep 32 bit size fields paired with other 32 bit size if possible. That is, move the above "int nid" down where it doesn't cause a long field to be 4 bytes away. > + __field(unsigned long, nr_reclaimed) > + __field(unsigned long, nr_dirty) > + __field(unsigned long, nr_writeback) > + __field(unsigned long, nr_congested) > + __field(unsigned long, nr_immediate) > + __field(unsigned int, nr_activate0) > + __field(unsigned int, nr_activate1) > + __field(unsigned long, nr_ref_keep) > + __field(unsigned long, nr_unmap_fail) __field(int, nid) here! > + __field(int, priority) > + __field(int, reclaim_flags) > + ), > + > + TP_fast_assign( > + __entry->nid = nid; > + __entry->nr_reclaimed = nr_reclaimed; > + __entry->nr_dirty = stat->nr_dirty; > + __entry->nr_writeback = stat->nr_writeback; > + __entry->nr_congested = stat->nr_congested; > + __entry->nr_immediate = stat->nr_immediate; > + __entry->nr_activate0 = stat->nr_activate[0]; > + __entry->nr_activate1 = stat->nr_activate[1]; > + __entry->nr_ref_keep = stat->nr_ref_keep; > + __entry->nr_unmap_fail = stat->nr_unmap_fail; > + __entry->priority = priority; > + __entry->reclaim_flags = trace_reclaim_flags(file); > + ), > + > + TP_printk("nid=%d nr_reclaimed=%ld nr_dirty=%ld nr_writeback=%ld nr_congested=%ld nr_immediate=%ld nr_activate_anon=%d nr_activate_file=%d nr_ref_keep=%ld nr_unmap_fail=%ld priority=%d flags=%s", > + __entry->nid, __entry->nr_reclaimed, > + __entry->nr_dirty, __entry->nr_writeback, > + __entry->nr_congested, __entry->nr_immediate, > + __entry->nr_activate0, __entry->nr_activate1, > + __entry->nr_ref_keep, __entry->nr_unmap_fail, > + __entry->priority, > + show_reclaim_flags(__entry->reclaim_flags)) > +); > + > TRACE_EVENT(mm_vmscan_node_reclaim_begin, > > TP_PROTO(int nid, int order, gfp_t gfp_flags), > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 6f13394b112e..cc10e3fb8fa2 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -5005,6 +5005,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > int sorted = 0; > int scanned = 0; > int isolated = 0; > + int skipped = 0; > int remaining = MAX_LRU_BATCH; > struct lru_gen_folio *lrugen = &lruvec->lrugen; > struct mem_cgroup *memcg = lruvec_memcg(lruvec); > @@ -5018,7 +5019,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > > for (i = MAX_NR_ZONES; i > 0; i--) { > LIST_HEAD(moved); > - int skipped = 0; > + int skipped_zone = 0; > int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; > struct list_head *head = &lrugen->folios[gen][type][zone]; > > @@ -5040,16 +5041,17 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > isolated += delta; > } else { > list_move(&folio->lru, &moved); > - skipped += delta; > + skipped_zone += delta; > } > > - if (!--remaining || max(isolated, skipped) >= MIN_LRU_BATCH) > + if (!--remaining || max(isolated, skipped_zone) >= MIN_LRU_BATCH) > break; > } > > - if (skipped) { > + if (skipped_zone) { > list_splice(&moved, head); > - __count_zid_vm_events(PGSCAN_SKIP, zone, skipped); > + __count_zid_vm_events(PGSCAN_SKIP, zone, skipped_zone); > + skipped += skipped_zone; > } > > if (!remaining || isolated >= MIN_LRU_BATCH) > @@ -5065,6 +5067,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > __count_memcg_events(memcg, PGREFILL, sorted); > __count_vm_events(PGSCAN_ANON + type, isolated); > > + if (scanned) BTW, you can make this branch conditional with the trace event logic, so that it isn't tested when tracing is enabled. That is, remove the "if (scanned)" test and use TRACE_EVENT_CONDITION() as I show below. > + trace_mm_vmscan_lru_gen_scan(sc->reclaim_idx, sc->order, > + MAX_LRU_BATCH, scanned, skipped, isolated, > + sc->may_unmap ? 0 : ISOLATE_UNMAPPED, type); Why not pass the sc in to the trace event, and then do the assigning there? // use CONDITION to test scanned TRACE_EVENT_CONDITION(mm_vmscan_lru_gen_scan, TP_PROTO(struct scan_control *sc, unsigned long nr_requested, unsigned long nr_scanned, unsigned long nr_skipped, unsigned long nr_taken, int lru), TP_ARGS(...) TP_CONDITION(nr_scanned) TP_fast_assign( __entry->highest_zoneidx = sc->reclaim_idx; __entry->order = sc->order; __entry->nr_requested = nr_requested; __entry->nr_scanned = nr_scanned; __entry->nr_skipped = nr_skipped; __entry->nr_taken = nr_taken; __entry->isolate_mode = (__force unsigned int)(sc->may_unmap ? 0 : ISOLATE_UNMAPPED); __entry->lru = lru; ), Lots of parameters can be expensive to pass, as it requires more copying. -- Steve > /* > * There might not be eligible folios due to reclaim_idx. Check the > * remaining to prevent livelock if it's not making progress. > @@ -5194,6 +5200,8 @@ static int evict_folios(struct lruvec *lruvec, struct scan_control *sc, int swap > retry: > reclaimed = shrink_folio_list(&list, pgdat, sc, &stat, false); > sc->nr_reclaimed += reclaimed; > + trace_mm_vmscan_lru_gen_evict(pgdat->node_id, reclaimed, &stat, > + sc->priority, type); > > list_for_each_entry_safe_reverse(folio, next, &list, lru) { > if (!folio_evictable(folio)) {