Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp1075305pxm; Wed, 23 Feb 2022 17:28:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJygxwEhoAPZHAaRjhizJabVkeaxBAUlCjjNQ7k0o71iZuRJ4XV3jSy0auBiYlTGxAHU2Zkw X-Received: by 2002:a17:902:9b97:b0:14d:a773:4d35 with SMTP id y23-20020a1709029b9700b0014da7734d35mr379100plp.61.1645666113686; Wed, 23 Feb 2022 17:28:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645666113; cv=none; d=google.com; s=arc-20160816; b=lKwLoHJNh8R5Kjz2riEHkKt6yH2jz8rAhoUdxo+EoWRHXDVa9hjYpg7SwCRiwJ3G3b /SYHmwbO6xCI9kqvh8lfmkL3DAkIr+PkUKX50aUzNV32YgJ2c/PB4kHKp3sWgdsKcV+u FiEiiJJkca0a6pJuzVKMOwuWkJCvs7DLRU3ExkJ6A/uN0J782AQKagVgaT0avU3Z8hXk n7mUjJWNo+yrAQG8Cve1iEQ/iRHs6Ispf5HO/57YSZmpz+/74tBiGQ5ztTRMOOSozh7D UEr7/StL8j9r76fXSxxKdyKzePzj9WU7jAf4OUFC+EmkqzX92oNh4Mgc1eGUcVOHVVSJ ptHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:mime-version:message-id:date :dkim-signature; bh=3O2OEu4eUhR3YtOUni3RQudnfEWV+vLSeR/9NTfWNyU=; b=yg8zode2fmAQnp91m/S2FUmmdJw2YDjm1nW7N0g2G2lzm7NM/svTXbJjV6PmlYJ47Y Fd5fd5RBTk35/Ky3mFoLlAGUMkvgEuN6Ve4P8T1P+oPVcGb3dQtqE4DnvgBJUdJmiWdk 9ZKlHdbooxPYPqJC6/UIIolA5M8ekCHjLgmn0LyukYbHswmrPeRAXU8Xi3Q6snk6DQ9W 4G85mzabFBFtUZdW7ol3pzDCF2F4LKoO8ieTGIrm/yJZd32OKf1QGdyKiCmoBzZ50cBv I745nRRcXBqav/ad7o9uwx1n2Ue215f/BvNnypjK2i4bJt9X70UJ9mHJV4NUWWlAxeEZ xXYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jzC9fz4T; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b22si943356pft.303.2022.02.23.17.28.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Feb 2022 17:28:33 -0800 (PST) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jzC9fz4T; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9B5E3233E76; Wed, 23 Feb 2022 17:09:31 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244487AbiBWTkw (ORCPT + 99 others); Wed, 23 Feb 2022 14:40:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235576AbiBWTkw (ORCPT ); Wed, 23 Feb 2022 14:40:52 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE79849248 for ; Wed, 23 Feb 2022 11:40:23 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id d17-20020a253611000000b006244e94b7b4so16142380yba.4 for ; Wed, 23 Feb 2022 11:40:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=3O2OEu4eUhR3YtOUni3RQudnfEWV+vLSeR/9NTfWNyU=; b=jzC9fz4TzuA54IP+E5K9YaKUQaPNmQ/lvm3NX2pb3OKYpT8t0JaPn6bAt2BddJkswc CaDH8o0HuYAtisKR5ccNJFhqnT9crftdUBByiSgZ28y3XbzO3EdQaGScOTjoQ8alXMwB JBrL6fQH+JWTaChQwvObiF7Wze08of9eeQzNJBGqa58pXjb8fI0wFM01INm2APxTcq6d 0/z1QuNzzNqvlMbZmvPurZn/8lu9y9/XNtMjiBk+FAPQ0V4eXsbYDNvjkoDuhePHNvGO UffScw890HXGZdnsEUbyecWHYB9Sntgl8fkx0DrcJkg0q5QFExj/O81celFzUTPqIBKJ 3tzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=3O2OEu4eUhR3YtOUni3RQudnfEWV+vLSeR/9NTfWNyU=; b=Qs3UFNxASB0IAFea75PGUrg4t44rHb4mjtoygzOrALru1oo78UpVC46oruumwSfcRf 44VeoA9KI1WUP0AZzVcOLUv3kTRTqpKdq2XTn4Ax5yYxtc9tnzhpPa/hjMHqxVED615G BiIJwOH1ZGVFbLaUccYbWAWtOGwf3xT2YcDolrL+u2lH1Ig1YLUPHrlrY9UdabZ+zsQr aW5FCsFmDOaSRAA1On3+iK8mAD4qFarxKhElnrDmCkA+UJUmlxe6we2FzH1+vjvzGx04 zi6GmiLmL/3J+24v9GzsZ07xgdkMSwqVdUyKYhoMnkKy1POcM80VcFW8/E2n+1liKcqS 0pjA== X-Gm-Message-State: AOAM531yP53Bgcnisjhz+dOlpnKnkjmbhl4MhmDg62MF2wAqjfJNTJe1 Ew+LQAm2EMGBMwoAYAEIpNX6OEk1/A0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:5093:9fb5:d0ba:a5f]) (user=surenb job=sendgmr) by 2002:a25:db8d:0:b0:624:5e99:1665 with SMTP id g135-20020a25db8d000000b006245e991665mr1151004ybf.524.1645645223005; Wed, 23 Feb 2022 11:40:23 -0800 (PST) Date: Wed, 23 Feb 2022 11:40:18 -0800 Message-Id: <20220223194018.1296629-1-surenb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.35.1.473.g83b2b277ed-goog Subject: [PATCH v2 1/1] mm: count time in drain_all_pages during direct reclaim as memory pressure From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, mhocko@suse.com, pmladek@suse.com, peterz@infradead.org, guro@fb.com, shakeelb@google.com, minchan@kernel.org, timmurray@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When page allocation in direct reclaim path fails, the system will make one attempt to shrink per-cpu page lists and free pages from high alloc reserves. Draining per-cpu pages into buddy allocator can be a very slow operation because it's done using workqueues and the task in direct reclaim waits for all of them to finish before proceeding. Currently this time is not accounted as psi memory stall. While testing mobile devices under extreme memory pressure, when allocations are failing during direct reclaim, we notices that psi events which would be expected in such conditions were not triggered. After profiling these cases it was determined that the reason for missing psi events was that a big chunk of time spent in direct reclaim is not accounted as memory stall, therefore psi would not reach the levels at which an event is generated. Further investigation revealed that the bulk of that unaccounted time was spent inside drain_all_pages call. A typical captured case when drain_all_pages path gets activated: __alloc_pages_slowpath took 44.644.613ns __perform_reclaim took 751.668ns (1.7%) drain_all_pages took 43.887.167ns (98.3%) PSI in this case records the time spent in __perform_reclaim but ignores drain_all_pages, IOW it misses 98.3% of the time spent in __alloc_pages_slowpath. Annotate __alloc_pages_direct_reclaim in its entirety so that delays from handling page allocation failure in the direct reclaim path are accounted as memory stall. Reported-by: Tim Murray Signed-off-by: Suren Baghdasaryan Acked-by: Johannes Weiner --- changes in v2: - Added captured sample case to show the delay numbers, per Michal Hocko - Moved annotation from __perform_reclaim into __alloc_pages_direct_reclaim, per Minchan Kim mm/page_alloc.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3589febc6d31..2e9fbf28938f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4595,13 +4595,12 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order, const struct alloc_context *ac) { unsigned int noreclaim_flag; - unsigned long pflags, progress; + unsigned long progress; cond_resched(); /* We now go into synchronous reclaim */ cpuset_memory_pressure_bump(); - psi_memstall_enter(&pflags); fs_reclaim_acquire(gfp_mask); noreclaim_flag = memalloc_noreclaim_save(); @@ -4610,7 +4609,6 @@ __perform_reclaim(gfp_t gfp_mask, unsigned int order, memalloc_noreclaim_restore(noreclaim_flag); fs_reclaim_release(gfp_mask); - psi_memstall_leave(&pflags); cond_resched(); @@ -4624,11 +4622,13 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order, unsigned long *did_some_progress) { struct page *page = NULL; + unsigned long pflags; bool drained = false; + psi_memstall_enter(&pflags); *did_some_progress = __perform_reclaim(gfp_mask, order, ac); if (unlikely(!(*did_some_progress))) - return NULL; + goto out; retry: page = get_page_from_freelist(gfp_mask, order, alloc_flags, ac); @@ -4644,7 +4644,8 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order, drained = true; goto retry; } - + psi_memstall_leave(&pflags); +out: return page; } -- 2.35.1.473.g83b2b277ed-goog