Received: by 2002:a05:6358:16cd:b0:dc:6189:e246 with SMTP id r13csp1594344rwl; Fri, 4 Nov 2022 16:25:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63ebJKMYiA2KTiV3eTEXQB9biaKgoI8cy2sxZtm9q4gg9dSCN1spc59a1Z6kq1WAS2Y3yJ X-Received: by 2002:a50:ccc2:0:b0:460:4db2:5006 with SMTP id b2-20020a50ccc2000000b004604db25006mr38025363edj.369.1667604356185; Fri, 04 Nov 2022 16:25:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667604356; cv=none; d=google.com; s=arc-20160816; b=Vq+JD27xp5dFRmb01WrQ0ymMaMP+NoXZ2DPOD3/pSHHsYL7bl9Kws2bzwME6FHGBEv SYI6+Wus4egprHGT2+8uW+B87CAebSbF3Df+kSnaNOhJgy12HTYSerkS9Z1SghMqfbFK pZVL9eGq7+/MKDWGl7L8j8GFLDSqE6ZncSPpz1GDfbVNxf1+swoImnh3NE6lSATigeiQ at3WDibwXeduqPQGUA4OpYhx2IGQsyv4LN8v44tmg6xwzQjaMgPDjUOVH9C6EsZ60lJY eQYEISH5uFDVzLaR5KP8tDZwVwrDbX0l/AbxXD8sQ9PrXS2ZTZDBmZqFI4WOVH86/Gsk 8hbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=gfQYYmCbfCdmjxGtiSGnXE7+iSfuVxiEGd6gtYgqdUg=; b=dFwWiLXWqcJw2F7q483gsMbrpQXAEJQ5ZYRSY/mGth69XaTND0HqNfTSAJ5oz4zJEx Y512bG27+liHK2tg/9PJleHV5RisxPCGrVhBubmpD3MOh0A+tiFiVIGLW6gTP+IUCJgh uYLuRFH+dTBq1VieSm4k5ENqYORV2Lbw/BKJdWNvhtRpOzf6FOwDH1gYn5PjWF4FUinu 53VxEqsdqeAlEVk8fOat+kFgrdnTSGzI7EoS9vvjnjLW9vGhkdyA9I9TeZlg+7qQNqMU /b7Jhhzf6hwJJgu6u/uYUP7Rnpx5s5JOxqGRuXCDgWsxDFCIrs1KmfFeJKcaHJ4FIe7+ GPyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=W7QfOvf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o11-20020a50fd8b000000b0045fca739593si774167edt.188.2022.11.04.16.25.32; Fri, 04 Nov 2022 16:25:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=W7QfOvf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229791AbiKDXPz (ORCPT + 98 others); Fri, 4 Nov 2022 19:15:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229783AbiKDXPy (ORCPT ); Fri, 4 Nov 2022 19:15:54 -0400 Received: from mail-yb1-xb2e.google.com (mail-yb1-xb2e.google.com [IPv6:2607:f8b0:4864:20::b2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 490532664 for ; Fri, 4 Nov 2022 16:15:53 -0700 (PDT) Received: by mail-yb1-xb2e.google.com with SMTP id j130so7453165ybj.9 for ; Fri, 04 Nov 2022 16:15:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=gfQYYmCbfCdmjxGtiSGnXE7+iSfuVxiEGd6gtYgqdUg=; b=W7QfOvf2wl5F1GgempFO5HuRpFHpBoZFuLire4tTDUmcIFG0V0llXW6eY60TN4eKwj mgqQY4kV0HG1PwRq4ETPLqO2Ir6Ngk2DRQ0nbQK3MnjpiijFSmgbwsBKaDlPDYxRmDue l+lSlmAT+regGBQ1XuTGfERpdMReMZtE5VEMTcnASO9eKaV1LzeZvhPnPhCfsi5BtQMb k86hY1TpDd51WohK4ComlXZnZduj0C4dJzYLSckz7B1HD9FCjGnk1tKnvay2eldhq5d9 YK7JFTIt5KZYTxbDvRBJxHQ4rLRG/enxlZD8C2uaFRkV2V6pgmBbOPOsb7r3AHJQKJ+s vj/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gfQYYmCbfCdmjxGtiSGnXE7+iSfuVxiEGd6gtYgqdUg=; b=PGP8sc4FB1KdPnxUbNezpSU+xT0HGX9B6ZuO+ZJ1r/ftKoqBcxY1B5OCsESKNx8UVj 9+Z2FOxKze5+N3co6yBak0JQ0cTQuuGluJsBTihhG4lY1szlROa44zf3Q6ZJDv9wkFp2 fyCstpQywvBkxepUNcPIVyAHf8lVDwFwREr+FgBop8/QOPSi5YGCcgajkIApDSqCBxv1 pHNzW0kXvf435fmsPyIxiNOGVMVorT4ayLWnoxI2JW8XAlX9eBCwAp7syAYuzzTy/QcY ifLsxQWhan9tXc1X/d4OjzpgxUhn/bMaagnWVJ2+wnTTfHILnCwlb2xfU51FZQHaJYgJ JBEw== X-Gm-Message-State: ACrzQf1VxxisCp+tQKAUJ3RUbSwq7X8FRjpEF5XjS/rS52zKo2zZzZos jAS0uAGNnvV++wltUSSucyJnp1arcL4yR1iq9lmlSqbnDOY= X-Received: by 2002:a25:c00a:0:b0:6cf:dda2:552e with SMTP id c10-20020a25c00a000000b006cfdda2552emr221195ybf.363.1667603752365; Fri, 04 Nov 2022 16:15:52 -0700 (PDT) MIME-Version: 1.0 References: <20221024052841.3291983-1-shakeelb@google.com> <20221103171407.ydubp43x7tzahriq@google.com> <20221104160552.c249397512c5c7f8b293869f@linux-foundation.org> In-Reply-To: <20221104160552.c249397512c5c7f8b293869f@linux-foundation.org> From: Shakeel Butt Date: Fri, 4 Nov 2022 16:15:41 -0700 Message-ID: Subject: Re: [PATCH] mm: convert mm's rss stats into percpu_counter To: Andrew Morton Cc: Marek Szyprowski , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 4, 2022 at 4:05 PM Andrew Morton wrote: > > On Thu, 3 Nov 2022 17:14:07 +0000 Shakeel Butt wrote: > > > > > ... > > > > Thanks for the report. It seems like there is a race between > > for_each_online_cpu() in __percpu_counter_sum() and > > percpu_counter_cpu_dead()/cpu-offlining. Normally this race is fine for > > percpu_counter users but for check_mm() is not happy with this race. Can > > you please try the following patch: > > percpu-counters supposedly avoid such races via the hotplup notifier. > So can you please fully describe the race and let's see if it can be > fixed at the percpu_counter level? > Yes, I am writing a more detailed commit message explaining the race and why it is not really an issue for current users. > > > > From: Shakeel Butt > > Date: Thu, 3 Nov 2022 06:05:13 +0000 > > Subject: [PATCH] mm: percpu_counter: use race free percpu_counter sum > > interface > > > > percpu_counter_sum can race with cpu offlining. Add a new interface > > which does not race with it and use that for check_mm(). > > I'll grab this version for now, as others might be seeing this issue. > Thanks. > > > --- > > include/linux/percpu_counter.h | 11 +++++++++++ > > kernel/fork.c | 2 +- > > lib/percpu_counter.c | 24 ++++++++++++++++++------ > > 3 files changed, 30 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/percpu_counter.h b/include/linux/percpu_counter.h > > index bde6c4c1f405..3070c1043acf 100644 > > --- a/include/linux/percpu_counter.h > > +++ b/include/linux/percpu_counter.h > > @@ -45,6 +45,7 @@ void percpu_counter_set(struct percpu_counter *fbc, s64 amount); > > void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, > > s32 batch); > > s64 __percpu_counter_sum(struct percpu_counter *fbc); > > +s64 __percpu_counter_sum_all(struct percpu_counter *fbc); > > int __percpu_counter_compare(struct percpu_counter *fbc, s64 rhs, s32 batch); > > void percpu_counter_sync(struct percpu_counter *fbc); > > > > @@ -85,6 +86,11 @@ static inline s64 percpu_counter_sum(struct percpu_counter *fbc) > > return __percpu_counter_sum(fbc); > > } > > > > +static inline s64 percpu_counter_sum_all(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_all(fbc); > > +} > > We haven't been good about documenting these interfaces. Can we please > start now? ;) > Yup will do. > > > > ... > > > > + > > +/* > > + * Add up all the per-cpu counts, return the result. This is a more accurate > > + * but much slower version of percpu_counter_read_positive() > > + */ > > +s64 __percpu_counter_sum(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_mask(fbc, cpu_online_mask); > > +} > > EXPORT_SYMBOL(__percpu_counter_sum); > > > > +s64 __percpu_counter_sum_all(struct percpu_counter *fbc) > > +{ > > + return __percpu_counter_sum_mask(fbc, cpu_possible_mask); > > +} > > +EXPORT_SYMBOL(__percpu_counter_sum_all); > > Probably here is a good place to document it. > > Is there any point in having the > percpu_counter_sum_all()->__percpu_counter_sum_all() inlined wrapper? > Why not name this percpu_counter_sum_all() directly? > Ack. thanks, Shakeel