Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp3566166rwl; Mon, 27 Mar 2023 16:27:44 -0700 (PDT) X-Google-Smtp-Source: AKy350bgi0oAy4rzQBI/9AVFKkbSDqkssEiTYJO0eMLE1xszdEPafKTLyAC3cUeJ6FwpU7Pq5WAF X-Received: by 2002:a17:906:9f15:b0:931:c7fd:10b1 with SMTP id fy21-20020a1709069f1500b00931c7fd10b1mr13401875ejc.19.1679959664374; Mon, 27 Mar 2023 16:27:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679959664; cv=none; d=google.com; s=arc-20160816; b=Y73/aFr+EE2CJfzA0jpyklGloaXVtwdgzRKSDvbObu84xoMHH5vmd7ysTEUz+8eIfN t5Qt5Ns5wURPlgqZ1egWCQKybhZdBmrsQRugejSEVFhtINc3ZVzayy1Kx3q/x4YoN+18 cyrojbQd/9CMMTSRIbLthAhPun7pEOjbNfB01GwqG2cZZ96MgJMKAVZVLd5xEPGM5OlW HUFfoUgQr2DqsndNSs5lWEwy60LDiptAnybYLbaGD9oU7Luif8XLB6gysbTRCvl4tWaz l1YAIXzV7DVyu75Lv1MUXefxIRN09+eN6PFoBLyRV7Ef2f1uaJufubopgePCLkrB/Pys /eBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=2sthNEXRrhXPflNLlQaZ/R6XYUJ5sFT3zFaO3Oj8iQU=; b=GxZPsJ8tN10I3Ir54KhmobguFhx0oW369fcnwVN7qOuayp8wimojfgLgn5+PsaaJ4E IvxU9voS+5BCZYl6VznecWQgWHlZUBadPSo6JH8pRCChH0eVh3Kcr8dT2/2P1rGOMzZ9 pXIKWNhN77Ht9snCoF1jXQjlB5OlEeotHl17HpLzfSUKQyOdrcUn7PneX2NNE8ZestYm m+MJLxAju3S5OudKOMhNE1M+VHRidufQtgBAJ7ySokGSbwLgG7f1x+W24RPSL5HuPdxj /XLqL6e6JK7BFDADTgYBAi2cGpmUp8VRSjf+/iVOcU/M3iZ1X4mGrbRD50WGLzIHYriF KSQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=W+Tv96+p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t3-20020a170906a10300b009339cca7780si23696058ejy.521.2023.03.27.16.27.19; Mon, 27 Mar 2023 16:27:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=W+Tv96+p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjC0XYN (ORCPT + 99 others); Mon, 27 Mar 2023 19:24:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229608AbjC0XXx (ORCPT ); Mon, 27 Mar 2023 19:23:53 -0400 Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 228BF92 for ; Mon, 27 Mar 2023 16:23:52 -0700 (PDT) Received: by mail-ed1-x534.google.com with SMTP id eg48so42556368edb.13 for ; Mon, 27 Mar 2023 16:23:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679959430; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=2sthNEXRrhXPflNLlQaZ/R6XYUJ5sFT3zFaO3Oj8iQU=; b=W+Tv96+pFUlhT5JOOrNtcy47rsqxINYX5Iydl2ZhOjD2ZICVF/srMD4pjGFArnDzTC kPELiLr2qm8nbVs2H3GBDurYFGHHHR4YppJMTiQgYYWDafqzi7oz9Du7VQA6lba1jKXq dwn2vZhC36MqALm/QJpx59PUuvFUdHNBjYQvAl4nTFdcu6Cb2GfBntbxfBbl6jFLwYkS xFSElExlWZVdm5acLMkTXu9+vioAED99Ij0oJd6h+lwLDOoAZY3FueFbhm3i8L+Th0Ha nQFF7w8Ehg6MPHtETaxtVvSE4I3f3hAepMquA1Zlg0t/TDXtMd7OTTczELyQE4Wsmo2Y 0q3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679959430; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2sthNEXRrhXPflNLlQaZ/R6XYUJ5sFT3zFaO3Oj8iQU=; b=FIFoS01z5T8YcYyTPUDM7GqoILunk7PFdRQ1kjEuyvTmWc8FVU2QAUC6q/btIEJBeW gH5LXvCM17tGct1y5FlgJrrATE7kMu8JagmtujYUcgdrNr2tHK9URjj5vPTQAyQ4PmK1 CT6sMAkiBsRuOE7FFhEJPlooJbo5FrlTQnfdNTCdhvaWnGzDRNnaUjln5DFrPr9lB2HB DoQik2zW0OVdU5oBM6H9lS4MwtnulmmRXlXpxndjL0SDJfpveyYiw0AL0Odp5yZGyzkN KjOmueMO51FkzqcBFkwOjZoKRZ28vyu31+v0fgDz547CxVNB04hHnTrP1wAlbtKreMLq CU+A== X-Gm-Message-State: AAQBX9fJoQOJbzKUSzUyXgTakEXsbt1YEsLyYocrE+WBxUxY/RoX2bJd JSeNC88ngx/8V+Z4VEyL1ltegjW1xivhBICU9lI/7A== X-Received: by 2002:a50:8e0d:0:b0:4fc:473d:3308 with SMTP id 13-20020a508e0d000000b004fc473d3308mr6749360edw.8.1679959430371; Mon, 27 Mar 2023 16:23:50 -0700 (PDT) MIME-Version: 1.0 References: <20230323040037.2389095-1-yosryahmed@google.com> <20230323040037.2389095-2-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Mon, 27 Mar 2023 16:23:13 -0700 Message-ID: Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock To: Shakeel Butt Cc: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-15.7 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 24, 2023 at 9:46=E2=80=AFPM Shakeel Butt = wrote: > > On Fri, Mar 24, 2023 at 9:37=E2=80=AFPM Yosry Ahmed wrote: > > > > On Fri, Mar 24, 2023 at 9:31=E2=80=AFPM Shakeel Butt wrote: > > > > > > On Fri, Mar 24, 2023 at 7:18=E2=80=AFPM Yosry Ahmed wrote: > > > > > > > [...] > > > > Any ideas here are welcome! > > > > > > > > > > Let's move forward. It seems like we are not going to reach an > > > agreement on making cgroup_rstat_lock a non-irq lock. However there i= s > > > agreement on the memcg code of not flushing in irq context and the > > > cleanup Johannes has requested. Let's proceed with those for now. We > > > can come back to cgroup_rstat_lock later if we still see issues in > > > production. > > > > Even if we do not flush from irq context, we still flush from atomic > > contexts that will currently hold the lock with irqs disabled > > throughout the entire flush sequence. A primary purpose of this reason > > is to avoid that. > > > > We can either: > > (a) Proceed with the following approach of making cgroup_rstat_lock a > > non-irq lock. > > (b) Proceed with Tejun's suggestion of always releasing and > > reacquiring the lock at CPU boundaries, even for atomic flushes (if > > the spinlock needs a break ofc). > > (c) Something else. > > (d) keep the status quo regarding cgroup_rstat_lock > (e) decouple the discussion of cgroup_rstat_lock from the agreed > improvements. Send the patches for the agreed ones and continue > discussing cgroup_rstat_lock. Ah, I lost sight of the fact that the rest of the patch series does not strictly depend on this patch. I will respin the rest of the patch series separately. Thanks, Shakeel. Meanwhile, it would be useful to reach an agreement here to stop acquiring the cgroup_rstat_lock for a long time with irq disabled in atomic contexts. Tejun, if having the lock be non-irq is a non-starter for you, I can send a patch that instead gives up the lock and reacquires it at every CPU boundary unconditionally -- or perhaps every N CPU boundaries to avoid excessively releasing and reacquiring the lock. Something like: static void cgroup_rstat_flush_locked(struct cgroup *cgrp, bool may_sleep) { ... for_each_possible_cpu(cpu) { ... /* Always yield the at CPU boundaries to enable irqs */ spin_unlock_irq(&cgroup_rstat_lock); /* if @may_sleep, play nice and yield if necessary */ if (may_sleep) cond_resched(); spin_lock_irq(&cgroup_rstat_lock); } } If you have other ideas to avoid disabling irq's for the entire flush sequence I am also open to that. Thanks!