Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp7836028rwl; Thu, 23 Mar 2023 09:12:30 -0700 (PDT) X-Google-Smtp-Source: AK7set8KawFSy3CCErpMOsHFgZM0rN7osh74qRYLSt82n+a/AKHbBBTvFZIdNPFfUeYnODeRTNcV X-Received: by 2002:a17:902:ebcf:b0:1a1:dc27:202b with SMTP id p15-20020a170902ebcf00b001a1dc27202bmr6580318plg.20.1679587950164; Thu, 23 Mar 2023 09:12:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679587950; cv=none; d=google.com; s=arc-20160816; b=i8gd+vhJqDlq+Gv4hYv4l59+xmSW0LJR7P/kaATyeMdwZT4ZCUKi4x/TgBsfSIu6w0 c4jTXhkSsLLMPBHPwS3S6uYbyuIJ1LctXcqpQaGsI8hmSjCX5C1eDX1iPq4hCjBrHGro 0jeM/mRQJMyqcoTmRi6uwjRdO74khNYO9RLzwrm6sdEnHUKY9llW8i73SbNwQ+Ux7sDe 4XcJ3aEC2uMUg3dYpH2P3uLqGLYbH8ysFdF4BI5sZgsbSwwMSXHSF8OSWm6OKm9YMcbD Ot0nMrpGdWgu0EnxMLWcIyImqBGsbrhE85JTMzz0Q2ziCiy+GEdWHdvNl1QNuqP28dyw wo5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=BgZMEf1PHFMinoIbPEomXcwtNzt8eUFNPvTXJsCn0P8=; b=wDmreBpAvJQtsAOJ6tNPnU/BygbbwIqKCBgB9FLFsDZthU7OYPaQXKb7YFdXkMIjo9 7HNk/rpsKdfTFZI6B2Fg1tH6UHWdCJwepb1TAieU5X51WN65zy9xGJT14o0ykP3W4ss9 uv0O/eHNhD1tZilyXgO1rMti8Os6XanNEQKkMaVv5TRPdDROI3P7PXIf0dk/zac91fgA yxzermD1hkwevYoWI0SCe0/qyZIyFByNpTFhjdQU2ncjbH0Ef5XwewuMrtb3iqOgLy5Q BoEzZXZmxaci68otYpPLajMBhkhaksUYbxBWtKUBYfUDSE2YYvINkk/Eg87CnwzKT3Re RYAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=k5dpDN1D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j3-20020a17090276c300b0019f3bf39426si18212148plt.518.2023.03.23.09.12.14; Thu, 23 Mar 2023 09:12:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=k5dpDN1D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231499AbjCWQK0 (ORCPT + 99 others); Thu, 23 Mar 2023 12:10:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33616 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231271AbjCWQKY (ORCPT ); Thu, 23 Mar 2023 12:10:24 -0400 Received: from mail-yw1-x1133.google.com (mail-yw1-x1133.google.com [IPv6:2607:f8b0:4864:20::1133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 15A8B35ED5 for ; Thu, 23 Mar 2023 09:10:02 -0700 (PDT) Received: by mail-yw1-x1133.google.com with SMTP id 00721157ae682-544b959a971so369202177b3.3 for ; Thu, 23 Mar 2023 09:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679587801; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BgZMEf1PHFMinoIbPEomXcwtNzt8eUFNPvTXJsCn0P8=; b=k5dpDN1DHcGjTlwbOvMKLMRzGlnzZn29V1PdhQ9swzOTalJXOVLnQWZ41YtSQ6iUrk 3tYVVzf3TXbBDU3Fht0rRrJX5GujygwDq4CznsLvQvc3iwlzWZ7RjrlJRpn6zm+LX6Is wg3xRL0pVLcK4/WUOQciLucr4KsYng/0RiN3fJUrGmHZTbbYLdyI+Mxa6fi3ob6cRFNq 8SpTkAvVF4xMY6kgvEFxHg+HMcX5zbBQrAgc5RP417XZQXgpltBfUeyhzDW4y/NM+4Qd L5xz9jZ+0DhO8BpAwodd5of1IQ/G/KeTFSQm8YwjDJLsC/QfhlZPdbgvRtQ21H8E1ZYK ce/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679587801; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BgZMEf1PHFMinoIbPEomXcwtNzt8eUFNPvTXJsCn0P8=; b=yofje5m3XFRgzWGJ9EXzn99q0VUV3MnjhM9bUvzdhRu9aH274fBpjuNknpfj+VsqYB +qFjulSKvog10rru9WoRBvf8yll5Y2lBHM6TpePapDGnC9b2Ru9IHmMMOZ7QTpeVcNmC 1M1F2+B1sjZIkl5JG1bfxF0EbEvf5hJHbKW38wJUBxYx/qT4N34ry3FVmevpDBijRjBq HSAI7YpFD6bBX9IKX/scq4lKz3VXrt+EIk0YaITKVwA6Wpp4RerGd8P6dMYh1nFAXT2C eQ/gx0iURtrxZAPQPTRzxmRWBgLLwdovwiiHhKWOQfC+QOm1jybu7Hj1znShh1WUqAip ol4g== X-Gm-Message-State: AAQBX9d21Jg+TPAvdNYl0TzLKiXtyha00yPj++ZpWSX0sgGFU7TTfcEI lTJxpNdXcEei6ts9OQ/JCcr8/J9Luo++Dtkk1GMxMA== X-Received: by 2002:a81:ae1c:0:b0:52e:e095:d840 with SMTP id m28-20020a81ae1c000000b0052ee095d840mr2159466ywh.0.1679587801172; Thu, 23 Mar 2023 09:10:01 -0700 (PDT) MIME-Version: 1.0 References: <20230323040037.2389095-1-yosryahmed@google.com> <20230323040037.2389095-2-yosryahmed@google.com> In-Reply-To: From: Shakeel Butt Date: Thu, 23 Mar 2023 09:09:49 -0700 Message-ID: Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock To: Yosry Ahmed Cc: Tejun Heo , Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Andrew Morton , Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-15.7 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,ENV_AND_HDR_SPF_MATCH, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 23, 2023 at 8:46=E2=80=AFAM Shakeel Butt = wrote: > > On Thu, Mar 23, 2023 at 8:43=E2=80=AFAM Yosry Ahmed wrote: > > > > On Thu, Mar 23, 2023 at 8:40=E2=80=AFAM Shakeel Butt wrote: > > > > > > On Thu, Mar 23, 2023 at 6:36=E2=80=AFAM Yosry Ahmed wrote: > > > > > > > [...] > > > > > > > > > > > > > 2. Are we really calling rstat flush in irq context? > > > > > > > > > > > > I think it is possible through the charge/uncharge path: > > > > > > memcg_check_events()->mem_cgroup_threshold()->mem_cgroup_usage(= ). I > > > > > > added the protection against flushing in an interrupt context f= or > > > > > > future callers as well, as it may cause a deadlock if we don't = disable > > > > > > interrupts when acquiring cgroup_rstat_lock. > > > > > > > > > > > > > 3. The mem_cgroup_flush_stats() call in mem_cgroup_usage() is= only > > > > > > > done for root memcg. Why is mem_cgroup_threshold() interested= in root > > > > > > > memcg usage? Why not ignore root memcg in mem_cgroup_threshol= d() ? > > > > > > > > > > > > I am not sure, but the code looks like event notifications may = be set > > > > > > up on root memcg, which is why we need to check thresholds. > > > > > > > > > > This is something we should deprecate as root memcg's usage is il= l defined. > > > > > > > > Right, but I think this would be orthogonal to this patch series. > > > > > > > > > > I don't think we can make cgroup_rstat_lock a non-irq-disabling lock > > > without either breaking a link between mem_cgroup_threshold and > > > cgroup_rstat_lock or make mem_cgroup_threshold work without disabling > > > irqs. > > > > > > So, this patch can not be applied before either of those two tasks ar= e > > > done (and we may find more such scenarios). > > > > > > Could you elaborate why? > > > > My understanding is that with an in_task() check to make sure we only > > acquire cgroup_rstat_lock from non-irq context it should be fine to > > acquire cgroup_rstat_lock without disabling interrupts. > > From mem_cgroup_threshold() code path, cgroup_rstat_lock will be taken > with irq disabled while other code paths will take cgroup_rstat_lock > with irq enabled. This is a potential deadlock hazard unless > cgroup_rstat_lock is always taken with irq disabled. Oh you are making sure it is not taken in the irq context through should_skip_flush(). Hmm seems like a hack. Normally it is recommended to actually remove all such users instead of silently ignoring/bypassing the functionality. So, how about removing mem_cgroup_flush_stats() from mem_cgroup_usage(). It will break the known chain which is taking cgroup_rstat_lock with irq disabled and you can add WARN_ON_ONCE(!in_task()).