Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp756559rwd; Tue, 16 May 2023 07:23:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5KX25z05NjtSoD5BPT99sRPQIUxq8B9C2B+BEAEiHTC26RsBnHV7rx9V3TqNxACxQSUagg X-Received: by 2002:a05:6a20:3d8b:b0:104:2d89:8f75 with SMTP id s11-20020a056a203d8b00b001042d898f75mr22892989pzi.24.1684246983473; Tue, 16 May 2023 07:23:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684246983; cv=none; d=google.com; s=arc-20160816; b=A8BjtttSSPzzOUsomaJC9KlczdnpiNEynh+7+F+jwsR7yfc/llq7cz0ucZsvO32d6x fZb/XvLVvVu/TbliVM7grUeup9fe+a2Tqfk2Z2Q9OiVaTFv5vd0rtvCyXifpfyeRfSJv QF6QlMxxVV1Bp38m9E0j3ltvASvGaYDrWT/ASlojJe4pAy7FoqT8TSFmREdHNIU3zkrT kjR+mBsR6a9F75SFMBNvNJqeI70j4atfqArKPRgXNKg06I8oTJE8LtXphY5xNOL4x5Wg wI+hdrV+DvXYjPwcM4psf8bOvTMJ1Rh65weGE28ih8tvEL9+lrhCYQezktzzzPZMrkWG x4IQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ABaiiANU88bGvckBvXgI3VuNlV8cLVjijeK5jkg4/ck=; b=a8PqEdimRF2yJk0neoTfmip1Quy4N9waCeslVAtL1btwlHS3cIAZeBIs55qfXpenSu UKKnSEEhfcfLmeoFjpvnnOPeYxLb5BVjFuoAis4KrqYz2xto02CBX8g4EXhI1D4TBgHS RHIVPGxFliipjkxCFckXF8pBGCkBwyjIaL8awtd/e7WokNxfW1SOYZEbMr12iqRCUHsz JXyTiN6JM/MC5kplXTDfKBNzZjL8d8TTIEaaEXC9K/DjjPUgpyZuWoa5eO/WkaCm/e89 1a1fp6oQPsJEGYtF5ZM/8u6pDcnV0oW+tJMc2chXiwNd104wI7tJ1xlkb5fTKr1ZVq6b rlmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="g/dTGVKr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j65-20020a638b44000000b0051b10b20ceasi19872614pge.893.2023.05.16.07.22.50; Tue, 16 May 2023 07:23:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b="g/dTGVKr"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233934AbjEPOTM (ORCPT + 99 others); Tue, 16 May 2023 10:19:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233901AbjEPOTF (ORCPT ); Tue, 16 May 2023 10:19:05 -0400 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B5B272B8 for ; Tue, 16 May 2023 07:18:53 -0700 (PDT) Received: by mail-qt1-x833.google.com with SMTP id d75a77b69052e-3f396606ab0so1822221cf.0 for ; Tue, 16 May 2023 07:18:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684246732; x=1686838732; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ABaiiANU88bGvckBvXgI3VuNlV8cLVjijeK5jkg4/ck=; b=g/dTGVKrnOyAW6gQtWR48XtNN3aHkA70cOuV96/xgC4m2jxOg20sqbyg3ZV6myFbXD eE+WCXyYweDzgJhlXEhbKTEwWGwxMz8v4deX3BAfcPPlD2Zcg+E1CJvCzjeiFyrnJRNf eniT/4O0IozDFdeQyW4Flh0fZ9ufwCAzk3VgVT2F7TRuIWpcp/zmKrjU7YBaaXBPxHfM Ub9SFGP/e447BbJzqe2d/qcQpW99RbefsiAg32q61XDlUH3yRZ7MWXijymLQVyNa7uuY 1sj7VCSN2S8MZ0ZqXuOTIr1gsUHo2nxt2sEoLfb8hsoQhatHGjrqxqvk95y23Fac4Y2/ eY/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684246732; x=1686838732; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ABaiiANU88bGvckBvXgI3VuNlV8cLVjijeK5jkg4/ck=; b=bEcRaWzstqLWxHlnGh5ZzF52MRY7eq3P09oIq4GN6pVmQZUj0sppI+e8pM/iVhqpmH VtzJT+hzWlWJ1a3tkEKhJ+fsyqcu6xSak+cc2+IAssLpVDqFk2YomARBYuLbvyJaDqrE hvvXC+IvIKNTTmO6OnWnfKfh7QdT2gDoEDnMQpFkGWFMj7qssm5a0tDrDlvaD3tEd0bl guCYMDNUz/n1EbLg7EYnDtIVVcP04YIE5IFkwOGQMqsN9l7/HD3tBOsC4Ig1uFxlcatF m/H6vYlJ6U8aMeYeRz5CTbrPx9JV1vJ5LQ1Xt1AvOOD4NN/up+RBi9JGnRuSmrzlMjz6 YpCw== X-Gm-Message-State: AC+VfDzBQrBzx0f4+7wjT/sbCtio3otzxtmd5tY13anRG9zm8nUCh5BR SqcyRJ9UxNieC98tFWl0ISpJrJSFwN2Kd9Ai4YYUOw== X-Received: by 2002:a05:622a:182a:b0:3f5:49b6:f196 with SMTP id t42-20020a05622a182a00b003f549b6f196mr102474qtc.9.1684246732310; Tue, 16 May 2023 07:18:52 -0700 (PDT) MIME-Version: 1.0 References: <20230421141723.2405942-1-peternewman@google.com> <20230421141723.2405942-4-peternewman@google.com> <38b9e6df-cccd-a745-da4a-1d1a0ec86ff3@intel.com> In-Reply-To: From: Peter Newman Date: Tue, 16 May 2023 16:18:40 +0200 Message-ID: Subject: Re: [PATCH v1 3/9] x86/resctrl: Add resctrl_mbm_flush_cpu() to collect CPUs' MBM events To: Reinette Chatre Cc: Fenghua Yu , Babu Moger , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Stephane Eranian , James Morse , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Reinette, On Fri, May 12, 2023 at 3:25=E2=80=AFPM Peter Newman wrote: > On Thu, May 11, 2023 at 11:37=E2=80=AFPM Reinette Chatre > wrote: > > On 4/21/2023 7:17 AM, Peter Newman wrote: > > > + > > > + if (evtid =3D=3D QOS_L3_MBM_LOCAL_EVENT_ID) { > > > + counter =3D &state->local; > > > + } else { > > > + WARN_ON(evtid !=3D QOS_L3_MBM_TOTAL_EVENT_ID); > > > + counter =3D &state->total; > > > + } > > > + > > > + /* > > > + * Propagate the value read from the hw_rmid assigned to the cu= rrent CPU > > > + * into the "soft" rmid associated with the current task or CPU= . > > > + */ > > > + m =3D get_mbm_state(d, soft_rmid, evtid); > > > + if (!m) > > > + return; > > > + > > > + if (resctrl_arch_rmid_read(r, d, hw_rmid, evtid, &val)) > > > + return; > > > + > > > > This all seems unsafe to run without protection. The code relies on > > the rdt_domain but a CPU hotplug event could result in the domain > > disappearing underneath this code. The accesses to the data structures > > also appear unsafe to me. Note that resctrl_arch_rmid_read() updates > > the architectural MBM state and this same state can be updated concurre= ntly > > in other code paths without appropriate locking. > > The domain is supposed to always be the current one, but I see that > even a get_domain_from_cpu(smp_processor_id(), ...) call needs to walk > a resource's domain list to find a matching entry, which could be > concurrently modified when other domains are added/removed. > > Similarly, when soft RMIDs are enabled, it should not be possible to > call resctrl_arch_rmid_read() outside of on the current CPU's HW RMID. > > I'll need to confirm whether it's safe to access the current CPU's > rdt_domain in an atomic context. If it isn't, I assume I would have to > arrange all of the state used during flush to be per-CPU. > > I expect the constraints on what data can be safely accessed where is > going to constrain how the state is ultimately arranged, so I will > need to settle this before I can come back to the other questions > about mbm_state. According to cpu_hotplug.rst, the startup callbacks are called before a CPU is started and the teardown callbacks are called after the CPU has become dysfunctional, so it should always be safe for a CPU to access its own data, so all I need to do here is avoid walking domain lists in resctrl_mbm_flush_cpu(). However, this also means that resctrl_{on,off}line_cpu() call clear_closid_rmid() on a different CPU, so whichever CPU executes these will zap its own pqr_state struct and PQR_ASSOC MSR. -Peter