Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp891720rwb; Wed, 14 Dec 2022 04:16:42 -0800 (PST) X-Google-Smtp-Source: AA0mqf6DBrOL6gCMxgnuIKZvc/4gHLwo9cqWEa+kkJkCRqQfRMA1q1TfGla3tpxczWuzMFfgCmyx X-Received: by 2002:a05:6402:7cb:b0:470:306a:5c27 with SMTP id u11-20020a05640207cb00b00470306a5c27mr5837597edy.13.1671020201965; Wed, 14 Dec 2022 04:16:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671020201; cv=none; d=google.com; s=arc-20160816; b=OmPashsky0uzFzHHytQ3cb/dzb4zZ4MiK02E3wD9lLUdyTfa78P4vsz6Gil6Nt3AeS Z463YUccrqVnSi7oh1tJOiJw1MV8B9Qh/dij+na1jermHseS88htLVdPAPAgTIB1KHNO SN0RF/2nL9m52f+W0sUG31LQsVsYpAx5Xpna1Zp5BznZeJwDgHq0izeFCcnYAL+Kas1P D4OB2nYExNG25oCr7YmMPCWMSPOIwqi95UNaVQNx2phyVS1kyreqRxmrE1byuFrc5sSd TIUU/5IaRQfrlix9qie30vrrGLMSwkBhlASjr8gR3iyQePYoUdtwIMoh6cd9ZXq/dUED 1A7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=YgnN0QtEEWTyKETKPOOJ3VtCQ7cxhjYteagNX9ETV3M=; b=TWBgOKU6FDmeFOsRAezd9XfpH9sqZCGz6wGuI7LoWiZglYq9U79sL2dGRPFF5iXT+C EagNBG8fAyfVbETdPsxcINWV9oA1S4Vjow9O/x9dgrJh94DhPh3170ZFbAKMsbftmmJE WWtkQe/QWs0flLofavHOJhX0g138+779/UKS3HEuNwx1QDUEOxNCgF6HtXQg5K5up2OF gtRpeY5lJDOOIwxLk+62x40TMZUBJ/nxdZ8HQlX53YCDZQOTybsUwT0dLb65sSVVZbPe YBFcz5be/E5oYATQAtM85XVXgjo3AO3tJPp6dtVGuLOIYv82huMpXmY5npnZt6S7kiFG IoXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bXB0wApp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hs36-20020a1709073ea400b007c10640d8e0si11959517ejc.723.2022.12.14.04.16.23; Wed, 14 Dec 2022 04:16:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=bXB0wApp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238305AbiLNLpU (ORCPT + 69 others); Wed, 14 Dec 2022 06:45:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229695AbiLNLpC (ORCPT ); Wed, 14 Dec 2022 06:45:02 -0500 Received: from mail-wm1-x34a.google.com (mail-wm1-x34a.google.com [IPv6:2a00:1450:4864:20::34a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1C7020F56 for ; Wed, 14 Dec 2022 03:45:00 -0800 (PST) Received: by mail-wm1-x34a.google.com with SMTP id i9-20020a1c3b09000000b003d21fa95c38so4326668wma.3 for ; Wed, 14 Dec 2022 03:45:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YgnN0QtEEWTyKETKPOOJ3VtCQ7cxhjYteagNX9ETV3M=; b=bXB0wAppfK3GA2HYbRuWqvPsTY3f+aGB1QepHMKJA7vImXpQpYm31VFbXfPrtLWxo3 l31asC6VSSz8FeCw53O2jOYzWz5FREHkbkwPvVGX9DQtjgefPrdxNqCIXuZFPhx/dEMx fEvcTISA6ki2F765SfusarEWqS61UtPqwk68EhbtOHx8VT/N8kZrAs1j4oyAnjgt3fnd zIuglo80o3MN2ZZqgRXAiqYiicVNTRIMIlk3ynovic8O66vjoaURSLgYsn4vKk2Ei6Ah gWVuJZd47aotqPqoe9VZAe9xhF+UQDbElM+yssZL0eMnuKZkRCLFLIRmcmKeGiHw3WRt DDHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YgnN0QtEEWTyKETKPOOJ3VtCQ7cxhjYteagNX9ETV3M=; b=hG/cTbnqJtq113ivPryJPCcV+5TIo4r/VmJ3kubwQkwa7QngeCATbvPd3SQe6G0BJL A7B/okyxTbOnaJKtLtG0Z6zyij2+BUFyIpKwQ/lzIMIJH/ZuRmh5osie6px6xA7tZpa5 TeyG86ZaabEV/5P/rgwSA3ohgAfMsI/g9zxd8+EqQAcDQRiSPwEFQTBOf9rlVf7l/oDJ THGVCe6QMnqynHjifF35gI8xoiHZJwYuxEt6Zk2ZaCPOjwsF/Mpb/IYEbYSD9+NA5M6R gJwoRm3i4OCrwsGho2rxXuW0+7fWjvJxw/OygICxLTnxMnaNGw/IIsOofLveVpb9CUds IBjw== X-Gm-Message-State: ANoB5plf0vic4DUek081y55kAcLYUs1sWYYCgxJaGISxzd2Z5k3rxbDW PV+Wu9+YuA5SnNMZm621ita+oslf8k3tNIA3Fg== X-Received: from peternewman-vh.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:30cc]) (user=peternewman job=sendgmr) by 2002:a1c:6a14:0:b0:3cf:878c:6555 with SMTP id f20-20020a1c6a14000000b003cf878c6555mr184510wmc.38.1671018299448; Wed, 14 Dec 2022 03:44:59 -0800 (PST) Date: Wed, 14 Dec 2022 12:44:47 +0100 In-Reply-To: <20221214114447.1935755-1-peternewman@google.com> Mime-Version: 1.0 References: <20221214114447.1935755-1-peternewman@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221214114447.1935755-2-peternewman@google.com> Subject: [PATCH v5 1/1] x86/resctrl: Fix task CLOSID/RMID update race From: Peter Newman To: reinette.chatre@intel.com, fenghua.yu@intel.com Cc: bp@alien8.de, derkling@google.com, eranian@google.com, hpa@zytor.com, james.morse@arm.com, jannh@google.com, kpsingh@google.com, linux-kernel@vger.kernel.org, mingo@redhat.com, tglx@linutronix.de, x86@kernel.org, Peter Newman Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When the user moves a running task to a new rdtgroup using the tasks file interface or by deleting its rdtgroup, the resulting change in CLOSID/RMID must be immediately propagated to the PQR_ASSOC MSR on the task(s) CPUs. x86 allows reordering loads with prior stores, so if the task starts running between a task_curr() check that the CPU hoisted before the stores in the CLOSID/RMID update then it can start running with the old CLOSID/RMID until it is switched again because __rdtgroup_move_task() failed to determine that it needs to be interrupted to obtain the new CLOSID/RMID. Refer to the diagram below: CPU 0 CPU 1 ----- ----- __rdtgroup_move_task(): curr <- t1->cpu->rq->curr __schedule(): rq->curr <- t1 resctrl_sched_in(): t1->{closid,rmid} -> {1,1} t1->{closid,rmid} <- {2,2} if (curr == t1) // false IPI(t1->cpu) A similar race impacts rdt_move_group_tasks(), which updates tasks in a deleted rdtgroup. In both cases, use smp_mb() to order the task_struct::{closid,rmid} stores before the loads in task_curr(). In particular, in the rdt_move_group_tasks() case, simply execute an smp_mb() on every iteration with a matching task. It is possible to use a single smp_mb() in rdt_move_group_tasks(), but this would require two passes and a means of remembering which task_structs were updated in the first loop. However, benchmarking results below showed too little performance impact in the simple approach to justify implementing the two-pass approach. Times below were collected using `perf stat` to measure the time to remove a group containing a 1600-task, parallel workload. CPU: Intel(R) Xeon(R) Platinum P-8136 CPU @ 2.00GHz (112 threads) # mkdir /sys/fs/resctrl/test # echo $$ > /sys/fs/resctrl/test/tasks # perf bench sched messaging -g 40 -l 100000 task-clock time ranges collected using: # perf stat rmdir /sys/fs/resctrl/test Baseline: 1.54 - 1.60 ms smp_mb() every matching task: 1.57 - 1.67 ms Signed-off-by: Peter Newman --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index e5a48f05e787..5993da21d822 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -580,8 +580,10 @@ static int __rdtgroup_move_task(struct task_struct *tsk, /* * Ensure the task's closid and rmid are written before determining if * the task is current that will decide if it will be interrupted. + * This pairs with the full barrier between the rq->curr update and + * resctrl_sched_in() during context switch. */ - barrier(); + smp_mb(); /* * By now, the task's closid and rmid are set. If the task is current @@ -2401,6 +2403,14 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to, WRITE_ONCE(t->closid, to->closid); WRITE_ONCE(t->rmid, to->mon.rmid); + /* + * Order the closid/rmid stores above before the loads + * in task_curr(). This pairs with the full barrier + * between the rq->curr update and resctrl_sched_in() + * during context switch. + */ + smp_mb(); + /* * If the task is on a CPU, set the CPU in the mask. * The detection is inaccurate as tasks might move or -- 2.39.0.rc1.256.g54fd8350bd-goog