Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp952911pxp; Wed, 16 Mar 2022 22:28:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztKWgMs9/yZZaZbxvKVfDMQHW0/m7/3aW/+gYy3W85mDXndkmC0eg8PZN9NOp2x0MC/Eda X-Received: by 2002:a17:902:da89:b0:153:349c:d240 with SMTP id j9-20020a170902da8900b00153349cd240mr2891313plx.73.1647494899362; Wed, 16 Mar 2022 22:28:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647494899; cv=none; d=google.com; s=arc-20160816; b=sI6GpTLQL2rLn0tpVbhjUp75Zwnfhc87V23Y5x9rP3zkk5ecickm38fxmJrGk1sOQ8 fRVP6Q6hZCPMVJk4TC32CxRoAvxZrCdp9NxKfbOttM2HYo1IS3Fwra9Won9gTZASp9ly ZveIjC3f8gn47C9zGrnbFxP0kQdTj/+OrpfoQCyMsTaMwCdHdZrnvFk/rERN3p6T/wce meuHGG5E/0TvEaBS6PC+DYoD47wOHNWXDtPj5ZpPpnVTiHaP3FippyjrdKpPjpH4FYhQ 93XKdJBJhF7wRhxCOhdE1o58oIpvYK0WivhSutmBo9CoUBEUWApSW3Lg8dN9t8JBwQ8R lOyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=uSShqJ5gVRMRD1d9QqBDt1o00pamL7IUy4nJ7k3R/v8=; b=kqfFS2dk2LbQlqLF5H+8NQ/3UGfSueQLr6sg3SAx8D+S/QMJM5Wwg/2ip2tqZPU308 lZsvf1e3uDikDB2mNzHNIZtPqZuvYQHE3NEpA8ukLOSWqrVml7Z1phr0drPvL9+PXVyi Tr73g0BJ5zp++JxTbDpwuiUOmDA0iC+jOuCt11HUCVzNHHIY2pu5I8iODg44Yvj5Qsc/ vx4nhHr1xk6tfOaWaHjqs81YniEZOTv806A3radfepfBYv5TdMYJTuDNADMGWLvGyKaz LlI6mU4wniNHveTtCU69pHdK5JVW/F0HgrIokE+7jWwTL3/BWegGZGluz2SUhlL1LUl5 l+YA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id i11-20020a636d0b000000b003816043f0c8si1064204pgc.701.2022.03.16.22.28.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Mar 2022 22:28:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8A3192261C7; Wed, 16 Mar 2022 21:29:38 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243721AbiCQCnS (ORCPT + 99 others); Wed, 16 Mar 2022 22:43:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237943AbiCQCnQ (ORCPT ); Wed, 16 Mar 2022 22:43:16 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFAE220192; Wed, 16 Mar 2022 19:41:59 -0700 (PDT) Received: from dggpeml500026.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4KJrwY36pXzfZ2K; Thu, 17 Mar 2022 10:40:29 +0800 (CST) Received: from dggpeml500018.china.huawei.com (7.185.36.186) by dggpeml500026.china.huawei.com (7.185.36.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Thu, 17 Mar 2022 10:41:57 +0800 Received: from [10.67.111.186] (10.67.111.186) by dggpeml500018.china.huawei.com (7.185.36.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Thu, 17 Mar 2022 10:41:57 +0800 Message-ID: <1ea13066-aa98-ead2-f50f-f62d030ce3c5@huawei.com> Date: Thu, 17 Mar 2022 10:41:57 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.1.1 Subject: Re: [PATCH 4.19 01/34] cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug To: Greg Kroah-Hartman , =?UTF-8?Q?Michal_Koutn=c3=bd?= CC: , , Zhao Gongyi , Waiman Long , Tejun Heo , Juri Lelli References: <20220228172207.090703467@linuxfoundation.org> <20220228172208.566431934@linuxfoundation.org> <20220308151232.GA21752@blackbody.suse.cz> <20220314111940.GC1035@blackbody.suse.cz> From: Zhang Qiao In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.67.111.186] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpeml500018.china.huawei.com (7.185.36.186) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/3/16 22:27, Greg Kroah-Hartman 写道: > On Mon, Mar 14, 2022 at 12:19:41PM +0100, Michal Koutný wrote: >> Hello. >> >> In my opinion there are two approaches: >> a) drop this backport (given other races present), > > I have no problem with that, want to send a revert patch? > >> b) swap the locks compatible with v4.19 as this patch proposes. >> >> On Mon, Mar 14, 2022 at 05:11:50PM +0800, Zhang Qiao wrote: >>> + /* >>> + * It should hold cpus lock because a cpu offline event can >>> + * cause set_cpus_allowed_ptr() failed. >>> + */ >>> + cpus_read_lock(); >> >> Maybe just a nit, the old kernels before commit c5c63b9a6a2e ("cgroup: >> Replace deprecated CPU-hotplug functions.") v5.15-rc1~159^2~5 >> would be more consistent with get_online_cpus() here (but they're >> equivalent functionally so the locking order is correct). > > A fixed up patch would also be appreciated :) > Fixed up patch as follows, replace cpus_read_lock() with get_online_cpus(). thanks. -------- [PATCH] cpuset: Fix unsafe lock order between cpuset lock and cpuslock The backport commit 4eec5fe1c680a ("cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug") looks suspicious since it comes before commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order") v5.4-rc1~176^2~30 when the locking order was: cpuset lock, cpus lock. Fix it with the correct locking order and reduce the cpus locking range because only set_cpus_allowed_ptr() needs the protection of cpus lock. Fixes: 4eec5fe1c680a ("cgroup/cpuset: Fix a race between cpuset_attach() and cpu hotplug") Reported-by: Michal Koutný Signed-off-by: Zhang Qiao --- kernel/cgroup/cpuset.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index d43d25acc..4e1c4232e 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1528,9 +1528,13 @@ static void cpuset_attach(struct cgroup_taskset *tset) cgroup_taskset_first(tset, &css); cs = css_cs(css); - cpus_read_lock(); mutex_lock(&cpuset_mutex); + /* + * It should hold cpus lock because a cpu offline event can + * cause set_cpus_allowed_ptr() failed. + */ + get_online_cpus(); /* prepare for attach */ if (cs == &top_cpuset) cpumask_copy(cpus_attach, cpu_possible_mask); @@ -1549,6 +1553,7 @@ static void cpuset_attach(struct cgroup_taskset *tset) cpuset_change_task_nodemask(task, &cpuset_attach_nodemask_to); cpuset_update_task_spread_flag(cs, task); } + put_online_cpus(); /* * Change mm for all threadgroup leaders. This is expensive and may @@ -1584,7 +1589,6 @@ static void cpuset_attach(struct cgroup_taskset *tset) wake_up(&cpuset_attach_wq); mutex_unlock(&cpuset_mutex); - cpus_read_unlock(); } /* The various types of files and directories in a cpuset file system */ -- 2.18.0 > thanks, > > greg k-h > . >