Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4575221imw; Tue, 12 Jul 2022 10:15:36 -0700 (PDT) X-Google-Smtp-Source: AGRyM1u2CA7tFIodfo1hl19XXKgkgR9daqDGqu/EgfxG+WXXA+Rbo+Oudit7KWryfXjZc/dcv2EA X-Received: by 2002:a17:902:8491:b0:16b:9c47:b4c6 with SMTP id c17-20020a170902849100b0016b9c47b4c6mr25350670plo.42.1657646136431; Tue, 12 Jul 2022 10:15:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657646136; cv=none; d=google.com; s=arc-20160816; b=fcaEMvXGD5f2/GXTFNVtq2KvxXMHUwR0V0MHK9BgiwtRHjFuqgeiYBoC/X7iTVxPkP yQRAoub6mSSwBfKcA1fgsm9C3gzw/KR1ZbIcAuFVBLa4qUIyE9JwnJi1RMD8hE7XKtog w6QQDkEVX3IrPQ/gWIRY+6Cx31+iSfq6l5+FlOfc5zXu8EbDQd6QnRri+VdiFgq+tpec nlPBpMks26XX46ahkx1O1LDagMJpUH2TBMBIFqOultNGbxYjC9nsnvudkkyMPMCjTO7M z79KnpPM632ivO4w9PSd8PDC2utkq7VvbTQvkRvZIKAYCulFx+JdbNXOgAwSb9i9aQL7 fgqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=rHRsVi9/gkrVhKPw5kAv6G9VIvHQCY9xpzbZ1SkAiwM=; b=TT9w5h9Yk4JAkE0SUvoH6nBP1ihxcdYRQmhamlfmaQkTVlgBWVMSwU2ka88/LBe+Rb g2zFI4Jw+A33fIWBhrML3Kcy8T3oi93tA1yA+RUviUyJvbFWsDQf4SE5e0NTGPVAdrjm 3a5kYADG24ccDa7g9poydU6qXRBKVBUARmbxKg5ZdBF7+7zEdk++UJOA6IzkiaTkj0Rb ZKQLbv1/MFYTjwHenF5QUFlnF8oWW9sCx7P7eqoiblD9bEGz5JM1sB2jYG1QKerVSIZT aFj7CFjBYO+xgwuI/s8vwYtCz+9/t4kdoDGKlwwsMJfaBuXpJrhLLBZoCVQSjG9iktUw UHNQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id na13-20020a17090b4c0d00b001f044828f3dsi7212030pjb.28.2022.07.12.10.15.22; Tue, 12 Jul 2022 10:15:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230326AbiGLROO (ORCPT + 99 others); Tue, 12 Jul 2022 13:14:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229880AbiGLROM (ORCPT ); Tue, 12 Jul 2022 13:14:12 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 79B102E9E1; Tue, 12 Jul 2022 10:14:11 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A7FD1165C; Tue, 12 Jul 2022 10:14:11 -0700 (PDT) Received: from wubuntu (unknown [10.57.85.79]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EB4583F73D; Tue, 12 Jul 2022 10:14:08 -0700 (PDT) Date: Tue, 12 Jul 2022 18:14:07 +0100 From: Qais Yousef To: Tejun Heo Cc: Xuewen Yan , rafael@kernel.org, viresh.kumar@linaro.org, mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, linux-kernel@vger.kernel.org, ke.wang@unisoc.com, xuewyan@foxmail.com, linux-pm@vger.kernel.org, Waiman Long Subject: Re: [PATCH] sched/schedutil: Fix deadlock between cpuset and cpu hotplug when using schedutil Message-ID: <20220712171407.ns67p7nygltydupx@wubuntu> References: <20220705123705.764-1-xuewen.yan@unisoc.com> <20220711174629.uehfmqegcwn2lqzu@wubuntu> <20220712125702.yg4eqbaakvj56k6m@wubuntu> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/12/22 06:13, Tejun Heo wrote: > On Tue, Jul 12, 2022 at 01:57:02PM +0100, Qais Yousef wrote: > > Is there a lot of subsystems beside cpuset that needs the cpus_read_lock()? > > A quick grep tells me it's the only one. > > > > Can't we instead use cpus_read_trylock() in cpuset_can_attach() so that we > > either hold the lock successfully then before we go ahead and call > > cpuset_attach(), or bail out and cancel the whole attach operation which should > > unlock the threadgroup_rwsem() lock? > > But now we're failing user-initiated operations randomly. I have a hard time True. That might appear more random than necessary. It looked neat and I thought since hotplug operations aren't that common and users must be prepared for failures for other reasons, it might be okay. > seeing that as an acceptable solution. The only thing we can do, I think, is > establishing a locking order between the two locks by either nesting That might be enough if no other paths can exist which would hold them in reverse order again. It would be more robust to either hold them both or wait until we can. Then potential ordering problems can't happen again, because of this path at least. > threadgroup_rwsem under cpus_read_lock or disallowing thread creation during > hotplug operations. I think that's what Xuewen tried to do in the proposed patch. But it fixes it for a specific user. If we go with that we'll need nuts and bolts to help warn when other users do that. Thanks -- Qais Yousef