Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp4526204pxb; Wed, 20 Apr 2022 05:10:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxVXHmMR/sMCJ1WBqzl6eBQCIDKLKYZjc1eOHY1agzuFaIa+Rt43l38HFf+mC9GLM5g8/Oo X-Received: by 2002:a17:902:9308:b0:158:da34:ab55 with SMTP id bc8-20020a170902930800b00158da34ab55mr20485261plb.84.1650456649083; Wed, 20 Apr 2022 05:10:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650456649; cv=none; d=google.com; s=arc-20160816; b=W8WOEzuCcwc7ov/f78Lshk2Lf2ohDO2TENAUdCjr9Cc3Ny9WCnPW9UYQm1UpEmCaOB mwXFtUti4w0pJwVM5YjDxQKIUO4oGW6/HKI7z9cZCmPmPplg+jDa8NytWEG5TH0nmZ3g dkWML6ZrfpUDYQhUEFpXlEpEJdWv/CNXIYbXKWsEew9b4PfpmL0xdQUNjJMx7n+0Wuuu ACtRF+7mIlUfLt7QqUWyXnTLHeQs++4cxO2hXhN8qrUmzRAoaNCHD3GYgpmVZF6VFQZ3 6sUwh6+TshsDu9swXso9/r2TMI4+e3N0lf8fcdfEQl6qObhGv4fK0vlA3j8XH0+D0peI GElg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :organization:from:references:cc:to:content-language:user-agent :mime-version:date:message-id:dkim-signature; bh=kmRkaKzAt9R+U4rWTjWmga5zw1HoOc2kC7WKpK/XhjA=; b=o/t2ICZh7y15mITgkZGkieWpy79DYXs53Xvl2wY5x5+JCcb1DginkT8EMxk84+bLZc VwerPvGcpzw0Na2b6XWMAKevYoFDDgvL5vhU/f/pH+y8AOJdgAk+UPMvaBeevwykJpea J28qeFIpKq9r3TBsID3+ZM093TrGiInD6/p3LP0+P2NAfBsDpHDPdtu/9YwKZeWmj75V 4IpWQtp0xd1Amf71Z/qT21xupr4ZfCHvWRRT4c+nkPkn81R3zrut5uFpgSTcW3ZRgKuk hXYoa/0MckrtmkR+5eA6FP5OeZ8+QEpIt2Yv2Jp7ha36Dvsb9BR405WLzsL/wnDAqsU6 gezA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HNAqWrwc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k4-20020a628404000000b004fe574cd35dsi1798705pfd.213.2022.04.20.05.10.34; Wed, 20 Apr 2022 05:10:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HNAqWrwc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242758AbiDSMhj (ORCPT + 99 others); Tue, 19 Apr 2022 08:37:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231152AbiDSMh0 (ORCPT ); Tue, 19 Apr 2022 08:37:26 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 89D53340C0 for ; Tue, 19 Apr 2022 05:34:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1650371682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kmRkaKzAt9R+U4rWTjWmga5zw1HoOc2kC7WKpK/XhjA=; b=HNAqWrwc2fJFwVNEB2eiuEA2s5n5dM06iIU4kNc40i7pNXw0gFnTHHKMv/eheNTbWi8uxu OsT9JBMuzVtWTcHrBMNaXi3Xq0+rvhIqpPX1BW8pzt4TRoi0fiKgHwI/vXqz3URBELgngT mwE8qzIs+SDV5pBfyXBapjWoattEEPc= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-614-jPIJFVchPR-oAzXrl5iVEA-1; Tue, 19 Apr 2022 08:34:41 -0400 X-MC-Unique: jPIJFVchPR-oAzXrl5iVEA-1 Received: by mail-wr1-f69.google.com with SMTP id p18-20020adf9592000000b00207bc12decbso1605880wrp.21 for ; Tue, 19 Apr 2022 05:34:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=kmRkaKzAt9R+U4rWTjWmga5zw1HoOc2kC7WKpK/XhjA=; b=LkjB7+9ekZiNSRvx4Tt0bDJia53vFxbAc6s9OS1JimwOJPtEenWToRWAiPlE95lelb N9PD02GW1hQ0lBLofF8RbxOJfY49931QW2DPlWfieWVb9AIdEFACrZ6TivK7eKxg+yfV lVKNudWyNwdcScEQaflSjM69O160irNHC24IBxo2toH5gbn6WJf43N+SU98GEs634sbR CDecFv776R7DYbJJhHF/QWNuM+2a4uyDKcJXcOepjM3r/F4z5IHMhICj8+ytDmKMX+Lb pDIqez0i9/rX4jrPMalUom9lXrQZoVbnYHKJJiZKNYSJgICB64Ujhacn0rsTu516NWTD XhWQ== X-Gm-Message-State: AOAM531WbY8Y7621DFE0RnxQW0xIXSgz8hQkwuMrYoYfLXCfIsVXE+ZU ul8LnraAUvdR/LVNajid9syLt6JdSVIL8p9ju5uw21ZfMRSIVTSJja40aAMdzKJbjzjYRSsBDk2 kL69zVJyAWI3He3HJVgi7hPHF X-Received: by 2002:adf:e188:0:b0:207:9b59:9d82 with SMTP id az8-20020adfe188000000b002079b599d82mr11441204wrb.114.1650371679940; Tue, 19 Apr 2022 05:34:39 -0700 (PDT) X-Received: by 2002:adf:e188:0:b0:207:9b59:9d82 with SMTP id az8-20020adfe188000000b002079b599d82mr11441185wrb.114.1650371679646; Tue, 19 Apr 2022 05:34:39 -0700 (PDT) Received: from ?IPV6:2003:cb:c704:5d00:d8c2:fbf6:a608:957a? (p200300cbc7045d00d8c2fbf6a608957a.dip0.t-ipconnect.de. [2003:cb:c704:5d00:d8c2:fbf6:a608:957a]) by smtp.gmail.com with ESMTPSA id a4-20020a056000188400b0020a9ec6e8e3sm3830865wri.55.2022.04.19.05.34.38 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 19 Apr 2022 05:34:39 -0700 (PDT) Message-ID: Date: Tue, 19 Apr 2022 14:34:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.2 Content-Language: en-US To: Joel Savitz , linux-kernel@vger.kernel.org Cc: Thomas Gleixner , Valentin Schneider , Peter Zijlstra , Frederic Weisbecker , Mark Rutland , Yuan ZhaoXiong , Baokun Li , "Jason A. Donenfeld" , YueHaibing , Randy Dunlap , David Hildenbrand References: <20220418195402.2986573-1-jsavitz@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH] kernel/cpu: restart cpu_up when hotplug is disabled In-Reply-To: <20220418195402.2986573-1-jsavitz@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 18.04.22 21:54, Joel Savitz wrote: > The cpu hotplug path may be utilized while hotplug is disabled for a > brief moment leading to failures. As an example, attempts to perform > cpu hotplug by userspace soon after boot may race with pci_device_probe > leading to inconsistent results. You might want to extend a bit in which situation we observed that issue fairly reliably. When restricting the number of boot cpus on the kernel cmdline, e.g., via "maxcpus=2", udev will find the offline cpus when enumerating all cpus and try onlining them. Due to the race, onlining of some cpus fails e.g., when racing with pci_device_probe(). While teaching udev to not online coldplugged CPUs when "maxcpus" was specified ("policy"), it revealed the underlying issue that onlining a CPU can fail with -EBUSY in corner cases when cpu hotplug is temporarily disabled. > > Proposed idea: > Call restart_syscall instead of returning -EBUSY since > cpu_hotplug_disabled seems to only have a positive value > for short, temporary amounts of time. > > Does anyone see any serious problems with this? > > Signed-off-by: Joel Savitz > --- > kernel/cpu.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 5797c2a7a93f..2992c7d1d24e 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -35,6 +35,7 @@ > #include > #include > #include > +#include > > #include > #define CREATE_TRACE_POINTS > @@ -1401,7 +1402,9 @@ static int cpu_up(unsigned int cpu, enum cpuhp_state target) > cpu_maps_update_begin(); > > if (cpu_hotplug_disabled) { > - err = -EBUSY; > + /* avoid busy looping (5ms of sleep should be enough) */ > + msleep(5); > + err = restart_syscall(); It's worth noting that we use the same approach in lock_device_hotplug_sysfs(). It's far from perfect I would say, but we really wanted to avoid letting user space having to deal with retry logic. For example, while memory onlining can fail with -EBUSY, it's not expected to fail during memory onlining (we only fail in very rare cases, when a memory notifier fails -- for example when kasan fails to allocate memory). -- Thanks, David / dhildenb