Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp2727779rdb; Fri, 8 Dec 2023 18:28:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IHSEoJRbB9BKAPkrs+cQgvWO5bb+q72AHnguO5EkfoJ9mh4F18VDYNeA8EG3toCb1IpkJrJ X-Received: by 2002:a17:90b:1914:b0:286:6cc1:7817 with SMTP id mp20-20020a17090b191400b002866cc17817mr988166pjb.90.1702088879992; Fri, 08 Dec 2023 18:27:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702088879; cv=none; d=google.com; s=arc-20160816; b=LDLORoHLIyIlt/F69gf4xnBIgg/3UG26ZxodxFz49xWV1mB07hsglELmYUeXDyVe/3 XLJr5hhitTLFrkTvjAzoG5js7jGEA7xssgq6sEJP5SlcGVl8mff0Hj7lXJkdJ2/iwrHq TCjhmUIubEsu2l1c24p7IPYQpbu5L9q0nboUdTtIbHqryrbZXv4m2cokbVdujQqp7uAf w4O7rVYkKJIRCFDPB0uPfQ3KWeT9hPCVx898uyAFGPSF0XHdTlj80SmqCp0IQiwSQ2jU NMaRVe8PxUacDwKeoA/QZeUUex/DBKF3uV7gMllSk9taEHEoLwgeDRRdZlTiM69/zExV 5W1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=ZAI0NuHLd7A7hcfMNmxQThYQE6RIisLpIwIrNRbX2Ig=; fh=cQI55JSD1tmkCX2Usbkm3wvZdu3r1LntW2yi33d37fI=; b=W2eRSb/F4gQdDGeXm4yBYIbeEAfqX3OCDQpy7HnKL6mGfkpIjJgWVwN6vwZA3guoax iDuS10sdKumZXUdmeM0PzQI3UQGxAgomXoGCw1YNDAfnu4giPjC9qhzEgYqDq8wLQegV dbvfcg6le1+dTa6K1iitWjZ2AT29KHEmFyy1nhs2UVYd4smqWev4zRXdEbf56KiHnpwC USPT8+C1QMAx7uqJq+OULB9ijUJZVGC7aulXV8JEl/EnJnMXPe44mhNrZQjraFraFFwB n6qg3VyoUEKCuEs0FnOy370OMhhG3Hs6c/EQSorCp7oy6u+JGgKzc0N72IhEtB/YUYlF EGdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Tad5E9hR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id r7-20020a17090aad0700b00286596c9b37si3628991pjq.22.2023.12.08.18.27.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Dec 2023 18:27:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Tad5E9hR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 54E0E835A69B; Fri, 8 Dec 2023 18:27:57 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbjLIC1m (ORCPT + 99 others); Fri, 8 Dec 2023 21:27:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229731AbjLIC1k (ORCPT ); Fri, 8 Dec 2023 21:27:40 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE7361705 for ; Fri, 8 Dec 2023 18:27:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1702088865; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ZAI0NuHLd7A7hcfMNmxQThYQE6RIisLpIwIrNRbX2Ig=; b=Tad5E9hRtHETUCKYmho4ouDR9kBO05XSD2Y2f4zbVSYjVKn7SaydcxhoXi2S8Q+pxGX5Mi UOVoETgjThbf9Dom1jIBwyxS66LkM6juRQLslSuhH1//MAhATrjVe/SiL2zNKxOvhaVR3U ocZdFIXMNnbY1cPfESgPjpvOdkKJxO8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-117-uIlvz8xnOAONWZ2y1JLgnw-1; Fri, 08 Dec 2023 21:27:43 -0500 X-MC-Unique: uIlvz8xnOAONWZ2y1JLgnw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1757F805844; Sat, 9 Dec 2023 02:27:43 +0000 (UTC) Received: from fedora (unknown [10.72.120.4]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 773B4492BC6; Sat, 9 Dec 2023 02:27:37 +0000 (UTC) Date: Sat, 9 Dec 2023 10:27:32 +0800 From: Ming Lei To: Andrew Morton Cc: Thomas Gleixner , linux-kernel@vger.kernel.org, Keith Busch , linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, Yi Zhang , Guangwu Zhang , Chengming Zhou , Jens Axboe Subject: Re: [PATCH V4 resend] lib/group_cpus.c: avoid to acquire cpu hotplug lock in group_cpus_evenly Message-ID: References: <20231120083559.285174-1-ming.lei@redhat.com> <20231120120059.ef0614c2295b2102100cb56e@linux-foundation.org> <20231206151246.99bbf0f253b85f053bea9199@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231206151246.99bbf0f253b85f053bea9199@linux-foundation.org> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 08 Dec 2023 18:27:57 -0800 (PST) On Wed, Dec 06, 2023 at 03:12:46PM -0800, Andrew Morton wrote: > On Mon, 20 Nov 2023 12:00:59 -0800 Andrew Morton wrote: > > > On Mon, 20 Nov 2023 16:35:59 +0800 Ming Lei wrote: > > > > > group_cpus_evenly() could be part of storage driver's error handler, > > > such as nvme driver, when may happen during CPU hotplug, in which > > > storage queue has to drain its pending IOs because all CPUs associated > > > with the queue are offline and the queue is becoming inactive. And > > > handling IO needs error handler to provide forward progress. > > > > > > Then dead lock is caused: > > > > > > 1) inside CPU hotplug handler, CPU hotplug lock is held, and blk-mq's > > > handler is waiting for inflight IO > > > > > > 2) error handler is waiting for CPU hotplug lock > > > > > > 3) inflight IO can't be completed in blk-mq's CPU hotplug handler because > > > error handling can't provide forward progress. > > > > > > Solve the deadlock by not holding CPU hotplug lock in group_cpus_evenly(), > > > in which two stage spreads are taken: 1) the 1st stage is over all present > > > CPUs; 2) the end stage is over all other CPUs. > > > > > > Turns out the two stage spread just needs consistent 'cpu_present_mask', and > > > remove the CPU hotplug lock by storing it into one local cache. This way > > > doesn't change correctness, because all CPUs are still covered. > > > > I'm not sure what is the intended merge path for this, but I can do lib/. > > > > Do you think that a -stable backport is needed? It sounds that way. > > > > If so, are we able to identify a suitable Fixes: target? That would > > predate f7b3ea8cf72f3 ("genirq/affinity: Move group_cpus_evenly() into > > lib/"). > > No? I think this predates 428e211641ed8 ("genirq/affinity: Replace > deprecated CPU-hotplug functions." also. > > I'll slap a cc:stable on it and I'll let you and the -stable > maintainers figure it out. The issue should be introduced since 3ee0ce2a54df ("genirq/affinity: Use get/put_online_cpus around cpumask operations") in v4.8, but the logic has been changed a lot, so may take some effort to backport to longterm stables. The issue is reported from RH QA test, in which both cpu hotplug and nvme error recovering are triggered at same time, and easy to duplicate in QE lab, but may be hard to trigger in production environment. Thanks, Ming