Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5849622imu; Wed, 30 Jan 2019 04:45:22 -0800 (PST) X-Google-Smtp-Source: ALg8bN5IcQMsWtL590ZDufQX8E/h8MR7hDMwEfFPwZZ+YqKezvQ8J9uccvATzvG/SvRrTITXtpeN X-Received: by 2002:a17:902:8a8a:: with SMTP id p10mr30660320plo.50.1548852322348; Wed, 30 Jan 2019 04:45:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548852322; cv=none; d=google.com; s=arc-20160816; b=lNDhr+lriXL8SgJ1y8aHBamru1dwzPtOYsyxAqg0rN6IDXICLhn0yGa5QpaID0lKTm JPTbE4+flOnEa9SIMka5SNFmUD57+gcSoGRBaMFx/B7V0m1MQto41ku5g3mZZE44KEsj SxkSMb/3Sviif8tzeOKGoj8VR1afO3qC7xQRaPKAxNr+JBpPNpaNs40oN/LjhRy4Juf8 ESPQNUqVrjl04fUZX8/F0ckgcxJ7ITI6GbVjacCSl4+2HqK4iiUpxZOWwLxcvp3IGyAv OSXslvWQgE1b9iuXNr7b7b0tfKbhlUY6vwCOUWYfbgqhiEjQOEuoSOcqEyPnC55sUiDy UG2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=CaPnod0YDHOoHI5UgGEqW0eu/7pJBeeUxGOYhWXVtCU=; b=HcW6hWXgzaI5THTWb8cphxWl8EiyyXOnNPh8NT9irozUD/gcOg5No26Lg5AxLHrkGN oZr+g2YSrPOFyUgVJzRKsS2I7fRnJPma8OVrEiLylq7C9VCdXTb6xbcSYUr+AMzR/dCq 82agF4NMLS66qkz63qy+eIDTBzLxipPe8VPPJu3pmL1Z5Tvke/FFNo8FSjRgh76DjGCA F1hhC9Pw3NG7lXpo1l5NGKCUcOTMXTa6f4S0yxiawpGaFjqliOJ+0s0aeFHkb2ohPfTF Wz/WctqQvGAXw/kHwvnUVNhHfTzUI9QDOHeUmBjJxJw8SS5El9JtJXly7pl0FUvtF8h6 JOfg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e11si1358817pgf.450.2019.01.30.04.45.06; Wed, 30 Jan 2019 04:45:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730673AbfA3Mn5 (ORCPT + 99 others); Wed, 30 Jan 2019 07:43:57 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:46995 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725768AbfA3Mn5 (ORCPT ); Wed, 30 Jan 2019 07:43:57 -0500 Received: from [2a01:598:b890:92b7:fc90:b8ff:fed0:1fb6] (helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gopDO-0007h8-IP; Wed, 30 Jan 2019 13:43:38 +0100 Date: Wed, 30 Jan 2019 13:43:32 +0100 (CET) From: Thomas Gleixner To: John Garry cc: Keith Busch , Christoph Hellwig , Marc Zyngier , "axboe@kernel.dk" , Peter Zijlstra , Michael Ellerman , Linuxarm , "linux-kernel@vger.kernel.org" , Hannes Reinecke Subject: Re: Question on handling managed IRQs when hotplugging CPUs In-Reply-To: <3fe63dab-0791-f476-69c4-9866b70e8520@huawei.com> Message-ID: References: <20190129154433.GF15302@localhost.localdomain> <757902fc-a9ea-090b-7853-89944a0ce1b5@huawei.com> <20190129172059.GC17132@localhost.localdomain> <3fe63dab-0791-f476-69c4-9866b70e8520@huawei.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 30 Jan 2019, John Garry wrote: > On 29/01/2019 17:20, Keith Busch wrote: > > On Tue, Jan 29, 2019 at 05:12:40PM +0000, John Garry wrote: > > > On 29/01/2019 15:44, Keith Busch wrote: > > > > > > > > Hm, we used to freeze the queues with CPUHP_BLK_MQ_PREPARE callback, > > > > which would reap all outstanding commands before the CPU and IRQ are > > > > taken offline. That was removed with commit 4b855ad37194f ("blk-mq: > > > > Create hctx for each present CPU"). It sounds like we should bring > > > > something like that back, but make more fine grain to the per-cpu > > > > context. > > > > > > > > > > Seems reasonable. But we would need it to deal with drivers where they > > > only > > > expose a single queue to BLK MQ, but use many queues internally. I think > > > megaraid sas does this, for example. > > > > > > I would also be slightly concerned with commands being issued from the > > > driver unknown to blk mq, like SCSI TMF. > > > > I don't think either of those descriptions sound like good candidates > > for using managed IRQ affinities. > > I wouldn't say that this behaviour is obvious to the developer. I can't see > anything in Documentation/PCI/MSI-HOWTO.txt > > It also seems that this policy to rely on upper layer to flush+freeze queues > would cause issues if managed IRQs are used by drivers in other subsystems. > Networks controllers may have multiple queues and unsoliciated interrupts. It's doesn't matter which part is managing flush/freeze of queues as long as something (either common subsystem code, upper layers or the driver itself) does it. So for the megaraid SAS example the BLK MQ layer obviously can't do anything because it only sees a single request queue. But the driver could, if the the hardware supports it. tell the device to stop queueing completions on the completion queue which is associated with a particular CPU (or set of CPUs) during offline and then wait for the on flight stuff to be finished. If the hardware does not allow that, then managed interrupts can't work for it. Thanks, tglx