Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4565351imu; Tue, 29 Jan 2019 03:55:15 -0800 (PST) X-Google-Smtp-Source: ALg8bN6F8AnI7DYujhUEMwP1L3bDDbPgNrAo8KcInp5L85rBgzLAxJ0dyPHO8BCoFD3nbcfgQSS5 X-Received: by 2002:a63:9e58:: with SMTP id r24mr23938965pgo.264.1548762915758; Tue, 29 Jan 2019 03:55:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548762915; cv=none; d=google.com; s=arc-20160816; b=mp+wHpaoPwELzVAXCn9UtoooQIlT54cV8MdrmR5IZy0SxyI26wIUjNSQbEe97lt9xJ R0VdBNzmwu4mNxV0gsqXqBM3/7Ar95YEu5580dsMejNPM71E2o/ISLfsz7CShRMXO6ud xaAMph0bhlUzP2evjiPCuxquqB8M8NN5pevsJU66qWJGunuLytK2H4zuFwLwKUT+RUx2 hzm91H/LvJ8zYd5bgaXBPDrZ3HIHDiA6BQDyDNfdf4OGJPXHuW/MMUFRZn0IxBrrCFeD uivoaBohjMVmc2j0+BT70nMMkeAR27yI+ItgceCQ3MrwUfy8Pe6nppmIJEtH9WBxrd2+ R2iQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=4dsUKYiqPdYILXgQSj/D8cYBqP+O7Emw9zr4PFAEHE8=; b=rIkh3Un1zkI1y7fc9PPmYo80J9athEzzfy1sA/ZOhsjsqlS16mi6Q2QRCxjJu3nE8s xP8OB2l//w30tHWPVoeC9L1iZ47zjku4pryWAHnGkZAQjSAMNOjX+sd+iMeoisGpMu5a wrZYOkGJ1BhUzXcbjxj+yi2fOOlfu5WnGclSyux+yipzbiGocoii8fegYjY2u1WWsQ+J 0/7kO17hItgiYOqdwHhCi+ULrcsx8cScUdPvSL+FaZV+BkOqQw7sXKOoSRdQ2E1QLp3b GmKl0gyeV7DP0Ftzdm4srXKVF9asYI3xi47XEipRku+7ilSpCG16IMOlXVhdA8gW3ATw lOfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n15si10418205pgk.27.2019.01.29.03.55.00; Tue, 29 Jan 2019 03:55:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731933AbfA2Lys (ORCPT + 99 others); Tue, 29 Jan 2019 06:54:48 -0500 Received: from smtp.nue.novell.com ([195.135.221.5]:41316 "EHLO smtp.nue.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730486AbfA2Lyr (ORCPT ); Tue, 29 Jan 2019 06:54:47 -0500 Received: from [10.160.4.48] (charybdis.suse.de [149.44.162.66]) by smtp.nue.novell.com with ESMTP (TLS encrypted); Tue, 29 Jan 2019 12:54:45 +0100 Subject: Re: Question on handling managed IRQs when hotplugging CPUs To: John Garry , tglx@linutronix.de, Christoph Hellwig Cc: Marc Zyngier , "axboe@kernel.dk" , Keith Busch , Peter Zijlstra , Michael Ellerman , Linuxarm , "linux-kernel@vger.kernel.org" , SCSI Mailing List References: From: Hannes Reinecke Message-ID: <5bff8227-16fd-6bca-c16e-3992ef6bec5a@suse.com> Date: Tue, 29 Jan 2019 12:54:44 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/29/19 12:25 PM, John Garry wrote: > Hi, > > I have a question on $subject which I hope you can shed some light on. > > According to commit c5cb83bb337c25 ("genirq/cpuhotplug: Handle managed > IRQs on CPU hotplug"), if we offline the last CPU in a managed IRQ > affinity mask, the IRQ is shutdown. > > The reasoning is that this IRQ is thought to be associated with a > specific queue on a MQ device, and the CPUs in the IRQ affinity mask are > the same CPUs associated with the queue. So, if no CPU is using the > queue, then no need for the IRQ. > > However how does this handle scenario of last CPU in IRQ affinity mask > being offlined while IO associated with queue is still in flight? > > Or if we make the decision to use queue associated with the current CPU, > and then that CPU (being the last CPU online in the queue's IRQ > afffinity mask) goes offline and we finish the delivery with another CPU? > > In these cases, when the IO completes, it would not be serviced and > timeout. > > I have actually tried this on my arm64 system and I see IO timeouts. > That actually is a very good question, and I have been wondering about this for quite some time. I find it a bit hard to envision a scenario where the IRQ affinity is automatically (and, more importantly, atomically!) re-routed to one of the other CPUs. And even it it were, chances are that there are checks in the driver _preventing_ them from handling those requests, seeing that they should have been handled by another CPU ... I guess the safest bet is to implement a 'cleanup' worker queue which is responsible of looking through all the outstanding commands (on all hardware queues), and then complete those for which no corresponding CPU / irqhandler can be found. But I defer to the higher authorities here; maybe I'm totally wrong and it's already been taken care of. But if there is no generic mechanism this really is a fit topic for LSF/MM, as most other drivers would be affected, too. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.com +49 911 74053 688 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: F. Imendörffer, J. Smithard, D. Upmanyu, G. Norton HRB 21284 (AG Nürnberg)