Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4905900imu; Tue, 29 Jan 2019 09:25:53 -0800 (PST) X-Google-Smtp-Source: AHgI3IZPEm+ToRDshks+UIm4JE3RhGd/4pLA85pKA+3z/c6vbZHNKVS6GTGhzCRp45fBT+zhYUXH X-Received: by 2002:a17:902:70cc:: with SMTP id l12mr13057089plt.149.1548782753043; Tue, 29 Jan 2019 09:25:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548782753; cv=none; d=google.com; s=arc-20160816; b=EkuvvVwS5QQDZQYDvJR53sTdjklmKi9xc1+v4pYm+SLZYCPpX+7KGysHLUSXGPIamG pMODxhOfUXHSc+jbdNLUv4Q5VIpbbt3NaKU8YprmiOnvDx6n/xX2nl0Ih8znp3jz9JtT 4YKV9Lav6qzPEkY0PO4jfkA00w3dUI/En1AkrNohOgArLG2n2GiAc21Qt15NR8gZKCdm QNVAKlGCAhS2EUip7Ki2RvRs72+7i8AwzkwnsyBWkWPdUTVgkmdPglY5lD9rC+M6xho/ TZvgwWRNkYDo4wUXXwaJ5b5PyWz6Z4TESio0rqYZwzL3ciOyFdGSXo7ceeOV4dRHdHdp pupA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject; bh=GzqwSzIUKfA6vrc5KwByyDT1zKhN1vtj5dIuiiSc3yU=; b=Ob+ONaQH/gzhz7BXahQWmWKaD/B2a1M5LGqPnCf5MGWsW5XJNslPU5DfKzMfvdTAWA XGV5kbBmnCf8v0+GdqJFlolbTa8y/z+lzJpdPRiAohA9AZWKdAhAUIkS71oWrRhqUvbZ LdA9jCSp6kPs7No32oaol1+JogHg707QnWL8yHisAdJgVHsSFRqUbtO9gPoECs63rbTT FvmexYfUU0lCEFDszQMe//gIjh/FiKONHkW3e4G/tjDjmkHcU2hK3o9LPW0KU8jVrXI1 cTgBJ49GulYSsUANmCFdplgb9+jnJcWCYZWiWj6sn0xvyEb1EgKP74ae97+wggv6vzmW W3oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k33si35894586pgb.424.2019.01.29.09.25.33; Tue, 29 Jan 2019 09:25:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728935AbfA2RYL (ORCPT + 99 others); Tue, 29 Jan 2019 12:24:11 -0500 Received: from szxga04-in.huawei.com ([45.249.212.190]:2698 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726374AbfA2RYL (ORCPT ); Tue, 29 Jan 2019 12:24:11 -0500 Received: from DGGEMS407-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 46A599D47DA7C7DC36C6; Wed, 30 Jan 2019 01:24:08 +0800 (CST) Received: from [127.0.0.1] (10.202.226.43) by DGGEMS407-HUB.china.huawei.com (10.3.19.207) with Microsoft SMTP Server id 14.3.408.0; Wed, 30 Jan 2019 01:23:59 +0800 Subject: Re: Question on handling managed IRQs when hotplugging CPUs To: Thomas Gleixner References: <5bff8227-16fd-6bca-c16e-3992ef6bec5a@suse.com> CC: Hannes Reinecke , Christoph Hellwig , "Marc Zyngier" , "axboe@kernel.dk" , "Keith Busch" , Peter Zijlstra , "Michael Ellerman" , Linuxarm , "linux-kernel@vger.kernel.org" , "SCSI Mailing List" From: John Garry Message-ID: Date: Tue, 29 Jan 2019 17:23:52 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.202.226.43] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/01/2019 16:27, Thomas Gleixner wrote: > On Tue, 29 Jan 2019, John Garry wrote: >> On 29/01/2019 12:01, Thomas Gleixner wrote: >>> If the last CPU which is associated to a queue (and the corresponding >>> interrupt) goes offline, then the subsytem/driver code has to make sure >>> that: >>> >>> 1) No more requests can be queued on that queue >>> >>> 2) All outstanding of that queue have been completed or redirected >>> (don't know if that's possible at all) to some other queue. >> >> This may not be possible. For the HW I deal with, we have symmetrical delivery >> and completion queues, and a command delivered on DQx will always complete on >> CQx. Each completion queue has a dedicated IRQ. > > So you can stop queueing on DQx and wait for all outstanding ones to come > in on CQx, right? Right, and this sounds like what Keith Busch mentioned in his reply. > >>> That has to be done in that order obviously. Whether any of the >>> subsystems/drivers actually implements this, I can't tell. >> >> Going back to c5cb83bb337c25, it seems to me that the change was made with the >> idea that we can maintain the affinity for the IRQ as we're shutting it down >> as no interrupts should occur. >> >> However I don't see why we can't instead keep the IRQ up and set the affinity >> to all online CPUs in offline path, and restore the original affinity in >> online path. The reason we set the queue affinity to specific CPUs is for >> performance, but I would not say that this matters for handling residual IRQs. > > Oh yes it does. The problem is especially on x86, that if you have a large > number of queues and you take a large number of CPUs offline, then you run > into vector space exhaustion on the remaining online CPUs. > > In the worst case a single CPU on x86 has only 186 vectors available for > device interrupts. So just take a quad socket machine with 144 CPUs and two > multiqueue devices with a queue per cpu. ---> FAIL > > It probably fails already with one device because there are lots of other > devices which have regular interrupt which cannot be shut down. OK, understood. Thanks, John > > Thanks, > > tglx > > > . >