Received: by 10.213.65.68 with SMTP id h4csp855486imn; Wed, 4 Apr 2018 08:21:53 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/ZapU1eQ8WcYbkdDjtww89119sPlpnm+yHXzy34ukb7SBe3iztE/TSGGdJrKkuOo4PZy38 X-Received: by 10.99.126.92 with SMTP id o28mr2335928pgn.50.1522855313083; Wed, 04 Apr 2018 08:21:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522855313; cv=none; d=google.com; s=arc-20160816; b=lIe47aYCnb301+stEhZ4RHpa4pKsa46/1PdA7BFva2wsxwA2kDWwYs4O/UzMy1Udnp Ll+XNKcD2MgrXt8CcUgoCk+7RLAQ8Q9va//Sx0DjpH+mqBkJOZmmHkIeIcx1Olqb2VTh 90P15TWfOqt3IrPuusu7DdRVXfrFy/LtCUNSoYxsRX3jasTsWILptqbVaGr8DxuUWxFe Zc8PquTzLB2AiZjcWQfYsph97EQOn5z6JaCY1liq8YpN2ATBHiHEHUVok2eWY3WlgiLz 8kUpJmkbi+BbMRK+lA615JCFZMYbpZwAsrTKwN0YT+BOH878QVtgWsFQ1z1CaR/fA6fN oxVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=wBYL4ySQdCx1qLrhWMXReR4PVdbQqzMbjRDvnT6ygyw=; b=rVvQldd93qGOw4JZXJekp+4LSL0C4bgUNTHpMc3fzBZKSIkYLyDJQ+nPeqiIS37sdX I6+EgDr5KWmYx5dFezcSrKxWR75i5Bu991EpN6bFSH+AHT5ObIhheaRMECyM+o6XkRgQ uD0l8yS4la4UP659XW8OKk6rcgvVDcP8OKdb5EpjSdhgPJuboz95b1SAmu8fVpCYMC59 f2qfhzMavc4iQcifoIXfM2QfXLZnGaV1luPgt9ZRgde6/VtAo+rgoSz5+3W1oAnwsd90 vHiDKawTd/j/Ca7m5fpZNYXt1rKED+gj7Vaz9LGtXpAFN2vUx/LjuYYFv9PoVzZ9eH6f n1sA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c12-v6si3344664plo.216.2018.04.04.08.21.38; Wed, 04 Apr 2018 08:21:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751698AbeDDPUh (ORCPT + 99 others); Wed, 4 Apr 2018 11:20:37 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:42398 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751360AbeDDPUg (ORCPT ); Wed, 4 Apr 2018 11:20:36 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D816D4129349; Wed, 4 Apr 2018 15:20:35 +0000 (UTC) Received: from ming.t460p (ovpn-12-21.pek2.redhat.com [10.72.12.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2D4CC6353E; Wed, 4 Apr 2018 15:20:23 +0000 (UTC) Date: Wed, 4 Apr 2018 23:20:19 +0800 From: Ming Lei To: Thomas Gleixner Cc: Jens Axboe , Christoph Hellwig , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, Laurence Oberman Subject: Re: [PATCH V3 4/4] genirq/affinity: irq vector spread among online CPUs as far as possible Message-ID: <20180404152018.GB24824@ming.t460p> References: <20180308105358.1506-1-ming.lei@redhat.com> <20180308105358.1506-5-ming.lei@redhat.com> <20180403160001.GA25255@ming.t460p> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 04 Apr 2018 15:20:35 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Wed, 04 Apr 2018 15:20:35 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'ming.lei@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 04, 2018 at 02:45:18PM +0200, Thomas Gleixner wrote: > On Wed, 4 Apr 2018, Thomas Gleixner wrote: > > I'm aware how that hw-queue stuff works. But that only works if the > > spreading algorithm makes the interrupts affine to offline/not-present CPUs > > when the block device is initialized. > > > > In the example above: > > > > > > > irq 39, cpu list 0,4 > > > > > irq 40, cpu list 1,6 > > > > > irq 41, cpu list 2,5 > > > > > irq 42, cpu list 3,7 > > > > and assumed that at driver init time only CPU 0-3 are online then the > > hotplug of CPU 4-7 will not result in any interrupt delivered to CPU 4-7. > > > > So the extra assignment to CPU 4-7 in the affinity mask has no effect > > whatsoever and even if the spreading result is 'perfect' it just looks > > perfect as it is not making any difference versus the original result: > > > > > > > irq 39, cpu list 0 > > > > > irq 40, cpu list 1 > > > > > irq 41, cpu list 2 > > > > > irq 42, cpu list 3 > > And looking deeper into the changes, I think that the first spreading step > has to use cpu_present_mask and not cpu_online_mask. > > Assume the following scenario: > > Machine with 8 present CPUs is booted, the 4 last CPUs are > unplugged. Device with 4 queues is initialized. > > The resulting spread is going to be exactly your example: > > irq 39, cpu list 0,4 > irq 40, cpu list 1,6 > irq 41, cpu list 2,5 > irq 42, cpu list 3,7 > > Now the 4 offline CPUs are plugged in again. These CPUs won't ever get an > interrupt as all interrupts stay on CPU 0-3 unless one of these CPUs is > unplugged. Using cpu_present_mask the spread would be: > > irq 39, cpu list 0,1 > irq 40, cpu list 2,3 > irq 41, cpu list 4,5 > irq 42, cpu list 6,7 Given physical CPU hotplug isn't common, this way will make only irq 39 and irq 40 active most of times, so performance regression is caused just as Kashyap reported. > > while on a machine where CPU 4-7 are NOT present, but advertised as > possible the spread would be: > > irq 39, cpu list 0,4 > irq 40, cpu list 1,6 > irq 41, cpu list 2,5 > irq 42, cpu list 3,7 I think this way is still better, since performance regression can be avoided, and there is at least one CPU for covering one irq vector, in reality, it is often enough. As I mentioned in another email, I still don't understand why interrupts can't be delivered to CPU 4~7 after these CPUs become present & online. Seems in theory, interrupts should be delivered to these CPUs since affinity info has been programmed to interrupt controller already. Or do we still need CPU hotplug handler for device driver to tell device the CPU hotplug change for delivering interrupts to new added CPUs? Thanks, Ming