Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp2707183ybe; Sat, 7 Sep 2019 22:39:00 -0700 (PDT) X-Google-Smtp-Source: APXvYqzAgVOWrMcaR6EkVsFWWknGCntuMSLeqN2lo++ZMixQNF2syHbVoN1Mygpiy9xBSxtobi9q X-Received: by 2002:a17:902:a615:: with SMTP id u21mr17009746plq.4.1567921139965; Sat, 07 Sep 2019 22:38:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567921139; cv=none; d=google.com; s=arc-20160816; b=bqpRpDA5a7CbkCCOsZD4uZbEyOjIh3r1+a4R7WOBQ4zpU2DdCYe2f56IeJM97UnSg2 BQHM1yJzMHvYyThHAdL5teVGTXbKmlBW99tSTM1p1S9PHNArOgYskjizlqTxBL2ZKYOo KwdH3N32sPTXuhZRfKzrtsV0tXsfGDo8QURQ/fVmumU2yaShCz7Qup7z5765lh0StmP2 jwUNek/gdlvobaHmO7p6ZwK5cggk99fl5FXwm9thUemw5O53bxNKukccLI+4RPBCNmG4 Tlb4Wn3CMPZeyYd9HKgpYCYu5767wqJ7K8OJ9eP8BuWi330gd3ZsWYKDDVMWGYea6SdT EbZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=YLPNP6YZ6CDCck+sxQ4nUkYfkQhlU80iODy+n1iGxXc=; b=pR3Nw04l3ikLQqDFfpyN6xXVuesz1irim+0betAJX9qIxoO0PSrg4SaFMe8mcM7C1d 0PD7u47y/k4/Fm51L9aUVykT1/JjdnJE0scvMaFOOOl/bK6mrqDqmASxPfECGH3AE5JP Urz8CWg5FyHLHp0aPIuwgqCfdi2W1R82JhNmD5QkJ6RJSOl2l0vx4+HVb+5f6Ty0hLpT 0HFnHrGXO8uLYB9vECi2vgLblPM7SVIKANF/NfKr9HfjH0UEnmGqPFHFeEXQi0h2RZ1T KcKAqkS67A8SLlMeyoiZyDnp7JvZNcSMumepvn6OiLpws01s3mvuav1ZhD3Rc6goegjP Zwjg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h35si9176364pgm.183.2019.09.07.22.38.44; Sat, 07 Sep 2019 22:38:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404610AbfIFWTi (ORCPT + 99 others); Fri, 6 Sep 2019 18:19:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36972 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729244AbfIFWTh (ORCPT ); Fri, 6 Sep 2019 18:19:37 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 389CB8AC6F9; Fri, 6 Sep 2019 22:19:37 +0000 (UTC) Received: from ming.t460p (ovpn-8-16.pek2.redhat.com [10.72.8.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B7E8F6092F; Fri, 6 Sep 2019 22:19:26 +0000 (UTC) Date: Sat, 7 Sep 2019 06:19:21 +0800 From: Ming Lei To: Long Li Cc: Keith Busch , Daniel Lezcano , Keith Busch , Hannes Reinecke , Bart Van Assche , "linux-scsi@vger.kernel.org" , Peter Zijlstra , John Garry , LKML , "linux-nvme@lists.infradead.org" , Jens Axboe , Ingo Molnar , Thomas Gleixner , Christoph Hellwig , Sagi Grimberg Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism Message-ID: <20190906221920.GA12290@ming.t460p> References: <6b88719c-782a-4a63-db9f-bf62734a7874@linaro.org> <20190903072848.GA22170@ming.t460p> <6f3b6557-1767-8c80-f786-1ea667179b39@acm.org> <2a8bd278-5384-d82f-c09b-4fce236d2d95@linaro.org> <20190905090617.GB4432@ming.t460p> <6a36ccc7-24cd-1d92-fef1-2c5e0f798c36@linaro.org> <20190906014819.GB27116@ming.t460p> <20190906141858.GA3953@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.69]); Fri, 06 Sep 2019 22:19:37 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 06, 2019 at 05:50:49PM +0000, Long Li wrote: > >Subject: Re: [PATCH 1/4] softirq: implement IRQ flood detection mechanism > > > >On Fri, Sep 06, 2019 at 09:48:21AM +0800, Ming Lei wrote: > >> When one IRQ flood happens on one CPU: > >> > >> 1) softirq handling on this CPU can't make progress > >> > >> 2) kernel thread bound to this CPU can't make progress > >> > >> For example, network may require softirq to xmit packets, or another > >> irq thread for handling keyboards/mice or whatever, or rcu_sched may > >> depend on that CPU for making progress, then the irq flood stalls the > >> whole system. > >> > >> > > >> > AFAIU, there are fast medium where the responses to requests are > >> > faster than the time to process them, right? > >> > >> Usually medium may not be faster than CPU, now we are talking about > >> interrupts, which can be originated from lots of devices concurrently, > >> for example, in Long Li'test, there are 8 NVMe drives involved. > > > >Why are all 8 nvmes sharing the same CPU for interrupt handling? > >Shouldn't matrix_find_best_cpu_managed() handle selecting the least used > >CPU from the cpumask for the effective interrupt handling? > > The tests run on 10 NVMe disks on a system of 80 CPUs. Each NVMe disk has 32 hardware queues. Then there are total 320 NVMe MSI/X vectors, and 80 CPUs, so irq matrix can't avoid effective CPUs overlapping at all. > It seems matrix_find_best_cpu_managed() has done its job, but we may still have CPUs that service several hardware queues mapped from other issuing CPUs. > Another thing to consider is that there may be other managed interrupts on the system, so NVMe interrupts may not end up evenly distributed on such a system. Another improvement could be to try to not overlap effective CPUs among vectors of fast device first, meantime allow the overlap between slow vectors and fast vectors. This way could improve in case that total fast vectors are <= nr_cpu_cores. thanks, Ming