Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp866746ybl; Thu, 22 Aug 2019 06:10:26 -0700 (PDT) X-Google-Smtp-Source: APXvYqwpMIl3FmPS5ToA/DxFjzxW5ZiOEEjRXcZ6NedhHsIT1X2EhNxHHsYBawqqsvkApMyz1beV X-Received: by 2002:a65:500a:: with SMTP id f10mr33825984pgo.105.1566479426198; Thu, 22 Aug 2019 06:10:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566479426; cv=none; d=google.com; s=arc-20160816; b=rdJb5poYkFkBDOSLVyJ37nmuVJRE03rV0iLoVUUBlemov/CRndIwdA8ecdV/r028GL LFrioBdQmbtdhXnwN2jkdoY367kL0J5+MgkdgR88DltMftiPBYFlqGY4UlxQH8G7mg9x tvs44ktzUECcQLY99u6QJBW/42dQyhK5k/YXy33GzZblnmUb8AHt1l+Z3ZseCWnN2z5t +SZznWYtimq639pZw/cCs1ta8p+uD0eyGDJ2YDKByb4AiR8jb05dT1LhRST8cWnTWVlD eUdjMmZqz/OLrJtCaR4JDSeM9nSfcf8v8bg4/a0ATBOWIg5EcHY+t52X67diBJ8RH5oP txxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=IxKnzSFB9iky9vvbF90x7sEZz0Xn8ZbChud1Jit0HJY=; b=JHQKUkD9QeIS2GUxVGrDy2hhlIZuGMy+8hQbazzotdCNYA8G4emz9GYxQK4NGo0ZEk DtaHKH16iTdRLktJdwdwgYz0nYHUVLQKNF3BWJ7XCqAsmNd4sGDkqV91aHjDlSK8hCnD I7gHmozlV2c2aQHEoUgilEBJoum9SLajucT6T7BXt9YYbVykcHvwlfjBFatWfUwBbW/X cBkhhdOfXf+1VWhYeh/Pw+nDn3yjb1t7L9a+SmQedaRjuRILn1qZ3GRl3vUCGNnsiIPu 7V5Cn+9wKsl5MFIn6cGOC6MtWVeKGf+6ChTe9Tn8XVqkhdvXyVO+9QjAuRUL0CALiYZk QlVw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o9si18715678pfp.158.2019.08.22.06.10.10; Thu, 22 Aug 2019 06:10:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732323AbfHVJsv (ORCPT + 99 others); Thu, 22 Aug 2019 05:48:51 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:59611 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732310AbfHVJsv (ORCPT ); Thu, 22 Aug 2019 05:48:51 -0400 Received: from p5de0b6c5.dip0.t-ipconnect.de ([93.224.182.197] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1i0jhn-000790-5b; Thu, 22 Aug 2019 11:48:31 +0200 Date: Thu, 22 Aug 2019 11:48:29 +0200 (CEST) From: Thomas Gleixner To: Keith Busch cc: Ming Lei , Long Li , Jens Axboe , Sagi Grimberg , chenxiang , Peter Zijlstra , Ming Lei , John Garry , Linux Kernel Mailing List , linux-nvme , Keith Busch , Ingo Molnar , Christoph Hellwig , "longli@linuxonhyperv.com" Subject: Re: [PATCH 0/3] fix interrupt swamp in NVMe In-Reply-To: Message-ID: References: <1566281669-48212-1-git-send-email-longli@linuxonhyperv.com> <20190821094406.GA28391@ming.t460p> <20190822013356.GC28635@ming.t460p> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 21 Aug 2019, Keith Busch wrote: > On Wed, Aug 21, 2019 at 7:34 PM Ming Lei wrote: > > On Wed, Aug 21, 2019 at 04:27:00PM +0000, Long Li wrote: > > > Here is the command to benchmark it: > > > > > > fio --bs=4k --ioengine=libaio --iodepth=128 --filename=/dev/nvme0n1:/dev/nvme1n1:/dev/nvme2n1:/dev/nvme3n1:/dev/nvme4n1:/dev/nvme5n1:/dev/nvme6n1:/dev/nvme7n1:/dev/nvme8n1:/dev/nvme9n1 --direct=1 --runtime=120 --numjobs=80 --rw=randread --name=test --group_reporting --gtod_reduce=1 > > > > > > > I can reproduce the issue on one machine(96 cores) with 4 NVMes(32 queues), so > > each queue is served on 3 CPUs. > > > > IOPS drops > 20% when 'use_threaded_interrupts' is enabled. From fio log, CPU > > context switch is increased a lot. > > Interestingly use_threaded_interrupts shows a marginal improvement on > my machine with the same fio profile. It was only 5 NVMes, but they've > one queue per-cpu on 112 cores. Which is not surprising because the thread and the hard interrupt are on the same CPU and there is just that little overhead of the context switch. The thing is that this really depends on how the scheduler decides to place the interrupt thread. If you have a queue for several CPUs, then depending on the load situation allowing a multi-cpu affinity for the thread can cause lots of task migration. But restricting the irq thread to the CPU on which the interrupt is affine can also starve that CPU. There is no universal rule for that. Tracing should tell. Thanks, tglx