Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1937663ybl; Sat, 14 Dec 2019 03:01:21 -0800 (PST) X-Google-Smtp-Source: APXvYqz9KBmn8xikl0CiPzEVZHJA3xURmuvAHpfY69wBO6Iiuf8sJQTUlt0SgVnrNInuniLtdf3s X-Received: by 2002:a05:6830:1d7a:: with SMTP id l26mr19133327oti.138.1576321281147; Sat, 14 Dec 2019 03:01:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576321281; cv=none; d=google.com; s=arc-20160816; b=Qgnv6FzaT/gbMF19xwaHGVfMcBA4wcI05xr3G/MlIgE14tzbDUlCK2HoA0A0EbZQj5 fjfYuyCpGHg37MuXE6ksoQl1RBWT7IPglEyXoX2t6Fj4Tctj9S59xoNRGY84mKd/O1b7 Xoz8WIBWR9qd3BLoBYHDFJn7iWTgeC7ZoCjPLIzeoYpYBy+BL8shNaTM0iHPpDaB15ho X1qfrhC3TDeBzUk79UT7H8ZI5pzDd0zd9C04H52ht8hNQzpAxvQ0ZjGk1UVkgqH2WsNH hoJvlBDwGCsuD9cKvSx8QpxZZbRNT78nc5g6kZbM+DB0rf+D1y61/Wp5tJ8VbcWMGPB3 mUWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date; bh=GQ7AmlS/QXUBCClhdMhzMFA8oeA65chGjh9M0QjKpZs=; b=cnMItIxLhThLo/Q8kxO8LZzLgRlfPvo/hAGbugTJfWpM6VktyI17kvqtwAQ86wp7MF +FFIZgnHj8cLkrz1Te+du8P9oBZeIu2s5PvxwAGW9HPQZv2jl3rESq8Gm9g1JvMb91dk k8Oai2uS9nwzGgUhLRR9NFYV3Y/BnPhh+soCJ89jT1yrwFEwvMjwvMttGend6f2nmu0W pck+rr9U7r5Mv+wLWqZzDQVqYoL6AD0947WsJi9Vh3ZxSmkTmX2xQhZfMT/+krT27rFS XWtZUw/JHjm7hLFIqMl5QdYzqg0IXZx7/jqeYEzOOfz3mbb5NIYTDdz1nhYNx5f1rG/m QWGA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d199si6693980oib.226.2019.12.14.03.01.08; Sat, 14 Dec 2019 03:01:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726103AbfLNK7W convert rfc822-to-8bit (ORCPT + 99 others); Sat, 14 Dec 2019 05:59:22 -0500 Received: from inca-roads.misterjones.org ([213.251.177.50]:44362 "EHLO inca-roads.misterjones.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725884AbfLNK7W (ORCPT ); Sat, 14 Dec 2019 05:59:22 -0500 Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=big-swifty.misterjones.org) by cheepnis.misterjones.org with esmtpsa (TLSv1.2:AES256-GCM-SHA384:256) (Exim 4.80) (envelope-from ) id 1ig58j-0004PM-A0; Sat, 14 Dec 2019 11:59:13 +0100 Date: Sat, 14 Dec 2019 10:59:11 +0000 Message-ID: <86eex7i35s.wl-maz@kernel.org> From: Marc Zyngier To: John Garry Cc: Ming Lei , , "chenxiang (M)" , , , , , , , , Subject: Re: [PATCH RFC 1/1] genirq: Make threaded handler use irq affinity for managed interrupt In-Reply-To: <174bfdbe-0c44-298b-35e9-8466e77528df@huawei.com> References: <1575642904-58295-1-git-send-email-john.garry@huawei.com> <1575642904-58295-2-git-send-email-john.garry@huawei.com> <20191207080335.GA6077@ming.t460p> <78a10958-fdc9-0576-0c39-6079b9749d39@huawei.com> <20191210014335.GA25022@ming.t460p> <28424a58-1159-c3f9-1efb-f1366993afcf@huawei.com> <048746c22898849d28985c0f65cf2c2a@www.loen.fr> <6e513d25d8b0c6b95d37a64df0c27b78@www.loen.fr> <06d1e2ff-9ec7-2262-25a0-4503cb204b0b@huawei.com> <5caa8414415ab35e74662ac0a30bb4ac@www.loen.fr> <2443e657-2ccd-bf85-072c-284ea0b3ce40@huawei.com> <214947849a681fc702d018383a3f95ac@www.loen.fr> <174bfdbe-0c44-298b-35e9-8466e77528df@huawei.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 EasyPG/1.0.0 Emacs/26 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: john.garry@huawei.com, ming.lei@redhat.com, tglx@linutronix.de, chenxiang66@hisilicon.com, bigeasy@linutronix.de, linux-kernel@vger.kernel.org, hare@suse.com, hch@lst.de, axboe@kernel.dk, bvanassche@acm.org, peterz@infradead.org, mingo@redhat.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on cheepnis.misterjones.org); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 13 Dec 2019 12:08:54 +0000, John Garry wrote: > > Hi Marc, > > >> JFYI, we're still testing this and the patch itself seems to work as > >> intended. > >> > >> Here's the kernel log if you just want to see how the interrupts are > >> getting assigned: > >> https://pastebin.com/hh3r810g > > > > It is a bit hard to make sense of this dump, specially on such a wide > > machine (I want one!) > > So do I :) That's the newer "D06CS" board. > > without really knowing the topology of the system. > > So it's 2x socket, each socket has 2x CPU dies, and each die has 6 > clusters of 4 CPUs, which gives 96 in total. > > > > >> For me, I did get a performance boost for NVMe testing, but my > >> colleague Xiang Chen saw a drop for our storage test of interest  - > >> that's the HiSi SAS controller. We're trying to make sense of it now. > > > > One of the difference is that with this patch, the initial affinity > > is picked inside the NUMA node that matches the ITS. > > Is that even for managed interrupts? We're testing the storage > controller which uses managed interrupts. I should have made that > clearer. The ITS driver doesn't care about the fact that an interrupt affinity is 'managed' or not. And I don't think a low-level driver should, as it will just follow whatever interrupt affinity it is requested to use. If a managed interrupt has some requirements, then these requirements better be explicit in terms of CPU affinity. > In your case, > > that's either node 0 or 2. But it is unclear whether which CPUs these > > map to. > > > > Given that I see interrupts mapped to CPUs 0-23 on one side, and 48-71 > > on the other, it looks like half of your machine gets starved, > > Seems that way. > > So this is a mystery to me: > > [ 23.584192] picked CPU62 IRQ147 > > 147: 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 ITS-MSI 94404626 Edge hisi_sas_v3_hw cq > > > and > > [ 25.896728] picked CPU62 IRQ183 > > 183: 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 0 > 0 0 0 0 0 0 0 > 0 0 ITS-MSI 94437398 Edge hisi_sas_v3_hw cq > > > But mpstat reports for CPU62: > > 12:44:58 AM CPU %usr %nice %sys %iowait %irq %soft > %steal %guest %gnice %idle > 12:45:00 AM 62 6.54 0.00 42.99 0.00 6.54 12.15 > 0.00 0.00 6.54 25.23 > > I don't know what interrupts they are... Clearly, they aren't your SAS interrupts. But the debug print do not mean that these are the only interrupts that are targeting CPU62. Looking at the 62nd column of /proc/interrupts should tell you what fires (and my bet is on something like the timer). > It's the "hisi_sas_v3_hw cq" interrupts which we're interested in. Clearly, they aren't firing. > and that > > may be because no ITS targets the NUMA nodes they are part of. > > So both storage controllers (which we're interested in for this test) > are on socket #0, node #0. > > It would > > be interesting to see what happens if you manually set the affinity > > of the interrupts outside of the NUMA node. > > > > Again, managed, so I don't think it's possible. OK, we need to get back to what the actual requirements of a 'managed' interrupt are, because there is clearly something that hasn't made it into the core code... M. -- Jazz is not dead, it just smells funny.