Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp143664imm; Thu, 30 Aug 2018 10:17:39 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYQzve88nkQoqt95cZo5oyI1HDZ5MIvnhCvsWoiqncUzlLninzYWUFA/Z/X1mI/6rHIez4Z X-Received: by 2002:a17:902:2f43:: with SMTP id s61-v6mr10817052plb.176.1535649459256; Thu, 30 Aug 2018 10:17:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1535649459; cv=none; d=google.com; s=arc-20160816; b=rb7q/b7gLJwA+WC2jXlvNmRJ9NB3HBHJV5MbT99hsK3iVw9zy4lBnQgfln8dYI0Tx0 LFkiNq75Jc2aHELXoXFag5fJ0O3mRL8eoow/+E7AaFVFsR3GkU73SWNJ4Fxv7BAz7fFU BrjoAlwbcZNm6DtTHTbDRYM+LwSB4RgW56ePQPu250sxN6HsPYB73dJXQ+2oH+U+BvFr dcJsH1XTbJtfyo9M/jck7fgzQcZP+ARQcB2SPuGzW4Tl1jP2kKSrt4EeUsuJX/p0PGtr pOtyQmzd7MfsXgv78LD10SrGgg+ZoFG8J8EhqkFWxvWKCuvVLUMJodDHu4LJFfqSQefj jeCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date :thread-index:mime-version:in-reply-to:references:from :dkim-signature:arc-authentication-results; bh=kMloL3k/6fw70hU7N+eWpV+TlfWGevqpwdtYS+RCvWU=; b=U4kSNh+xxW5+ZcXbI4jl6x9+Usbvz8DHYYG+NU0dGpe991LyOtXHScgkPF/jZjaJG0 2sU1gW/M008zJBQXcSxmjKTDEXHFHyCJK/e8uuzkBzynJ0+1aJTihDWsHVLlUrVMr6WI wOxGGwjtlJkGyW45HsRnbaAmmbAfXUy5fYCUE+2XtmxXD7n3Tjcp9yPIyqxjSdAhrjEC edIAVeQZIyZZBlJtVYY9ZRqVxA9SDPQScS14KFRDxhgDKQ7lcjTFK9norsvvtm0l8ccD B8ErOSkLMS3JWJvK7QNLsJ6u3xgfL1jmSNETJU31kz4SuOJfXnFQixqCsRpm8LylrfDP EIGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b="EaeXmM/r"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4-v6si7113788pgf.193.2018.08.30.10.17.24; Thu, 30 Aug 2018 10:17:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b="EaeXmM/r"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727281AbeH3VSq (ORCPT + 99 others); Thu, 30 Aug 2018 17:18:46 -0400 Received: from mail-it0-f54.google.com ([209.85.214.54]:53621 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725893AbeH3VSq (ORCPT ); Thu, 30 Aug 2018 17:18:46 -0400 Received: by mail-it0-f54.google.com with SMTP id p79-v6so3593773itp.3 for ; Thu, 30 Aug 2018 10:15:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:references:in-reply-to:mime-version:thread-index:date :message-id:subject:to:cc; bh=kMloL3k/6fw70hU7N+eWpV+TlfWGevqpwdtYS+RCvWU=; b=EaeXmM/rPwkSr4dWPimpHTlFQLrqihqKblBVIbTYXmVyTLxFGmNC5fMIO2MS9w79D5 p/XNBvakSrIqdQVRkc49taeGWDZH5flQuUdhN3pgRsK8wmbsrbgzDKb7MKUVnpjNgKnc tD5s3e7s/+o2GFSvoJ2ePKS8KBSHSz6NzyPtw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:references:in-reply-to:mime-version :thread-index:date:message-id:subject:to:cc; bh=kMloL3k/6fw70hU7N+eWpV+TlfWGevqpwdtYS+RCvWU=; b=nD4AbKSlGTEKM+oCZDD6md//oj996SQ01GunzJczqIwtdl6vm6No555LaZ6sHWwhG2 DCh7mGL6j4YcPWPvZyXe83pMjwUTllVaWz05L+o6o155GlHh58R9QcAlwfwT952cwADt GJwlr+T+Jz2fVNlfUS1okX3f91kCoe6L/fc0vmRP8tfQtl188d1qah0zTcPFLJIdnsDm YpFes5pUXwE94Jb05mDA4DeUcgYUmu1vMWFWVjgFB2fTUdrTCVaRXY58GtL3/SPlnMcH CTaXMH/Q84BS0PnMBjntCl4xtfA022gUhMQlATgxc/ZLuIEuww+XQToofXA24iMxH2tZ 1LRQ== X-Gm-Message-State: APzg51ATRDfQtl6/9RiJv2XhCrHJGDucUkmdqz+4IT7IGqE+QPZOJ6LS k9AWkw7cBrGN9CERq4+NHcDH50BI7yYX0ycz4URL8A== X-Received: by 2002:a02:39a:: with SMTP id e26-v6mr10141913jae.135.1535649338479; Thu, 30 Aug 2018 10:15:38 -0700 (PDT) From: Kashyap Desai References: <20180829084618.GA24765@ming.t460p> <300d6fef733ca76ced581f8c6304bac6@mail.gmail.com> In-Reply-To: <300d6fef733ca76ced581f8c6304bac6@mail.gmail.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQL9fTS7902n0VSYivL2AMCzXDd9xwGQx87UAiSubfuiaGEY8A== Date: Thu, 30 Aug 2018 11:15:37 -0600 Message-ID: <51f07ed08d4369fe513bd13f59656a8b@mail.gmail.com> Subject: RE: Affinity managed interrupts vs non-managed interrupts To: Sumit Saxena , Ming Lei Cc: tglx@linutronix.de, hch@lst.de, linux-kernel@vger.kernel.org, Shivasharan Srikanteshwara Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, Ming, Chris et all, Your input will help us to do changes for megaraid_sas driver. We are currently waiting for community response. Is it recommended to use " pci_enable_msix_range" and have low level driver do affinity setting because current APIs around pci_alloc_irq_vectors do not meet our requirement. We want more than online CPU msix vectors and using pre_vector we can do that, but first 16 msix should be mapped to local numa node with effective cpu spread across cpus of local numa node. This is not possible using pci_alloc_irq_vectors_affinity. Do we need kernel API changes or let's have low level driver to manage it via irq_set_affinity_hint ? Kashyap > -----Original Message----- > From: Sumit Saxena [mailto:sumit.saxena@broadcom.com] > Sent: Wednesday, August 29, 2018 4:46 AM > To: Ming Lei > Cc: tglx@linutronix.de; hch@lst.de; linux-kernel@vger.kernel.org; Kashyap > Desai; Shivasharan Srikanteshwara > Subject: RE: Affinity managed interrupts vs non-managed interrupts > > > -----Original Message----- > > From: Ming Lei [mailto:ming.lei@redhat.com] > > Sent: Wednesday, August 29, 2018 2:16 PM > > To: Sumit Saxena > > Cc: tglx@linutronix.de; hch@lst.de; linux-kernel@vger.kernel.org > > Subject: Re: Affinity managed interrupts vs non-managed interrupts > > > > Hello Sumit, > Hi Ming, > Thanks for response. > > > > On Tue, Aug 28, 2018 at 12:04:52PM +0530, Sumit Saxena wrote: > > > Affinity managed interrupts vs non-managed interrupts > > > > > > Hi Thomas, > > > > > > We are working on next generation MegaRAID product where requirement > > > is- to allocate additional 16 MSI-x vectors in addition to number of > > > MSI-x vectors megaraid_sas driver usually allocates. MegaRAID adapter > > > supports 128 MSI-x vectors. > > > > > > To explain the requirement and solution, consider that we have 2 > > > socket system (each socket having 36 logical CPUs). Current driver > > > will allocate total 72 MSI-x vectors by calling API- > > > pci_alloc_irq_vectors(with flag- PCI_IRQ_AFFINITY). All 72 MSI-x > > > vectors will have affinity across NUMA node s and interrupts are > affinity > > managed. > > > > > > If driver calls- pci_alloc_irq_vectors_affinity() with pre_vectors = > > > 16 and, driver can allocate 16 + 72 MSI-x vectors. > > > > Could you explain a bit what the specific use case the extra 16 vectors > is? > We are trying to avoid the penalty due to one interrupt per IO completion > and decided to coalesce interrupts on these extra 16 reply queues. > For regular 72 reply queues, we will not coalesce interrupts as for low IO > workload, interrupt coalescing may take more time due to less IO > completions. > In IO submission path, driver will decide which set of reply queues > (either extra 16 reply queues or regular 72 reply queues) to be picked > based on IO workload. > > > > > > > > All pre_vectors (16) will be mapped to all available online CPUs but e > > > ffective affinity of each vector is to CPU 0. Our requirement is to > > > have pre _vectors 16 reply queues to be mapped to local NUMA node with > > > effective CPU should be spread within local node cpu mask. Without > > > changing kernel code, we can > > > > If all CPUs in one NUMA node is offline, can this use case work as > expected? > > Seems we have to understand what the use case is and how it works. > > Yes, if all CPUs of the NUMA node is offlined, IRQ-CPU affinity will be > broken and irqbalancer takes care of migrating affected IRQs to online > CPUs of different NUMA node. > When offline CPUs are onlined again, irqbalancer restores affinity. > > > > > > Thanks, > > Ming