Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752437AbbGOP41 (ORCPT ); Wed, 15 Jul 2015 11:56:27 -0400 Received: from mail-bl2on0062.outbound.protection.outlook.com ([65.55.169.62]:52880 "EHLO na01-bl2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751655AbbGOP4Z (ORCPT ); Wed, 15 Jul 2015 11:56:25 -0400 X-Greylist: delayed 911 seconds by postgrey-1.27 at vger.kernel.org; Wed, 15 Jul 2015 11:56:25 EDT Authentication-Results: spf=pass (sender IP is 63.163.107.173) smtp.mailfrom=sandisk.com; lists.linuxfoundation.org; dkim=none (message not signed) header.d=none; X-AuditID: ac160a69-f790f6d000001879-24-55a67f1137d5 Message-ID: <55A67F11.1030709@sandisk.com> Date: Wed, 15 Jul 2015 08:41:05 -0700 From: Bart Van Assche User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Thomas Gleixner , Christoph Hellwig CC: , , , , Jens Axboe Subject: Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity References: <20150715120708.GA24534@infradead.org> In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFuplkeLIzCtJLcpLzFFi42JZI8azSFewflmowZZn4har7/azWZyesIjJ 4uevbUwWl3fNYbOYv+wpu8WzQ70sFps3TWV2YPfYvELL4/LZUo93584BOUvqPX7/mMzo8XmT XABbFJdNSmpOZllqkb5dAlfG5WunGAsOiVY8aPrK1MA4U7CLkZNDQsBEYvfXV0wQtpjEhXvr 2UBsIYETjBL/Z7N0MXIB2TsYJdZ8/M0K03DoxUaoxG5GiSO7loMleAW0JF7v+gCU4OBgEVCV WLwgDSTMJmAk8e39TLCwqECYxJ49qhDVghInZz5hAbFFBAIlpjyfDDaSWWAjo8TN/U/BjhAW sJA4u2IFC8RBcRJTP7aBreIEir/Zu5IdZCazgL3Eg61lIGFmAXmJ7W/nMIPMkRC4yiqxbMVv qF51iZNL5jNNYBSZhWT3LIT2WUjaFzAyr2IUy83MKc5NTy0wNNIrTsxLySzO1kvOz93ECI4k rswdjCsmmR9iFOBgVOLhbVi0NFSINbGsuDL3EKMEB7OSCO+h6mWhQrwpiZVVqUX58UWlOanF hxilOViUxHl7c3VChQTSE0tSs1NTC1KLYLJMHJxSDYxhRcePXzE2v7v5/Oe6PMEVlZxMn2+k ay3uv9Q3b+Gmewtyzl3tDVbqvXi5rjFEXC6k8877xy6tiXEv556el7E+30yoVspAWc5oi/0d YbfJ+ueNOk1cJLOPXluc3rHoR8xVxRCW48W/4l/a7vY44SHiPt0oJ/uN5Luz/9K/tmm7zBXc r+GfHKPEUpyRaKjFXFScCAAXqyNtoAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmplluLIzCtJLcpLzFFi42Lh2siRoitYvyzU4EoDt8Xqu/1sFqcnLGKy +PlrG5PF5V1z2CzmL3vKbvHsUC+LxeZNU5kd2D02r9DyuHy21OPduXNAzpJ6j98/JjN6fN4k F8AWxWWTkpqTWZZapG+XwJVx+dopxoJDohUPmr4yNTDOFOxi5OSQEDCROPRiIwuELSZx4d56 ti5GLg4hgZ2MEhPWrGYHSfAKaEm83vUBqIiDg0VAVWLxgjSQMJuAkcS39zPBwqICYRJ79qhC VAtKnJz5BGykiECQxKq3jewgI5kFVjNK/Nn2EmyksICFxNkVK8CKhATiJKZ+bGMFsTmB4m/2 rgSrYRawlbgzdzczhC0vsf3tHOYJjPyzkOyYhaRsFpKyBYzMqxjFcjNzinPTMwsMjfSKE/NS Mouz9ZLzczcxggOaM2oH4/WJ5ocYmTg4pRoYFapWunG+PD+9wqB3xt0eS8un3xjbWN/+qJOz UDBcOuNMae6zRpNXvxTW3/FlsD38rsn/+p7Sne/XvO2WZuC2NMru+/1gwrEWkRTu0l/bZtQn yppVRfpdTRNyO9uT26/FnBl6asHvjXMKD7/5EP/4Z8ljxfqzPOsVA5ynHJK3EPObtq8s+yar EktxRqKhFnNRcSIA3hvFJxgCAAA= X-EOPAttributedMessage: 0 X-Microsoft-Exchange-Diagnostics: 1;BL2FFO11FD012;1:MLoa1jHmFWeqP/ZcKy4kQJlu2R7+bbwklFHmanxS+Qb0TutyF35r5vePZkQdQ/dr0PfK168jXJTY48Oqb59C/p4Qx8JzPN/z2KB8X+LKYUw47ZEn0GPftDFvEwGE690AzOy4y2N8VzbPltBqD30pa5Y2zUeQiakR6f9R1mm+tpeNl8Ygf1bEzIk11a1e7OjRhVQkI9t0Bp28UkSVVz2cdTwzRPFwiTmz9m8WR9oGi15SH0P3IM/OZ7cELleUFttGtJ3Fugzv0j9U1sjHQDajdtNd7RM2Lvn+1jwiC321XOlI6TRDHixXHutMwoEJs8aYB7qQCOtMQZyWAkfrj+qRZZ9gY7foBsiiMp5nFL/cvFFW5Vsma8D+LJpJr0f1wgZG6QwccthB3NZQ1UIV/fbVZvl/WYcmVinqQTM+9NjhG5c= X-Forefront-Antispam-Report: CIP:63.163.107.173;CTRY:US;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(10009020)(6009001)(2980300002)(438002)(51704005)(479174004)(199003)(189002)(24454002)(377454003)(83506001)(50466002)(19580395003)(23746002)(87936001)(50986999)(64126003)(4001350100001)(86362001)(65806001)(65956001)(53416004)(46102003)(65816999)(36756003)(47776003)(189998001)(5001770100001)(92566002)(77096005)(5003600100002)(87266999)(5890100001)(15975445007)(5001960100002)(76176999)(59896002)(54356999)(33656002)(77156002)(2950100001)(62966003)(106466001);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR02MB1373;H:milsmgep12.sandisk.com;FPR:;SPF:Pass;MLV:sfv;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: 1;SN1PR02MB1373;2:xQmi6O/KKgitE6imP8RuFJLxFiUj0cyWW2jMAgmxNsld5wvAqXOIQy9cHDCGAZdv;3:amPurZphNhRvZ9qfyPySiEmHRCZnwHSiAsudl+gtgT7Nh9Ys5HYH8UinbKReBa5+ZzOtmntu4uZllHW5aUwSZY3ZYG4xziP69I9Zlei1KcJQk+aHdwz2wDpwVDUnwVAk6T/758kiqQ+Fb+0+akOe62gK+Ge53sU6Gw28ZI2Wf79eE0Wxh3IZODhAX4kbxofBufbgsBc4sdjHl3uoTjZC2kIxQXH8aaoEgJqRseC0Y/u9M7rwuJjxOFgr638LMxMV;25:rsF7/FL1x+7qvzAsW4HlwQqC/fkGvMkhID3Nmm3wzgNEz3nAqYhxrtXzEKoJnhso2ConIZT6AgTV7QJpCKfbR35aPelWLxnAgiCTSaGnKFl4zr6qmwCJcrP7WIfyXMR25Jj2VrOZ0/s7FUYRziCl4OpRl9CaIqGY3dlceua8IZXgV2MOiHxDTAnsnjUcFbDE9xMqndT5iUZHG6Ii+dnzR9xEwtvI/sX8v9zI9ZiME5tmVyVeskqmceArFawiHRJOWQCRbDlL5eFPgrtL/dUjPg== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR02MB1373;UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR02MB1695; X-Microsoft-Exchange-Diagnostics: 1;SN1PR02MB1373;20:40IYoGwrCt5WtwgqYyiUWcyoOaZ3jLlM+4xqFMb421f0po9boPzECm9v2K+7CfN75nZjTl4KSXhvmTAislFv96qk+l0lI57TAaZf2oqMNaJnxCXZEJgLz6s+QaFlK7gDB8U0dY0Bpfbv4RrvtTdliTzAOrFnzHRumvbzAuRCGa0UR8VRs9r3fUAebK9QK5oQH0L72Re+JgwNLtjnXSQO8s7emmI/3VU6JP/ckJh6tAw4h3ajXMxe9CJjik6Ropdk4+QXPtD9nUj2C55SVeimpiwMkgwvUOGK7GuFyO97/NQ/fDa2VCnoRUsQXUxg7OzWqXBdnbfdfdq4bmW1/5SrvY4LLICqqTjJGcjqvxsHTrzg5VsnpiZHillcF/ws2fZBTj6cA0YpgirBmp7P/XiaQRthK0Pj8xb5T3/E1jCncC8X0f62hYe1TmCSGtaMEOIJ5ya+uHka9EG8MQ/h/P9+APqUcTEcxklAeB3qBCLZbQYhpOoSGnY4xzWS2Y4I7LhI;4:pmFQjGBNgjxSN6Oirixso4H3U02qcBLJeRzfJ9P8Mw/+aEPn0P1nrYYB5A+Zg9Zsngle74JtAmc82tJoz5/ftOOAXp8PeUZ7fLsaZmq3wFsdfQHqrXWs1caLfn0M64F/G0bF4KEpC6qKZkh6GYAZ40YDpal4nJluDWFzyG1FaEvqC8zsyoDnOveosk87uY0Abr5zwSy5r/NXU6Bw0iGClf0vA70B0yYovca7/KqvTZn9TfS3SfKcJ1U86ZXL+reSr7hQ3uRadFwfc3gyoABaM8n1KqK8WLczU1N7RGuogrQ= SN1PR02MB1373: X-MS-Exchange-Organization-RulesExecuted X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(3002001);SRVR:SN1PR02MB1373;BCL:0;PCL:0;RULEID:;SRVR:SN1PR02MB1373; X-Forefront-PRVS: 0638FD5066 X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;SN1PR02MB1373;23:9s80UjX0K/SpdCNRqDiPFIux6BAaJP8jlyEW2?= =?Windows-1252?Q?4DGOYGbEEBRMhdBFpDrsO63eP4oHjN8CtjS4Xls26Wjc/Zy0eyMKyZKo?= =?Windows-1252?Q?h2tMf1Q3uL2WosPmRVy9G89k9cByXlsz2bH9DTnx2g/6ADDSl0MXdLhf?= =?Windows-1252?Q?rrMRldTPqX2FiDVuw3BLAP1qfWFytpJsbW2MvdJ+/wM5TI89CH+AQadp?= =?Windows-1252?Q?w20JT0TyliWJHPy39XftqpNbBV5+VU0h+UqkE8U1xGrF0exrkXbKhkd3?= =?Windows-1252?Q?ZUzp5KHoZBQ922JrHJs2nFmP2m1CwJW7MkBw/j2bJWgU/11lHLFtz8rS?= =?Windows-1252?Q?Skd6IrooCaVLfzy3L530/q6VBuHCznC7YdzSMpXHZB3ey6aeFmtt1p9p?= =?Windows-1252?Q?+WSK5w18vzDammjyZiw+6omxfiJn+3xoKQMdwWdjJnP78cTEOP49xtT4?= =?Windows-1252?Q?8sAtdpwroeJNSVj+aEl3m60DzE4JQWX5pmsNP/8pz2WgWAIjdZCtFlM2?= =?Windows-1252?Q?T2Ao3EdSMy4lkReI9wjvVPV3fH2iEtlhOaxAT1AvrnCzvj+Ok48jtZxy?= =?Windows-1252?Q?IomMg3Dpd7l1AmuZlngYzJHGEVGjOUc5FreS4+2lixbCjyIYZ9Bs4Gz4?= =?Windows-1252?Q?lTX3++j9jKB+eEWIdTfA7096dIUZFP1G+V/fLPvM1rxEeXFxlNednKTe?= =?Windows-1252?Q?SrQgByKMtlOVpoAoNksun/ufZ+LRmghCiIz/wTBZuAYUd7jz2s9kKwPe?= =?Windows-1252?Q?ya1JL2+nNycQkSXWh4jyhfOSFl2cFGMPqg86H90nUs+SNX+100riqP96?= =?Windows-1252?Q?b1rB1ix59rMnltC8Y2S2AzFxOs010q3OBUDEtiwbOfWqnwU5QrSsKG5H?= =?Windows-1252?Q?j8+zWNdSWSb5bRuiTN4GpL+x+HhDtcUdnq0hlfcho6LaChv6rJzna4K9?= =?Windows-1252?Q?HX1kU+ZmWT2mnS2WNgLC6sYdMCZ8Jk3f0/0boU9UKyOLyfGzu5hZh8my?= =?Windows-1252?Q?W7R1gM8saLkkOKQVcnhcauMAK3poDsEX48GApdPtQ8o3FqnATc2A5bb2?= =?Windows-1252?Q?EPgISH72g8Hl6Rd5ZG42Q4t7+2PjSKKBEk9sn0yXCi+WPbZG9lSRL0v6?= =?Windows-1252?Q?UIwLm2FVq7a0BouOOninzOI1QJyDoA7LPDQbzN4G7CIzFM/FhgNjnC8b?= =?Windows-1252?Q?hSr5Dp/g/79pORJEojFip4aUS3DUMA=3D?= X-Microsoft-Exchange-Diagnostics: 1;SN1PR02MB1373;5:uJ7qbTsUXxZ+X9/olCTdTs9b5kcNMQqSSpxd0O+A1/G7+Y4JyJldakbEBT94DP1GZxA4toGrtSKDfOsZBt6nwyTyHOf6Z3OsmYXMhGxqc6wKNQPA0SLvYwZ/YZMZ+09nG/YS3eWEMGLTQ+pCTvW1jg==;24:K1TychGCNDqcjMn+L6oRrQpCOT/ltcqzQpFcN5xznN2DjvwzeIoFCLZdSZnucFAHur6czbYbS5hRDcHIfSbYzRriMVPANT/TQ89UH3xwdAU=;20:0+hABqQB2xjyPQqDnUoTiiSWaaPqygntOPQG0BIvFPN6GCNBmQeLcFb6TWYa2W8an+TuCV83ceDIJkIjcspigTvDUhstrNfAd6vn+UBMkFKtgHOncM/B3UfVpY2YKJ+1kG2I/27XamYDxZgleZpx8RBHdEMOiYLeGyO7yPffAcU= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jul 2015 15:41:06.9923 (UTC) X-MS-Exchange-CrossTenant-Id: fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=fcd9ea9c-ae8c-460c-ab3c-3db42d7ac64d;Ip=[63.163.107.173];Helo=[milsmgep12.sandisk.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR02MB1373 X-Microsoft-Exchange-Diagnostics: 1;SN1PR02MB1695;2:ykzY+DGxG9Zed1L/igixpBXUmmvD28VqQK01awulTQRFTkhMio2oad/j7ytilByq;3:nvf7Nj16LhVZgwvFFUI3KJdPZA5V0RF/jpBnv/8rir2VQt2eQjpxDPWwb4gSgN9n34HZNLKIlFr8q0jCAFjialQJlvAlw38q/1jLJ61bKemRPTEVZCQNgjwqvB+iLffN+dHaUrlBIO0lhRB2gE7SnNpLbrCkeqjSS1AOiMg04P9VlnKgpCwIt0P7mkLMPge6CEkrrHeYUWcdZUbleB6peELt/Bagb8q7LvCRS5lsrefKyuQE2bRJZ3wERFFAkAGQ;25:fV+UqH1LPruk0BnMLXGu4ppktOgxSgYK+QBTozfXWyCw2g4VwmWUmbfMWfXT1ieQzuHCCKFXV2zdD3Br+KWcUpIX/xoD0lx3tzG0VYIH5c1JaNZg/A37bS5r/FbmkwdadLD6qYVyNR/+mcXmO8/P37UjNYHfdtCvioZZLqBp+qUlUU4wcJEik4vOu22OETlqxrWQ1WIoXzMU4Jwu27hLd5843qRWzQHrM3YrlcfCZoRsZjH4P4ABY1dl39RymiL8hCCeyxwiDwA4230uqFhr1w==;20:3ihmxT1MN+pFZcjcIjI4heTGtWGC8sgvfEcc7AEZrOMqsGVev99ApQD6K+9TpgMVDAfRbDybGsMRTVCLB0jI2Q==;23:0kI0/U5z6tyEd+5hIzy23cwqX7EXyaTcKAk1XpwZH3HUu7IdJ9Jq0SSwKw9O3G0yY20wjVHftik9q3g8M+/vO2QDAZLQsOw8TrRnB+ZTJ/wBaao9Q37UTc6UzTui1OHgnZ6GFJcDEV9+5qGtQixD08uqtL1XYQqo3zS4BvO50GVlhN9A/rWLOyyZl6KosRzPdRj3nlI9pKQQV5foSfCKMQxBjbHae5oomfu1dBd9qvfPknqUOG6Qz2uulMC9eDOu SN1PR02MB1695: X-MS-Exchange-Organization-RulesExecuted X-OriginatorOrg: sandisk.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2950 Lines: 54 On 07/15/2015 05:12 AM, Thomas Gleixner wrote: > On Wed, 15 Jul 2015, Christoph Hellwig wrote: >> Many years ago we decided to move setting of IRQ to core affnities to >> userspace with the irqbalance daemon. >> >> These days we have systems with lots of MSI-X vector, and we have >> hardware and subsystem support for per-CPU I/O queues in the block >> layer, the RDMA subsystem and probably the network stack (I'm not too >> familar with the recent developments there). It would really help the >> out of the box performance and experience if we could allow such >> subsystems to bind interrupt vectors to the node that the queue is >> configured on. >> >> I'd like to discuss if the rationale for moving the IRQ affinity setting >> fully to userspace are still correct in todays world any any pitfalls >> we'll have to learn from in irqbalanced and the old in-kernel affinity >> code. > > I think setting an initial affinity is not going to create the horror > of the old in-kernel irq balancer again. It still could be changed > from user space and does not try to be smart by moving interrupts > around in circles all the time. Thanks Thomas for your feedback. But no matter whether IRQ balancing happens in user space or in the kernel, the following issues need to be addressed and have not yet been addressed today: * irqbalanced is not aware of the relationship between MSI-X vectors. If e.g. two kernel drivers each allocate 24 MSI-X vectors for the PCIe interfaces they control irqbalanced could e.g. decide to associate all MSI-X vectors for the first PCIe interface with a first set of CPUs and the MSI-X vectors of the second PCIe interface with a second set of CPUs. This will result in suboptimal performance if these two PCIe interfaces are used alternatingly instead of simultaneously. * With blk-mq and scsi-mq optimal performance can only be achieved if the relationship between MSI-X vector and NUMA node does not change over time. This is necessary to allow a blk-mq/scsi-mq driver to ensure that interrupts are processed on the same NUMA node as the node on which the data structures for a communication channel have been allocated. However, today there is no API that allows blk-mq/scsi-mq drivers and irqbalanced to exchange information about the relationship between MSI-X vector ranges and NUMA nodes. The only approach I know of that works today to define IRQ affinity for blk-mq/scsi-mq drivers is to disable irqbalanced and to run a custom script that defines IRQ affinity (see e.g. the spread-mlx4-ib-interrupts attachment of http://thread.gmane.org/gmane.linux.kernel.device-mapper.devel/21312/focus=98409). Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/