In a realtime environment, it is essential to isolate
unwanted IRQs from isolated CPUs to prevent latency overheads.
Creating MSIX vectors only based on the online CPUs could lead
to a potential issue on an RT setup that has several isolated
CPUs but a very few housekeeping CPUs. This is because in these
kinds of setups an attempt to move the IRQs to the limited
housekeeping CPUs from isolated CPUs might fail due to the per
CPU vector limit. This could eventually result in latency spikes
because of the IRQ threads that we fail to move from isolated
CPUs. This patch prevents i40e to add vectors only based on
available online CPUs by using housekeeping_cpumask() to derive
the number of available housekeeping CPUs.
Signed-off-by: Nitesh Narayan Lal <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 5d807c8004f8..9691bececb86 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5,6 +5,7 @@
#include <linux/of_net.h>
#include <linux/pci.h>
#include <linux/bpf.h>
+#include <linux/sched/isolation.h>
/* Local includes */
#include "i40e.h"
@@ -10933,11 +10934,13 @@ static int i40e_reserve_msix_vectors(struct i40e_pf *pf, int vectors)
static int i40e_init_msix(struct i40e_pf *pf)
{
struct i40e_hw *hw = &pf->hw;
+ const struct cpumask *mask;
int cpus, extra_vectors;
int vectors_left;
int v_budget, i;
int v_actual;
int iwarp_requested = 0;
+ int hk_flags;
if (!(pf->flags & I40E_FLAG_MSIX_ENABLED))
return -ENODEV;
@@ -10968,12 +10971,15 @@ static int i40e_init_msix(struct i40e_pf *pf)
/* reserve some vectors for the main PF traffic queues. Initially we
* only reserve at most 50% of the available vectors, in the case that
- * the number of online CPUs is large. This ensures that we can enable
- * extra features as well. Once we've enabled the other features, we
- * will use any remaining vectors to reach as close as we can to the
- * number of online CPUs.
+ * the number of online (housekeeping) CPUs is large. This ensures that
+ * we can enable extra features as well. Once we've enabled the other
+ * features, we will use any remaining vectors to reach as close as we
+ * can to the number of online (housekeeping) CPUs.
*/
- cpus = num_online_cpus();
+ hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
+ mask = housekeeping_cpumask(hk_flags);
+ cpus = cpumask_weight(mask);
+
pf->num_lan_msix = min_t(int, cpus, vectors_left / 2);
vectors_left -= pf->num_lan_msix;
--
2.18.4
> -----Original Message-----
> From: Nitesh Narayan Lal <[email protected]>
> Sent: Monday, June 15, 2020 1:21 PM
> To: [email protected]; [email protected]; [email protected];
> [email protected]; Kirsher, Jeffrey T <[email protected]>; Keller,
> Jacob E <[email protected]>; [email protected]
> Subject: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
>
> In a realtime environment, it is essential to isolate
> unwanted IRQs from isolated CPUs to prevent latency overheads.
> Creating MSIX vectors only based on the online CPUs could lead
> to a potential issue on an RT setup that has several isolated
> CPUs but a very few housekeeping CPUs. This is because in these
> kinds of setups an attempt to move the IRQs to the limited
> housekeeping CPUs from isolated CPUs might fail due to the per
> CPU vector limit. This could eventually result in latency spikes
> because of the IRQ threads that we fail to move from isolated
> CPUs. This patch prevents i40e to add vectors only based on
> available online CPUs by using housekeeping_cpumask() to derive
> the number of available housekeeping CPUs.
>
> Signed-off-by: Nitesh Narayan Lal <[email protected]>
> ---
Ok, so the idea is that "housekeeping" CPUs are to be used for general purpose configuration, and thus is a subset of online CPUs. By reducing the limit to just housekeeping CPUs, we ensure that we do not overload the system with more queues than can be handled by the general purpose CPUs?
Thanks,
Jake
On 6/15/20 4:48 PM, Keller, Jacob E wrote:
>
>> -----Original Message-----
>> From: Nitesh Narayan Lal <[email protected]>
>> Sent: Monday, June 15, 2020 1:21 PM
>> To: [email protected]; [email protected]; [email protected];
>> [email protected]; Kirsher, Jeffrey T <[email protected]>; Keller,
>> Jacob E <[email protected]>; [email protected]
>> Subject: [Patch v1] i40e: limit the msix vectors based on housekeeping CPUs
>>
>> In a realtime environment, it is essential to isolate
>> unwanted IRQs from isolated CPUs to prevent latency overheads.
>> Creating MSIX vectors only based on the online CPUs could lead
>> to a potential issue on an RT setup that has several isolated
>> CPUs but a very few housekeeping CPUs. This is because in these
>> kinds of setups an attempt to move the IRQs to the limited
>> housekeeping CPUs from isolated CPUs might fail due to the per
>> CPU vector limit. This could eventually result in latency spikes
>> because of the IRQ threads that we fail to move from isolated
>> CPUs. This patch prevents i40e to add vectors only based on
>> available online CPUs by using housekeeping_cpumask() to derive
>> the number of available housekeeping CPUs.
>>
>> Signed-off-by: Nitesh Narayan Lal <[email protected]>
>> ---
> Ok, so the idea is that "housekeeping" CPUs are to be used for general purpose configuration, and thus is a subset of online CPUs. By reducing the limit to just housekeeping CPUs, we ensure that we do not overload the system with more queues than can be handled by the general purpose CPUs?
Yes.
General purpose or the housekeeping CPUs or the non-isolated CPUs.
>
> Thanks,
> Jake
>
--
Nitesh
On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
> + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
> + mask = housekeeping_cpumask(hk_flags);
> + cpus = cpumask_weight(mask);
Code like this has no business inside a driver. Please provide a
proper core API for it instead. Also please wire up
pci_alloc_irq_vectors* to use this API as well.
On 6/16/20 4:03 AM, Christoph Hellwig wrote:
> On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
>> + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> + mask = housekeeping_cpumask(hk_flags);
>> + cpus = cpumask_weight(mask);
> Code like this has no business inside a driver. Please provide a
> proper core API for it instead.
Ok, I will think of a better way of doing this.
> Also please wire up
> pci_alloc_irq_vectors* to use this API as well.
Understood, I will include this in a separate patch.
>
--
Thanks
Nitesh
On 6/16/20 4:03 AM, Christoph Hellwig wrote:
> On Mon, Jun 15, 2020 at 04:21:25PM -0400, Nitesh Narayan Lal wrote:
>> + hk_flags = HK_FLAG_DOMAIN | HK_FLAG_WQ;
>> + mask = housekeeping_cpumask(hk_flags);
>> + cpus = cpumask_weight(mask);
> Code like this has no business inside a driver. Please provide a
> proper core API for it instead. Also please wire up
> pci_alloc_irq_vectors* to use this API as well.
>
Hi Christoph,
I have been looking into using nr_houskeeping_* API that I will be defining
within pci_alloc_irq_vectors* to limit the nr of vectors.
However, I am wondering about a few things:
- Some of the drivers such as i40e until now, use the num_online CPUs to
restrict the number of vectors that they should create. Will it make sense if
I restrict the maximum vectors requested based on
nr_online/housekeeping_cpus (Though I will have to make sure that the
min_vecs is always satisfied)?
The other option would be to check for the total available vectors in all
online/housekeeping CPUs for limiting the maxvecs, this way will probably be
more accurate?
- Another thing that I am wondering about is the right way to test this change.
Please let me know if you have any suggestions?
--
Nitesh