2021-04-12 06:59:24

by Jason Xing

[permalink] [raw]
Subject: [PATCH] i40e: fix the panic when running bpf in xdpdrv mode

From: Jason Xing <[email protected]>

Fix this by add more rules to calculate the value of @rss_size_max which
could be used in allocating the queues when bpf is loaded, which, however,
could cause the failure and then triger the NULL pointer of vsi->rx_rings.
Prio to this fix, the machine doesn't care about how many cpus are online
and then allocates 256 queues on the machine with 32 cpus online
actually.

Once the load of bpf begins, the log will go like this "failed to get
tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
failed".

Thus, I attach the key information of the crash-log here.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
Call Trace:
[2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
[2160294.717666] dev_xdp_install+0x4f/0x70
[2160294.718036] dev_change_xdp_fd+0x11f/0x230
[2160294.718380] ? dev_disable_lro+0xe0/0xe0
[2160294.718705] do_setlink+0xac7/0xe70
[2160294.719035] ? __nla_parse+0xed/0x120
[2160294.719365] rtnl_newlink+0x73b/0x860

Signed-off-by: Jason Xing <[email protected]>
Signed-off-by: Shujin Li <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 521ea9d..4e9a247 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
{
int err = 0;
int size;
+ u16 pow;

/* Set default capability flags */
pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
@@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf)
pf->rss_table_size = pf->hw.func_caps.rss_table_size;
pf->rss_size_max = min_t(int, pf->rss_size_max,
pf->hw.func_caps.num_tx_qp);
+
+ /* find the next higher power-of-2 of num cpus */
+ pow = roundup_pow_of_two(num_online_cpus());
+ pf->rss_size_max = min_t(int, pf->rss_size_max, pow);
+
if (pf->hw.func_caps.rss) {
pf->flags |= I40E_FLAG_RSS_ENABLED;
pf->alloc_rss_size = min_t(int, pf->rss_size_max,
--
1.8.3.1


2021-04-13 07:18:51

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH] i40e: fix the panic when running bpf in xdpdrv mode

[email protected] wrote:

> From: Jason Xing <[email protected]>
>
> Re: [PATCH] i40e: fix the panic when running bpf in xdpdrv mode

Please use netdev style subject lines when patching net kernel to
indicate which kernel tree this is targeted at, "net" or "net-next"
[PATCH net v2] i40e: ...

> Fix this by add more rules to calculate the value of @rss_size_max which

Fix this panic by adding ...

> could be used in allocating the queues when bpf is loaded, which, however,
> could cause the failure and then triger the NULL pointer of vsi->rx_rings.

trigger

> Prio to this fix, the machine doesn't care about how many cpus are online
> and then allocates 256 queues on the machine with 32 cpus online
> actually.
>
> Once the load of bpf begins, the log will go like this "failed to get
> tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> failed".
>
> Thus, I attach the key information of the crash-log here.
>
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> Call Trace:
> [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> [2160294.717666] dev_xdp_install+0x4f/0x70
> [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> [2160294.718705] do_setlink+0xac7/0xe70
> [2160294.719035] ? __nla_parse+0xed/0x120
> [2160294.719365] rtnl_newlink+0x73b/0x860
>
> Signed-off-by: Jason Xing <[email protected]>
> Signed-off-by: Shujin Li <[email protected]>

if you send to "net" - I suspect you should supply a Fixes: line, above
the sign-offs.
In this case however, this bug has been here since the beginning of the
driver, but the patch will easily apply, so please supply

Fixes: 41c445ff0f48 ("i40e: main driver core")

> ---
> drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 521ea9d..4e9a247 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
> {
> int err = 0;
> int size;
> + u16 pow;
>
> /* Set default capability flags */
> pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
> @@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf)
> pf->rss_table_size = pf->hw.func_caps.rss_table_size;
> pf->rss_size_max = min_t(int, pf->rss_size_max,
> pf->hw.func_caps.num_tx_qp);
> +
> + /* find the next higher power-of-2 of num cpus */
> + pow = roundup_pow_of_two(num_online_cpus());
> + pf->rss_size_max = min_t(int, pf->rss_size_max, pow);
> +

The fix itself is fine, and is correct as far as I can tell, thank you
for sending the patch!

> if (pf->hw.func_caps.rss) {
> pf->flags |= I40E_FLAG_RSS_ENABLED;
> pf->alloc_rss_size = min_t(int, pf->rss_size_max,


2021-04-13 12:28:52

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH] i40e: fix the panic when running bpf in xdpdrv mode

On Tue, Apr 13, 2021 at 5:52 AM Jesse Brandeburg
<[email protected]> wrote:
>
> [email protected] wrote:
>
> > From: Jason Xing <[email protected]>
> >
> > Re: [PATCH] i40e: fix the panic when running bpf in xdpdrv mode
>
> Please use netdev style subject lines when patching net kernel to
> indicate which kernel tree this is targeted at, "net" or "net-next"
> [PATCH net v2] i40e: ...
>
> > Fix this by add more rules to calculate the value of @rss_size_max which
>
> Fix this panic by adding ...
>
> > could be used in allocating the queues when bpf is loaded, which, however,
> > could cause the failure and then triger the NULL pointer of vsi->rx_rings.
>
> trigger
>
> > Prio to this fix, the machine doesn't care about how many cpus are online
> > and then allocates 256 queues on the machine with 32 cpus online
> > actually.
> >
> > Once the load of bpf begins, the log will go like this "failed to get
> > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> > failed".
> >
> > Thus, I attach the key information of the crash-log here.
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000000
> > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> > Call Trace:
> > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> > [2160294.717666] dev_xdp_install+0x4f/0x70
> > [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> > [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> > [2160294.718705] do_setlink+0xac7/0xe70
> > [2160294.719035] ? __nla_parse+0xed/0x120
> > [2160294.719365] rtnl_newlink+0x73b/0x860
> >
> > Signed-off-by: Jason Xing <[email protected]>
> > Signed-off-by: Shujin Li <[email protected]>
>
> if you send to "net" - I suspect you should supply a Fixes: line, above
> the sign-offs.
> In this case however, this bug has been here since the beginning of the
> driver, but the patch will easily apply, so please supply
>
> Fixes: 41c445ff0f48 ("i40e: main driver core")
>
> > ---
> > drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > index 521ea9d..4e9a247 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > @@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
> > {
> > int err = 0;
> > int size;
> > + u16 pow;
> >
> > /* Set default capability flags */
> > pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
> > @@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf)
> > pf->rss_table_size = pf->hw.func_caps.rss_table_size;
> > pf->rss_size_max = min_t(int, pf->rss_size_max,
> > pf->hw.func_caps.num_tx_qp);
> > +
> > + /* find the next higher power-of-2 of num cpus */
> > + pow = roundup_pow_of_two(num_online_cpus());
> > + pf->rss_size_max = min_t(int, pf->rss_size_max, pow);
> > +
>
> The fix itself is fine, and is correct as far as I can tell, thank you
> for sending the patch!
>

Thanks for your advice. I'm going to send the patch v2 :)

Jason

> > if (pf->hw.func_caps.rss) {
> > pf->flags |= I40E_FLAG_RSS_ENABLED;
> > pf->alloc_rss_size = min_t(int, pf->rss_size_max,
>
>

2021-04-13 13:11:44

by Jason Xing

[permalink] [raw]
Subject: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

From: Jason Xing <[email protected]>

Fix this panic by adding more rules to calculate the value of @rss_size_max
which could be used in allocating the queues when bpf is loaded, which,
however, could cause the failure and then trigger the NULL pointer of
vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
cpus are online and then allocates 256 queues on the machine with 32 cpus
online actually.

Once the load of bpf begins, the log will go like this "failed to get
tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
failed".

Thus, I attach the key information of the crash-log here.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
Call Trace:
[2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
[2160294.717666] dev_xdp_install+0x4f/0x70
[2160294.718036] dev_change_xdp_fd+0x11f/0x230
[2160294.718380] ? dev_disable_lro+0xe0/0xe0
[2160294.718705] do_setlink+0xac7/0xe70
[2160294.719035] ? __nla_parse+0xed/0x120
[2160294.719365] rtnl_newlink+0x73b/0x860

Fixes: 41c445ff0f48 ("i40e: main driver core")

Signed-off-by: Jason Xing <[email protected]>
Signed-off-by: Shujin Li <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 521ea9d..4e9a247 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
{
int err = 0;
int size;
+ u16 pow;

/* Set default capability flags */
pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
@@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf)
pf->rss_table_size = pf->hw.func_caps.rss_table_size;
pf->rss_size_max = min_t(int, pf->rss_size_max,
pf->hw.func_caps.num_tx_qp);
+
+ /* find the next higher power-of-2 of num cpus */
+ pow = roundup_pow_of_two(num_online_cpus());
+ pf->rss_size_max = min_t(int, pf->rss_size_max, pow);
+
if (pf->hw.func_caps.rss) {
pf->flags |= I40E_FLAG_RSS_ENABLED;
pf->alloc_rss_size = min_t(int, pf->rss_size_max,
--
1.8.3.1

2021-04-13 21:33:03

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

[email protected] wrote:

> From: Jason Xing <[email protected]>

Hi Jason,

Sorry, I missed this on the first time: Added intel-wired-lan,
please include on any future submissions for Intel drivers.
get-maintainers script might help here?

>
> Fix this panic by adding more rules to calculate the value of @rss_size_max
> which could be used in allocating the queues when bpf is loaded, which,
> however, could cause the failure and then trigger the NULL pointer of
> vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> cpus are online and then allocates 256 queues on the machine with 32 cpus
> online actually.
>
> Once the load of bpf begins, the log will go like this "failed to get
> tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> failed".
>
> Thus, I attach the key information of the crash-log here.
>
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> Call Trace:
> [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> [2160294.717666] dev_xdp_install+0x4f/0x70
> [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> [2160294.718705] do_setlink+0xac7/0xe70
> [2160294.719035] ? __nla_parse+0xed/0x120
> [2160294.719365] rtnl_newlink+0x73b/0x860
>
> Fixes: 41c445ff0f48 ("i40e: main driver core")
>

This Fixes line should be connected to the Sign offs with
no linefeeds between.

> Signed-off-by: Jason Xing <[email protected]>
> Signed-off-by: Shujin Li <[email protected]>

Did Shujin contribute to this patch? Why are they signing off? If
they developed this patch with you, it should say:
Co-developed-by: Shujin ....
Signed-off-by: Shujin ...
Signed-off-by: Jason ...

Your signature should be last if you sent the patch. The sign-offs are
like a chain of custody, please review
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by

Thanks,
Jesse

2021-04-14 10:02:07

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

On Wed, Apr 14, 2021 at 12:27 AM Jesse Brandeburg
<[email protected]> wrote:
>
> [email protected] wrote:
>
> > From: Jason Xing <[email protected]>
>
> Hi Jason,
>
> Sorry, I missed this on the first time: Added intel-wired-lan,
> please include on any future submissions for Intel drivers.
> get-maintainers script might help here?
>

Hi, Jesse

In the first patch, I did send to intel-wired-lan but it told me that
it is open for the member only, so
I got rid of it in this patch v2.

> >
> > Fix this panic by adding more rules to calculate the value of @rss_size_max
> > which could be used in allocating the queues when bpf is loaded, which,
> > however, could cause the failure and then trigger the NULL pointer of
> > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> > cpus are online and then allocates 256 queues on the machine with 32 cpus
> > online actually.
> >
> > Once the load of bpf begins, the log will go like this "failed to get
> > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> > failed".
> >
> > Thus, I attach the key information of the crash-log here.
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000000
> > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> > Call Trace:
> > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> > [2160294.717666] dev_xdp_install+0x4f/0x70
> > [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> > [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> > [2160294.718705] do_setlink+0xac7/0xe70
> > [2160294.719035] ? __nla_parse+0xed/0x120
> > [2160294.719365] rtnl_newlink+0x73b/0x860
> >
> > Fixes: 41c445ff0f48 ("i40e: main driver core")
> >
>
> This Fixes line should be connected to the Sign offs with
> no linefeeds between.
>
> > Signed-off-by: Jason Xing <[email protected]>
> > Signed-off-by: Shujin Li <[email protected]>
>
> Did Shujin contribute to this patch? Why are they signing off? If
> they developed this patch with you, it should say:
> Co-developed-by: Shujin ....
> Signed-off-by: Shujin ...
> Signed-off-by: Jason ...
>

Well, yes, I will add a Co-developed-by label later.

> Your signature should be last if you sent the patch. The sign-offs are
> like a chain of custody, please review
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by
>

Thanks so much for your detailed instructions. I'm about to correct
them all together and then resend the patch v3.

Jason

> Thanks,
> Jesse

2021-04-14 10:21:38

by Jason Xing

[permalink] [raw]
Subject: [PATCH net v3] i40e: fix the panic when running bpf in xdpdrv mode

From: Jason Xing <[email protected]>

Fix this panic by adding more rules to calculate the value of @rss_size_max
which could be used in allocating the queues when bpf is loaded, which,
however, could cause the failure and then trigger the NULL pointer of
vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
cpus are online and then allocates 256 queues on the machine with 32 cpus
online actually.

Once the load of bpf begins, the log will go like this "failed to get
tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
failed".

Thus, I attach the key information of the crash-log here.

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
Call Trace:
[2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
[2160294.717666] dev_xdp_install+0x4f/0x70
[2160294.718036] dev_change_xdp_fd+0x11f/0x230
[2160294.718380] ? dev_disable_lro+0xe0/0xe0
[2160294.718705] do_setlink+0xac7/0xe70
[2160294.719035] ? __nla_parse+0xed/0x120
[2160294.719365] rtnl_newlink+0x73b/0x860

Fixes: 41c445ff0f48 ("i40e: main driver core")
Co-developed-by: Shujin Li <[email protected]>
Signed-off-by: Shujin Li <[email protected]>
Signed-off-by: Jason Xing <[email protected]>
---
drivers/net/ethernet/intel/i40e/i40e_main.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 521ea9d..4e9a247 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11867,6 +11867,7 @@ static int i40e_sw_init(struct i40e_pf *pf)
{
int err = 0;
int size;
+ u16 pow;

/* Set default capability flags */
pf->flags = I40E_FLAG_RX_CSUM_ENABLED |
@@ -11885,6 +11886,11 @@ static int i40e_sw_init(struct i40e_pf *pf)
pf->rss_table_size = pf->hw.func_caps.rss_table_size;
pf->rss_size_max = min_t(int, pf->rss_size_max,
pf->hw.func_caps.num_tx_qp);
+
+ /* find the next higher power-of-2 of num cpus */
+ pow = roundup_pow_of_two(num_online_cpus());
+ pf->rss_size_max = min_t(int, pf->rss_size_max, pow);
+
if (pf->hw.func_caps.rss) {
pf->flags |= I40E_FLAG_RSS_ENABLED;
pf->alloc_rss_size = min_t(int, pf->rss_size_max,
--
1.8.3.1

2021-04-15 01:16:46

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

On Wed, Apr 14, 2021 at 12:27 AM Jesse Brandeburg
<[email protected]> wrote:
>
> [email protected] wrote:
>
> > From: Jason Xing <[email protected]>
>
> Hi Jason,
>
> Sorry, I missed this on the first time: Added intel-wired-lan,
> please include on any future submissions for Intel drivers.
> get-maintainers script might help here?
>

Probably I got this wrong in the last email. Did you mean that I should add
intel-wired-lan in the title not the cc list? It seems I should put
this together on
the next submission like this:

[Intel-wired-lan] [PATCH net v4]

Am I missing something?

Thanks,
Jason

> >
> > Fix this panic by adding more rules to calculate the value of @rss_size_max
> > which could be used in allocating the queues when bpf is loaded, which,
> > however, could cause the failure and then trigger the NULL pointer of
> > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> > cpus are online and then allocates 256 queues on the machine with 32 cpus
> > online actually.
> >
> > Once the load of bpf begins, the log will go like this "failed to get
> > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> > failed".
> >
> > Thus, I attach the key information of the crash-log here.
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000000
> > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> > Call Trace:
> > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> > [2160294.717666] dev_xdp_install+0x4f/0x70
> > [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> > [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> > [2160294.718705] do_setlink+0xac7/0xe70
> > [2160294.719035] ? __nla_parse+0xed/0x120
> > [2160294.719365] rtnl_newlink+0x73b/0x860
> >
> > Fixes: 41c445ff0f48 ("i40e: main driver core")
> >
>
> This Fixes line should be connected to the Sign offs with
> no linefeeds between.
>
> > Signed-off-by: Jason Xing <[email protected]>
> > Signed-off-by: Shujin Li <[email protected]>
>
> Did Shujin contribute to this patch? Why are they signing off? If
> they developed this patch with you, it should say:
> Co-developed-by: Shujin ....
> Signed-off-by: Shujin ...
> Signed-off-by: Jason ...
>
> Your signature should be last if you sent the patch. The sign-offs are
> like a chain of custody, please review
> https://www.kernel.org/doc/html/latest/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by
>
> Thanks,
> Jesse

2021-04-15 02:07:40

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH net v3] i40e: fix the panic when running bpf in xdpdrv mode

[email protected] wrote:

> From: Jason Xing <[email protected]>
>
> Fix this panic by adding more rules to calculate the value of @rss_size_max
> which could be used in allocating the queues when bpf is loaded, which,
> however, could cause the failure and then trigger the NULL pointer of
> vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> cpus are online and then allocates 256 queues on the machine with 32 cpus
> online actually.
>
> Once the load of bpf begins, the log will go like this "failed to get
> tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> failed".
>
> Thus, I attach the key information of the crash-log here.
>
> BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000000
> RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> Call Trace:
> [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> [2160294.717666] dev_xdp_install+0x4f/0x70
> [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> [2160294.718705] do_setlink+0xac7/0xe70
> [2160294.719035] ? __nla_parse+0xed/0x120
> [2160294.719365] rtnl_newlink+0x73b/0x860
>
> Fixes: 41c445ff0f48 ("i40e: main driver core")
> Co-developed-by: Shujin Li <[email protected]>
> Signed-off-by: Shujin Li <[email protected]>
> Signed-off-by: Jason Xing <[email protected]>

Reviewed-by: Jesse Brandeburg <[email protected]>

@Jakub/@DaveM - feel free to apply this directly.

2021-04-15 02:10:17

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

Jason Xing wrote:

> On Wed, Apr 14, 2021 at 12:27 AM Jesse Brandeburg
> <[email protected]> wrote:
> >
> > [email protected] wrote:
> >
> > > From: Jason Xing <[email protected]>
> >
> > Hi Jason,
> >
> > Sorry, I missed this on the first time: Added intel-wired-lan,
> > please include on any future submissions for Intel drivers.
> > get-maintainers script might help here?
> >
>
> Probably I got this wrong in the last email. Did you mean that I should add
> intel-wired-lan in the title not the cc list? It seems I should put
> this together on
> the next submission like this:
>
> [Intel-wired-lan] [PATCH net v4]

Your v3 submittal was correct. My intent was to make sure
intel-wired-lan was in CC:

If Kuba or Dave wants us to take the fix in via intel-wired-lan trees,
then we can do that, or they can apply it directly. I'll ack it on the
v3.

2021-04-15 05:18:21

by Jason Xing

[permalink] [raw]
Subject: Re: [PATCH net v2] i40e: fix the panic when running bpf in xdpdrv mode

On Thu, Apr 15, 2021 at 10:08 AM Jesse Brandeburg
<[email protected]> wrote:
>
> Jason Xing wrote:
>
> > On Wed, Apr 14, 2021 at 12:27 AM Jesse Brandeburg
> > <[email protected]> wrote:
> > >
> > > [email protected] wrote:
> > >
> > > > From: Jason Xing <[email protected]>
> > >
> > > Hi Jason,
> > >
> > > Sorry, I missed this on the first time: Added intel-wired-lan,
> > > please include on any future submissions for Intel drivers.
> > > get-maintainers script might help here?
> > >
> >
> > Probably I got this wrong in the last email. Did you mean that I should add
> > intel-wired-lan in the title not the cc list? It seems I should put
> > this together on
> > the next submission like this:
> >
> > [Intel-wired-lan] [PATCH net v4]
>
> Your v3 submittal was correct. My intent was to make sure
> intel-wired-lan was in CC:
>

Well, I get to know more about the whole thing.

> If Kuba or Dave wants us to take the fix in via intel-wired-lan trees,
> then we can do that, or they can apply it directly. I'll ack it on the
> v3.

Thanks, Jesse:)

Jason

>

2021-04-15 10:32:45

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH net v3] i40e: fix the panic when running bpf in xdpdrv mode

On Wed, 14 Apr 2021 19:06:52 -0700
Jesse Brandeburg <[email protected]> wrote:

> [email protected] wrote:
>
> > From: Jason Xing <[email protected]>
> >
> > Fix this panic by adding more rules to calculate the value of @rss_size_max
> > which could be used in allocating the queues when bpf is loaded, which,
> > however, could cause the failure and then trigger the NULL pointer of
> > vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> > cpus are online and then allocates 256 queues on the machine with 32 cpus
> > online actually.
> >
> > Once the load of bpf begins, the log will go like this "failed to get
> > tracking for 256 queues for VSI 0 err -12" and this "setup of MAIN VSI
> > failed".
> >
> > Thus, I attach the key information of the crash-log here.
> >
> > BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000000
> > RIP: 0010:i40e_xdp+0xdd/0x1b0 [i40e]
> > Call Trace:
> > [2160294.717292] ? i40e_reconfig_rss_queues+0x170/0x170 [i40e]
> > [2160294.717666] dev_xdp_install+0x4f/0x70
> > [2160294.718036] dev_change_xdp_fd+0x11f/0x230
> > [2160294.718380] ? dev_disable_lro+0xe0/0xe0
> > [2160294.718705] do_setlink+0xac7/0xe70
> > [2160294.719035] ? __nla_parse+0xed/0x120
> > [2160294.719365] rtnl_newlink+0x73b/0x860
> >
> > Fixes: 41c445ff0f48 ("i40e: main driver core")
> > Co-developed-by: Shujin Li <[email protected]>
> > Signed-off-by: Shujin Li <[email protected]>
> > Signed-off-by: Jason Xing <[email protected]>
>
> Reviewed-by: Jesse Brandeburg <[email protected]>
>
> @Jakub/@DaveM - feel free to apply this directly.

Acked-by: Jesper Dangaard Brouer <[email protected]>

The crash/bug happens in this code:

static int i40e_xdp_setup(struct i40e_vsi *vsi, struct bpf_prog *prog,
struct netlink_ext_ack *extack)
{
[...]
for (i = 0; i < vsi->num_queue_pairs; i++)
WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog);


And this is a side effect of i40e_setup_pf_switch() failing with "setup
of MAIN VSI failed".

LGTM
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

2021-04-15 21:41:35

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net v3] i40e: fix the panic when running bpf in xdpdrv mode

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Wed, 14 Apr 2021 10:34:28 +0800 you wrote:
> From: Jason Xing <[email protected]>
>
> Fix this panic by adding more rules to calculate the value of @rss_size_max
> which could be used in allocating the queues when bpf is loaded, which,
> however, could cause the failure and then trigger the NULL pointer of
> vsi->rx_rings. Prio to this fix, the machine doesn't care about how many
> cpus are online and then allocates 256 queues on the machine with 32 cpus
> online actually.
>
> [...]

Here is the summary with links:
- [net,v3] i40e: fix the panic when running bpf in xdpdrv mode
https://git.kernel.org/netdev/net/c/4e39a072a6a0

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html