2023-11-16 11:42:18

by 黄杰

[permalink] [raw]
Subject: [PATCH net] veth: fix ethtool statistical errors

if peer->real_num_rx_queues > 1, the ethtool -s command for
veth network device will display some error statistical values.
The value of tx_idx is reset with each iteration, so even if
peer->real_num_rx_queues is greater than 1, the value of tx_idx
will remain constant. This results in incorrect statistical values.
To fix this issue, assign the value of pp_idx to tx_idx.

Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
Signed-off-by: Albert Huang <[email protected]>
---
drivers/net/veth.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0deefd1573cf..3a8e3fc5eeb5 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
for (i = 0; i < peer->real_num_rx_queues; i++) {
const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
const void *base = (void *)&rq_stats->vs;
- unsigned int start, tx_idx = idx;
+ unsigned int start, tx_idx = pp_idx;
size_t offset;

tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
--
2.20.1


2023-11-17 09:26:21

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [PATCH net] veth: fix ethtool statistical errors

> if peer->real_num_rx_queues > 1, the ethtool -s command for
> veth network device will display some error statistical values.
> The value of tx_idx is reset with each iteration, so even if
> peer->real_num_rx_queues is greater than 1, the value of tx_idx
> will remain constant. This results in incorrect statistical values.
> To fix this issue, assign the value of pp_idx to tx_idx.
>
> Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> Signed-off-by: Albert Huang <[email protected]>
> ---
> drivers/net/veth.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 0deefd1573cf..3a8e3fc5eeb5 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> for (i = 0; i < peer->real_num_rx_queues; i++) {
> const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> const void *base = (void *)&rq_stats->vs;
> - unsigned int start, tx_idx = idx;
> + unsigned int start, tx_idx = pp_idx;
> size_t offset;
>
> tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> --
> 2.20.1
>

Hi Albert,

Can you please provide more details about the issue you are facing?
In particular, what is the number of configured tx and rx queues for both
peers?
tx_idx is the index of the current (local) tx queue and it must restart from
idx in each iteration otherwise we will have an issue when
peer->real_num_rx_queues is greater than dev->real_num_tx_queues.

Regards,
Lorenzo


Attachments:
(No filename) (1.59 kB)
signature.asc (235.00 B)
Download all attachments

2023-11-20 09:46:37

by 黄杰

[permalink] [raw]
Subject: Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

Lorenzo Bianconi <[email protected]> 于2023年11月17日周五 17:26写道:
>
> > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > veth network device will display some error statistical values.
> > The value of tx_idx is reset with each iteration, so even if
> > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > will remain constant. This results in incorrect statistical values.
> > To fix this issue, assign the value of pp_idx to tx_idx.
> >
> > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > Signed-off-by: Albert Huang <[email protected]>
> > ---
> > drivers/net/veth.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > const void *base = (void *)&rq_stats->vs;
> > - unsigned int start, tx_idx = idx;
> > + unsigned int start, tx_idx = pp_idx;
> > size_t offset;
> >
> > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > --
> > 2.20.1
> >
>
> Hi Albert,
>
> Can you please provide more details about the issue you are facing?
> In particular, what is the number of configured tx and rx queues for both
> peers?

Hi, Lorenzo
I found this because I wanted to add more echo information in ethttool(for veth,
but I found that the information was incorrect. That's why I paid
attention here.

> tx_idx is the index of the current (local) tx queue and it must restart from
> idx in each iteration otherwise we will have an issue when
> peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
>
OK. I don't know if this is a known issue.

BR
Albert


> Regards,
> Lorenzo

2023-11-20 09:52:16

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

> Lorenzo Bianconi <[email protected]> 于2023年11月17日周五 17:26写道:
> >
> > > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > > veth network device will display some error statistical values.
> > > The value of tx_idx is reset with each iteration, so even if
> > > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > > will remain constant. This results in incorrect statistical values.
> > > To fix this issue, assign the value of pp_idx to tx_idx.
> > >
> > > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > > Signed-off-by: Albert Huang <[email protected]>
> > > ---
> > > drivers/net/veth.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > > --- a/drivers/net/veth.c
> > > +++ b/drivers/net/veth.c
> > > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > > const void *base = (void *)&rq_stats->vs;
> > > - unsigned int start, tx_idx = idx;
> > > + unsigned int start, tx_idx = pp_idx;
> > > size_t offset;
> > >
> > > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > > --
> > > 2.20.1
> > >
> >
> > Hi Albert,
> >
> > Can you please provide more details about the issue you are facing?
> > In particular, what is the number of configured tx and rx queues for both
> > peers?
>
> Hi, Lorenzo
> I found this because I wanted to add more echo information in ethttool(for veth,
> but I found that the information was incorrect. That's why I paid
> attention here.

ack. Could you please share the veth pair tx/rx queue configuration?

Rergards,
Lorenzo

>
> > tx_idx is the index of the current (local) tx queue and it must restart from
> > idx in each iteration otherwise we will have an issue when
> > peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
> >
> OK. I don't know if this is a known issue.
>
> BR
> Albert
>
>
> > Regards,
> > Lorenzo


Attachments:
(No filename) (2.26 kB)
signature.asc (235.00 B)
Download all attachments

2023-11-20 10:03:14

by 黄杰

[permalink] [raw]
Subject: Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

Lorenzo Bianconi <[email protected]> 于2023年11月20日周一 17:52写道:
>
> > Lorenzo Bianconi <[email protected]> 于2023年11月17日周五 17:26写道:
> > >
> > > > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > > > veth network device will display some error statistical values.
> > > > The value of tx_idx is reset with each iteration, so even if
> > > > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > > > will remain constant. This results in incorrect statistical values.
> > > > To fix this issue, assign the value of pp_idx to tx_idx.
> > > >
> > > > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > > > Signed-off-by: Albert Huang <[email protected]>
> > > > ---
> > > > drivers/net/veth.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > > > --- a/drivers/net/veth.c
> > > > +++ b/drivers/net/veth.c
> > > > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > > > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > > > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > > > const void *base = (void *)&rq_stats->vs;
> > > > - unsigned int start, tx_idx = idx;
> > > > + unsigned int start, tx_idx = pp_idx;
> > > > size_t offset;
> > > >
> > > > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > > > --
> > > > 2.20.1
> > > >
> > >
> > > Hi Albert,
> > >
> > > Can you please provide more details about the issue you are facing?
> > > In particular, what is the number of configured tx and rx queues for both
> > > peers?
> >
> > Hi, Lorenzo
> > I found this because I wanted to add more echo information in ethttool(for veth,
> > but I found that the information was incorrect. That's why I paid
> > attention here.
>
> ack. Could you please share the veth pair tx/rx queue configuration?
>

dev: tx --->4. rx--->4
peer: tx--->1 rx---->1

Could the following code still be problematic? pp_idx not updated correctly.
page_pool_stats:
veth_get_page_pool_stats(dev, &data[pp_idx]);

BR
Albert

> Rergards,
> Lorenzo
>
> >
> > > tx_idx is the index of the current (local) tx queue and it must restart from
> > > idx in each iteration otherwise we will have an issue when
> > > peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
> > >
> > OK. I don't know if this is a known issue.
> >
> > BR
> > Albert
> >
> >
> > > Regards,
> > > Lorenzo

2023-11-20 10:59:36

by Lorenzo Bianconi

[permalink] [raw]
Subject: Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

> Lorenzo Bianconi <[email protected]> 于2023年11月20日周一 17:52写道:
> >
> > > Lorenzo Bianconi <[email protected]> 于2023年11月17日周五 17:26写道:
> > > >
> > > > > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > > > > veth network device will display some error statistical values.
> > > > > The value of tx_idx is reset with each iteration, so even if
> > > > > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > > > > will remain constant. This results in incorrect statistical values.
> > > > > To fix this issue, assign the value of pp_idx to tx_idx.
> > > > >
> > > > > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > ---
> > > > > drivers/net/veth.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > > > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > > > > --- a/drivers/net/veth.c
> > > > > +++ b/drivers/net/veth.c
> > > > > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > > > > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > > > > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > > > > const void *base = (void *)&rq_stats->vs;
> > > > > - unsigned int start, tx_idx = idx;
> > > > > + unsigned int start, tx_idx = pp_idx;
> > > > > size_t offset;
> > > > >
> > > > > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > > > > --
> > > > > 2.20.1
> > > > >
> > > >
> > > > Hi Albert,
> > > >
> > > > Can you please provide more details about the issue you are facing?
> > > > In particular, what is the number of configured tx and rx queues for both
> > > > peers?
> > >
> > > Hi, Lorenzo
> > > I found this because I wanted to add more echo information in ethttool(for veth,
> > > but I found that the information was incorrect. That's why I paid
> > > attention here.
> >
> > ack. Could you please share the veth pair tx/rx queue configuration?
> >
>
> dev: tx --->4. rx--->4
> peer: tx--->1 rx---->1
>
> Could the following code still be problematic? pp_idx not updated correctly.
> page_pool_stats:
> veth_get_page_pool_stats(dev, &data[pp_idx]);

Thx for pointing this out. This part is a bit tricky but I think I can see the
issue now. Since we have just one peer rx queue, when we run ndo_xdp_xmit
pointer on dev, we will squash all dev xmit queues on the single peer rx one
(where we do do the accounting) [0].
The issue is ethtool will display all dev xmit queues so we need to set pp_idx
properly in veth_get_ethtool_stats().
Can you please take a look to the patch below?

Regards,
Lorenzo

[0] https://github.com/LorenzoBianconi/net-next/blob/master/drivers/net/veth.c#L417

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 9980517ed8b0..8607eb8cf458 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -236,8 +236,8 @@ static void veth_get_ethtool_stats(struct net_device *dev,
data[tx_idx + j] += *(u64 *)(base + offset);
}
} while (u64_stats_fetch_retry(&rq_stats->syncp, start));
- pp_idx = tx_idx + VETH_TQ_STATS_LEN;
}
+ pp_idx = idx + dev->real_num_tx_queues * VETH_TQ_STATS_LEN;

page_pool_stats:
veth_get_page_pool_stats(dev, &data[pp_idx]);

>
> BR
> Albert
>
> > Rergards,
> > Lorenzo
> >
> > >
> > > > tx_idx is the index of the current (local) tx queue and it must restart from
> > > > idx in each iteration otherwise we will have an issue when
> > > > peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
> > > >
> > > OK. I don't know if this is a known issue.
> > >
> > > BR
> > > Albert
> > >
> > >
> > > > Regards,
> > > > Lorenzo


Attachments:
(No filename) (3.86 kB)
signature.asc (235.00 B)
Download all attachments

2023-11-20 11:02:35

by 黄杰

[permalink] [raw]
Subject: Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

黄杰 <[email protected]> 于2023年11月20日周一 18:02写道:
>
> Lorenzo Bianconi <[email protected]> 于2023年11月20日周一 17:52写道:
> >
> > > Lorenzo Bianconi <[email protected]> 于2023年11月17日周五 17:26写道:
> > > >
> > > > > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > > > > veth network device will display some error statistical values.
> > > > > The value of tx_idx is reset with each iteration, so even if
> > > > > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > > > > will remain constant. This results in incorrect statistical values.
> > > > > To fix this issue, assign the value of pp_idx to tx_idx.
> > > > >
> > > > > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > > > > Signed-off-by: Albert Huang <[email protected]>
> > > > > ---
> > > > > drivers/net/veth.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > > > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > > > > --- a/drivers/net/veth.c
> > > > > +++ b/drivers/net/veth.c
> > > > > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > > > > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > > > > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > > > > const void *base = (void *)&rq_stats->vs;
> > > > > - unsigned int start, tx_idx = idx;
> > > > > + unsigned int start, tx_idx = pp_idx;
> > > > > size_t offset;
> > > > >
> > > > > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > > > > --
> > > > > 2.20.1
> > > > >
> > > >
> > > > Hi Albert,
> > > >
> > > > Can you please provide more details about the issue you are facing?
> > > > In particular, what is the number of configured tx and rx queues for both
> > > > peers?
> > >
> > > Hi, Lorenzo
> > > I found this because I wanted to add more echo information in ethttool(for veth,
> > > but I found that the information was incorrect. That's why I paid
> > > attention here.
> >
> > ack. Could you please share the veth pair tx/rx queue configuration?
> >
>
> dev: tx --->4. rx--->4
> peer: tx--->1 rx---->1
>
> Could the following code still be problematic? pp_idx not updated correctly.
> page_pool_stats:
> veth_get_page_pool_stats(dev, &data[pp_idx]);

I did the test locally and there is no problem with this place. I
didn't fully understand
this piece of code before
thanks.
BR
Albert.

>
> BR
> Albert
>
> > Rergards,
> > Lorenzo
> >
> > >
> > > > tx_idx is the index of the current (local) tx queue and it must restart from
> > > > idx in each iteration otherwise we will have an issue when
> > > > peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
> > > >
> > > OK. I don't know if this is a known issue.
> > >
> > > BR
> > > Albert
> > >
> > >
> > > > Regards,
> > > > Lorenzo