2012-11-26 08:04:53

by Jason Wang

[permalink] [raw]
Subject: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

Some deivces do not free the old tx skbs immediately after it has been sent
(usually in tx interrupt). One such example is virtio-net which optimizes for
virt and only free the possible old tx skbs during the next packet sending. This
would lead the pktgen to wait forever in the refcount of the skb if no other
pakcet will be sent afterwards.

Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
notify the pktgen that the device does not free skb immediately after it has
been sent and let it not to wait for the refcount to be one.

Signed-off-by: Jason Wang <[email protected]>
---

Another choice is to introduce an new .ndo which could be called by pktgen
during the waiting to free the old tx skbs.

drivers/net/virtio_net.c | 3 ++-
include/uapi/linux/if.h | 2 ++
net/core/pktgen.c | 3 ++-
3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 26c502e..8262232 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1058,7 +1058,8 @@ static int virtnet_probe(struct virtio_device *vdev)
return -ENOMEM;

/* Set up network device as normal. */
- dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE;
+ dev->priv_flags |= IFF_UNICAST_FLT | IFF_LIVE_ADDR_CHANGE |
+ IFF_TX_SKB_FREE_DELAY;
dev->netdev_ops = &virtnet_netdev;
dev->features = NETIF_F_HIGHDMA;

diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
index 1ec407b..a0d2168 100644
--- a/include/uapi/linux/if.h
+++ b/include/uapi/linux/if.h
@@ -83,6 +83,8 @@
#define IFF_SUPP_NOFCS 0x80000 /* device supports sending custom FCS */
#define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address
* change when it's running */
+#define IFF_TX_SKB_FREE_DELAY 0x200000 /* device does free the tx skb
+ * immediately when it is sent */


#define IF_GET_IFACE 0x0001 /* for querying only */
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index b29dacf..85d4e53 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3269,7 +3269,8 @@ unlock:

/* If pkt_dev->count is zero, then run forever */
if ((pkt_dev->count != 0) && (pkt_dev->sofar >= pkt_dev->count)) {
- pktgen_wait_for_skb(pkt_dev);
+ if (!(pkt_dev->odev->priv_flags & IFF_TX_SKB_FREE_DELAY))
+ pktgen_wait_for_skb(pkt_dev);

/* Done with this */
pktgen_stop_device(pkt_dev);
--
1.7.1


2012-11-26 17:38:43

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Mon, 26 Nov 2012 15:56:52 +0800
Jason Wang <[email protected]> wrote:

> Some deivces do not free the old tx skbs immediately after it has been sent
> (usually in tx interrupt). One such example is virtio-net which optimizes for
> virt and only free the possible old tx skbs during the next packet sending. This
> would lead the pktgen to wait forever in the refcount of the skb if no other
> pakcet will be sent afterwards.
>
> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
> notify the pktgen that the device does not free skb immediately after it has
> been sent and let it not to wait for the refcount to be one.
>
> Signed-off-by: Jason Wang <[email protected]>

Another alternative would be using skb_orphan() and skb->destructor.
There are other cases where skb's are not freed right away.

2012-11-27 06:45:33

by Jason Wang

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> On Mon, 26 Nov 2012 15:56:52 +0800
> Jason Wang <[email protected]> wrote:
>
>> Some deivces do not free the old tx skbs immediately after it has been sent
>> (usually in tx interrupt). One such example is virtio-net which optimizes for
>> virt and only free the possible old tx skbs during the next packet sending. This
>> would lead the pktgen to wait forever in the refcount of the skb if no other
>> pakcet will be sent afterwards.
>>
>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
>> notify the pktgen that the device does not free skb immediately after it has
>> been sent and let it not to wait for the refcount to be one.
>>
>> Signed-off-by: Jason Wang <[email protected]>
> Another alternative would be using skb_orphan() and skb->destructor.
> There are other cases where skb's are not freed right away.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Stephen:

Do you mean registering a skb->destructor for pktgen then set and check
bits in skb->tx_flag?

2012-11-27 16:50:31

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Tue, 27 Nov 2012 14:45:13 +0800
Jason Wang <[email protected]> wrote:

> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> > On Mon, 26 Nov 2012 15:56:52 +0800
> > Jason Wang <[email protected]> wrote:
> >
> >> Some deivces do not free the old tx skbs immediately after it has been sent
> >> (usually in tx interrupt). One such example is virtio-net which optimizes for
> >> virt and only free the possible old tx skbs during the next packet sending. This
> >> would lead the pktgen to wait forever in the refcount of the skb if no other
> >> pakcet will be sent afterwards.
> >>
> >> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
> >> notify the pktgen that the device does not free skb immediately after it has
> >> been sent and let it not to wait for the refcount to be one.
> >>
> >> Signed-off-by: Jason Wang <[email protected]>
> > Another alternative would be using skb_orphan() and skb->destructor.
> > There are other cases where skb's are not freed right away.
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> Hi Stephen:
>
> Do you mean registering a skb->destructor for pktgen then set and check
> bits in skb->tx_flag?

Yes. Register a destructor that does something like update a counter (number of packets pending),
then just spin while number of packets pending is over threshold.

2012-11-28 06:49:13

by Jason Wang

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On 11/28/2012 12:49 AM, Stephen Hemminger wrote:
> On Tue, 27 Nov 2012 14:45:13 +0800
> Jason Wang <[email protected]> wrote:
>
>> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
>>> On Mon, 26 Nov 2012 15:56:52 +0800
>>> Jason Wang <[email protected]> wrote:
>>>
>>>> Some deivces do not free the old tx skbs immediately after it has been sent
>>>> (usually in tx interrupt). One such example is virtio-net which optimizes for
>>>> virt and only free the possible old tx skbs during the next packet sending. This
>>>> would lead the pktgen to wait forever in the refcount of the skb if no other
>>>> pakcet will be sent afterwards.
>>>>
>>>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
>>>> notify the pktgen that the device does not free skb immediately after it has
>>>> been sent and let it not to wait for the refcount to be one.
>>>>
>>>> Signed-off-by: Jason Wang <[email protected]>
>>> Another alternative would be using skb_orphan() and skb->destructor.
>>> There are other cases where skb's are not freed right away.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Hi Stephen:
>>
>> Do you mean registering a skb->destructor for pktgen then set and check
>> bits in skb->tx_flag?
> Yes. Register a destructor that does something like update a counter (number of packets pending),
> then just spin while number of packets pending is over threshold.
> --

Not sure this is the best method, since pktgen was used to test the tx
process of the device driver and NIC. If we use skb_orhpan(), we would
miss the test of tx completion part.

2012-11-28 16:54:21

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Wed, 28 Nov 2012 14:48:52 +0800
Jason Wang <[email protected]> wrote:

> On 11/28/2012 12:49 AM, Stephen Hemminger wrote:
> > On Tue, 27 Nov 2012 14:45:13 +0800
> > Jason Wang <[email protected]> wrote:
> >
> >> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> >>> On Mon, 26 Nov 2012 15:56:52 +0800
> >>> Jason Wang <[email protected]> wrote:
> >>>
> >>>> Some deivces do not free the old tx skbs immediately after it has been sent
> >>>> (usually in tx interrupt). One such example is virtio-net which optimizes for
> >>>> virt and only free the possible old tx skbs during the next packet sending. This
> >>>> would lead the pktgen to wait forever in the refcount of the skb if no other
> >>>> pakcet will be sent afterwards.
> >>>>
> >>>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY which could
> >>>> notify the pktgen that the device does not free skb immediately after it has
> >>>> been sent and let it not to wait for the refcount to be one.
> >>>>
> >>>> Signed-off-by: Jason Wang <[email protected]>
> >>> Another alternative would be using skb_orphan() and skb->destructor.
> >>> There are other cases where skb's are not freed right away.
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>> the body of a message to [email protected]
> >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >> Hi Stephen:
> >>
> >> Do you mean registering a skb->destructor for pktgen then set and check
> >> bits in skb->tx_flag?
> > Yes. Register a destructor that does something like update a counter (number of packets pending),
> > then just spin while number of packets pending is over threshold.
> > --
>
> Not sure this is the best method, since pktgen was used to test the tx
> process of the device driver and NIC. If we use skb_orhpan(), we would
> miss the test of tx completion part.

There are other places that delay freeing and your solution would mean finding
and fixing all those. Code that does that already has to use skb_orphan() to
work, and I was looking for a way that could use that. Introducing another flag
value seems like a long term burden.

Alternatively, virtio could do cleanup more aggressively. Maybe in response to
ring getting half full, or add a cleanup timer or something to avoid the problem.


2012-11-29 10:13:28

by Jason Wang

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Wednesday, November 28, 2012 08:53:05 AM Stephen Hemminger wrote:
> On Wed, 28 Nov 2012 14:48:52 +0800
>
> Jason Wang <[email protected]> wrote:
> > On 11/28/2012 12:49 AM, Stephen Hemminger wrote:
> > > On Tue, 27 Nov 2012 14:45:13 +0800
> > >
> > > Jason Wang <[email protected]> wrote:
> > >> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> > >>> On Mon, 26 Nov 2012 15:56:52 +0800
> > >>>
> > >>> Jason Wang <[email protected]> wrote:
> > >>>> Some deivces do not free the old tx skbs immediately after it has
> > >>>> been sent
> > >>>> (usually in tx interrupt). One such example is virtio-net which
> > >>>> optimizes for virt and only free the possible old tx skbs during the
> > >>>> next packet sending. This would lead the pktgen to wait forever in
> > >>>> the refcount of the skb if no other pakcet will be sent afterwards.
> > >>>>
> > >>>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY
> > >>>> which could notify the pktgen that the device does not free skb
> > >>>> immediately after it has been sent and let it not to wait for the
> > >>>> refcount to be one.
> > >>>>
> > >>>> Signed-off-by: Jason Wang <[email protected]>
> > >>>
> > >>> Another alternative would be using skb_orphan() and skb->destructor.
> > >>> There are other cases where skb's are not freed right away.
> > >>> --
> > >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> > >>> the body of a message to [email protected]
> > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > >>
> > >> Hi Stephen:
> > >>
> > >> Do you mean registering a skb->destructor for pktgen then set and check
> > >> bits in skb->tx_flag?
> > >
> > > Yes. Register a destructor that does something like update a counter
> > > (number of packets pending), then just spin while number of packets
> > > pending is over threshold.
> > > --
> >
> > Not sure this is the best method, since pktgen was used to test the tx
> > process of the device driver and NIC. If we use skb_orhpan(), we would
> > miss the test of tx completion part.
>
> There are other places that delay freeing and your solution would mean
> finding and fixing all those. Code that does that already has to use
> skb_orphan() to work, and I was looking for a way that could use that.
> Introducing another flag value seems like a long term burden.
>

Get the point, will draft another version.
> Alternatively, virtio could do cleanup more aggressively. Maybe in response
> to ring getting half full, or add a cleanup timer or something to avoid the
> problem.

May worth to try. Another method is that virtio has a feature to notify guest
when tx ring is empty, we could free the old tx skbs there. But it may brings
extra overhead. If we could let virtio_net free the old tx skb timely, it
would be easier to bring BQL support to virtio_net also.

Thanks
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-11-29 11:27:28

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Thu, Nov 29, 2012 at 06:13:13PM +0800, Jason Wang wrote:
> On Wednesday, November 28, 2012 08:53:05 AM Stephen Hemminger wrote:
> > On Wed, 28 Nov 2012 14:48:52 +0800
> >
> > Jason Wang <[email protected]> wrote:
> > > On 11/28/2012 12:49 AM, Stephen Hemminger wrote:
> > > > On Tue, 27 Nov 2012 14:45:13 +0800
> > > >
> > > > Jason Wang <[email protected]> wrote:
> > > >> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> > > >>> On Mon, 26 Nov 2012 15:56:52 +0800
> > > >>>
> > > >>> Jason Wang <[email protected]> wrote:
> > > >>>> Some deivces do not free the old tx skbs immediately after it has
> > > >>>> been sent
> > > >>>> (usually in tx interrupt). One such example is virtio-net which
> > > >>>> optimizes for virt and only free the possible old tx skbs during the
> > > >>>> next packet sending. This would lead the pktgen to wait forever in
> > > >>>> the refcount of the skb if no other pakcet will be sent afterwards.
> > > >>>>
> > > >>>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY
> > > >>>> which could notify the pktgen that the device does not free skb
> > > >>>> immediately after it has been sent and let it not to wait for the
> > > >>>> refcount to be one.
> > > >>>>
> > > >>>> Signed-off-by: Jason Wang <[email protected]>
> > > >>>
> > > >>> Another alternative would be using skb_orphan() and skb->destructor.
> > > >>> There are other cases where skb's are not freed right away.
> > > >>> --
> > > >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > >>> the body of a message to [email protected]
> > > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > >>
> > > >> Hi Stephen:
> > > >>
> > > >> Do you mean registering a skb->destructor for pktgen then set and check
> > > >> bits in skb->tx_flag?
> > > >
> > > > Yes. Register a destructor that does something like update a counter
> > > > (number of packets pending), then just spin while number of packets
> > > > pending is over threshold.
> > > > --
> > >
> > > Not sure this is the best method, since pktgen was used to test the tx
> > > process of the device driver and NIC. If we use skb_orhpan(), we would
> > > miss the test of tx completion part.
> >
> > There are other places that delay freeing and your solution would mean
> > finding and fixing all those. Code that does that already has to use
> > skb_orphan() to work, and I was looking for a way that could use that.
> > Introducing another flag value seems like a long term burden.
> >
>
> Get the point, will draft another version.
>
> > Alternatively, virtio could do cleanup more aggressively. Maybe in response
> > to ring getting half full, or add a cleanup timer or something to avoid the
> > problem.

Timer would prevent complete deadlock but it is very expensive
in the virt scenario.
pulling at ring half full would only help if ring gets half full :)
which it does not have to.

>
> May worth to try. Another method is that virtio has a feature to notify guest
> when tx ring is empty, we could free the old tx skbs there.
> But it may brings
> extra overhead. If we could let virtio_net free the old tx skb timely, it
> would be easier to bring BQL support to virtio_net also.
>
> Thanks

We used to use notify on empty - it's still very slow.

> >
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html

2012-11-29 17:22:21

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [net-next RFC] pktgen: don't wait for the device who doesn't free skb immediately after sent

On Thu, 29 Nov 2012 13:30:13 +0200
"Michael S. Tsirkin" <[email protected]> wrote:

> On Thu, Nov 29, 2012 at 06:13:13PM +0800, Jason Wang wrote:
> > On Wednesday, November 28, 2012 08:53:05 AM Stephen Hemminger wrote:
> > > On Wed, 28 Nov 2012 14:48:52 +0800
> > >
> > > Jason Wang <[email protected]> wrote:
> > > > On 11/28/2012 12:49 AM, Stephen Hemminger wrote:
> > > > > On Tue, 27 Nov 2012 14:45:13 +0800
> > > > >
> > > > > Jason Wang <[email protected]> wrote:
> > > > >> On 11/27/2012 01:37 AM, Stephen Hemminger wrote:
> > > > >>> On Mon, 26 Nov 2012 15:56:52 +0800
> > > > >>>
> > > > >>> Jason Wang <[email protected]> wrote:
> > > > >>>> Some deivces do not free the old tx skbs immediately after it has
> > > > >>>> been sent
> > > > >>>> (usually in tx interrupt). One such example is virtio-net which
> > > > >>>> optimizes for virt and only free the possible old tx skbs during the
> > > > >>>> next packet sending. This would lead the pktgen to wait forever in
> > > > >>>> the refcount of the skb if no other pakcet will be sent afterwards.
> > > > >>>>
> > > > >>>> Solving this issue by introducing a new flag IFF_TX_SKB_FREE_DELAY
> > > > >>>> which could notify the pktgen that the device does not free skb
> > > > >>>> immediately after it has been sent and let it not to wait for the
> > > > >>>> refcount to be one.
> > > > >>>>
> > > > >>>> Signed-off-by: Jason Wang <[email protected]>
> > > > >>>
> > > > >>> Another alternative would be using skb_orphan() and skb->destructor.
> > > > >>> There are other cases where skb's are not freed right away.
> > > > >>> --
> > > > >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> > > > >>> the body of a message to [email protected]
> > > > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > >>
> > > > >> Hi Stephen:
> > > > >>
> > > > >> Do you mean registering a skb->destructor for pktgen then set and check
> > > > >> bits in skb->tx_flag?
> > > > >
> > > > > Yes. Register a destructor that does something like update a counter
> > > > > (number of packets pending), then just spin while number of packets
> > > > > pending is over threshold.
> > > > > --
> > > >
> > > > Not sure this is the best method, since pktgen was used to test the tx
> > > > process of the device driver and NIC. If we use skb_orhpan(), we would
> > > > miss the test of tx completion part.
> > >
> > > There are other places that delay freeing and your solution would mean
> > > finding and fixing all those. Code that does that already has to use
> > > skb_orphan() to work, and I was looking for a way that could use that.
> > > Introducing another flag value seems like a long term burden.
> > >
> >
> > Get the point, will draft another version.
> >
> > > Alternatively, virtio could do cleanup more aggressively. Maybe in response
> > > to ring getting half full, or add a cleanup timer or something to avoid the
> > > problem.
>
> Timer would prevent complete deadlock but it is very expensive
> in the virt scenario.
> pulling at ring half full would only help if ring gets half full :)
> which it does not have to.

A timer that fires once (when idle) to mop up the last packet in a stream
would be not very expensive. Most of the time, it would just be continually
rescheduled.