2015-02-05 05:59:10

by Alexander Drozdov

[permalink] [raw]
Subject: [PATCH] af_packet: don't pass empty blocks for PACKET_V3

Don't close an empty block on timeout. Its meaningless to
pass it to the user. Moreover, passing empty blocks wastes
CPU & buffer space increasing probability of packets
dropping on small timeouts.

Side effect of this patch is indefinite user-space wait
in poll on idle links. But, I believe its better to set
timeout for poll(2) when needed than to get empty blocks
every millisecond when not needed.

Signed-off-by: Alexander Drozdov <[email protected]>
---
net/packet/af_packet.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9cfe2e1..9a2f70a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -698,6 +698,10 @@ static void prb_retire_rx_blk_timer_expired(unsigned long data)

if (pkc->last_kactive_blk_num == pkc->kactive_blk_num) {
if (!frozen) {
+ if (!BLOCK_NUM_PKTS(pbd)) {
+ /* An empty block. Just refresh the timer. */
+ goto refresh_timer;
+ }
prb_retire_current_block(pkc, po, TP_STATUS_BLK_TMO);
if (!prb_dispatch_next_block(pkc, po))
goto refresh_timer;
@@ -798,7 +802,11 @@ static void prb_close_block(struct tpacket_kbdq_core *pkc1,
h1->ts_last_pkt.ts_sec = last_pkt->tp_sec;
h1->ts_last_pkt.ts_nsec = last_pkt->tp_nsec;
} else {
- /* Ok, we tmo'd - so get the current time */
+ /* Ok, we tmo'd - so get the current time.
+ *
+ * It shouldn't really happen as we don't close empty
+ * blocks. See prb_retire_rx_blk_timer_expired().
+ */
struct timespec ts;
getnstimeofday(&ts);
h1->ts_last_pkt.ts_sec = ts.tv_sec;
--
1.9.1


2015-02-05 20:02:12

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH] af_packet: don't pass empty blocks for PACKET_V3

On Wed, Feb 4, 2015 at 9:58 PM, Alexander Drozdov <[email protected]> wrote:
> Don't close an empty block on timeout. Its meaningless to
> pass it to the user. Moreover, passing empty blocks wastes
> CPU & buffer space increasing probability of packets
> dropping on small timeouts.
>
> Side effect of this patch is indefinite user-space wait
> in poll on idle links. But, I believe its better to set
> timeout for poll(2) when needed than to get empty blocks
> every millisecond when not needed.

This change would break existing applications that have come
to depend on the periodic signal.

I don't disagree with the argument that the data ready signal
should be sent only when a block is full or a timer expires and
at least some data is waiting, but that is moot at this point.

>
> Signed-off-by: Alexander Drozdov <[email protected]>
> ---
> net/packet/af_packet.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 9cfe2e1..9a2f70a 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -698,6 +698,10 @@ static void prb_retire_rx_blk_timer_expired(unsigned long data)
>
> if (pkc->last_kactive_blk_num == pkc->kactive_blk_num) {
> if (!frozen) {
> + if (!BLOCK_NUM_PKTS(pbd)) {
> + /* An empty block. Just refresh the timer. */
> + goto refresh_timer;
> + }
> prb_retire_current_block(pkc, po, TP_STATUS_BLK_TMO);
> if (!prb_dispatch_next_block(pkc, po))
> goto refresh_timer;
> @@ -798,7 +802,11 @@ static void prb_close_block(struct tpacket_kbdq_core *pkc1,
> h1->ts_last_pkt.ts_sec = last_pkt->tp_sec;
> h1->ts_last_pkt.ts_nsec = last_pkt->tp_nsec;
> } else {
> - /* Ok, we tmo'd - so get the current time */
> + /* Ok, we tmo'd - so get the current time.
> + *
> + * It shouldn't really happen as we don't close empty
> + * blocks. See prb_retire_rx_blk_timer_expired().
> + */
> struct timespec ts;
> getnstimeofday(&ts);
> h1->ts_last_pkt.ts_sec = ts.tv_sec;
> --
> 1.9.1
>

2015-02-05 21:16:38

by Guy Harris

[permalink] [raw]
Subject: Re: [PATCH] af_packet: don't pass empty blocks for PACKET_V3


On Feb 5, 2015, at 12:01 PM, Willem de Bruijn <[email protected]> wrote:

> On Wed, Feb 4, 2015 at 9:58 PM, Alexander Drozdov <[email protected]> wrote:
>> Don't close an empty block on timeout. Its meaningless to
>> pass it to the user. Moreover, passing empty blocks wastes
>> CPU & buffer space increasing probability of packets
>> dropping on small timeouts.
>>
>> Side effect of this patch is indefinite user-space wait
>> in poll on idle links. But, I believe its better to set
>> timeout for poll(2) when needed than to get empty blocks
>> every millisecond when not needed.
>
> This change would break existing applications that have come
> to depend on the periodic signal.
>
> I don't disagree with the argument that the data ready signal
> should be sent only when a block is full or a timer expires and
> at least some data is waiting, but that is moot at this point.

For what it's worth, the BPF packet capture mechanism (which really needs a new name, to distinguish itself from the BPF packet filter language and its implementation(s), but I digress) has the same issue - when the timer expires, a wakeup is delivered even if there are no packets to read.

*However*, if there are no packets available, the buffers aren't rotated, so the empty buffer is left around to be filled up with packets, rather than being made the hold buffer.

Given that before the previous TPACKET_V3 change, wakeups were delivered when packets arrived rather than when a block was closed, presumably code using TPACKET_V3 was capable of dealing with wakeups being delivered when no new blocks had been made available to userland; could TPACKET_V3 work a bit more like BPF and deliver a wakeup when the timer expires *without* closing the empty block?-

2015-02-06 04:49:43

by Alexander Drozdov

[permalink] [raw]
Subject: Re: [PATCH] af_packet: don't pass empty blocks for PACKET_V3

On 06.02.2015 00:16:30 +0300 Guy Harris <[email protected]> wrote:
> On Feb 5, 2015, at 12:01 PM, Willem de Bruijn <[email protected]> wrote:
>
>> On Wed, Feb 4, 2015 at 9:58 PM, Alexander Drozdov <[email protected]> wrote:
>>> Don't close an empty block on timeout. Its meaningless to
>>> pass it to the user. Moreover, passing empty blocks wastes
>>> CPU & buffer space increasing probability of packets
>>> dropping on small timeouts.
>>>
>>> Side effect of this patch is indefinite user-space wait
>>> in poll on idle links. But, I believe its better to set
>>> timeout for poll(2) when needed than to get empty blocks
>>> every millisecond when not needed.
>> This change would break existing applications that have come
>> to depend on the periodic signal.
>>
>> I don't disagree with the argument that the data ready signal
>> should be sent only when a block is full or a timer expires and
>> at least some data is waiting, but that is moot at this point.
> For what it's worth, the BPF packet capture mechanism (which really needs a new name, to distinguish itself from the BPF packet filter language and its implementation(s), but I digress) has the same issue - when the timer expires, a wakeup is delivered even if there are no packets to read.
>
> *However*, if there are no packets available, the buffers aren't rotated, so the empty buffer is left around to be filled up with packets, rather than being made the hold buffer.
>
> Given that before the previous TPACKET_V3 change, wakeups were delivered when packets arrived rather than when a block was closed, presumably code using TPACKET_V3 was capable of dealing with wakeups being delivered when no new blocks had been made available to userland; could TPACKET_V3 work a bit more like BPF and deliver a wakeup when the timer expires *without* closing the empty block?
Thank you all for your comments! I'll try to create two patches:
1. Wakeup by timeout without closing the empty block
2. Allow to not wakeup by timeout (the feature should be explicitly requested by a user)

2015-02-06 06:54:28

by Alexander Drozdov

[permalink] [raw]
Subject: Re: [PATCH] af_packet: don't pass empty blocks for PACKET_V3

On 05.02.2015 23:01:38 +0300 Willem de Bruijn wrote:
> On Wed, Feb 4, 2015 at 9:58 PM, Alexander Drozdov <[email protected]> wrote:
>> Don't close an empty block on timeout. Its meaningless to
>> pass it to the user. Moreover, passing empty blocks wastes
>> CPU & buffer space increasing probability of packets
>> dropping on small timeouts.
>>
>> Side effect of this patch is indefinite user-space wait
>> in poll on idle links. But, I believe its better to set
>> timeout for poll(2) when needed than to get empty blocks
>> every millisecond when not needed.
> This change would break existing applications that have come
> to depend on the periodic signal.
>
> I don't disagree with the argument that the data ready signal
> should be sent only when a block is full or a timer expires and
> at least some data is waiting, but that is moot at this point.
I missed something. As pointed by Guy Harris <[email protected]>,
before the previous patch periodic signal was not delivered. The previous patch
(da413eec729dae5dc by Dan Collins <[email protected]>) is for 3.19 kernel only.
Should we care about existing 3.19-only applications?
>
>> Signed-off-by: Alexander Drozdov <[email protected]>
>> ---
>> net/packet/af_packet.c | 10 +++++++++-
>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 9cfe2e1..9a2f70a 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -698,6 +698,10 @@ static void prb_retire_rx_blk_timer_expired(unsigned long data)
>>
>> if (pkc->last_kactive_blk_num == pkc->kactive_blk_num) {
>> if (!frozen) {
>> + if (!BLOCK_NUM_PKTS(pbd)) {
>> + /* An empty block. Just refresh the timer. */
>> + goto refresh_timer;
>> + }
>> prb_retire_current_block(pkc, po, TP_STATUS_BLK_TMO);
>> if (!prb_dispatch_next_block(pkc, po))
>> goto refresh_timer;
>> @@ -798,7 +802,11 @@ static void prb_close_block(struct tpacket_kbdq_core *pkc1,
>> h1->ts_last_pkt.ts_sec = last_pkt->tp_sec;
>> h1->ts_last_pkt.ts_nsec = last_pkt->tp_nsec;
>> } else {
>> - /* Ok, we tmo'd - so get the current time */
>> + /* Ok, we tmo'd - so get the current time.
>> + *
>> + * It shouldn't really happen as we don't close empty
>> + * blocks. See prb_retire_rx_blk_timer_expired().
>> + */
>> struct timespec ts;
>> getnstimeofday(&ts);
>> h1->ts_last_pkt.ts_sec = ts.tv_sec;
>> --
>> 1.9.1
>>

2015-02-07 01:45:53

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH] af_packet: don't pass empty blocks for PACKET_V3

On Thu, Feb 5, 2015 at 10:54 PM, Alexander Drozdov <[email protected]> wrote:
> On 05.02.2015 23:01:38 +0300 Willem de Bruijn wrote:
>>
>> On Wed, Feb 4, 2015 at 9:58 PM, Alexander Drozdov <[email protected]>
>> wrote:
>>>
>>> Don't close an empty block on timeout. Its meaningless to
>>> pass it to the user. Moreover, passing empty blocks wastes
>>> CPU & buffer space increasing probability of packets
>>> dropping on small timeouts.
>>>
>>> Side effect of this patch is indefinite user-space wait
>>> in poll on idle links. But, I believe its better to set
>>> timeout for poll(2) when needed than to get empty blocks
>>> every millisecond when not needed.
>>
>> This change would break existing applications that have come
>> to depend on the periodic signal.
>>
>> I don't disagree with the argument that the data ready signal
>> should be sent only when a block is full or a timer expires and
>> at least some data is waiting, but that is moot at this point.
>
> I missed something. As pointed by Guy Harris <[email protected]>,
> before the previous patch periodic signal was not delivered. The previous
> patch
> (da413eec729dae5dc by Dan Collins <[email protected]>) is for 3.19 kernel
> only. Should we care about existing 3.19-only applications?

It does sound reasonable to expect processes to handle infinite sleep
on no data if that is the historical behavior of the interface.

>>
>>> Signed-off-by: Alexander Drozdov <[email protected]>
>>> ---
>>> net/packet/af_packet.c | 10 +++++++++-
>>> 1 file changed, 9 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>>> index 9cfe2e1..9a2f70a 100644
>>> --- a/net/packet/af_packet.c
>>> +++ b/net/packet/af_packet.c
>>> @@ -698,6 +698,10 @@ static void prb_retire_rx_blk_timer_expired(unsigned
>>> long data)
>>>
>>> if (pkc->last_kactive_blk_num == pkc->kactive_blk_num) {
>>> if (!frozen) {
>>> + if (!BLOCK_NUM_PKTS(pbd)) {
>>> + /* An empty block. Just refresh the
>>> timer. */
>>> + goto refresh_timer;
>>> + }
>>> prb_retire_current_block(pkc, po,
>>> TP_STATUS_BLK_TMO);
>>> if (!prb_dispatch_next_block(pkc, po))
>>> goto refresh_timer;
>>> @@ -798,7 +802,11 @@ static void prb_close_block(struct tpacket_kbdq_core
>>> *pkc1,
>>> h1->ts_last_pkt.ts_sec = last_pkt->tp_sec;
>>> h1->ts_last_pkt.ts_nsec = last_pkt->tp_nsec;
>>> } else {
>>> - /* Ok, we tmo'd - so get the current time */
>>> + /* Ok, we tmo'd - so get the current time.
>>> + *
>>> + * It shouldn't really happen as we don't close empty
>>> + * blocks. See prb_retire_rx_blk_timer_expired().
>>> + */
>>> struct timespec ts;
>>> getnstimeofday(&ts);
>>> h1->ts_last_pkt.ts_sec = ts.tv_sec;
>>> --
>>> 1.9.1
>>>
>

2015-02-24 05:18:45

by Alexander Drozdov

[permalink] [raw]
Subject: [RESEND PATCH] af_packet: don't pass empty blocks for PACKET_V3

Before da413eec729d ("packet: Fixed TPACKET V3 to signal poll when block is
closed rather than every packet") poll listening for an af_packet socket was
not signaled if there was no packets to process. After the patch poll is
signaled evety time when block retire timer expires. That happens because
af_packet closes the current block on timeout even if the block is empty.

Passing empty blocks to the user not only wastes CPU but also wastes ring
buffer space increasing probability of packets dropping on small timeouts.

Signed-off-by: Alexander Drozdov <[email protected]>
Cc: Dan Collins <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Cc: Guy Harris <[email protected]>
---
net/packet/af_packet.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9c28cec..1da46ef 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -698,6 +698,10 @@ static void prb_retire_rx_blk_timer_expired(unsigned long data)

if (pkc->last_kactive_blk_num == pkc->kactive_blk_num) {
if (!frozen) {
+ if (!BLOCK_NUM_PKTS(pbd)) {
+ /* An empty block. Just refresh the timer. */
+ goto refresh_timer;
+ }
prb_retire_current_block(pkc, po, TP_STATUS_BLK_TMO);
if (!prb_dispatch_next_block(pkc, po))
goto refresh_timer;
@@ -798,7 +802,11 @@ static void prb_close_block(struct tpacket_kbdq_core *pkc1,
h1->ts_last_pkt.ts_sec = last_pkt->tp_sec;
h1->ts_last_pkt.ts_nsec = last_pkt->tp_nsec;
} else {
- /* Ok, we tmo'd - so get the current time */
+ /* Ok, we tmo'd - so get the current time.
+ *
+ * It shouldn't really happen as we don't close empty
+ * blocks. See prb_retire_rx_blk_timer_expired().
+ */
struct timespec ts;
getnstimeofday(&ts);
h1->ts_last_pkt.ts_sec = ts.tv_sec;
--
1.9.1

2015-02-24 21:10:03

by David Miller

[permalink] [raw]
Subject: Re: [RESEND PATCH] af_packet: don't pass empty blocks for PACKET_V3

From: Alexander Drozdov <[email protected]>
Date: Tue, 24 Feb 2015 08:18:28 +0300

> Before da413eec729d ("packet: Fixed TPACKET V3 to signal poll when block is
> closed rather than every packet") poll listening for an af_packet socket was
> not signaled if there was no packets to process. After the patch poll is
> signaled evety time when block retire timer expires. That happens because
> af_packet closes the current block on timeout even if the block is empty.
>
> Passing empty blocks to the user not only wastes CPU but also wastes ring
> buffer space increasing probability of packets dropping on small timeouts.
>
> Signed-off-by: Alexander Drozdov <[email protected]>

Applied, thanks.