LINUX_MIB_OFOPRUNED is used to count how many times ofo queue is pruned,
but sometimes we want to know how many packets are pruned from this queue,
that could help us to track the dropped packets.
As LINUX_MIB_OFOPRUNED is a useful event for us, so I introduce a new
SNMP counter LINUX_MIB_OFOPRUNEDROP, which could be showed in netstat as
OfoPruneDrop.
Signed-off-by: Yafang Shao <[email protected]>
---
include/uapi/linux/snmp.h | 1 +
net/ipv4/proc.c | 1 +
net/ipv4/tcp_input.c | 1 +
3 files changed, 3 insertions(+)
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index e5ebc83..c996fba 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -172,6 +172,7 @@ enum
LINUX_MIB_PRUNECALLED, /* PruneCalled */
LINUX_MIB_RCVPRUNED, /* RcvPruned */
LINUX_MIB_OFOPRUNED, /* OfoPruned */
+ LINUX_MIB_OFOPRUNEDROP, /* OfoPruneDrop */
LINUX_MIB_OUTOFWINDOWICMPS, /* OutOfWindowIcmps */
LINUX_MIB_LOCKDROPPEDICMPS, /* LockDroppedIcmps */
LINUX_MIB_ARPFILTER, /* ArpFilter */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index b46e4cf..c718295 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -180,6 +180,7 @@ static int sockstat_seq_show(struct seq_file *seq, void *v)
SNMP_MIB_ITEM("PruneCalled", LINUX_MIB_PRUNECALLED),
SNMP_MIB_ITEM("RcvPruned", LINUX_MIB_RCVPRUNED),
SNMP_MIB_ITEM("OfoPruned", LINUX_MIB_OFOPRUNED),
+ SNMP_MIB_ITEM("OfoPruneDrop", LINUX_MIB_OFOPRUNEDROP),
SNMP_MIB_ITEM("OutOfWindowIcmps", LINUX_MIB_OUTOFWINDOWICMPS),
SNMP_MIB_ITEM("LockDroppedIcmps", LINUX_MIB_LOCKDROPPEDICMPS),
SNMP_MIB_ITEM("ArpFilter", LINUX_MIB_ARPFILTER),
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 91dbb9a..5267121 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4980,6 +4980,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
do {
prev = rb_prev(node);
rb_erase(node, &tp->out_of_order_queue);
+ NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNEDROP);
tcp_drop(sk, rb_to_skb(node));
sk_mem_reclaim(sk);
if (atomic_read(&sk->sk_rmem_alloc) <= sk->sk_rcvbuf &&
--
1.8.3.1
On 07/25/2018 06:06 AM, Yafang Shao wrote:
> LINUX_MIB_OFOPRUNED is used to count how many times ofo queue is pruned,
> but sometimes we want to know how many packets are pruned from this queue,
> that could help us to track the dropped packets.
>
> As LINUX_MIB_OFOPRUNED is a useful event for us, so I introduce a new
> SNMP counter LINUX_MIB_OFOPRUNEDROP, which could be showed in netstat as
> OfoPruneDrop.
Okay, but why tracking number of skbs that are removed ?
Skb can contain many segments (because of GRO and TCP coalescing)
So your claim of tracking dropped packets is ill defined.
Also I prefer having net tree being merged into net-next, since your patch would
add a merge conflict.
On Wed, Jul 25, 2018 at 9:33 PM, Eric Dumazet <[email protected]> wrote:
>
>
> On 07/25/2018 06:06 AM, Yafang Shao wrote:
>> LINUX_MIB_OFOPRUNED is used to count how many times ofo queue is pruned,
>> but sometimes we want to know how many packets are pruned from this queue,
>> that could help us to track the dropped packets.
>>
>> As LINUX_MIB_OFOPRUNED is a useful event for us, so I introduce a new
>> SNMP counter LINUX_MIB_OFOPRUNEDROP, which could be showed in netstat as
>> OfoPruneDrop.
>
>
> Okay, but why tracking number of skbs that are removed ?
>
Because we want to know why packets were dropped.
If that could be show in netstat, we could easily find that it is
dropped due to ofo prune.
> Skb can contain many segments (because of GRO and TCP coalescing)
>
> So your claim of tracking dropped packets is ill defined.
>
> Also I prefer having net tree being merged into net-next, since your patch would
> add a merge conflict.
>
>
On 07/25/2018 06:40 AM, Yafang Shao wrote:
>
> Because we want to know why packets were dropped.
> If that could be show in netstat, we could easily find that it is
> dropped due to ofo prune.
We have a counter already for these events : LINUX_MIB_OFOPRUNED
You want to add another counter tracking number of _skbs_,
which has no precise value for user,
given each skb can contain a variable number of _packets_.
On Wed, Jul 25, 2018 at 9:55 PM, Eric Dumazet <[email protected]> wrote:
>
>
> On 07/25/2018 06:40 AM, Yafang Shao wrote:
>
>>
>> Because we want to know why packets were dropped.
>> If that could be show in netstat, we could easily find that it is
>> dropped due to ofo prune.
>
>
> We have a counter already for these events : LINUX_MIB_OFOPRUNED
>
> You want to add another counter tracking number of _skbs_,
> which has no precise value for user,
> given each skb can contain a variable number of _packets_.
>
Got it.
But with LINUX_MIB_OFOPRUNED, we only know tcp_prune_ofo_queue is
called, but have no idea how many skbs are dropped.
Thanks
Yafang
On Wed, Jul 25, 2018 at 9:59 PM, Yafang Shao <[email protected]> wrote:
> On Wed, Jul 25, 2018 at 9:55 PM, Eric Dumazet <[email protected]> wrote:
>>
>>
>> On 07/25/2018 06:40 AM, Yafang Shao wrote:
>>
>>>
>>> Because we want to know why packets were dropped.
>>> If that could be show in netstat, we could easily find that it is
>>> dropped due to ofo prune.
>>
>>
>> We have a counter already for these events : LINUX_MIB_OFOPRUNED
>>
>> You want to add another counter tracking number of _skbs_,
>> which has no precise value for user,
>> given each skb can contain a variable number of _packets_.
>>
>
> Got it.
>
> But with LINUX_MIB_OFOPRUNED, we only know tcp_prune_ofo_queue is
> called, but have no idea how many skbs are dropped.
>
Hi Eric,
LINUX_MIB_TCPOFOQUEUE, LINUX_MIB_TCPOFODROP and LINUX_MIB_TCPOFOMERGE
are all for the number of SKBs, but only LINUX_MIB_OFOPRUNED is for
the event, that could lead misunderstading.
So I think introducing a counter for the number of SKB pruned could be
better, that could help us to track the whole behavior of ofo queue.
That is why I submit this patch.
Thanks
Yafang
On 07/25/2018 06:42 PM, Yafang Shao wrote:
>
> Hi Eric,
>
> LINUX_MIB_TCPOFOQUEUE, LINUX_MIB_TCPOFODROP and LINUX_MIB_TCPOFOMERGE
> are all for the number of SKBs, but only LINUX_MIB_OFOPRUNED is for
> the event, that could lead misunderstading.
> So I think introducing a counter for the number of SKB pruned could be
> better, that could help us to track the whole behavior of ofo queue.
> That is why I submit this patch.
Sure, and I said your patch had issues.
You mixed 'packets' and 'skbs' but apparently you do not get my point.
I would rather not add another SNMP counter, and refine the current one,
trying to track something more meaningful.
The notion of 'skb' is internal to the kernel, and can not be mapped easily
to 'number of network segments' which probably is more useful for the user.
I will send this instead :
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d51fa358b2b196d0f9c258b24354813f2128a675..141a062abd0660c8f6d049de1dc7c7ecf7a7230d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5001,18 +5001,19 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
struct rb_node *node, *prev;
+ unsigned int segs = 0;
int goal;
if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
return false;
- NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
goal = sk->sk_rcvbuf >> 3;
node = &tp->ooo_last_skb->rbnode;
do {
prev = rb_prev(node);
rb_erase(node, &tp->out_of_order_queue);
goal -= rb_to_skb(node)->truesize;
+ segs += max_t(u16, 1, skb_shinfo(rb_to_skb(node))->gso_segs);
tcp_drop(sk, rb_to_skb(node));
if (!prev || goal <= 0) {
sk_mem_reclaim(sk);
@@ -5023,6 +5024,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
}
node = prev;
} while (node);
+ NET_ADD_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED, segs);
tp->ooo_last_skb = rb_to_skb(prev);
/* Reset SACK state. A conforming SACK implementation will
On Thu, Jul 26, 2018 at 12:06 PM, Eric Dumazet <[email protected]> wrote:
>
>
> On 07/25/2018 06:42 PM, Yafang Shao wrote:
>
>>
>> Hi Eric,
>>
>> LINUX_MIB_TCPOFOQUEUE, LINUX_MIB_TCPOFODROP and LINUX_MIB_TCPOFOMERGE
>> are all for the number of SKBs, but only LINUX_MIB_OFOPRUNED is for
>> the event, that could lead misunderstading.
>> So I think introducing a counter for the number of SKB pruned could be
>> better, that could help us to track the whole behavior of ofo queue.
>> That is why I submit this patch.
>
> Sure, and I said your patch had issues.
> You mixed 'packets' and 'skbs' but apparently you do not get my point.
>
I had noticed that I made this mistake.
> I would rather not add another SNMP counter, and refine the current one,
> trying to track something more meaningful.
>
> The notion of 'skb' is internal to the kernel, and can not be mapped easily
> to 'number of network segments' which probably is more useful for the user.
>
> I will send this instead :
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index d51fa358b2b196d0f9c258b24354813f2128a675..141a062abd0660c8f6d049de1dc7c7ecf7a7230d 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -5001,18 +5001,19 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
> {
> struct tcp_sock *tp = tcp_sk(sk);
> struct rb_node *node, *prev;
> + unsigned int segs = 0;
> int goal;
>
> if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
> return false;
>
> - NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
> goal = sk->sk_rcvbuf >> 3;
> node = &tp->ooo_last_skb->rbnode;
> do {
> prev = rb_prev(node);
> rb_erase(node, &tp->out_of_order_queue);
> goal -= rb_to_skb(node)->truesize;
> + segs += max_t(u16, 1, skb_shinfo(rb_to_skb(node))->gso_segs);
> tcp_drop(sk, rb_to_skb(node));
> if (!prev || goal <= 0) {
> sk_mem_reclaim(sk);
> @@ -5023,6 +5024,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
> }
> node = prev;
> } while (node);
> + NET_ADD_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED, segs);
> tp->ooo_last_skb = rb_to_skb(prev);
>
> /* Reset SACK state. A conforming SACK implementation will
>
You patch will make it more meaningful.
How about the other ones?
I think changing all of them from the number of SKBs to the number of
network segments would be better.
Thanks
Yafang
On 07/25/2018 09:31 PM, Yafang Shao wrote:
> On Thu, Jul 26, 2018 at 12:06 PM, Eric Dumazet <[email protected]> wrote:
>>
>>
>> On 07/25/2018 06:42 PM, Yafang Shao wrote:
>>
>>>
>>> Hi Eric,
>>>
>>> LINUX_MIB_TCPOFOQUEUE, LINUX_MIB_TCPOFODROP and LINUX_MIB_TCPOFOMERGE
>>> are all for the number of SKBs, but only LINUX_MIB_OFOPRUNED is for
>>> the event, that could lead misunderstading.
>>> So I think introducing a counter for the number of SKB pruned could be
>>> better, that could help us to track the whole behavior of ofo queue.
>>> That is why I submit this patch.
>>
>> Sure, and I said your patch had issues.
>> You mixed 'packets' and 'skbs' but apparently you do not get my point.
>>
>
> I had noticed that I made this mistake.
>
>> I would rather not add another SNMP counter, and refine the current one,
>> trying to track something more meaningful.
>>
>> The notion of 'skb' is internal to the kernel, and can not be mapped easily
>> to 'number of network segments' which probably is more useful for the user.
>>
>> I will send this instead :
>>
>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>> index d51fa358b2b196d0f9c258b24354813f2128a675..141a062abd0660c8f6d049de1dc7c7ecf7a7230d 100644
>> --- a/net/ipv4/tcp_input.c
>> +++ b/net/ipv4/tcp_input.c
>> @@ -5001,18 +5001,19 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
>> {
>> struct tcp_sock *tp = tcp_sk(sk);
>> struct rb_node *node, *prev;
>> + unsigned int segs = 0;
>> int goal;
>>
>> if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
>> return false;
>>
>> - NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
>> goal = sk->sk_rcvbuf >> 3;
>> node = &tp->ooo_last_skb->rbnode;
>> do {
>> prev = rb_prev(node);
>> rb_erase(node, &tp->out_of_order_queue);
>> goal -= rb_to_skb(node)->truesize;
>> + segs += max_t(u16, 1, skb_shinfo(rb_to_skb(node))->gso_segs);
>> tcp_drop(sk, rb_to_skb(node));
>> if (!prev || goal <= 0) {
>> sk_mem_reclaim(sk);
>> @@ -5023,6 +5024,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
>> }
>> node = prev;
>> } while (node);
>> + NET_ADD_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED, segs);
>> tp->ooo_last_skb = rb_to_skb(prev);
>>
>> /* Reset SACK state. A conforming SACK implementation will
>>
>
> You patch will make it more meaningful.
> How about the other ones?
Same thing really.
Note that tcp_collapse() does not currently propagate skb_shinfo(skb)->gso_segs
when rebuilding the skbs, so this will need a fix.
On Thu, Jul 26, 2018 at 12:36 PM, Eric Dumazet <[email protected]> wrote:
>
>
> On 07/25/2018 09:31 PM, Yafang Shao wrote:
>> On Thu, Jul 26, 2018 at 12:06 PM, Eric Dumazet <[email protected]> wrote:
>>>
>>>
>>> On 07/25/2018 06:42 PM, Yafang Shao wrote:
>>>
>>>>
>>>> Hi Eric,
>>>>
>>>> LINUX_MIB_TCPOFOQUEUE, LINUX_MIB_TCPOFODROP and LINUX_MIB_TCPOFOMERGE
>>>> are all for the number of SKBs, but only LINUX_MIB_OFOPRUNED is for
>>>> the event, that could lead misunderstading.
>>>> So I think introducing a counter for the number of SKB pruned could be
>>>> better, that could help us to track the whole behavior of ofo queue.
>>>> That is why I submit this patch.
>>>
>>> Sure, and I said your patch had issues.
>>> You mixed 'packets' and 'skbs' but apparently you do not get my point.
>>>
>>
>> I had noticed that I made this mistake.
>>
>>> I would rather not add another SNMP counter, and refine the current one,
>>> trying to track something more meaningful.
>>>
>>> The notion of 'skb' is internal to the kernel, and can not be mapped easily
>>> to 'number of network segments' which probably is more useful for the user.
>>>
>>> I will send this instead :
>>>
>>> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
>>> index d51fa358b2b196d0f9c258b24354813f2128a675..141a062abd0660c8f6d049de1dc7c7ecf7a7230d 100644
>>> --- a/net/ipv4/tcp_input.c
>>> +++ b/net/ipv4/tcp_input.c
>>> @@ -5001,18 +5001,19 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
>>> {
>>> struct tcp_sock *tp = tcp_sk(sk);
>>> struct rb_node *node, *prev;
>>> + unsigned int segs = 0;
>>> int goal;
>>>
>>> if (RB_EMPTY_ROOT(&tp->out_of_order_queue))
>>> return false;
>>>
>>> - NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
>>> goal = sk->sk_rcvbuf >> 3;
>>> node = &tp->ooo_last_skb->rbnode;
>>> do {
>>> prev = rb_prev(node);
>>> rb_erase(node, &tp->out_of_order_queue);
>>> goal -= rb_to_skb(node)->truesize;
>>> + segs += max_t(u16, 1, skb_shinfo(rb_to_skb(node))->gso_segs);
>>> tcp_drop(sk, rb_to_skb(node));
>>> if (!prev || goal <= 0) {
>>> sk_mem_reclaim(sk);
>>> @@ -5023,6 +5024,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
>>> }
>>> node = prev;
>>> } while (node);
>>> + NET_ADD_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED, segs);
>>> tp->ooo_last_skb = rb_to_skb(prev);
>>>
>>> /* Reset SACK state. A conforming SACK implementation will
>>>
>>
>> You patch will make it more meaningful.
>> How about the other ones?
>
> Same thing really.
>
> Note that tcp_collapse() does not currently propagate skb_shinfo(skb)->gso_segs
> when rebuilding the skbs, so this will need a fix.
>
Got it.
Thank you very much for your comment.
Thanks
Yafang