2022-02-21 09:13:14

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH net-next v3 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb

The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
introduced the kfree_skb_reason() to help track the reason.

The tun and tap are commonly used as virtio-net/vhost-net backend. This is to
use kfree_skb_reason() to trace the dropped skb for those two drivers.

Changed since v1:
- I have renamed many of the reasons since v1. I make them as generic as
possible so that they can be re-used by core networking and drivers.

Changed since v2:
- declare drop_reason as type "enum skb_drop_reason"
- handle the drop in skb_list_walk_safe() case for tap driver, and
kfree_skb_list_reason() is introduced


The following reasons are introduced.

- SKB_DROP_REASON_SKB_CSUM

This is used whenever there is checksum error with sk_buff.

- SKB_DROP_REASON_SKB_COPY_DATA

The kernel may (zero) copy the data to or from sk_buff, e.g.,
zerocopy_sg_from_iter(), skb_copy_datagram_from_iter() and
skb_orphan_frags_rx(). This reason is for the copy related error.

- SKB_DROP_REASON_SKB_GSO_SEG

Any error reported when GSO processing the sk_buff. It is frequent to process
sk_buff gso data and we introduce a new reason to handle that.

- SKB_DROP_REASON_SKB_PULL
- SKB_DROP_REASON_SKB_TRIM

It is frequent to pull to sk_buff data or trim the sk_buff data.

- SKB_DROP_REASON_DEV_HDR

Any driver may report error if there is any error in the metadata on the DMA
ring buffer.

- SKB_DROP_REASON_DEV_READY

The device is not ready/online or initialized to receive data.

- SKB_DROP_REASON_DEV_FILTER

David Ahern suggested SKB_DROP_REASON_TAP_FILTER. I changed from 'TAP' to 'DEV'
to make it more generic.

- SKB_DROP_REASON_FULL_RING

Suggested by Eric Dumazet.

- SKB_DROP_REASON_BPF_FILTER

Dropped by ebpf filter


This is the output for TUN device.

# cat /sys/kernel/debug/tracing/trace_pipe
<idle>-0 [018] ..s1. 1478.130490: kfree_skb: skbaddr=00000000c4f21b8d protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
vhost-9003-9020 [012] b..1. 1478.196264: kfree_skb: skbaddr=00000000b174fb9b protocol=2054 location=000000001cf38db0 reason: FULL_RING
arping-9639 [018] b..1. 1479.082993: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
<idle>-0 [012] b.s3. 1479.110472: kfree_skb: skbaddr=00000000e0c3681f protocol=4 location=000000001cf38db0 reason: FULL_RING
arping-9639 [018] b..1. 1480.083086: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING


This is the output for TAP device.

# cat /sys/kernel/debug/tracing/trace_pipe
<idle>-0 [014] ..s1. 1096.418621: kfree_skb: skbaddr=00000000f8f41946 protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
arping-7006 [001] ..s1. 1096.843961: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
arping-7006 [001] ..s1. 1097.844035: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
arping-7006 [001] ..s1. 1098.844102: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
arping-7006 [001] ..s1. 1099.844160: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
arping-7006 [001] ..s1. 1100.844214: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
arping-7006 [001] ..s1. 1101.844230: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING


drivers/net/tap.c | 35 +++++++++++++++++++++++++----------
drivers/net/tun.c | 38 ++++++++++++++++++++++++++++++--------
include/linux/skbuff.h | 18 ++++++++++++++++++
include/trace/events/skb.h | 10 ++++++++++
net/core/skbuff.c | 11 +++++++++--
5 files changed, 92 insertions(+), 20 deletions(-)

Please let me know if there is any suggestion on the definition of reasons.

Thank you very much!

Dongli Zhang



2022-02-21 09:13:13

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH net-next v3 3/4] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement"

No functional change.

Just to split the if statement into different conditions to use
kfree_skb_reason() to trace the reason later.

Cc: Joao Martins <[email protected]>
Cc: Joe Jin <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/net/tun.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index fed8544..aa27268 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1086,7 +1086,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
goto drop;

len = run_ebpf_filter(tun, skb, len);
- if (len == 0 || pskb_trim(skb, len))
+ if (len == 0)
+ goto drop;
+
+ if (pskb_trim(skb, len))
goto drop;

if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC)))
--
1.8.3.1

2022-02-21 09:54:58

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH net-next v3 1/4] skbuff: introduce kfree_skb_list_reason()

This is to introduce kfree_skb_list_reason() to drop a list of sk_buff with
a specific reason.

Cc: Joao Martins <[email protected]>
Cc: Joe Jin <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
---
include/linux/skbuff.h | 2 ++
net/core/skbuff.c | 11 +++++++++--
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a3e90ef..87ebe2f 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1176,6 +1176,8 @@ static inline void kfree_skb(struct sk_buff *skb)
}

void skb_release_head_state(struct sk_buff *skb);
+void kfree_skb_list_reason(struct sk_buff *segs,
+ enum skb_drop_reason reason);
void kfree_skb_list(struct sk_buff *segs);
void skb_dump(const char *level, const struct sk_buff *skb, bool full_pkt);
void skb_tx_error(struct sk_buff *skb);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 9d0388be..dfdd71e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -777,15 +777,22 @@ void kfree_skb_reason(struct sk_buff *skb, enum skb_drop_reason reason)
}
EXPORT_SYMBOL(kfree_skb_reason);

-void kfree_skb_list(struct sk_buff *segs)
+void kfree_skb_list_reason(struct sk_buff *segs,
+ enum skb_drop_reason reason)
{
while (segs) {
struct sk_buff *next = segs->next;

- kfree_skb(segs);
+ kfree_skb_reason(segs, reason);
segs = next;
}
}
+EXPORT_SYMBOL(kfree_skb_list_reason);
+
+void kfree_skb_list(struct sk_buff *segs)
+{
+ kfree_skb_list_reason(segs, SKB_DROP_REASON_NOT_SPECIFIED);
+}
EXPORT_SYMBOL(kfree_skb_list);

/* Dump skb information and contents.
--
1.8.3.1

2022-02-22 04:37:29

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH net-next v3 1/4] skbuff: introduce kfree_skb_list_reason()

On 2/20/22 10:34 PM, Dongli Zhang wrote:
> This is to introduce kfree_skb_list_reason() to drop a list of sk_buff with
> a specific reason.
>
> Cc: Joao Martins <[email protected]>
> Cc: Joe Jin <[email protected]>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> include/linux/skbuff.h | 2 ++
> net/core/skbuff.c | 11 +++++++++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
>

Reviewed-by: David Ahern <[email protected]>


2022-02-22 05:10:19

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH net-next v3 3/4] net: tun: split run_ebpf_filter() and pskb_trim() into different "if statement"

On 2/20/22 10:34 PM, Dongli Zhang wrote:
> No functional change.
>
> Just to split the if statement into different conditions to use
> kfree_skb_reason() to trace the reason later.
>
> Cc: Joao Martins <[email protected]>
> Cc: Joe Jin <[email protected]>
> Signed-off-by: Dongli Zhang <[email protected]>
> ---
> drivers/net/tun.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>

Reviewed-by: David Ahern <[email protected]>


2022-02-22 05:32:58

by Dongli Zhang

[permalink] [raw]
Subject: Re: [PATCH net-next v3 0/3] tun/tap: use kfree_skb_reason() to trace dropped skb

The subject should be [PATCH net-next v3 0/4] but not [PATCH net-next v3 0/3].

Sorry for the mistake.

Dongli Zhang

On 2/20/22 9:34 PM, Dongli Zhang wrote:
> The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
> introduced the kfree_skb_reason() to help track the reason.
>
> The tun and tap are commonly used as virtio-net/vhost-net backend. This is to
> use kfree_skb_reason() to trace the dropped skb for those two drivers.
>
> Changed since v1:
> - I have renamed many of the reasons since v1. I make them as generic as
> possible so that they can be re-used by core networking and drivers.
>
> Changed since v2:
> - declare drop_reason as type "enum skb_drop_reason"
> - handle the drop in skb_list_walk_safe() case for tap driver, and
> kfree_skb_list_reason() is introduced
>
>
> The following reasons are introduced.
>
> - SKB_DROP_REASON_SKB_CSUM
>
> This is used whenever there is checksum error with sk_buff.
>
> - SKB_DROP_REASON_SKB_COPY_DATA
>
> The kernel may (zero) copy the data to or from sk_buff, e.g.,
> zerocopy_sg_from_iter(), skb_copy_datagram_from_iter() and
> skb_orphan_frags_rx(). This reason is for the copy related error.
>
> - SKB_DROP_REASON_SKB_GSO_SEG
>
> Any error reported when GSO processing the sk_buff. It is frequent to process
> sk_buff gso data and we introduce a new reason to handle that.
>
> - SKB_DROP_REASON_SKB_PULL
> - SKB_DROP_REASON_SKB_TRIM
>
> It is frequent to pull to sk_buff data or trim the sk_buff data.
>
> - SKB_DROP_REASON_DEV_HDR
>
> Any driver may report error if there is any error in the metadata on the DMA
> ring buffer.
>
> - SKB_DROP_REASON_DEV_READY
>
> The device is not ready/online or initialized to receive data.
>
> - SKB_DROP_REASON_DEV_FILTER
>
> David Ahern suggested SKB_DROP_REASON_TAP_FILTER. I changed from 'TAP' to 'DEV'
> to make it more generic.
>
> - SKB_DROP_REASON_FULL_RING
>
> Suggested by Eric Dumazet.
>
> - SKB_DROP_REASON_BPF_FILTER
>
> Dropped by ebpf filter
>
>
> This is the output for TUN device.
>
> # cat /sys/kernel/debug/tracing/trace_pipe
> <idle>-0 [018] ..s1. 1478.130490: kfree_skb: skbaddr=00000000c4f21b8d protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
> vhost-9003-9020 [012] b..1. 1478.196264: kfree_skb: skbaddr=00000000b174fb9b protocol=2054 location=000000001cf38db0 reason: FULL_RING
> arping-9639 [018] b..1. 1479.082993: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
> <idle>-0 [012] b.s3. 1479.110472: kfree_skb: skbaddr=00000000e0c3681f protocol=4 location=000000001cf38db0 reason: FULL_RING
> arping-9639 [018] b..1. 1480.083086: kfree_skb: skbaddr=00000000c4f21b8d protocol=2054 location=000000001cf38db0 reason: FULL_RING
>
>
> This is the output for TAP device.
>
> # cat /sys/kernel/debug/tracing/trace_pipe
> <idle>-0 [014] ..s1. 1096.418621: kfree_skb: skbaddr=00000000f8f41946 protocol=0 location=00000000aff342c7 reason: NOT_SPECIFIED
> arping-7006 [001] ..s1. 1096.843961: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
> arping-7006 [001] ..s1. 1097.844035: kfree_skb: skbaddr=000000002ec803a8 protocol=2054 location=000000009a57b32f reason: FULL_RING
> arping-7006 [001] ..s1. 1098.844102: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
> arping-7006 [001] ..s1. 1099.844160: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
> arping-7006 [001] ..s1. 1100.844214: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
> arping-7006 [001] ..s1. 1101.844230: kfree_skb: skbaddr=00000000295eb0da protocol=2054 location=000000009a57b32f reason: FULL_RING
>
>
> drivers/net/tap.c | 35 +++++++++++++++++++++++++----------
> drivers/net/tun.c | 38 ++++++++++++++++++++++++++++++--------
> include/linux/skbuff.h | 18 ++++++++++++++++++
> include/trace/events/skb.h | 10 ++++++++++
> net/core/skbuff.c | 11 +++++++++--
> 5 files changed, 92 insertions(+), 20 deletions(-)
>
> Please let me know if there is any suggestion on the definition of reasons.
>
> Thank you very much!
>
> Dongli Zhang
>
>