2022-02-03 20:43:14

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH RFC 0/4] net: skb: to use (function, line) pair as reason for kfree_skb_reason()

This RFC is to seek for suggestion to track the reason that the sk_buff is
dropped.

Sometimes the kernel may not directly call kfree_skb() to drop the sk_buff.
Instead, it "goto drop" and call kfree_skb() at 'drop'. This makes it
difficult to track the reason that the sk_buff is dropped.

The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
introduced the kfree_skb_reason() to help track the reason. However, we may
need to define many reasons for each driver/subsystem.

I am going to trace the "goto drop" in TUN and TAP drivers. However, I will
need to introduce many new reasons if I re-use kfree_skb_reason().


There are some other options.

1. The 1st option is to introduce a new tracepoint, e.g., trace_drop_skb()
as below to track the function and line number. We would call
trace_drop_skb() before "goto drop".

TP_PROTO(struct sk_buff *skb, struct net_device *dev,
const char *function, unsigned int line),


2. The 2nd option is to directly call trace_kfree_skb() before "goto drop".
And we may replace kfree_skb() with below kfree_skb_notrace() as suggested
by Joao Martins.

/**
* kfree_skb_notrace - free an sk_buff without tracing
* @skb: buffer to free
*
* Drop a reference to the buffer and free it if the usage count has
* hit zero.
*/
void kfree_skb_notrace(struct sk_buff *skb)
{
if (!skb_unref(skb))
return;

__kfree_skb(skb);
}


3. The last option is this RFC. To avoid introducing so many new reasons,
we use (__func__, __LINE__) to uniquely identify the location of
each "goto drop". The 'reason' introduced by the
commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") is replaced
by the (function, line) pair.

The below is the sample output from trace_pipe by this RFC, when the
sk_buff is dropped by TUN driver.


<idle>-0 [016] ..s1. 432.701987: kfree_skb: skbaddr=00000000a65c0a72 protocol=0 location=000000008a49d80c function=none line=0
<idle>-0 [003] b.s2. 432.704397: kfree_skb: skbaddr=00000000665e5ccd protocol=2048 location=00000000ec3b7129 function=tun_net_xmit line=1116
<idle>-0 [003] ..s1. 432.704400: kfree_skb: skbaddr=00000000e4c806f8 protocol=2048 location=000000002929642d function=none line=0
<idle>-0 [002] b.s2. 432.734617: kfree_skb: skbaddr=00000000079749b3 protocol=2048 location=00000000ec3b7129 function=tun_net_xmit line=1116
<idle>-0 [015] b.s2. 432.880571: kfree_skb: skbaddr=00000000e1542f1e protocol=34525 location=00000000ec3b7129 function=tun_net_xmit line=1116
<idle>-0 [015] ..s1. 432.880577: kfree_skb: skbaddr=000000004f3022b6 protocol=34525 location=00000000547c5c25 function=none line=0
<idle>-0 [002] b.s2. 432.886247: kfree_skb: skbaddr=0000000062990a71 protocol=2054 location=00000000ec3b7129 function=tun_net_xmit line=1116


drivers/net/tap.c | 30 ++++++++++++++++++++++--------
drivers/net/tun.c | 33 +++++++++++++++++++++++++--------
include/linux/skbuff.h | 24 +++++++-----------------
include/trace/events/skb.h | 37 ++++++++-----------------------------
net/core/dev.c | 3 ++-
net/core/skbuff.c | 10 ++++++----
net/ipv4/tcp_ipv4.c | 14 +++++++-------
net/ipv4/udp.c | 14 +++++++-------
8 files changed, 84 insertions(+), 81 deletions(-)


Would you please share your suggestion and feedback?

Thank you very much!

Dongli Zhang



2022-02-04 11:58:03

by Dongli Zhang

[permalink] [raw]
Subject: [PATCH RFC 3/4] net: tun: track dropped skb via kfree_skb_reason()

The TUN can be used as vhost-net backend. E.g, the tun_net_xmit() is the
interface to forward the skb from TUN to vhost-net/virtio-net.

However, there are many "goto drop" in the TUN driver. Therefore, the
kfree_skb_reason() is involved at each "goto drop" to help userspace
ftrace/ebpf to track the reason for the loss of packets.

Cc: Joao Martins <[email protected]>
Cc: Joe Jin <[email protected]>
Signed-off-by: Dongli Zhang <[email protected]>
---
drivers/net/tun.c | 33 +++++++++++++++++++++++++--------
1 file changed, 25 insertions(+), 8 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index fed85447701a..8f6c6d23a787 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1062,13 +1062,16 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
struct netdev_queue *queue;
struct tun_file *tfile;
int len = skb->len;
+ unsigned int drop_line = SKB_DROP_LINE_NONE;

rcu_read_lock();
tfile = rcu_dereference(tun->tfiles[txq]);

/* Drop packet if interface is not attached */
- if (!tfile)
+ if (!tfile) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

if (!rcu_dereference(tun->steering_prog))
tun_automq_xmit(tun, skb);
@@ -1078,19 +1081,27 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
/* Drop if the filter does not like it.
* This is a noop if the filter is disabled.
* Filter can be enabled only for the TAP devices. */
- if (!check_filter(&tun->txflt, skb))
+ if (!check_filter(&tun->txflt, skb)) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

if (tfile->socket.sk->sk_filter &&
- sk_filter(tfile->socket.sk, skb))
+ sk_filter(tfile->socket.sk, skb)) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

len = run_ebpf_filter(tun, skb, len);
- if (len == 0 || pskb_trim(skb, len))
+ if (len == 0 || pskb_trim(skb, len)) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

- if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC)))
+ if (unlikely(skb_orphan_frags_rx(skb, GFP_ATOMIC))) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

skb_tx_timestamp(skb);

@@ -1101,8 +1112,10 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)

nf_reset_ct(skb);

- if (ptr_ring_produce(&tfile->tx_ring, skb))
+ if (ptr_ring_produce(&tfile->tx_ring, skb)) {
+ drop_line = SKB_DROP_LINE;
goto drop;
+ }

/* NETIF_F_LLTX requires to do our own update of trans_start */
queue = netdev_get_tx_queue(dev, txq);
@@ -1119,7 +1132,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
drop:
atomic_long_inc(&dev->tx_dropped);
skb_tx_error(skb);
- kfree_skb(skb);
+ kfree_skb_reason(skb, SKB_DROP_FUNC, drop_line);
rcu_read_unlock();
return NET_XMIT_DROP;
}
@@ -1717,6 +1730,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
u32 rxhash = 0;
int skb_xdp = 1;
bool frags = tun_napi_frags_enabled(tfile);
+ unsigned int drop_line = SKB_DROP_LINE_NONE;

if (!(tun->flags & IFF_NO_PI)) {
if (len < sizeof(pi))
@@ -1820,9 +1834,10 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,

if (err) {
err = -EFAULT;
+ drop_line = SKB_DROP_LINE;
drop:
atomic_long_inc(&tun->dev->rx_dropped);
- kfree_skb(skb);
+ kfree_skb_reason(skb, SKB_DROP_FUNC, drop_line);
if (frags) {
tfile->napi.skb = NULL;
mutex_unlock(&tfile->napi_mutex);
@@ -1868,6 +1883,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
break;
case IFF_TAP:
if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
+ drop_line = SKB_DROP_LINE;
err = -ENOMEM;
goto drop;
}
@@ -1922,6 +1938,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
if (unlikely(!(tun->dev->flags & IFF_UP))) {
err = -EIO;
rcu_read_unlock();
+ drop_line = SKB_DROP_LINE;
goto drop;
}

--
2.17.1