On 7/15/22 4:55 AM, Zhengchao Shao wrote:
> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> skbs, that is, the flow->head is null.
> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> run a bpf prog which redirects empty skbs.
> So we should determine whether the length of the packet modified by bpf
> prog or others like bpf_prog_test is valid before forwarding it directly.
>
> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
>
> Reported-by: [email protected]
> Signed-off-by: Zhengchao Shao <[email protected]>
> ---
> v3: modify debug print
> v2: need move checking to convert___skb_to_skb and add debug info
> v1: should not check len in fast path
>
> include/linux/skbuff.h | 8 ++++++++
> net/bpf/test_run.c | 3 +++
> net/core/dev.c | 1 +
> 3 files changed, 12 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f6a27ab19202..82e8368ba6e6 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2459,6 +2459,14 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
>
> #endif /* NET_SKBUFF_DATA_USES_OFFSET */
>
> +static inline void skb_assert_len(struct sk_buff *skb)
> +{
> +#ifdef CONFIG_DEBUG_NET
> + if (WARN_ONCE(!skb->len, "%s\n", __func__))
> + DO_ONCE_LITE(skb_dump, KERN_ERR, skb, false);
> +#endif /* CONFIG_DEBUG_NET */
> +}
> +
> /*
> * Add data to an sk_buff
> */
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2ca96acbc50a..dc9dc0bedca0 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -955,6 +955,9 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
> {
> struct qdisc_skb_cb *cb = (struct qdisc_skb_cb *)skb->cb;
>
> + if (!skb->len)
> + return -EINVAL;
From another recent report [0], I don't think this change is fixing the report
from syzbot. It probably makes sense to revert this patch.
afaict, This '!skb->len' test is done after
if (is_l2)
__skb_push(skb, hh_len);
Hence, skb->len is not zero in convert___skb_to_skb(). The proper place to test
skb->len is before __skb_push() to ensure there is some network header after the
mac or may as well ensure "data_size_in > ETH_HLEN" at the beginning.
The fix in [0] is applied. If it turns out there are other cases caused by the
skb generated by test_run that needs extra fixes in bpf_redirect_*, it needs to
revisit an earlier !skb->len check mentioned above and the existing test cases
outside of test_progs would have to adjust accordingly.
[0]: https://lore.kernel.org/bpf/[email protected]/
> +
> if (!__skb)
> return 0;
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index d588fd0a54ce..716df64fcfa5 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4168,6 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> bool again = false;
>
> skb_reset_mac_header(skb);
> + skb_assert_len(skb);
>
> if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);