On Wed, Jul 13, 2022 at 11:05 PM Zhengchao Shao
<[email protected]> wrote:
>
> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> skbs, that is, the flow->head is null.
> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> run a bpf prog which redirects empty skbs.
> So we should determine whether the length of the packet modified by bpf
> prog is valid before forwarding it directly.
>
> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
>
> Reported-by: [email protected]
> Signed-off-by: Zhengchao Shao <[email protected]>
Reviewed-by: Stanislav Fomichev <[email protected]>
Daniel, do you see any issues with this approach?
I wonder if we might also want to add some WARN_ON to the
__bpf_redirect_common routine gated by #ifdef CONFIG_DEBUG_NET ?
In case syscaller manages to hit similar issues elsewhere..
> ---
> v1: should not check len in fast path
>
> net/bpf/test_run.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 2ca96acbc50a..750d7d173a20 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -1152,6 +1152,12 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> ret = convert___skb_to_skb(skb, ctx);
> if (ret)
> goto out;
> +
> + if (skb->len == 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false);
> if (ret)
> goto out;
> --
> 2.17.1
>
On 7/14/22 8:22 PM, Stanislav Fomichev wrote:
> On Wed, Jul 13, 2022 at 11:05 PM Zhengchao Shao
> <[email protected]> wrote:
>>
>> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
>> skbs, that is, the flow->head is null.
>> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
>> run a bpf prog which redirects empty skbs.
>> So we should determine whether the length of the packet modified by bpf
>> prog is valid before forwarding it directly.
>>
>> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
>> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
>>
>> Reported-by: [email protected]
>> Signed-off-by: Zhengchao Shao <[email protected]>
>
> Reviewed-by: Stanislav Fomichev <[email protected]>
>
> Daniel, do you see any issues with this approach?
I think it's fine, maybe this could be folded into:
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 750d7d173a20..256cd18cfe22 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -957,6 +957,8 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
if (!__skb)
return 0;
+ if (!skb->len)
+ return -EINVAL;
/* make sure the fields we don't use are zeroed */
if (!range_is_zero(__skb, 0, offsetof(struct __sk_buff, mark)))
> I wonder if we might also want to add some WARN_ON to the
> __bpf_redirect_common routine gated by #ifdef CONFIG_DEBUG_NET ?
> In case syscaller manages to hit similar issues elsewhere..
Assuming the issue is generic (and CONFIG_DEBUG_NET only enabled by things like
syzkaller), couldn't we do something like the below?
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f6a27ab19202..c9988a785294 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2459,6 +2459,17 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
#endif /* NET_SKBUFF_DATA_USES_OFFSET */
+static inline void skb_assert_len(struct sk_buff *skb)
+{
+#ifdef CONFIG_DEBUG_NET
+ if (unlikey(!skb->len)) {
+ pr_err("%s\n", __func__);
+ skb_dump(KERN_ERR, skb, false);
+ WARN_ON_ONCE(1);
+ }
+#endif /* CONFIG_DEBUG_NET */
+}
+
/*
* Add data to an sk_buff
*/
diff --git a/net/core/dev.c b/net/core/dev.c
index 978ed0622d8f..53c4b9fd22c0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4168,7 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
bool again = false;
skb_reset_mac_header(skb);
-
+ skb_assert_len(skb);
if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
__skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
>> ---
>> v1: should not check len in fast path
>>
>> net/bpf/test_run.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
>> index 2ca96acbc50a..750d7d173a20 100644
>> --- a/net/bpf/test_run.c
>> +++ b/net/bpf/test_run.c
>> @@ -1152,6 +1152,12 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
>> ret = convert___skb_to_skb(skb, ctx);
>> if (ret)
>> goto out;
>> +
>> + if (skb->len == 0) {
>> + ret = -EINVAL;
>> + goto out;
>> + }
>> +
>> ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false);
>> if (ret)
>> goto out;
>> --
>> 2.17.1
>>
On Thu, Jul 14, 2022 at 1:39 PM Daniel Borkmann <[email protected]> wrote:
>
> On 7/14/22 8:22 PM, Stanislav Fomichev wrote:
> > On Wed, Jul 13, 2022 at 11:05 PM Zhengchao Shao
> > <[email protected]> wrote:
> >>
> >> Syzbot found an issue [1]: fq_codel_drop() try to drop a flow whitout any
> >> skbs, that is, the flow->head is null.
> >> The root cause, as the [2] says, is because that bpf_prog_test_run_skb()
> >> run a bpf prog which redirects empty skbs.
> >> So we should determine whether the length of the packet modified by bpf
> >> prog is valid before forwarding it directly.
> >>
> >> LINK: [1] https://syzkaller.appspot.com/bug?id=0b84da80c2917757915afa89f7738a9d16ec96c5
> >> LINK: [2] https://www.spinics.net/lists/netdev/msg777503.html
> >>
> >> Reported-by: [email protected]
> >> Signed-off-by: Zhengchao Shao <[email protected]>
> >
> > Reviewed-by: Stanislav Fomichev <[email protected]>
> >
> > Daniel, do you see any issues with this approach?
>
> I think it's fine, maybe this could be folded into:
>
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index 750d7d173a20..256cd18cfe22 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -957,6 +957,8 @@ static int convert___skb_to_skb(struct sk_buff *skb, struct __sk_buff *__skb)
>
> if (!__skb)
> return 0;
> + if (!skb->len)
> + return -EINVAL;
So putting it in convert___skb_to_skb lets us remove a goto :-) Works
for me, whichever you prefer.
> /* make sure the fields we don't use are zeroed */
> if (!range_is_zero(__skb, 0, offsetof(struct __sk_buff, mark)))
>
> > I wonder if we might also want to add some WARN_ON to the
> > __bpf_redirect_common routine gated by #ifdef CONFIG_DEBUG_NET ?
> > In case syscaller manages to hit similar issues elsewhere..
>
> Assuming the issue is generic (and CONFIG_DEBUG_NET only enabled by things like
> syzkaller), couldn't we do something like the below?
Based on [1] it's selected by DEBUG_KERNEL, so it feels like it's safe
to assume that it's mostly debugging/fuzzing workloads?
Your suggestion to put it into __dev_queue_xmit looks good to me!
[1] https://lore.kernel.org/netdev/[email protected]/
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f6a27ab19202..c9988a785294 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -2459,6 +2459,17 @@ static inline void skb_set_tail_pointer(struct sk_buff *skb, const int offset)
>
> #endif /* NET_SKBUFF_DATA_USES_OFFSET */
>
> +static inline void skb_assert_len(struct sk_buff *skb)
> +{
> +#ifdef CONFIG_DEBUG_NET
> + if (unlikey(!skb->len)) {
> + pr_err("%s\n", __func__);
> + skb_dump(KERN_ERR, skb, false);
> + WARN_ON_ONCE(1);
> + }
> +#endif /* CONFIG_DEBUG_NET */
> +}
> +
> /*
> * Add data to an sk_buff
> */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 978ed0622d8f..53c4b9fd22c0 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4168,7 +4168,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
> bool again = false;
>
> skb_reset_mac_header(skb);
> -
> + skb_assert_len(skb);
> if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_SCHED_TSTAMP))
> __skb_tstamp_tx(skb, NULL, NULL, skb->sk, SCM_TSTAMP_SCHED);
>
>
>
> >> ---
> >> v1: should not check len in fast path
> >>
> >> net/bpf/test_run.c | 6 ++++++
> >> 1 file changed, 6 insertions(+)
> >>
> >> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> >> index 2ca96acbc50a..750d7d173a20 100644
> >> --- a/net/bpf/test_run.c
> >> +++ b/net/bpf/test_run.c
> >> @@ -1152,6 +1152,12 @@ int bpf_prog_test_run_skb(struct bpf_prog *prog, const union bpf_attr *kattr,
> >> ret = convert___skb_to_skb(skb, ctx);
> >> if (ret)
> >> goto out;
> >> +
> >> + if (skb->len == 0) {
> >> + ret = -EINVAL;
> >> + goto out;
> >> + }
> >> +
> >> ret = bpf_test_run(prog, skb, repeat, &retval, &duration, false);
> >> if (ret)
> >> goto out;
> >> --
> >> 2.17.1
> >>
>