2017-12-28 01:19:08

by Tom Herbert

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <[email protected]> wrote:
>
>
> 27.12.2017, 23:14, "Dmitry Vyukov" <[email protected]>:
>> On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <[email protected]> wrote:
>>> 27.12.2017, 22:21, "Dmitry Vyukov" <[email protected]>:
>>>> On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <[email protected]> wrote:
>>>>> Did you try the patch I posted?
>>>>
>>>> Hi Tom,
>>>
>>> Hello Dmitry,
>>>
>>>> No. And I didn't know I need to. Why?
>>>> If you think the patch needs additional testing, you can ask syzbot to
>>>> test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>> Otherwise proceed with committing it. Or what are we waiting for?
>>>>
>>>> Thanks
>>>
>>> I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>> How do test it because there is no patch in the following bug?
>>
>> Hi Ozgur,
>>
>> I am not sure I completely understand what you mean. But the
>> reproducer for this bug (which one can use for testing) is here:
>> https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>> Tom also mentions there is some patch for this, but I don't know where
>> it is, it doesn't seem to be referenced from this thread.
>
> Hello Dmitry,
>
> Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
> I think Tom send patch to only you and are you tested?
>
> kcmsock.c will change and strp_data_ready I think locked.
>
> Tom, please send a patch for me? I can test and inform you.
>
Hi Ozgur,

I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!

Thanks,
Tom

> Regards
>
> Ozgur
>
>>> The fix patch should be for this net/kcm/kcmsock.c file and lock functions must be added calling sk_data_ready ().
>>> Regards
>>>
>>> Ozgur
>>>
>>>>> On Wed, Dec 27, 2017 at 10:25 AM, Dmitry Vyukov <[email protected]> wrote:
>>>>>> On Wed, Dec 6, 2017 at 4:44 PM, Dmitry Vyukov <[email protected]> wrote:
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> On 10/24/2017 08:20 AM, syzbot wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> syzkaller hit the following crash on 73d3393ada4f70fa3df5639c8d438f2f034c0ecb
>>>>>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>>>>>>>>>> compiler: gcc (GCC) 7.1.1 20170620
>>>>>>>>>> .config is attached
>>>>>>>>>> Raw console output is attached.
>>>>>>>>>> C reproducer is attached
>>>>>>>>>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>>>>>>>>>> for information about syzkaller reproducers
>>>>>>>>>>
>>>>>>>>>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_me include/net/sock.h:1505 [inline]
>>>>>>>>>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_user include/net/sock.h:1511 [inline]
>>>>>>>>>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
>>>>>>>>>> Kernel panic - not syncing: panic_on_warn set ...
>>>>>>>>>>
>>>>>>>>>> CPU: 0 PID: 2996 Comm: syzkaller142210 Not tainted 4.14.0-rc5+ #138
>>>>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>>>>>>> Call Trace:
>>>>>>>>>> <IRQ>
>>>>>>>>>> __dump_stack lib/dump_stack.c:16 [inline]
>>>>>>>>>> dump_stack+0x194/0x257 lib/dump_stack.c:52
>>>>>>>>>> panic+0x1e4/0x417 kernel/panic.c:181
>>>>>>>>>> __warn+0x1c4/0x1d9 kernel/panic.c:542
>>>>>>>>>> report_bug+0x211/0x2d0 lib/bug.c:183
>>>>>>>>>> fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
>>>>>>>>>> do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
>>>>>>>>>> do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
>>>>>>>>>> do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
>>>>>>>>>> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
>>>>>>>>>> invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
>>>>>>>>>> RIP: 0010:sock_owned_by_me include/net/sock.h:1505 [inline]
>>>>>>>>>> RIP: 0010:sock_owned_by_user include/net/sock.h:1511 [inline]
>>>>>>>>>> RIP: 0010:strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
>>>>>>>>>> RSP: 0018:ffff8801db206b18 EFLAGS: 00010206
>>>>>>>>>> RAX: ffff8801d1e02080 RBX: ffff8801dad74c48 RCX: 0000000000000000
>>>>>>>>>> RDX: 0000000000000100 RSI: ffff8801d29fa0a0 RDI: ffffffff85cbede0
>>>>>>>>>> RBP: ffff8801db206b38 R08: 0000000000000005 R09: 1ffffffff0ce0bcd
>>>>>>>>>> R10: ffff8801db206a00 R11: dffffc0000000000 R12: ffff8801d29fa000
>>>>>>>>>> R13: ffff8801dad74c50 R14: ffff8801d4350a92 R15: 0000000000000001
>>>>>>>>>> psock_data_ready+0x56/0x70 net/kcm/kcmsock.c:353
>>>>>>>>>
>>>>>>>>> Looks like KCM is calling sk_data_ready() without first taking the
>>>>>>>>> sock lock.
>>>>>>>>>
>>>>>>>>> /* Called with lower sock held */
>>>>>>>>> static void kcm_rcv_strparser(struct strparser *strp, struct sk_buff *skb)
>>>>>>>>> {
>>>>>>>>> [...]
>>>>>>>>> if (kcm_queue_rcv_skb(&kcm->sk, skb)) {
>>>>>>>>>
>>>>>>>>> In this case kcm->sk is not the same lock the comment is referring to.
>>>>>>>>> And kcm_queue_rcv_skb() will eventually call sk_data_ready().
>>>>>>>>>
>>>>>>>>> @Tom, how about wrapping the sk_data_ready call in {lock|release}_sock?
>>>>>>>>> I don't have anything better in mind immediately.
>>>>>>>> The sock locks are taken in reverse order in the send path so so
>>>>>>>> grabbing kcm sock lock with lower lock held to call sk_data_ready may
>>>>>>>> lead to deadlock like I think.
>>>>>>>>
>>>>>>>> It might be possible to change the order in the send path to do this.
>>>>>>>> Something like:
>>>>>>>>
>>>>>>>> trylock on lower socket lock
>>>>>>>> -if trylock fails
>>>>>>>> - release kcm sock lock
>>>>>>>> - lock lower sock
>>>>>>>> - lock kcm sock
>>>>>>>> - call sendpage locked function
>>>>>>>>
>>>>>>>> I admit that dealing with two levels of socket locks in the data path
>>>>>>>> is quite a pain :-)
>>>>>>>
>>>>>>> up
>>>>>>>
>>>>>>> still happening and we've lost 50K+ test VMs on this
>>>>>>
>>>>>> up
>>>>>>
>>>>>> Still happens and number of crashes crossed 60K, can we do something
>>>>>> with this please?


2017-12-28 07:55:09

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Thu, Dec 28, 2017 at 2:19 AM, Tom Herbert <[email protected]> wrote:
> On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <[email protected]> wrote:
>>
>>
>> 27.12.2017, 23:14, "Dmitry Vyukov" <[email protected]>:
>>> On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <[email protected]> wrote:
>>>> 27.12.2017, 22:21, "Dmitry Vyukov" <[email protected]>:
>>>>> On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <[email protected]> wrote:
>>>>>> Did you try the patch I posted?
>>>>>
>>>>> Hi Tom,
>>>>
>>>> Hello Dmitry,
>>>>
>>>>> No. And I didn't know I need to. Why?
>>>>> If you think the patch needs additional testing, you can ask syzbot to
>>>>> test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>>> Otherwise proceed with committing it. Or what are we waiting for?
>>>>>
>>>>> Thanks
>>>>
>>>> I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>>> How do test it because there is no patch in the following bug?
>>>
>>> Hi Ozgur,
>>>
>>> I am not sure I completely understand what you mean. But the
>>> reproducer for this bug (which one can use for testing) is here:
>>> https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>>> Tom also mentions there is some patch for this, but I don't know where
>>> it is, it doesn't seem to be referenced from this thread.
>>
>> Hello Dmitry,
>>
>> Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
>> I think Tom send patch to only you and are you tested?
>>
>> kcmsock.c will change and strp_data_ready I think locked.
>>
>> Tom, please send a patch for me? I can test and inform you.
>>
> Hi Ozgur,
>
> I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!

OK, I will work as your typist this time:

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
master

But I wonder what part of the following you don't understand? Do we
need to improve wording or something?

> If you think the patch needs additional testing, you can ask syzbot to test it.
> See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot

Also I don't know what git repo/branch you have in mind. Kernel
patches don't generally apply just to any tree. Fingers crossed that I
guessed correctly and it will apply.


Attachments:
strparser.patch (999.00 B)

2017-12-28 08:12:04

by syzbot

[permalink] [raw]
Subject: WARNING in strp_data_ready

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger
crash:

Reported-and-tested-by:
syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com

Note: the tag will also help syzbot to understand when the bug is fixed.

Tested on commit 5f520fc318764df800789edd202b5e3b55130613
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
compiler: gcc (GCC) 7.1.1 20170620
Patch is attached.
Kernel config is attached.


---
There is no WARRANTY for the result, to the extent permitted by applicable
law.
Except when otherwise stated in writing syzbot provides the result "AS IS"
without warranty of any kind, either expressed or implied, but not limited
to,
the implied warranties of merchantability and fittness for a particular
purpose.
The entire risk as to the quality of the result is with you. Should the
result
prove defective, you assume the cost of all necessary servicing, repair or
correction.


Attachments:
config.txt (121.80 kB)
patch.txt (906.00 B)
Download all attachments

2017-12-28 08:14:13

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Thu, Dec 28, 2017 at 9:12 AM, syzbot
<syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com>
wrote:
> Hello,
>
> syzbot has tested the proposed patch and the reproducer did not trigger
> crash:
>
> Reported-and-tested-by:
> syzbot+c91c53af67f9ebe599a337d2e70950366153b295@syzkaller.appspotmail.com

/\/\/\/\/\/\/\/\/\/\/\/\/\

Tom, please don't miss this part!

> Note: the tag will also help syzbot to understand when the bug is fixed.
>
> Tested on commit 5f520fc318764df800789edd202b5e3b55130613
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> Patch is attached.
> Kernel config is attached.
>
>
> ---
> There is no WARRANTY for the result, to the extent permitted by applicable
> law.
> Except when otherwise stated in writing syzbot provides the result "AS IS"
> without warranty of any kind, either expressed or implied, but not limited
> to,
> the implied warranties of merchantability and fittness for a particular
> purpose.
> The entire risk as to the quality of the result is with you. Should the
> result
> prove defective, you assume the cost of all necessary servicing, repair or
> correction.

2017-12-28 16:14:55

by Tom Herbert

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Thu, Dec 28, 2017 at 12:59 AM, Ozgur <[email protected]> wrote:
>
>
> 28.12.2017, 04:19, "Tom Herbert" <[email protected]>:
>> On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <[email protected]> wrote:
>>> 27.12.2017, 23:14, "Dmitry Vyukov" <[email protected]>:
>>>> On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <[email protected]> wrote:
>>>>> 27.12.2017, 22:21, "Dmitry Vyukov" <[email protected]>:
>>>>>> On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <[email protected]> wrote:
>>>>>>> Did you try the patch I posted?
>>>>>>
>>>>>> Hi Tom,
>>>>>
>>>>> Hello Dmitry,
>>>>>
>>>>>> No. And I didn't know I need to. Why?
>>>>>> If you think the patch needs additional testing, you can ask syzbot to
>>>>>> test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>>>> Otherwise proceed with committing it. Or what are we waiting for?
>>>>>>
>>>>>> Thanks
>>>>>
>>>>> I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>>>> How do test it because there is no patch in the following bug?
>>>>
>>>> Hi Ozgur,
>>>>
>>>> I am not sure I completely understand what you mean. But the
>>>> reproducer for this bug (which one can use for testing) is here:
>>>> https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>>>> Tom also mentions there is some patch for this, but I don't know where
>>>> it is, it doesn't seem to be referenced from this thread.
>>>
>>> Hello Dmitry,
>>>
>>> Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
>>> I think Tom send patch to only you and are you tested?
>>>
>>> kcmsock.c will change and strp_data_ready I think locked.
>>>
>>> Tom, please send a patch for me? I can test and inform you.
>>
>> Hi Ozgur,
>>
>> I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!
>>
>> Thanks,
>> Tom
>
> Hello Tom,
>
> Which are you use the repos? I pulled but I don't seen this patches.
>
They are not in any public repo yet. I posted the patches to netdev
list so they can be reviewed and tested by third parties. Posting
patches to the list a normal path to get patches into the kernel
(http://nickdesaulniers.github.io/blog/2017/05/16/submitting-your-first-patch-to-the-linux-kernel-and-responding-to-feedback/).

These patches were applied to net-next but are simple enough that they
should apply to other branches. I will repost and target to net per
Dave's directive once they are verified to fix the issue.

Tom

2017-12-28 16:33:45

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Thu, Dec 28, 2017 at 5:14 PM, Tom Herbert <[email protected]> wrote:
> On Thu, Dec 28, 2017 at 12:59 AM, Ozgur <[email protected]> wrote:
>>
>>
>> 28.12.2017, 04:19, "Tom Herbert" <[email protected]>:
>>> On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <[email protected]> wrote:
>>>> 27.12.2017, 23:14, "Dmitry Vyukov" <[email protected]>:
>>>>> On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <[email protected]> wrote:
>>>>>> 27.12.2017, 22:21, "Dmitry Vyukov" <[email protected]>:
>>>>>>> On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <[email protected]> wrote:
>>>>>>>> Did you try the patch I posted?
>>>>>>>
>>>>>>> Hi Tom,
>>>>>>
>>>>>> Hello Dmitry,
>>>>>>
>>>>>>> No. And I didn't know I need to. Why?
>>>>>>> If you think the patch needs additional testing, you can ask syzbot to
>>>>>>> test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>>>>> Otherwise proceed with committing it. Or what are we waiting for?
>>>>>>>
>>>>>>> Thanks
>>>>>>
>>>>>> I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>>>>> How do test it because there is no patch in the following bug?
>>>>>
>>>>> Hi Ozgur,
>>>>>
>>>>> I am not sure I completely understand what you mean. But the
>>>>> reproducer for this bug (which one can use for testing) is here:
>>>>> https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>>>>> Tom also mentions there is some patch for this, but I don't know where
>>>>> it is, it doesn't seem to be referenced from this thread.
>>>>
>>>> Hello Dmitry,
>>>>
>>>> Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
>>>> I think Tom send patch to only you and are you tested?
>>>>
>>>> kcmsock.c will change and strp_data_ready I think locked.
>>>>
>>>> Tom, please send a patch for me? I can test and inform you.
>>>
>>> Hi Ozgur,
>>>
>>> I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!
>>>
>>> Thanks,
>>> Tom
>>
>> Hello Tom,
>>
>> Which are you use the repos? I pulled but I don't seen this patches.
>>
> They are not in any public repo yet. I posted the patches to netdev
> list so they can be reviewed and tested by third parties. Posting
> patches to the list a normal path to get patches into the kernel
> (http://nickdesaulniers.github.io/blog/2017/05/16/submitting-your-first-patch-to-the-linux-kernel-and-responding-to-feedback/).
>
> These patches were applied to net-next but are simple enough that they
> should apply to other branches. I will repost and target to net per
> Dave's directive once they are verified to fix the issue.

FWIW they are already verified to fix the issue, see few emails up, also here:
https://groups.google.com/d/msg/syzkaller-bugs/Kxs05ziCpgY/fPdZcO_GAwAJ
and don't forget this:
https://groups.google.com/d/msg/syzkaller-bugs/Kxs05ziCpgY/uGjsrA3HAwAJ

2017-12-28 18:35:26

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Thu, Dec 28, 2017 at 7:21 PM, Ozgur <[email protected]> wrote:
>
>
> 28.12.2017, 19:33, "Dmitry Vyukov" <[email protected]>:
>> On Thu, Dec 28, 2017 at 5:14 PM, Tom Herbert <[email protected]> wrote:
>>> On Thu, Dec 28, 2017 at 12:59 AM, Ozgur <[email protected]> wrote:
>>>> 28.12.2017, 04:19, "Tom Herbert" <[email protected]>:
>>>>> On Wed, Dec 27, 2017 at 12:20 PM, Ozgur <[email protected]> wrote:
>>>>>> 27.12.2017, 23:14, "Dmitry Vyukov" <[email protected]>:
>>>>>>> On Wed, Dec 27, 2017 at 9:08 PM, Ozgur <[email protected]> wrote:
>>>>>>>> 27.12.2017, 22:21, "Dmitry Vyukov" <[email protected]>:
>>>>>>>>> On Wed, Dec 27, 2017 at 8:09 PM, Tom Herbert <[email protected]> wrote:
>>>>>>>>>> Did you try the patch I posted?
>>>>>>>>>
>>>>>>>>> Hi Tom,
>>>>>>>>
>>>>>>>> Hello Dmitry,
>>>>>>>>
>>>>>>>>> No. And I didn't know I need to. Why?
>>>>>>>>> If you think the patch needs additional testing, you can ask syzbot to
>>>>>>>>> test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot
>>>>>>>>> Otherwise proceed with committing it. Or what are we waiting for?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> I think we need to fixed patch for crash, in fact check to patch code and test solve the bug.
>>>>>>>> How do test it because there is no patch in the following bug?
>>>>>>>
>>>>>>> Hi Ozgur,
>>>>>>>
>>>>>>> I am not sure I completely understand what you mean. But the
>>>>>>> reproducer for this bug (which one can use for testing) is here:
>>>>>>> https://groups.google.com/forum/#!topic/syzkaller-bugs/Kxs05ziCpgY
>>>>>>> Tom also mentions there is some patch for this, but I don't know where
>>>>>>> it is, it doesn't seem to be referenced from this thread.
>>>>>>
>>>>>> Hello Dmitry,
>>>>>>
>>>>>> Ah, I'm sorry I don't seen Tom mail and I don't have a patch not tested :)
>>>>>> I think Tom send patch to only you and are you tested?
>>>>>>
>>>>>> kcmsock.c will change and strp_data_ready I think locked.
>>>>>>
>>>>>> Tom, please send a patch for me? I can test and inform you.
>>>>>
>>>>> Hi Ozgur,
>>>>>
>>>>> I reposted the patches as RFC "kcm: Fix lockdep issue". Please test if you can!
>>>>>
>>>>> Thanks,
>>>>> Tom
>>>>
>>>> Hello Tom,
>>>>
>>>> Which are you use the repos? I pulled but I don't seen this patches.
>>> They are not in any public repo yet. I posted the patches to netdev
>>> list so they can be reviewed and tested by third parties. Posting
>>> patches to the list a normal path to get patches into the kernel
>>> (http://nickdesaulniers.github.io/blog/2017/05/16/submitting-your-first-patch-to-the-linux-kernel-and-responding-to-feedback/).
>>>
>>> These patches were applied to net-next but are simple enough that they
>>> should apply to other branches. I will repost and target to net per
>>> Dave's directive once they are verified to fix the issue.
>
> Hello,
>
> thanks Tom and I have tested the fixed patch for linux-next builds and don't have to kernel panic. when nocheck funcs call sk_lock.owned and kernel doesn't give a panic. I have compiled and uploaded next-kernel.
>
> Dmitry,
> could you test it on linux-next?

If you are trying to test how many times I can repeat this, I can
repeat this lots of times:

If you think the patch needs additional testing, you can ask syzbot to
test it. See https://github.com/google/syzkaller/blob/master/docs/syzbot.md#communication-with-syzbot

2017-10-30 22:41:05

by Tom Herbert

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On Mon, Oct 30, 2017 at 2:44 PM, John Fastabend
<[email protected]> wrote:
> On 10/24/2017 08:20 AM, syzbot wrote:
>> Hello,
>>
>> syzkaller hit the following crash on 73d3393ada4f70fa3df5639c8d438f2f034c0ecb
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> C reproducer is attached
>> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
>> for information about syzkaller reproducers
>>
>>
>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_me include/net/sock.h:1505 [inline]
>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_user include/net/sock.h:1511 [inline]
>> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
>> Kernel panic - not syncing: panic_on_warn set ...
>>
>> CPU: 0 PID: 2996 Comm: syzkaller142210 Not tainted 4.14.0-rc5+ #138
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>> <IRQ>
>> __dump_stack lib/dump_stack.c:16 [inline]
>> dump_stack+0x194/0x257 lib/dump_stack.c:52
>> panic+0x1e4/0x417 kernel/panic.c:181
>> __warn+0x1c4/0x1d9 kernel/panic.c:542
>> report_bug+0x211/0x2d0 lib/bug.c:183
>> fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
>> do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
>> do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
>> do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
>> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
>> invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
>> RIP: 0010:sock_owned_by_me include/net/sock.h:1505 [inline]
>> RIP: 0010:sock_owned_by_user include/net/sock.h:1511 [inline]
>> RIP: 0010:strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
>> RSP: 0018:ffff8801db206b18 EFLAGS: 00010206
>> RAX: ffff8801d1e02080 RBX: ffff8801dad74c48 RCX: 0000000000000000
>> RDX: 0000000000000100 RSI: ffff8801d29fa0a0 RDI: ffffffff85cbede0
>> RBP: ffff8801db206b38 R08: 0000000000000005 R09: 1ffffffff0ce0bcd
>> R10: ffff8801db206a00 R11: dffffc0000000000 R12: ffff8801d29fa000
>> R13: ffff8801dad74c50 R14: ffff8801d4350a92 R15: 0000000000000001
>> psock_data_ready+0x56/0x70 net/kcm/kcmsock.c:353
>
> Looks like KCM is calling sk_data_ready() without first taking the
> sock lock.
>
> /* Called with lower sock held */
> static void kcm_rcv_strparser(struct strparser *strp, struct sk_buff *skb)
> {
> [...]
> if (kcm_queue_rcv_skb(&kcm->sk, skb)) {
>
> In this case kcm->sk is not the same lock the comment is referring to.
> And kcm_queue_rcv_skb() will eventually call sk_data_ready().
>
> @Tom, how about wrapping the sk_data_ready call in {lock|release}_sock?
> I don't have anything better in mind immediately.
>
The sock locks are taken in reverse order in the send path so so
grabbing kcm sock lock with lower lock held to call sk_data_ready may
lead to deadlock like I think.

It might be possible to change the order in the send path to do this.
Something like:

trylock on lower socket lock
-if trylock fails
- release kcm sock lock
- lock lower sock
- lock kcm sock
- call sendpage locked function

I admit that dealing with two levels of socket locks in the data path
is quite a pain :-)

Tom

> Thanks,
> John

From 1582720638726964977@xxx Mon Oct 30 21:47:03 +0000 2017
X-GM-THRID: 1582152825737332416
X-Gmail-Labels: Inbox,Category Forums

2017-10-30 21:47:03

by John Fastabend

[permalink] [raw]
Subject: Re: WARNING in strp_data_ready

On 10/24/2017 08:20 AM, syzbot wrote:
> Hello,
>
> syzkaller hit the following crash on 73d3393ada4f70fa3df5639c8d438f2f034c0ecb
> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> C reproducer is attached
> syzkaller reproducer is attached. See https://goo.gl/kgGztJ
> for information about syzkaller reproducers
>
>
> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_me include/net/sock.h:1505 [inline]
> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 sock_owned_by_user include/net/sock.h:1511 [inline]
> WARNING: CPU: 0 PID: 2996 at ./include/net/sock.h:1505 strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 0 PID: 2996 Comm: syzkaller142210 Not tainted 4.14.0-rc5+ #138
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x194/0x257 lib/dump_stack.c:52
>  panic+0x1e4/0x417 kernel/panic.c:181
>  __warn+0x1c4/0x1d9 kernel/panic.c:542
>  report_bug+0x211/0x2d0 lib/bug.c:183
>  fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178
>  do_trap_no_signal arch/x86/kernel/traps.c:212 [inline]
>  do_trap+0x260/0x390 arch/x86/kernel/traps.c:261
>  do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298
>  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311
>  invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
> RIP: 0010:sock_owned_by_me include/net/sock.h:1505 [inline]
> RIP: 0010:sock_owned_by_user include/net/sock.h:1511 [inline]
> RIP: 0010:strp_data_ready+0x2b7/0x390 net/strparser/strparser.c:404
> RSP: 0018:ffff8801db206b18 EFLAGS: 00010206
> RAX: ffff8801d1e02080 RBX: ffff8801dad74c48 RCX: 0000000000000000
> RDX: 0000000000000100 RSI: ffff8801d29fa0a0 RDI: ffffffff85cbede0
> RBP: ffff8801db206b38 R08: 0000000000000005 R09: 1ffffffff0ce0bcd
> R10: ffff8801db206a00 R11: dffffc0000000000 R12: ffff8801d29fa000
> R13: ffff8801dad74c50 R14: ffff8801d4350a92 R15: 0000000000000001
>  psock_data_ready+0x56/0x70 net/kcm/kcmsock.c:353

Looks like KCM is calling sk_data_ready() without first taking the
sock lock.

/* Called with lower sock held */
static void kcm_rcv_strparser(struct strparser *strp, struct sk_buff *skb)
{
[...]
if (kcm_queue_rcv_skb(&kcm->sk, skb)) {

In this case kcm->sk is not the same lock the comment is referring to.
And kcm_queue_rcv_skb() will eventually call sk_data_ready().

@Tom, how about wrapping the sk_data_ready call in {lock|release}_sock?
I don't have anything better in mind immediately.

Thanks,
John

From 1582152825737332416@xxx Tue Oct 24 15:21:54 +0000 2017
X-GM-THRID: 1582152825737332416
X-Gmail-Labels: Inbox,Category Forums