2015-12-30 11:18:34

by Andy Lutomirski

[permalink] [raw]
Subject: Linux 4.4-rc4 regression, bisected to "net: fix sock_wake_async() rcu protection"

On recent v4.4-rc releases, I can't run emacs. No, really, running
"emacs" in a GNOME 3 session makes gnome-shell think that emacs is
running, but no window is drawn, and the overall system UI is a bit
weird when the invisible emacs window is focused.

This is 100% reproducible.

There might be other symptoms involving gdb malfunctioning, but those
are, at best, sporadic. The emacs failure is entirely reliable. I
have no idea what the underlying failure mode is, but failure to wake
a socket waiter seems plausible, I also have no idea why oocalc,
gimp, vim, gedit, firefox, etc aren't affected.

A somewhat unorthodox "git bisect" run blames:

commit ceb5d58b217098a657f3850b7a2640f995032e62
Author: Eric Dumazet <[email protected]>
Date: Sun Nov 29 20:03:11 2015 -0800

net: fix sock_wake_async() rcu protection

I've confirmed that v4.4-rc7 with that patch reverted works fine.

Since the offending commit was apparently a security fix, simply
reverting it might not be the best idea.

--Andy


2015-12-30 11:32:15

by Nicolai Stange

[permalink] [raw]
Subject: Re: Linux 4.4-rc4 regression, bisected to "net: fix sock_wake_async() rcu protection"

Andy Lutomirski <[email protected]> writes:

> On recent v4.4-rc releases, I can't run emacs. No, really, running
> "emacs" in a GNOME 3 session makes gnome-shell think that emacs is
> running, but no window is drawn, and the overall system UI is a bit
> weird when the invisible emacs window is focused.
>
> This is 100% reproducible.
>
> There might be other symptoms involving gdb malfunctioning, but those
> are, at best, sporadic. The emacs failure is entirely reliable. I
> have no idea what the underlying failure mode is, but failure to wake
> a socket waiter seems plausible, I also have no idea why oocalc,
> gimp, vim, gedit, firefox, etc aren't affected.
>
> A somewhat unorthodox "git bisect" run blames:
>
> commit ceb5d58b217098a657f3850b7a2640f995032e62
> Author: Eric Dumazet <[email protected]>
> Date: Sun Nov 29 20:03:11 2015 -0800
>
> net: fix sock_wake_async() rcu protection
>
> I've confirmed that v4.4-rc7 with that patch reverted works fine.
>
> Since the offending commit was apparently a security fix, simply
> reverting it might not be the best idea.

Please have a look at https://lkml.kernel.org/g/[email protected]

I ran into the same issue and this one fixes it for me.

Best,

Nicolai

2015-12-30 13:55:07

by Eric Dumazet

[permalink] [raw]
Subject: Re: Linux 4.4-rc4 regression, bisected to "net: fix sock_wake_async() rcu protection"

On Wed, 2015-12-30 at 12:32 +0100, Nicolai Stange wrote:
> Andy Lutomirski <[email protected]> writes:
>
> > On recent v4.4-rc releases, I can't run emacs. No, really, running
> > "emacs" in a GNOME 3 session makes gnome-shell think that emacs is
> > running, but no window is drawn, and the overall system UI is a bit
> > weird when the invisible emacs window is focused.
> >
> > This is 100% reproducible.
> >
> > There might be other symptoms involving gdb malfunctioning, but those
> > are, at best, sporadic. The emacs failure is entirely reliable. I
> > have no idea what the underlying failure mode is, but failure to wake
> > a socket waiter seems plausible, I also have no idea why oocalc,
> > gimp, vim, gedit, firefox, etc aren't affected.
> >
> > A somewhat unorthodox "git bisect" run blames:
> >
> > commit ceb5d58b217098a657f3850b7a2640f995032e62
> > Author: Eric Dumazet <[email protected]>
> > Date: Sun Nov 29 20:03:11 2015 -0800
> >
> > net: fix sock_wake_async() rcu protection
> >
> > I've confirmed that v4.4-rc7 with that patch reverted works fine.
> >
> > Since the offending commit was apparently a security fix, simply
> > reverting it might not be the best idea.
>
> Please have a look at https://lkml.kernel.org/g/[email protected]
>
> I ran into the same issue and this one fixes it for me.

Right, and the ozlabs pointers for this were :

v1:
https://patchwork.ozlabs.org/patch/561194/

v2:
https://patchwork.ozlabs.org/patch/561553/

Thanks.

2015-12-31 23:02:48

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 4.4-rc4 regression, bisected to "net: fix sock_wake_async() rcu protection"

On Wed, Dec 30, 2015 at 5:55 AM, Eric Dumazet <[email protected]> wrote:
> On Wed, 2015-12-30 at 12:32 +0100, Nicolai Stange wrote:
>>
>> Please have a look at https://lkml.kernel.org/g/[email protected]
>>
>> I ran into the same issue and this one fixes it for me.
>
> Right, and the ozlabs pointers for this were :
> v2:
> https://patchwork.ozlabs.org/patch/561553/

Ok, that's merged into my tree through the networking merge today.

Andy, if you still see it with current -git, holler.

Linus