2017-09-12 22:35:45

by Laura Abbott

[permalink] [raw]
Subject: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

Hi,

Fedora got a bug report
https://bugzilla.redhat.com/show_bug.cgi?id=1432684 of a regression with
automatic spice port
assignment. The libvirt team reduced this to the attached test
case run as follows:

In a separate terminal, qemu-kvm -vnc 127.0.0.1:0 to grab port 5900.
Then do this:

$ gcc bind-collision.c && ./a.out
bind: Address already in use
AF_INET check failed.
$ gcc -D CHECK_IPV6 bind-collision.c && ./a.out
AF_INET6 success
AF_INET success
$ gcc bind-collision.c && ./a.out
AF_INET success

Bisection showed this behavior to be caused by

commit 319554f284dda9f2737d09df82ba3610bd8ddea3
Author: Josef Bacik <[email protected]>
Date: Thu Jan 19 17:47:46 2017 -0500

inet: don't use sk_v6_rcv_saddr directly

When comparing two sockets we need to use inet6_rcv_saddr so we get
a NULL
sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our
comparison function
can be wrong.

Fixes: 637bc8b ("inet: reset tb->fastreuseport when adding a
reuseport sk")
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David S. Miller <[email protected]>


And reverting fixed both the standalone test case and the spice issue.

Any ideas?

Thanks,
Laura


Attachments:
bind-collision.c (1.98 kB)

2017-09-12 23:13:42

by Josef Bacik

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

First I’m super sorry for the top post, I’m at plumbers and I forgot to upload my muttrc to my new cloud instance, so I’m screwed using outlook.

I have a completely untested, uncompiled patch that I think will fix the problem, would you mind giving it a go? Thanks,

Josef

On 9/12/17, 3:36 PM, "Laura Abbott" <[email protected]> wrote:

Hi,

Fedora got a bug report
https://bugzilla.redhat.com/show_bug.cgi?id=1432684 of a regression with
automatic spice port
assignment. The libvirt team reduced this to the attached test
case run as follows:

In a separate terminal, qemu-kvm -vnc 127.0.0.1:0 to grab port 5900.
Then do this:

$ gcc bind-collision.c && ./a.out
bind: Address already in use
AF_INET check failed.
$ gcc -D CHECK_IPV6 bind-collision.c && ./a.out
AF_INET6 success
AF_INET success
$ gcc bind-collision.c && ./a.out
AF_INET success

Bisection showed this behavior to be caused by

commit 319554f284dda9f2737d09df82ba3610bd8ddea3
Author: Josef Bacik <[email protected]>
Date: Thu Jan 19 17:47:46 2017 -0500

inet: don't use sk_v6_rcv_saddr directly

When comparing two sockets we need to use inet6_rcv_saddr so we get
a NULL
sk_v6_rcv_saddr if the socket isn't AF_INET6, otherwise our
comparison function
can be wrong.

Fixes: 637bc8b ("inet: reset tb->fastreuseport when adding a
reuseport sk")
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David S. Miller <[email protected]>


And reverting fixed both the standalone test case and the spice issue.

Any ideas?

Thanks,
Laura



Attachments:
0001-net-set-tb-fast_sk_family.patch (1.28 kB)
0001-net-set-tb-fast_sk_family.patch

2017-09-13 15:44:56

by Laura Abbott

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On 09/12/2017 04:12 PM, Josef Bacik wrote:
> First I’m super sorry for the top post, I’m at plumbers and I forgot to upload my muttrc to my new cloud instance, so I’m screwed using outlook.
>
> I have a completely untested, uncompiled patch that I think will fix the problem, would you mind giving it a go? Thanks,
>
> Josef

Thanks for the quick turnaround. Unfortunately, the problem is still
reproducible according to the reporter.

Thanks,
Laura

2017-09-13 17:29:01

by Josef Bacik

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

Sorry I thought I had made this other fix, can you apply this on top of the other one and try that? I have more things to try if this doesn’t work, sorry you are playing go between, but I want to make sure I know _which_ fix actually fixes the problem, and then clean up in followup patches. Thanks,

Josef

On 9/13/17, 8:45 AM, "Laura Abbott" <[email protected]> wrote:

On 09/12/2017 04:12 PM, Josef Bacik wrote:
> First I’m super sorry for the top post, I’m at plumbers and I forgot to upload my muttrc to my new cloud instance, so I’m screwed using outlook.
>
> I have a completely untested, uncompiled patch that I think will fix the problem, would you mind giving it a go? Thanks,
>
> Josef

Thanks for the quick turnaround. Unfortunately, the problem is still
reproducible according to the reporter.

Thanks,
Laura



Attachments:
0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch (1.06 kB)
0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch

2017-09-13 17:40:22

by Cole Robinson

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On 09/13/2017 01:28 PM, Josef Bacik wrote:
> Sorry I thought I had made this other fix, can you apply this on top of the other one and try that? I have more things to try if this doesn’t work, sorry you are playing go between, but I want to make sure I know _which_ fix actually fixes the problem, and then clean up in followup patches. Thanks,
>

I'm the bug reporter. I'll combine the two patches and report back

Thanks,
Cole

2017-09-13 19:13:47

by Cole Robinson

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On 09/13/2017 01:40 PM, Cole Robinson wrote:
> On 09/13/2017 01:28 PM, Josef Bacik wrote:
>> Sorry I thought I had made this other fix, can you apply this on top of the other one and try that? I have more things to try if this doesn’t work, sorry you are playing go between, but I want to make sure I know _which_ fix actually fixes the problem, and then clean up in followup patches. Thanks,
>>
>
> I'm the bug reporter. I'll combine the two patches and report back
>

Nope, issue is still present with both patches applied. Tried my own build and
a package Laura provided

Thanks,
Cole

2017-09-13 19:44:55

by Josef Bacik

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

Alright thanks, this should fix it.

Josef

On 9/13/17, 12:14 PM, "Cole Robinson" <[email protected]> wrote:

On 09/13/2017 01:40 PM, Cole Robinson wrote:
> On 09/13/2017 01:28 PM, Josef Bacik wrote:
>> Sorry I thought I had made this other fix, can you apply this on top of the other one and try that? I have more things to try if this doesn’t work, sorry you are playing go between, but I want to make sure I know _which_ fix actually fixes the problem, and then clean up in followup patches. Thanks,
>>
>
> I'm the bug reporter. I'll combine the two patches and report back
>

Nope, issue is still present with both patches applied. Tried my own build and
a package Laura provided

Thanks,
Cole




Attachments:
0001-net-don-t-fast-patch-mismatched-sockets-in-STRICT-mo.patch (1.15 kB)
0001-net-don-t-fast-patch-mismatched-sockets-in-STRICT-mo.patch

2017-09-13 19:46:10

by Chuck Ebbert

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On Wed, 13 Sep 2017 17:28:25 +0000
Josef Bacik <[email protected]> wrote:

> Sorry I thought I had made this other fix, can you apply this on top
> of the other one and try that? I have more things to try if this
> doesn’t work, sorry you are playing go between, but I want to make
> sure I know _which_ fix actually fixes the problem, and then clean up
> in followup patches. Thanks,
>
> Josef
>
> On 9/13/17, 8:45 AM, "Laura Abbott" <[email protected]> wrote:
>
> On 09/12/2017 04:12 PM, Josef Bacik wrote:
> > First I’m super sorry for the top post, I’m at plumbers and I
> > forgot to upload my muttrc to my new cloud instance, so I’m screwed
> > using outlook.
> >
> > I have a completely untested, uncompiled patch that I think will
> > fix the problem, would you mind giving it a go? Thanks,
> >
> > Josef
>
> Thanks for the quick turnaround. Unfortunately, the problem is still
> reproducible according to the reporter.
>
> Thanks,
> Laura

I am confused by the patch that originally caused this:

if (sk->sk_family == AF_INET6)
return ipv6_rcv_saddr_equal(&sk->sk_v6_rcv_saddr,
- &sk2->sk_v6_rcv_saddr,
+ inet6_rcv_saddr(sk2),
sk->sk_rcv_saddr,
sk2->sk_rcv_saddr,

Shouldn't the first argument also be changed to use inet6_rcv_saddr()?




2017-09-13 20:11:54

by Josef Bacik

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression


> On Sep 13, 2017, at 12:46 PM, Chuck Ebbert <[email protected]> wrote:
>
> On Wed, 13 Sep 2017 17:28:25 +0000
> Josef Bacik <[email protected]> wrote:
>
>> Sorry I thought I had made this other fix, can you apply this on top
>> of the other one and try that? I have more things to try if this
>> doesn’t work, sorry you are playing go between, but I want to make
>> sure I know _which_ fix actually fixes the problem, and then clean up
>> in followup patches. Thanks,
>>
>> Josef
>>
>> On 9/13/17, 8:45 AM, "Laura Abbott" <[email protected]> wrote:
>>
>> On 09/12/2017 04:12 PM, Josef Bacik wrote:
>>> First I’m super sorry for the top post, I’m at plumbers and I
>>> forgot to upload my muttrc to my new cloud instance, so I’m screwed
>>> using outlook.
>>>
>>> I have a completely untested, uncompiled patch that I think will
>>> fix the problem, would you mind giving it a go? Thanks,
>>>
>>> Josef
>>
>> Thanks for the quick turnaround. Unfortunately, the problem is still
>> reproducible according to the reporter.
>>
>> Thanks,
>> Laura
>
> I am confused by the patch that originally caused this:
>
> if (sk->sk_family == AF_INET6)
> return ipv6_rcv_saddr_equal(&sk->sk_v6_rcv_saddr,
> - &sk2->sk_v6_rcv_saddr,
> + inet6_rcv_saddr(sk2),
> sk->sk_rcv_saddr,
> sk2->sk_rcv_saddr,
>
> Shouldn't the first argument also be changed to use inet6_rcv_saddr()?

No we know sk is IPv6 so it's alright to use directly. Thanks,

Josef

2017-09-13 22:49:39

by Cole Robinson

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On 09/13/2017 03:44 PM, Josef Bacik wrote:
> Alright thanks, this should fix it.
>

Still no luck with all three patches applied to fedora 4.12.8-300 RPM. Pretty
sure I didn't mess up the testing but since I rarely do kernel builds it's not
impossible...

Thanks,
Cole

2017-09-15 17:52:46

by Josef Bacik

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

Finally got access to a box to run this down myself. This patch on top of the other patches fixes the problem for me, could you verify it works for you? Thanks,

Josef

On 9/13/17, 3:49 PM, "Cole Robinson" <[email protected]> wrote:

On 09/13/2017 03:44 PM, Josef Bacik wrote:
> Alright thanks, this should fix it.
>

Still no luck with all three patches applied to fedora 4.12.8-300 RPM. Pretty
sure I didn't mess up the testing but since I rarely do kernel builds it's not
impossible...

Thanks,
Cole




Attachments:
0001-net-call-sk_reuseport_match-if-we-are-a-reusesock.patch (1.19 kB)
0001-net-call-sk_reuseport_match-if-we-are-a-reusesock.patch

2017-09-17 13:17:17

by Cole Robinson

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On 09/15/2017 01:51 PM, Josef Bacik wrote:
> Finally got access to a box to run this down myself. This patch on top of the other patches fixes the problem for me, could you verify it works for you? Thanks,
>

Yup I can confirm that patch fixes things when applied on top of the
previous 3 patches. Thanks! Please tag those patches for stable releases
if appropriate, this is affecting a decent amount of libvirt users

Thanks,
Cole

2017-09-18 08:20:10

by Marc Haber

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

On Sun, Sep 17, 2017 at 09:17:13AM -0400, Cole Robinson wrote:
> On 09/15/2017 01:51 PM, Josef Bacik wrote:
> > Finally got access to a box to run this down myself. This patch on top of the other patches fixes the problem for me, could you verify it works for you? Thanks,
> >
>
> Yup I can confirm that patch fixes things when applied on top of the
> previous 3 patches. Thanks! Please tag those patches for stable releases
> if appropriate, this is affecting a decent amount of libvirt users

I can also confirm that these four patches fix things for me (on
Debian) as well. Thanks!

I would love to have this in one of Greg's next 4.13 releases.

Greetings
Marc

--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421

2017-11-13 08:10:40

by Marc Haber

[permalink] [raw]
Subject: Re: 319554f284dd ("inet: don't use sk_v6_rcv_saddr directly") causes bind port regression

Hi,

those four patches never made it into any 4.13 release.

0001-net-call-sk_reuseport_match-if-we-are-a-reusesock.patch
0001-net-don-t-fast-patch-mismatched-sockets-in-STRICT-mo.patch
0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch
0001-net-set-tb-fast_sk_family.patch

And I have just seen that the first two are not even in 4.14. What does
that mean for libvirt users on systems runnign a 4.14 kernel?

The third and fourth patch
(0001-net-use-inet6_rcv_saddr-to-compare-sockets.patch and
0001-net-set-tb-fast_sk_family.patch) seem to be in 4.14.

Greetings
Marc

On Mon, Sep 18, 2017 at 10:02:32AM +0200, Marc Haber wrote:
> On Sun, Sep 17, 2017 at 09:17:13AM -0400, Cole Robinson wrote:
> > On 09/15/2017 01:51 PM, Josef Bacik wrote:
> > > Finally got access to a box to run this down myself. This patch on top of the other patches fixes the problem for me, could you verify it works for you? Thanks,
> > >
> >
> > Yup I can confirm that patch fixes things when applied on top of the
> > previous 3 patches. Thanks! Please tag those patches for stable releases
> > if appropriate, this is affecting a decent amount of libvirt users
>
> I can also confirm that these four patches fix things for me (on
> Debian) as well. Thanks!
>
> I would love to have this in one of Greg's next 4.13 releases.

--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421

From 1578866232400623675@xxx Mon Sep 18 08:42:54 +0000 2017
X-GM-THRID: 1578375222454021652
X-Gmail-Labels: Inbox,Category Forums