2008-03-26 18:43:35

by Mike Snitzer

[permalink] [raw]
Subject: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive

I'm seeing that nbd_device's socket is getting set to NULL in the
middle of nbd_read_stat()'s sock_xmit().

There appears to be a race where 'nbd-client -d' requests that an NBD
device first disconnect from the nbd-server (via NBD_DISCONNECT ioctl)
and then set the NBD device's socket to NULL, etc (via
NBD_CLEAR_SOCK).

Both NBD_DISCONNECT and NBD_CLEAR_SOCK take the nbd_device's tx_lock
(which protects the socket during transmits) _but_ for receives the
socket can be set to NULL (via NBD_CLEAR_SOCK) at any time while
inside sock_xmit(); as such NBD_CLEAR_SOCK can cause a NULL pointer in
sock_xmit().

Analyzing the crash it is clear that the NULL pointer comes when
sock_xmit()'s do {} while() dereferences the nbd_device's socket with:
sock->sk->sk_allocation = GFP_NOIO;
I also saw that the sock_xmit() caller is nbd_read_stat().

The sequence looks like this:

nbd1: NBD_DISCONNECT
[NOTE: a sock_xmit() send attempt is made on behalf of NBD_DISCONNECT]
nbd1: Send control failed (result -32)
...
[NBD is still dequeueing requests]
...
Race: [NBD_CLEAR_SOCK ioctl][FATAL: nbd_read_stat()'s sock_xmit()
receive attempt causes NULL pointer]

In practice this looks like:

nbd1: NBD_DISCONNECT
nbd1: Send control failed (result -32)
end_request: I/O error, dev nbd1, sector 0
end_request: I/O error, dev nbd1, sector 8032264
md: super_written gets error=-5, uptodate=0
raid1: Disk failure on nbd1, disabling device.
Operation continuing on 1 devices
Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
[<ffffffff88b1e125>] :nbd:sock_xmit+0x9d/0x301

The fact that sock_xmit() in receive mode is unprotected seems to be
the WHY a NULL pointer is possible; but I'm still trying to identify
the HOW.

But for me this begs the question: why isn't the nbd_device's socket
always protected during sock_xmit() for both
transmits and receives; rather than just transmits (via tx_lock)!?

Any help on the "right" fix would be appreciated, thanks.
Mike


2008-03-27 12:36:14

by Paul Clements

[permalink] [raw]
Subject: Re: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive

Mike Snitzer wrote:

> In practice this looks like:
>
> nbd1: NBD_DISCONNECT
> nbd1: Send control failed (result -32)
> end_request: I/O error, dev nbd1, sector 0
> end_request: I/O error, dev nbd1, sector 8032264
> md: super_written gets error=-5, uptodate=0
> raid1: Disk failure on nbd1, disabling device.
> Operation continuing on 1 devices
> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
> [<ffffffff88b1e125>] :nbd:sock_xmit+0x9d/0x301

> The fact that sock_xmit() in receive mode is unprotected seems to be
> the WHY a NULL pointer is possible; but I'm still trying to identify
> the HOW.

Do you know who is setting the socket NULL? Is it already NULL when you
get to this point? Is it the nbd-client -d? Is it the original
nbd-client/kernel that does it? Figuring that out would help narrow down
the cause.

> But for me this begs the question: why isn't the nbd_device's socket
> always protected during sock_xmit() for both
> transmits and receives; rather than just transmits (via tx_lock)!?

It would deadlock if we held the lock over both. Generally we don't have
to worry about receives, since they're always done in the nbd-client
process, so we have control over when and how it exits and cleans up.
The odd case, as you've discovered, is when another process (nbd-client
-d) comes along and starts mucking with the queue and socket. Would
"kill -9 <nbd-client-pid>" work for you instead? That is what I use to
break the connection, and it's safe, as it tells the original nbd-client
to exit (which it does cleanly and safely).

--
Paul

2008-03-27 13:21:35

by Mike Snitzer

[permalink] [raw]
Subject: Re: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive

On Thu, Mar 27, 2008 at 8:35 AM, Paul Clements
<[email protected]> wrote:
> Mike Snitzer wrote:
>
> > In practice this looks like:
> >
> > nbd1: NBD_DISCONNECT
> > nbd1: Send control failed (result -32)
> > end_request: I/O error, dev nbd1, sector 0
> > end_request: I/O error, dev nbd1, sector 8032264
> > md: super_written gets error=-5, uptodate=0
> > raid1: Disk failure on nbd1, disabling device.
> > Operation continuing on 1 devices
> > Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
> > [<ffffffff88b1e125>] :nbd:sock_xmit+0x9d/0x301
>
> > The fact that sock_xmit() in receive mode is unprotected seems to be
> > the WHY a NULL pointer is possible; but I'm still trying to identify
> > the HOW.
>
> Do you know who is setting the socket NULL? Is it already NULL when you
> get to this point? Is it the nbd-client -d? Is it the original
> nbd-client/kernel that does it? Figuring that out would help narrow down
> the cause.

I believe that NBD_CLEAR_SOCK from 'nbd-client -d' sets it to NULL.
lo->sock is already NULL on entry to sock_xmit().

So simply checking if the sock_xmit's 'sock' is NULL _should_ avoid
any possibility of a NULL pointer Oops because sock can't be !NULL
after the negative check (because of the sock = lo->sock assignment).
That is, unless I'm missing somewhere in the rest of the kernel (not
nbd) that would take action to set a socket to NULL?

The attached patch seems reasonable. I'll be testing today to verify
it fixes the problem.

> > But for me this begs the question: why isn't the nbd_device's socket
> > always protected during sock_xmit() for both
> > transmits and receives; rather than just transmits (via tx_lock)!?
>
> It would deadlock if we held the lock over both. Generally we don't have
> to worry about receives, since they're always done in the nbd-client
> process, so we have control over when and how it exits and cleans up.
> The odd case, as you've discovered, is when another process (nbd-client
> -d) comes along and starts mucking with the queue and socket. Would
> "kill -9 <nbd-client-pid>" work for you instead? That is what I use to
> break the connection, and it's safe, as it tells the original nbd-client
> to exit (which it does cleanly and safely).

I'm aware tx_lock can't be held over both; I was suggesting maybe
another lock but that feels like overkill.

I use 'nbd-client -d' and then resort to 'kill -9' IFF 'nbd-client -d'
returned non-zero.
But it sounds like simply using 'kill -9' could be a near-term
workaround, I'll try this as well and will report back.

thanks,
Mike


Attachments:
(No filename) (2.57 kB)
nbd_sock_xmit_oops.patch (611.00 B)
Download all attachments

2008-03-27 21:12:46

by Mike Snitzer

[permalink] [raw]
Subject: Re: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive

On Thu, Mar 27, 2008 at 9:21 AM, Mike Snitzer <[email protected]> wrote:
> On Thu, Mar 27, 2008 at 8:35 AM, Paul Clements
> <[email protected]> wrote:
> > Mike Snitzer wrote:
> >
> > > In practice this looks like:
> > >
> > > nbd1: NBD_DISCONNECT
> > > nbd1: Send control failed (result -32)
> > > end_request: I/O error, dev nbd1, sector 0
> > > end_request: I/O error, dev nbd1, sector 8032264
> > > md: super_written gets error=-5, uptodate=0
> > > raid1: Disk failure on nbd1, disabling device.
> > > Operation continuing on 1 devices
> > > Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
> > > [<ffffffff88b1e125>] :nbd:sock_xmit+0x9d/0x301
> >
> > > The fact that sock_xmit() in receive mode is unprotected seems to be
> > > the WHY a NULL pointer is possible; but I'm still trying to identify
> > > the HOW.
> >
> > Do you know who is setting the socket NULL? Is it already NULL when you
> > get to this point? Is it the nbd-client -d? Is it the original
> > nbd-client/kernel that does it? Figuring that out would help narrow down
> > the cause.
>
> I believe that NBD_CLEAR_SOCK from 'nbd-client -d' sets it to NULL.
> lo->sock is already NULL on entry to sock_xmit().
>
> So simply checking if the sock_xmit's 'sock' is NULL _should_ avoid
> any possibility of a NULL pointer Oops because sock can't be !NULL
> after the negative check (because of the sock = lo->sock assignment).
> That is, unless I'm missing somewhere in the rest of the kernel (not
> nbd) that would take action to set a socket to NULL?
>
> The attached patch seems reasonable. I'll be testing today to verify
> it fixes the problem.

The nbd_sock_xmit_oops.patch from my previous email does fix this issue.

I verified this by making the race window wider with the attached patch.

The sequence to hit this race (with a patched nbd) is:

# insmod ./nbd_exposed.ko
# nbd-client 192.168.114.152 9011 /dev/nbd1
Negotiation: ..size = 3148739KB
bs=1024, sz=3148739
# dd if=/dev/nbd1 of=/dev/null &
[ NOTE: wait to see "sleeping 30s before recv in sock_xmit" in the syslog ]
# nbd-client -d /dev/nbd1

In the system log you'll see something like:

nbd: registered device at major 43
nbd1:
nbd1: sleeping 30s before recv in sock_xmit
nbd1: NBD_DISCONNECT
nbd1: sleeping 30s after setting sock to NULL in NBD_CLEAR_SOCK
nbd1: sleeping 30s before recv in sock_xmit
Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
[<ffffffff883a917c>] :nbd:sock_xmit+0xf0/0x33e
PGD 141b52067 PUD 144eaf067 PMD 0
Oops: 0000 [1] SMP
...

With the nbd_sock_xmit_oops.patch applied this Oops no longer happens.
I think my original patch's use of EINVAL should be changed to EIO.
Any other comments on the fix are welcome.

Paul, would you like an updated patch that you push upstream? Or
should I just submit the patch and Cc: you?

Mike


Attachments:
(No filename) (2.84 kB)
nbd_sock_xmit_expose_race.patch (1.18 kB)
Download all attachments

2008-03-28 03:17:50

by Paul Clements

[permalink] [raw]
Subject: Re: nbd: Oops because nbd doesn't prevent NBD_CLEAR_SOCK while sock_xmit() is working on a receive

Mike Snitzer wrote:

> With the nbd_sock_xmit_oops.patch applied this Oops no longer happens.
> I think my original patch's use of EINVAL should be changed to EIO.
> Any other comments on the fix are welcome.

No, EINVAL is fine, so the original patch looks good to me.

> Paul, would you like an updated patch that you push upstream? Or
> should I just submit the patch and Cc: you?

The original one is fine, go ahead and send it.

Thanks,
Paul