LinuxLists.cc - [v2] nbd: fix potential NULL pointer fault in nbd_genl

2020-01-20 12:48:48

Subject: [v2] nbd: fix potential NULL pointer fault in nbd_genl_disconnect

Open /dev/nbdX first, the config_refs will be 1 and
the pointers in nbd_device are still null. Disconnect
/dev/nbdX, then reference a null recv_workq. The
protection by config_refs in nbd_genl_disconnect is useless.

To fix it, just add a check for a non null task_recv in
nbd_genl_disconnect.

Signed-off-by: Sun Ke <[email protected]>
---
v1 -> v2:

add an omitted mutex_unlock.
---
drivers/block/nbd.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b4607dd96185..668bc9cb92ed 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -2008,6 +2008,10 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
index);
return -EINVAL;
}
+ if (!nbd->task_recv) {
+ mutex_unlock(&nbd_index_mutex);
+ return -EINVAL;
+ }
if (!refcount_inc_not_zero(&nbd->refs)) {
mutex_unlock(&nbd_index_mutex);
printk(KERN_ERR "nbd: device at index %d is going down\n",
--
2.17.2

2020-01-21 14:12:41

by Josef Bacik

[permalink] [raw]

Subject: Re: [v2] nbd: fix potential NULL pointer fault in nbd_genl_disconnect

On 1/20/20 7:45 AM, Sun Ke wrote:
> Open /dev/nbdX first, the config_refs will be 1 and
> the pointers in nbd_device are still null. Disconnect
> /dev/nbdX, then reference a null recv_workq. The
> protection by config_refs in nbd_genl_disconnect is useless.
>
> To fix it, just add a check for a non null task_recv in
> nbd_genl_disconnect.
>
> Signed-off-by: Sun Ke <[email protected]>
> ---
> v1 -> v2:
>
> add an omitted mutex_unlock.
> ---
> drivers/block/nbd.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index b4607dd96185..668bc9cb92ed 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -2008,6 +2008,10 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
> index);
> return -EINVAL;
> }
> + if (!nbd->task_recv) {
> + mutex_unlock(&nbd_index_mutex);
> + return -EINVAL;
> + }
> if (!refcount_inc_not_zero(&nbd->refs)) {
> mutex_unlock(&nbd_index_mutex);
> printk(KERN_ERR "nbd: device at index %d is going down\n",
>

This doesn't even really protect us, we need to have the nbd->config_lock held
here to make sure it's ok. The IOCTL path is safe because it creates the device
on open so it's sure to exist by the time we get to the disconnect, we don't
have that for genl_disconnect. So I'd add the config_mutex before getting the
config_ref, and then do the check, something like

mutex_lock(&nbd->config_lock);
if (!refcount_inc_not_zero(&nbd->refs)) {
}
if (!nbd->recv_workq) {
}
mutex_unlock(&nbd->config_lock);

Thanks,

Josef

2020-01-22 01:09:15

by Mike Christie

[permalink] [raw]

Subject: Re: [v2] nbd: fix potential NULL pointer fault in nbd_genl_disconnect

On 01/21/2020 08:09 AM, Josef Bacik wrote:
> On 1/20/20 7:45 AM, Sun Ke wrote:
>> Open /dev/nbdX first, the config_refs will be 1 and
>> the pointers in nbd_device are still null. Disconnect
>> /dev/nbdX, then reference a null recv_workq. The
>> protection by config_refs in nbd_genl_disconnect is useless.
>>
>> To fix it, just add a check for a non null task_recv in
>> nbd_genl_disconnect.
>>
>> Signed-off-by: Sun Ke <[email protected]>
>> ---
>> v1 -> v2:
>>
>> add an omitted mutex_unlock.
>> ---
>> drivers/block/nbd.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index b4607dd96185..668bc9cb92ed 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -2008,6 +2008,10 @@ static int nbd_genl_disconnect(struct sk_buff
>> *skb, struct genl_info *info)
>> index);
>> return -EINVAL;
>> }
>> + if (!nbd->task_recv) {
>> + mutex_unlock(&nbd_index_mutex);
>> + return -EINVAL;
>> + }
>> if (!refcount_inc_not_zero(&nbd->refs)) {
>> mutex_unlock(&nbd_index_mutex);
>> printk(KERN_ERR "nbd: device at index %d is going down\n",
>>
>
> This doesn't even really protect us, we need to have the
> nbd->config_lock held here to make sure it's ok. The IOCTL path is safe
> because it creates the device on open so it's sure to exist by the time
> we get to the disconnect, we don't have that for genl_disconnect. So
> I'd add the config_mutex before getting the config_ref, and then do the
> check, something like
>
> mutex_lock(&nbd->config_lock);
> if (!refcount_inc_not_zero(&nbd->refs)) {
> }
> if (!nbd->recv_workq) {
> }
> mutex_unlock(&nbd->config_lock);
>

We will be doing a mix of checks/behavior. Maybe we want to settle on one?

It seems the code, before my patch, would let you do a open() then do a
nbd_genl_disconnect. This function would then try to cleanup what it
could and return success.

To keep the current behavior/style in nbd_disconnect_and_put would you
want to do:

nbd_disconnect_and_put()

....

if (nbd->task_recv)
flush_workqueue(nbd->recv_workq);

?

Alternatively, I think if we want to make it so calling
nbd_genl_disconnect is not allowed on a device that we have not done a
successful nbd_genl_connect/nbd_start_device call on then we want to add
the new state bit to indicate nbd_start_device was successful.

Or, we could stick to one variable that gets set at start and always use
that to indicate nbd_start_device was called ok. For example, for
nbd_genl_reconfigure we already check if task_recv is set to check if
nbd_start_device was called successfully.