LinuxLists.cc - [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup

2021-07-15 05:18:39

Subject: [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup_queue()

Now that error handling for add_disk*() calls is added, we must
accept a common form for when errors are detected on the the
add_disk*() calls, and that is to call blk_cleanup_disk() on
error always. One of the corner cases possible is a driver bug
where the queue is already gone and we cannot blk_get_queue(),
and so may be NULL. When blk_cleanup_disk() is called in this
case blk_cleanup_queue() will crash with a null dereference.

Make this an accepted condition and just skip it. This allows us
to also test for it safely with error injection.

Signed-off-by: Luis Chamberlain <[email protected]>
---
block/blk-core.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/block/blk-core.c b/block/blk-core.c
index 04477697ee4b..156f7d3b4bd9 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -372,6 +372,9 @@ void blk_cleanup_queue(struct request_queue *q)
/* cannot be called from atomic context */
might_sleep();

+ if (!q)
+ return;
+
WARN_ON_ONCE(blk_queue_registered(q));

/* mark @q DYING, no new request or merges will be allowed afterwards */
--
2.27.0

2021-07-15 08:26:35

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup_queue()

On Wed, Jul 14, 2021 at 09:55:31PM -0700, Luis Chamberlain wrote:
> Now that error handling for add_disk*() calls is added, we must
> accept a common form for when errors are detected on the the
> add_disk*() calls, and that is to call blk_cleanup_disk() on
> error always. One of the corner cases possible is a driver bug
> where the queue is already gone and we cannot blk_get_queue(),
> and so may be NULL. When blk_cleanup_disk() is called in this
> case blk_cleanup_queue() will crash with a null dereference.
>
> Make this an accepted condition and just skip it. This allows us
> to also test for it safely with error injection.

So you plan to call blk_cleanup_disk when add_disk fails?

For all drivers using blk_alloc_disk/blk_mq_alloc_disk there should
always be a queue. The others ones aren't ready to handle errors
from add_disk yet in any way I think (and I plan to fix this up
ASAP).

2021-07-15 19:34:28

by Luis Chamberlain

[permalink] [raw]

Subject: Re: [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup_queue()

On Thu, Jul 15, 2021 at 08:11:57AM +0100, Christoph Hellwig wrote:
> On Wed, Jul 14, 2021 at 09:55:31PM -0700, Luis Chamberlain wrote:
> > Now that error handling for add_disk*() calls is added, we must
> > accept a common form for when errors are detected on the the
> > add_disk*() calls, and that is to call blk_cleanup_disk() on
> > error always. One of the corner cases possible is a driver bug
> > where the queue is already gone and we cannot blk_get_queue(),
> > and so may be NULL. When blk_cleanup_disk() is called in this
> > case blk_cleanup_queue() will crash with a null dereference.
> >
> > Make this an accepted condition and just skip it. This allows us
> > to also test for it safely with error injection.
>
> So you plan to call blk_cleanup_disk when add_disk fails?

Yes, they can open code things if they wish as well, but when possible yes.

> For all drivers using blk_alloc_disk/blk_mq_alloc_disk there should
> always be a queue. The others ones aren't ready to handle errors
> from add_disk yet in any way I think (and I plan to fix this up
> ASAP).

Have an example in mind?

Luis

2021-07-19 09:52:27

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup_queue()

On Thu, Jul 15, 2021 at 12:07:26PM -0700, Luis Chamberlain wrote:
> > For all drivers using blk_alloc_disk/blk_mq_alloc_disk there should
> > always be a queue. The others ones aren't ready to handle errors
> > from add_disk yet in any way I think (and I plan to fix this up
> > ASAP).
>
> Have an example in mind?

The only ones left are nvme, dasd and scsi. NVMe is trivial (attached),
dasd needs a little more work, I need to send up a WIP to the
maintainers. scsi is the real problem and will require a fair amount
of work.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 11779be42186..35da956acef6 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3726,9 +3726,14 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
if (!ns)
goto out_free_id;

- ns->queue = blk_mq_init_queue(ctrl->tagset);
- if (IS_ERR(ns->queue))
+ disk = blk_mq_alloc_disk(ctrl->tagset, ns);
+ if (IS_ERR(disk))
goto out_free_ns;
+ disk->fops = &nvme_bdev_ops;
+ disk->private_data = ns;
+
+ ns->disk = disk;
+ ns->queue = disk->queue;

if (ctrl->opts && ctrl->opts->data_digest)
blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, ns->queue);
@@ -3737,20 +3742,12 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
if (ctrl->ops->flags & NVME_F_PCI_P2PDMA)
blk_queue_flag_set(QUEUE_FLAG_PCI_P2PDMA, ns->queue);

- ns->queue->queuedata = ns;
ns->ctrl = ctrl;
kref_init(&ns->kref);

if (nvme_init_ns_head(ns, nsid, ids, id->nmic & NVME_NS_NMIC_SHARED))
- goto out_free_queue;
-
- disk = alloc_disk_node(0, node);
- if (!disk)
- goto out_unlink_ns;
+ goto out_cleanup_disk;

- disk->fops = &nvme_bdev_ops;
- disk->private_data = ns;
- disk->queue = ns->queue;
/*
* Without the multipath code enabled, multiple controller per
* subsystems are visible as devices and thus we cannot use the
@@ -3759,15 +3756,14 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
if (!nvme_mpath_set_disk_name(ns, disk->disk_name, &disk->flags))
sprintf(disk->disk_name, "nvme%dn%d", ctrl->instance,
ns->head->instance);
- ns->disk = disk;

if (nvme_update_ns_info(ns, id))
- goto out_put_disk;
+ goto out_unlink_ns;

if ((ctrl->quirks & NVME_QUIRK_LIGHTNVM) && id->vs[0] == 0x1) {
if (nvme_nvm_register(ns, disk->disk_name, node)) {
dev_warn(ctrl->device, "LightNVM init failure\n");
- goto out_put_disk;
+ goto out_unlink_ns;
}
}

@@ -3786,10 +3782,7 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
kfree(id);

return;
- out_put_disk:
- /* prevent double queue cleanup */
- ns->disk->queue = NULL;
- put_disk(ns->disk);
+
out_unlink_ns:
mutex_lock(&ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
@@ -3797,8 +3790,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid,
list_del_init(&ns->head->entry);
mutex_unlock(&ctrl->subsys->lock);
nvme_put_ns_head(ns->head);
- out_free_queue:
- blk_cleanup_queue(ns->queue);
+ out_cleanup_disk:
+ blk_cleanup_disk(disk);
out_free_ns:
kfree(ns);
out_free_id:

2021-07-20 00:28:43

by Luis Chamberlain

[permalink] [raw]

Subject: Re: [PATCH v2 6/6] block: skip queue if NULL on blk_cleanup_queue()

On Mon, Jul 19, 2021 at 10:50:13AM +0100, Christoph Hellwig wrote:
> On Thu, Jul 15, 2021 at 12:07:26PM -0700, Luis Chamberlain wrote:
> > > For all drivers using blk_alloc_disk/blk_mq_alloc_disk there should
> > > always be a queue. The others ones aren't ready to handle errors
> > > from add_disk yet in any way I think (and I plan to fix this up
> > > ASAP).
> >
> > Have an example in mind?
>
> The only ones left are nvme, dasd and scsi. NVMe is trivial (attached),
> dasd needs a little more work, I need to send up a WIP to the
> maintainers. scsi is the real problem and will require a fair amount
> of work.

Alright, I'll wait until these changes are merged before sending those
driver's conversions. Without this patch though, there's no way to test
the error injection in case the driver was buggy and removed the queue
by mistake. Recall my first patch moved that check to the beginning too.

So we either skip that error injection test ... or not sure what else to do.

Luis