Hello,
when I physically remove a CF-Card while it is still mounted and
accessed, the result is a kernel oops. I traced the problem down to the
following situation:
1) Physical device is removed and the corresponding ide_trive_t gets
deleted, reference to the contained *queue is given back and queue gets
deleted.
2) struct block_device still exists, because the device is still
mounted, and points to a struct gendisk, which contains the - now
invalid - pointer to the queue that was destructed earlier.
3) As soon as a request is sent to the block device, the queue pointer
is used and the kernel oopses.
To fix this matter, I've made the following changes:
1a) When a queue is attached to a struct gendisk by ide_disk_probe(),
its reference counter shall be incremented by a blk_get_queue()-call.
1b) When a disk is released by disk_release(), its queue's reference
count shall be decremented by calling blk_cleanup_queue().
2a) When a physical drive is released by drive_release_dev(), the
corresponding queue is marked dead, so that no further calls to the
physical device's queue-handler are made.
2b) When a request is submitted to a dead queue using
generic_make_request(), the request shall be failed immedaiately with
-ENXIO which causes the caller to recive a "Bus error". This is the same
beaviour as when a USB-Storage device gets pulled while in use.
diff -uprN -X b/Documentation/dontdiff a/drivers/block/genhd.c
b/drivers/block/genhd.c
--- a/drivers/block/genhd.c 2005-08-08 15:30:13.000000000 +0200
+++ b/drivers/block/genhd.c 2005-09-05 02:07:46.000000000 +0200
@@ -415,6 +415,9 @@ static struct attribute * default_attrs[
static void disk_release(struct kobject * kobj)
{
struct gendisk *disk = to_disk(kobj);
+
+ blk_cleanup_queue(disk->queue);
+
kfree(disk->random);
kfree(disk->part);
free_disk_stats(disk);
diff -uprN -X b/Documentation/dontdiff a/drivers/block/ll_rw_blk.c
b/drivers/block/ll_rw_blk.c
--- a/drivers/block/ll_rw_blk.c 2005-08-13 15:54:15.000000000 +0200
+++ b/drivers/block/ll_rw_blk.c 2005-09-05 02:08:35.000000000 +0200
@@ -2877,15 +2877,26 @@ end_io:
}
if (unlikely(bio_sectors(bio) > q->max_hw_sectors)) {
- printk("bio too big device %s (%u > %u)\n",
+ printk(KERN_ERR
+ "generic_make_request: "
+ "bio too big device %s (%u > %u)\n",
bdevname(bio->bi_bdev, b),
bio_sectors(bio),
q->max_hw_sectors);
goto end_io;
}
- if (unlikely(test_bit(QUEUE_FLAG_DEAD, &q->queue_flags)))
- goto end_io;
+ if (unlikely(test_bit(QUEUE_FLAG_DEAD, &q->queue_flags))) {
+ printk(KERN_ERR
+ "generic_make_request: access to "
+ "dead device %s (%Lu)\n",
+ bdevname(bio->bi_bdev, b),
+ (long long) bio->bi_sector);
+ bio_endio(bio, bio->bi_size, -ENXIO);
+ break;
+ }
block_wait_queue_running(q);
diff -uprN -X b/Documentation/dontdiff a/drivers/ide/ide-disk.c
b/drivers/ide/ide-disk.c
--- a/drivers/ide/ide-disk.c 2005-08-24 17:58:02.000000000 +0200
+++ b/drivers/ide/ide-disk.c 2005-09-05 02:10:30.000000000 +0200
@@ -1224,6 +1221,9 @@ static int ide_disk_probe(struct device
if (!g)
goto out_free_idkp;
+ if(0 != blk_get_queue(drive->queue))
+ goto out_free_idkp;
+
ide_init_disk(g, drive);
ide_register_subdriver(drive, &idedisk_driver);
diff -uprN -X b/Documentation/dontdiff a/drivers/ide/ide-probe.c
b/drivers/ide/ide-probe.c
--- a/drivers/ide/ide-probe.c 2005-08-24 17:58:02.000000000 +0200
+++ b/drivers/ide/ide-probe.c 2005-09-05 02:10:59.000000000 +0200
@@ -1321,6 +1321,13 @@ static void drive_release_dev (struct de
drive->id = NULL;
}
drive->present = 0;
+
+ /* Set the queue dead, so it won't call us anymore */
+ set_bit(QUEUE_FLAG_DEAD, &drive->queue->queue_flags);
+
+ /* remove pointer to ide drive as it will be gone, soon */
+ drive->queue->queuedata = NULL;
+
/* Messed up locking ... */
spin_unlock_irq(&ide_lock);
blk_cleanup_queue(drive->queue);
Signed-off-by: Thomas Kleffel <[email protected]>
Best regards,
Thomas
On 9/6/05, Thomas Kleffel <[email protected]> wrote:
> Hello,
Hi,
> when I physically remove a CF-Card while it is still mounted and
> accessed, the result is a kernel oops. I traced the problem down to the
> following situation:
>
> 1) Physical device is removed and the corresponding ide_trive_t gets
> deleted, reference to the contained *queue is given back and queue gets
> deleted.
>
> 2) struct block_device still exists, because the device is still
> mounted, and points to a struct gendisk, which contains the - now
> invalid - pointer to the queue that was destructed earlier.
>
> 3) As soon as a request is sent to the block device, the queue pointer
> is used and the kernel oopses.
>
> To fix this matter, I've made the following changes:
>
> 1a) When a queue is attached to a struct gendisk by ide_disk_probe(),
> its reference counter shall be incremented by a blk_get_queue()-call.
>
> 1b) When a disk is released by disk_release(), its queue's reference
> count shall be decremented by calling blk_cleanup_queue().
Which conflicts with IDE layer reference counting & locking scheme
as IDE layer can send (special) requests to the device without any
device driver loaded.
> 2a) When a physical drive is released by drive_release_dev(), the
> corresponding queue is marked dead, so that no further calls to the
> physical device's queue-handler are made.
>
> 2b) When a request is submitted to a dead queue using
> generic_make_request(), the request shall be failed immedaiately with
> -ENXIO which causes the caller to recive a "Bus error". This is the same
> beaviour as when a USB-Storage device gets pulled while in use.
The fix would be to fail the previous requests during removal of the
device [ somebody already posted a patch to do that but I didn't have
time to give it proper thought ] or alternatively to add the offline state to
the device [ so processing of the requests would be resumed after device
is online again ].
Thanks,
Bartlomiej
> diff -uprN -X b/Documentation/dontdiff a/drivers/block/genhd.c
> b/drivers/block/genhd.c
> --- a/drivers/block/genhd.c 2005-08-08 15:30:13.000000000 +0200
> +++ b/drivers/block/genhd.c 2005-09-05 02:07:46.000000000 +0200
> @@ -415,6 +415,9 @@ static struct attribute * default_attrs[
> static void disk_release(struct kobject * kobj)
> {
> struct gendisk *disk = to_disk(kobj);
> +
> + blk_cleanup_queue(disk->queue);
> +
> kfree(disk->random);
> kfree(disk->part);
> free_disk_stats(disk);
> diff -uprN -X b/Documentation/dontdiff a/drivers/block/ll_rw_blk.c
> b/drivers/block/ll_rw_blk.c
> --- a/drivers/block/ll_rw_blk.c 2005-08-13 15:54:15.000000000 +0200
> +++ b/drivers/block/ll_rw_blk.c 2005-09-05 02:08:35.000000000 +0200
> @@ -2877,15 +2877,26 @@ end_io:
> }
>
> if (unlikely(bio_sectors(bio) > q->max_hw_sectors)) {
> - printk("bio too big device %s (%u > %u)\n",
> + printk(KERN_ERR
> + "generic_make_request: "
> + "bio too big device %s (%u > %u)\n",
> bdevname(bio->bi_bdev, b),
> bio_sectors(bio),
> q->max_hw_sectors);
> goto end_io;
> }
>
> - if (unlikely(test_bit(QUEUE_FLAG_DEAD, &q->queue_flags)))
> - goto end_io;
> + if (unlikely(test_bit(QUEUE_FLAG_DEAD, &q->queue_flags))) {
> + printk(KERN_ERR
> + "generic_make_request: access to "
> + "dead device %s (%Lu)\n",
> + bdevname(bio->bi_bdev, b),
> + (long long) bio->bi_sector);
> + bio_endio(bio, bio->bi_size, -ENXIO);
> + break;
> + }
>
> block_wait_queue_running(q);
>
> diff -uprN -X b/Documentation/dontdiff a/drivers/ide/ide-disk.c
> b/drivers/ide/ide-disk.c
> --- a/drivers/ide/ide-disk.c 2005-08-24 17:58:02.000000000 +0200
> +++ b/drivers/ide/ide-disk.c 2005-09-05 02:10:30.000000000 +0200
> @@ -1224,6 +1221,9 @@ static int ide_disk_probe(struct device
> if (!g)
> goto out_free_idkp;
>
> + if(0 != blk_get_queue(drive->queue))
> + goto out_free_idkp;
> +
> ide_init_disk(g, drive);
>
> ide_register_subdriver(drive, &idedisk_driver);
> diff -uprN -X b/Documentation/dontdiff a/drivers/ide/ide-probe.c
> b/drivers/ide/ide-probe.c
> --- a/drivers/ide/ide-probe.c 2005-08-24 17:58:02.000000000 +0200
> +++ b/drivers/ide/ide-probe.c 2005-09-05 02:10:59.000000000 +0200
> @@ -1321,6 +1321,13 @@ static void drive_release_dev (struct de
> drive->id = NULL;
> }
> drive->present = 0;
> +
> + /* Set the queue dead, so it won't call us anymore */
> + set_bit(QUEUE_FLAG_DEAD, &drive->queue->queue_flags);
> +
> + /* remove pointer to ide drive as it will be gone, soon */
> + drive->queue->queuedata = NULL;
> +
> /* Messed up locking ... */
> spin_unlock_irq(&ide_lock);
> blk_cleanup_queue(drive->queue);
>
>
> Signed-off-by: Thomas Kleffel <[email protected]>
>
> Best regards,
> Thomas
Hi,
Bartlomiej Zolnierkiewicz wrote:
>> [...]
>>1b) When a disk is released by disk_release(), its queue's reference
>>count shall be decremented by calling blk_cleanup_queue().
>
>
> Which conflicts with IDE layer reference counting & locking scheme
> as IDE layer can send (special) requests to the device without any
> device driver loaded.
But the IDE layer won't do that anymore after ide_unregister() was
called, while a device driver doesn't know that the device doesn't exist
anymore, so it may still access the queue until it releases the disk.
>
>
>>2a) When a physical drive is released by drive_release_dev(), the
>>corresponding queue is marked dead, so that no further calls to the
>>physical device's queue-handler are made.
>>
>>2b) When a request is submitted to a dead queue using
>>generic_make_request(), the request shall be failed immedaiately with
>>-ENXIO which causes the caller to recive a "Bus error". This is the same
>>beaviour as when a USB-Storage device gets pulled while in use.
>
>
> The fix would be to fail the previous requests during removal of the
> device [ somebody already posted a patch to do that but I didn't have
> time to give it proper thought ] or alternatively to add the offline state to
> the device [ so processing of the requests would be resumed after device
> is online again ].
I don't think it'd be wise to resume request processing on the same
device as the CF Card inserted again might not be the same one as the
user pulled out before. Performing old, cached writes on the new card
could destroy innocent data.
From your responses I read that the correct solution would be to keep
the old pysical device as long as the ide layer still has references to
it and to fail all requests in the meantime.
Is that correct?
Best regards,
Thomas
On 9/12/05, Thomas Kleffel <[email protected]> wrote:
> Hi,
>
> Bartlomiej Zolnierkiewicz wrote:
>
> >> [...]
> >>1b) When a disk is released by disk_release(), its queue's reference
> >>count shall be decremented by calling blk_cleanup_queue().
> >
> >
> > Which conflicts with IDE layer reference counting & locking scheme
> > as IDE layer can send (special) requests to the device without any
> > device driver loaded.
>
> But the IDE layer won't do that anymore after ide_unregister() was
> called, while a device driver doesn't know that the device doesn't exist
> anymore, so it may still access the queue until it releases the disk.
The point is that you have added blk_cleanup_queue() to the ide-disk
driver release function and it is perfectly fine for IDE layer to send requests
after unloading ide-disk driver.
> >>2a) When a physical drive is released by drive_release_dev(), the
> >>corresponding queue is marked dead, so that no further calls to the
> >>physical device's queue-handler are made.
> >>
> >>2b) When a request is submitted to a dead queue using
> >>generic_make_request(), the request shall be failed immedaiately with
> >>-ENXIO which causes the caller to recive a "Bus error". This is the same
> >>beaviour as when a USB-Storage device gets pulled while in use.
> >
> >
> > The fix would be to fail the previous requests during removal of the
> > device [ somebody already posted a patch to do that but I didn't have
> > time to give it proper thought ] or alternatively to add the offline state to
> > the device [ so processing of the requests would be resumed after device
> > is online again ].
>
> I don't think it'd be wise to resume request processing on the same
> device as the CF Card inserted again might not be the same one as the
> user pulled out before. Performing old, cached writes on the new card
> could destroy innocent data.
Yes, that is why failing old requests is a preferred solution.
> From your responses I read that the correct solution would be to keep
> the old pysical device as long as the ide layer still has references to
> it and to fail all requests in the meantime.
>
> Is that correct?
Yes, we should just fail all the requests if the device is not present.
[ What do you mean by the old physical device, old 'ide_drive_t *'? ]
Regards,
Bartlomiej
> Best regards,
>
> Thomas
>
>
>
>
In-Reply-To: <[email protected]>
On Tue, 13 Sep 2005 at 10:36:42 +0200, Bartlomiej Zolnierkiewicz wrote:
> > From your responses I read that the correct solution would be to keep
> > the old pysical device as long as the ide layer still has references to
> > it and to fail all requests in the meantime.
> >
> > Is that correct?
>
> Yes, we should just fail all the requests if the device is not present.
> [ What do you mean by the old physical device, old 'ide_drive_t *'? ]
Jens Axboe posted this patch for the same problem on 2 Aug:
===========================================================================
=====
That's not quite true, q is not invalid after this call. It will only be
invalid when it is freed (which doesn't happen from here but rather from
the blk_cleanup_queue() call when the reference count drops to 0).
This is still not perfect, but a lot better. Does it work for you?
--- linux-2.6.12/drivers/ide/ide-disk.c~ 2005-08-02
12:48:16.000000000 +0200
+++ linux-2.6.12/drivers/ide/ide-disk.c 2005-08-02 12:48:32.000000000 +0200
@@ -1054,6 +1054,7 @@
drive->driver_data = NULL;
drive->devfs_name[0] = '\0';
g->private_data = NULL;
+ g->disk = NULL;
put_disk(g);
kfree(idkp);
}
__
Chuck