Devices are not unmounted inside a domU after a xl block-detach.
After xl block-detach, blkfront_closing() is called with state ==
XenbusStateConnected, it detects that the device is still in use and
only switches state to XenbusStateClosing. blkfront_closing() is called
a second time but returns immediately because state ==
XenbusStateClosing. Thus the device keeps being mounted inside the domU.
To fix this, emit a KOBJ_OFFLINE uevent even if the device has users.
With this patch, inside domU, udev has:
KERNEL[16994.526789] offline /devices/vbd-51728/block/xvdb (block)
KERNEL[16994.796197] remove /devices/virtual/bdi/202:16 (bdi)
KERNEL[16994.797167] remove /devices/vbd-51728/block/xvdb (block)
UDEV [16994.798035] remove /devices/virtual/bdi/202:16 (bdi)
UDEV [16994.809429] offline /devices/vbd-51728/block/xvdb (block)
UDEV [16994.842365] remove /devices/vbd-51728/block/xvdb (block)
KERNEL[16995.461991] remove /devices/vbd-51728 (xen)
UDEV [16995.462549] remove /devices/vbd-51728 (xen)
While without, it had:
KERNEL[30.862764] remove /devices/vbd-51728 (xen)
UDEV [30.867838] remove /devices/vbd-51728 (xen)
Signed-off-by: Pascal Bouchareine <[email protected]>
Signed-off-by: Fatih Acar <[email protected]>
Signed-off-by: Vincent Legout <[email protected]>
---
drivers/block/xen-blkfront.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 39459631667c..da0b0444ee1f 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -2185,8 +2185,10 @@ static void blkfront_closing(struct blkfront_info *info)
mutex_lock(&bdev->bd_mutex);
if (bdev->bd_openers) {
- xenbus_dev_error(xbdev, -EBUSY,
- "Device in use; refusing to close");
+ dev_warn(disk_to_dev(info->gd),
+ "detaching %s with pending users\n",
+ xbdev->nodename);
+ kobject_uevent(&disk_to_dev(info->gd)->kobj, KOBJ_OFFLINE);
xenbus_switch_state(xbdev, XenbusStateClosing);
} else {
xlvbd_release_gendisk(info);
--
2.13.2
On Tue, Jul 04, 2017 at 01:48:32PM +0200, Vincent Legout wrote:
> Devices are not unmounted inside a domU after a xl block-detach.
>
> After xl block-detach, blkfront_closing() is called with state ==
> XenbusStateConnected, it detects that the device is still in use and
> only switches state to XenbusStateClosing. blkfront_closing() is called
> a second time but returns immediately because state ==
> XenbusStateClosing. Thus the device keeps being mounted inside the domU.
>
> To fix this, emit a KOBJ_OFFLINE uevent even if the device has users.
>
> With this patch, inside domU, udev has:
>
> KERNEL[16994.526789] offline /devices/vbd-51728/block/xvdb (block)
> KERNEL[16994.796197] remove /devices/virtual/bdi/202:16 (bdi)
> KERNEL[16994.797167] remove /devices/vbd-51728/block/xvdb (block)
> UDEV [16994.798035] remove /devices/virtual/bdi/202:16 (bdi)
> UDEV [16994.809429] offline /devices/vbd-51728/block/xvdb (block)
> UDEV [16994.842365] remove /devices/vbd-51728/block/xvdb (block)
> KERNEL[16995.461991] remove /devices/vbd-51728 (xen)
> UDEV [16995.462549] remove /devices/vbd-51728 (xen)
I'm not an expect on udev, but aren't those messages duplicated? You
seem to get one message from udev and another one from the kernel.
> While without, it had:
>
> KERNEL[30.862764] remove /devices/vbd-51728 (xen)
> UDEV [30.867838] remove /devices/vbd-51728 (xen)
>
> Signed-off-by: Pascal Bouchareine <[email protected]>
> Signed-off-by: Fatih Acar <[email protected]>
> Signed-off-by: Vincent Legout <[email protected]>
>
> drivers/block/xen-blkfront.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> index 39459631667c..da0b0444ee1f 100644
> --- a/drivers/block/xen-blkfront.c
> +++ b/drivers/block/xen-blkfront.c
> @@ -2185,8 +2185,10 @@ static void blkfront_closing(struct blkfront_info *info)
> mutex_lock(&bdev->bd_mutex);
>
> if (bdev->bd_openers) {
> - xenbus_dev_error(xbdev, -EBUSY,
> - "Device in use; refusing to close");
> + dev_warn(disk_to_dev(info->gd),
> + "detaching %s with pending users\n",
> + xbdev->nodename);
> + kobject_uevent(&disk_to_dev(info->gd)->kobj, KOBJ_OFFLINE);
What happens if you simply remove the xenbus_dev_error but don't add
the kobject_uevent?
I'm asking because I don't see any other block device calling
directly kobject_uevent, and I'm sure this should be pretty similar to
what virtio or USB do when a block device is hot-unplugged.
For example blk_unregister_queue already contains a call to trigger a
kobject_uevent.
Thanks, Roger.
On Tue, Jul 04, 2017 at 05:59:27PM +0100, Roger Pau Monné wrote :
> On Tue, Jul 04, 2017 at 01:48:32PM +0200, Vincent Legout wrote:
> > Devices are not unmounted inside a domU after a xl block-detach.
> >
> > After xl block-detach, blkfront_closing() is called with state ==
> > XenbusStateConnected, it detects that the device is still in use and
> > only switches state to XenbusStateClosing. blkfront_closing() is called
> > a second time but returns immediately because state ==
> > XenbusStateClosing. Thus the device keeps being mounted inside the domU.
> >
> > To fix this, emit a KOBJ_OFFLINE uevent even if the device has users.
> >
> > With this patch, inside domU, udev has:
> >
> > KERNEL[16994.526789] offline /devices/vbd-51728/block/xvdb (block)
> > KERNEL[16994.796197] remove /devices/virtual/bdi/202:16 (bdi)
> > KERNEL[16994.797167] remove /devices/vbd-51728/block/xvdb (block)
> > UDEV [16994.798035] remove /devices/virtual/bdi/202:16 (bdi)
> > UDEV [16994.809429] offline /devices/vbd-51728/block/xvdb (block)
> > UDEV [16994.842365] remove /devices/vbd-51728/block/xvdb (block)
> > KERNEL[16995.461991] remove /devices/vbd-51728 (xen)
> > UDEV [16995.462549] remove /devices/vbd-51728 (xen)
>
> I'm not an expect on udev, but aren't those messages duplicated? You
> seem to get one message from udev and another one from the kernel.
I'm not either, but this seems to be the expected behavior, at least
that's what I get on a few different setups.
> > While without, it had:
> >
> > KERNEL[30.862764] remove /devices/vbd-51728 (xen)
> > UDEV [30.867838] remove /devices/vbd-51728 (xen)
> >
> > Signed-off-by: Pascal Bouchareine <[email protected]>
> > Signed-off-by: Fatih Acar <[email protected]>
> > Signed-off-by: Vincent Legout <[email protected]>
> >
> > drivers/block/xen-blkfront.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
> > index 39459631667c..da0b0444ee1f 100644
> > --- a/drivers/block/xen-blkfront.c
> > +++ b/drivers/block/xen-blkfront.c
> > @@ -2185,8 +2185,10 @@ static void blkfront_closing(struct blkfront_info *info)
> > mutex_lock(&bdev->bd_mutex);
> >
> > if (bdev->bd_openers) {
> > - xenbus_dev_error(xbdev, -EBUSY,
> > - "Device in use; refusing to close");
> > + dev_warn(disk_to_dev(info->gd),
> > + "detaching %s with pending users\n",
> > + xbdev->nodename);
> > + kobject_uevent(&disk_to_dev(info->gd)->kobj, KOBJ_OFFLINE);
>
> What happens if you simply remove the xenbus_dev_error but don't add
> the kobject_uevent?
I just tested and I've got the same behavior as before if I do that
(i.e. no unmount inside domU).
> I'm asking because I don't see any other block device calling
> directly kobject_uevent, and I'm sure this should be pretty similar to
> what virtio or USB do when a block device is hot-unplugged.
I don't know if this is the right thing to do, but a call to
kobject_uevent_env was added in xen-blkfront a few months ago:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89515d0255c918e08aa4085956c79bf17615fda5
> For example blk_unregister_queue already contains a call to trigger a
> kobject_uevent.
Without the patch, blkif_release and xlvbd_release_gendisk are never
called, and no call to blk_unregister_queue is made.
blkif_release expects the device to be unused. And calling directly
xlvbd_release_gendisk instead of kobject_uevent seems to block at
del_gendisk while calling invalidate_partition and then fsync_bdev.
Vincent
>>> On 05.07.17 at 10:08, <[email protected]> wrote:
> Without the patch, blkif_release and xlvbd_release_gendisk are never
> called, and no call to blk_unregister_queue is made.
But isn't that what needs to be fixed then? The device should be
removed once its last user goes away (which would be at the time
the umount is eventually done aiui).
Jan
On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
> >>> On 05.07.17 at 10:08, <[email protected]> wrote:
> > Without the patch, blkif_release and xlvbd_release_gendisk are never
> > called, and no call to blk_unregister_queue is made.
>
> But isn't that what needs to be fixed then? The device should be
> removed once its last user goes away (which would be at the time
> the umount is eventually done aiui).
You mean that block-detach should fail if the device is still mounted?
or find a way to wait until all the users are gone?
I don't say that's not what should be done, but that's not what I get.
The device is removed after a block-detach, even if still mounted. So
the system is left in an unstable state without the patch.
I also just saw the --force option of xl block-detach, but from a quick
look it seems this option was actually only in xm and never in xl.
Vincent
>>> On 05.07.17 at 14:37, <[email protected]> wrote:
> On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
>> >>> On 05.07.17 at 10:08, <[email protected]> wrote:
>> > Without the patch, blkif_release and xlvbd_release_gendisk are never
>> > called, and no call to blk_unregister_queue is made.
>>
>> But isn't that what needs to be fixed then? The device should be
>> removed once its last user goes away (which would be at the time
>> the umount is eventually done aiui).
>
> You mean that block-detach should fail if the device is still mounted?
> or find a way to wait until all the users are gone?
>
> I don't say that's not what should be done, but that's not what I get.
> The device is removed after a block-detach, even if still mounted. So
> the system is left in an unstable state without the patch.
Unstable? I'd expect subsequent I/O to fail for that device, yes, but
that's still a stable system. Are you observing anything else?
Jan
On Wed, Jul 05, 2017 at 06:53:25AM -0600, Jan Beulich wrote :
> >>> On 05.07.17 at 14:37, <[email protected]> wrote:
> > On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
> >> >>> On 05.07.17 at 10:08, <[email protected]> wrote:
> >> > Without the patch, blkif_release and xlvbd_release_gendisk are never
> >> > called, and no call to blk_unregister_queue is made.
> >>
> >> But isn't that what needs to be fixed then? The device should be
> >> removed once its last user goes away (which would be at the time
> >> the umount is eventually done aiui).
> >
> > You mean that block-detach should fail if the device is still mounted?
> > or find a way to wait until all the users are gone?
> >
> > I don't say that's not what should be done, but that's not what I get.
> > The device is removed after a block-detach, even if still mounted. So
> > the system is left in an unstable state without the patch.
>
> Unstable? I'd expect subsequent I/O to fail for that device, yes, but
> that's still a stable system. Are you observing anything else?
Yes, that's what I meant by unstable, nothing else. Sorry for the
confusion.
Vincent
On Wed, Jul 05, 2017 at 03:30:00PM +0200, Vincent Legout wrote:
> On Wed, Jul 05, 2017 at 06:53:25AM -0600, Jan Beulich wrote :
> > >>> On 05.07.17 at 14:37, <[email protected]> wrote:
> > > On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
> > >> >>> On 05.07.17 at 10:08, <[email protected]> wrote:
> > >> > Without the patch, blkif_release and xlvbd_release_gendisk are never
> > >> > called, and no call to blk_unregister_queue is made.
> > >>
> > >> But isn't that what needs to be fixed then? The device should be
> > >> removed once its last user goes away (which would be at the time
> > >> the umount is eventually done aiui).
> > >
> > > You mean that block-detach should fail if the device is still mounted?
> > > or find a way to wait until all the users are gone?
> > >
> > > I don't say that's not what should be done, but that's not what I get.
> > > The device is removed after a block-detach, even if still mounted. So
> > > the system is left in an unstable state without the patch.
> >
> > Unstable? I'd expect subsequent I/O to fail for that device, yes, but
> > that's still a stable system. Are you observing anything else?
>
> Yes, that's what I meant by unstable, nothing else. Sorry for the
> confusion.
IMHO, this should behave in the same exact way as hot-unplugging a USB
drive that's mounted, can you confirm that's correct?
Roger.
Hello,
Sorry for such a long delay. I'm still interested in having this patch
merged.
I've tried to make the patch more generic and move it to xenbus as
discussed during the Xen summit, but I'm not sure how or if it's
possible. Would doing something in xenbus_otherend_changed() make sense?
But do we have enough information there? I'd be happy to get any advice,
I've re-attached the original patch.
On Fri, Jul 07, 2017 at 09:10:53AM +0100, Roger Pau Monné wrote :
> On Wed, Jul 05, 2017 at 03:30:00PM +0200, Vincent Legout wrote:
> > On Wed, Jul 05, 2017 at 06:53:25AM -0600, Jan Beulich wrote :
> > > >>> On 05.07.17 at 14:37, <[email protected]> wrote:
> > > > On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
> > > >> >>> On 05.07.17 at 10:08, <[email protected]> wrote:
> > > >> > Without the patch, blkif_release and xlvbd_release_gendisk are never
> > > >> > called, and no call to blk_unregister_queue is made.
> > > >>
> > > >> But isn't that what needs to be fixed then? The device should be
> > > >> removed once its last user goes away (which would be at the time
> > > >> the umount is eventually done aiui).
> > > >
> > > > You mean that block-detach should fail if the device is still mounted?
> > > > or find a way to wait until all the users are gone?
> > > >
> > > > I don't say that's not what should be done, but that's not what I get.
> > > > The device is removed after a block-detach, even if still mounted. So
> > > > the system is left in an unstable state without the patch.
> > >
> > > Unstable? I'd expect subsequent I/O to fail for that device, yes, but
> > > that's still a stable system. Are you observing anything else?
> >
> > Yes, that's what I meant by unstable, nothing else. Sorry for the
> > confusion.
>
> IMHO, this should behave in the same exact way as hot-unplugging a USB
> drive that's mounted, can you confirm that's correct?
I agree. And if I'm not wrong, it currently doesn't behave the same as
USB device unplugging. The patch tries to fix that.
Thanks,
Vincent
On 05/09/17 09:28, Vincent Legout wrote:
> Hello,
>
> Sorry for such a long delay. I'm still interested in having this patch
> merged.
>
> I've tried to make the patch more generic and move it to xenbus as
> discussed during the Xen summit, but I'm not sure how or if it's
> possible. Would doing something in xenbus_otherend_changed() make sense?
> But do we have enough information there? I'd be happy to get any advice,
> I've re-attached the original patch.
Maybe you could add a callback to struct xenbus_driver which is called
by xenbus_otherend_changed() if available and which will return the
missing information (e.g. the kobj).
Juergen
>
> On Fri, Jul 07, 2017 at 09:10:53AM +0100, Roger Pau Monn? wrote :
>> On Wed, Jul 05, 2017 at 03:30:00PM +0200, Vincent Legout wrote:
>>> On Wed, Jul 05, 2017 at 06:53:25AM -0600, Jan Beulich wrote :
>>>>>>> On 05.07.17 at 14:37, <[email protected]> wrote:
>>>>> On Wed, Jul 05, 2017 at 02:17:24AM -0600, Jan Beulich wrote :
>>>>>>>>> On 05.07.17 at 10:08, <[email protected]> wrote:
>>>>>>> Without the patch, blkif_release and xlvbd_release_gendisk are never
>>>>>>> called, and no call to blk_unregister_queue is made.
>>>>>>
>>>>>> But isn't that what needs to be fixed then? The device should be
>>>>>> removed once its last user goes away (which would be at the time
>>>>>> the umount is eventually done aiui).
>>>>>
>>>>> You mean that block-detach should fail if the device is still mounted?
>>>>> or find a way to wait until all the users are gone?
>>>>>
>>>>> I don't say that's not what should be done, but that's not what I get.
>>>>> The device is removed after a block-detach, even if still mounted. So
>>>>> the system is left in an unstable state without the patch.
>>>>
>>>> Unstable? I'd expect subsequent I/O to fail for that device, yes, but
>>>> that's still a stable system. Are you observing anything else?
>>>
>>> Yes, that's what I meant by unstable, nothing else. Sorry for the
>>> confusion.
>>
>> IMHO, this should behave in the same exact way as hot-unplugging a USB
>> drive that's mounted, can you confirm that's correct?
>
> I agree. And if I'm not wrong, it currently doesn't behave the same as
> USB device unplugging. The patch tries to fix that.
>
> Thanks,
> Vincent
>
On Wed, Sep 06, 2017 at 12:18:03PM +0200, Juergen Gross wrote:
> On 05/09/17 09:28, Vincent Legout wrote:
> > Hello,
> >
> > Sorry for such a long delay. I'm still interested in having this patch
> > merged.
> >
> > I've tried to make the patch more generic and move it to xenbus as
> > discussed during the Xen summit, but I'm not sure how or if it's
> > possible. Would doing something in xenbus_otherend_changed() make sense?
> > But do we have enough information there? I'd be happy to get any advice,
> > I've re-attached the original patch.
>
> Maybe you could add a callback to struct xenbus_driver which is called
> by xenbus_otherend_changed() if available and which will return the
> missing information (e.g. the kobj).
Hello,
I'm still unsure we should call KOBJ_OFFLINE, mostly because I don't
see any other block devices doing so. AFAICT it seems to be used only
by cpu and memory hotplug. Maybe xenbus should use the device_offline
function instead on each device it wants to remove?
>From my limited Linux bus handling understanding, this seems to be
more in line with what ACPI does for example.
Thanks, Roger.