LinuxLists.cc - [PATCH v2 0/4] xen-blkback: support live update

2019-12-10 11:35:21

Subject: [PATCH v2 0/4] xen-blkback: support live update

Patch #1 is clean-up for an apparent mis-feature.

Paul Durrant (4):
xenbus: move xenbus_dev_shutdown() into frontend code...
xenbus: limit when state is forced to closed
xen/interface: re-define FRONT/BACK_RING_ATTACH()
xen-blkback: support dynamic unbind/bind

drivers/block/xen-blkback/xenbus.c | 59 +++++++++++++++-------
drivers/xen/xenbus/xenbus.h | 2 -
drivers/xen/xenbus/xenbus_probe.c | 35 ++++---------
drivers/xen/xenbus/xenbus_probe_backend.c | 1 -
drivers/xen/xenbus/xenbus_probe_frontend.c | 24 ++++++++-
include/xen/interface/io/ring.h | 29 ++++-------
include/xen/xenbus.h | 1 +
7 files changed, 84 insertions(+), 67 deletions(-)
---
Cc: Boris Ostrovsky <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: "Roger Pau Monné" <[email protected]>
Cc: Stefano Stabellini <[email protected]>
--
2.20.1

2019-12-10 11:36:22

by Paul Durrant

[permalink] [raw]

Subject: [PATCH v2 2/4] xenbus: limit when state is forced to closed

If a driver probe() fails then leave the xenstore state alone. There is no
reason to modify it as the failure may be due to transient resource
allocation issues and hence a subsequent probe() may succeed.

If the driver supports re-binding then only force state to closed during
remove() only in the case when the toolstack may need to clean up. This can
be detected by checking whether the state in xenstore has been set to
closing prior to device removal.

NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
which defaults to false. Subsequent patches will add support to
some backend drivers.

Signed-off-by: Paul Durrant <[email protected]>
---
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>

v2:
- Introduce the 'allow_rebind' flag
- Expand the commit comment
---
drivers/xen/xenbus/xenbus_probe.c | 12 ++++++++++--
include/xen/xenbus.h | 1 +
2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index a10311c348b9..9303ff35b2bd 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -255,7 +255,6 @@ int xenbus_dev_probe(struct device *_dev)
module_put(drv->driver.owner);
fail:
xenbus_dev_error(dev, err, "xenbus_dev_probe on %s", dev->nodename);
- xenbus_switch_state(dev, XenbusStateClosed);
return err;
}
EXPORT_SYMBOL_GPL(xenbus_dev_probe);
@@ -276,7 +275,16 @@ int xenbus_dev_remove(struct device *_dev)

free_otherend_details(dev);

- xenbus_switch_state(dev, XenbusStateClosed);
+ /*
+ * If the toolstack has forced the device state to closing then set
+ * the state to closed now to allow it to be cleaned up.
+ * Similarly, if the driver does not support re-bind, set the
+ * closed.
+ */
+ if (!drv->allow_rebind ||
+ xenbus_read_driver_state(dev->nodename) == XenbusStateClosing)
+ xenbus_switch_state(dev, XenbusStateClosed);
+
return 0;
}
EXPORT_SYMBOL_GPL(xenbus_dev_remove);
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 869c816d5f8c..24228a102141 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -93,6 +93,7 @@ struct xenbus_device_id
struct xenbus_driver {
const char *name; /* defaults to ids[0].devicetype */
const struct xenbus_device_id *ids;
+ bool allow_rebind; /* avoid setting xenstore closed during remove */
int (*probe)(struct xenbus_device *dev,
const struct xenbus_device_id *id);
void (*otherend_changed)(struct xenbus_device *dev,
--
2.20.1

2019-12-10 11:36:30

by Paul Durrant

[permalink] [raw]

Subject: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

By simply re-attaching to shared rings during connect_ring() rather than
assuming they are freshly allocated (i.e assuming the counters are zero)
it is possible for vbd instances to be unbound and re-bound from and to
(respectively) a running guest.

This has been tested by running:

while true;
do fio --name=randwrite --ioengine=libaio --iodepth=16 \
--rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
done

in a PV guest whilst running:

while true;
do echo vbd-$DOMID-$VBD >unbind;
echo unbound;
sleep 5;
echo vbd-$DOMID-$VBD >bind;
echo bound;
sleep 3;
done

in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
re-bind its system disk image.

This is a highly useful feature for a backend module as it allows it to be
unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
This was also tested by running:

while true;
do echo vbd-$DOMID-$VBD >unbind;
echo unbound;
sleep 5;
rmmod xen-blkback;
echo unloaded;
sleep 1;
modprobe xen-blkback;
echo bound;
cd $(pwd);
sleep 3;
done

in dom0 whilst running the same loop as above in the (single) PV guest.

Some (less stressful) testing has also been done using a Windows HVM guest
with the latest 9.0 PV drivers installed.

Signed-off-by: Paul Durrant <[email protected]>
---
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: "Roger Pau Monné" <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>

v2:
- Apply a sanity check to the value of rsp_prod and fail the re-attach
if it is implausible
- Set allow_rebind to prevent ring from being closed on unbind
- Update test workload from dd to fio (with verification)
---
drivers/block/xen-blkback/xenbus.c | 59 +++++++++++++++++++++---------
1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index e8c5c54e1d26..13d09630b237 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref,
{
int err;
struct xen_blkif *blkif = ring->blkif;
+ struct blkif_common_sring *sring_common;
+ RING_IDX rsp_prod, req_prod;

/* Already connected through? */
if (ring->irq)
@@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref,
if (err < 0)
return err;

+ sring_common = (struct blkif_common_sring *)ring->blk_ring;
+ rsp_prod = READ_ONCE(sring_common->rsp_prod);
+ req_prod = READ_ONCE(sring_common->req_prod);
+
switch (blkif->blk_protocol) {
case BLKIF_PROTOCOL_NATIVE:
{
- struct blkif_sring *sring;
- sring = (struct blkif_sring *)ring->blk_ring;
- BACK_RING_INIT(&ring->blk_rings.native, sring,
- XEN_PAGE_SIZE * nr_grefs);
+ struct blkif_sring *sring_native =
+ (struct blkif_sring *)ring->blk_ring;
+ unsigned int size = __RING_SIZE(sring_native,
+ XEN_PAGE_SIZE * nr_grefs);
+
+ BACK_RING_ATTACH(&ring->blk_rings.native, sring_native,
+ rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+ err = (req_prod - rsp_prod > size) ? -EIO : 0;
break;
}
case BLKIF_PROTOCOL_X86_32:
{
- struct blkif_x86_32_sring *sring_x86_32;
- sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
- BACK_RING_INIT(&ring->blk_rings.x86_32, sring_x86_32,
- XEN_PAGE_SIZE * nr_grefs);
+ struct blkif_x86_32_sring *sring_x86_32 =
+ (struct blkif_x86_32_sring *)ring->blk_ring;
+ unsigned int size = __RING_SIZE(sring_x86_32,
+ XEN_PAGE_SIZE * nr_grefs);
+
+ BACK_RING_ATTACH(&ring->blk_rings.x86_32, sring_x86_32,
+ rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+ err = (req_prod - rsp_prod > size) ? -EIO : 0;
break;
}
case BLKIF_PROTOCOL_X86_64:
{
- struct blkif_x86_64_sring *sring_x86_64;
- sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring;
- BACK_RING_INIT(&ring->blk_rings.x86_64, sring_x86_64,
- XEN_PAGE_SIZE * nr_grefs);
+ struct blkif_x86_64_sring *sring_x86_64 =
+ (struct blkif_x86_64_sring *)ring->blk_ring;
+ unsigned int size = __RING_SIZE(sring_x86_64,
+ XEN_PAGE_SIZE * nr_grefs);
+
+ BACK_RING_ATTACH(&ring->blk_rings.x86_64, sring_x86_64,
+ rsp_prod, XEN_PAGE_SIZE * nr_grefs);
+ err = (req_prod - rsp_prod > size) ? -EIO : 0;
break;
}
default:
BUG();
}
+ if (err < 0)
+ goto fail;

err = bind_interdomain_evtchn_to_irqhandler(blkif->domid, evtchn,
xen_blkif_be_int, 0,
"blkif-backend", ring);
- if (err < 0) {
- xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring);
- ring->blk_rings.common.sring = NULL;
- return err;
- }
+ if (err < 0)
+ goto fail;
ring->irq = err;

return 0;
+
+fail:
+ xenbus_unmap_ring_vfree(blkif->be->dev, ring->blk_ring);
+ ring->blk_rings.common.sring = NULL;
+ return err;
}

static int xen_blkif_disconnect(struct xen_blkif *blkif)
@@ -1121,7 +1143,8 @@ static struct xenbus_driver xen_blkbk_driver = {
.ids = xen_blkbk_ids,
.probe = xen_blkbk_probe,
.remove = xen_blkbk_remove,
- .otherend_changed = frontend_changed
+ .otherend_changed = frontend_changed,
+ .allow_rebind = true,
};

int xen_blkif_xenbus_init(void)
--
2.20.1

2019-12-11 10:08:02

by Roger Pau Monne

[permalink] [raw]

Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

On Tue, Dec 10, 2019 at 11:33:45AM +0000, Paul Durrant wrote:
> If a driver probe() fails then leave the xenstore state alone. There is no
> reason to modify it as the failure may be due to transient resource
> allocation issues and hence a subsequent probe() may succeed.
>
> If the driver supports re-binding then only force state to closed during
> remove() only in the case when the toolstack may need to clean up. This can
> be detected by checking whether the state in xenstore has been set to
> closing prior to device removal.
>
> NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
> which defaults to false. Subsequent patches will add support to
> some backend drivers.

My intention was to specify whether you want to close the
backends on unbind in sysfs, so that an user can decide at runtime,
rather than having a hardcoded value in the driver.

Anyway, I'm less sure whether such runtime tunable is useful at all,
so let's leave it out and can always be added afterwards. At the end
of day a user wrongly doing a rmmod blkback can always recover
gracefully by loading blkback again with your proposed approach to
leave connections open on module removal.

Sorry for the extra work.

Thanks, Roger.

2019-12-11 10:16:07

by Paul Durrant

[permalink] [raw]

Subject: RE: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

> -----Original Message-----
> From: Roger Pau Monn? <[email protected]>
> Sent: 11 December 2019 10:06
> To: Durrant, Paul <[email protected]>
> Cc: [email protected]; [email protected]; Juergen
> Gross <[email protected]>; Stefano Stabellini <[email protected]>;
> Boris Ostrovsky <[email protected]>
> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
> to closed
>
> On Tue, Dec 10, 2019 at 11:33:45AM +0000, Paul Durrant wrote:
> > If a driver probe() fails then leave the xenstore state alone. There is
> no
> > reason to modify it as the failure may be due to transient resource
> > allocation issues and hence a subsequent probe() may succeed.
> >
> > If the driver supports re-binding then only force state to closed during
> > remove() only in the case when the toolstack may need to clean up. This
> can
> > be detected by checking whether the state in xenstore has been set to
> > closing prior to device removal.
> >
> > NOTE: Re-bind support is indicated by new boolean in struct
> xenbus_driver,
> > which defaults to false. Subsequent patches will add support to
> > some backend drivers.
>
> My intention was to specify whether you want to close the
> backends on unbind in sysfs, so that an user can decide at runtime,
> rather than having a hardcoded value in the driver.
>
> Anyway, I'm less sure whether such runtime tunable is useful at all,
> so let's leave it out and can always be added afterwards. At the end
> of day a user wrongly doing a rmmod blkback can always recover
> gracefully by loading blkback again with your proposed approach to
> leave connections open on module removal.
>
> Sorry for the extra work.
>

Does this mean you don't think the extra driver flag is necessary any more? NB: now that xenbus actually takes module references you can't accidentally rmmod any more :-)

Paul

2019-12-11 10:22:32

by Jürgen Groß

[permalink] [raw]

Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

On 11.12.19 11:14, Durrant, Paul wrote:
>> -----Original Message-----
>> From: Roger Pau Monné <[email protected]>
>> Sent: 11 December 2019 10:06
>> To: Durrant, Paul <[email protected]>
>> Cc: [email protected]; [email protected]; Juergen
>> Gross <[email protected]>; Stefano Stabellini <[email protected]>;
>> Boris Ostrovsky <[email protected]>
>> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
>> to closed
>>
>> On Tue, Dec 10, 2019 at 11:33:45AM +0000, Paul Durrant wrote:
>>> If a driver probe() fails then leave the xenstore state alone. There is
>> no
>>> reason to modify it as the failure may be due to transient resource
>>> allocation issues and hence a subsequent probe() may succeed.
>>>
>>> If the driver supports re-binding then only force state to closed during
>>> remove() only in the case when the toolstack may need to clean up. This
>> can
>>> be detected by checking whether the state in xenstore has been set to
>>> closing prior to device removal.
>>>
>>> NOTE: Re-bind support is indicated by new boolean in struct
>> xenbus_driver,
>>> which defaults to false. Subsequent patches will add support to
>>> some backend drivers.
>>
>> My intention was to specify whether you want to close the
>> backends on unbind in sysfs, so that an user can decide at runtime,
>> rather than having a hardcoded value in the driver.
>>
>> Anyway, I'm less sure whether such runtime tunable is useful at all,
>> so let's leave it out and can always be added afterwards. At the end
>> of day a user wrongly doing a rmmod blkback can always recover
>> gracefully by loading blkback again with your proposed approach to
>> leave connections open on module removal.
>>
>> Sorry for the extra work.
>>
>
> Does this mean you don't think the extra driver flag is necessary any more? NB: now that xenbus actually takes module references you can't accidentally rmmod any more :-)

I'd like it to be kept, please.

Juergen

2019-12-11 10:46:42

by Roger Pau Monne

[permalink] [raw]

Subject: Re: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

On Tue, Dec 10, 2019 at 11:33:47AM +0000, Paul Durrant wrote:
> By simply re-attaching to shared rings during connect_ring() rather than
> assuming they are freshly allocated (i.e assuming the counters are zero)
> it is possible for vbd instances to be unbound and re-bound from and to
> (respectively) a running guest.
>
> This has been tested by running:
>
> while true;
> do fio --name=randwrite --ioengine=libaio --iodepth=16 \
> --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
> done
>
> in a PV guest whilst running:
>
> while true;
> do echo vbd-$DOMID-$VBD >unbind;
> echo unbound;
> sleep 5;

Is there anyway to know when the unbind has finished? AFAICT
xen_blkif_disconnect will return EBUSY if there are in flight
requests, and the disconnect won't be completed until those requests
are finished.

> echo vbd-$DOMID-$VBD >bind;
> echo bound;
> sleep 3;
> done
>
> in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
> re-bind its system disk image.
>
> This is a highly useful feature for a backend module as it allows it to be
> unloaded and re-loaded (i.e. updated) without requiring domUs to be halted.
> This was also tested by running:
>
> while true;
> do echo vbd-$DOMID-$VBD >unbind;
> echo unbound;
> sleep 5;
> rmmod xen-blkback;
> echo unloaded;
> sleep 1;
> modprobe xen-blkback;
> echo bound;
> cd $(pwd);
> sleep 3;
> done
>
> in dom0 whilst running the same loop as above in the (single) PV guest.
>
> Some (less stressful) testing has also been done using a Windows HVM guest
> with the latest 9.0 PV drivers installed.
>
> Signed-off-by: Paul Durrant <[email protected]>
> ---
> Cc: Konrad Rzeszutek Wilk <[email protected]>
> Cc: "Roger Pau Monn?" <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Stefano Stabellini <[email protected]>
>
> v2:
> - Apply a sanity check to the value of rsp_prod and fail the re-attach
> if it is implausible
> - Set allow_rebind to prevent ring from being closed on unbind
> - Update test workload from dd to fio (with verification)
> ---
> drivers/block/xen-blkback/xenbus.c | 59 +++++++++++++++++++++---------
> 1 file changed, 41 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> index e8c5c54e1d26..13d09630b237 100644
> --- a/drivers/block/xen-blkback/xenbus.c
> +++ b/drivers/block/xen-blkback/xenbus.c
> @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref,
> {
> int err;
> struct xen_blkif *blkif = ring->blkif;
> + struct blkif_common_sring *sring_common;
> + RING_IDX rsp_prod, req_prod;
>
> /* Already connected through? */
> if (ring->irq)
> @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring *ring, grant_ref_t *gref,
> if (err < 0)
> return err;
>
> + sring_common = (struct blkif_common_sring *)ring->blk_ring;
> + rsp_prod = READ_ONCE(sring_common->rsp_prod);
> + req_prod = READ_ONCE(sring_common->req_prod);
> +
> switch (blkif->blk_protocol) {
> case BLKIF_PROTOCOL_NATIVE:
> {
> - struct blkif_sring *sring;
> - sring = (struct blkif_sring *)ring->blk_ring;
> - BACK_RING_INIT(&ring->blk_rings.native, sring,
> - XEN_PAGE_SIZE * nr_grefs);
> + struct blkif_sring *sring_native =
> + (struct blkif_sring *)ring->blk_ring;

I think you can constify both sring_native and sring_common (and the
other instances below).

> + unsigned int size = __RING_SIZE(sring_native,
> + XEN_PAGE_SIZE * nr_grefs);
> +
> + BACK_RING_ATTACH(&ring->blk_rings.native, sring_native,
> + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> + err = (req_prod - rsp_prod > size) ? -EIO : 0;
> break;
> }
> case BLKIF_PROTOCOL_X86_32:
> {
> - struct blkif_x86_32_sring *sring_x86_32;
> - sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
> - BACK_RING_INIT(&ring->blk_rings.x86_32, sring_x86_32,
> - XEN_PAGE_SIZE * nr_grefs);
> + struct blkif_x86_32_sring *sring_x86_32 =
> + (struct blkif_x86_32_sring *)ring->blk_ring;
> + unsigned int size = __RING_SIZE(sring_x86_32,
> + XEN_PAGE_SIZE * nr_grefs);
> +
> + BACK_RING_ATTACH(&ring->blk_rings.x86_32, sring_x86_32,
> + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> + err = (req_prod - rsp_prod > size) ? -EIO : 0;
> break;
> }
> case BLKIF_PROTOCOL_X86_64:
> {
> - struct blkif_x86_64_sring *sring_x86_64;
> - sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring;
> - BACK_RING_INIT(&ring->blk_rings.x86_64, sring_x86_64,
> - XEN_PAGE_SIZE * nr_grefs);
> + struct blkif_x86_64_sring *sring_x86_64 =
> + (struct blkif_x86_64_sring *)ring->blk_ring;
> + unsigned int size = __RING_SIZE(sring_x86_64,
> + XEN_PAGE_SIZE * nr_grefs);
> +
> + BACK_RING_ATTACH(&ring->blk_rings.x86_64, sring_x86_64,
> + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> + err = (req_prod - rsp_prod > size) ? -EIO : 0;

This is repeated for all ring types, might be worth to pull it out of
the switch...

> break;
> }
> default:
> BUG();
> }
> + if (err < 0)
> + goto fail;

...and placed here instead?

Thanks, Roger.

2019-12-11 10:53:43

by Paul Durrant

[permalink] [raw]

Subject: RE: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

> -----Original Message-----
> From: Roger Pau Monn? <[email protected]>
> Sent: 11 December 2019 10:46
> To: Durrant, Paul <[email protected]>
> Cc: [email protected]; [email protected]; Konrad
> Rzeszutek Wilk <[email protected]>; Jens Axboe <[email protected]>;
> Boris Ostrovsky <[email protected]>; Juergen Gross
> <[email protected]>; Stefano Stabellini <[email protected]>
> Subject: Re: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind
>
> On Tue, Dec 10, 2019 at 11:33:47AM +0000, Paul Durrant wrote:
> > By simply re-attaching to shared rings during connect_ring() rather than
> > assuming they are freshly allocated (i.e assuming the counters are zero)
> > it is possible for vbd instances to be unbound and re-bound from and to
> > (respectively) a running guest.
> >
> > This has been tested by running:
> >
> > while true;
> > do fio --name=randwrite --ioengine=libaio --iodepth=16 \
> > --rw=randwrite --bs=4k --direct=1 --size=1G --verify=crc32;
> > done
> >
> > in a PV guest whilst running:
> >
> > while true;
> > do echo vbd-$DOMID-$VBD >unbind;
> > echo unbound;
> > sleep 5;
>
> Is there anyway to know when the unbind has finished? AFAICT
> xen_blkif_disconnect will return EBUSY if there are in flight
> requests, and the disconnect won't be completed until those requests
> are finished.

Yes, the device sysfs node will disappear when remove() completes.

>
> > echo vbd-$DOMID-$VBD >bind;
> > echo bound;
> > sleep 3;
> > done
> >
> > in dom0 from /sys/bus/xen-backend/drivers/vbd to continuously unbind and
> > re-bind its system disk image.
> >
> > This is a highly useful feature for a backend module as it allows it to
> be
> > unloaded and re-loaded (i.e. updated) without requiring domUs to be
> halted.
> > This was also tested by running:
> >
> > while true;
> > do echo vbd-$DOMID-$VBD >unbind;
> > echo unbound;
> > sleep 5;
> > rmmod xen-blkback;
> > echo unloaded;
> > sleep 1;
> > modprobe xen-blkback;
> > echo bound;
> > cd $(pwd);
> > sleep 3;
> > done
> >
> > in dom0 whilst running the same loop as above in the (single) PV guest.
> >
> > Some (less stressful) testing has also been done using a Windows HVM
> guest
> > with the latest 9.0 PV drivers installed.
> >
> > Signed-off-by: Paul Durrant <[email protected]>
> > ---
> > Cc: Konrad Rzeszutek Wilk <[email protected]>
> > Cc: "Roger Pau Monn?" <[email protected]>
> > Cc: Jens Axboe <[email protected]>
> > Cc: Boris Ostrovsky <[email protected]>
> > Cc: Juergen Gross <[email protected]>
> > Cc: Stefano Stabellini <[email protected]>
> >
> > v2:
> > - Apply a sanity check to the value of rsp_prod and fail the re-attach
> > if it is implausible
> > - Set allow_rebind to prevent ring from being closed on unbind
> > - Update test workload from dd to fio (with verification)
> > ---
> > drivers/block/xen-blkback/xenbus.c | 59 +++++++++++++++++++++---------
> > 1 file changed, 41 insertions(+), 18 deletions(-)
> >
> > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> blkback/xenbus.c
> > index e8c5c54e1d26..13d09630b237 100644
> > --- a/drivers/block/xen-blkback/xenbus.c
> > +++ b/drivers/block/xen-blkback/xenbus.c
> > @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring
> *ring, grant_ref_t *gref,
> > {
> > int err;
> > struct xen_blkif *blkif = ring->blkif;
> > + struct blkif_common_sring *sring_common;
> > + RING_IDX rsp_prod, req_prod;
> >
> > /* Already connected through? */
> > if (ring->irq)
> > @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring
> *ring, grant_ref_t *gref,
> > if (err < 0)
> > return err;
> >
> > + sring_common = (struct blkif_common_sring *)ring->blk_ring;
> > + rsp_prod = READ_ONCE(sring_common->rsp_prod);
> > + req_prod = READ_ONCE(sring_common->req_prod);
> > +
> > switch (blkif->blk_protocol) {
> > case BLKIF_PROTOCOL_NATIVE:
> > {
> > - struct blkif_sring *sring;
> > - sring = (struct blkif_sring *)ring->blk_ring;
> > - BACK_RING_INIT(&ring->blk_rings.native, sring,
> > - XEN_PAGE_SIZE * nr_grefs);
> > + struct blkif_sring *sring_native =
> > + (struct blkif_sring *)ring->blk_ring;
>
> I think you can constify both sring_native and sring_common (and the
> other instances below).

Yes, I can do that. I don't think the macros would mind.

>
> > + unsigned int size = __RING_SIZE(sring_native,
> > + XEN_PAGE_SIZE * nr_grefs);
> > +
> > + BACK_RING_ATTACH(&ring->blk_rings.native, sring_native,
> > + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> > + err = (req_prod - rsp_prod > size) ? -EIO : 0;
> > break;
> > }
> > case BLKIF_PROTOCOL_X86_32:
> > {
> > - struct blkif_x86_32_sring *sring_x86_32;
> > - sring_x86_32 = (struct blkif_x86_32_sring *)ring->blk_ring;
> > - BACK_RING_INIT(&ring->blk_rings.x86_32, sring_x86_32,
> > - XEN_PAGE_SIZE * nr_grefs);
> > + struct blkif_x86_32_sring *sring_x86_32 =
> > + (struct blkif_x86_32_sring *)ring->blk_ring;
> > + unsigned int size = __RING_SIZE(sring_x86_32,
> > + XEN_PAGE_SIZE * nr_grefs);
> > +
> > + BACK_RING_ATTACH(&ring->blk_rings.x86_32, sring_x86_32,
> > + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> > + err = (req_prod - rsp_prod > size) ? -EIO : 0;
> > break;
> > }
> > case BLKIF_PROTOCOL_X86_64:
> > {
> > - struct blkif_x86_64_sring *sring_x86_64;
> > - sring_x86_64 = (struct blkif_x86_64_sring *)ring->blk_ring;
> > - BACK_RING_INIT(&ring->blk_rings.x86_64, sring_x86_64,
> > - XEN_PAGE_SIZE * nr_grefs);
> > + struct blkif_x86_64_sring *sring_x86_64 =
> > + (struct blkif_x86_64_sring *)ring->blk_ring;
> > + unsigned int size = __RING_SIZE(sring_x86_64,
> > + XEN_PAGE_SIZE * nr_grefs);
> > +
> > + BACK_RING_ATTACH(&ring->blk_rings.x86_64, sring_x86_64,
> > + rsp_prod, XEN_PAGE_SIZE * nr_grefs);
> > + err = (req_prod - rsp_prod > size) ? -EIO : 0;
>
> This is repeated for all ring types, might be worth to pull it out of
> the switch...
>

I did wonder about that... I'll do in v3.

> > break;
> > }
> > default:
> > BUG();
> > }
> > + if (err < 0)
> > + goto fail;
>
> ...and placed here instead?

Indeed.

Cheers,
Paul

>
> Thanks, Roger.

2019-12-11 13:31:08

by Paul Durrant

[permalink] [raw]

Subject: RE: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced to closed

> -----Original Message-----
> From: Jürgen Groß <[email protected]>
> Sent: 11 December 2019 10:21
> To: Durrant, Paul <[email protected]>; Roger Pau Monné
> <[email protected]>
> Cc: [email protected]; [email protected]; Stefano
> Stabellini <[email protected]>; Boris Ostrovsky
> <[email protected]>
> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is forced
> to closed
>
> On 11.12.19 11:14, Durrant, Paul wrote:
> >> -----Original Message-----
> >> From: Roger Pau Monné <[email protected]>
> >> Sent: 11 December 2019 10:06
> >> To: Durrant, Paul <[email protected]>
> >> Cc: [email protected]; [email protected];
> Juergen
> >> Gross <[email protected]>; Stefano Stabellini <[email protected]>;
> >> Boris Ostrovsky <[email protected]>
> >> Subject: Re: [Xen-devel] [PATCH v2 2/4] xenbus: limit when state is
> forced
> >> to closed
> >>
> >> On Tue, Dec 10, 2019 at 11:33:45AM +0000, Paul Durrant wrote:
> >>> If a driver probe() fails then leave the xenstore state alone. There
> is
> >> no
> >>> reason to modify it as the failure may be due to transient resource
> >>> allocation issues and hence a subsequent probe() may succeed.
> >>>
> >>> If the driver supports re-binding then only force state to closed
> during
> >>> remove() only in the case when the toolstack may need to clean up.
> This
> >> can
> >>> be detected by checking whether the state in xenstore has been set to
> >>> closing prior to device removal.
> >>>
> >>> NOTE: Re-bind support is indicated by new boolean in struct
> >> xenbus_driver,
> >>> which defaults to false. Subsequent patches will add support to
> >>> some backend drivers.
> >>
> >> My intention was to specify whether you want to close the
> >> backends on unbind in sysfs, so that an user can decide at runtime,
> >> rather than having a hardcoded value in the driver.
> >>
> >> Anyway, I'm less sure whether such runtime tunable is useful at all,
> >> so let's leave it out and can always be added afterwards. At the end
> >> of day a user wrongly doing a rmmod blkback can always recover
> >> gracefully by loading blkback again with your proposed approach to
> >> leave connections open on module removal.
> >>
> >> Sorry for the extra work.
> >>
> >
> > Does this mean you don't think the extra driver flag is necessary any
> more? NB: now that xenbus actually takes module references you can't
> accidentally rmmod any more :-)
>
> I'd like it to be kept, please.
>

Ok. I'll leave this patch alone then.

Paul

> Juergen

2019-12-11 14:47:29

by Paul Durrant

[permalink] [raw]

Subject: RE: [PATCH v2 4/4] xen-blkback: support dynamic unbind/bind

> -----Original Message-----
> > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
> > blkback/xenbus.c
> > > index e8c5c54e1d26..13d09630b237 100644
> > > --- a/drivers/block/xen-blkback/xenbus.c
> > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > @@ -181,6 +181,8 @@ static int xen_blkif_map(struct xen_blkif_ring
> > *ring, grant_ref_t *gref,
> > > {
> > > int err;
> > > struct xen_blkif *blkif = ring->blkif;
> > > + struct blkif_common_sring *sring_common;
> > > + RING_IDX rsp_prod, req_prod;
> > >
> > > /* Already connected through? */
> > > if (ring->irq)
> > > @@ -191,46 +193,66 @@ static int xen_blkif_map(struct xen_blkif_ring
> > *ring, grant_ref_t *gref,
> > > if (err < 0)
> > > return err;
> > >
> > > + sring_common = (struct blkif_common_sring *)ring->blk_ring;
> > > + rsp_prod = READ_ONCE(sring_common->rsp_prod);
> > > + req_prod = READ_ONCE(sring_common->req_prod);
> > > +
> > > switch (blkif->blk_protocol) {
> > > case BLKIF_PROTOCOL_NATIVE:
> > > {
> > > - struct blkif_sring *sring;
> > > - sring = (struct blkif_sring *)ring->blk_ring;
> > > - BACK_RING_INIT(&ring->blk_rings.native, sring,
> > > - XEN_PAGE_SIZE * nr_grefs);
> > > + struct blkif_sring *sring_native =
> > > + (struct blkif_sring *)ring->blk_ring;
> >
> > I think you can constify both sring_native and sring_common (and the
> > other instances below).
>
> Yes, I can do that. I don't think the macros would mind.
>

Spoke to soon. They do mind, of course, because the sring pointer in the front/back ring is not (and should not) be const. I can const sring_common but no others.

Paul