2015-07-22 06:40:28

by Bob Liu

[permalink] [raw]
Subject: [PATCH v2 1/3] xen-blkfront: introduce blkfront_gather_backend_features()

There is a bug when migrate from !feature-persistent host to feature-persistent
host, because domU still thinks new host/backend doesn't support persistent.
Dmesg like:
backed has not unmapped grant: 839
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 773
backed has not unmapped grant: 839

The fix is to recheck feature-persistent of new backend in blkif_recover().
See: https://lkml.org/lkml/2015/5/25/469

As Roger suggested, we can split the part of blkfront_connect that checks for
optional features, like persistent grants, indirect descriptors and
flush/barrier features to a separate function and call it from both
blkfront_connect and blkif_recover

Signed-off-by: Bob Liu <[email protected]>
---
Changes in v2:
* Also put blkfront_setup_indirect() inside
---
drivers/block/xen-blkfront.c | 122 +++++++++++++++++++++++-------------------
1 file changed, 68 insertions(+), 54 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5b45ee5..3b193cf 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -181,6 +181,7 @@ static DEFINE_SPINLOCK(minor_lock);
((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)

static int blkfront_setup_indirect(struct blkfront_info *info);
+static int blkfront_gather_backend_features(struct blkfront_info *info);

static int get_id_from_freelist(struct blkfront_info *info)
{
@@ -1514,7 +1515,7 @@ static int blkif_recover(struct blkfront_info *info)
info->shadow_free = info->ring.req_prod_pvt;
info->shadow[BLK_RING_SIZE(info)-1].req.u.rw.id = 0x0fffffff;

- rc = blkfront_setup_indirect(info);
+ rc = blkfront_gather_backend_features(info);
if (rc) {
kfree(copy);
return rc;
@@ -1694,20 +1695,13 @@ static void blkfront_setup_discard(struct blkfront_info *info)

static int blkfront_setup_indirect(struct blkfront_info *info)
{
- unsigned int indirect_segments, segs;
+ unsigned int segs;
int err, i;

- err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
- "feature-max-indirect-segments", "%u", &indirect_segments,
- NULL);
- if (err) {
- info->max_indirect_segments = 0;
+ if (info->max_indirect_segments == 0)
segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
- } else {
- info->max_indirect_segments = min(indirect_segments,
- xen_blkif_max_segments);
+ else
segs = info->max_indirect_segments;
- }

err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * BLK_RING_SIZE(info));
if (err)
@@ -1771,6 +1765,68 @@ out_of_memory:
}

/*
+ * Gather all backend feature-*
+ */
+static int blkfront_gather_backend_features(struct blkfront_info *info)
+{
+ int err;
+ int barrier, flush, discard, persistent;
+ unsigned int indirect_segments;
+
+ info->feature_flush = 0;
+
+ err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+ "feature-barrier", "%d", &barrier,
+ NULL);
+
+ /*
+ * If there's no "feature-barrier" defined, then it means
+ * we're dealing with a very old backend which writes
+ * synchronously; nothing to do.
+ *
+ * If there are barriers, then we use flush.
+ */
+ if (!err && barrier)
+ info->feature_flush = REQ_FLUSH | REQ_FUA;
+ /*
+ * And if there is "feature-flush-cache" use that above
+ * barriers.
+ */
+ err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+ "feature-flush-cache", "%d", &flush,
+ NULL);
+
+ if (!err && flush)
+ info->feature_flush = REQ_FLUSH;
+
+ err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+ "feature-discard", "%d", &discard,
+ NULL);
+
+ if (!err && discard)
+ blkfront_setup_discard(info);
+
+ err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+ "feature-persistent", "%u", &persistent,
+ NULL);
+ if (err)
+ info->feature_persistent = 0;
+ else
+ info->feature_persistent = persistent;
+
+ err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
+ "feature-max-indirect-segments", "%u", &indirect_segments,
+ NULL);
+ if (err)
+ info->max_indirect_segments = 0;
+ else
+ info->max_indirect_segments = min(indirect_segments,
+ xen_blkif_max_segments);
+
+ return blkfront_setup_indirect(info);
+}
+
+/*
* Invoked when the backend is finally 'ready' (and has told produced
* the details about the physical device - #sectors, size, etc).
*/
@@ -1781,7 +1837,6 @@ static void blkfront_connect(struct blkfront_info *info)
unsigned int physical_sector_size;
unsigned int binfo;
int err;
- int barrier, flush, discard, persistent;

switch (info->connected) {
case BLKIF_STATE_CONNECTED:
@@ -1838,48 +1893,7 @@ static void blkfront_connect(struct blkfront_info *info)
if (err != 1)
physical_sector_size = sector_size;

- info->feature_flush = 0;
-
- err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
- "feature-barrier", "%d", &barrier,
- NULL);
-
- /*
- * If there's no "feature-barrier" defined, then it means
- * we're dealing with a very old backend which writes
- * synchronously; nothing to do.
- *
- * If there are barriers, then we use flush.
- */
- if (!err && barrier)
- info->feature_flush = REQ_FLUSH | REQ_FUA;
- /*
- * And if there is "feature-flush-cache" use that above
- * barriers.
- */
- err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
- "feature-flush-cache", "%d", &flush,
- NULL);
-
- if (!err && flush)
- info->feature_flush = REQ_FLUSH;
-
- err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
- "feature-discard", "%d", &discard,
- NULL);
-
- if (!err && discard)
- blkfront_setup_discard(info);
-
- err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
- "feature-persistent", "%u", &persistent,
- NULL);
- if (err)
- info->feature_persistent = 0;
- else
- info->feature_persistent = persistent;
-
- err = blkfront_setup_indirect(info);
+ err = blkfront_gather_backend_features(info);
if (err) {
xenbus_dev_fatal(info->xbdev, err, "setup_indirect at %s",
info->xbdev->otherend);
--
1.7.10.4


2015-07-22 06:40:30

by Bob Liu

[permalink] [raw]
Subject: [PATCH v2 2/3] xen-blkfront: don't add indirect pages to list when !feature_persistent

We should consider info->feature_persistent when adding indriect page to list
info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.

Signed-off-by: Bob Liu <[email protected]>
---
drivers/block/xen-blkfront.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 3b193cf..5dd591d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1125,8 +1125,10 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
* Add the used indirect page back to the list of
* available pages for indirect grefs.
*/
- indirect_page = pfn_to_page(s->indirect_grants[i]->pfn);
- list_add(&indirect_page->lru, &info->indirect_pages);
+ if (!info->feature_persistent) {
+ indirect_page = pfn_to_page(s->indirect_grants[i]->pfn);
+ list_add(&indirect_page->lru, &info->indirect_pages);
+ }
s->indirect_grants[i]->gref = GRANT_INVALID_REF;
list_add_tail(&s->indirect_grants[i]->node, &info->grants);
}
--
1.7.10.4

2015-07-22 06:40:33

by Bob Liu

[permalink] [raw]
Subject: [PATCH v2] xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()

The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.
There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.

Signed-off-by: Bob Liu <[email protected]>
---
Change in v2:
* Replace with work_busy()
---
drivers/block/xen-blkback/blkback.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index ced9677..954c002 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -369,8 +369,8 @@ static void purge_persistent_gnt(struct xen_blkif *blkif)
return;
}

- if (work_pending(&blkif->persistent_purge_work)) {
- pr_alert_ratelimited("Scheduled work from previous purge is still pending, cannot purge list\n");
+ if (work_busy(&blkif->persistent_purge_work)) {
+ pr_alert_ratelimited("Scheduled work from previous purge is still busy, cannot purge list\n");
return;
}

--
1.7.10.4

2015-07-23 08:59:28

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] xen-blkfront: don't add indirect pages to list when !feature_persistent

El 22/07/15 a les 8.40, Bob Liu ha escrit:
> We should consider info->feature_persistent when adding indriect page to list
^ indirect
> info->indirect_pages, else the BUG_ON() in blkif_free() would be triggered.
>
> Signed-off-by: Bob Liu <[email protected]>

Thanks, this looks correct indeed. If we are using persistent grants the
indirect_pages list should always be empty because blkfront has
pre-allocated enough persistent pages to fill all requests on the ring.

Acked-by: Roger Pau Monn? <[email protected]>

Should be backported to stable branches.

Roger.

2015-07-23 09:00:34

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] xen-blkfront: introduce blkfront_gather_backend_features()

El 22/07/15 a les 8.40, Bob Liu ha escrit:
> There is a bug when migrate from !feature-persistent host to feature-persistent
> host, because domU still thinks new host/backend doesn't support persistent.
> Dmesg like:
> backed has not unmapped grant: 839
> backed has not unmapped grant: 773
> backed has not unmapped grant: 773
> backed has not unmapped grant: 773
> backed has not unmapped grant: 839
>
> The fix is to recheck feature-persistent of new backend in blkif_recover().
> See: https://lkml.org/lkml/2015/5/25/469
>
> As Roger suggested, we can split the part of blkfront_connect that checks for
> optional features, like persistent grants, indirect descriptors and
> flush/barrier features to a separate function and call it from both
> blkfront_connect and blkif_recover
>
> Signed-off-by: Bob Liu <[email protected]>

Acked-by: Roger Pau Monn? <[email protected]>

2015-07-23 09:07:28

by Roger Pau Monne

[permalink] [raw]
Subject: Re: [PATCH v2] xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()

El 22/07/15 a les 8.40, Bob Liu ha escrit:
> The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
> work haven't finished.
> There is a work_pending() before this BUG_ON, but it doesn't account if the work
> is still currently running.
>
> Signed-off-by: Bob Liu <[email protected]>

Acked-by: Roger Pau Monn? <[email protected]>

Should be backported to stable branches.

Roger.

2015-07-24 13:11:04

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH v2] xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()

On Thu, Jul 23, 2015 at 11:06:58AM +0200, Roger Pau Monn? wrote:
> El 22/07/15 a les 8.40, Bob Liu ha escrit:
> > The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
> > work haven't finished.
> > There is a work_pending() before this BUG_ON, but it doesn't account if the work
> > is still currently running.
> >
> > Signed-off-by: Bob Liu <[email protected]>
>
> Acked-by: Roger Pau Monn? <[email protected]>
>
> Should be backported to stable branches.

I applied all the patches and I am now testing them for regressions.

Thank you!
>
> Roger.
>