2020-04-20 14:42:08

by Pankaj Gupta

[permalink] [raw]
Subject: [RFC 0/2] virtio-pmem: Asynchronous flush

Jeff reported preflush order issue with the existing implementation
of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
for virtio pmem using work queue as done in md/RAID. This patch series
intends to solve the preflush ordering issue and also makes the flush
asynchronous from the submitting thread POV.

Submitting this patch series for feeback and is in WIP. I have
done basic testing and currently doing more testing.

Pankaj Gupta (2):
pmem: make nvdimm_flush asynchronous
virtio_pmem: Async virtio-pmem flush

drivers/nvdimm/nd_virtio.c | 66 ++++++++++++++++++++++++++----------
drivers/nvdimm/pmem.c | 15 ++++----
drivers/nvdimm/region_devs.c | 3 +-
drivers/nvdimm/virtio_pmem.c | 9 +++++
drivers/nvdimm/virtio_pmem.h | 12 +++++++
5 files changed, 78 insertions(+), 27 deletions(-)

[1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2
--
2.20.1


2020-04-20 14:42:47

by Pankaj Gupta

[permalink] [raw]
Subject: [RFC 2/2] virtio_pmem: Async virtio-pmem flush

This patch enables asynchronous flush for virtio pmem
using work queue. Also, it coalesce the flush requests
when a flush is already in process.

Signed-off-by: Pankaj Gupta <[email protected]>
---
drivers/nvdimm/nd_virtio.c | 62 ++++++++++++++++++++++++++----------
drivers/nvdimm/virtio_pmem.c | 9 ++++++
drivers/nvdimm/virtio_pmem.h | 14 ++++++++
3 files changed, 68 insertions(+), 17 deletions(-)

diff --git a/drivers/nvdimm/nd_virtio.c b/drivers/nvdimm/nd_virtio.c
index 10351d5b49fa..ef53d0a0d134 100644
--- a/drivers/nvdimm/nd_virtio.c
+++ b/drivers/nvdimm/nd_virtio.c
@@ -97,29 +97,57 @@ static int virtio_pmem_flush(struct nd_region *nd_region)
return err;
};

+static void submit_async_flush(struct work_struct *ws);
+
/* The asynchronous flush callback function */
int async_pmem_flush(struct nd_region *nd_region, struct bio *bio)
{
- /*
- * Create child bio for asynchronous flush and chain with
- * parent bio. Otherwise directly call nd_region flush.
- */
- if (bio && bio->bi_iter.bi_sector != -1) {
- struct bio *child = bio_alloc(GFP_ATOMIC, 0);
-
- if (!child)
- return -ENOMEM;
- bio_copy_dev(child, bio);
- child->bi_opf = REQ_PREFLUSH;
- child->bi_iter.bi_sector = -1;
- bio_chain(child, bio);
- submit_bio(child);
- return 0;
+ struct virtio_device *vdev = nd_region->provider_data;
+ struct virtio_pmem *vpmem = vdev->priv;
+ ktime_t start = ktime_get_boottime();
+
+ spin_lock_irq(&vpmem->lock);
+ wait_event_lock_irq(vpmem->sb_wait,
+ !vpmem->flush_bio ||
+ ktime_after(vpmem->last_flush, start),
+ vpmem->lock);
+
+ if (!ktime_after(vpmem->last_flush, start)) {
+ WARN_ON(vpmem->flush_bio);
+ vpmem->flush_bio = bio;
+ bio = NULL;
}
- if (virtio_pmem_flush(nd_region))
- return -EIO;
+ spin_unlock_irq(&vpmem->lock);

+ if (!bio) {
+ INIT_WORK(&vpmem->flush_work, submit_async_flush);
+ queue_work(vpmem->pmem_wq, &vpmem->flush_work);
+ return 1;
+ }
+
+ bio->bi_opf &= ~REQ_PREFLUSH;
return 0;
};
EXPORT_SYMBOL_GPL(async_pmem_flush);
+
+static void submit_async_flush(struct work_struct *ws)
+{
+ struct virtio_pmem *vpmem = container_of(ws, struct virtio_pmem, flush_work);
+ struct bio *bio = vpmem->flush_bio;
+
+ vpmem->start_flush = ktime_get_boottime();
+ bio->bi_status = errno_to_blk_status(virtio_pmem_flush(vpmem->nd_region));
+ vpmem->last_flush = vpmem->start_flush;
+ vpmem->flush_bio = NULL;
+ wake_up(&vpmem->sb_wait);
+
+ /* Submit parent bio only for PREFLUSH */
+ if (bio && (bio->bi_opf & REQ_PREFLUSH)) {
+ bio->bi_opf &= ~REQ_PREFLUSH;
+ submit_bio(bio);
+ } else if (bio && (bio->bi_opf & REQ_FUA)) {
+ bio->bi_opf &= ~REQ_FUA;
+ bio_endio(bio);
+ }
+}
MODULE_LICENSE("GPL");
diff --git a/drivers/nvdimm/virtio_pmem.c b/drivers/nvdimm/virtio_pmem.c
index 5e3d07b47e0c..4ab135d820fd 100644
--- a/drivers/nvdimm/virtio_pmem.c
+++ b/drivers/nvdimm/virtio_pmem.c
@@ -24,6 +24,7 @@ static int init_vq(struct virtio_pmem *vpmem)
return PTR_ERR(vpmem->req_vq);

spin_lock_init(&vpmem->pmem_lock);
+ spin_lock_init(&vpmem->lock);
INIT_LIST_HEAD(&vpmem->req_list);

return 0;
@@ -58,6 +59,12 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
goto out_err;
}

+ vpmem->pmem_wq = alloc_workqueue("vpmem_wq", WQ_MEM_RECLAIM, 0);
+ if (!vpmem->pmem_wq) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+ init_waitqueue_head(&vpmem->sb_wait);
virtio_cread(vpmem->vdev, struct virtio_pmem_config,
start, &vpmem->start);
virtio_cread(vpmem->vdev, struct virtio_pmem_config,
@@ -90,10 +97,12 @@ static int virtio_pmem_probe(struct virtio_device *vdev)
goto out_nd;
}
nd_region->provider_data = dev_to_virtio(nd_region->dev.parent->parent);
+ vpmem->nd_region = nd_region;
return 0;
out_nd:
nvdimm_bus_unregister(vpmem->nvdimm_bus);
out_vq:
+ destroy_workqueue(vpmem->pmem_wq);
vdev->config->del_vqs(vdev);
out_err:
return err;
diff --git a/drivers/nvdimm/virtio_pmem.h b/drivers/nvdimm/virtio_pmem.h
index 0dddefe594c4..9d3615a324bf 100644
--- a/drivers/nvdimm/virtio_pmem.h
+++ b/drivers/nvdimm/virtio_pmem.h
@@ -35,9 +35,23 @@ struct virtio_pmem {
/* Virtio pmem request queue */
struct virtqueue *req_vq;

+ struct bio *flush_bio;
+ /* last_flush is when the last completed flush was started */
+ ktime_t last_flush, start_flush;
+
+ /* work queue for deferred flush */
+ struct work_struct flush_work;
+ struct workqueue_struct *pmem_wq;
+
+ /* Synchronize flush wait queue data */
+ spinlock_t lock;
+ /* for waiting for previous flush to complete */
+ wait_queue_head_t sb_wait;
+
/* nvdimm bus registers virtio pmem device */
struct nvdimm_bus *nvdimm_bus;
struct nvdimm_bus_descriptor nd_desc;
+ struct nd_region *nd_region;

/* List to store deferred work if virtqueue is full */
struct list_head req_list;
--
2.20.1

2021-03-11 16:34:07

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC 0/2] virtio-pmem: Asynchronous flush

On 20.04.20 15:19, Pankaj Gupta wrote:
> Jeff reported preflush order issue with the existing implementation
> of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
> for virtio pmem using work queue as done in md/RAID. This patch series
> intends to solve the preflush ordering issue and also makes the flush
> asynchronous from the submitting thread POV.
>
> Submitting this patch series for feeback and is in WIP. I have
> done basic testing and currently doing more testing.
>
> Pankaj Gupta (2):
> pmem: make nvdimm_flush asynchronous
> virtio_pmem: Async virtio-pmem flush
>
> drivers/nvdimm/nd_virtio.c | 66 ++++++++++++++++++++++++++----------
> drivers/nvdimm/pmem.c | 15 ++++----
> drivers/nvdimm/region_devs.c | 3 +-
> drivers/nvdimm/virtio_pmem.c | 9 +++++
> drivers/nvdimm/virtio_pmem.h | 12 +++++++
> 5 files changed, 78 insertions(+), 27 deletions(-)
>
> [1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2
>

Just wondering, was there any follow up of this or are we still waiting
for feedback? :)

--
Thanks,

David / dhildenb

2021-03-12 05:12:41

by Pankaj Gupta

[permalink] [raw]
Subject: Re: [RFC 0/2] virtio-pmem: Asynchronous flush

Hi David,

> > Jeff reported preflush order issue with the existing implementation
> > of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
> > for virtio pmem using work queue as done in md/RAID. This patch series
> > intends to solve the preflush ordering issue and also makes the flush
> > asynchronous from the submitting thread POV.
> >
> > Submitting this patch series for feeback and is in WIP. I have
> > done basic testing and currently doing more testing.
> >
> > Pankaj Gupta (2):
> > pmem: make nvdimm_flush asynchronous
> > virtio_pmem: Async virtio-pmem flush
> >
> > drivers/nvdimm/nd_virtio.c | 66 ++++++++++++++++++++++++++----------
> > drivers/nvdimm/pmem.c | 15 ++++----
> > drivers/nvdimm/region_devs.c | 3 +-
> > drivers/nvdimm/virtio_pmem.c | 9 +++++
> > drivers/nvdimm/virtio_pmem.h | 12 +++++++
> > 5 files changed, 78 insertions(+), 27 deletions(-)
> >
> > [1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2
> >
>
> Just wondering, was there any follow up of this or are we still waiting
> for feedback? :)

Thank you for bringing this up.

My apologies I could not followup on this. I have another version in my local
tree but could not post it as I was not sure if I solved the problem
correctly. I will
clean it up and post for feedback as soon as I can.

P.S: Due to serious personal/family health issues I am not able to
devote much time
on this with other professional commitments. I feel bad that I have
this unfinished task.
Just in last one year things have not been stable for me & my family
and still not getting :(

Best regards,
Pankaj

2021-03-12 06:12:06

by Dan Williams

[permalink] [raw]
Subject: Re: [RFC 0/2] virtio-pmem: Asynchronous flush

On Thu, Mar 11, 2021 at 8:21 PM Pankaj Gupta
<[email protected]> wrote:
>
> Hi David,
>
> > > Jeff reported preflush order issue with the existing implementation
> > > of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
> > > for virtio pmem using work queue as done in md/RAID. This patch series
> > > intends to solve the preflush ordering issue and also makes the flush
> > > asynchronous from the submitting thread POV.
> > >
> > > Submitting this patch series for feeback and is in WIP. I have
> > > done basic testing and currently doing more testing.
> > >
> > > Pankaj Gupta (2):
> > > pmem: make nvdimm_flush asynchronous
> > > virtio_pmem: Async virtio-pmem flush
> > >
> > > drivers/nvdimm/nd_virtio.c | 66 ++++++++++++++++++++++++++----------
> > > drivers/nvdimm/pmem.c | 15 ++++----
> > > drivers/nvdimm/region_devs.c | 3 +-
> > > drivers/nvdimm/virtio_pmem.c | 9 +++++
> > > drivers/nvdimm/virtio_pmem.h | 12 +++++++
> > > 5 files changed, 78 insertions(+), 27 deletions(-)
> > >
> > > [1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2
> > >
> >
> > Just wondering, was there any follow up of this or are we still waiting
> > for feedback? :)
>
> Thank you for bringing this up.
>
> My apologies I could not followup on this. I have another version in my local
> tree but could not post it as I was not sure if I solved the problem
> correctly. I will
> clean it up and post for feedback as soon as I can.
>
> P.S: Due to serious personal/family health issues I am not able to
> devote much time
> on this with other professional commitments. I feel bad that I have
> this unfinished task.
> Just in last one year things have not been stable for me & my family
> and still not getting :(

No worries Pankaj. Take care of yourself and your family. The
community can handle this for you. I'm open to coaching somebody
through what's involved to get this fix landed.

2021-03-12 08:57:36

by David Hildenbrand

[permalink] [raw]
Subject: Re: [RFC 0/2] virtio-pmem: Asynchronous flush

On 12.03.21 07:02, Dan Williams wrote:
> On Thu, Mar 11, 2021 at 8:21 PM Pankaj Gupta
> <[email protected]> wrote:
>>
>> Hi David,
>>
>>>> Jeff reported preflush order issue with the existing implementation
>>>> of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
>>>> for virtio pmem using work queue as done in md/RAID. This patch series
>>>> intends to solve the preflush ordering issue and also makes the flush
>>>> asynchronous from the submitting thread POV.
>>>>
>>>> Submitting this patch series for feeback and is in WIP. I have
>>>> done basic testing and currently doing more testing.
>>>>
>>>> Pankaj Gupta (2):
>>>> pmem: make nvdimm_flush asynchronous
>>>> virtio_pmem: Async virtio-pmem flush
>>>>
>>>> drivers/nvdimm/nd_virtio.c | 66 ++++++++++++++++++++++++++----------
>>>> drivers/nvdimm/pmem.c | 15 ++++----
>>>> drivers/nvdimm/region_devs.c | 3 +-
>>>> drivers/nvdimm/virtio_pmem.c | 9 +++++
>>>> drivers/nvdimm/virtio_pmem.h | 12 +++++++
>>>> 5 files changed, 78 insertions(+), 27 deletions(-)
>>>>
>>>> [1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2
>>>>
>>>
>>> Just wondering, was there any follow up of this or are we still waiting
>>> for feedback? :)
>>
>> Thank you for bringing this up.
>>
>> My apologies I could not followup on this. I have another version in my local
>> tree but could not post it as I was not sure if I solved the problem
>> correctly. I will
>> clean it up and post for feedback as soon as I can.
>>
>> P.S: Due to serious personal/family health issues I am not able to
>> devote much time
>> on this with other professional commitments. I feel bad that I have
>> this unfinished task.
>> Just in last one year things have not been stable for me & my family
>> and still not getting :(
>
> No worries Pankaj. Take care of yourself and your family. The
> community can handle this for you. I'm open to coaching somebody
> through what's involved to get this fix landed.

Absolutely, no need to worry for now - take care of yourself and your
loved ones! I was merely stumbling over this series while cleaning up my
inbox, wondering if this is still stuck waiting for review/feedback. No
need to rush anything or be stressed.

In case I have time to look into this in the future, I'd coordinate in
this thread (especially, asking for feedback again so I know where this
series stands)!

--
Thanks,

David / dhildenb