2019-07-24 04:26:11

by John Hubbard

[permalink] [raw]
Subject: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

From: John Hubbard <[email protected]>

Hi,

This is mostly Jerome's work, converting the block/bio and related areas
to call put_user_page*() instead of put_page(). Because I've changed
Jerome's patches, in some cases significantly, I'd like to get his
feedback before we actually leave him listed as the author (he might
want to disown some or all of these).

I added a new patch, in order to make this work with Christoph Hellwig's
recent overhaul to bio_release_pages(): "block: bio_release_pages: use
flags arg instead of bool".

I've started the series with a patch that I've posted in another
series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
because I'm not sure which of these will go in first, and this allows each
to stand alone.

Testing: not much beyond build and boot testing has been done yet. And
I'm not set up to even exercise all of it (especially the IB parts) at
run time.

Anyway, changes here are:

* Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
it is time to release the pages. That allows choosing between put_page()
and put_user_page*().

* Pass in one more piece of information to bio_release_pages: a "from_gup"
parameter. Similar use as above.

* Change the block layer, and several file systems, to use
put_user_page*().

[1] https://lore.kernel.org/r/[email protected]
And please note the correction email that I posted as a follow-up,
if you're looking closely at that patch. :) The fixed version is
included here.

John Hubbard (3):
mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
block: bio_release_pages: use flags arg instead of bool
fs/ceph: fix a build warning: returning a value from void function

Jérôme Glisse (9):
iov_iter: add helper to test if an iter would use GUP v2
block: bio_release_pages: convert put_page() to put_user_page*()
block_dev: convert put_page() to put_user_page*()
fs/nfs: convert put_page() to put_user_page*()
vhost-scsi: convert put_page() to put_user_page*()
fs/cifs: convert put_page() to put_user_page*()
fs/fuse: convert put_page() to put_user_page*()
fs/ceph: convert put_page() to put_user_page*()
9p/net: convert put_page() to put_user_page*()

block/bio.c | 81 ++++++++++++---
drivers/infiniband/core/umem.c | 5 +-
drivers/infiniband/hw/hfi1/user_pages.c | 5 +-
drivers/infiniband/hw/qib/qib_user_pages.c | 5 +-
drivers/infiniband/hw/usnic/usnic_uiom.c | 5 +-
drivers/infiniband/sw/siw/siw_mem.c | 8 +-
drivers/vhost/scsi.c | 13 ++-
fs/block_dev.c | 22 +++-
fs/ceph/debugfs.c | 2 +-
fs/ceph/file.c | 62 ++++++++---
fs/cifs/cifsglob.h | 3 +
fs/cifs/file.c | 22 +++-
fs/cifs/misc.c | 19 +++-
fs/direct-io.c | 2 +-
fs/fuse/dev.c | 22 +++-
fs/fuse/file.c | 53 +++++++---
fs/nfs/direct.c | 10 +-
include/linux/bio.h | 22 +++-
include/linux/mm.h | 5 +-
include/linux/uio.h | 11 ++
mm/gup.c | 115 +++++++++------------
net/9p/trans_common.c | 14 ++-
net/9p/trans_common.h | 3 +-
net/9p/trans_virtio.c | 18 +++-
24 files changed, 357 insertions(+), 170 deletions(-)

--
2.22.0


2019-07-24 04:26:22

by John Hubbard

[permalink] [raw]
Subject: [PATCH 03/12] block: bio_release_pages: use flags arg instead of bool

From: John Hubbard <[email protected]>

In commit d241a95f3514 ("block: optionally mark pages dirty in
bio_release_pages"), new "bool mark_dirty" argument was added to
bio_release_pages.

In upcoming work, another bool argument (to indicate that the pages came
from get_user_pages) is going to be added. That's one bool too many,
because it's not desirable have calls of the form:

foo(true, false, true, etc);

In order to prepare for that, change the argument from a bool, to a
typesafe (enum-based) flags argument.

Cc: Christoph Hellwig <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: Minwoo Im <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
---
block/bio.c | 12 ++++++------
fs/block_dev.c | 4 ++--
fs/direct-io.c | 2 +-
include/linux/bio.h | 13 ++++++++++++-
4 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 299a0e7651ec..7675e2de509d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -833,7 +833,7 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);

-void bio_release_pages(struct bio *bio, bool mark_dirty)
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags)
{
struct bvec_iter_all iter_all;
struct bio_vec *bvec;
@@ -842,7 +842,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty)
return;

bio_for_each_segment_all(bvec, bio, iter_all) {
- if (mark_dirty && !PageCompound(bvec->bv_page))
+ if ((flags & BIO_RP_MARK_DIRTY) && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
put_page(bvec->bv_page);
}
@@ -1421,7 +1421,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
return bio;

out_unmap:
- bio_release_pages(bio, false);
+ bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
return ERR_PTR(ret);
}
@@ -1437,7 +1437,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
*/
void bio_unmap_user(struct bio *bio)
{
- bio_release_pages(bio, bio_data_dir(bio) == READ);
+ bio_release_pages(bio, bio_rp_dirty_flag(bio_data_dir(bio) == READ));
bio_put(bio);
bio_put(bio);
}
@@ -1683,7 +1683,7 @@ static void bio_dirty_fn(struct work_struct *work)
while ((bio = next) != NULL) {
next = bio->bi_private;

- bio_release_pages(bio, true);
+ bio_release_pages(bio, BIO_RP_MARK_DIRTY);
bio_put(bio);
}
}
@@ -1699,7 +1699,7 @@ void bio_check_pages_dirty(struct bio *bio)
goto defer;
}

- bio_release_pages(bio, false);
+ bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
return;
defer:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 4707dfff991b..9fe6616f8788 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
}
__set_current_state(TASK_RUNNING);

- bio_release_pages(&bio, should_dirty);
+ bio_release_pages(&bio, bio_rp_dirty_flag(should_dirty));
if (unlikely(bio.bi_status))
ret = blk_status_to_errno(bio.bi_status);

@@ -329,7 +329,7 @@ static void blkdev_bio_end_io(struct bio *bio)
if (should_dirty) {
bio_check_pages_dirty(bio);
} else {
- bio_release_pages(bio, false);
+ bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
}
}
diff --git a/fs/direct-io.c b/fs/direct-io.c
index ae196784f487..423ef431ddda 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -551,7 +551,7 @@ static blk_status_t dio_bio_complete(struct dio *dio, struct bio *bio)
if (dio->is_async && should_dirty) {
bio_check_pages_dirty(bio); /* transfers ownership */
} else {
- bio_release_pages(bio, should_dirty);
+ bio_release_pages(bio, bio_rp_dirty_flag(should_dirty));
bio_put(bio);
}
return err;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 3cdb84cdc488..2715e55679c1 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -440,7 +440,18 @@ bool __bio_try_merge_page(struct bio *bio, struct page *page,
void __bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int off);
int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
-void bio_release_pages(struct bio *bio, bool mark_dirty);
+
+enum bio_rp_flags_t {
+ BIO_RP_NORMAL = 0,
+ BIO_RP_MARK_DIRTY = 1,
+};
+
+static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
+{
+ return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
+}
+
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
struct rq_map_data;
extern struct bio *bio_map_user_iov(struct request_queue *,
struct iov_iter *, gfp_t);
--
2.22.0

2019-07-24 04:26:57

by John Hubbard

[permalink] [raw]
Subject: [PATCH 08/12] fs/cifs: convert put_page() to put_user_page*()

From: Jérôme Glisse <[email protected]>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Boaz Harrosh <[email protected]>
Cc: Steve French <[email protected]>
---
fs/cifs/cifsglob.h | 3 +++
fs/cifs/file.c | 22 +++++++++++++++++-----
fs/cifs/misc.c | 19 +++++++++++++++----
3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index fe610e7e3670..e95cb82bfa50 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1283,6 +1283,7 @@ struct cifs_aio_ctx {
* If yes, iter is a copy of the user passed iov_iter
*/
bool direct_io;
+ bool from_gup;
};

struct cifs_readdata;
@@ -1317,6 +1318,7 @@ struct cifs_readdata {
struct cifs_credits credits;
unsigned int nr_pages;
struct page **pages;
+ bool from_gup;
};

struct cifs_writedata;
@@ -1343,6 +1345,7 @@ struct cifs_writedata {
struct cifs_credits credits;
unsigned int nr_pages;
struct page **pages;
+ bool from_gup;
};

/*
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 97090693d182..84fa7e0a578f 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2571,8 +2571,13 @@ cifs_uncached_writedata_release(struct kref *refcount)
struct cifs_writedata, refcount);

kref_put(&wdata->ctx->refcount, cifs_aio_ctx_release);
- for (i = 0; i < wdata->nr_pages; i++)
- put_page(wdata->pages[i]);
+ if (wdata->from_gup) {
+ for (i = 0; i < wdata->nr_pages; i++)
+ put_user_page(wdata->pages[i]);
+ } else {
+ for (i = 0; i < wdata->nr_pages; i++)
+ put_page(wdata->pages[i]);
+ }
cifs_writedata_release(refcount);
}

@@ -2781,7 +2786,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
break;
}

-
+ wdata->from_gup = iov_iter_get_pages_use_gup(from);
wdata->page_offset = start;
wdata->tailsz =
nr_pages > 1 ?
@@ -2797,6 +2802,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct iov_iter *from,
add_credits_and_wake_if(server, credits, 0);
break;
}
+ wdata->from_gup = false;

rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
if (rc) {
@@ -3238,8 +3244,12 @@ cifs_uncached_readdata_release(struct kref *refcount)
unsigned int i;

kref_put(&rdata->ctx->refcount, cifs_aio_ctx_release);
- for (i = 0; i < rdata->nr_pages; i++) {
- put_page(rdata->pages[i]);
+ if (rdata->from_gup) {
+ for (i = 0; i < rdata->nr_pages; i++)
+ put_user_page(rdata->pages[i]);
+ } else {
+ for (i = 0; i < rdata->nr_pages; i++)
+ put_page(rdata->pages[i]);
}
cifs_readdata_release(refcount);
}
@@ -3502,6 +3512,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
break;
}

+ rdata->from_gup = iov_iter_get_pages_use_gup(&direct_iov);
npages = (cur_len + start + PAGE_SIZE-1) / PAGE_SIZE;
rdata->page_offset = start;
rdata->tailsz = npages > 1 ?
@@ -3519,6 +3530,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct cifsFileInfo *open_file,
rc = -ENOMEM;
break;
}
+ rdata->from_gup = false;

rc = cifs_read_allocate_pages(rdata, npages);
if (rc) {
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index f383877a6511..5a04c34fea05 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -822,10 +822,18 @@ cifs_aio_ctx_release(struct kref *refcount)
if (ctx->bv) {
unsigned i;

- for (i = 0; i < ctx->npages; i++) {
- if (ctx->should_dirty)
- set_page_dirty(ctx->bv[i].bv_page);
- put_page(ctx->bv[i].bv_page);
+ if (ctx->from_gup) {
+ for (i = 0; i < ctx->npages; i++) {
+ if (ctx->should_dirty)
+ set_page_dirty(ctx->bv[i].bv_page);
+ put_user_page(ctx->bv[i].bv_page);
+ }
+ } else {
+ for (i = 0; i < ctx->npages; i++) {
+ if (ctx->should_dirty)
+ set_page_dirty(ctx->bv[i].bv_page);
+ put_page(ctx->bv[i].bv_page);
+ }
}
kvfree(ctx->bv);
}
@@ -881,6 +889,9 @@ setup_aio_ctx_iter(struct cifs_aio_ctx *ctx, struct iov_iter *iter, int rw)

saved_len = count;

+ /* This is only use by cifs_aio_ctx_release() */
+ ctx->from_gup = iov_iter_get_pages_use_gup(iter);
+
while (count && npages < max_pages) {
rc = iov_iter_get_pages(iter, pages, count, max_pages, &start);
if (rc < 0) {
--
2.22.0

2019-07-24 04:27:12

by John Hubbard

[permalink] [raw]
Subject: [PATCH 10/12] fs/ceph: convert put_page() to put_user_page*()

From: Jérôme Glisse <[email protected]>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Use the enhanced put_user_pages_dirty_lock().

Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Boaz Harrosh <[email protected]>
Cc: "Yan, Zheng" <[email protected]>
Cc: Sage Weil <[email protected]>
Cc: Ilya Dryomov <[email protected]>
---
fs/ceph/file.c | 62 ++++++++++++++++++++++++++++++++++++++------------
1 file changed, 48 insertions(+), 14 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 685a03cc4b77..c628a1f96978 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -158,18 +158,26 @@ static ssize_t iter_get_bvecs_alloc(struct iov_iter *iter, size_t maxsize,
return bytes;
}

-static void put_bvecs(struct bio_vec *bvecs, int num_bvecs, bool should_dirty)
+static void put_bvecs(struct bio_vec *bv, int num_bvecs, bool should_dirty,
+ bool from_gup)
{
int i;

+
for (i = 0; i < num_bvecs; i++) {
- if (bvecs[i].bv_page) {
+ if (!bv[i].bv_page)
+ continue;
+
+ if (from_gup) {
+ put_user_pages_dirty_lock(&bv[i].bv_page, 1,
+ should_dirty);
+ } else {
if (should_dirty)
- set_page_dirty_lock(bvecs[i].bv_page);
- put_page(bvecs[i].bv_page);
+ set_page_dirty_lock(bv[i].bv_page);
+ put_page(bv[i].bv_page);
}
}
- kvfree(bvecs);
+ kvfree(bv);
}

/*
@@ -730,6 +738,7 @@ struct ceph_aio_work {
};

static void ceph_aio_retry_work(struct work_struct *work);
+static void ceph_aio_from_gup_retry_work(struct work_struct *work);

static void ceph_aio_complete(struct inode *inode,
struct ceph_aio_request *aio_req)
@@ -774,7 +783,7 @@ static void ceph_aio_complete(struct inode *inode,
kfree(aio_req);
}

-static void ceph_aio_complete_req(struct ceph_osd_request *req)
+static void _ceph_aio_complete_req(struct ceph_osd_request *req, bool from_gup)
{
int rc = req->r_result;
struct inode *inode = req->r_inode;
@@ -793,7 +802,9 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)

aio_work = kmalloc(sizeof(*aio_work), GFP_NOFS);
if (aio_work) {
- INIT_WORK(&aio_work->work, ceph_aio_retry_work);
+ INIT_WORK(&aio_work->work, from_gup ?
+ ceph_aio_from_gup_retry_work :
+ ceph_aio_retry_work);
aio_work->req = req;
queue_work(ceph_inode_to_client(inode)->inode_wq,
&aio_work->work);
@@ -830,7 +841,7 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
}

put_bvecs(osd_data->bvec_pos.bvecs, osd_data->num_bvecs,
- aio_req->should_dirty);
+ aio_req->should_dirty, from_gup);
ceph_osdc_put_request(req);

if (rc < 0)
@@ -840,7 +851,17 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req)
return;
}

-static void ceph_aio_retry_work(struct work_struct *work)
+static void ceph_aio_complete_req(struct ceph_osd_request *req)
+{
+ _ceph_aio_complete_req(req, false);
+}
+
+static void ceph_aio_from_gup_complete_req(struct ceph_osd_request *req)
+{
+ _ceph_aio_complete_req(req, true);
+}
+
+static void _ceph_aio_retry_work(struct work_struct *work, bool from_gup)
{
struct ceph_aio_work *aio_work =
container_of(work, struct ceph_aio_work, work);
@@ -891,7 +912,8 @@ static void ceph_aio_retry_work(struct work_struct *work)

ceph_osdc_put_request(orig_req);

- req->r_callback = ceph_aio_complete_req;
+ req->r_callback = from_gup ? ceph_aio_from_gup_complete_req :
+ ceph_aio_complete_req;
req->r_inode = inode;
req->r_priv = aio_req;

@@ -899,13 +921,23 @@ static void ceph_aio_retry_work(struct work_struct *work)
out:
if (ret < 0) {
req->r_result = ret;
- ceph_aio_complete_req(req);
+ _ceph_aio_complete_req(req, from_gup);
}

ceph_put_snap_context(snapc);
kfree(aio_work);
}

+static void ceph_aio_retry_work(struct work_struct *work)
+{
+ _ceph_aio_retry_work(work, false);
+}
+
+static void ceph_aio_from_gup_retry_work(struct work_struct *work)
+{
+ _ceph_aio_retry_work(work, true);
+}
+
static ssize_t
ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
struct ceph_snap_context *snapc,
@@ -927,6 +959,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
loff_t pos = iocb->ki_pos;
bool write = iov_iter_rw(iter) == WRITE;
bool should_dirty = !write && iter_is_iovec(iter);
+ bool from_gup = iov_iter_get_pages_use_gup(iter);

if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP)
return -EROFS;
@@ -1023,7 +1056,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
aio_req->num_reqs++;
atomic_inc(&aio_req->pending_reqs);

- req->r_callback = ceph_aio_complete_req;
+ req->r_callback = !from_gup ? ceph_aio_complete_req :
+ ceph_aio_from_gup_complete_req;
req->r_inode = inode;
req->r_priv = aio_req;
list_add_tail(&req->r_private_item, &aio_req->osd_reqs);
@@ -1054,7 +1088,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
len = ret;
}

- put_bvecs(bvecs, num_pages, should_dirty);
+ put_bvecs(bvecs, num_pages, should_dirty, from_gup);
ceph_osdc_put_request(req);
if (ret < 0)
break;
@@ -1093,7 +1127,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
req, false);
if (ret < 0) {
req->r_result = ret;
- ceph_aio_complete_req(req);
+ _ceph_aio_complete_req(req, from_gup);
}
}
return -EIOCBQUEUED;
--
2.22.0

2019-07-24 04:27:36

by John Hubbard

[permalink] [raw]
Subject: [PATCH 05/12] block_dev: convert put_page() to put_user_page*()

From: Jérôme Glisse <[email protected]>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* reworked to be compatible with recent bio_release_pages() changes.

Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Boaz Harrosh <[email protected]>
---
block/bio.c | 13 +++++++++++++
fs/block_dev.c | 22 +++++++++++++++++-----
include/linux/bio.h | 8 ++++++++
3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 74f9eba2583b..3b9f66e64bc1 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1746,6 +1746,19 @@ void bio_check_pages_dirty(struct bio *bio)
__bio_check_pages_dirty(bio, false);
}

+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty)
+{
+ enum bio_rp_flags_t flags = BIO_RP_NORMAL;
+
+ if (mark_dirty)
+ flags |= BIO_RP_MARK_DIRTY;
+
+ if (iov_iter_get_pages_use_gup(iter))
+ flags |= BIO_RP_FROM_GUP;
+
+ return flags;
+}
+
void update_io_ticks(struct hd_struct *part, unsigned long now)
{
unsigned long stamp;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9fe6616f8788..d53abaf31e54 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
}
__set_current_state(TASK_RUNNING);

- bio_release_pages(&bio, bio_rp_dirty_flag(should_dirty));
+ bio_release_pages(&bio, bio_rp_flags(iter, should_dirty));
if (unlikely(bio.bi_status))
ret = blk_status_to_errno(bio.bi_status);

@@ -295,7 +295,7 @@ static int blkdev_iopoll(struct kiocb *kiocb, bool wait)
return blk_poll(q, READ_ONCE(kiocb->ki_cookie), wait);
}

-static void blkdev_bio_end_io(struct bio *bio)
+static void _blkdev_bio_end_io(struct bio *bio, bool from_gup)
{
struct blkdev_dio *dio = bio->bi_private;
bool should_dirty = dio->should_dirty;
@@ -327,13 +327,23 @@ static void blkdev_bio_end_io(struct bio *bio)
}

if (should_dirty) {
- bio_check_pages_dirty(bio);
+ __bio_check_pages_dirty(bio, from_gup);
} else {
- bio_release_pages(bio, BIO_RP_NORMAL);
+ bio_release_pages(bio, bio_rp_gup_flag(from_gup));
bio_put(bio);
}
}

+static void blkdev_bio_end_io(struct bio *bio)
+{
+ _blkdev_bio_end_io(bio, false);
+}
+
+static void blkdev_bio_from_gup_end_io(struct bio *bio)
+{
+ _blkdev_bio_end_io(bio, true);
+}
+
static ssize_t
__blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
{
@@ -380,7 +390,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
bio->bi_iter.bi_sector = pos >> 9;
bio->bi_write_hint = iocb->ki_hint;
bio->bi_private = dio;
- bio->bi_end_io = blkdev_bio_end_io;
+ bio->bi_end_io = iov_iter_get_pages_use_gup(iter) ?
+ blkdev_bio_from_gup_end_io :
+ blkdev_bio_end_io;
bio->bi_ioprio = iocb->ki_ioprio;

ret = bio_iov_iter_get_pages(bio, iter);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d68a40c2c9d4..b9460d1a4679 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -452,6 +452,13 @@ static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
}

+static inline enum bio_rp_flags_t bio_rp_gup_flag(bool from_gup)
+{
+ return from_gup ? BIO_RP_FROM_GUP : BIO_RP_NORMAL;
+}
+
+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty);
+
void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
struct rq_map_data;
extern struct bio *bio_map_user_iov(struct request_queue *,
@@ -463,6 +470,7 @@ extern struct bio *bio_copy_kern(struct request_queue *, void *, unsigned int,
gfp_t, int);
extern void bio_set_pages_dirty(struct bio *bio);
extern void bio_check_pages_dirty(struct bio *bio);
+void __bio_check_pages_dirty(struct bio *bio, bool from_gup);

void generic_start_io_acct(struct request_queue *q, int op,
unsigned long sectors, struct hd_struct *part);
--
2.22.0

2019-07-24 04:27:50

by John Hubbard

[permalink] [raw]
Subject: [PATCH 07/12] vhost-scsi: convert put_page() to put_user_page*()

From: Jérôme Glisse <[email protected]>

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Changed a WARN_ON to a BUG_ON.

Signed-off-by: Jérôme Glisse <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Jan Kara <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Johannes Thumshirn <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Boaz Harrosh <[email protected]>
Cc: Miklos Szeredi <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Stefan Hajnoczi <[email protected]>
---
drivers/vhost/scsi.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index a9caf1bc3c3e..282565ab5e3f 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -329,11 +329,11 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd)

if (tv_cmd->tvc_sgl_count) {
for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
- put_page(sg_page(&tv_cmd->tvc_sgl[i]));
+ put_user_page(sg_page(&tv_cmd->tvc_sgl[i]));
}
if (tv_cmd->tvc_prot_sgl_count) {
for (i = 0; i < tv_cmd->tvc_prot_sgl_count; i++)
- put_page(sg_page(&tv_cmd->tvc_prot_sgl[i]));
+ put_user_page(sg_page(&tv_cmd->tvc_prot_sgl[i]));
}

vhost_scsi_put_inflight(tv_cmd->inflight);
@@ -630,6 +630,13 @@ vhost_scsi_map_to_sgl(struct vhost_scsi_cmd *cmd,
size_t offset;
unsigned int npages = 0;

+ /*
+ * Here in all cases we should have an IOVEC which use GUP. If that is
+ * not the case then we will wrongly call put_user_page() and the page
+ * refcount will go wrong (this is in vhost_scsi_release_cmd())
+ */
+ WARN_ON(!iov_iter_get_pages_use_gup(iter));
+
bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
VHOST_SCSI_PREALLOC_UPAGES, &offset);
/* No pages were pinned */
@@ -681,7 +688,7 @@ vhost_scsi_iov_to_sgl(struct vhost_scsi_cmd *cmd, bool write,
while (p < sg) {
struct page *page = sg_page(p++);
if (page)
- put_page(page);
+ put_user_page(page);
}
return ret;
}
--
2.22.0

2019-07-24 04:35:51

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 07/12] vhost-scsi: convert put_page() to put_user_page*()

On 7/23/19 9:25 PM, [email protected] wrote:
> From: Jérôme Glisse <[email protected]>
>
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page().
>
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
>
> Changes from Jérôme's original patch:
>
> * Changed a WARN_ON to a BUG_ON.
>

Clearly, the above commit log has it backwards (this is quite my night
for typos). Please read that as "changed a BUG_ON to a WARN_ON".

I'll correct the commit description in next iteration of this patchset.

...

> + /*
> + * Here in all cases we should have an IOVEC which use GUP. If that is
> + * not the case then we will wrongly call put_user_page() and the page
> + * refcount will go wrong (this is in vhost_scsi_release_cmd())
> + */
> + WARN_ON(!iov_iter_get_pages_use_gup(iter));
> +
...

thanks,
--
John Hubbard
NVIDIA

2019-07-24 05:32:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 03/12] block: bio_release_pages: use flags arg instead of bool

On Tue, Jul 23, 2019 at 09:25:09PM -0700, [email protected] wrote:
> From: John Hubbard <[email protected]>
>
> In commit d241a95f3514 ("block: optionally mark pages dirty in
> bio_release_pages"), new "bool mark_dirty" argument was added to
> bio_release_pages.
>
> In upcoming work, another bool argument (to indicate that the pages came
> from get_user_pages) is going to be added. That's one bool too many,
> because it's not desirable have calls of the form:

All pages releases by bio_release_pages should come from
get_get_user_pages, so I don't really see the point here.

2019-07-24 06:18:51

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

On Tue, Jul 23, 2019 at 09:25:06PM -0700, [email protected] wrote:
> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
> it is time to release the pages. That allows choosing between put_page()
> and put_user_page*().
>
> * Pass in one more piece of information to bio_release_pages: a "from_gup"
> parameter. Similar use as above.
>
> * Change the block layer, and several file systems, to use
> put_user_page*().

I think we can do this in a simple and better way. We have 5 ITER_*
types. Of those ITER_DISCARD as the name suggests never uses pages, so
we can skip handling it. ITER_PIPE is rejected іn the direct I/O path,
which leaves us with three.

Out of those ITER_BVEC needs a user page reference, so we want to call
put_user_page* on it. ITER_BVEC always already has page reference,
which means in the block direct I/O path path we alread don't take
a page reference. We should extent that handling to all other calls
of iov_iter_get_pages / iov_iter_get_pages_alloc. I think we should
just reject ITER_KVEC for direct I/O as well as we have no users and
it is rather pointless. Alternatively if we see a use for it the
callers should always have a life page reference anyway (or might
be on kmalloc memory), so we really should not take a reference either.

In other words: the only time we should ever have to put a page in
this patch is when they are user pages. We'll need to clean up
various bits of code for that, but that can be done gradually before
even getting to the actual put_user_pages conversion.

2019-07-29 20:58:36

by Jerome Glisse

[permalink] [raw]
Subject: Re: [PATCH 03/12] block: bio_release_pages: use flags arg instead of bool

On Tue, Jul 23, 2019 at 10:30:53PM -0700, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:09PM -0700, [email protected] wrote:
> > From: John Hubbard <[email protected]>
> >
> > In commit d241a95f3514 ("block: optionally mark pages dirty in
> > bio_release_pages"), new "bool mark_dirty" argument was added to
> > bio_release_pages.
> >
> > In upcoming work, another bool argument (to indicate that the pages came
> > from get_user_pages) is going to be added. That's one bool too many,
> > because it's not desirable have calls of the form:
>
> All pages releases by bio_release_pages should come from
> get_get_user_pages, so I don't really see the point here.

No they do not all comes from GUP for see various callers
of bio_check_pages_dirty() for instance iomap_dio_zero()

I have carefully tracked down all this and i did not do
anyconvertion just for the fun of it :)

Cheers,
J?r?me

2019-08-05 22:55:28

by John Hubbard

[permalink] [raw]
Subject: Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, [email protected] wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>> it is time to release the pages. That allows choosing between put_page()
>> and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>> parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>> put_user_page*().
>
> I think we can do this in a simple and better way. We have 5 ITER_*
> types. Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it. ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
>

Hi Christoph,

Are you working on anything like this? Or on the put_user_bvec() idea?
Please let me know, otherwise I'll go in and implement something here.


thanks,
--
John Hubbard
NVIDIA

> Out of those ITER_BVEC needs a user page reference, so we want to call
> put_user_page* on it. ITER_BVEC always already has page reference,
> which means in the block direct I/O path path we alread don't take
> a page reference. We should extent that handling to all other calls
> of iov_iter_get_pages / iov_iter_get_pages_alloc. I think we should
> just reject ITER_KVEC for direct I/O as well as we have no users and
> it is rather pointless. Alternatively if we see a use for it the
> callers should always have a life page reference anyway (or might
> be on kmalloc memory), so we really should not take a reference either.
>
> In other words: the only time we should ever have to put a page in
> this patch is when they are user pages. We'll need to clean up
> various bits of code for that, but that can be done gradually before
> even getting to the actual put_user_pages conversion.
>

2019-08-07 06:36:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote:
> On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> > On Tue, Jul 23, 2019 at 09:25:06PM -0700, [email protected] wrote:
> >> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
> >> Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
> >> it is time to release the pages. That allows choosing between put_page()
> >> and put_user_page*().
> >>
> >> * Pass in one more piece of information to bio_release_pages: a "from_gup"
> >> parameter. Similar use as above.
> >>
> >> * Change the block layer, and several file systems, to use
> >> put_user_page*().
> >
> > I think we can do this in a simple and better way. We have 5 ITER_*
> > types. Of those ITER_DISCARD as the name suggests never uses pages, so
> > we can skip handling it. ITER_PIPE is rejected іn the direct I/O path,
> > which leaves us with three.
> >
>
> Hi Christoph,
>
> Are you working on anything like this?

I was hoping I could steer you towards it. But if you don't want to do
it yourself I'll add it to my ever growing todo list.

> Or on the put_user_bvec() idea?

I have a prototype from two month ago:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec

but that only survived the most basic testing, so it'll need more work,
which I'm not sure when I'll find time for.