Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753219AbbDGJTl (ORCPT ); Tue, 7 Apr 2015 05:19:41 -0400 Received: from mail-ob0-f175.google.com ([209.85.214.175]:36203 "EHLO mail-ob0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750994AbbDGJTg (ORCPT ); Tue, 7 Apr 2015 05:19:36 -0400 MIME-Version: 1.0 In-Reply-To: <1428347694-17704-2-git-send-email-jmoyer@redhat.com> References: <1428347694-17704-1-git-send-email-jmoyer@redhat.com> <1428347694-17704-2-git-send-email-jmoyer@redhat.com> Date: Tue, 7 Apr 2015 17:19:35 +0800 Message-ID: Subject: Re: [PATCH 2/2] blk-plug: don't flush nested plug lists From: Ming Lei To: Jeff Moyer Cc: Jens Axboe , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 54974 Lines: 1462 Hi Jeff, On Tue, Apr 7, 2015 at 3:14 AM, Jeff Moyer wrote: > The way the on-stack plugging currently works, each nesting level > flushes its own list of I/Os. This can be less than optimal (read > awful) for certain workloads. For example, consider an application > that issues asynchronous O_DIRECT I/Os. It can send down a bunch of > I/Os together in a single io_submit call, only to have each of them > dispatched individually down in the bowells of the dirct I/O code. > The reason is that there are blk_plug's instantiated both at the upper > call site in do_io_submit and down in do_direct_IO. The latter will > submit as little as 1 I/O at a time (if you have a small enough I/O > size) instead of performing the batching that the plugging > infrastructure is supposed to provide. > > Now, for the case where there is an elevator involved, this doesn't > really matter too much. The elevator will keep the I/O around long > enough for it to be merged. However, in cases where there is no > elevator (like blk-mq), I/Os are simply dispatched immediately. > > Try this, for example (note I'm using a virtio-blk device, so it's > using the blk-mq single queue path, though I've also reproduced this > with the micron p320h): > > fio --rw=read --bs=4k --iodepth=128 --iodepth_batch=16 --iodepth_batch_complete=16 --runtime=10s --direct=1 --filename=/dev/vdd --name=job1 --ioengine=libaio --time_based > > If you run that on a current kernel, you will get zero merges. Zero! > After this patch, you will get many merges (the actual number depends > on how fast your storage is, obviously), and much better throughput. > Here are results from my test rig: > > Unpatched kernel: > Read B/W: 283,638 KB/s > Read Merges: 0 > > Patched kernel: > Read B/W: 873,224 KB/s > Read Merges: 2,046K The data is amazing, but maybe better to provide some latency data. > > I considered several approaches to solving the problem: > 1) get rid of the inner-most plugs > 2) handle nesting by using only one on-stack plug > 2a) #2, except use a per-cpu blk_plug struct, which may clean up the > code a bit at the expense of memory footprint > > Option 1 will be tricky or impossible to do, since inner most plug > lists are sometimes the only plug lists, depending on the call path. > Option 2 is what this patch implements. Option 2a is perhaps a better > idea, but since I already implemented option 2, I figured I'd post it > for comments and opinions before rewriting it. > > Much of the patch involves modifying call sites to blk_start_plug, > since its signature is changed. The meat of the patch is actually I am wondering if the type of blk_start_plug has to be changed since the active plug is always the top plug with your patch, and blk_finish_plug() can find the active plug from current->plug. > pretty simple and constrained to block/blk-core.c and > include/linux/blkdev.h. The only tricky bits were places where plugs > were finished and then restarted to flush out I/O. There, I went > ahead and exported blk_flush_plug_list and called that directly. > > Comments would be greatly appreciated. > > Signed-off-by: Jeff Moyer > --- > block/blk-core.c | 33 +++++++++++++++++++-------------- > block/blk-lib.c | 6 +++--- > block/blk-throttle.c | 6 +++--- > drivers/block/xen-blkback/blkback.c | 6 +++--- > drivers/md/dm-bufio.c | 15 +++++++-------- > drivers/md/dm-kcopyd.c | 6 +++--- > drivers/md/dm-thin.c | 6 +++--- > drivers/md/md.c | 6 +++--- > drivers/md/raid1.c | 6 +++--- > drivers/md/raid10.c | 6 +++--- > drivers/md/raid5.c | 12 ++++++------ > drivers/target/target_core_iblock.c | 6 +++--- > fs/aio.c | 6 +++--- > fs/block_dev.c | 6 +++--- > fs/btrfs/scrub.c | 6 +++--- > fs/btrfs/transaction.c | 6 +++--- > fs/btrfs/tree-log.c | 16 ++++++++-------- > fs/btrfs/volumes.c | 12 +++++------- > fs/buffer.c | 6 +++--- > fs/direct-io.c | 8 +++++--- > fs/ext4/file.c | 6 +++--- > fs/ext4/inode.c | 12 +++++------- > fs/f2fs/checkpoint.c | 6 +++--- > fs/f2fs/gc.c | 6 +++--- > fs/f2fs/node.c | 6 +++--- > fs/jbd/checkpoint.c | 6 +++--- > fs/jbd/commit.c | 10 +++++----- > fs/jbd2/checkpoint.c | 6 +++--- > fs/jbd2/commit.c | 6 +++--- > fs/mpage.c | 6 +++--- > fs/xfs/xfs_buf.c | 12 ++++++------ > fs/xfs/xfs_dir2_readdir.c | 6 +++--- > fs/xfs/xfs_itable.c | 6 +++--- > include/linux/blkdev.h | 3 ++- > mm/madvise.c | 6 +++--- > mm/page-writeback.c | 6 +++--- > mm/readahead.c | 6 +++--- > mm/swap_state.c | 6 +++--- > mm/vmscan.c | 6 +++--- > 39 files changed, 155 insertions(+), 152 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 794c3e7..64f3f2a 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -3002,7 +3002,7 @@ EXPORT_SYMBOL(kblockd_schedule_delayed_work_on); > > /** > * blk_start_plug - initialize blk_plug and track it inside the task_struct > - * @plug: The &struct blk_plug that needs to be initialized > + * @plug: The on-stack &struct blk_plug that needs to be initialized > * > * Description: > * Tracking blk_plug inside the task_struct will help with auto-flushing the > @@ -3013,26 +3013,29 @@ EXPORT_SYMBOL(kblockd_schedule_delayed_work_on); > * page belonging to that request that is currently residing in our private > * plug. By flushing the pending I/O when the process goes to sleep, we avoid > * this kind of deadlock. > + * > + * Returns: a pointer to the active &struct blk_plug > */ > -void blk_start_plug(struct blk_plug *plug) > +struct blk_plug *blk_start_plug(struct blk_plug *plug) The change might not be necessary. > { > struct task_struct *tsk = current; > > + if (tsk->plug) { > + tsk->plug->depth++; > + return tsk->plug; > + } > + > + plug->depth = 1; > INIT_LIST_HEAD(&plug->list); > INIT_LIST_HEAD(&plug->mq_list); > INIT_LIST_HEAD(&plug->cb_list); > > /* > - * If this is a nested plug, don't actually assign it. It will be > - * flushed on its own. > + * Store ordering should not be needed here, since a potential > + * preempt will imply a full memory barrier > */ > - if (!tsk->plug) { > - /* > - * Store ordering should not be needed here, since a potential > - * preempt will imply a full memory barrier > - */ > - tsk->plug = plug; > - } > + tsk->plug = plug; > + return tsk->plug; tsk->plug is always returned from this function, so it means tsk->plug is always the active plug. > } > EXPORT_SYMBOL(blk_start_plug); > > @@ -3176,13 +3179,15 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule) > > local_irq_restore(flags); > } > +EXPORT_SYMBOL_GPL(blk_flush_plug_list); > > void blk_finish_plug(struct blk_plug *plug) > { > - blk_flush_plug_list(plug, false); > + if (--plug->depth > 0) > + return; The active plug should be current->plug, suppose blk_finish_plug() is always paired with blk_start_plug(). > > - if (plug == current->plug) > - current->plug = NULL; > + blk_flush_plug_list(plug, false); > + current->plug = NULL; > } > EXPORT_SYMBOL(blk_finish_plug); > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index 7688ee3..e2d2448 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -48,7 +48,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, > struct bio_batch bb; > struct bio *bio; > int ret = 0; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > if (!q) > return -ENXIO; > @@ -81,7 +81,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, > bb.flags = 1 << BIO_UPTODATE; > bb.wait = &wait; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while (nr_sects) { > unsigned int req_sects; > sector_t end_sect, tmp; > @@ -128,7 +128,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, > */ > cond_resched(); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); Like the comment above, blk_finish_plug() can figure out the active plug, so looks unnecessary to introduce return value to blk_start_plug(). > > /* Wait for bios in-flight */ > if (!atomic_dec_and_test(&bb.done)) > diff --git a/block/blk-throttle.c b/block/blk-throttle.c > index 5b9c6d5..f57bbd3 100644 > --- a/block/blk-throttle.c > +++ b/block/blk-throttle.c > @@ -1266,7 +1266,7 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work) > struct request_queue *q = td->queue; > struct bio_list bio_list_on_stack; > struct bio *bio; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int rw; > > bio_list_init(&bio_list_on_stack); > @@ -1278,10 +1278,10 @@ static void blk_throtl_dispatch_work_fn(struct work_struct *work) > spin_unlock_irq(q->queue_lock); > > if (!bio_list_empty(&bio_list_on_stack)) { > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while((bio = bio_list_pop(&bio_list_on_stack))) > generic_make_request(bio); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > } > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > index 2a04d34..f075182 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -1207,7 +1207,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > struct bio **biolist = pending_req->biolist; > int i, nbio = 0; > int operation; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > bool drain = false; > struct grant_page **pages = pending_req->segments; > unsigned short req_operation; > @@ -1368,13 +1368,13 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > } > > atomic_set(&pending_req->pendcnt, nbio); > - blk_start_plug(&plug); > + blk_start_plug(onstack_plug); > > for (i = 0; i < nbio; i++) > submit_bio(operation, biolist[i]); > > /* Let the I/Os go.. */ > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > if (operation == READ) > blkif->st_rd_sect += preq.nr_sects; > diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c > index 86dbbc7..a6cbc6f 100644 > --- a/drivers/md/dm-bufio.c > +++ b/drivers/md/dm-bufio.c > @@ -706,8 +706,8 @@ static void __write_dirty_buffer(struct dm_buffer *b, > > static void __flush_write_list(struct list_head *write_list) > { > - struct blk_plug plug; > - blk_start_plug(&plug); > + struct blk_plug *plug, onstack_plug; > + plug = blk_start_plug(&onstack_plug); > while (!list_empty(write_list)) { > struct dm_buffer *b = > list_entry(write_list->next, struct dm_buffer, write_list); > @@ -715,7 +715,7 @@ static void __flush_write_list(struct list_head *write_list) > submit_io(b, WRITE, b->block, write_endio); > dm_bufio_cond_resched(); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > /* > @@ -1110,13 +1110,13 @@ EXPORT_SYMBOL_GPL(dm_bufio_new); > void dm_bufio_prefetch(struct dm_bufio_client *c, > sector_t block, unsigned n_blocks) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > LIST_HEAD(write_list); > > BUG_ON(dm_bufio_in_request()); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > dm_bufio_lock(c); > > for (; n_blocks--; block++) { > @@ -1126,9 +1126,8 @@ void dm_bufio_prefetch(struct dm_bufio_client *c, > &write_list); > if (unlikely(!list_empty(&write_list))) { > dm_bufio_unlock(c); > - blk_finish_plug(&plug); > __flush_write_list(&write_list); > - blk_start_plug(&plug); > + blk_flush_plug_list(plug, false); > dm_bufio_lock(c); > } > if (unlikely(b != NULL)) { > @@ -1149,7 +1148,7 @@ void dm_bufio_prefetch(struct dm_bufio_client *c, > dm_bufio_unlock(c); > > flush_plug: > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > EXPORT_SYMBOL_GPL(dm_bufio_prefetch); > > diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c > index 3a7cade..fed371f 100644 > --- a/drivers/md/dm-kcopyd.c > +++ b/drivers/md/dm-kcopyd.c > @@ -580,7 +580,7 @@ static void do_work(struct work_struct *work) > { > struct dm_kcopyd_client *kc = container_of(work, > struct dm_kcopyd_client, kcopyd_work); > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > /* > * The order that these are called is *very* important. > @@ -589,11 +589,11 @@ static void do_work(struct work_struct *work) > * list. io jobs call wake when they complete and it all > * starts again. > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > process_jobs(&kc->complete_jobs, kc, run_complete_job); > process_jobs(&kc->pages_jobs, kc, run_pages_job); > process_jobs(&kc->io_jobs, kc, run_io_job); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > /* > diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c > index 921aafd..6a7459a 100644 > --- a/drivers/md/dm-thin.c > +++ b/drivers/md/dm-thin.c > @@ -1775,7 +1775,7 @@ static void process_thin_deferred_bios(struct thin_c *tc) > unsigned long flags; > struct bio *bio; > struct bio_list bios; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > unsigned count = 0; > > if (tc->requeue_mode) { > @@ -1799,7 +1799,7 @@ static void process_thin_deferred_bios(struct thin_c *tc) > > spin_unlock_irqrestore(&tc->lock, flags); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while ((bio = bio_list_pop(&bios))) { > /* > * If we've got no free new_mapping structs, and processing > @@ -1824,7 +1824,7 @@ static void process_thin_deferred_bios(struct thin_c *tc) > dm_pool_issue_prefetches(pool->pmd); > } > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > static int cmp_cells(const void *lhs, const void *rhs) > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 717daad..9f24719 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -7402,7 +7402,7 @@ void md_do_sync(struct md_thread *thread) > int skipped = 0; > struct md_rdev *rdev; > char *desc, *action = NULL; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > /* just incase thread restarts... */ > if (test_bit(MD_RECOVERY_DONE, &mddev->recovery)) > @@ -7572,7 +7572,7 @@ void md_do_sync(struct md_thread *thread) > md_new_event(mddev); > update_time = jiffies; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while (j < max_sectors) { > sector_t sectors; > > @@ -7686,7 +7686,7 @@ void md_do_sync(struct md_thread *thread) > /* > * this also signals 'finished resyncing' to md_stop > */ > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active)); > > /* tell personality that we are finished */ > diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c > index d34e238..2608bc3 100644 > --- a/drivers/md/raid1.c > +++ b/drivers/md/raid1.c > @@ -2399,11 +2399,11 @@ static void raid1d(struct md_thread *thread) > unsigned long flags; > struct r1conf *conf = mddev->private; > struct list_head *head = &conf->retry_list; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > md_check_recovery(mddev); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (;;) { > > flush_pending_writes(conf); > @@ -2441,7 +2441,7 @@ static void raid1d(struct md_thread *thread) > if (mddev->flags & ~(1< md_check_recovery(mddev); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > static int init_resync(struct r1conf *conf) > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index a7196c4..7bea5c7 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -2791,11 +2791,11 @@ static void raid10d(struct md_thread *thread) > unsigned long flags; > struct r10conf *conf = mddev->private; > struct list_head *head = &conf->retry_list; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > md_check_recovery(mddev); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (;;) { > > flush_pending_writes(conf); > @@ -2835,7 +2835,7 @@ static void raid10d(struct md_thread *thread) > if (mddev->flags & ~(1< md_check_recovery(mddev); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > static int init_resync(struct r10conf *conf) > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index cd2f96b..59e7090 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -5259,11 +5259,11 @@ static void raid5_do_work(struct work_struct *work) > struct r5conf *conf = group->conf; > int group_id = group - conf->worker_groups; > int handled; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > pr_debug("+++ raid5worker active\n"); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > handled = 0; > spin_lock_irq(&conf->device_lock); > while (1) { > @@ -5281,7 +5281,7 @@ static void raid5_do_work(struct work_struct *work) > pr_debug("%d stripes handled\n", handled); > > spin_unlock_irq(&conf->device_lock); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > pr_debug("--- raid5worker inactive\n"); > } > @@ -5298,13 +5298,13 @@ static void raid5d(struct md_thread *thread) > struct mddev *mddev = thread->mddev; > struct r5conf *conf = mddev->private; > int handled; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > pr_debug("+++ raid5d active\n"); > > md_check_recovery(mddev); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > handled = 0; > spin_lock_irq(&conf->device_lock); > while (1) { > @@ -5352,7 +5352,7 @@ static void raid5d(struct md_thread *thread) > spin_unlock_irq(&conf->device_lock); > > async_tx_issue_pending_all(); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > pr_debug("--- raid5d inactive\n"); > } > diff --git a/drivers/target/target_core_iblock.c b/drivers/target/target_core_iblock.c > index d4a4b0f..248a6bb 100644 > --- a/drivers/target/target_core_iblock.c > +++ b/drivers/target/target_core_iblock.c > @@ -361,13 +361,13 @@ iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num) > > static void iblock_submit_bios(struct bio_list *list, int rw) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > struct bio *bio; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while ((bio = bio_list_pop(list))) > submit_bio(rw, bio); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > static void iblock_end_io_flush(struct bio *bio, int err) > diff --git a/fs/aio.c b/fs/aio.c > index f8e52a1..b1c3583 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -1575,7 +1575,7 @@ long do_io_submit(aio_context_t ctx_id, long nr, > struct kioctx *ctx; > long ret = 0; > int i = 0; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > if (unlikely(nr < 0)) > return -EINVAL; > @@ -1592,7 +1592,7 @@ long do_io_submit(aio_context_t ctx_id, long nr, > return -EINVAL; > } > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > /* > * AKPM: should this return a partial result if some of the IOs were > @@ -1616,7 +1616,7 @@ long do_io_submit(aio_context_t ctx_id, long nr, > if (ret) > break; > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > percpu_ref_put(&ctx->users); > return i ? i : ret; > diff --git a/fs/block_dev.c b/fs/block_dev.c > index 975266b..928d3e0 100644 > --- a/fs/block_dev.c > +++ b/fs/block_dev.c > @@ -1598,10 +1598,10 @@ static long block_ioctl(struct file *file, unsigned cmd, unsigned long arg) > ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from) > { > struct file *file = iocb->ki_filp; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > ssize_t ret; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > ret = __generic_file_write_iter(iocb, from); > if (ret > 0) { > ssize_t err; > @@ -1609,7 +1609,7 @@ ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from) > if (err < 0) > ret = err; > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > return ret; > } > EXPORT_SYMBOL_GPL(blkdev_write_iter); > diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c > index ec57687..768bac3 100644 > --- a/fs/btrfs/scrub.c > +++ b/fs/btrfs/scrub.c > @@ -2967,7 +2967,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, > struct btrfs_root *root = fs_info->extent_root; > struct btrfs_root *csum_root = fs_info->csum_root; > struct btrfs_extent_item *extent; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > u64 flags; > int ret; > int slot; > @@ -3088,7 +3088,7 @@ static noinline_for_stack int scrub_stripe(struct scrub_ctx *sctx, > * collect all data csums for the stripe to avoid seeking during > * the scrub. This might currently (crc32) end up to be about 1MB > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > /* > * now find all extents for each stripe and scrub them > @@ -3316,7 +3316,7 @@ out: > scrub_wr_submit(sctx); > mutex_unlock(&sctx->wr_ctx.wr_lock); > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > btrfs_free_path(path); > btrfs_free_path(ppath); > return ret < 0 ? ret : 0; > diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c > index 8be4278..b5a8078 100644 > --- a/fs/btrfs/transaction.c > +++ b/fs/btrfs/transaction.c > @@ -979,11 +979,11 @@ static int btrfs_write_and_wait_marked_extents(struct btrfs_root *root, > { > int ret; > int ret2; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > ret = btrfs_write_marked_extents(root, dirty_pages, mark); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > ret2 = btrfs_wait_marked_extents(root, dirty_pages, mark); > > if (ret) > diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c > index c5b8ba3..1623924 100644 > --- a/fs/btrfs/tree-log.c > +++ b/fs/btrfs/tree-log.c > @@ -2519,7 +2519,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > struct btrfs_root *log_root_tree = root->fs_info->log_root_tree; > int log_transid = 0; > struct btrfs_log_ctx root_log_ctx; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > mutex_lock(&root->log_mutex); > log_transid = ctx->log_transid; > @@ -2571,10 +2571,10 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > /* we start IO on all the marked extents here, but we don't actually > * wait for them until later. > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > ret = btrfs_write_marked_extents(log, &log->dirty_log_pages, mark); > if (ret) { > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > btrfs_abort_transaction(trans, root, ret); > btrfs_free_logged_extents(log, log_transid); > btrfs_set_log_full_commit(root->fs_info, trans); > @@ -2619,7 +2619,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > if (!list_empty(&root_log_ctx.list)) > list_del_init(&root_log_ctx.list); > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > btrfs_set_log_full_commit(root->fs_info, trans); > > if (ret != -ENOSPC) { > @@ -2635,7 +2635,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > } > > if (log_root_tree->log_transid_committed >= root_log_ctx.log_transid) { > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > mutex_unlock(&log_root_tree->log_mutex); > ret = root_log_ctx.log_ret; > goto out; > @@ -2643,7 +2643,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > > index2 = root_log_ctx.log_transid % 2; > if (atomic_read(&log_root_tree->log_commit[index2])) { > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > ret = btrfs_wait_marked_extents(log, &log->dirty_log_pages, > mark); > btrfs_wait_logged_extents(trans, log, log_transid); > @@ -2669,7 +2669,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > * check the full commit flag again > */ > if (btrfs_need_log_full_commit(root->fs_info, trans)) { > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > btrfs_wait_marked_extents(log, &log->dirty_log_pages, mark); > btrfs_free_logged_extents(log, log_transid); > mutex_unlock(&log_root_tree->log_mutex); > @@ -2680,7 +2680,7 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans, > ret = btrfs_write_marked_extents(log_root_tree, > &log_root_tree->dirty_log_pages, > EXTENT_DIRTY | EXTENT_NEW); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > if (ret) { > btrfs_set_log_full_commit(root->fs_info, trans); > btrfs_abort_transaction(trans, root, ret); > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 8222f6f..0e215ff 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -261,7 +261,7 @@ static noinline void run_scheduled_bios(struct btrfs_device *device) > unsigned long last_waited = 0; > int force_reg = 0; > int sync_pending = 0; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > /* > * this function runs all the bios we've collected for > @@ -269,7 +269,7 @@ static noinline void run_scheduled_bios(struct btrfs_device *device) > * another device without first sending all of these down. > * So, setup a plug here and finish it off before we return > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > bdi = blk_get_backing_dev_info(device->bdev); > fs_info = device->dev_root->fs_info; > @@ -358,8 +358,7 @@ loop_lock: > if (pending_bios == &device->pending_sync_bios) { > sync_pending = 1; > } else if (sync_pending) { > - blk_finish_plug(&plug); > - blk_start_plug(&plug); > + blk_flush_plug_list(plug, false); > sync_pending = 0; > } > > @@ -415,8 +414,7 @@ loop_lock: > } > /* unplug every 64 requests just for good measure */ > if (batch_run % 64 == 0) { > - blk_finish_plug(&plug); > - blk_start_plug(&plug); > + blk_flush_plug_list(plug, false); > sync_pending = 0; > } > } > @@ -431,7 +429,7 @@ loop_lock: > spin_unlock(&device->io_lock); > > done: > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > static void pending_bios_fn(struct btrfs_work *work) > diff --git a/fs/buffer.c b/fs/buffer.c > index 20805db..727d642 100644 > --- a/fs/buffer.c > +++ b/fs/buffer.c > @@ -717,10 +717,10 @@ static int fsync_buffers_list(spinlock_t *lock, struct list_head *list) > struct list_head tmp; > struct address_space *mapping; > int err = 0, err2; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > INIT_LIST_HEAD(&tmp); > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > spin_lock(lock); > while (!list_empty(list)) { > @@ -758,7 +758,7 @@ static int fsync_buffers_list(spinlock_t *lock, struct list_head *list) > } > > spin_unlock(lock); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > spin_lock(lock); > > while (!list_empty(&tmp)) { > diff --git a/fs/direct-io.c b/fs/direct-io.c > index e181b6b..e79c0c6 100644 > --- a/fs/direct-io.c > +++ b/fs/direct-io.c > @@ -488,6 +488,8 @@ static int dio_bio_complete(struct dio *dio, struct bio *bio) > static void dio_await_completion(struct dio *dio) > { > struct bio *bio; > + > + blk_flush_plug(current); > do { > bio = dio_await_one(dio); > if (bio) > @@ -1108,7 +1110,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, > struct dio *dio; > struct dio_submit sdio = { 0, }; > struct buffer_head map_bh = { 0, }; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > unsigned long align = offset | iov_iter_alignment(iter); > > if (rw & WRITE) > @@ -1231,7 +1233,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, > > sdio.pages_in_io += iov_iter_npages(iter, INT_MAX); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > retval = do_direct_IO(dio, &sdio, &map_bh); > if (retval) > @@ -1262,7 +1264,7 @@ do_blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode, > if (sdio.bio) > dio_bio_submit(dio, &sdio); > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > /* > * It is possible that, we return short IO due to end of file. > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > index 33a09da..fd0cf21 100644 > --- a/fs/ext4/file.c > +++ b/fs/ext4/file.c > @@ -94,7 +94,7 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > struct file *file = iocb->ki_filp; > struct inode *inode = file_inode(iocb->ki_filp); > struct mutex *aio_mutex = NULL; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int o_direct = io_is_direct(file); > int overwrite = 0; > size_t length = iov_iter_count(from); > @@ -139,7 +139,7 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > > iocb->private = &overwrite; > if (o_direct) { > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > > /* check whether we do a DIO overwrite or not */ > @@ -183,7 +183,7 @@ ext4_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > ret = err; > } > if (o_direct) > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > errout: > if (aio_mutex) > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 5cb9a21..d4b645b 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -2284,7 +2284,7 @@ static int ext4_writepages(struct address_space *mapping, > int needed_blocks, rsv_blocks = 0, ret = 0; > struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb); > bool done; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > bool give_up_on_write = false; > > trace_ext4_writepages(inode, wbc); > @@ -2298,11 +2298,9 @@ static int ext4_writepages(struct address_space *mapping, > goto out_writepages; > > if (ext4_should_journal_data(inode)) { > - struct blk_plug plug; > - > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > ret = write_cache_pages(mapping, wbc, __writepage, mapping); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > goto out_writepages; > } > > @@ -2368,7 +2366,7 @@ retry: > if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages) > tag_pages_for_writeback(mapping, mpd.first_page, mpd.last_page); > done = false; > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while (!done && mpd.first_page <= mpd.last_page) { > /* For each extent of pages we use new io_end */ > mpd.io_submit.io_end = ext4_init_io_end(inode, GFP_KERNEL); > @@ -2438,7 +2436,7 @@ retry: > if (ret) > break; > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > if (!ret && !cycled && wbc->nr_to_write > 0) { > cycled = 1; > mpd.last_page = writeback_index - 1; > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c > index 7f794b7..de2b522 100644 > --- a/fs/f2fs/checkpoint.c > +++ b/fs/f2fs/checkpoint.c > @@ -810,10 +810,10 @@ static int block_operations(struct f2fs_sb_info *sbi) > .nr_to_write = LONG_MAX, > .for_reclaim = 0, > }; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int err = 0; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > retry_flush_dents: > f2fs_lock_all(sbi); > @@ -846,7 +846,7 @@ retry_flush_nodes: > goto retry_flush_nodes; > } > out: > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > return err; > } > > diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c > index 76adbc3..082e961 100644 > --- a/fs/f2fs/gc.c > +++ b/fs/f2fs/gc.c > @@ -661,12 +661,12 @@ static void do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno, > { > struct page *sum_page; > struct f2fs_summary_block *sum; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > /* read segment summary of victim */ > sum_page = get_sum_page(sbi, segno); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > sum = page_address(sum_page); > > @@ -678,7 +678,7 @@ static void do_garbage_collect(struct f2fs_sb_info *sbi, unsigned int segno, > gc_data_segment(sbi, sum->entries, gc_list, segno, gc_type); > break; > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > stat_inc_seg_count(sbi, GET_SUM_TYPE((&sum->footer))); > stat_inc_call_count(sbi->stat_info); > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c > index 97bd9d3..e1fc81e 100644 > --- a/fs/f2fs/node.c > +++ b/fs/f2fs/node.c > @@ -1066,7 +1066,7 @@ repeat: > struct page *get_node_page_ra(struct page *parent, int start) > { > struct f2fs_sb_info *sbi = F2FS_P_SB(parent); > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > struct page *page; > int err, i, end; > nid_t nid; > @@ -1086,7 +1086,7 @@ repeat: > else if (err == LOCKED_PAGE) > goto page_hit; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > /* Then, try readahead for siblings of the desired node */ > end = start + MAX_RA_NODE; > @@ -1098,7 +1098,7 @@ repeat: > ra_node_page(sbi, nid); > } > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > lock_page(page); > if (unlikely(page->mapping != NODE_MAPPING(sbi))) { > diff --git a/fs/jbd/checkpoint.c b/fs/jbd/checkpoint.c > index 08c0304..29b550d 100644 > --- a/fs/jbd/checkpoint.c > +++ b/fs/jbd/checkpoint.c > @@ -258,12 +258,12 @@ static void > __flush_batch(journal_t *journal, struct buffer_head **bhs, int *batch_count) > { > int i; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (i = 0; i < *batch_count; i++) > write_dirty_buffer(bhs[i], WRITE_SYNC); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > for (i = 0; i < *batch_count; i++) { > struct buffer_head *bh = bhs[i]; > diff --git a/fs/jbd/commit.c b/fs/jbd/commit.c > index bb217dc..ccb32f0 100644 > --- a/fs/jbd/commit.c > +++ b/fs/jbd/commit.c > @@ -311,7 +311,7 @@ void journal_commit_transaction(journal_t *journal) > int first_tag = 0; > int tag_flag; > int i; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int write_op = WRITE; > > /* > @@ -444,10 +444,10 @@ void journal_commit_transaction(journal_t *journal) > * Now start flushing things to disk, in the order they appear > * on the transaction lists. Data blocks go first. > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > err = journal_submit_data_buffers(journal, commit_transaction, > write_op); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > /* > * Wait for all previously submitted IO to complete. > @@ -503,7 +503,7 @@ void journal_commit_transaction(journal_t *journal) > err = 0; > } > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > journal_write_revoke_records(journal, commit_transaction, write_op); > > @@ -697,7 +697,7 @@ start_journal_io: > } > } > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > /* Lo and behold: we have just managed to send a transaction to > the log. Before we can commit it, wait for the IO so far to > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c > index 988b32e..31b5e73 100644 > --- a/fs/jbd2/checkpoint.c > +++ b/fs/jbd2/checkpoint.c > @@ -182,12 +182,12 @@ static void > __flush_batch(journal_t *journal, int *batch_count) > { > int i; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (i = 0; i < *batch_count; i++) > write_dirty_buffer(journal->j_chkpt_bhs[i], WRITE_SYNC); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > for (i = 0; i < *batch_count; i++) { > struct buffer_head *bh = journal->j_chkpt_bhs[i]; > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c > index b73e021..713d26a 100644 > --- a/fs/jbd2/commit.c > +++ b/fs/jbd2/commit.c > @@ -390,7 +390,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > int tag_bytes = journal_tag_bytes(journal); > struct buffer_head *cbh = NULL; /* For transactional checksums */ > __u32 crc32_sum = ~0; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > /* Tail of the journal */ > unsigned long first_block; > tid_t first_tid; > @@ -555,7 +555,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) > if (err) > jbd2_journal_abort(journal, err); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > jbd2_journal_write_revoke_records(journal, commit_transaction, > &log_bufs, WRITE_SYNC); > > @@ -805,7 +805,7 @@ start_journal_io: > __jbd2_journal_abort_hard(journal); > } > > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > /* Lo and behold: we have just managed to send a transaction to > the log. Before we can commit it, wait for the IO so far to > diff --git a/fs/mpage.c b/fs/mpage.c > index 3e79220..926dc42 100644 > --- a/fs/mpage.c > +++ b/fs/mpage.c > @@ -676,10 +676,10 @@ int > mpage_writepages(struct address_space *mapping, > struct writeback_control *wbc, get_block_t get_block) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int ret; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > if (!get_block) > ret = generic_writepages(mapping, wbc); > @@ -695,7 +695,7 @@ mpage_writepages(struct address_space *mapping, > if (mpd.bio) > mpage_bio_submit(WRITE, mpd.bio); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > return ret; > } > EXPORT_SYMBOL(mpage_writepages); > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index 1790b00..776ac5a 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1208,7 +1208,7 @@ STATIC void > _xfs_buf_ioapply( > struct xfs_buf *bp) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int rw; > int offset; > int size; > @@ -1281,7 +1281,7 @@ _xfs_buf_ioapply( > */ > offset = bp->b_offset; > size = BBTOB(bp->b_io_length); > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (i = 0; i < bp->b_map_count; i++) { > xfs_buf_ioapply_map(bp, i, &offset, &size, rw); > if (bp->b_error) > @@ -1289,7 +1289,7 @@ _xfs_buf_ioapply( > if (size <= 0) > break; /* all done */ > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > /* > @@ -1772,7 +1772,7 @@ __xfs_buf_delwri_submit( > struct list_head *io_list, > bool wait) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > struct xfs_buf *bp, *n; > int pinned = 0; > > @@ -1806,7 +1806,7 @@ __xfs_buf_delwri_submit( > > list_sort(NULL, io_list, xfs_buf_cmp); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > list_for_each_entry_safe(bp, n, io_list, b_list) { > bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_ASYNC | XBF_WRITE_FAIL); > bp->b_flags |= XBF_WRITE | XBF_ASYNC; > @@ -1823,7 +1823,7 @@ __xfs_buf_delwri_submit( > > xfs_buf_submit(bp); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > return pinned; > } > diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c > index 098cd78..6074671 100644 > --- a/fs/xfs/xfs_dir2_readdir.c > +++ b/fs/xfs/xfs_dir2_readdir.c > @@ -275,7 +275,7 @@ xfs_dir2_leaf_readbuf( > struct xfs_inode *dp = args->dp; > struct xfs_buf *bp = *bpp; > struct xfs_bmbt_irec *map = mip->map; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int error = 0; > int length; > int i; > @@ -404,7 +404,7 @@ xfs_dir2_leaf_readbuf( > /* > * Do we need more readahead? > */ > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (mip->ra_index = mip->ra_offset = i = 0; > mip->ra_want > mip->ra_current && i < mip->map_blocks; > i += geo->fsbcount) { > @@ -455,7 +455,7 @@ xfs_dir2_leaf_readbuf( > } > } > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > out: > *bpp = bp; > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c > index 82e3142..b890aa4 100644 > --- a/fs/xfs/xfs_itable.c > +++ b/fs/xfs/xfs_itable.c > @@ -179,7 +179,7 @@ xfs_bulkstat_ichunk_ra( > struct xfs_inobt_rec_incore *irec) > { > xfs_agblock_t agbno; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int blks_per_cluster; > int inodes_per_cluster; > int i; /* inode chunk index */ > @@ -188,7 +188,7 @@ xfs_bulkstat_ichunk_ra( > blks_per_cluster = xfs_icluster_size_fsb(mp); > inodes_per_cluster = blks_per_cluster << mp->m_sb.sb_inopblog; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (i = 0; i < XFS_INODES_PER_CHUNK; > i += inodes_per_cluster, agbno += blks_per_cluster) { > if (xfs_inobt_maskn(i, inodes_per_cluster) & ~irec->ir_free) { > @@ -196,7 +196,7 @@ xfs_bulkstat_ichunk_ra( > &xfs_inode_buf_ops); > } > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > } > > /* > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h > index 7f9a516..4617fce 100644 > --- a/include/linux/blkdev.h > +++ b/include/linux/blkdev.h > @@ -1091,6 +1091,7 @@ static inline void blk_post_runtime_resume(struct request_queue *q, int err) {} > * schedule() where blk_schedule_flush_plug() is called. > */ > struct blk_plug { > + int depth; /* number of nested plugs */ > struct list_head list; /* requests */ > struct list_head mq_list; /* blk-mq requests */ > struct list_head cb_list; /* md requires an unplug callback */ > @@ -1106,7 +1107,7 @@ struct blk_plug_cb { > }; > extern struct blk_plug_cb *blk_check_plugged(blk_plug_cb_fn unplug, > void *data, int size); > -extern void blk_start_plug(struct blk_plug *); > +extern struct blk_plug *blk_start_plug(struct blk_plug *); > extern void blk_finish_plug(struct blk_plug *); > extern void blk_flush_plug_list(struct blk_plug *, bool); > > diff --git a/mm/madvise.c b/mm/madvise.c > index d551475..1f7d5ad 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -463,7 +463,7 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) > int error = -EINVAL; > int write; > size_t len; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > #ifdef CONFIG_MEMORY_FAILURE > if (behavior == MADV_HWPOISON || behavior == MADV_SOFT_OFFLINE) > @@ -503,7 +503,7 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) > if (vma && start > vma->vm_start) > prev = vma; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (;;) { > /* Still start < end. */ > error = -ENOMEM; > @@ -539,7 +539,7 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) > vma = find_vma(current->mm, start); > } > out: > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > if (write) > up_write(¤t->mm->mmap_sem); > else > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 644bcb6..9369d5e 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -2011,16 +2011,16 @@ static int __writepage(struct page *page, struct writeback_control *wbc, > int generic_writepages(struct address_space *mapping, > struct writeback_control *wbc) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > int ret; > > /* deal with chardevs and other special file */ > if (!mapping->a_ops->writepage) > return 0; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > ret = write_cache_pages(mapping, wbc, __writepage, mapping); > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > return ret; > } > > diff --git a/mm/readahead.c b/mm/readahead.c > index 9356758..c3350b5 100644 > --- a/mm/readahead.c > +++ b/mm/readahead.c > @@ -111,11 +111,11 @@ EXPORT_SYMBOL(read_cache_pages); > static int read_pages(struct address_space *mapping, struct file *filp, > struct list_head *pages, unsigned nr_pages) > { > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > unsigned page_idx; > int ret; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > > if (mapping->a_ops->readpages) { > ret = mapping->a_ops->readpages(filp, mapping, pages, nr_pages); > @@ -136,7 +136,7 @@ static int read_pages(struct address_space *mapping, struct file *filp, > ret = 0; > > out: > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > return ret; > } > diff --git a/mm/swap_state.c b/mm/swap_state.c > index 405923f..3f6b8ec 100644 > --- a/mm/swap_state.c > +++ b/mm/swap_state.c > @@ -455,7 +455,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, > unsigned long offset = entry_offset; > unsigned long start_offset, end_offset; > unsigned long mask; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > > mask = swapin_nr_pages(offset) - 1; > if (!mask) > @@ -467,7 +467,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, > if (!start_offset) /* First page is swap header. */ > start_offset++; > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > for (offset = start_offset; offset <= end_offset ; offset++) { > /* Ok, do the async read-ahead now */ > page = read_swap_cache_async(swp_entry(swp_type(entry), offset), > @@ -478,7 +478,7 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, > SetPageReadahead(page); > page_cache_release(page); > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > > lru_add_drain(); /* Push any new pages onto the LRU now */ > skip: > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 5e8eadd..fd29974 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2130,7 +2130,7 @@ static void shrink_lruvec(struct lruvec *lruvec, int swappiness, > enum lru_list lru; > unsigned long nr_reclaimed = 0; > unsigned long nr_to_reclaim = sc->nr_to_reclaim; > - struct blk_plug plug; > + struct blk_plug *plug, onstack_plug; > bool scan_adjusted; > > get_scan_count(lruvec, swappiness, sc, nr, lru_pages); > @@ -2152,7 +2152,7 @@ static void shrink_lruvec(struct lruvec *lruvec, int swappiness, > scan_adjusted = (global_reclaim(sc) && !current_is_kswapd() && > sc->priority == DEF_PRIORITY); > > - blk_start_plug(&plug); > + plug = blk_start_plug(&onstack_plug); > while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] || > nr[LRU_INACTIVE_FILE]) { > unsigned long nr_anon, nr_file, percentage; > @@ -2222,7 +2222,7 @@ static void shrink_lruvec(struct lruvec *lruvec, int swappiness, > > scan_adjusted = true; > } > - blk_finish_plug(&plug); > + blk_finish_plug(plug); > sc->nr_reclaimed += nr_reclaimed; > > /* > -- > 1.8.3.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Ming Lei -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/