Message-ID: <1435338857.30650.6.camel@mita-ThinkPad-T540p>
Subject: Re: [PATCH 3/4] blk-mq: establish new mapping before cpu starts
 handling requests
From: Akinobu Mita <akinobu.mita@gmail.com>
To: Ming Lei <tom.leiming@gmail.com>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        Jens Axboe <axboe@kernel.dk>
Date: Sat, 27 Jun 2015 02:14:17 +0900
In-Reply-To: <CAC5umyjOZgRz2FF0skXMuKFLMtbnMO5Hdx0pkvBUaHk2+QfAng@mail.gmail.com>
References: <1434894751-6877-1-git-send-email-akinobu.mita@gmail.com>
	 <1434894751-6877-4-git-send-email-akinobu.mita@gmail.com>
	 <CACVXFVO6cuZE3cq7PgveMN9oiRw0yRYns7sDg3KnkqmmRWNDSA@mail.gmail.com>
	 <CAC5umyg2vFBTuwsRR0t-uLF=UKMixXo3Nd3eAe1RiMqXG4uUkg@mail.gmail.com>
	 <CACVXFVNgVwh-6_EKenB1KKGWm51KR-fLw-oU2B+WX9TmCzkKyw@mail.gmail.com>
	 <CAC5umyiGEQ_umjWw-tCptSZXrTszsjyp8HM682+74LxjaUhr9Q@mail.gmail.com>
	 <CACVXFVOmzSKvG8jam4Hw9m=c_ggFyTk2n0BH-cvRhAjKNZQFhw@mail.gmail.com>
	 <CAC5umyhPQMuZCF4DMfL1kwVBFfeAheW0tTtju=qcrU=yFhPofw@mail.gmail.com>
	 <20150625234030.4dc99725@tom-T450>
	 <CAC5umyjOZgRz2FF0skXMuKFLMtbnMO5Hdx0pkvBUaHk2+QfAng@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2888
Lines: 76

Akinobu Mita <akinobu.mita@gmail.com> wrote:
> 2015-06-26 0:40 GMT+09:00 Ming Lei <tom.leiming@gmail.com>:
> > On Thu, 25 Jun 2015 21:49:43 +0900
> > Akinobu Mita <akinobu.mita@gmail.com> wrote:
> >> For example, there is a single hw queue (hctx) and two CPU queues
> >> (ctx0 for CPU0, and ctx1 for CPU1).  Now CPU1 is just onlined and
> >> a request is inserted into ctx1->rq_list and set bit0 in pending
> >> bitmap as ctx1->index_hw is still zero.
> >>
> >> And then while running hw queue, flush_busy_ctxs() finds bit0 is set
> >> in pending bitmap and tries to retrieve requests in
> >> hctx->ctxs[0].rq_list.  But htx->ctxs[0] is ctx0, so the request in
> >> ctx1->rq_list is ignored.
> >
> > Per current design, the request should have been inserted into ctx0 instead
> > of ctx1 because ctx1 isn't mapped yet even though ctx1->cpu becomes ONLINE.
> >
> > So how about the following patch? which looks much simpler.
> 
> OK, I'll try this patch to see if the problem disappears.

This doesn't fix the problem.  Because:

> > diff --git a/block/blk-mq.c b/block/blk-mq.c
> > index f537796..2f45b73 100644
> > --- a/block/blk-mq.c
> > +++ b/block/blk-mq.c
> > @@ -1034,7 +1034,12 @@ void blk_mq_insert_request(struct request *rq, bool at_head, bool run_queue,
> >         struct blk_mq_ctx *ctx = rq->mq_ctx, *current_ctx;
> >
> >         current_ctx = blk_mq_get_ctx(q);
> > -       if (!cpu_online(ctx->cpu))
> > +       /*
> > +        * ctx->cpu may become ONLINE but ctx hasn't been mapped to
> > +        * hctx yet because there is a tiny race window between
> > +        * ctx->cpu ONLINE and doing the remap
> > +        */
> > +       if (!blk_mq_ctx_mapped(ctx))
> >                 rq->mq_ctx = ctx = current_ctx;

The process running on just onlined CPU1 in the above example can
satisfy this condition and current_ctx will be ctx1.  So the same
scenario can happen (the request is ignored by flush_busy_ctxs).

I found simple alternative solution that assigns the offline CPUs
unique ctx->index_hw.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 594eea0..a8fcfbf 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1787,10 +1787,11 @@ static void blk_mq_map_swqueue(struct
request_queue *q)
 	 */
 	queue_for_each_ctx(q, ctx, i) {
 		/* If the cpu isn't online, the cpu is mapped to first hctx */
-		if (!cpu_online(i))
-			continue;
+		if (!cpu_online(i) && cpu_possible(i))
+			hctx = q->mq_ops->map_queue(q, 0);
+		else
+			hctx = q->mq_ops->map_queue(q, i);
 
-		hctx = q->mq_ops->map_queue(q, i);
 		cpumask_set_cpu(i, hctx->cpumask);
 		ctx->index_hw = hctx->nr_ctx;
 		hctx->ctxs[hctx->nr_ctx++] = ctx;
-- 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/