Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752079AbbFYMtu (ORCPT ); Thu, 25 Jun 2015 08:49:50 -0400 Received: from mail-vn0-f45.google.com ([209.85.216.45]:37677 "EHLO mail-vn0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751345AbbFYMtn (ORCPT ); Thu, 25 Jun 2015 08:49:43 -0400 MIME-Version: 1.0 In-Reply-To: References: <1434894751-6877-1-git-send-email-akinobu.mita@gmail.com> <1434894751-6877-4-git-send-email-akinobu.mita@gmail.com> Date: Thu, 25 Jun 2015 21:49:43 +0900 Message-ID: Subject: Re: [PATCH 3/4] blk-mq: establish new mapping before cpu starts handling requests From: Akinobu Mita To: Ming Lei Cc: Linux Kernel Mailing List , Jens Axboe Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3550 Lines: 71 2015-06-25 17:07 GMT+09:00 Ming Lei : > On Thu, Jun 25, 2015 at 10:56 AM, Akinobu Mita wrote: >> 2015-06-25 1:24 GMT+09:00 Ming Lei : >>> On Wed, Jun 24, 2015 at 10:34 PM, Akinobu Mita wrote: >>>> Hi Ming, >>>> >>>> 2015-06-24 18:46 GMT+09:00 Ming Lei : >>>>> On Sun, Jun 21, 2015 at 9:52 PM, Akinobu Mita wrote: >>>>>> ctx->index_hw is zero for the CPUs which have never been onlined since >>>>>> the block queue was initialized. If one of those CPUs is hotadded and >>>>>> starts handling request before new mappings are established, pending >>>>> >>>>> Could you explain a bit what the handling request is? The fact is that >>>>> blk_mq_queue_reinit() is run after all queues are put into freezing. >>>> >>>> Notifier callbacks for CPU_ONLINE action can be run on the other CPU >>>> than the CPU which was just onlined. So it is possible for the >>>> process running on the just onlined CPU to insert request and run >>>> hw queue before blk_mq_queue_reinit_notify() is actually called with >>>> action=CPU_ONLINE. >>> >>> You are right because blk_mq_queue_reinit_notify() is alwasy run after >>> the CPU becomes UP, so there is a tiny window in which the CPU is up >>> but the mapping is updated. Per current design, the CPU just onlined >>> is still mapped to hw queue 0 until the mapping is updated by >>> blk_mq_queue_reinit_notify(). >>> >>> But I am wondering why it is a problem and why you think flush_busy_ctxs >>> can't find the requests on the software queue in this situation? >> >> The problem happens when the CPU has just been onlined first time >> since the request queue was initialized. At this time ctx->index_hw >> for the CPU is still zero before blk_mq_queue_reinit_notify is called. >> >> The request can be inserted to ctx->rq_list, but blk_mq_hctx_mark_pending() >> marks busy for wrong bit position as ctx->index_hw is zero. > > It isn't wrong bit since the CPU onlined just is still mapped to hctx 0 at that > time . ctx->index_hw is not CPU queue to HW queue mapping. ctx->index_hw is the index in hctx->ctxs[] for this ctx. Each ctx in a hw queue should have unique ctx->index_hw. This problem can be reproducible with a single hw queue. (The script in cover letter can reproduce this problem with a single hw queue) >> flush_busy_ctxs() only retrieves the requests from software queues >> which are marked busy. So the request just inserted is ignored as >> the corresponding bit position is not busy. > > Before making the remap in blk_mq_queue_reinit() for the CPU topo change, > the request queue will be put into freezing first and all requests > inserted to hctx 0 > should be retrieved and scheduled out. So can the request be igonred by > flush_busy_ctxs()? For example, there is a single hw queue (hctx) and two CPU queues (ctx0 for CPU0, and ctx1 for CPU1). Now CPU1 is just onlined and a request is inserted into ctx1->rq_list and set bit0 in pending bitmap as ctx1->index_hw is still zero. And then while running hw queue, flush_busy_ctxs() finds bit0 is set in pending bitmap and tries to retrieve requests in hctx->ctxs[0].rq_list. But htx->ctxs[0] is ctx0, so the request in ctx1->rq_list is ignored. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/