Received: by 2002:ac0:bc90:0:0:0:0:0 with SMTP id a16csp2896374img; Sun, 24 Mar 2019 22:39:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqwov73/TG2la4JkqhKp/xll4IsMG3JnV0sXxJsWg+u1VKtj45RZpiNTgUiPG/GNL1LmFJI4 X-Received: by 2002:a65:5343:: with SMTP id w3mr4902018pgr.232.1553492363261; Sun, 24 Mar 2019 22:39:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1553492363; cv=none; d=google.com; s=arc-20160816; b=DGIG430ANf+nuNvNX0c6iT1GIKemdUYntJjWUIAzz2gG4kC8GxAqWuT0uAcVJLJ/rJ HF5CVIl2SU2h2Tu+p7x6+9GSAqQ7jrT43jRivaOEMQAadnwKyH9XcGuVYXtcwER5rujU OXcA4NDQOgliEvSjOCYBFkiEbh1xiQ4FgvNhZPVogPfcki1AIfWDh+BeAML/K+kiHD2W fK1ODwvrUUpcVCSjNZArTuqD+BXTgwEIbfow7n/kSxlc2jiSkmfQkT8Q7h1Gsf24hEes agrCIIXVE8tkla3sPcqNCuPOIkyUlJCbgJu0O/0fKt+xqaWXxUlXiZj0ZFX9eHe2Q+Qh YcJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=ZAE5EKg7j3MWXmklZuczQM1dXUA/DCU36YdmGVW1W1E=; b=p92oh8ppPGZop//hjUq64hi0/ZriWCPOngvdKZC/haTt+4qHMkF5Ey9hthvZ0he2MS c9qHJBKlFhBIhmuDr50dqrPSvhLR7PPoJa4yLE+qUFyPOkdKFMIdm9gGeFV3VFD7+bUR bTqvunCkv+QXSmt6qx8GdT2a+qVtg+o33qfQ6shozcbBIrxkshuiTErON8M7IU2rIKBK Lli2LXz6QtYYbjbtVbleCr5qq4kHhcHrmByG40oqfDYqnGScseSLfWwp3Fc7P0m1Ng1h V7IAaeNC9d20vUZ8QXkl7PO9UMygGAScLO5OlNgtNSP7Xm8yO89TiavCSt9lSClvq3fa PiSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=k206w3C+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t7si12468363pgp.196.2019.03.24.22.39.08; Sun, 24 Mar 2019 22:39:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=k206w3C+; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729233AbfCYFhC (ORCPT + 99 others); Mon, 25 Mar 2019 01:37:02 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:59444 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729293AbfCYFhC (ORCPT ); Mon, 25 Mar 2019 01:37:02 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x2P5YgIH005787; Mon, 25 Mar 2019 05:36:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id; s=corp-2018-07-02; bh=ZAE5EKg7j3MWXmklZuczQM1dXUA/DCU36YdmGVW1W1E=; b=k206w3C+iF8za9tQstaB3L4fUQ6o1X1k7O78O7e5eEUH91v2Nc1ScwM7EzxS10YF4QfR /SzS5lz0MzDmYczgqMBFydEifu2pL6mWoTE63mHl1hm9y4K+Ta+smmvmibTY9KYgQHQV GiD3yZac9sknY1Q3FuSNGtDsPoTUbyclcsWIMaQQwF4bo9maFQJXyLP3nqivNbUKW5pl xwWhh5/0J8nq9jAuExxTD8ULK7m8RCJNcPQyUA2bTooEn2mL0vOrgcX0U2FYM7JA2EQe 4mWhyaWrS4b4VB5KyEpgdAUo2bkzMFCs0Pq3JQTiLTj0sPI4CO6kHaQNrf1Mr10nIJy5 lg== Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by userp2120.oracle.com with ESMTP id 2re6dj1tpu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 Mar 2019 05:36:29 +0000 Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0021.oracle.com (8.14.4/8.14.4) with ESMTP id x2P5aSXX014978 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 Mar 2019 05:36:28 GMT Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x2P5aRlc025211; Mon, 25 Mar 2019 05:36:27 GMT Received: from will-ThinkCentre-M93p.cn.oracle.com (/10.182.71.12) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 24 Mar 2019 22:36:27 -0700 From: Jianchao Wang To: axboe@kernel.dk Cc: hch@lst.de, jthumshirn@suse.de, hare@suse.de, josef@toxicpanda.com, bvanassche@acm.org, sagi@grimberg.me, keith.busch@intel.com, jsmart2021@gmail.com, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH V2 0/8]: blk-mq: use static_rqs to iterate busy tags Date: Mon, 25 Mar 2019 13:28:01 +0800 Message-Id: <1553491689-1730-1-git-send-email-jianchao.w.wang@oracle.com> X-Mailer: git-send-email 2.7.4 X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9205 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903250043 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jens As we know, there is a risk of accesing stale requests when iterate in-flight requests with tags->rqs[] and this has been talked in following thread, [1] https://marc.info/?l=linux-scsi&m=154511693912752&w=2 [2] https://marc.info/?l=linux-block&m=154526189023236&w=2 A typical sence could be blk_mq_get_request blk_mq_queue_tag_busy_iter -> blk_mq_get_tag -> bt_for_each -> bt_iter -> rq = taags->rqs[] -> rq->q -> blk_mq_rq_ctx_init -> data->hctx->tags->rqs[rq->tag] = rq; The root cause is that there is a window between set bit on tag sbitmap and set tags->rqs[]. This patch would fix this issue by iterating requests with tags->static_rqs[] instead of tags->rqs[] which would be changed dynamically. Moreover, we will try to get a non-zero q_usage_counter before access hctxs and tags and thus could avoid the race with updating nr_hw_queues, switching io scheduler and even queue clean up which are all under a frozen and drained queue. The 1st patch get rid of the useless of synchronize_rcu in __blk_mq_update_nr_hw_queues The 2nd patch modify the blk_mq_queue_tag_busy_iter to use tags->static_rqs[] instead of tags->rqs[] to iterate the busy tags. The 3rd ~ 7th patch change the blk_mq_tagset_busy_iter to blk_mq_queue_tag_busy_iter which is safer The 8th patch get rid of the blk_mq_tagset_busy_iter. Change log V1 -> V2: - Add wrapper to hide the 'infligh' parameter to user based on Sagi's suggestion. - Other misc changes on comment. Jianchao Wang (8) blk-mq: get rid of the synchronize_rcu in blk-mq: use static_rqs instead of rqs to iterate tags blk-mq: use blk_mq_queue_tag_inflight_iter in debugfs mtip32xx: use blk_mq_queue_tag_inflight_iter nbd: use blk_mq_queue_tag_inflight_iter skd: use blk_mq_queue_tag_inflight_iter nvme: use blk_mq_queue_tag_inflight_iter blk-mq: remove blk_mq_tagset_busy_iter diff stat block/blk-mq-debugfs.c | 2 +- block/blk-mq-tag.c | 193 ++++++++++++++------------------------ block/blk-mq-tag.h | 4 +- block/blk-mq.c | 31 ++---- drivers/block/mtip32xx/mtip32xx.c | 6 +- drivers/block/nbd.c | 2 +- drivers/block/skd_main.c | 4 +- drivers/nvme/host/core.c | 12 +++ drivers/nvme/host/fc.c | 10 +- drivers/nvme/host/nvme.h | 2 + drivers/nvme/host/pci.c | 5 +- drivers/nvme/host/rdma.c | 4 +- drivers/nvme/host/tcp.c | 5 +- drivers/nvme/target/loop.c | 4 +- include/linux/blk-mq.h | 7 +- 15 files changed, 119 insertions(+), 172 deletions(-) Thanks Jianchao