Received: by 10.223.164.202 with SMTP id h10csp5501051wrb; Tue, 21 Nov 2017 10:40:53 -0800 (PST) X-Google-Smtp-Source: AGs4zMZS/rhK6Ge2IBJSwX2IVfXgsk9B7+n61TepBE44okFAFFgfDvhMg5KenMmMH8z2Tghcnw8v X-Received: by 10.84.128.197 with SMTP id a63mr18474635pla.210.1511289653410; Tue, 21 Nov 2017 10:40:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511289653; cv=none; d=google.com; s=arc-20160816; b=ek7o6rsn75+toPGTM/HE9C/9JnFctQ4iuQNvEghb89GIZz6ZKxc7mkWvRwM8eqCV7s eETdKrl5veOMfdiqZcZdNLWh3tnS5Ly0DoTNud3Xsn7NfFSb3/jmTjBPoTzxcYd3fUzh s/EuwFTcfplWC85YSwDnCA3SjARFV50CnVj4hb3YrXJclT/RoRHbryMXfbV56pbbAD0x 12ci1WPRUFzslpQyjBs7jKl/fViKX39ipLnbTXLdS/urwiomsGAY7c4RJw9YGEt5lkbg wpuYO+xkab8e7O0nckCmMkCVqj0jYTQ9gNlnk9QGcPcP+ma+Tc4RIt3PPgVagNVfXP5V CbGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:to:from:subject:dkim-signature :arc-authentication-results; bh=XyTqkC/lYES+EiRskiMTK0z84hEm9S2ccYtrPqwQung=; b=UUmb8lDeiUoZeU3moBxTUANwh3L00+rgBzReJ9Yv66Ny1930ft8ooJBLHSTfM72wmm huggi2M4boCylyK2LbNy6Ay0oDyJuzdsd+QYCxTqBzKzVwKb5jGlkpD/ebTJZqA7zOZM rKWVEPc2uKmOydfqKXxvVIOau2ugWJs8A65ajWXbnYJ/AYkBTr2VkWyYhT0I/cQOMutv +qha8U72I+TT5z3gDQs0HjFs3Cz4WiuWkAN+8Jwqcha/bWPjO8jHFP5TTMp3wc8Phs7K WgLm3+a1RjwunQ+lGAZg+53nS2nsB5L2BMs0UsH/kxkaSkLH+WFn/6NpI4c3+cV4Fcbq CuYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=DsA/B6u8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z99si11681570plh.355.2017.11.21.10.40.41; Tue, 21 Nov 2017 10:40:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=DsA/B6u8; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751357AbdKUSjo (ORCPT + 76 others); Tue, 21 Nov 2017 13:39:44 -0500 Received: from mail-it0-f68.google.com ([209.85.214.68]:39420 "EHLO mail-it0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751184AbdKUSjm (ORCPT ); Tue, 21 Nov 2017 13:39:42 -0500 Received: by mail-it0-f68.google.com with SMTP id x13so3215778iti.4 for ; Tue, 21 Nov 2017 10:39:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:from:to:references:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=XyTqkC/lYES+EiRskiMTK0z84hEm9S2ccYtrPqwQung=; b=DsA/B6u8Eys1JSjZIQpIaTbW8GPa6D3keE5vwZlvl5l7VKVuwLc4vY3xCNd4YuotUr riT4cOgjWvCdVzYSvB+p+lP/T0w+ZyKxuA+37YzS1xDvepcE+HQ5GlwJ3jOawuYow0Dg 5tkQebp6BYrXGwYhnQafCM44pKg8OErllv3MjcfVctrI7rLo6OIxz97W7bN903gsr2HZ LopdTJlIrBTzXrE8AJ0TxaIt8wJiLFlqBvE7Gq374kxRC+RLNGSJFfcHP62bVu4sRcut u45sgnTbWwNhphuWK6aAxWs6qhUMzUlEmkwtBkflVCoeDADYRKKflLkZNP+J9bMVKBvN cqcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=XyTqkC/lYES+EiRskiMTK0z84hEm9S2ccYtrPqwQung=; b=Ir31QdolvqO4UFAGa/bGU1SNmeu7jlg/72bAJcZX45IlfqwyUMa3rAu1K5yNXu2hPS VlTYJijFMmqDID7KfSFTkvsQM0f4dCPTdi2951XSi2DViTyjNKEEq/4LYp9Q4FvoQHgP NfQeWrKaolyOntXErJrBpKbK4wrr6kHD2ngmy9GKd+lKcA/LS1cCIxX0FpcGIEGUYiif ZE4WOdi7jFUbuIdDlCkAQ+1tgiRf88IWVFgHVZ4lod8pzDiil1uc6f6piFOvlE44902b oxLA3UffS9ZNwn6XOMqrULOaUJ5x6ohVRQBHEpGFuHxe0lf1r+TiCkbQnY4MGhGOSqyX YbVw== X-Gm-Message-State: AJaThX6Fhs+wnGg2WDoZd1xDHDSXbQEBYQ2hb1TdRKplA2W1dhGd9qUF ixqxl/pKdNbyqWWsgdlmcVlb/w== X-Received: by 10.36.22.79 with SMTP id a76mr3310654ita.55.1511289581619; Tue, 21 Nov 2017 10:39:41 -0800 (PST) Received: from [192.168.1.154] ([216.160.245.98]) by smtp.gmail.com with ESMTPSA id e142sm958663ite.28.2017.11.21.10.39.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 21 Nov 2017 10:39:40 -0800 (PST) Subject: Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable) From: Jens Axboe To: Christian Borntraeger , Bart Van Assche , "virtualization@lists.linux-foundation.org" , "linux-block@vger.kernel.org" , "mst@redhat.com" , "jasowang@redhat.com" , "linux-kernel@vger.kernel.org" , Christoph Hellwig References: <9c5eec5d-f542-4d76-6933-6fe31203ce09@de.ibm.com> <1511205644.2396.32.camel@wdc.com> <04526c98-ffc5-1eca-3aa8-50f9212c4323@de.ibm.com> <5c9f2228-0a8b-8225-7038-e6cb3f31ca0b@kernel.dk> <2e44dbd3-2f90-c267-560c-91d1d4b0e892@de.ibm.com> <823b9dd5-7781-5a72-03ff-bc931433fc19@kernel.dk> <15f232d2-2aaa-df7c-57e8-2f710e051e84@de.ibm.com> <055f040d-3f9a-a8fd-e8e2-326c6b9094a1@kernel.dk> <1aeecf2e-a68e-4c18-5912-2473f457e6ea@de.ibm.com> <8fedc2ad-d775-7789-742c-92ca928a3aee@kernel.dk> Message-ID: Date: Tue, 21 Nov 2017 11:39:38 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <8fedc2ad-d775-7789-742c-92ca928a3aee@kernel.dk> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/21/2017 11:27 AM, Jens Axboe wrote: > On 11/21/2017 11:12 AM, Christian Borntraeger wrote: >> >> >> On 11/21/2017 07:09 PM, Jens Axboe wrote: >>> On 11/21/2017 10:27 AM, Jens Axboe wrote: >>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote: >>>>> Bisect points to >>>>> >>>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit >>>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1 >>>>> Author: Christoph Hellwig >>>>> Date: Mon Jun 26 12:20:57 2017 +0200 >>>>> >>>>> blk-mq: Create hctx for each present CPU >>>>> >>>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream. >>>>> >>>>> Currently we only create hctx for online CPUs, which can lead to a lot >>>>> of churn due to frequent soft offline / online operations. Instead >>>>> allocate one for each present CPU to avoid this and dramatically simplify >>>>> the code. >>>>> >>>>> Signed-off-by: Christoph Hellwig >>>>> Reviewed-by: Jens Axboe >>>>> Cc: Keith Busch >>>>> Cc: linux-block@vger.kernel.org >>>>> Cc: linux-nvme@lists.infradead.org >>>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de >>>>> Signed-off-by: Thomas Gleixner >>>>> Cc: Oleksandr Natalenko >>>>> Cc: Mike Galbraith >>>>> Signed-off-by: Greg Kroah-Hartman >>>> >>>> I wonder if we're simply not getting the masks updated correctly. I'll >>>> take a look. >>> >>> Can't make it trigger here. We do init for each present CPU, which means >>> that if I offline a few CPUs here and register a queue, those still show >>> up as present (just offline) and get mapped accordingly. >>> >>> From the looks of it, your setup is different. If the CPU doesn't show >>> up as present and it gets hotplugged, then I can see how this condition >>> would trigger. What environment are you running this in? We might have >>> to re-introduce the cpu hotplug notifier, right now we just monitor >>> for a dead cpu and handle that. >> >> I am not doing a hot unplug and the replug, I use KVM and add a previously >> not available CPU. >> >> in libvirt/virsh speak: >> 4 > > So that's why we run into problems. It's not present when we load the device, > but becomes present and online afterwards. > > Christoph, we used to handle this just fine, your patch broke it. > > I'll see if I can come up with an appropriate fix. Can you try the below? diff --git a/block/blk-mq.c b/block/blk-mq.c index b600463791ec..ab3a66e7bd03 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -40,6 +40,7 @@ static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie); static void blk_mq_poll_stats_start(struct request_queue *q); static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); +static void blk_mq_map_swqueue(struct request_queue *q); static int blk_mq_poll_stats_bkt(const struct request *rq) { @@ -1947,6 +1950,15 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, return -ENOMEM; } +static int blk_mq_hctx_notify_prepare(unsigned int cpu, struct hlist_node *node) +{ + struct blk_mq_hw_ctx *hctx; + + hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp); + blk_mq_map_swqueue(hctx->queue); + return 0; +} + /* * 'cpu' is going away. splice any existing rq_list entries from this * software queue to the hw queue dispatch list, and ensure that it @@ -1958,7 +1970,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) struct blk_mq_ctx *ctx; LIST_HEAD(tmp); - hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp_dead); + hctx = hlist_entry_safe(node, struct blk_mq_hw_ctx, cpuhp); ctx = __blk_mq_get_ctx(hctx->queue, cpu); spin_lock(&ctx->lock); @@ -1981,8 +1993,7 @@ static int blk_mq_hctx_notify_dead(unsigned int cpu, struct hlist_node *node) static void blk_mq_remove_cpuhp(struct blk_mq_hw_ctx *hctx) { - cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_DEAD, - &hctx->cpuhp_dead); + cpuhp_state_remove_instance_nocalls(CPUHP_BLK_MQ_PREPARE, &hctx->cpuhp); } /* hctx->ctxs will be freed in queue's release handler */ @@ -2039,7 +2050,7 @@ static int blk_mq_init_hctx(struct request_queue *q, hctx->queue = q; hctx->flags = set->flags & ~BLK_MQ_F_TAG_SHARED; - cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead); + cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_PREPARE, &hctx->cpuhp); hctx->tags = set->tags[hctx_idx]; @@ -2974,7 +2987,8 @@ static int __init blk_mq_init(void) BUILD_BUG_ON((REQ_ATOM_STARTED / BITS_PER_BYTE) != (REQ_ATOM_COMPLETE / BITS_PER_BYTE)); - cpuhp_setup_state_multi(CPUHP_BLK_MQ_DEAD, "block/mq:dead", NULL, + cpuhp_setup_state_multi(CPUHP_BLK_MQ_PREPARE, "block/mq:prepare", + blk_mq_hctx_notify_prepare, blk_mq_hctx_notify_dead); return 0; } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 95c9a5c862e2..a6f03e9464fb 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -52,7 +52,7 @@ struct blk_mq_hw_ctx { atomic_t nr_active; - struct hlist_node cpuhp_dead; + struct hlist_node cpuhp; struct kobject kobj; unsigned long poll_considered; diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index ec32c4c5eb30..28b0fc9229c8 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -48,7 +48,7 @@ enum cpuhp_state { CPUHP_BLOCK_SOFTIRQ_DEAD, CPUHP_ACPI_CPUDRV_DEAD, CPUHP_S390_PFAULT_DEAD, - CPUHP_BLK_MQ_DEAD, + CPUHP_BLK_MQ_PREPARE, CPUHP_FS_BUFF_DEAD, CPUHP_PRINTK_DEAD, CPUHP_MM_MEMCQ_DEAD, -- Jens Axboe From 1584701281106312689@xxx Tue Nov 21 18:28:30 +0000 2017 X-GM-THRID: 1584670276912512570 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread