Received: by 10.223.164.202 with SMTP id h10csp5595302wrb; Tue, 21 Nov 2017 12:15:06 -0800 (PST) X-Google-Smtp-Source: AGs4zMbJbkD3Ulgn2mM5osOJeUTwwOiMqLhWdGhLH6KC2fRgX8qyTY38mG5oh2qkjJESaPTkQ9WI X-Received: by 10.84.170.132 with SMTP id j4mr14098412plb.316.1511295306471; Tue, 21 Nov 2017 12:15:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511295306; cv=none; d=google.com; s=arc-20160816; b=fZo06lMnJXjsLj/39AYeiWnW21RpPPG3ALFVRcZYeyGtJfnWYu6MTPjz9LfGqN05Ih PlIgoCWfm2sKdTsc0uOr3U9G1ozcS1+9Top2Q/XBbToPjXU8BWAvGJm3bVZvpVuzyJ8m cYnX1+nn/qmsISVOkLHCqNSRVwinpLtCKrwDZkXT/AMQTvPkvUy+Zxs9hONpFeR43Iit DE/elIBcvUxb376NYh5pSyTDmvOw9StggP/DrasMtUHUoaj41eTAtFPknZODT7joO+tQ wN7DXA9rB144JD9BsvY0vth36n61aK3oCmyO+pam6quKR8joR7Q8TBBisEdDA8G7lAXy 2TWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:dkim-signature :arc-authentication-results; bh=PCg/QI5aiLOeFXCyuAHgbD3Xvg2c6ZM7OjxOlma3d4s=; b=sSI2WWglF8D9OLtBG8JIotGbk19zq03jH58rk1FpJN2uS0vW6owkgbEoUzB5mlh2g9 rhpaSRtVs6tN4UEAVRWVJ6xMx9meqIez/POyyr5f5taVNeCYPDjYUpbpqKmMIsH7Y2r1 aiWTjIlxiQOoSrrP+U1r4qn2k45y7QJ3iCvAVpgutGoyVZSUE0BJIXO7Td0vWlEZcmzW rt1YpsfWZNslb1kBBNKGhnV5zQvDO2q0nO+02PvSBPyDAzDcce4aJz8GKKi2FCKWWwPs EqjsEHWtVEaEUGW+SyqpipSLk+H3gevOjLkCQ1aiGARAF9m5WnK5V3/T3fcN/npgLEGv WXXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ru8L6zy6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b1si7148048pfd.341.2017.11.21.12.14.54; Tue, 21 Nov 2017 12:15:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=ru8L6zy6; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751374AbdKUUOP (ORCPT + 76 others); Tue, 21 Nov 2017 15:14:15 -0500 Received: from mail-pg0-f66.google.com ([74.125.83.66]:34961 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751067AbdKUUON (ORCPT ); Tue, 21 Nov 2017 15:14:13 -0500 Received: by mail-pg0-f66.google.com with SMTP id l19so11042233pgo.2 for ; Tue, 21 Nov 2017 12:14:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=PCg/QI5aiLOeFXCyuAHgbD3Xvg2c6ZM7OjxOlma3d4s=; b=ru8L6zy6vOcuKp0ClNasurWLecU6FrnyQDD3usjKpLPbwIE9uom5nAR013B0NMQfZB YmYc4zhkLo0LoWBNQf2D2wmALuDwuci0wMlhKZTxGgKAKEmRqKZZ7WNxApBo3u/hBqvf JKFlodXg0rd6WL2rtZ9bgiin0G8XOrJYsEMva98wKuXdpK7fCoKRVcvyO7Gs9sUqjSvl 6JL9LKgvc/+aNy1L+uvZYOG9lOIjDL96oBpFGHRVLofwU8Z0s1SGhIiXGajyNRODH68Q 2TePna1cZQTpchdqD3TE7BedF/pVQcbrkbsMWetKuEJkWq1q63mLxzaLiJBHbIJB6ZcK GyCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=PCg/QI5aiLOeFXCyuAHgbD3Xvg2c6ZM7OjxOlma3d4s=; b=YCWcJiySRcfhfexGuLNQBfZTtQJ8UwVUx3rxv1lBUvr83DUjJWqenl/18YK3L+qUHV SZvAj0bXj2PiYwTXGTxTsXjkkIOYMMFYkMUUANhHwXHRku0HeghichXCW1YHt5lMqhG5 HqYKxVFCiYpqZOTOs2ky2vLJUMo4pFmN052ZHRE+t+J3rI1kaZjVzBioroniED7qdlLw 5cGD5+uilBULauvo315H/P0dUDKBPCX1y3G8PkFGugSp407a/1NlsVntRksIc/Snxy6j /9++U0rxro5VQ3RhiKBhOWxTRUM7Lk1q/5LjW3zdKegIMIPfWwzeM/bN0F1O0VsSfxpY mSGw== X-Gm-Message-State: AJaThX650iGCImHoXL1CmXDD4rbmf7LFCyhpHOXxAqGOpOBjhSgiOIlw oaLLB9BfsQOIviOX5U5NJQ6vcw== X-Received: by 10.99.127.67 with SMTP id p3mr18284831pgn.321.1511295252433; Tue, 21 Nov 2017 12:14:12 -0800 (PST) Received: from ?IPv6:2620:10d:c081:1130::1066? ([2620:10d:c090:180::1:9a4c]) by smtp.gmail.com with ESMTPSA id x1sm25467840pfh.113.2017.11.21.12.14.10 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 21 Nov 2017 12:14:11 -0800 (PST) Subject: Re: 4.14: WARNING: CPU: 4 PID: 2895 at block/blk-mq.c:1144 with virtio-blk (also 4.12 stable) To: Christian Borntraeger , Bart Van Assche , "virtualization@lists.linux-foundation.org" , "linux-block@vger.kernel.org" , "mst@redhat.com" , "jasowang@redhat.com" , "linux-kernel@vger.kernel.org" , Christoph Hellwig References: <9c5eec5d-f542-4d76-6933-6fe31203ce09@de.ibm.com> <1511205644.2396.32.camel@wdc.com> <04526c98-ffc5-1eca-3aa8-50f9212c4323@de.ibm.com> <5c9f2228-0a8b-8225-7038-e6cb3f31ca0b@kernel.dk> <2e44dbd3-2f90-c267-560c-91d1d4b0e892@de.ibm.com> <823b9dd5-7781-5a72-03ff-bc931433fc19@kernel.dk> <15f232d2-2aaa-df7c-57e8-2f710e051e84@de.ibm.com> <055f040d-3f9a-a8fd-e8e2-326c6b9094a1@kernel.dk> <1aeecf2e-a68e-4c18-5912-2473f457e6ea@de.ibm.com> <8fedc2ad-d775-7789-742c-92ca928a3aee@kernel.dk> From: Jens Axboe Message-ID: Date: Tue, 21 Nov 2017 13:14:09 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/21/2017 01:12 PM, Christian Borntraeger wrote: > > > On 11/21/2017 08:30 PM, Jens Axboe wrote: >> On 11/21/2017 12:15 PM, Christian Borntraeger wrote: >>> >>> >>> On 11/21/2017 07:39 PM, Jens Axboe wrote: >>>> On 11/21/2017 11:27 AM, Jens Axboe wrote: >>>>> On 11/21/2017 11:12 AM, Christian Borntraeger wrote: >>>>>> >>>>>> >>>>>> On 11/21/2017 07:09 PM, Jens Axboe wrote: >>>>>>> On 11/21/2017 10:27 AM, Jens Axboe wrote: >>>>>>>> On 11/21/2017 03:14 AM, Christian Borntraeger wrote: >>>>>>>>> Bisect points to >>>>>>>>> >>>>>>>>> 1b5a7455d345b223d3a4658a9e5fce985b7998c1 is the first bad commit >>>>>>>>> commit 1b5a7455d345b223d3a4658a9e5fce985b7998c1 >>>>>>>>> Author: Christoph Hellwig >>>>>>>>> Date: Mon Jun 26 12:20:57 2017 +0200 >>>>>>>>> >>>>>>>>> blk-mq: Create hctx for each present CPU >>>>>>>>> >>>>>>>>> commit 4b855ad37194f7bdbb200ce7a1c7051fecb56a08 upstream. >>>>>>>>> >>>>>>>>> Currently we only create hctx for online CPUs, which can lead to a lot >>>>>>>>> of churn due to frequent soft offline / online operations. Instead >>>>>>>>> allocate one for each present CPU to avoid this and dramatically simplify >>>>>>>>> the code. >>>>>>>>> >>>>>>>>> Signed-off-by: Christoph Hellwig >>>>>>>>> Reviewed-by: Jens Axboe >>>>>>>>> Cc: Keith Busch >>>>>>>>> Cc: linux-block@vger.kernel.org >>>>>>>>> Cc: linux-nvme@lists.infradead.org >>>>>>>>> Link: http://lkml.kernel.org/r/20170626102058.10200-3-hch@lst.de >>>>>>>>> Signed-off-by: Thomas Gleixner >>>>>>>>> Cc: Oleksandr Natalenko >>>>>>>>> Cc: Mike Galbraith >>>>>>>>> Signed-off-by: Greg Kroah-Hartman >>>>>>>> >>>>>>>> I wonder if we're simply not getting the masks updated correctly. I'll >>>>>>>> take a look. >>>>>>> >>>>>>> Can't make it trigger here. We do init for each present CPU, which means >>>>>>> that if I offline a few CPUs here and register a queue, those still show >>>>>>> up as present (just offline) and get mapped accordingly. >>>>>>> >>>>>>> From the looks of it, your setup is different. If the CPU doesn't show >>>>>>> up as present and it gets hotplugged, then I can see how this condition >>>>>>> would trigger. What environment are you running this in? We might have >>>>>>> to re-introduce the cpu hotplug notifier, right now we just monitor >>>>>>> for a dead cpu and handle that. >>>>>> >>>>>> I am not doing a hot unplug and the replug, I use KVM and add a previously >>>>>> not available CPU. >>>>>> >>>>>> in libvirt/virsh speak: >>>>>> 4 >>>>> >>>>> So that's why we run into problems. It's not present when we load the device, >>>>> but becomes present and online afterwards. >>>>> >>>>> Christoph, we used to handle this just fine, your patch broke it. >>>>> >>>>> I'll see if I can come up with an appropriate fix. >>>> >>>> Can you try the below? >>> >>> >>> It does prevent the crash but it seems that the new CPU is not "used " after the hotplug for mq: >>> >>> >>> output with 2 cpus: >>> /sys/kernel/debug/block/vda >>> /sys/kernel/debug/block/vda/hctx0 >>> /sys/kernel/debug/block/vda/hctx0/cpu0 >>> /sys/kernel/debug/block/vda/hctx0/cpu0/completed >>> /sys/kernel/debug/block/vda/hctx0/cpu0/merged >>> /sys/kernel/debug/block/vda/hctx0/cpu0/dispatched >>> /sys/kernel/debug/block/vda/hctx0/cpu0/rq_list >>> /sys/kernel/debug/block/vda/hctx0/active >>> /sys/kernel/debug/block/vda/hctx0/run >>> /sys/kernel/debug/block/vda/hctx0/queued >>> /sys/kernel/debug/block/vda/hctx0/dispatched >>> /sys/kernel/debug/block/vda/hctx0/io_poll >>> /sys/kernel/debug/block/vda/hctx0/sched_tags_bitmap >>> /sys/kernel/debug/block/vda/hctx0/sched_tags >>> /sys/kernel/debug/block/vda/hctx0/tags_bitmap >>> /sys/kernel/debug/block/vda/hctx0/tags >>> /sys/kernel/debug/block/vda/hctx0/ctx_map >>> /sys/kernel/debug/block/vda/hctx0/busy >>> /sys/kernel/debug/block/vda/hctx0/dispatch >>> /sys/kernel/debug/block/vda/hctx0/flags >>> /sys/kernel/debug/block/vda/hctx0/state >>> /sys/kernel/debug/block/vda/sched >>> /sys/kernel/debug/block/vda/sched/dispatch >>> /sys/kernel/debug/block/vda/sched/starved >>> /sys/kernel/debug/block/vda/sched/batching >>> /sys/kernel/debug/block/vda/sched/write_next_rq >>> /sys/kernel/debug/block/vda/sched/write_fifo_list >>> /sys/kernel/debug/block/vda/sched/read_next_rq >>> /sys/kernel/debug/block/vda/sched/read_fifo_list >>> /sys/kernel/debug/block/vda/write_hints >>> /sys/kernel/debug/block/vda/state >>> /sys/kernel/debug/block/vda/requeue_list >>> /sys/kernel/debug/block/vda/poll_stat >> >> Try this, basically just a revert. > > Yes, seems to work. > > Tested-by: Christian Borntraeger Great, thanks for testing. > Do you know why the original commit made it into 4.12 stable? After all > it has no Fixes tag and no cc stable- I was wondering the same thing when you said it was in 4.12.stable and not in 4.12 release. That patch should absolutely not have gone into stable, it's not marked as such and it's not fixing a problem that is stable worthy. In fact, it's causing a regression... Greg? Upstream commit is mentioned higher up, start of the email. -- Jens Axboe From 1584707888090452040@xxx Tue Nov 21 20:13:31 +0000 2017 X-GM-THRID: 1584670276912512570 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread