Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp1650813rwn; Fri, 9 Sep 2022 01:39:22 -0700 (PDT) X-Google-Smtp-Source: AA6agR6+OiGijSXVgDsa1ItJS70YPCyYOmWyX8N+/XtvhyUfj8bVY5H7Cmo1dxKqSWHGToZ2C1hL X-Received: by 2002:a05:651c:54b:b0:268:a2ad:b8e3 with SMTP id q11-20020a05651c054b00b00268a2adb8e3mr3597938ljp.281.1662712762622; Fri, 09 Sep 2022 01:39:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662712762; cv=none; d=google.com; s=arc-20160816; b=fcfPKg4TR3qFvzczat9+ygrXX0mM6GuOFjQF4humA6mfYDf9znE1LZtFZ/vAA83mO8 UL/MoFWBS/rtxXMMl/gp+8b4+GdfaSSHFms9OU80JkSd9u3Qdba8ZbytdkKzNIIH60Pn B8yknRHuNHeNvh30onPRQWl0wxZ1vakxWmDptJ0181y5f6rk57RerAqFaCRgy70onihh U+O9k/wAqZIXAGrqqjE4qZ+Stw/cnlnaXYJMxjwymUqOZqoVdzHgmUhvo72kOKlE5EFB 87OPxSTVMqWveDeRFZCxmonc1UDGAvWCExnNN4So05dm2FpKFkgnWxk1jQrmuv+hP+i5 c1mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=zLgBFhbjWgsxc9dWG54TSxsLPfsBKw+Wx4DaHZWpGNc=; b=D1jS4aODaRWOg9F1jALQal8MoPcbDtLIhOkko+DOaKL9WKxLQr8hQ2kRFowg+28dcj XZSOmaoFxgVdqbqo9QWFDLIfW4V+2q7C+gkXXdSo3mUK8PkfghUqR3mY7B5X/UXFBW6o EUaMFb0PmaDverKGpN7PLOp2ZJvmd7nOL+SIS0SJ+hZa/xONdearfCLUujuWtWo4R01X UZM8nm50bivML6gMJO8XqJxFrqA044vTKj7lbIbyOWXhBezhJPjIxYfBHM/L9r64rH8d 50z3DoC7r8SQr4y/AlqzBRYi9OyMDt2DsU7fuKzLxFoqB8SKEB6hPm0PBUY0j5wCqzwg fQeA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bvu5VZxP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k15-20020a0565123d8f00b00494ad197660si574665lfv.319.2022.09.09.01.38.52; Fri, 09 Sep 2022 01:39:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bvu5VZxP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231341AbiIIIZB (ORCPT + 99 others); Fri, 9 Sep 2022 04:25:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229639AbiIIIY7 (ORCPT ); Fri, 9 Sep 2022 04:24:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 541B22980B for ; Fri, 9 Sep 2022 01:24:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662711897; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zLgBFhbjWgsxc9dWG54TSxsLPfsBKw+Wx4DaHZWpGNc=; b=bvu5VZxPCUBBlVT7o27Xgihi1lvB0GBGeS6ZiuGPO3DOSQXNDHeRf8QzY7WRHeHJIIrnzX 2H9dYgEAfsotqGQS3QUDSDsuqtc++haKrrib14UZns4dUJg5zSYtBpqADol2K1KKP5HU1I n0WEyIYlTVrbkeWZs13M7GTMeLygues= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-240-xfWXhLT0N96_XNj2orK8-g-1; Fri, 09 Sep 2022 04:24:54 -0400 X-MC-Unique: xfWXhLT0N96_XNj2orK8-g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D6EAE185A7B2; Fri, 9 Sep 2022 08:24:53 +0000 (UTC) Received: from T590 (ovpn-8-16.pek2.redhat.com [10.72.8.16]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 444422026D4C; Fri, 9 Sep 2022 08:24:47 +0000 (UTC) Date: Fri, 9 Sep 2022 16:24:40 +0800 From: Ming Lei To: Christoph Hellwig Cc: Dusty Mabe , Jens Axboe , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, ming.lei@redhat.com Subject: Re: regression caused by block: freeze the queue earlier in del_gendisk Message-ID: References: <017845ae-fbae-70f6-5f9e-29aff2742b8c@dustymabe.com> <20220907073324.GB23826@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220907073324.GB23826@lst.de> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 07, 2022 at 09:33:24AM +0200, Christoph Hellwig wrote: > On Thu, Sep 01, 2022 at 03:06:08PM +0800, Ming Lei wrote: > > It is a bit hard to associate the above commit with reported issue. > > So the messages clearly are about something trying to open a device > that went away at the block layer, but somehow does not get removed > in time by udev (which seems to be a userspace bug in CoreOS). But > even with that we really should not hang. Xiao Ni provides one script[1] which can reproduce the issue more or less. - create raid #./imsm.sh imsm /dev/md/test 1 /dev/sda /dev/sdb #ls /dev/md/ [root@ktest-36 md]# ls -l /dev/md/ total 0 lrwxrwxrwx. 1 root root 8 Sep 9 08:10 imsm -> ../md127 lrwxrwxrwx. 1 root root 8 Sep 9 08:10 test -> ../md126 - destroy the two raid devices # mdadm --stop /dev/md/test /dev/md/imsm mdadm: stopped /dev/md/test mdadm: stopped /dev/md/imsm # lsblk ... md126 9:126 0 0B 0 md md127 9:127 0 0B 0 md md126 is actually added after it is deleted, and with the log of "block device autoloading is deprecated and will be removed.", and bcc stack trace shows that the device is added by mdadm. 08:20:03 456 456 kworker/6:2 del_gendisk disk b'md126' b'del_gendisk+0x1 [kernel]' b'md_kobj_release+0x34 [kernel]' b'kobject_put+0x87 [kernel]' b'process_one_work+0x1c4 [kernel]' b'worker_thread+0x4d [kernel]' b'kthread+0xe6 [kernel]' b'ret_from_fork+0x1f [kernel]' 08:20:03 2476 2476 mdadm device_add_disk disk b'md126' b'device_add_disk+0x1 [kernel]' b'md_alloc+0x3ba [kernel]' b'md_probe+0x25 [kernel]' b'blk_request_module+0x5f [kernel]' b'blkdev_get_no_open+0x5c [kernel]' b'blkdev_get_by_dev.part.0+0x1e [kernel]' b'blkdev_open+0x52 [kernel]' b'do_dentry_open+0x1ce [kernel]' b'path_openat+0xc43 [kernel]' b'do_filp_open+0xa1 [kernel]' b'do_sys_openat2+0x7c [kernel]' b'__x64_sys_openat+0x5c [kernel]' b'do_syscall_64+0x37 [kernel]' b'entry_SYSCALL_64_after_hwframe+0x63 [kernel]' Also the md device is delayed to remove by scheduling wq, and it is actually deleted in mddev's release handler: mddev_delayed_delete(): kobject_put(&mddev->kobj) ... md_kobj_release(): del_gendisk(mddev->gendisk); > > Now that fact that it did hang before and this now becomes reproducible > also makes me assume the change is not the root cause. It might still > be a good vehicle to fix the issue for real, but it really broadens > the scope. > [1] create one imsm raid1 ./imsm.sh imsm /dev/md/test 1 /dev/sda /dev/sdb #!/bin/bash export IMSM_NO_PLATFORM=1 export IMSM_DEVNAME_AS_SERIAL=1 echo "" echo "===========================================================" echo "./test.sh container raid devlist level devnum" echo "example: ./test.sh imsm /dev/md/test 1 /dev/loop0 /dev/loop1" echo "===========================================================" echo "" container=$1 raid=$2 level=$3 shift 3 dev_num=$# dev_list=$@ mdadm -CR $container -e imsm -n $dev_num $dev_list mdadm -CR $raid -l $level -n $dev_num $dev_list [2] destroy created raid devices mdadm --stop /dev/md/test /dev/md/imsm Thanks, Ming