Received: by 2002:ab2:68c1:0:b0:1fd:9a81:d0e4 with SMTP id e1csp87889lqp; Sat, 8 Jun 2024 07:16:35 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXoByC5w1OOvs7GYyJUVmMw8d9ZHMxqpS0k14han3hAC81VKrd4bhaSqcCwwU6pxRBV3dDsObhm6DWmnZ2HthgyNzxMHTwJ3rUzRU2Uqw== X-Google-Smtp-Source: AGHT+IFSd844sRPtleXlszA1pMxrLMFXc6oSZnQBfX+xwmjtFea8R3z+WQ9snU/hTsfSX4j5DfDp X-Received: by 2002:a05:622a:20d:b0:43f:ed86:45c2 with SMTP id d75a77b69052e-44041c2b0ebmr73140241cf.56.1717856195221; Sat, 08 Jun 2024 07:16:35 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717856195; cv=pass; d=google.com; s=arc-20160816; b=YbYXY7IP0EdvKHkxO4jnUjjmAYaUhkVYpZd3XQA9NpmTqtMFbP68P22pptZPHB02rv MhCOnbuZJRoAnNHpXQxDkBSLpW7v+c8ZePUaX9BgDXOfu4vXGKaqP+6a61Cob4ZEFijM dOhvFspcyhjyK93KSRVusjk1/f2klYrtDNa88NNRYhBRzR3N91pFO5+uURa2yqOYIKgZ yFA0Uds+U0QIJ4RV6o5CmVVPcDJxOLXFoWMBJWzIlRpTF+atDltnwHxsN1f18rRZzh93 Vh2HOwAs/Z64xaxKtazvIPV3BgoOHmWQ6xuTRylCHvgggy8vmpxVxfooUWi59H2y2Gb0 J/7g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=e4i3eRJmBmN+iQA9CJ+0NHv/UewJVWHJgwRHQO6ZQpM=; fh=MV+5krNP7R3t57wFhesGrGH9wn/42HMg5NQ2zdPGZWY=; b=dYDBqoXeNGjzVSVBQa2yITzgj9XNWr+iKJkJPk4iW8glZy3Lu3T5cAoWZhLgN+sa8H hKiDjrpMS7v/VIrrFhhj1JbpQm8aj6EO+kmmxR1gIeN/HnU2hRjLXk8leINl9zOxP2n2 niyyzfgWu4SKE+vbNlhOET8dBC7t4twTw2ppheyWLJSNqokI+uopQwb4n1ySfPdKYs4q GGlcmzap59AqQRTmAx1ITeH04o4mKda9dIwmtgWE74F8bpjejJtQWlTdv2Kflr5qbU4I 6kwJSJSEUz6pX99CJ9pZAV4HdXyHz83ErJHQc8VBkH40KLvtX7qsGEi5X43SzlgBB15c 6qOg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bGMlaSGz; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-207056-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-207056-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id d75a77b69052e-44038b39bf6si63837781cf.431.2024.06.08.07.16.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 08 Jun 2024 07:16:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-207056-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bGMlaSGz; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-207056-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-207056-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id D9E721C20D38 for ; Sat, 8 Jun 2024 14:16:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 36336179967; Sat, 8 Jun 2024 14:16:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bGMlaSGz" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 755A827457 for ; Sat, 8 Jun 2024 14:16:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717856188; cv=none; b=Jh0glPiydLoo9a/4DWIh6HGV2CR1yjpZBDb6NK8U0KWZFCvb9CtjOlPRXqvJK7ks64QYsc0OFK44Fr/BPU78FABWRYXiJU5B/2ZitLY8albpXVEFy1QmclMfQaO1WR+AhD4v4NWko82ebLpLRusyHU67o4X5VsucOajwgLRbdiU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717856188; c=relaxed/simple; bh=y1/pdGFNQ4bmBupNSPXczZk3gfpVzHE5/xtJhhxuZEM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bwOivWlKuPbvVanKgjGRBvMLWXJFvoUqdioXEY/Kv3itCLeTvi6mUaBn/mNkK8Rruh8moEEe9WaIJD7I7JI5Jer3gyzyvLkvhjFD3Gy1MDa9m4SL3eFF8FEAiSjuPCB7kneaG0XQ8zdx5zYZ65lmH8FB/+n96t6MzsXX3cZg3mc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=bGMlaSGz; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717856185; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e4i3eRJmBmN+iQA9CJ+0NHv/UewJVWHJgwRHQO6ZQpM=; b=bGMlaSGz6APPr9zNP1Yot6WMUL439akzptfCVcUxIxPzZeBGOutPqcwcvWsPzlCVVZ5X0h GBolX0+kV72R7SpjeqfE/7Yk+Y+whmZHF/4S+DvEbE9IjcJj0VoOM9g8Cn1rHQRTVrVh4H zSxkvLGTfiQvFdC6Rc25y3P5/CoeVb8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-531-ESfm0zbmO4CqmPNoso-9zA-1; Sat, 08 Jun 2024 10:16:21 -0400 X-MC-Unique: ESfm0zbmO4CqmPNoso-9zA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 41D21185A780; Sat, 8 Jun 2024 14:16:21 +0000 (UTC) Received: from fedora (unknown [10.72.112.37]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 117253C23; Sat, 8 Jun 2024 14:16:14 +0000 (UTC) Date: Sat, 8 Jun 2024 22:16:10 +0800 From: Ming Lei To: Li Nan Cc: Changhui Zhong , axboe@kernel.dk, ZiyangZhang@linux.alibaba.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com, ming.lei@redhat.com Subject: Re: [PATCH] ublk_drv: fix NULL pointer dereference in ublk_ctrl_start_recovery() Message-ID: References: <225f4c8e-0e2c-8f4b-f87d-69f4677af572@huaweicloud.com> <918f128b-f752-2d66-ca60-7d9c711ed928@huaweicloud.com> <2c46587e-0621-b21e-fbc1-fd69e87def03@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 On Sat, Jun 08, 2024 at 02:34:47PM +0800, Li Nan wrote: > > > 在 2024/6/6 17:52, Ming Lei 写道: > > On Thu, Jun 06, 2024 at 04:05:33PM +0800, Li Nan wrote: > > > > > > > > > 在 2024/6/6 12:48, Changhui Zhong 写道: > > > > > > [...] > > > > > > > > > > > > > Hi Changhui, > > > > > > > > > > The hang is actually expected because recovery fails. > > > > > > > > > > Please pull the latest ublksrv and check if the issue can still be > > > > > reproduced: > > > > > > > > > > https://github.com/ublk-org/ublksrv > > > > > > > > > > BTW, one ublksrv segfault and two test cleanup issues are fixed. > > > > > > > > > > Thanks, > > > > > Ming > > > > > > > > > > > > > Hi,Ming and Nan > > > > > > > > after applying the new patch and pulling the latest ublksrv, > > > > I ran the test for 4 hours and did not observe any task hang. > > > > the test results looks good! > > > > > > > > Thanks, > > > > Changhui > > > > > > > > > > > > . > > > > > > Thanks for you test! > > > > > > However, I got a NULL pointer dereference bug with ublksrv. It is not > > > > BTW, your patch isn't related with generic/004 which won't touch > > recovery code path. > > > > > introduced by this patch. It seems io was issued after deleting disk. And > > > it can be reproduced by: > > > > > > while true; do make test T=generic/004; done > > > > We didn't see that when running such test with linus tree, and usually > > Changhui run generic test for hours. > > > > > > > > [ 1524.286485] running generic/004 > > > [ 1529.110875] blk_print_req_error: 109 callbacks suppressed > > ... > > > [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000 > > > [ 1541.171734] #PF: supervisor write access in kernel mode > > > [ 1541.172271] #PF: error_code(0x0002) - not-present page > > > [ 1541.172798] PGD 0 P4D 0 > > > [ 1541.173065] Oops: Oops: 0002 [#1] PREEMPT SMP > > > [ 1541.173515] CPU: 0 PID: 43707 Comm: ublk Not tainted > > > 6.9.0-next-20240523-00004-g9bc7e95c7323 #454 > > > [ 1541.174417] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > 1.16.1-2.fc37 04/01/2014 > > > [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300 > > > > This one looks one io_uring issue. > > > > Care to provide which line of source code points to by 'io_fallback_tw+0x252'? > > > > gdb> l *(io_fallback_tw+0x252) > > > (gdb) list * io_fallback_tw+0x252 > 0xffffffff81d79dc2 is in io_fallback_tw > (./arch/x86/include/asm/atomic64_64.h:25). > 20 __WRITE_ONCE(v->counter, i); > 21 } > 22 > 23 static __always_inline void arch_atomic64_add(s64 i, atomic64_t *v) > 24 { > 25 asm volatile(LOCK_PREFIX "addq %1,%0" > 26 : "=m" (v->counter) > 27 : "er" (i), "m" (v->counter) : "memory"); > 28 } > > The corresponding code is: > io_fallback_tw > percpu_ref_get(&last_ctx->refs); [ 1541.171010] BUG: kernel NULL pointer dereference, address: 0000000000000000 ... [ 1541.175311] RIP: 0010:io_fallback_tw+0x252/0x300 So looks the 'struct io_ring_ctx' instance is freed and ctx->refs.data becomes NULL when calling percpu_ref_get(&last_ctx->refs), not clear how it can happen since request grabs one ctx reference. Maybe you can try kasan and see if more useful info can be dumped. > > I have the vmcore of this issue. If you have any other needs, please let me > know. Not try remote vmcore debug yet, will think further if any useful hint is needed. > The space of the root path has been filled up by > ublksrv(tests/tmpublk_loop_data_xxx), which may the issue be related to this? It isn't supposed to be related, since the backing loop image is created by dd and it isn't sparse image, and the test runs fio over ublk block device only. Thanks, Ming