Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp411384pxu; Thu, 3 Dec 2020 03:31:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJxqIg/3rRefkDTa2fVe+2XTgb0S1DgD3G9lB9WXHNJM2UGxOyascHxZsVJMz4IjFxQEJ5nj X-Received: by 2002:a50:8a8e:: with SMTP id j14mr2439207edj.87.1606995077855; Thu, 03 Dec 2020 03:31:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1606995077; cv=none; d=google.com; s=arc-20160816; b=m8WnEjjGxYS8K9mSHOlV5usuYP4jixVnmUx0hgF1jAcuxdN8MAz7MTd805heIoRkgD /ZAa1N4+823Q7obpBbL37WAaRiznFT6sjCGTMDdSspyyNfagRd9RqK+qPGf1NuBOceR7 sbWXPHq0pTOF+zsELU8H2g/ocLOjAQ2GVur6xidG6gRHcQWG7oJg5njtNx+yNafd5XBZ aVScB17PFQPq0gyFfdPlH0wduUkI4pg4WzCE0ZFO9th45ysQIxaRobMQhfb+477RQK40 EKUJaThN9L556KMFuJqLgpdxWzRMv2HG1uEXhYTed6hOvoMr2hRmr++YN5KaRk9zNqN1 E9NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=VSqVZmi7dakl9rBv38FyD9ADpqSIbOVXLj/GvGfV5fc=; b=K1X1jzuNlOPzNIkCtxuW3q5GRHx8RQe2ngBay01YyoXkC3mWy8SJZVIzZiIEiHte/q RArWk20kP8wyOypbX6mc9yWrpfnuVM+P0HUTvB+nLFsLTMr7VEFkw3blD4Dj25P6YPdA GATPdb9KBsfJlzt9S1DG9N83X24F4DYYknZrjU9iYmaeAP9mX3z7IpUXBmo9G4pjVQlX Exh1mOfGSVPIY3rL3RvNTnMe1qfR87wwRlfX1D1vRv4GEnv3iMDt7ZRyNXpRdPyo1/iv +8fBtapoA0rRMPN+u5dFq2MGqakQS1VA277JJn5aEG+VJKXMeiFmAXTEh+IXlyJpKvtt xqBA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 30si726637edr.581.2020.12.03.03.30.54; Thu, 03 Dec 2020 03:31:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730009AbgLCL2e (ORCPT + 99 others); Thu, 3 Dec 2020 06:28:34 -0500 Received: from mx2.suse.de ([195.135.220.15]:42786 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725985AbgLCL2d (ORCPT ); Thu, 3 Dec 2020 06:28:33 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 538BEACBA; Thu, 3 Dec 2020 11:27:51 +0000 (UTC) Subject: Re: [PATCH v2] bcache: fix panic due to cache_set is null To: Yi Li Cc: yilikernel@gmail.com, kent.overstreet@gmail.com, linux-bcache@vger.kernel.org, linux-kernel@vger.kernel.org, Guo Chao References: <20201203094711.3236551-1-yili@winhong.com> From: Coly Li Message-ID: Date: Thu, 3 Dec 2020 19:27:45 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.5.0 MIME-Version: 1.0 In-Reply-To: <20201203094711.3236551-1-yili@winhong.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 12/3/20 5:47 PM, Yi Li wrote: > bcache_device_detach will release the cache_set after hotunplug cache > disk. > > Here is how the issue happens. > 1) cached_dev_free do cancel_writeback_rate_update_dwork > without bch_register_lock. > 2) Wirting the writeback_percent by sysfs with > bch_register_lock will insert a writeback_rate_update work. > 3) cached_dev_free with bch_register_lock to do bcache_device_free. > dc->disk.cl will be set NULL > 4) update_writeback_rate will crash when access dc->disk.cl The analysis makes sense, good catch! Thank you for make me understand the problem. > > Fixes: 80265d8dfd77 ("bcache: acquire bch_register_lock later in cached_dev_free()") > > IP: [] update_writeback_rate+0x59/0x3a0 [bcache] > PGD 879620067 PUD 8755d3067 PMD 0 > Oops: 0000 [#1] SMP > CPU: 8 PID: 1005702 Comm: kworker/8:0 Tainted: G 4.4.0+10 #1 > Hardware name: Intel BIOS SE5C610.86B.01.01.0021.032120170601 03/21/2017 > Workqueue: events update_writeback_rate [bcache] > task: ffff8808786f3800 ti: ffff88077082c000 task.ti: ffff88077082c000 > RIP: e030:[] update_writeback_rate+0x59/0x3a0 [bcache] > RSP: e02b:ffff88077082fde0 EFLAGS: 00010202 > RAX: 0000000000000018 RBX: ffff8808047f0b08 RCX: 0000000000000000 > RDX: 0000000000000001 RSI: ffff88088170dab8 RDI: ffff88088170dab8 > RBP: ffff88077082fe18 R08: 000000000000000a R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000017bc8 R12: 0000000000000000 > R13: ffff8808047f0000 R14: 0000000000000200 R15: ffff8808047f0b08 > FS: 00007f157b6d6700(0000) GS:ffff880881700000(0000) knlGS:0000000000000000 > CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000368 CR3: 0000000875c05000 CR4: 0000000000040660 > Stack: > 0000000000000001 0000000000007ff0 ffff88085ff600c0 ffff880881714e80 > ffff880881719500 0000000000000200 ffff8808047f0b08 ffff88077082fe60 > ffffffff81088c0c 0000000081714e80 0000000000000000 ffff880881714e80 > Call Trace: > [] process_one_work+0x1fc/0x3b0 > [] worker_thread+0x2a5/0x470 > [] ? __schedule+0x648/0x870 > [] ? rescuer_thread+0x300/0x300 > [] kthread+0xd5/0xe0 > [] ? kthread_stop+0x110/0x110 > [] ret_from_fork+0x3f/0x70 > [] ? kthread_stop+0x110/0x110 > > Reported-by: Guo Chao > Signed-off-by: Yi Li > --- > drivers/md/bcache/super.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c > index 46a00134a36a..8b341f756ac0 100644 > --- a/drivers/md/bcache/super.c > +++ b/drivers/md/bcache/super.c > @@ -1334,9 +1334,6 @@ static void cached_dev_free(struct closure *cl) > { > struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl); > > - if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)) > - cancel_writeback_rate_update_dwork(dc); > - > if (!IS_ERR_OR_NULL(dc->writeback_thread)) > kthread_stop(dc->writeback_thread); > if (!IS_ERR_OR_NULL(dc->status_update_thread)) > @@ -1344,6 +1341,9 @@ static void cached_dev_free(struct closure *cl) > > mutex_lock(&bch_register_lock); > > + if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags)) > + cancel_writeback_rate_update_dwork(dc); > + > if (atomic_read(&dc->running)) > bd_unlink_disk_holder(dc->bdev, dc->disk.disk); > bcache_device_free(&dc->disk); > Such change is problematic, the writeback rate kworker mush stopped before writeback and status_update thread, otherwise you may encounter other problem. And when I review your patch I find another similar potential problem. This is tricky, let me think how to fix it .... Thank you again, for catch such issue. Coly Li