Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2250471imm; Mon, 28 May 2018 04:47:12 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIxpHxafjZvwJ3hr7KyG+GOiRfNyVv1CTsRqELdbeHUw470eN4CY03Ql0+rLXlBKrbJnTLl X-Received: by 2002:a63:6501:: with SMTP id z1-v6mr7342238pgb.452.1527508032609; Mon, 28 May 2018 04:47:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527508032; cv=none; d=google.com; s=arc-20160816; b=hud0kG32cO36p0Q1F+2vD545gFMQnE8PoG+Gyb7THg0ZUiyWor7VleIp5nilq0dcky BlVytSHtIDrH9DdosjE9HWMruRwH7jsNU2sYd671VtPJ+A/Rc77VmRleT5y0E94YrTvl Ru8z6aSXel/CYWlmMYcrkdCVlIAEnfTK9O9s9c+RkXZxV2K2ed3Xp6qjGO0HsaEYNutx nqmCMMNaNc/tROuUTz2XOmBIhsx5Aoek+cnwfVyuxC9XJYcNogMHg30YfONJt8EfFpcv kmvU0mu7KNrfZjKdxJag/CI6uOy1SbSvai4ty9G4zBeo45DKR6uDTtav/ZBQwVeTdvez 7Oqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature :arc-authentication-results; bh=MFswnycXeLbyjAug1ZgQn3VFQ3fJ4Wb9VB38EP0ggSc=; b=JL5kQbmtImHE/OO71KV3752Kn8Lf8aLRBq0DL6isXx1IKQtCCBwAsAJCMitD5ECo7c NAnLzkiAMClUlQneYKrrwQJW0QccniXOTOL223gtRP/4RVK2P+59XZO2o/A4p/NlBlgf eoGwMOv76Ax9wyElw+BwNrY5V5FkNailJzbNsXGAvmrZOWF2wROgPrJN0vOqh5fiR9rS ZyAajMluf0HYc3Bin9f/rIvIoRuiCNalgPgJy6St5EbBXH0DjkxcixVthgHd/W6QwhgF mqRHu2GSrf0A7RxZAIVm+Ny2UF87XlLC3NZYt1EieDmPEQ+XpAMkE63snxDCikf5ksi7 wg3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ZyjXfqwj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j10-v6si23505130pgq.249.2018.05.28.04.46.57; Mon, 28 May 2018 04:47:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=ZyjXfqwj; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1423970AbeE1LKw (ORCPT + 99 others); Mon, 28 May 2018 07:10:52 -0400 Received: from mail.kernel.org ([198.145.29.99]:58732 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1423945AbeE1LKo (ORCPT ); Mon, 28 May 2018 07:10:44 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id ADFF9206B7; Mon, 28 May 2018 11:10:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1527505844; bh=VJX7alkvXEA0az4PhfejqztJNXobxWCEVYCEjOo7Y/8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZyjXfqwjbO5KEC+ObMKVHdf/+9HsX1JhvZcPU9WbAH7pTEr4+1BP/vNsXRCljqVRb JHPhRRuwW6DKtMc4nljGpRrpa/U70IzTVxhJQuf2flwwSRpdkUGwBTa5hHuajJW2A6 z+wQf3tpWZ0Epu9tX3GZAF26GwbdPvSdNq2i3eY0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Coly Li , Michael Lyle , Hannes Reinecke , Huijun Tang , Jens Axboe , Sasha Levin Subject: [PATCH 4.16 147/272] bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set Date: Mon, 28 May 2018 12:03:00 +0200 Message-Id: <20180528100253.359068093@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180528100240.256525891@linuxfoundation.org> References: <20180528100240.256525891@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.16-stable review patch. If anyone has any objections, please let me know. ------------------ From: Coly Li [ Upstream commit fadd94e05c02afec7b70b0b14915624f1782f578 ] In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()", cached_dev_get() is called when creating dc->writeback_thread, and cached_dev_put() is called when exiting dc->writeback_thread. This modification works well unless people detach the bcache device manually by 'echo 1 > /sys/block/bcache/bcache/detach' Because this sysfs interface only calls bch_cached_dev_detach() which wakes up dc->writeback_thread but does not stop it. The reason is, before patch "bcache: fix cached_dev->count usage for bch_cache_set_error()", inside bch_writeback_thread(), if cache is not dirty after writeback, cached_dev_put() will be called here. And in cached_dev_make_request() when a new write request makes cache from clean to dirty, cached_dev_get() will be called there. Since we don't operate dc->count in these locations, refcount d->count cannot be dropped after cache becomes clean, and cached_dev_detach_finish() won't be called to detach bcache device. This patch fixes the issue by checking whether BCACHE_DEV_DETACHING is set inside bch_writeback_thread(). If this bit is set and cache is clean (no existing writeback_keys), break the while-loop, call cached_dev_put() and quit the writeback thread. Please note if cache is still dirty, even BCACHE_DEV_DETACHING is set the writeback thread should continue to perform writeback, this is the original design of manually detach. It is safe to do the following check without locking, let me explain why, + if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) && + (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) { If the kenrel thread does not sleep and continue to run due to conditions are not updated in time on the running CPU core, it just consumes more CPU cycles and has no hurt. This should-sleep-but-run is safe here. We just focus on the should-run-but-sleep condition, which means the writeback thread goes to sleep in mistake while it should continue to run. 1, First of all, no matter the writeback thread is hung or not, kthread_stop() from cached_dev_detach_finish() will wake up it and terminate by making kthread_should_stop() return true. And in normal run time, bit on index BCACHE_DEV_DETACHING is always cleared, the condition !test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) is always true and can be ignored as constant value. 2, If one of the following conditions is true, the writeback thread should go to sleep, "!atomic_read(&dc->has_dirty)" or "!dc->writeback_running)" each of them independently controls the writeback thread should sleep or not, let's analyse them one by one. 2.1 condition "!atomic_read(&dc->has_dirty)" If dc->has_dirty is set from 0 to 1 on another CPU core, bcache will call bch_writeback_queue() immediately or call bch_writeback_add() which indirectly calls bch_writeback_queue() too. In bch_writeback_queue(), wake_up_process(dc->writeback_thread) is called. It sets writeback thread's task state to TASK_RUNNING and following an implicit memory barrier, then tries to wake up the writeback thread. In writeback thread, its task state is set to TASK_INTERRUPTIBLE before doing the condition check. If other CPU core sets the TASK_RUNNING state after writeback thread setting TASK_INTERRUPTIBLE, the writeback thread will be scheduled to run very soon because its state is not TASK_INTERRUPTIBLE. If other CPU core sets the TASK_RUNNING state before writeback thread setting TASK_INTERRUPTIBLE, the implict memory barrier of wake_up_process() will make sure modification of dc->has_dirty on other CPU core is updated and observed on the CPU core of writeback thread. Therefore the condition check will correctly be false, and continue writeback code without sleeping. 2.2 condition "!dc->writeback_running)" dc->writeback_running can be changed via sysfs file, every time it is modified, a following bch_writeback_queue() is alwasy called. So the change is always observed on the CPU core of writeback thread. If dc->writeback_running is changed from 0 to 1 on other CPU core, this condition check will observe the modification and allow writeback thread to continue to run without sleeping. Now we can see, even without a locking protection, multiple conditions check is safe here, no deadlock or process hang up will happen. I compose a separte patch because that patch "bcache: fix cached_dev->count usage for bch_cache_set_error()" already gets a "Reviewed-by:" from Hannes Reinecke. Also this fix is not trivial and good for a separate patch. Signed-off-by: Coly Li Reviewed-by: Michael Lyle Cc: Hannes Reinecke Cc: Huijun Tang Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman --- drivers/md/bcache/writeback.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-) --- a/drivers/md/bcache/writeback.c +++ b/drivers/md/bcache/writeback.c @@ -565,9 +565,15 @@ static int bch_writeback_thread(void *ar while (!kthread_should_stop()) { down_write(&dc->writeback_lock); set_current_state(TASK_INTERRUPTIBLE); - if (!atomic_read(&dc->has_dirty) || - (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) && - !dc->writeback_running)) { + /* + * If the bache device is detaching, skip here and continue + * to perform writeback. Otherwise, if no dirty data on cache, + * or there is dirty data on cache but writeback is disabled, + * the writeback thread should sleep here and wait for others + * to wake up it. + */ + if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) && + (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) { up_write(&dc->writeback_lock); if (kthread_should_stop()) { @@ -588,6 +594,14 @@ static int bch_writeback_thread(void *ar cached_dev_put(dc); SET_BDEV_STATE(&dc->sb, BDEV_STATE_CLEAN); bch_write_bdev_super(dc, NULL); + /* + * If bcache device is detaching via sysfs interface, + * writeback thread should stop after there is no dirty + * data on cache. BCACHE_DEV_DETACHING flag is set in + * bch_cached_dev_detach(). + */ + if (test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)) + break; } up_write(&dc->writeback_lock);