Received: by 10.213.65.68 with SMTP id h4csp695398imn; Tue, 13 Mar 2018 18:41:55 -0700 (PDT) X-Google-Smtp-Source: AG47ELup8K/txc2/rzWP7h9xN32hUC8yrmlvkV2h6Mbt0h5LnqMlrEGqfKB3scYtW221jhdTLuVn X-Received: by 10.99.125.75 with SMTP id m11mr2136966pgn.391.1520991715589; Tue, 13 Mar 2018 18:41:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520991715; cv=none; d=google.com; s=arc-20160816; b=lUN4wdVtnMFP3EWGsL/izTUT/XXehUQgRxfRoKomdaerXGFtjqO9rUwOXT5ILKYA3s n1SQwuKTaF/LQReSiAYeYZyGRCdk0RKM5rbhksnkiNRnPby7nK9s/Qi2dJ+x+nGTG3We iBKIyHAtnXB5kyaw/kofhHPLb2jO+UvjefZj8VguX5e0KJjQWbIk2cgDgWhWqm4c5xbu gA0CjurA2UN1eX8ZfIk1uIKIL6KbMZz/rC2WKR+4yWZa+chsx8LrzWgutOgjBo4Dg+4b 85+9jpWZQDXhTA6GDrbMthzRWgUPGDh4lgmzlPMxOhp+UG7MNb4FIrSaTSjDUAfVU7i7 mJ1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=9MzKgCS8QCIUuc6uSm9SrEO+FLcgBjwpgNfWPfsvl/M=; b=me+8eieLHmWMlfFx+SOGuqhkb0CMBCTqCEYfSwMK1VlWh+ZWiol/OxNS+R/LDqetxr +q3JKBo9HGzUpU2iREggJa06TZ25egXcfeLB/IrcmOuLk2PFSnSLruni+xEAO+q+ziB2 AYUuqwwfHUy1CojKwSF6p8m8JA28Cyn5Cd7SO6YjYC0F6Rt1Ua2tKZDcCX54EwYdWpTP FJvkgfvs02k6GxH5xhE5XqLMPFCVTZL6ismioD9gX/z+tcBKa7c7iibXgB/h+XxiCfY1 niqhEN/wdcDHhwzMV2qatMgi0eyD6Lv2F1WHvC+2YRHLdFk5g5tWT0jxMSYRUFtnpwKr KYsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k198si1020687pgc.545.2018.03.13.18.41.41; Tue, 13 Mar 2018 18:41:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933164AbeCNBkg (ORCPT + 99 others); Tue, 13 Mar 2018 21:40:36 -0400 Received: from magic.merlins.org ([209.81.13.136]:33565 "EHLO mail1.merlins.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932636AbeCNBke (ORCPT ); Tue, 13 Mar 2018 21:40:34 -0400 Received: from 107-182-35-68.volcanocom.com ([107.182.35.68]:57916 helo=legolas.merlins.org) by mail1.merlins.org with esmtpsa (Cipher TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.87 #1) id 1evvP2-0003rJ-Ue by authid with srv_auth_plain; Tue, 13 Mar 2018 18:40:29 -0700 Received: from merlin by legolas.merlins.org with local (Exim 4.80) (envelope-from ) id 1evvP2-0004KN-9n; Tue, 13 Mar 2018 18:40:28 -0700 Date: Tue, 13 Mar 2018 18:40:28 -0700 From: Marc MERLIN To: Michael Lyle , linux-bcache@vger.kernel.org Cc: Tang Junhui , Jens Axboe Subject: Re: [PATCH 4.14 095/140] bcache: fix crashes in duplicate cache device register Message-ID: <20180314014028.tybbyq2gtmh64mga@merlins.org> References: <20180313152458.201155692@linuxfoundation.org> <20180313152504.537862990@linuxfoundation.org> <20180313161907.tu4afzqrgmkvzzxv@merlins.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Sysadmin: BOFH X-URL: http://marc.merlins.org/ User-Agent: NeoMutt/20160916 (1.7.0) X-SA-No-Run: Yes X-SA-Exim-Connect-IP: 107.182.35.68 X-SA-Exim-Mail-From: marc@merlins.org Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [linux-kernel to bcc, moving back to bcache list] On Tue, Mar 13, 2018 at 10:26:33AM -0700, Michael Lyle wrote: > Though note you're still not safe from -that-. If there's duplicate > UUIDs around because you've duplicated devices, there's just no sane > way to tell which is the "right one" to attach to. Thanks for clearing that up, Mike. So, what happened to me was 1) I dd'ed drive1 to drive2 (raw device) 2) while that was going on, I ran fdisk on drive2 to fix a partition type 3) saving fdisk caused drive2 to be rescanned by the kernel 4) udev said, oh, a bcache partition, yummy, let me register that 5) instead I got a kernel crash that got fixed by this patch 6) tried to reboot a few times, and each time the kernel would crash early, until I found out it was bcache, removed drive2, system came back up 7) by then, my bcache filesystem was heavily corrupted and unsuable If there is a duplicate cache device UUID, wouldn't bcache just use the first one it sees and ignore the 2nd one? In my case this would have been the safe thing and I'm guessing in most cases, whatever device the UUID got duplicated on, will come 2nd in the boot order, and therefore is safer to ignore, even if the duplicate situation isn't safe per se. What do you think? Thanks, Marc > Mike > > On Tue, Mar 13, 2018 at 9:19 AM, Marc MERLIN wrote: > > On Tue, Mar 13, 2018 at 04:24:58PM +0100, Greg Kroah-Hartman wrote: > >> 4.14-stable review patch. If anyone has any objections, please let me know. > > > > Just in case someone is considering whether it's important to merge, the > > bug did crash my kernel of course, but I'm virtually certain it was also > > responsible for corrupting my existing bcache device enough that I had > > to restore it from backup. > > > > Thanks again to Tang for fixing it. > > > > > >> ------------------ > >> > >> From: Tang Junhui > >> > >> commit cc40daf91bdddbba72a4a8cd0860640e06668309 upstream. > >> > >> Kernel crashed when register a duplicate cache device, the call trace is > >> bellow: > >> [ 417.643790] CPU: 1 PID: 16886 Comm: bcache-register Tainted: G > >> W OE 4.15.5-amd64-preempt-sysrq-20171018 #2 > >> [ 417.643861] Hardware name: LENOVO 20ERCTO1WW/20ERCTO1WW, BIOS > >> N1DET41W (1.15 ) 12/31/2015 > >> [ 417.643870] RIP: 0010:bdevname+0x13/0x1e > >> [ 417.643876] RSP: 0018:ffffa3aa9138fd38 EFLAGS: 00010282 > >> [ 417.643884] RAX: 0000000000000000 RBX: ffff8c8f2f2f8000 RCX: ffffd6701f8 > >> c7edf > >> [ 417.643890] RDX: ffffa3aa9138fd88 RSI: ffffa3aa9138fd88 RDI: 00000000000 > >> 00000 > >> [ 417.643895] RBP: ffffa3aa9138fde0 R08: ffffa3aa9138fae8 R09: 00000000000 > >> 1850e > >> [ 417.643901] R10: ffff8c8eed34b271 R11: ffff8c8eed34b250 R12: 00000000000 > >> 00000 > >> [ 417.643906] R13: ffffd6701f78f940 R14: ffff8c8f38f80000 R15: ffff8c8ea7d > >> 90000 > >> [ 417.643913] FS: 00007fde7e66f500(0000) GS:ffff8c8f61440000(0000) knlGS: > >> 0000000000000000 > >> [ 417.643919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 417.643925] CR2: 0000000000000314 CR3: 00000007e6fa0001 CR4: 00000000003 > >> 606e0 > >> [ 417.643931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000 > >> 00000 > >> [ 417.643938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000 > >> 00400 > >> [ 417.643946] Call Trace: > >> [ 417.643978] register_bcache+0x1117/0x1270 [bcache] > >> [ 417.643994] ? slab_pre_alloc_hook+0x15/0x3c > >> [ 417.644001] ? slab_post_alloc_hook.isra.44+0xa/0x1a > >> [ 417.644013] ? kernfs_fop_write+0xf6/0x138 > >> [ 417.644020] kernfs_fop_write+0xf6/0x138 > >> [ 417.644031] __vfs_write+0x31/0xcc > >> [ 417.644043] ? current_kernel_time64+0x10/0x36 > >> [ 417.644115] ? __audit_syscall_entry+0xbf/0xe3 > >> [ 417.644124] vfs_write+0xa5/0xe2 > >> [ 417.644133] SyS_write+0x5c/0x9f > >> [ 417.644144] do_syscall_64+0x72/0x81 > >> [ 417.644161] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > >> [ 417.644169] RIP: 0033:0x7fde7e1c1974 > >> [ 417.644175] RSP: 002b:00007fff13009a38 EFLAGS: 00000246 ORIG_RAX: 0000000 > >> 000000001 > >> [ 417.644183] RAX: ffffffffffffffda RBX: 0000000001658280 RCX: 00007fde7e1c > >> 1974 > >> [ 417.644188] RDX: 000000000000000a RSI: 0000000001658280 RDI: 000000000000 > >> 0001 > >> [ 417.644193] RBP: 000000000000000a R08: 0000000000000003 R09: 000000000000 > >> 0077 > >> [ 417.644198] R10: 000000000000089e R11: 0000000000000246 R12: 000000000000 > >> 0001 > >> [ 417.644203] R13: 000000000000000a R14: 7fffffffffffffff R15: 000000000000 > >> 0000 > >> [ 417.644213] Code: c7 c2 83 6f ee 98 be 20 00 00 00 48 89 df e8 6c 27 3b 0 > >> 0 48 89 d8 5b c3 0f 1f 44 00 00 48 8b 47 70 48 89 f2 48 8b bf 80 00 00 00 <8 > >> b> b0 14 03 00 00 e9 73 ff ff ff 0f 1f 44 00 00 48 8b 47 40 39 > >> [ 417.644302] RIP: bdevname+0x13/0x1e RSP: ffffa3aa9138fd38 > >> [ 417.644306] CR2: 0000000000000314 > >> > >> When registering duplicate cache device in register_cache(), after failure > >> on calling register_cache_set(), bch_cache_release() will be called, then > >> bdev will be freed, so bdevname(bdev, name) caused kernel crash. > >> > >> Since bch_cache_release() will free bdev, so in this patch we make sure > >> bdev being freed if register_cache() fail, and do not free bdev again in > >> register_bcache() when register_cache() fail. > >> > >> Signed-off-by: Tang Junhui > >> Reported-by: Marc MERLIN > >> Tested-by: Michael Lyle > >> Reviewed-by: Michael Lyle > >> Cc: > >> Signed-off-by: Jens Axboe > >> Signed-off-by: Greg Kroah-Hartman > >> > >> --- > >> drivers/md/bcache/super.c | 16 ++++++++++------ > >> 1 file changed, 10 insertions(+), 6 deletions(-) > >> > >> --- a/drivers/md/bcache/super.c > >> +++ b/drivers/md/bcache/super.c > >> @@ -1181,7 +1181,7 @@ static void register_bdev(struct cache_s > >> > >> return; > >> err: > >> - pr_notice("error opening %s: %s", bdevname(bdev, name), err); > >> + pr_notice("error %s: %s", bdevname(bdev, name), err); > >> bcache_device_stop(&dc->disk); > >> } > >> > >> @@ -1849,6 +1849,8 @@ static int register_cache(struct cache_s > >> const char *err = NULL; /* must be set for any error case */ > >> int ret = 0; > >> > >> + bdevname(bdev, name); > >> + > >> memcpy(&ca->sb, sb, sizeof(struct cache_sb)); > >> ca->bdev = bdev; > >> ca->bdev->bd_holder = ca; > >> @@ -1857,11 +1859,12 @@ static int register_cache(struct cache_s > >> ca->sb_bio.bi_io_vec[0].bv_page = sb_page; > >> get_page(sb_page); > >> > >> - if (blk_queue_discard(bdev_get_queue(ca->bdev))) > >> + if (blk_queue_discard(bdev_get_queue(bdev))) > >> ca->discard = CACHE_DISCARD(&ca->sb); > >> > >> ret = cache_alloc(ca); > >> if (ret != 0) { > >> + blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL); > >> if (ret == -ENOMEM) > >> err = "cache_alloc(): -ENOMEM"; > >> else > >> @@ -1884,14 +1887,14 @@ static int register_cache(struct cache_s > >> goto out; > >> } > >> > >> - pr_info("registered cache device %s", bdevname(bdev, name)); > >> + pr_info("registered cache device %s", name); > >> > >> out: > >> kobject_put(&ca->kobj); > >> > >> err: > >> if (err) > >> - pr_notice("error opening %s: %s", bdevname(bdev, name), err); > >> + pr_notice("error %s: %s", name, err); > >> > >> return ret; > >> } > >> @@ -1980,6 +1983,7 @@ static ssize_t register_bcache(struct ko > >> if (err) > >> goto err_close; > >> > >> + err = "failed to register device"; > >> if (SB_IS_BDEV(sb)) { > >> struct cached_dev *dc = kzalloc(sizeof(*dc), GFP_KERNEL); > >> if (!dc) > >> @@ -1994,7 +1998,7 @@ static ssize_t register_bcache(struct ko > >> goto err_close; > >> > >> if (register_cache(sb, sb_page, bdev, ca) != 0) > >> - goto err_close; > >> + goto err; > >> } > >> out: > >> if (sb_page) > >> @@ -2007,7 +2011,7 @@ out: > >> err_close: > >> blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL); > >> err: > >> - pr_info("error opening %s: %s", path, err); > >> + pr_info("error %s: %s", path, err); > >> ret = -EINVAL; > >> goto out; > >> } > >> > >> > >> > > > > -- > > "A mouse is a device used to point at the xterm you want to type in" - A.S.R. > > Microsoft is to operating systems .... > > .... what McDonalds is to gourmet cooking > > Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 > -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08