Return-Path: Received: from mail-it0-f41.google.com ([209.85.214.41]:35318 "EHLO mail-it0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1762407AbdKQV0J (ORCPT ); Fri, 17 Nov 2017 16:26:09 -0500 Received: by mail-it0-f41.google.com with SMTP id u132so5668851ita.0 for ; Fri, 17 Nov 2017 13:26:09 -0800 (PST) MIME-Version: 1.0 In-Reply-To: <09f2480f-e8e8-645b-6d94-b6ae4ca47806@gentoo.org> References: <20171109193715.GB21978@ZenIV.linux.org.uk> <40ad7c6e-f0d7-959a-bf29-d3e3843f5d31@gentoo.org> <23f7da04-95f7-24e7-ee70-ce40c5b8fee3@gentoo.org> <67939ef3-29c6-762c-7afe-46cc69630d95@gentoo.org> <3d948180-6bd7-c4e9-5ac8-5baef9cc15a7@gentoo.org> <09f2480f-e8e8-645b-6d94-b6ae4ca47806@gentoo.org> From: Kees Cook Date: Fri, 17 Nov 2017 13:26:07 -0800 Message-ID: Subject: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11 To: Patrick McLean Cc: Linus Torvalds , Emese Revfy , Al Viro , Bruce Fields , "Darrick J. Wong" , Linux Kernel Mailing List , Linux NFS Mailing List , stable , Thorsten Leemhuis , "kernel-hardening@lists.openwall.com" Content-Type: text/plain; charset="UTF-8" Sender: linux-nfs-owner@vger.kernel.org List-ID: On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean wrote: > On 2017-11-16 04:54 PM, Kees Cook wrote: >> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: >>> On 2017-11-11 09:31 AM, Linus Torvalds wrote: >>>> Boris Lukashev points out that Patrick should probably check a newer >>>> version of gcc. >>>> >>>> I looked around, and in one of the emails, Patrick said: >>>> >>>> "No changes, both the working and broken kernels were built with >>>> distro-provided gcc 5.4.0 and binutils 2.28.1" >>>> >>>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but >>>> it's a bug-fix release to a pretty old branch that is not exactly new. >>>> >>>> It would probably be good to check if the problems persist with gcc >>>> 6.x or 7.x.. I have no idea which gcc version the randstruct people >>>> tend to use themselves. >>> >>> I just tested it with gcc 7.2, and was able to reproduce the NULL >>> pointer dereference, the backtrace looks slightly different this time. >>> >>> I will also test with binutils 2.29, though I doubt that will make any >>> difference. >>> >>>> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at 0000000000000560 >>>> [ 56.166563] IP: vfs_statfs+0x7c/0xc0 >>>> [ 56.167249] PGD 0 P4D 0 >>>> [ 56.167860] Oops: 0000 [#1] SMP >>>> [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xt_addrtype iptable_mangle iptable> >>>> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O 4.14.0-git-kratos-1 #1 >>>> [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 >>>> [ 56.182729] task: ffff88040c412a00 task.stack: ffffc90002c18000 >>>> [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 >>>> [ 56.184341] RSP: 0018:ffffc90002c1bb28 EFLAGS: 00010202 >>>> [ 56.185143] RAX: 0000000000000000 RBX: ffffc90002c1bbf0 RCX: 0000000000000020 >>>> [ 56.186085] RDX: 0000000000001801 RSI: 0000000000001801 RDI: 0000000000000000 >>>> [ 56.187066] RBP: ffffc90002c1bbc0 R08: ffffffffffffff00 R09: 00000000000000ff >>>> [ 56.188268] R10: 000000000038be3a R11: ffff880408b18258 R12: 0000000000000000 >>>> [ 56.189336] R13: ffff88040c23ad00 R14: ffff88040b874000 R15: ffffc90002c1bbf0 >>>> [ 56.190444] FS: 0000000000000000(0000) GS:ffff88041fc00000(0000) knlGS:0000000000000000 >>>> [ 56.191876] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 56.192843] CR2: 0000000000000560 CR3: 0000000001e0a002 CR4: 00000000001606f0 >>>> [ 56.193898] Call Trace: >>>> [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 >>>> [ 56.195267] ? generic_permission+0x12c/0x1a0 >>>> [ 56.196025] nfsd4_encode_getattr+0x25/0x30 >>>> [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 >>>> [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 >>>> [ 56.198268] nfsd_dispatch+0xe8/0x220 >>>> [ 56.198968] svc_process_common+0x475/0x640 >>>> [ 56.199696] ? nfsd_destroy+0x60/0x60 >>>> [ 56.200404] svc_process+0xf2/0x1a0 >>>> [ 56.201079] nfsd+0xe3/0x150 >>>> [ 56.201706] kthread+0x117/0x130 >>>> [ 56.202354] ? kthread_create_on_node+0x40/0x40 >>>> [ 56.203100] ret_from_fork+0x25/0x30 >>>> [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> >>>> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: ffffc90002c1bb28 >>>> [ 56.207110] CR2: 0000000000000560 >>>> [ 56.207763] ---[ end trace d452986a80f64aaa ]--- >>> >>>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >>>>> >>>>> I'll take a closer look at this and see if I can provide something to >>>>> narrow it down. >> >> How reliable is this crash? The best idea I have to isolate it would >> be to bisect the additions of the __randomize_layout markings on >> various structures. I would start with the ones Al is most upset to >> see randomized. ;) > > It's pretty reliable, once I get a bad seed I can reproduce the crash > pretty quickly. > >> >> All that said, I'd like to better understand the BIOS side of this a >> little better. In the first email in this thread, you showed two BUGs >> separated by a little time, which implies to me that the NULL deref >> and the BIOS no longer POSTing are separate (though seemingly related) >> issues. Have you had machines survive the BUG without blowing up the >> BIOS? > > We had 3 machines die due to the BIOS issue (all of them pretty quickly > with the bad-seed kernel). All the dead machines had the same > motherboard model. I have not managed to reproduce the issue again on > the machine I restored via the IPMI interface, I suspect that it may be > a bug in the BIOS that was fixed in a more recent version. > >> >> I'm still trying to wrap my head around how the BIOS could be blowing >> up. I assume there's some magic memory address that is getting poked >> as a result of some struct randomization bug, so tracking that down >> should be possible assuming you can stand reflashing your BIOS across >> the bisects. > > That is our theory, some magic memory address that caused an overwrite > of the flash where the BIOS code is stored. We are working under the > assumption that it was fixed in a more recent BIOS update, since I have > not managed to reproduce the issue on the resurrected machine. Okay, well that's certainly better than having to reflash at every bisection step! :) >> For the first step, I'd try a revert of >> 9225331b310821760f39ba55b00b8973602adbb5, which enables a large >> portion of struct randomization. If that doesn't change things, I can >> provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9 >> and then re-applies __randomize_layout one structure per patch, and >> you could bisect that? > > Sure, I can bisect that. Okay, that should at least let us know if this is a specific struct that is not expecting to get randomized, or if there is some deeper flaw. Here's the tree, based on 4.14: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/randstruct/bisection With commit d9e12200852d, all randomization selections are reverted. I would expect this to be a "good" kernel for the bisect. The very end of the series (commit d893c17b3146), everything is back to being randomized. I would expect this to be a "bad" kernel. Each step between those two commits adds randomization to a single struct (with the filesystem stuff near the front). Here's hoping it'll be something obvious. :) Thanks for taking the time to debug this! -Kees -- Kees Cook Pixel Security