Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936571AbdLRXuz (ORCPT ); Mon, 18 Dec 2017 18:50:55 -0500 Received: from mail-it0-f51.google.com ([209.85.214.51]:34229 "EHLO mail-it0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934965AbdLRXuy (ORCPT ); Mon, 18 Dec 2017 18:50:54 -0500 X-Google-Smtp-Source: ACJfBovbcombjiRaAY9as49LAR2UZ1zCcigKECZVKmsENliVtC7/1RZpHFtEyIf08alD4ODLkqX7Mrfu7b4qrXGHRN0= MIME-Version: 1.0 In-Reply-To: <20171218231013.GA9481@codemonkey.org.uk> References: <20171218214438.GA32728@codemonkey.org.uk> <20171218221541.GP21978@ZenIV.linux.org.uk> <20171218231013.GA9481@codemonkey.org.uk> From: Linus Torvalds Date: Mon, 18 Dec 2017 15:50:52 -0800 X-Google-Sender-Auth: 9PF0bBhd5JmzUcdQJD5gP6JZ5ZA Message-ID: Subject: Re: proc_flush_task oops To: Dave Jones , Al Viro , Linux Kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1980 Lines: 50 On Mon, Dec 18, 2017 at 3:10 PM, Dave Jones wrote: > On Mon, Dec 18, 2017 at 10:15:41PM +0000, Al Viro wrote: > > On Mon, Dec 18, 2017 at 04:44:38PM -0500, Dave Jones wrote: > > > I've hit this twice today. It's odd, because afaics, none of this code > > > has really changed in a long time. > > > > Which tree had that been? > > Linus, rc4. Ok, so the original report was marked as spam for me for whatever reason. I ended up re-analyzing the oops, but came to the same conclusion you did: it's a NULL mnt pointer in proc_flush_task_mnt(). The code disassembles to 0: c1 e2 04 shl $0x4,%edx 3: 44 8b 60 30 mov 0x30(%rax),%r12d 7: 48 8b 40 38 mov 0x38(%rax),%rax b: 44 8b 34 11 mov (%rcx,%rdx,1),%r14d f: 48 c7 c2 60 3a f5 81 mov $0xffffffff81f53a60,%rdx 16: 44 89 e1 mov %r12d,%ecx 19: 4c 8b 68 58 mov 0x58(%rax),%r13 1d: e8 4b b4 77 00 callq 0x77b46d 22: 89 44 24 14 mov %eax,0x14(%rsp) 26: 48 8d 74 24 10 lea 0x10(%rsp),%rsi 2b:* 49 8b 7d 00 mov 0x0(%r13),%rdi <-- trapping instruction 2f: e8 b9 6a f9 ff callq 0xfffffffffff96aed 34: 48 85 c0 test %rax,%rax 37: 74 1a je 0x53 39: 48 89 c7 mov %rax,%rdi and just matching that up against the code I see generated, that first call is the call to snprintf, and the second call is to d_hash_and_lookup. So it's one of these two patterns (pid vs tgid): name.len = snprintf(buf, sizeof(buf), "%d", pid); /* no ->d_hash() rejects on procfs */ dentry = d_hash_and_lookup(mnt->mnt_root, &name); and that "mov 0x0(%r13),%rdi" that traps is "mnt->mnt_root". But I don't see what would have changed in this area recently. Do you end up saving the seeds that cause crashes? Is this reproducible? (Other than seeing it twoce, of course) Linus