Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp13937562ybl; Mon, 30 Dec 2019 00:33:34 -0800 (PST) X-Google-Smtp-Source: APXvYqzN5sMms/mCjrEc5fkQQftGOfd5K72amhGrahms3slEigyb5Z+BLmdBlU1Nb7+ZLj/Fq5J6 X-Received: by 2002:a05:6830:138b:: with SMTP id d11mr60494564otq.38.1577694814733; Mon, 30 Dec 2019 00:33:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1577694814; cv=none; d=google.com; s=arc-20160816; b=Oa64xglJVHpB2/TKXd/HtQDJx0F7uZAYfyGXBeQYOGGi/fmUkD1RusDz1fV2ocyN5g rVqGSVEF1yMFCiTm+hLHa2b8DuV4uX6vggl2XeIyvJpaMpF6GzOY0EmkfeH8VnnmUpYe FIz4iCJ7afi4NjkrElClvN3b1rcBfmOL1apKNlMpvXEvNL+KODZ3E4Q6wCi5f7fIn3KO 7V5EOQZVekZbKazHQ/UBTbCDWXcwOIpWMrOKZtLBK3eSmMM0um4dHfqTf2yLlK0EEZSC GRr+AlIxtxRSnlNWFa4g19wJB/r/TDEp0xE6yDPLnWyob5cYtRz6F6418XrhWcW+lbr3 ewSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=6CjktnLnH97qelplWzUrJv+fP0bRkYAr3dK63YNqpe4=; b=nYuz8ewCoWpSe01RQqoAAYFtgv41PBG9lxePfRGfzdAqK9HgTntr/JP4HWi6UxFEun yHeyLdN9Kye24L0qiSeqWyYjgIYP1knc5KWG2fs40gtjub/3Ce4ON7Kji4gp9N5RNy3N sFlvVeORWIxk1B4NBRLQEpo/72j1CcG9V4y28FISNlSsAiLfE58JD10gnAMQuJS64wa+ tvIEUVDxRpLNPwlsfAQvoxg49IStB+Was0Y5JD7nFiydPpKQxWWEuHpcpBFCChlpw87U qM3MO2UxoKnkM08WgwdG5e3D51IfjmjePg1Bo4b51Um49sUDj3U5w32JY5zKW+IhqoTe KqfA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w19si22220855otj.209.2019.12.30.00.33.22; Mon, 30 Dec 2019 00:33:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727282AbfL3Icl (ORCPT + 99 others); Mon, 30 Dec 2019 03:32:41 -0500 Received: from mout-p-202.mailbox.org ([80.241.56.172]:39240 "EHLO mout-p-202.mailbox.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727175AbfL3Ick (ORCPT ); Mon, 30 Dec 2019 03:32:40 -0500 Received: from smtp1.mailbox.org (smtp1.mailbox.org [80.241.60.240]) (using TLSv1.2 with cipher ECDHE-RSA-CHACHA20-POLY1305 (256/256 bits)) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 47mVzn5sqlzQlBT; Mon, 30 Dec 2019 09:32:37 +0100 (CET) X-Virus-Scanned: amavisd-new at heinlein-support.de Received: from smtp1.mailbox.org ([80.241.60.240]) by spamfilter05.heinlein-hosting.de (spamfilter05.heinlein-hosting.de [80.241.56.123]) (amavisd-new, port 10030) with ESMTP id w6FHZ5EZimpG; Mon, 30 Dec 2019 09:32:34 +0100 (CET) Date: Mon, 30 Dec 2019 19:32:24 +1100 From: Aleksa Sarai To: Linus Torvalds Cc: Al Viro , David Howells , Eric Biederman , stable , Christian Brauner , Serge Hallyn , dev@opencontainers.org, Linux Containers , Linux API , linux-fsdevel , Linux Kernel Mailing List Subject: Re: [PATCH RFC 0/1] mount: universally disallow mounting over symlinks Message-ID: <20191230083224.sbk2jspqmup43obs@yavin.dot.cyphar.com> References: <20191230052036.8765-1-cyphar@cyphar.com> <20191230054413.GX4203@ZenIV.linux.org.uk> <20191230054913.c5avdjqbygtur2l7@yavin.dot.cyphar.com> <20191230072959.62kcojxpthhdwmfa@yavin.dot.cyphar.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="4jbwm72isg7wdcgm" Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --4jbwm72isg7wdcgm Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2019-12-29, Linus Torvalds wrote: > On Sun, Dec 29, 2019 at 11:30 PM Aleksa Sarai wrote: > > > > BUG: kernel NULL pointer dereference, address: 0000000000000000 >=20 > Would you mind building with debug info, and then running the oops through >=20 > scripts/decode_stacktrace.sh >=20 > which makes those addresses much more legible. Will do. > > #PF: supervisor instruction fetch in kernel mode > > #PF: error_code(0x0010) - not-present page >=20 > Somebody jumped through a NULL pointer. >=20 > > RAX: 0000000000000000 RBX: ffff906d0cc3bb40 RCX: 0000000000000abc > > RDX: 0000000000000089 RSI: ffff906d74623cc0 RDI: ffff906d74475df0 > > RBP: ffff906d74475df0 R08: ffffd70b7fb24c20 R09: ffff906d066a5000 > > R10: 0000000000000000 R11: 8080807fffffffff R12: ffff906d74623cc0 > > R13: 0000000000000089 R14: ffffb70b82963dc0 R15: 0000000000000080 > > FS: 00007fbc2a8f0540(0000) GS:ffff906dcf500000(0000) knlGS:0000000= 000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: ffffffffffffffd6 CR3: 00000003c68f8001 CR4: 00000000003606e0 > > Call Trace: > > __lookup_slow+0x94/0x160 >=20 > And "__lookup_slow()" has two indirect calls (they aren't obvious with > retpoline, but look for something like >=20 > call __x86_indirect_thunk_rax >=20 > which is the modern sad way of doing "call *%rax"). One is for > revalidatinging an old dentry, but the one I _suspect_ you trigger is > this one: >=20 > old =3D inode->i_op->lookup(inode, dentry, flags); >=20 > but I thought we only could get here if we know it's a directory. >=20 > How did we miss the "d_can_lookup()", which is what should check that > yes, we can call that ->lookup() routine. I'll try applying a trivial patch to add d_can_lookup() to see if it fixes the immediate issue. > This is why I have that suspicion that it's somehow that O_PATH fd > opened in another process without O_PATH causes confusion... >=20 > So what I think has happened is that because of the O_PATH thing, > we've ended up with an inode that has never been truly opened (because > O_PATH skips that part), but then with the /proc//fd/xyz open, we > now have a file descriptor that _looks_ like it is valid, and we're > treating that inode as if it can be used. I'm not sure I agree -- as I mentioned in my other mail, re-opening through /proc/self/fd/$n works *very* well and has for a long time (in fact, both LXC and runc depend on this working). --=20 Aleksa Sarai Senior Software Engineer (Containers) SUSE Linux GmbH --4jbwm72isg7wdcgm Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQSxZm6dtfE8gxLLfYqdlLljIbnQEgUCXgm2FQAKCRCdlLljIbnQ Es8nAQDVLlsiprSDBJzgPUIJzecdqNxCZqJKorIf34AfFNF2FgD/YY1dxfmAH2LS EX3M69u6T8mfTLPNWSZlIyF13X2S2w4= =JUfW -----END PGP SIGNATURE----- --4jbwm72isg7wdcgm--