Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp643757ybi; Sat, 15 Jun 2019 08:30:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqyUE10b0ee/wMxUPdhasxwf3fhe7ci5WgxvtCeqtV+RYQD1qv5AXfwH+Fd6/hNQgfzR4sob X-Received: by 2002:a17:90a:a397:: with SMTP id x23mr17253410pjp.118.1560612654292; Sat, 15 Jun 2019 08:30:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560612654; cv=none; d=google.com; s=arc-20160816; b=sKtETbFPRUhBZqcK6Y2ZnC1r24npNKKfgX214FU3ujuthwUVVFNw3FE5Baslgd0TJv 2Rk5B2wRr35Nl2F/eRSjWz8u/thhsHfyaxQyNR+cuHQqDm9GgXNUHWfywZvik/1suXet rC5NafUNuAqr9ekIV+24A9EALgFT1aXulMm76C2tj2CChKMPRcyIWjtz/E8wKVNPHZJz 67cFmpCdmYjCZi8Na+RDx3FViXU6JTmbGRqXkGgpx6K5CLrSBgPBKqGzOjCMZNdoQuUb +n7sM43lgqJnZOGLG8w3TUDfc4LcA2WWEMLw1sCbt6bgqI2cisYhveUXunrqNcjf3Ul5 41iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=/jcKk2/19vs459JoYUsaRTHxKiAzDVN8TgUMdkzJK98=; b=ceeKFC3aW+paMFNZpoavV48mfRgSuPTVLl6M0eUSwZl3a5mC12BmHfbWFypDVS0t5E xG6qk6CKuGk2JBPqYNHRe1Jn/APe/bHfF8OR2csmbhNvvT2g8sT1fSHkV+bOwrfL+y+a 7i1RJERzvb6EN28lGkmaDl5aYeNK7SaiDBUrE4tDW2+Y5ctAnW69gbzuCk18xrE6RidG aP/e7uR/NL+7y6EYURQs2XLHsqVC/s7y2PvYa8ZgBE31LR45XdVXvRHWZEWN2jYIY7WO spjRe7JfSmtdlYVQa+QvGEywq6MX2pVNaDExqTnmUjDMwnQtx0ue6ESzuhf0V9e5S7YP /c/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=L1UiC6nR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k18si5286218pfk.103.2019.06.15.08.30.39; Sat, 15 Jun 2019 08:30:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=L1UiC6nR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726976AbfFOPaM (ORCPT + 99 others); Sat, 15 Jun 2019 11:30:12 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:38581 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725944AbfFOPaL (ORCPT ); Sat, 15 Jun 2019 11:30:11 -0400 Received: by mail-pl1-f194.google.com with SMTP id f97so2273961plb.5 for ; Sat, 15 Jun 2019 08:30:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=/jcKk2/19vs459JoYUsaRTHxKiAzDVN8TgUMdkzJK98=; b=L1UiC6nRuC3WJJyU4c429mSMn+pt8bbs3SLZ3/ixeDlvWes+s1uY9J4EILqskN4rXb fcFioaFdINs25jtQGQF5qZ2rz4p5MOv7Y/AYkenKBgA3ZKQ0mFZ+llpN/qByte1CrNMD c5cTCshV0XRXLeLNGARAUHukmOrufQ04fPKFD0HjybCZiZw/SImj09huQbCbm8aCrUMr TU5YUAeHmbjni7zoyDuZEyuUQoa3Qp3fG/qd+979UGnjN8hSVuXm/kcVhSlWZN4WAPU0 fjyhF3NeYYZyQjjTCZ6u0skIxl3jWLS+jEUDnprxySkwYzDUXrg5LZMv4R2K14mjW4Hv MV+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=/jcKk2/19vs459JoYUsaRTHxKiAzDVN8TgUMdkzJK98=; b=Ue4DTSykSd3K2F/P6/791WPfb7fK2WkwHacPd1VFDiWRgWagHTvRJKMywQZzjQRYjm ldM7x6hbe1R1QF/Cnnr/frgfaLgihgfVW1L8nAgaLSoFlru/q94PoDRBRMtOx49l9/cJ AY53aZpaHHGwyIMNV2RZcO76re5CFtBy0sYJMKbLvqJK2q7Ra6L43MppIzvFtQwOiIyr rvzi+FK8v+8ATe4ZoJVual1C+bl0FQD/xqW4X9BHFmvxoY3P5CU50MsIOE2UMoKy+8pv 5C9+Z7A1EZ4/qiBfXXjHzjotxniOw9yoxKxPGMCD72gVjyBTthLYGuBwuTmz9ot3Yf4O zgfw== X-Gm-Message-State: APjAAAXQPZB06uAAkicjPZ8Rrc8GbnqOWW0AmBbjyO4yLEqneD0xI7VX Gj7NS1snltRl2l1pAPxXxrf2Rw== X-Received: by 2002:a17:902:ab83:: with SMTP id f3mr8554100plr.122.1560612610934; Sat, 15 Jun 2019 08:30:10 -0700 (PDT) Received: from ?IPv6:2600:1010:b01c:6f69:f4c4:438f:f883:452a? ([2600:1010:b01c:6f69:f4c4:438f:f883:452a]) by smtp.gmail.com with ESMTPSA id g8sm7859239pgd.29.2019.06.15.08.30.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Jun 2019 08:30:09 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [PATCH v7 03/14] x86/cet/ibt: Add IBT legacy code bitmap setup function From: Andy Lutomirski X-Mailer: iPhone Mail (16F203) In-Reply-To: <5d7012f6-7ab9-fd3d-4a11-294258e48fb5@intel.com> Date: Sat, 15 Jun 2019 08:30:08 -0700 Cc: Yu-cheng Yu , Peter Zijlstra , x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin Content-Transfer-Encoding: quoted-printable Message-Id: References: <20190606200926.4029-1-yu-cheng.yu@intel.com> <7e0b97bf1fbe6ff20653a8e4e147c6285cc5552d.camel@intel.com> <25281DB3-FCE4-40C2-BADB-B3B05C5F8DD3@amacapital.net> <3f19582d-78b1-5849-ffd0-53e8ca747c0d@intel.com> <5aa98999b1343f34828414b74261201886ec4591.camel@intel.com> <0665416d-9999-b394-df17-f2a5e1408130@intel.com> <5c8727dde9653402eea97bfdd030c479d1e8dd99.camel@intel.com> <328275c9b43c06809c9937c83d25126a6e3efcbd.camel@intel.com> <92e56b28-0cd4-e3f4-867b-639d9b98b86c@intel.com> <1b961c71d30e31ecb22da2c5401b1a81cb802d86.camel@intel.com> <5ddf59e2-c701-3741-eaa1-f63ee741ea55@intel.com> <598edca7-c36a-a236-3b72-08b2194eb609@intel.com> <359e6f64d646d5305c52f393db5296c469630d11.camel@intel.com> <5d7012f6-7ab9-fd3d-4a11-294258e48fb5@intel.com> To: Dave Hansen Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Jun 14, 2019, at 3:06 PM, Dave Hansen wrote: >=20 >> On 6/14/19 2:34 PM, Yu-cheng Yu wrote: >> On Fri, 2019-06-14 at 13:57 -0700, Dave Hansen wrote: >>>> I have a related question: >>>>=20 >>>> Do we allow the application to read the bitmap, or any fault from the >>>> application on bitmap pages? >>>=20 >>> We have to allow apps to read it. Otherwise they can't execute >>> instructions. >>=20 >> What I meant was, if an app executes some legacy code that results in bit= map >> lookup, but the bitmap page is not yet populated, and if we then populate= that >> page with all-zero, a #CP should follow. So do we even populate that zer= o page >> at all? >>=20 >> I think we should; a #CP is more obvious to the user at least. >=20 > Please make an effort to un-Intel-ificate your messages as much as > possible. I'd really prefer that folks say "missing end branch fault" > rather than #CP. I had to Google "#CP". >=20 > I *think* you are saying that: The *only* lookups to this bitmap are on > "missing end branch" conditions. Normal, proper-functioning code > execution that has ENDBR instructions in it will never even look at the > bitmap. The only case when we reference the bitmap locations is when > the processor is about do do a "missing end branch fault" so that it can > be suppressed. Any population with the zero page would be done when > code had already encountered a "missing end branch" condition, and > populating with a zero-filled page will guarantee that a "missing end > branch fault" will result. You're arguing that we should just figure > this out at fault time and not ever reach the "missing end branch fault" > at all. >=20 > Is that right? >=20 > If so, that's an architecture subtlety that I missed until now and which > went entirely unmentioned in the changelog and discussion up to this > point. Let's make sure that nobody else has to walk that path by > improving our changelog, please. >=20 > In any case, I don't think this is worth special-casing our zero-fill > code, FWIW. It's not performance critical and not worth the complexity. > If apps want to handle the signals and abuse this to fill space up with > boring page table contents, they're welcome to. There are much easier > ways to consume a lot of memory. Isn=E2=80=99t it a special case either way? Either we look at CR2 and popul= ate a page, or we look at CR2 and the =E2=80=9Ctracker=E2=80=9D state and se= nd a different signal. Admittedly the former is very common in the kernel. >=20 >>> We don't have to allow them to (popuating) fault on it. But, if we >>> don't, we need some kind of kernel interface to avoid the faults. >>=20 >> The plan is: >>=20 >> * Move STACK_TOP (and vdso) down to give space to the bitmap. >=20 > Even for apps with 57-bit address spaces? >=20 >> * Reserve the bitmap space from (mm->start_stack + PAGE_SIZE) to cover a c= ode >> size of TASK_SIZE_LOW, which is (TASK_SIZE_LOW / PAGE_SIZE / 8). >=20 > The bitmap size is determined by CR4.LA57, not the app. If you place > the bitmap here, won't references to it for high addresses go into the > high address space? >=20 > Specifically, on a CR4.LA57=3D0 system, we have 48 bits of address space, > so 128TB for apps. You are proposing sticking the bitmap above the > stack which is near the top of that 128TB address space. But on a > 5-level paging system with CR4.LA57=3D1, there could be valid data at > 129GB. Is there something keeping that data from being mistaken for > being part of the bitmap? >=20 I think we need to make the vma be full sized =E2=80=94 it should cover the e= ntire range that the CPU might access. If that means it spans the 48-bit bou= ndary, so be it. > Also, if you're limiting it to TASK_SIZE_LOW, please don't forget that > this is yet another thing that probably won't work with the vsyscall > page. Please make sure you consider it and mention it in your next post. Why not? The vsyscall page is at a negative address. >=20 >> * Mmap the space only when the app issues the first mark-legacy prctl. T= his >> avoids the core-dump issue for most apps and the accounting problem that >> MAP_NORESERVE probably won't solve What happens if there=E2=80=99s another VMA there by the time you map it?=