Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1095019imm; Wed, 10 Oct 2018 09:04:01 -0700 (PDT) X-Google-Smtp-Source: ACcGV63zrOqg8i1wKrKTlpbszpiFTSJ6MbnfHc8Ce+XviHWlhGMhN+P1MsVBwieeA4tNc367vmt4 X-Received: by 2002:a17:902:7b83:: with SMTP id w3-v6mr33962186pll.285.1539187441924; Wed, 10 Oct 2018 09:04:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539187441; cv=none; d=google.com; s=arc-20160816; b=B9mvzhZYzc6Eyb4jPeUct+1rhACUSec27VehhZJDtsHvmz2g4HWx86zht+QiyJZ9FQ 4wHS+JhoCoYxEIKi7I4uiL8bxGeDRWjYMK0qw1BRPZL0NEgguynDHLY6KFmLvhD+t8xV NbJfnBnrtzV6xRwYfIlnH6QYY0EXCUTt7T+OSsFLCUr6g8+GL9QBSMQ8I145CjiEydIT yiFBVCR49eTNNoSc+PoeN4ggFB60Q49qsGv98JplNLZ3JPCmpqITip4Aayja1Yr56LuW xpLWPhgZGRItqkJjQwVv21sGxRAJ9SehFAMA09HKzQdmuznpqCkbkQ/O0L/7ZO+qexS9 geOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id; bh=sPeXrih4L+jF80FObLfQdjfZwi+fiCWA190CtpCtL5c=; b=dDoCWdgUUr7/qDODx2iVvUCFlqDadNMwI8cLclGwWcgvrSAgFEqe3eOgl765ORoc3E DuuqCCNBimtiYq425kQxdq+eKrzeT2AInuzOrx0iod+MsPvY2iRH8J5ln2PwMmtBv1OT hkdRxbOpIypZHo9tWPJ07ycKAQeMBvLlpHMzA/Zy+XZGT8TVEpPTwppIBUdO/iA/ts/b 6DeSK6ZsasqnIDdv9uyb36D/RtzMZaowNaawfLiwyLeH2rZswNyEamp1owmR2rml3MRD V7TepfqRpltDk4cIYDYsD7BSHvISBfwn3BPhIaQWPHv+Nmzx/bWCUCMH8bs8/vShJorU eoLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c68-v6si24421473pfa.45.2018.10.10.09.03.46; Wed, 10 Oct 2018 09:04:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727085AbeJJXYC (ORCPT + 99 others); Wed, 10 Oct 2018 19:24:02 -0400 Received: from mga11.intel.com ([192.55.52.93]:36076 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726479AbeJJXYC (ORCPT ); Wed, 10 Oct 2018 19:24:02 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Oct 2018 09:01:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,364,1534834800"; d="scan'208";a="240200199" Received: from 2b52.sc.intel.com ([143.183.136.51]) by orsmga004.jf.intel.com with ESMTP; 10 Oct 2018 09:00:55 -0700 Message-ID: Subject: Re: [RFC PATCH v4 3/9] x86/cet/ibt: Add IBT legacy code bitmap allocation function From: Yu-cheng Yu To: Eugene Syromiatnikov , Andy Lutomirski Cc: X86 ML , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , LKML , "linux-doc@vger.kernel.org" , Linux-MM , linux-arch , Linux API , Arnd Bergmann , Balbir Singh , Cyrill Gorcunov , Dave Hansen , Florian Weimer , "H. J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Shankar, Ravi V" , "Shanbhogue, Vedvyas" Date: Wed, 10 Oct 2018 08:56:03 -0700 In-Reply-To: <20181005172622.GD19360@asgard.redhat.com> References: <20180921150553.21016-1-yu-cheng.yu@intel.com> <20180921150553.21016-4-yu-cheng.yu@intel.com> <20181003195702.GF32759@asgard.redhat.com> <5BF3AE8F-CC2A-4160-9FF6-FEA171A76371@amacapital.net> <20181005172622.GD19360@asgard.redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.1-2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2018-10-05 at 10:26 -0700, Eugene Syromiatnikov wrote: > On Fri, Oct 05, 2018 at 10:07:46AM -0700, Andy Lutomirski wrote: > > On Fri, Oct 5, 2018 at 10:03 AM Yu-cheng Yu wrote: > > > > > > On Fri, 2018-10-05 at 09:28 -0700, Andy Lutomirski wrote: > > > > > On Oct 5, 2018, at 9:13 AM, Yu-cheng Yu wrote: > > > > > > > > > > > On Wed, 2018-10-03 at 21:57 +0200, Eugene Syromiatnikov wrote: > > > > > > > On Fri, Sep 21, 2018 at 08:05:47AM -0700, Yu-cheng Yu wrote: > > > > > > > Indirect branch tracking provides an optional legacy code bitmap > > > > > > > that indicates locations of non-IBT compatible code. When set, > > > > > > > each bit in the bitmap represents a page in the linear address is > > > > > > > legacy code. > > > > > > > > > > > > > > We allocate the bitmap only when the application requests it. > > > > > > > Most applications do not need the bitmap. > > > > > > > > > > > > > > Signed-off-by: Yu-cheng Yu > > > > > > > --- > > > > > > > arch/x86/kernel/cet.c | 45 > > > > > > > +++++++++++++++++++++++++++++++++++++++++++ > > > > > > > 1 file changed, 45 insertions(+) > > > > > > > > > > > > > > diff --git a/arch/x86/kernel/cet.c b/arch/x86/kernel/cet.c > > > > > > > index 6adfe795d692..a65d9745af08 100644 > > > > > > > --- a/arch/x86/kernel/cet.c > > > > > > > +++ b/arch/x86/kernel/cet.c > > > > > > > @@ -314,3 +314,48 @@ void cet_disable_ibt(void) > > > > > > > wrmsrl(MSR_IA32_U_CET, r); > > > > > > > current->thread.cet.ibt_enabled = 0; > > > > > > > } > > > > > > > + > > > > > > > +int cet_setup_ibt_bitmap(void) > > > > > > > +{ > > > > > > > + u64 r; > > > > > > > + unsigned long bitmap; > > > > > > > + unsigned long size; > > > > > > > + > > > > > > > + if (!cpu_feature_enabled(X86_FEATURE_IBT)) > > > > > > > + return -EOPNOTSUPP; > > > > > > > + > > > > > > > + if (!current->thread.cet.ibt_bitmap_addr) { > > > > > > > + /* > > > > > > > + * Calculate size and put in thread header. > > > > > > > + * may_expand_vm() needs this information. > > > > > > > + */ > > > > > > > + size = TASK_SIZE / PAGE_SIZE / BITS_PER_BYTE; > > > > > > > > > > > > TASK_SIZE_MAX is likely needed here, as an application can easily > > > > > > switch > > > > > > between long an 32-bit protected mode. And then the case of a CPU > > > > > > that > > > > > > doesn't support 5LPT. > > > > > > > > > > If we had calculated bitmap size from TASK_SIZE_MAX, all 32-bit apps > > > > > would > > > > > have > > > > > failed the allocation for bitmap size > TASK_SIZE. Please see values > > > > > below, > > > > > which is printed from the current code. > > > > > > > > > > Yu-cheng > > > > > > > > > > > > > > > x64: > > > > > TASK_SIZE_MAX = 0000 7fff ffff f000 > > > > > TASK_SIZE = 0000 7fff ffff f000 > > > > > bitmap size = 0000 0000 ffff ffff > > > > > > > > > > x32: > > > > > TASK_SIZE_MAX = 0000 7fff ffff f000 > > > > > TASK_SIZE = 0000 0000 ffff e000 > > > > > bitmap size = 0000 0000 0001 ffff > > > > > > > > > > > > > I haven’t followed all the details here, but I have a general policy of > > > > objecting to any new use of TASK_SIZE. If you really really need to > > > > depend on > > > > 32-bitness in new code, please figure out what exactly you mean by “32- > > > > bit” > > > > and use an explicit check. > > > > > > The explicit check would be: > > > > > > test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET : TASK_SIZE_MAX > > > > > > which is the same as TASK_SIZE. > > > > But this is only ever done in response to a syscall, right? So > > wouldn't in_compat_syscall() be the right check? > > > > Also, this whole thing makes me extremely nervous. The MSR only > > contains the start address, not the size, right? So what prevents > > some goof from causing the CPU to read way past the end of the bitmap > > if the bitmap is short because the kernel thought it was supposed to > > be 32-bit? > > That's what I've mentioned initially: every syscall made with int 0x80 > is interpreted as compat, even if it was made from long mode. > > > I'm inclined to suggest something awful-ish: always allocate the > > bitmap as though it's for a 64-bit process, and just let it be at a > > high address. And add a syscall or arch_prctl() to manipulate it for > > the benefit of 32-bit programs that can't address it directly. > > That's likely the only way to go. This bitmap is needed only when the app does dlopen() a non-IBT .so file. Most applications do not need it. Can't we let dlopen mmap() the bitmap when needed and pass it to the kernel? Yu-cheng