Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1203903imm; Wed, 11 Jul 2018 20:04:36 -0700 (PDT) X-Google-Smtp-Source: AAOMgpc+P831YOzIYVL6dNDlSEqtM5NWICYFPncBscBt8uYzTIfRoWYG8xsT+0moyoLNGWjK1kUZ X-Received: by 2002:a63:524e:: with SMTP id s14-v6mr446177pgl.35.1531364676472; Wed, 11 Jul 2018 20:04:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531364676; cv=none; d=google.com; s=arc-20160816; b=UtFPbfelPNDZW88OOz4KUedwK27lrAcRVlHmOkNML1dVcBb9Hj9ZFj01/eMj2U7YWU mcWMbJrs6QXQKssQx0pJuV2YBHczf5YQlApzgua7K7oWXRIFWpTu7UEsABR6fMT/q84S XyBWDINESbJu5zLYBvkSfWE0zFmobtXvTJGhX5+PEwyqfiNAAPPsPxIohP2zhc9pN+Y/ jQsNoiXJ4smoF1VdzCngRrWJug0bu5D5VOnsDlqUAV944M3RXObn0kbfI4mxoKmweBF4 xAfhGwOQ4WXMhC+192/C6ii2/ktOqfmxGZ6A98YRW6PJSAIbjNedHtnTzzdREBBYRndP 78DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:to:from:subject:message-id :arc-authentication-results; bh=CFIrhW6w55iRuWzv/FMSRUJ4Vow4pfqrwyJZFbE2amI=; b=XCFDDGUxrv79A3EvIQ809Bk0TqlX9M9nPvqYbMzRUpu1+Dz9DYBCVeR1MjeK/YK+6Q b/SsD/nR8wKOR5ZEQm6HyMBIlY6tAhWyWDJmtA/TwVue/K+DcyBxL2ZTy+giVv4OcWGS LKSv5jbzGYpirh5CWzcUnx7AQcYAJP68G3HsbweP1tOUo6z73mrF4mhsLGFS4Iv8iHdh e8XYa7CWY+sEp1+YCjnbAaqU8GOngb8Q8QcqnNKS9JPzCd/6nQRO5kcZ/q/VXwO+/buR Fq4YRKJU25oj9Td3ZfxlIbfN9lvfiXDQqgNIsK0Kp0I5IDG/Z2hDmRO8jfXySHFwTAy+ 9eSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p21-v6si20059739plq.94.2018.07.11.20.04.21; Wed, 11 Jul 2018 20:04:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389716AbeGKXK5 (ORCPT + 99 others); Wed, 11 Jul 2018 19:10:57 -0400 Received: from mga18.intel.com ([134.134.136.126]:3235 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388115AbeGKXK5 (ORCPT ); Wed, 11 Jul 2018 19:10:57 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jul 2018 16:04:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,339,1526367600"; d="scan'208";a="72011249" Received: from 2b52.sc.intel.com ([143.183.136.52]) by orsmga001.jf.intel.com with ESMTP; 11 Jul 2018 16:04:06 -0700 Message-ID: <1531350028.15351.102.camel@intel.com> Subject: Re: [RFC PATCH v2 22/27] x86/cet/ibt: User-mode indirect branch tracking support From: Yu-cheng Yu To: Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , "Ravi V. Shankar" , Vedvyas Shanbhogue Date: Wed, 11 Jul 2018 16:00:28 -0700 In-Reply-To: References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-23-yu-cheng.yu@intel.com> <3a7e9ce4-03c6-cc28-017b-d00108459e94@linux.intel.com> <1531347019.15351.89.camel@intel.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.18.5.2-0ubuntu3.2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 2018-07-11 at 15:40 -0700, Dave Hansen wrote: > On 07/11/2018 03:10 PM, Yu-cheng Yu wrote: > > > > On Tue, 2018-07-10 at 17:11 -0700, Dave Hansen wrote: > > > > > > Is this feature *integral* to shadow stacks?  Or, should it just > > > be > > > in a > > > different series? > > The whole CET series is mostly about SHSTK and only a minority for > > IBT. > > IBT changes cannot be applied by itself without first applying > > SHSTK > > changes.  Would the titles help, e.g. x86/cet/ibt, x86/cet/shstk, > > etc.? > That doesn't really answer what I asked, though. > > Do shadow stacks *require* IBT?  Or, should we concentrate on merging > shadow stacks themselves first and then do IBT at a later time, in a > different patch series? > > But, yes, better patch titles would help, although I'm not sure > that's > quite the format that Ingo and Thomas prefer. Shadow stack does not require IBT, but they complement each other.  If we can resolve the legacy bitmap, both features can be merged at the same time. > > > > > > > > > > > > > > +int cet_setup_ibt_bitmap(void) > > > > +{ > > > > + u64 r; > > > > + unsigned long bitmap; > > > > + unsigned long size; > > > > + > > > > + if (!cpu_feature_enabled(X86_FEATURE_IBT)) > > > > + return -EOPNOTSUPP; > > > > + > > > > + size = TASK_SIZE_MAX / PAGE_SIZE / BITS_PER_BYTE; > > > Just a note: this table is going to be gigantic on 5-level paging > > > systems, and userspace won't, by default use any of that extra > > > address > > > space.  I think it ends up being a 512GB allocation in a 128TB > > > address > > > space. > > > > > > Is that a problem? > > > > > > On 5-level paging systems, maybe we should just stick it up in > > > the  > > > high part of the address space. > > We do not know in advance if dlopen() needs to create the bitmap. > >  Do > > we always reserve high address or force legacy libs to low address? > Does it matter?  Does code ever get pointers to this area?  Might > they > be depending on high address bits for the IBT being clear? GLIBC does the bitmap setup.  It sets bits in there. I thought you wanted a smaller bitmap?  One way is forcing legacy libs to low address, or not having the bitmap at all, i.e. turn IBT off. > > > > > > > > > > > > > > > + bitmap = ibt_mmap(0, size); > > > > + > > > > + if (bitmap >= TASK_SIZE_MAX) > > > > + return -ENOMEM; > > > > + > > > > + bitmap &= PAGE_MASK; > > > We're page-aligning the result of an mmap()?  Why? > > This may not be necessary.  The lower bits of MSR_IA32_U_CET are > > settings and not part of the bitmap address.  Is this is safer? > No.  If we have mmap() returning non-page-aligned addresses, we have > bigger problems.  Worst-case, do > > WARN_ON_ONCE(bitmap & ~PAGE_MASK); > Ok. > > > > > > > > > > > > > + current->thread.cet.ibt_bitmap_addr = bitmap; > > > > + current->thread.cet.ibt_bitmap_size = size; > > > > + return 0; > > > > +} > > > > + > > > > +void cet_disable_ibt(void) > > > > +{ > > > > + u64 r; > > > > + > > > > + if (!cpu_feature_enabled(X86_FEATURE_IBT)) > > > > + return; > > > Does this need a check for being already disabled? > > We need that.  We cannot write to those MSRs if the CPU does not > > support it. > No, I mean for code doing cet_disable_ibt() twice in a row. Got it. > > > > > > > > > > > > > > + rdmsrl(MSR_IA32_U_CET, r); > > > > + r &= ~(MSR_IA32_CET_ENDBR_EN | MSR_IA32_CET_LEG_IW_EN > > > > | > > > > +        MSR_IA32_CET_NO_TRACK_EN); > > > > + wrmsrl(MSR_IA32_U_CET, r); > > > > + current->thread.cet.ibt_enabled = 0; > > > > +} > > > What's the locking for current->thread.cet? > > Now CET is not locked until the application calls ARCH_CET_LOCK. > No, I mean what is the in-kernel locking for the current->thread.cet > data structure?  Is there none because it's only every modified via > current->thread and it's entirely thread-local? Yes, that is the case.