Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp5456595imm; Tue, 12 Jun 2018 08:07:05 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ3aattLKFq6MmODPLjfwMyjH6JOqgd5MQleF2UqCD8ys8+dJ6USPDP1GUMz50nYVIm5V9Y X-Received: by 2002:a65:5c89:: with SMTP id a9-v6mr634461pgt.51.1528816025585; Tue, 12 Jun 2018 08:07:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528816025; cv=none; d=google.com; s=arc-20160816; b=m0bOcORMvDcBMUn82NZu6qEkC5dp6Z3KUofbYoLOu+NzbE8y8JvfkRvguFdg6ZT2Lp hjCGi3bkIj3BBFqjDi616aHJ2do5Z2u0r1fIgSYwFf26XCZw0RHhTQJyJ0GailgT5XHe TE/vl4ph6pG8RnAf7dsEO2+UpKYQYzWfhW5Ju/H8aUpx7Nonl0/8VvkIIvsN5qxGC3nX R1sEVVYz6bG+mKgRCqZ2YjT1GpsjRx4C3VrgX3OjZ4ZSYf/STy47jzfjykE5Evc4rvZF ro98veZZnNJDyVKR2Pkn7mASG76XE/PYGAhmiR34q5tbb4Sp+xKbx2e9grFS+Xv7p+5x dPsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=MAJKBFfRPpHqwATgwptgPSK3HbicNoVFbKNizoSiRfk=; b=wCI65DHdCy/KScOzOhss67aGFaCs0T9QYDC2saznfp9yIY/Pxav33/Vn1ciCwcxsvy z+J5rugKICyqyNUn6DQ4rA6Gwnwiw5jE43Obw+vdvSYasIKphyJjjSPKl5UcQcJ6ikL4 2sVDB1JD390BTLrfWOKamqlvwLOjpT8sizKtH/CQMUIN8HXC7CjwsBw4RsNmjz71KJmZ QSsgywov6BeDB0TwW4Gbw/eBt1gM8FtKWEiz84QuhKEBAo7A1DrULWNzLFa9O7W+Y5bo 5SSjtjy5Q27EMBedEb3p4+1sNOv/SQRWnWagbBkiCOhzMMYA7LqyDTSZlT6X5GJHWNiL z68g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z3-v6si281094plb.246.2018.06.12.08.06.51; Tue, 12 Jun 2018 08:07:05 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934347AbeFLPGN (ORCPT + 99 others); Tue, 12 Jun 2018 11:06:13 -0400 Received: from mga06.intel.com ([134.134.136.31]:10586 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932905AbeFLPGL (ORCPT ); Tue, 12 Jun 2018 11:06:11 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 Jun 2018 08:06:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,215,1526367600"; d="scan'208";a="58608801" Received: from 2b52.sc.intel.com (HELO [143.183.136.147]) ([143.183.136.147]) by orsmga003.jf.intel.com with ESMTP; 12 Jun 2018 08:06:09 -0700 Message-ID: <1528815781.8271.15.camel@2b52.sc.intel.com> Subject: Re: [PATCH 01/10] x86/cet: User-mode shadow stack support From: Yu-cheng Yu To: Balbir Singh Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , "H.J. Lu" , Vedvyas Shanbhogue , "Ravi V. Shankar" , Dave Hansen , Andy Lutomirski , Jonathan Corbet , Oleg Nesterov , Arnd Bergmann , Mike Kravetz Date: Tue, 12 Jun 2018 08:03:01 -0700 In-Reply-To: <0e80c181-83b2-457f-a419-01e79f94ca1c@gmail.com> References: <20180607143807.3611-1-yu-cheng.yu@intel.com> <20180607143807.3611-2-yu-cheng.yu@intel.com> <0e80c181-83b2-457f-a419-01e79f94ca1c@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2018-06-12 at 21:56 +1000, Balbir Singh wrote: > > On 08/06/18 00:37, Yu-cheng Yu wrote: > > This patch adds basic shadow stack enabling/disabling routines. > > A task's shadow stack is allocated from memory with VM_SHSTK > > flag set and read-only protection. The shadow stack is > > allocated to a fixed size and that can be changed by the system > > admin. > > > > I presume a read-only permission on the kernel side, but it > can be written by hardware? Yes, the shadow stack is written by the processor when a call instruction is executed. ... > > > > diff --git a/arch/x86/include/asm/cet.h b/arch/x86/include/asm/cet.h > > new file mode 100644 > > index 000000000000..9d5bc1efc9b7 > > --- /dev/null > > +++ b/arch/x86/include/asm/cet.h > > @@ -0,0 +1,32 @@ > > +/* SPDX-License-Identifier: GPL-2.0 */ > > +#ifndef _ASM_X86_CET_H > > +#define _ASM_X86_CET_H > > + > > +#ifndef __ASSEMBLY__ > > +#include > > + > > +struct task_struct; > > +/* > > + * Per-thread CET status > > + */ > > +struct cet_stat { > > stat sounds like statistics, just expand out to status please I will make it 'cet_status'. > > + unsigned long shstk_base; > > + unsigned long shstk_size; > > + unsigned int shstk_enabled:1; > > +}; > > + > > +#ifdef CONFIG_X86_INTEL_CET > > +unsigned long cet_get_shstk_ptr(void); > > For the current task? Why does _ptr routine return an unsigned long? What about cet_get_shstk_addr()? ... > > diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h > > index fda2114197b3..428d13828ba9 100644 > > --- a/arch/x86/include/asm/msr-index.h > > +++ b/arch/x86/include/asm/msr-index.h > > @@ -770,4 +770,18 @@ > > #define MSR_VM_IGNNE 0xc0010115 > > #define MSR_VM_HSAVE_PA 0xc0010117 > > > > +/* Control-flow Enforcement Technology MSRs */ > > +#define MSR_IA32_U_CET 0x6a0 > > +#define MSR_IA32_S_CET 0x6a2 > > +#define MSR_IA32_PL0_SSP 0x6a4 > > +#define MSR_IA32_PL3_SSP 0x6a7 > > +#define MSR_IA32_INT_SSP_TAB 0x6a8 > > some comments on the purpose of the MSR would be nice Sure. ... > > I think there was a comment about this being TASK_SIZE_MAX > > > + > > + rdmsrl(MSR_IA32_U_CET, r); > > + wrmsrl(MSR_IA32_U_CET, r | MSR_IA32_CET_SHSTK_EN); > > + wrmsrl(MSR_IA32_PL3_SSP, addr); > > Should the enable happen before setting addr? I would expect to do this in the opposite order. I will check. > > + return 0; > > +} > > + > > +unsigned long cet_get_shstk_ptr(void) > > +{ > > + unsigned long ptr; > > + > > + if (!current->thread.cet.shstk_enabled) > > + return 0; > > + > > + rdmsrl(MSR_IA32_PL3_SSP, ptr); > > + return ptr; > > +} > > + > > +static unsigned long shstk_mmap(unsigned long addr, unsigned long len) > > +{ > > + struct mm_struct *mm = current->mm; > > + unsigned long populate; > > + > > + down_write(&mm->mmap_sem); > > + addr = do_mmap(NULL, addr, len, PROT_READ, > > + MAP_ANONYMOUS | MAP_PRIVATE, VM_SHSTK, > > + 0, &populate, NULL); > > + up_write(&mm->mmap_sem); > > What happens if the mmap fails for any reason? I presume the caller disables shadow stack on this process? This is from exec(), and that fails. > > + > > + if (populate) > > + mm_populate(addr, populate); > > + > > + return addr; > > +} > > + > > +int cet_setup_shstk(void) > > +{ > > + unsigned long addr, size; > > + > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) > > + return -EOPNOTSUPP; > > + > > + size = SHSTK_SIZE; > > + addr = shstk_mmap(0, size); > > + > > + if (addr >= TASK_SIZE) > > + return -ENOMEM; > > + > > TASK_SIZE_MAX? Yes. > > > + cet_set_shstk_ptr(addr + size - sizeof(void *)); > > + current->thread.cet.shstk_base = addr; > > + current->thread.cet.shstk_size = size; > > + current->thread.cet.shstk_enabled = 1; > > + return 0; > > +} > > + > > +void cet_disable_shstk(void) > > +{ > > + u64 r; > > + > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) > > + return; > > + > > + rdmsrl(MSR_IA32_U_CET, r); > > + r &= ~(MSR_IA32_CET_SHSTK_EN); > > + wrmsrl(MSR_IA32_U_CET, r); > > + wrmsrl(MSR_IA32_PL3_SSP, 0); > > Again, I'd expect the order to be the reverse > > > + current->thread.cet.shstk_enabled = 0; > > +} > > + > > +void cet_disable_free_shstk(struct task_struct *tsk) > > +{ > > + if (!cpu_feature_enabled(X86_FEATURE_SHSTK) || > > + !tsk->thread.cet.shstk_enabled) > > + return; > > + > > + if (tsk == current) > > + cet_disable_shstk(); > > + > > + /* > > + * Free only when tsk is current or shares mm > > + * with current but has its own shstk. > > + */ > > + if (tsk->mm && (tsk->mm == current->mm) && > > + (tsk->thread.cet.shstk_base)) { > > Does the caller hold a reference to tsk->mm? If (tsk->mm == current->mm), i.e. it is current or it is a pthread of current, then yes. Yu-cheng