Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2397104yba; Thu, 25 Apr 2019 16:01:46 -0700 (PDT) X-Google-Smtp-Source: APXvYqwkkx/taBBpHTyovpAgDh+wUuq3LN/31A09yRoAmJ3+SPeKbDfmvx57cpnQX+4mKC8Qzz82 X-Received: by 2002:aa7:820c:: with SMTP id k12mr43204980pfi.177.1556233306732; Thu, 25 Apr 2019 16:01:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556233306; cv=none; d=google.com; s=arc-20160816; b=oCg37/KC6pqY1ELz8oVD4gtcFSuqLnBAu4T//VUAnswaZxrCf6pBJYr9aayIe+hu1A QjIzpEKOKpKul8B0I0iejo68JKcFktQg1JX1wp8/0Zn4/JMHxoFKJe+2ZQbyvIXTNHcy PdWpksoUTudyae5mkxr2J2h0s8vZgmZmyWJUJRirQ0rerjkcSJk4UTFCU2vjojmc9zvC my/5kytVUz+9RwFZdl9jkdjOnuQNLqK4YiqS1k7ZLOBah9TIF6ZPBf5H4Xavnb2W4It6 pPO5jaziSlwLvrsKA3AyZpwnoeYmhBlKra42SIyLw/m7NVSEGMwTAWETKAPO5tddsPLe h/Vg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from; bh=i1bfsZjFrXVP5RwntuUxVyAAkzjjd/96go1WsZXyan0=; b=kaDG0HyKsc1b39Okpba4TVCQUvmIveyO1VdnDEmcpNHXw1tpcmErhxnJ9Fb/1qZpdS 6QrbV9HnOQTorq/VRJeljpHrb7lhHhNwvopULIsvJ7qgpR0rY/y+cSz+qI1WjQt2BGf+ TZfuhM9EaiU5+xLQl4xZGaDmoU+uG4AbEdG2K3Vq8QPOF15b9cZlxVBuyhSU1TnVJxs7 ZOEYSYXaiT7yHr7bhUrGzEvIMCJZh4LjKxBFrSAEGUOH3a2N8C5Zd6/S3KZ1QD71geqZ o+sdsJeQBd2ZDtJK6yc7hEnsLRs6c7tu+LhHsRmsQ/P7lXmALaWSXE5TqElGKOFVqnt2 ITSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a13si24667260pfn.70.2019.04.25.16.01.29; Thu, 25 Apr 2019 16:01:46 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388217AbfDYVq0 (ORCPT + 99 others); Thu, 25 Apr 2019 17:46:26 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:56878 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388205AbfDYVqZ (ORCPT ); Thu, 25 Apr 2019 17:46:25 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x3PLYuNv143366 for ; Thu, 25 Apr 2019 17:46:24 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2s3m2faexs-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 25 Apr 2019 17:46:24 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 25 Apr 2019 22:46:21 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 25 Apr 2019 22:46:17 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x3PLkGbB52494340 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 Apr 2019 21:46:16 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 74A6311C052; Thu, 25 Apr 2019 21:46:16 +0000 (GMT) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 00CAC11C04C; Thu, 25 Apr 2019 21:46:14 +0000 (GMT) Received: from rapoport-lnx (unknown [9.148.204.209]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Thu, 25 Apr 2019 21:46:13 +0000 (GMT) Received: by rapoport-lnx (sSMTP sendmail emulation); Fri, 26 Apr 2019 00:46:13 +0300 From: Mike Rapoport To: linux-kernel@vger.kernel.org Cc: Alexandre Chartre , Andy Lutomirski , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , James Bottomley , Jonathan Adams , Kees Cook , Paul Turner , Peter Zijlstra , Thomas Gleixner , linux-mm@kvack.org, linux-security-module@vger.kernel.org, x86@kernel.org, Mike Rapoport Subject: [RFC PATCH 4/7] x86/sci: hook up isolated system call entry and exit Date: Fri, 26 Apr 2019 00:45:51 +0300 X-Mailer: git-send-email 2.7.4 In-Reply-To: <1556228754-12996-1-git-send-email-rppt@linux.ibm.com> References: <1556228754-12996-1-git-send-email-rppt@linux.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 19042521-4275-0000-0000-0000032E1ABE X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19042521-4276-0000-0000-0000383D6915 Message-Id: <1556228754-12996-5-git-send-email-rppt@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-04-25_18:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=861 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1904250133 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When a system call is required to run in an isolated context, the CR3 will be switched to the SCI page table a per-cpu variable will contain and offset from the original CR3. This offset is used to switch back to the full kernel context when a trap occurs during isolated system call. Signed-off-by: Mike Rapoport --- arch/x86/entry/common.c | 61 ++++++++++++++++++++++++++++++++++++++++++++ arch/x86/kernel/process_64.c | 5 ++++ kernel/exit.c | 3 +++ 3 files changed, 69 insertions(+) diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 7bc105f..8f2a6fd 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -25,12 +25,14 @@ #include #include #include +#include #include #include #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -269,6 +271,50 @@ __visible inline void syscall_return_slowpath(struct pt_regs *regs) } #ifdef CONFIG_X86_64 + +#ifdef CONFIG_SYSCALL_ISOLATION +static inline bool sci_required(unsigned long nr) +{ + return false; +} + +static inline unsigned long sci_syscall_enter(unsigned long nr) +{ + unsigned long sci_cr3, kernel_cr3; + unsigned long asid; + + kernel_cr3 = __read_cr3(); + asid = kernel_cr3 & ~PAGE_MASK; + + sci_cr3 = build_cr3(current->sci->pgd, 0) & PAGE_MASK; + sci_cr3 |= (asid | (1 << X86_CR3_SCI_PCID_BIT)); + + current->in_isolated_syscall = 1; + current->sci->cr3_offset = kernel_cr3 - sci_cr3; + + this_cpu_write(cpu_sci.sci_syscall, 1); + this_cpu_write(cpu_sci.sci_cr3_offset, current->sci->cr3_offset); + + write_cr3(sci_cr3); + + return kernel_cr3; +} + +static inline void sci_syscall_exit(unsigned long cr3) +{ + if (cr3) { + write_cr3(cr3); + current->in_isolated_syscall = 0; + this_cpu_write(cpu_sci.sci_syscall, 0); + sci_clear_data(); + } +} +#else +static inline bool sci_required(unsigned long nr) { return false; } +static inline unsigned long sci_syscall_enter(unsigned long nr) { return 0; } +static inline void sci_syscall_exit(unsigned long cr3) {} +#endif + __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs) { struct thread_info *ti; @@ -286,10 +332,25 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs) */ nr &= __SYSCALL_MASK; if (likely(nr < NR_syscalls)) { + unsigned long sci_cr3 = 0; + nr = array_index_nospec(nr, NR_syscalls); + + if (sci_required(nr)) { + int err = sci_init(current); + + if (err) { + regs->ax = err; + goto err_return_from_syscall; + } + sci_cr3 = sci_syscall_enter(nr); + } + regs->ax = sys_call_table[nr](regs); + sci_syscall_exit(sci_cr3); } +err_return_from_syscall: syscall_return_slowpath(regs); } #endif diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 6a62f4a..b8aa624 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -55,6 +55,8 @@ #include #include #include +#include + #ifdef CONFIG_IA32_EMULATION /* Not included via unistd.h */ #include @@ -581,6 +583,9 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p) switch_to_extra(prev_p, next_p); + /* update syscall isolation per-cpu data */ + sci_switch_to(next_p); + #ifdef CONFIG_XEN_PV /* * On Xen PV, IOPL bits in pt_regs->flags have no effect, and diff --git a/kernel/exit.c b/kernel/exit.c index 2639a30..8e81353 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -62,6 +62,7 @@ #include #include #include +#include #include #include @@ -859,6 +860,8 @@ void __noreturn do_exit(long code) tsk->exit_code = code; taskstats_exit(tsk, group_dead); + sci_exit(tsk); + exit_mm(); if (group_dead) -- 2.7.4