Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp382344yba; Fri, 26 Apr 2019 01:33:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqyD1SRE/tlSjbLe7VZO7lZ0AUv5vYpeUpoDq8qo/PIQWCePnf8vaqL2cRYTZMIfn8PGZ/n3 X-Received: by 2002:a63:f212:: with SMTP id v18mr39346234pgh.231.1556267590963; Fri, 26 Apr 2019 01:33:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556267590; cv=none; d=google.com; s=arc-20160816; b=Dal6klZ/Yf0KLs4oJUpg1B8iPFbv1PldSxfXsVBDazo8vQ3UzBGDvqUR4+C0KbglJh xRVxc4gmP1TAvXxQCv19siV7kjqayZ6l4E/yMsLCMJq59SeBaclAxSIaAptgklXtagxp aF/XCu6bobBC2AuVd5PiTa5osfVnMOub7BdVePKS5snMqmMJJpWbUniWkWO1m2uG9kEj dSMiodM/27SJibEkGxkh2BJyxGpJYCK6HazeNcA0yhZf8TtWgmLwBy+w/xpM5PgChx9T rhf6ByUlDLxOTTR+6DM+GEY4gc1LG+9udnqicieboqrj0/6geNaMPYrMHjYP/WPQTWre I66A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=/8yEHGmmpSB8zJvfnDs0qm8YJnL5/w4ezpyP4pzX7OU=; b=kdzgB7usdzTbW5YpMUolGprpAU8X9BATOpTXMeGxr4fTHTv/tsOpsDEC7bTn5A5nMq W60r2DT4r6b0/STa8LRwcqJ4gZme7cUBWnRck1E20jQWaVA0My/1oQISA4O+hWVwe8c4 irjs8MBWLBWG/gVevFxz0rYJ7tB7yg9dQtUwhIF+FCEOd11b+YtcSDs4oKBPCUAcUQQB HS1Kuipgv/kPpenL78La26mpuEsYelSDzmvJ4NgP931myr8sSnaLbrDo4Iuj2wA7he8w blwEWL6BiJPWqbSKt5UIOz7HR7yHNmuPZ4z4+dR/lrSpVrYA1CUumzer9p+NdnCUwmhw eHQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=VkXjRe45; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s2si18720765pgp.69.2019.04.26.01.32.54; Fri, 26 Apr 2019 01:33:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=VkXjRe45; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726011AbfDZIbw (ORCPT + 99 others); Fri, 26 Apr 2019 04:31:52 -0400 Received: from mail-wr1-f68.google.com ([209.85.221.68]:38420 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725993AbfDZIbt (ORCPT ); Fri, 26 Apr 2019 04:31:49 -0400 Received: by mail-wr1-f68.google.com with SMTP id k16so3253306wrn.5; Fri, 26 Apr 2019 01:31:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=/8yEHGmmpSB8zJvfnDs0qm8YJnL5/w4ezpyP4pzX7OU=; b=VkXjRe45hyNXVj6ZfWoXHr6MTHjtWL9wxGXVZSzLMcje4P0FbuT/Q7Jd+vA1p9zEB/ 8RqkOkiJthWGtNBekTaSFZY4oXZh1GyFrxsJ68xk0BZzJnK06NtdNo3O+axMJWfCl1qE GyHF+8T/9FBm0+DjXetmbgG3M0v8zroVHiOtLPRpaT6MrSmy9XOu160qeULDV180k0xa kY+zWM78luJUpw4oMLsj3nAbD5QeSMjPf76uIIhHD82smYQ4sa2Gw/bOFqlaYzMnrmOq /DJvD0YVbwD+SffY8PXAy25ufW7lJrslVydjKCvITbMXQredlQKDjMDmHDbLgT49Ymtn CxwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=/8yEHGmmpSB8zJvfnDs0qm8YJnL5/w4ezpyP4pzX7OU=; b=jM7zytu7j/90bBqtngv2yQGzbvslQWvn1ANQhMtl4oUnPe1+cLiNzrbg35oG+YT1Nw /ueimIvaeDaCzk9HdAQaxshUy+rClzoPv7S50SpQ/PfulLYYeGX5QuIgdHMa6qIBTcKx 7FH3oHng+/bNaXYKGCq9hqouQm2KBQlCSym5dd+cuey5sa87gcF+2L3FWQoo9u2jGrUx iqTSZFOWaGRe3QkkGRVhMUBT1s4IpLu0oxqcQ8i1oVSc5bUENE9X7KaMA4fKcmqJkQKJ HHsEXl3uNHKvvQkQvoMR1RvmP/2//N2+ZrDX1qVJSWDHMkbSSVThyNz0Vady3UhTRi73 huNg== X-Gm-Message-State: APjAAAX3QdY2Lv9SXr2O4FQwhzd6+iyWrpaPZFlHwyS8ORN6tqwPwtlL qsdbd6xwRdX+0XGIjaVEWfE= X-Received: by 2002:a5d:654a:: with SMTP id z10mr3402530wrv.153.1556267507780; Fri, 26 Apr 2019 01:31:47 -0700 (PDT) Received: from gmail.com (2E8B0CD5.catv.pool.telekom.hu. [46.139.12.213]) by smtp.gmail.com with ESMTPSA id v16sm20304256wru.76.2019.04.26.01.31.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 26 Apr 2019 01:31:46 -0700 (PDT) Date: Fri, 26 Apr 2019 10:31:44 +0200 From: Ingo Molnar To: Mike Rapoport Cc: linux-kernel@vger.kernel.org, Alexandre Chartre , Andy Lutomirski , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Ingo Molnar , James Bottomley , Jonathan Adams , Kees Cook , Paul Turner , Peter Zijlstra , Thomas Gleixner , linux-mm@kvack.org, linux-security-module@vger.kernel.org, x86@kernel.org, Linus Torvalds , Peter Zijlstra , Andrew Morton Subject: Re: [RFC PATCH 2/7] x86/sci: add core implementation for system call isolation Message-ID: <20190426083144.GA126896@gmail.com> References: <1556228754-12996-1-git-send-email-rppt@linux.ibm.com> <1556228754-12996-3-git-send-email-rppt@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1556228754-12996-3-git-send-email-rppt@linux.ibm.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Mike Rapoport wrote: > When enabled, the system call isolation (SCI) would allow execution of > the system calls with reduced page tables. These page tables are almost > identical to the user page tables in PTI. The only addition is the code > page containing system call entry function that will continue > exectution after the context switch. > > Unlike PTI page tables, there is no sharing at higher levels and all > the hierarchy for SCI page tables is cloned. > > The SCI page tables are created when a system call that requires > isolation is executed for the first time. > > Whenever a system call should be executed in the isolated environment, > the context is switched to the SCI page tables. Any further access to > the kernel memory will generate a page fault. The page fault handler > can verify that the access is safe and grant it or kill the task > otherwise. > > The initial SCI implementation allows access to any kernel data, but it > limits access to the code in the following way: > * calls and jumps to known code symbols without offset are allowed > * calls and jumps into a known symbol with offset are allowed only if that > symbol was already accessed and the offset is in the next page > * all other code access are blocked > > After the isolated system call finishes, the mappings created during its > execution are cleared. > > The entire SCI page table is lazily freed at task exit() time. So this basically uses a similar mechanism to the horrendous PTI CR3 switching overhead whenever a syscall seeks "protection", which overhead is only somewhat mitigated by PCID. This might work on PTI-encumbered CPUs. While AMD CPUs don't need PTI, nor do they have PCID. So this feature is hurting the CPU maker who didn't mess up, and is hurting future CPUs that don't need PTI .. I really don't like it where this is going. In a couple of years I really want to be able to think of PTI as a bad dream that is mostly over fortunately. I have the feeling that compiler level protection that avoids corrupting the stack in the first place is going to be lower overhead, and would work in a much broader range of environments. Do we have analysis of what the compiler would have to do to prevent most ROP attacks, and what the runtime cost of that is? I mean, C# and Java programs aren't able to corrupt the stack as long as the language runtime is corect. Has to be possible, right? Thanks, Ingo