Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp730608ybi; Fri, 12 Jul 2019 03:46:59 -0700 (PDT) X-Google-Smtp-Source: APXvYqz6cbtA/PAPYLRogZPEKSHgeJYWAa/bfeLmRglkhvDDPvYtbTYcHTEDwyOtSWdGmrenklWA X-Received: by 2002:a17:90a:8d0d:: with SMTP id c13mr10489259pjo.137.1562928419486; Fri, 12 Jul 2019 03:46:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562928419; cv=none; d=google.com; s=arc-20160816; b=RRf5EKDvi4TYxmTEgX8wQGFguxMdZ2I4spaFhsjd1bXw3ekDDx68ee4OWRJLk+dIvQ W4sV5xhPvzw2dOM93Fnij0kGocnwtLPIy88FtwZp2D+SphyfBRW1SRSV5uib3srmAMNh 9L5+T+hBbhgBFn4D4uZeRQuPnEFmDSedb8aq90A1h8sgmqM27ijtC6Th0ZAqwUiDDcct CxUy7Xz6p+I2Sh9DEBE2iIhz3rvBZp4kI1/IlrxI0KUwppKWsPmf3tEDVz2681gP0N48 r5lix4V3hcJE7hDOjSPxsVCZCeAgTbt9zL/AafCzKqMcaENZ05uy0bKnuR/O5YSE9i8e wFNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=AnAo6demy34KrUIw2rA7B/oKTEkErUZhT/bC3ShDxLg=; b=zd50/y2TARZzRBoaFEbANr15MsNERyEZ99a2FL+VinNEj4jNdhkBqlpTWvJpUCF9ZH HEXV/6cs4kFm6gxifL/0DZVWStJPVDZZiayuAUKEjLkH9q2B5j47Jfh3KOjMN2aqFOxl O975N/dXfr4vrLPoby+EHVXGJhvKE7rpG0oFeRQ0pvFrd2P7wPgPgbOWUT1vfEiRgsE3 9Jb6zseHVKCxoAp5SCJuUpk9NRQGDT/gelW3Zo0PwDheuYTx4AohmXtd2VN5UoYsOdpc xi0+jZBgEID3Qh0gC2dxG+KrFFGaJmoS3KM6ILbIPK+Y1BCAAsCVjrwD1cFIBI+2VHnz uwAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e13si7497751pfl.279.2019.07.12.03.46.44; Fri, 12 Jul 2019 03:46:59 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726140AbfGLKof (ORCPT + 99 others); Fri, 12 Jul 2019 06:44:35 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:43267 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726690AbfGLKoe (ORCPT ); Fri, 12 Jul 2019 06:44:34 -0400 Received: from [5.158.153.55] (helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1hlt2D-0004oK-MD; Fri, 12 Jul 2019 12:44:13 +0200 Date: Fri, 12 Jul 2019 12:44:02 +0200 (CEST) From: Thomas Gleixner To: Dave Hansen cc: Alexandre Chartre , pbonzini@redhat.com, rkrcmar@redhat.com, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, kvm@vger.kernel.org, x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, konrad.wilk@oracle.com, jan.setjeeilers@oracle.com, liran.alon@oracle.com, jwadams@google.com, graf@amazon.de, rppt@linux.vnet.ibm.com Subject: Re: [RFC v2 00/27] Kernel Address Space Isolation In-Reply-To: <5cab2a0e-1034-8748-fcbe-a17cf4fa2cd4@intel.com> Message-ID: References: <1562855138-19507-1-git-send-email-alexandre.chartre@oracle.com> <5cab2a0e-1034-8748-fcbe-a17cf4fa2cd4@intel.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 11 Jul 2019, Dave Hansen wrote: > On 7/11/19 7:25 AM, Alexandre Chartre wrote: > > - Kernel code mapped to the ASI page-table has been reduced to: > > . the entire kernel (I still need to test with only the kernel text) > > . the cpu entry area (because we need the GDT to be mapped) > > . the cpu ASI session (for managing ASI) > > . the current stack > > > > - Optionally, an ASI can request the following kernel mapping to be added: > > . the stack canary > > . the cpu offsets (this_cpu_off) > > . the current task > > . RCU data (rcu_data) > > . CPU HW events (cpu_hw_events). > > I don't see the per-cpu areas in here. But, the ASI macros in > entry_64.S (and asi_start_abort()) use per-cpu data. > > Also, this stuff seems to do naughty stuff (calling C code, touching > per-cpu data) before the PTI CR3 writes have been done. But, I don't > see anything excluding PTI and this code from coexisting. That ASI thing is just PTI on steroids. So why do we need two versions of the same thing? That's absolutely bonkers and will just introduce subtle bugs and conflicting decisions all over the place. The need for ASI is very tightly coupled to the need for PTI and there is absolutely no point in keeping them separate. The only difference vs. interrupts and exceptions is that the PTI logic cares whether they enter from user or from kernel space while ASI only cares about the kernel entry. But most exceptions/interrupts transitions do not require to be handled at the entry code level because on VMEXIT the exit reason clearly tells whether a switch to the kernel CR3 is necessary or not. So this has to be handled at the VMM level already in a very clean and simple way. I'm not a virt wizard, but according to code inspection and instrumentation even the NMI on the host is actually reinjected manually into the host via 'int $2' after the VMEXIT and for MCE it looks like manual handling as well. So why do we need to sprinkle that muck all over the entry code? From a semantical perspective VMENTER/VMEXIT are very similar to the return to user / enter to user mechanics. Just that the transition happens in the VMM code and not at the regular user/kernel transition points. So why do you want ot treat that differently? There is absolutely zero reason to do so. And there is no reason to create a pointlessly different version of PTI which introduces yet another variant of a restricted page table instead of just reusing and extending what's there already. Thanks, tglx