Received: by 10.223.176.46 with SMTP id f43csp1224697wra; Fri, 19 Jan 2018 08:32:36 -0800 (PST) X-Google-Smtp-Source: ACJfBosBIKKx/TV7D/ez0Rnr5DCpCOn2/hY3j6pza286p/GxyoJTj65kXRqvt06vuqWprUUaq/PK X-Received: by 10.99.95.193 with SMTP id t184mr36877276pgb.189.1516379556773; Fri, 19 Jan 2018 08:32:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516379556; cv=none; d=google.com; s=arc-20160816; b=SCriDl+c+Qnf6a97Myn6YwbYJnjZd+aMz9JHJiLLAbfhPz6C7pdVNS7jz3ngIPPFLz bY8AsPrle3Q6fnuTckitttSAgYM1DUMNh0k9jsCZ2ilywY/sAwBm22bgcPO2cvzniuzX KcXvMq2Gn7H+4Gs8y6JcDWbXk41l9fTn+mYDPG4FaKJJ7c+CLM2+fXvd+ezl+Ketiwst pXXMlg4TqRaIYvKmggf2yLGhh9x074jkS4PnzsG50e0weOXoehYQLMT2DskhlMsIw6RW GZks2nucdtq8tT7fKlGMCJM6WsJndEexYj6Z8wJ4RPKYZkJrGRUq9svFulDrIybHq/r0 kizg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dmarc-filter :arc-authentication-results; bh=xB1K/pb0TB3o1ioL3eWT0qIjbJP+IysjsQf3PWXjiHA=; b=kEOAyNfchdDeX2gahPdcBpA+MPZ/eqxYmoBWW0NMfNDqPrg6ZDjV4hVqKf1YE9O6ZY RbbMnTwvQfBJ/B8RyB3NElubbvUirtQJb6mLun4ULAK4Z4xjBInatqhaCEnmhjYXnDVJ XE42njwon5NflGwhsAJRMqy2nlDObWNpCNkpsE9JcjlUDN6SfHRZYhmURY9xcMO2+d/a sAX/ejS/fY8EAzF4Z1QGlLib0YgjElsynH73Yeb6cdmscyoqo8hyM+W2Nw26GnrsyMyK jTW7qDTzOEMWURffAenszYZr9sOcUkzMEiCCqvgzrmfv8U4wV8fhnCusjZ96Q+BBiUkM GUWQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f10si8526561pgu.783.2018.01.19.08.32.22; Fri, 19 Jan 2018 08:32:36 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756136AbeASQbJ (ORCPT + 99 others); Fri, 19 Jan 2018 11:31:09 -0500 Received: from mail.kernel.org ([198.145.29.99]:34720 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756034AbeASQaz (ORCPT ); Fri, 19 Jan 2018 11:30:55 -0500 Received: from mail-io0-f182.google.com (mail-io0-f182.google.com [209.85.223.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E434921783 for ; Fri, 19 Jan 2018 16:30:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E434921783 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org Received: by mail-io0-f182.google.com with SMTP id c17so2750588iod.1 for ; Fri, 19 Jan 2018 08:30:54 -0800 (PST) X-Gm-Message-State: AKwxytfWnYWh1BASld2gUjIcgoUMa2/9qje7hb4Ws7HigzIRBQoxAVOs +tH2EjOU/36AZndYlI0qu0+5WkEEiNjOrh8wn3CmqA== X-Received: by 10.107.167.136 with SMTP id q130mr20305541ioe.173.1516379454095; Fri, 19 Jan 2018 08:30:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.2.137.84 with HTTP; Fri, 19 Jan 2018 08:30:33 -0800 (PST) In-Reply-To: <20180119095523.GY28161@8bytes.org> References: <1516120619-1159-1-git-send-email-joro@8bytes.org> <1516120619-1159-3-git-send-email-joro@8bytes.org> <20180117091853.GI28161@8bytes.org> <20180119095523.GY28161@8bytes.org> From: Andy Lutomirski Date: Fri, 19 Jan 2018 08:30:33 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 02/16] x86/entry/32: Enter the kernel via trampoline stack To: Joerg Roedel Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , X86 ML , LKML , Linux-MM , Linus Torvalds , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , "Liguori, Anthony" , Daniel Gruss , Hugh Dickins , Kees Cook , Andrea Arcangeli , Waiman Long , Joerg Roedel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 19, 2018 at 1:55 AM, Joerg Roedel wrote: > Hey Andy, > > On Wed, Jan 17, 2018 at 10:10:23AM -0800, Andy Lutomirski wrote: >> On Wed, Jan 17, 2018 at 1:18 AM, Joerg Roedel wrote: > >> > Just read up on vm86 mode control transfers and the stack layout then. >> > Looks like I need to check for eflags.vm=1 and copy four more registers >> > from/to the entry stack. Thanks for pointing that out. >> >> You could just copy those slots unconditionally. After all, you're >> slowing down entries by an epic amount due to writing CR3 on with PCID >> off, so four words copied should be entirely lost in the noise. OTOH, >> checking for VM86 mode is just a single bt against EFLAGS. >> >> With the modern (rewritten a year or two ago by Brian Gerst) vm86 >> code, all the slots (those actually in pt_regs) are in the same >> location regardless of whether we're in VM86 mode or not, but we're >> still fiddling with the bottom of the stack. Since you're controlling >> the switch to the kernel thread stack, you can easily just write the >> frame to the correct location, so you should not need to context >> switch sp1 -- you can do it sanely and leave sp1 as the actual bottom >> of the kernel stack no matter what. In fact, you could probably avoid >> context switching sp0, either, which would be a nice cleanup. > > I am not sure what you mean by "not context switching sp0/sp1" ... You're supposed to read what I meant, not what I said... I meant that we could have sp0 have a genuinely constant value per cpu. That means that the entry trampoline ends up with RIP, etc in a different place depending on whether VM was in use, but the entry trampoline code should be able to handle that. sp1 would have a value that varies by task, but it could just point to the top of the stack instead of being changed depending on whether VM is in use. Instead, the entry trampoline would offset the registers as needed to keep pt_regs in the right place. I think you already figured all of that out, though :)