Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp3104981imm; Fri, 20 Jul 2018 10:11:49 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdTvHZg+TR26Mp1jSA/7oOJqOPUyD52V3aAKzguLanLUPKHyDRO+ppMqQwtPoFzB0XiY9SV X-Received: by 2002:a17:902:6b05:: with SMTP id o5-v6mr2893368plk.67.1532106709485; Fri, 20 Jul 2018 10:11:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532106709; cv=none; d=google.com; s=arc-20160816; b=uKwTmIiVTxqnNLqj5UzJM0Ks5djdm/HX8Clq+xPnP7iYPzhz/sIuXHquOmrPMTmC7s 0F6oqj9/jTrOsaTYVgQ9YGY2rLzZDxC+UFz0m8XDfhLBG1jWTS03Hp362XzgVEo87OP0 awsPQsnGp1Xsetr+KVr7+++171faulJq2Y0imTX1ZVN2km5DRSxtpmIK0p+jDTaJOkW3 71BDLDzU9ME6htCGHiHSybPAIG2b+pGTZusXqRMW4rM25TsJlEh/8SFjcVLVJ66NDf1V Sp24Mbe3XY8Kc4sl3vVtBpcJrSt9SsETrscNpdn26siD8w4KIwZ7GqlzLLPdGK3BSR5U h9Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=LySZi/8RLW+bZjEt6sj60TWNWhZAL5qT4W9U7DUaTyQ=; b=y3ceADdOkSOFVwX3wN1AX4WEQ058dLk1DSTyZsGd292Km0euf6La0iyhrVxVexi2cM Eh56PQqgx3q3928Pr9LXyc0VRTrnYlj0/QWKlpnob55E+30K/OyUXHTMzoK/42CJkuzS tys1VAF9d8Wu/A/KTpzH9YJpWIaISakVAozuCHnmn+ljrqHcuhf2HowMYwewTdLvyO46 pvtPlNFkt+ldxzx2BluZqzacJytFV+zZEIShDXtQGEKRxosiFQZ1IAuqlJgUtGu9aVD4 Ako1Vps1dhhGNkldjdpgnTt2TKalkL+RjJlSSge543oTJGS8QjsWbzYkz2nOIULtAmYn 2doQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=urqhv9Hf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o81-v6si2163691pfj.350.2018.07.20.10.11.34; Fri, 20 Jul 2018 10:11:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=urqhv9Hf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388145AbeGTR7A (ORCPT + 99 others); Fri, 20 Jul 2018 13:59:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:57344 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731556AbeGTR7A (ORCPT ); Fri, 20 Jul 2018 13:59:00 -0400 Received: from mail-wr1-f53.google.com (mail-wr1-f53.google.com [209.85.221.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2C5D72086D for ; Fri, 20 Jul 2018 17:09:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1532106588; bh=sBw5Hd+gWC18fpJE8J4gLAQMCg1CCO4ZvoLRrhiV45c=; h=In-Reply-To:References:From:Date:Subject:To:Cc:From; b=urqhv9HfpnuGzoN92MmhUQp05gLb8EjSVazmIV6GSnhH1vSh61MwohLsLkJYQkHw5 ElPVTulJkUvsxyWmyd9+1Pb/sh/DX7ypOiNXBs6vbZQ2tHxVkQzB8SA0+76w2tamLg xl799WWQiFr7flGF6HBgplVABN58+wVkrIALiSes= Received: by mail-wr1-f53.google.com with SMTP id q10-v6so11953894wrd.4 for ; Fri, 20 Jul 2018 10:09:48 -0700 (PDT) X-Gm-Message-State: AOUpUlG8UicjSCdWHLnkGn0ktCUWx6ppMx1JYJkeZi0w1Ls5LnqfMpH5 WslHXOR/xa2rchXNHbfqQoqrfafqHdrnLbDJ3lZmRw== X-Received: by 2002:adf:8325:: with SMTP id 34-v6mr2007828wrd.67.1532106586541; Fri, 20 Jul 2018 10:09:46 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a1c:d548:0:0:0:0:0 with HTTP; Fri, 20 Jul 2018 10:09:26 -0700 (PDT) In-Reply-To: <1532103744-31902-4-git-send-email-joro@8bytes.org> References: <1532103744-31902-1-git-send-email-joro@8bytes.org> <1532103744-31902-4-git-send-email-joro@8bytes.org> From: Andy Lutomirski Date: Fri, 20 Jul 2018 10:09:26 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 3/3] x86/entry/32: Copy only ptregs on paranoid entry/exit path To: Joerg Roedel Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , X86 ML , LKML , Linux-MM , Linus Torvalds , Andy Lutomirski , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , Brian Gerst , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , "Liguori, Anthony" , Daniel Gruss , Hugh Dickins , Kees Cook , Andrea Arcangeli , Waiman Long , Pavel Machek , "David H . Gutteridge" , Joerg Roedel , Arnaldo Carvalho de Melo , Alexander Shishkin , Jiri Olsa , Namhyung Kim Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 20, 2018 at 9:22 AM, Joerg Roedel wrote: > From: Joerg Roedel > > The code that switches from entry- to task-stack when we > enter from kernel-mode copies the full entry-stack contents > to the task-stack. > > That is because we don't trust that the entry-stack > contents. But actually we can trust its contents if we are > not scheduled between entry and exit. > > So do less copying and move only the ptregs over to the > task-stack in this code-path. > > Suggested-by: Andy Lutomirski > Signed-off-by: Joerg Roedel > --- > arch/x86/entry/entry_32.S | 70 +++++++++++++++++++++++++---------------------- > 1 file changed, 38 insertions(+), 32 deletions(-) > > diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S > index 2767c62..90166b2 100644 > --- a/arch/x86/entry/entry_32.S > +++ b/arch/x86/entry/entry_32.S > @@ -469,33 +469,48 @@ > * segment registers on the way back to user-space or when the > * sysenter handler runs with eflags.tf set. > * > - * When we switch to the task-stack here, we can't trust the > - * contents of the entry-stack anymore, as the exception handler > - * might be scheduled out or moved to another CPU. Therefore we > - * copy the complete entry-stack to the task-stack and set a > - * marker in the iret-frame (bit 31 of the CS dword) to detect > - * what we've done on the iret path. > + * When we switch to the task-stack here, we extend the > + * stack-frame we copy to include the entry-stack %esp and a > + * pseudo %ss value so that we have a full ptregs struct on the > + * stack. We set a marker in the frame (bit 31 of the CS dword). > * > - * On the iret path we copy everything back and switch to the > - * entry-stack, so that the interrupted kernel code-path > - * continues on the same stack it was interrupted with. > + * On the iret path we read %esp from the PT_OLDESP slot on the > + * stack and copy ptregs (except oldesp and oldss) to it, when > + * we find the marker set. Then we switch to the %esp we read, > + * so that the interrupted kernel code-path continues on the > + * same stack it was interrupted with. Can you give an example of the exact scenario in which any of this copying happens and why it's needed? IMO you should just be able to *run* on the entry stack without copying anything at all.