Received: by 10.223.185.116 with SMTP id b49csp2902767wrg; Mon, 5 Mar 2018 10:27:42 -0800 (PST) X-Google-Smtp-Source: AG47ELu7leZAev3IpRV763Fdr307bX7qRtLVJq0+S6rLN21jXivgQBkAbxQ6vrBamZdizSGeyqV6 X-Received: by 10.99.173.71 with SMTP id y7mr12759982pgo.432.1520274461970; Mon, 05 Mar 2018 10:27:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520274461; cv=none; d=google.com; s=arc-20160816; b=SyiNOvpLz4POBVZK2Lq5A8jQDzBFLur3ZHXcEaRfE2MFo52FibreFu8IcSTG4zDePG TMjy63owwFGolSqpGBfCaBsP/BrA8cOqM2js/AKLFVyBuIhuHLsMoouhAVzi2VdRjKFn pBuA2sbSZ21oShT04C0L/ywn1K5yvaKlv4MmPPCwkeKvBl9Jk3Z7f6K6EPbAgFPYaQcZ vd+beq9djO+WH9mPH0LR/++K5vuaM2h4XVOCbdkU3SzWtaaYCpWCqfkY954n4SifyEWV L6slUOqPLgvvAo3qWRCVFkKy768RppMzPc1NcM7HFN+UD2dndgQ4W2IG63OvE9VzqOXn wQlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=vTVn/V67uglWFPu1xXRqaL6PPUAuxt/2pn1u3o2Yhk4=; b=Gray/AbdORCP8luQrySUCXgAZmbK+3KjXDR9HEAXWlOVPiJpuzlkt0Dd8gtAdAf7ZT vR7orXejzDnSa6EoZjnvmKrofL2WOxGqPD4yGIKt8EJmLbqskzVPe8eAmEVXnJkHt+re 7R5HZ94F783OHyhhVO9TafF+4wJC6GHIosl16I2aAaQfhRq1uQrWAqoghwA9U2feOyra 74tTvfGx14Q6JNuttOGSmC75RtioyXkv8JBtGbpawhQkK9BoHYclGpxfw33OYg8RE49+ qeJjIvXMVZPhd0ISo9ZOAPbtJ3bzpoyQU2LPUiTXbH4qEVv/Fm/ltXI9vjCQ0ZKWQfbw YPbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@8bytes.org header.s=mail-1 header.b=iDL/Q4su; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b6-v6si9761915plm.590.2018.03.05.10.27.27; Mon, 05 Mar 2018 10:27:41 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@8bytes.org header.s=mail-1 header.b=iDL/Q4su; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=8bytes.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932368AbeCESZ1 (ORCPT + 99 others); Mon, 5 Mar 2018 13:25:27 -0500 Received: from 8bytes.org ([81.169.241.247]:38868 "EHLO theia.8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932099AbeCESZ0 (ORCPT ); Mon, 5 Mar 2018 13:25:26 -0500 Received: by theia.8bytes.org (Postfix, from userid 1000) id CEB0ED4; Mon, 5 Mar 2018 19:25:24 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=8bytes.org; s=mail-1; t=1520274324; bh=xMGVlYb1XHFTRkxdGxAVcCVnA4lk7VpN7WN5DGKIs74=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=iDL/Q4su1IbPhxA6XWFiO022vCB6albjHLv3Sljm6eLfNPA02sz2WY4XKEfpKuDy1 bkaf9mB2di6qRY31hW7eggQA3PBdqtKxi37tBvvI2mfUM9WUkLojgm0fs6R0PoLwCm 0078/dLe88U5b7pJGcyUeCCurXmrFQzHcI/25hBzrlYZ5yKTxfbL4hsiQgJTDZuxnc jMpoulwz2sYgbs82qHrn88bvTfP/M1aYbKKmE+ceQ7natBF6xV8WhH54ok0zptNvqd CzRPP/V8Bs3lRP8bX5eYzF223kED8YJNoQi9gQeP2ka1CoZZ4edzKyK0QyYQuMp0MB sR9yqiRJ6Tdtg== Date: Mon, 5 Mar 2018 19:25:24 +0100 From: Joerg Roedel To: Brian Gerst Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , the arch/x86 maintainers , Linux Kernel Mailing List , Linux-MM , Linus Torvalds , Andy Lutomirski , Dave Hansen , Josh Poimboeuf , Juergen Gross , Peter Zijlstra , Borislav Petkov , Jiri Kosina , Boris Ostrovsky , David Laight , Denys Vlasenko , Eduardo Valentin , Greg KH , Will Deacon , "Liguori, Anthony" , Daniel Gruss , Hugh Dickins , Kees Cook , Andrea Arcangeli , Waiman Long , Pavel Machek , Joerg Roedel Subject: Re: [PATCH 11/34] x86/entry/32: Handle Entry from Kernel-Mode on Entry-Stack Message-ID: <20180305182524.GT16484@8bytes.org> References: <1520245563-8444-1-git-send-email-joro@8bytes.org> <1520245563-8444-12-git-send-email-joro@8bytes.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Brian, thanks for your review and helpful input. On Mon, Mar 05, 2018 at 11:41:01AM -0500, Brian Gerst wrote: > On Mon, Mar 5, 2018 at 5:25 AM, Joerg Roedel wrote: > > +.Lentry_from_kernel_\@: > > + > > + /* > > + * This handles the case when we enter the kernel from > > + * kernel-mode and %esp points to the entry-stack. When this > > + * happens we need to switch to the task-stack to run C code, > > + * but switch back to the entry-stack again when we approach > > + * iret and return to the interrupted code-path. This usually > > + * happens when we hit an exception while restoring user-space > > + * segment registers on the way back to user-space. > > + * > > + * When we switch to the task-stack here, we can't trust the > > + * contents of the entry-stack anymore, as the exception handler > > + * might be scheduled out or moved to another CPU. Therefore we > > + * copy the complete entry-stack to the task-stack and set a > > + * marker in the iret-frame (bit 31 of the CS dword) to detect > > + * what we've done on the iret path. > > We don't need to worry about preemption changing the entry stack. The > faults that IRET or segment loads can generate just run the exception > fixup handler and return. Interrupts were disabled when the fault > occurred, so the kernel cannot be preempted. The other case to watch > is #DB on SYSENTER, but that simply returns and doesn't sleep either. > > We can keep the same process as the existing debug/NMI handlers - > leave the current exception pt_regs on the entry stack and just switch > to the task stack for the call to the handler. Then switch back to > the entry stack and continue. No copying needed. Okay, I'll look into that. Will it even be true for fully preemptible and RT kernels that there can't be any preemption of these handlers? > > + /* Mark stackframe as coming from entry stack */ > > + orl $CS_FROM_ENTRY_STACK, PT_CS(%esp) > > Not all 32-bit processors will zero-extend segment pushes. You will > need to explicitly clear the bit in the case where we didn't switch > CR3. Okay, thanks, will add that. Regards, Joerg