Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp3122578ybl; Fri, 20 Dec 2019 04:19:22 -0800 (PST) X-Google-Smtp-Source: APXvYqxPXWh+xz2cgT8ViuBgCjkxh1qy/CSmf1iH9rOWIKs4s16zkUA9zprOZQfisfXENAQ/IO5x X-Received: by 2002:aca:2118:: with SMTP id 24mr3598881oiz.28.1576844362133; Fri, 20 Dec 2019 04:19:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576844362; cv=none; d=google.com; s=arc-20160816; b=XB4Evys5DFVtyyWX++VMvBiH6elG4NLnbUrI0otjhTl8NRjnzWA7tGJDlfaHsOznAV L6I1WVbLPdtoTi1//b5lrLCwJuPVWAEC1GfL15jr+8+6rPQm2Mt/np4SNnLFIplKcyoN GfK1GlmnreCVrX4HuuPBpfWxZzxi2GVGx+oxPUyZD2OdYp41JI9iKNOP31twOn+pvdCQ sWKvS5cCwpXvQT/3IOJ8g2evQS68o7xirsQ92ohktRnR/vZ2lqQ33pNafRzeILczu7d0 thn44h6TenExkqs6Xu/iiv0fcbHQPWrjSDda5BDH47BgkYxwYR3U7FbYjKHJt3K2xlLV gowg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=+fpn5MBrUsO68vwFWveOAH2qULu+Kd/H5CmwM9v3iQ8=; b=pGzJHI2UQi1iUj7bMvIG2d+WSZElYg4kKgXFLM65DA46sw5hzYWim98h2ScUl3BFtx lvqWhbLzKb4iMna2vDODTVUeJfkPFhE7fWSP49JmRXzPj23vZxxtnuNMJbbFE6QEwt/H +kv9xur+vprsSCT+8/IQTzVq+c/2JkskXsnfLYWNWreUtf2Uc6WStsVUmFlwgioL3S+1 wotpC3ONhi4et3annUoRuIGBil2W9xqP4dzJlG4Lg8cniLvHwQZGJh6nlbO21N/Y1Ty6 HpTOoCEluzpEaIvrNG/dIAVm1mB1PYEb3xxsrFM4QjgVy/lruSpJiQAPTFVT2xeoaUFf 468Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ry6bmtLZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a10si4720361oia.232.2019.12.20.04.19.10; Fri, 20 Dec 2019 04:19:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ry6bmtLZ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727370AbfLTMSO (ORCPT + 99 others); Fri, 20 Dec 2019 07:18:14 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:35636 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727197AbfLTMSO (ORCPT ); Fri, 20 Dec 2019 07:18:14 -0500 Received: by mail-io1-f65.google.com with SMTP id v18so9199385iol.2 for ; Fri, 20 Dec 2019 04:18:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+fpn5MBrUsO68vwFWveOAH2qULu+Kd/H5CmwM9v3iQ8=; b=ry6bmtLZkIP/fUiJw7N0x7wl432V1h8rQGXiC73T1m3xeoYjUa+VwnKva5Wtp/PkUf WFJZrRAGuDZTkPCN2KriGUTeasYUB3zrwp6b/Fk7AIa4iQ2yv7AnJ70oXsu0gFmrTlmU 4wUmqzuI1hTMhZoPkdsq6rI0iA3QKZ4MaWtS8v3Xyy18gdMzg0kysnXPVaShQXOtV4vV UH2lwHEole8gh+m3DG0asxBPLXijl/MOJ7q4XwnCMpoUI49TvlW2np6Ow6ZACemDp7Oc Ufb/QobWEisK+g65YqwofVZTEB4Dnrlq5ZyqsAzSamKIA0oHXa6DAgBiYEGoVAZRj+zH 7EKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+fpn5MBrUsO68vwFWveOAH2qULu+Kd/H5CmwM9v3iQ8=; b=PNJ0niRIknjZg1mwplmIEq0+OqczMKj8mtx9LH7/Bi3Z8+qJ6Yb0VCfFSXMaygBR0s 81xnQvAX+O2qObD/j2/FZ0f4OtNlQ02yT1eUP0puQa943uo2/Agf+li2UNewQnmeuk5+ bA13TqVhWxjrzVJL6N8/7ijj7penoEFGsSohTRay5lExzsuVkVmfcwfa4/PHR1lyH/WJ 9Q33idrAC3MLsTkkNqZBy7RAGPoQD+gdjKgILjBbwsVFQzjRVQFwNq6Shy6QUn9VwdnS PXuH1CcHF35bWW88+xuG3embz9VcRcjAnlhVOQoWzkyz+Q52gxFZf/xOtPUacfaFieKQ sSnQ== X-Gm-Message-State: APjAAAW/G58GrSYmfnbs3SSDuY2cA/uJmUHG+jVwg5C3zUhjDnV/nGxD xlwbqAtW2q3n7kIstrMo73ar5G+MNaMCcG9fdQ== X-Received: by 2002:a5e:8344:: with SMTP id y4mr9243560iom.27.1576844293749; Fri, 20 Dec 2019 04:18:13 -0800 (PST) MIME-Version: 1.0 References: <20191219115812.102620-1-brgerst@gmail.com> <431a146f6461402da61d09fff155f35b@AcuMS.aculab.com> In-Reply-To: <431a146f6461402da61d09fff155f35b@AcuMS.aculab.com> From: Brian Gerst Date: Fri, 20 Dec 2019 07:18:02 -0500 Message-ID: Subject: Re: [PATCH] x86: Remove force_iret() To: David Laight Cc: Andy Lutomirski , X86 ML , LKML , Ingo Molnar , "H . Peter Anvin" , Boris Ostrovsky , Oleg Nesterov Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 20, 2019 at 5:10 AM David Laight wrote: > > From: Brian Gerst > > Sent: 20 December 2019 03:48 > > On Thu, Dec 19, 2019 at 8:50 PM Andy Lutomirski wrote: > > > > > > On Thu, Dec 19, 2019 at 3:58 AM Brian Gerst wrote: > > > > > > > > force_iret() was originally intended to prevent the return to user mode with > > > > the SYSRET or SYSEXIT instructions, in cases where the register state could > > > > have been changed to be incompatible with those instructions. > > > > > > It's more than that. Before the big syscall rework, we didn't restore > > > the caller-saved regs. See: > > > > > > commit 21d375b6b34ff511a507de27bf316b3dde6938d9 > > > Author: Andy Lutomirski > > > Date: Sun Jan 28 10:38:49 2018 -0800 > > > > > > x86/entry/64: Remove the SYSCALL64 fast path > > > > > > So if you changed r12, for example, the change would get lost. > > > > force_iret() specifically dealt with changes to CS, SS and EFLAGS. > > Saving and restoring the extra registers was a different problem > > although it affected the same functions like ptrace, signals, and > > exec. > > Is it ever possible for any of the segment registers to refer to the LDT > and for another thread to invalidate the entries 'very late' ? > So even though the values were valid when changed, they are > invalid during the 'return to user' sequence. Not in the SYSRET case, where the kernel requires that CS and SS are static segments in the GDT. Any userspace context that uses LDT segments for CS/SS must return with IRET. There is fault handling for IRET (fixup_bad_iret()) for this case. > I remember writing a signal handler that 'corrupted' all the > segment registers (etc) and fixing the NetBSD kernel to handle > all the faults restoring the segment registers and IRET faulting > in kernel (IIRC invalid user %SS or %CS). > (IRET can also fault in user space, but that is a normal fault.) > > Is it actually cheaper to properly validate the segment registers, > or take the 'hit' of the slightly slower IRET path and get the cpu > to do it for you? SYSRET is faster because it avoids segment table lookups and permission checks for CS and SS. It simply sets the selectors to values set in an MSR and the attributes (base, limit, etc.) to fixed values. It is up to the OS to make sure the actual segment descriptors in memory match those default attributes. -- Brian Gerst