Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp6493214yba; Wed, 1 May 2019 13:28:43 -0700 (PDT) X-Google-Smtp-Source: APXvYqwN218/8Co+azC/MTIdBCTB3HWujRfUMk2p4ufFGmHL5c8EgwTzUGD16weiQKI1mlwXetom X-Received: by 2002:aa7:80d1:: with SMTP id a17mr27289133pfn.156.1556742523481; Wed, 01 May 2019 13:28:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556742523; cv=none; d=google.com; s=arc-20160816; b=Nji4F88v1T6jUng+vD7G2KzzY3mXahRWSDEqDd6Fb2C08RNTtY27jhCtC2eD0mDdfs HMyjSTHniErbYiRD4T032YdOwogt+rER/fMybGPuS+sLgkGt9HV5KS8YPQxm8IHc1/C3 nnK2UoAmPAeeciGu+BsG7w2b4rkcpyxajf9F/moTLG3fvWznCZ7sOGCrzFrOHjEqWey3 yj9fcOjHZq2ACW8Km1iL0MAtZ+9WvQBvhmarMegy1VsgdjG4NZGpkiN7qW57FjSLo85T V5xycEIo1OknpTP/Qvxsi8rq1ez4Y+GBZ8K0SC/npb5gHJJyYstNQjgnrfD8wbH8MEfk HBgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=FSF7/Ot6zFZjWolRIiiMJFDPBlKnObZA93vCGCliBc0=; b=aNspWrbp1gPmsujP8uth9X2ddawUjHSXVVPtLfBF74QrVccJ+Vn+yCK7vQ5/B2R6KP c8AJ7aRee/QuMpQDYO6HMccc00v2T0cmFmSulpXFMGo03SykuUGGfTWLvSXa4+74Uxof vMkytYk3GDhA8bA0b4Zt9sJocR5ACA4ChXF1p0qcDhdidAzg1fIDgXP6xBdKO42GndMy 93Ke5qisDhoHu9kdjhCZ6nszwrfFcxAGqaoN0o4CyUYaf2NDX3MZGTMpbl1FapcKJd4J d2XVp4jirzrjit9yk5+FnpdgNXnkAP4SYkIHS4KZZ8sPiCYrY0LjB+YwD3/qQ6sZ7Y1q jVQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=MjjlszWE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r12si23794258pli.103.2019.05.01.13.28.28; Wed, 01 May 2019 13:28:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@amacapital-net.20150623.gappssmtp.com header.s=20150623 header.b=MjjlszWE; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726225AbfEAU0A (ORCPT + 99 others); Wed, 1 May 2019 16:26:00 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:43213 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726077AbfEAU0A (ORCPT ); Wed, 1 May 2019 16:26:00 -0400 Received: by mail-pl1-f193.google.com with SMTP id n8so8644715plp.10 for ; Wed, 01 May 2019 13:25:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=FSF7/Ot6zFZjWolRIiiMJFDPBlKnObZA93vCGCliBc0=; b=MjjlszWEp2k51s+D16yH7570R3HNDO+yiKMzcb62dpX5XPDgDdEeG15uxijmnIIuy5 ImillGZgFSmmza64MgKbel48kER2ZlyFs/DZP23HtW4UlHR0mxM5znVNc1D/F3V/Njx5 8ZIHyc7ZyPMTidQXW0V+pBm8owats8Kc5RQKfAVWyCvogJ364QdWd+fRiJcbhAp04wHu ACLaDJ5v0ftNPNcxQaZ0rXzJjzl1Z/69Hvm+XTEtjYtC7fgkUpZ1P/azGVkbbaIOuzId W11gXkjcBUD9t10LEUlXDae9Jj40OSlmO07pAeg6WbZKOgnKHtaEFAwSCIY0T4X0pmm8 APCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=FSF7/Ot6zFZjWolRIiiMJFDPBlKnObZA93vCGCliBc0=; b=jIm/ffBNUtnqU0WdWiyLKUYfx7a0ixgRrSoN091PIhpcwS+XvKFACac2SfHUqYVuvN wdamfBI3E1BNEHmtnVsXVk1s5lCQXcqTAAGc5hyWhkpXwq8HtrXMVaAODXFLVRB9K5G+ eYwKbQ9MMS0iUUX/177VVwVdV7h91eyOl6WDO7viXUvKu1pqzyi+6+VlPomuX1kQujlT Vh1vlq33vUEOULlH/Edea/Rxp413Ok562JsS3qkFl496qEGZ+/fdVczkJU8b1Xz+w3WK 1fioGKyXE+G8Sr3RQDFLR19xMttXnfjszZtpZVgJRr3MbkfIehjFJ5DmRWYGphHqNyyt Z2Hg== X-Gm-Message-State: APjAAAWZSuNFTpIAVY2lEugaTh1l41mspC83wAqm9PCv1Qw8M5db0nC+ Q1RE83hKN46NQZEWFphxTBWVeQ== X-Received: by 2002:a17:902:5c5:: with SMTP id f63mr76569541plf.327.1556742359199; Wed, 01 May 2019 13:25:59 -0700 (PDT) Received: from ?IPv6:2600:1010:b02a:b64e:1024:4224:172b:986b? ([2600:1010:b02a:b64e:1024:4224:172b:986b]) by smtp.gmail.com with ESMTPSA id 128sm47662235pgb.47.2019.05.01.13.25.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 01 May 2019 13:25:58 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (1.0) Subject: Re: [RESEND PATCH v6 08/12] x86/fsgsbase/64: Use the per-CPU base as GSBASE at the paranoid_entry From: Andy Lutomirski X-Mailer: iPhone Mail (16E227) In-Reply-To: <7029A32B-958E-4C1E-8B5F-D49BA68E4755@intel.com> Date: Wed, 1 May 2019 13:25:56 -0700 Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , Andi Kleen , "Shankar, Ravi V" , LKML , Dave Hansen Content-Transfer-Encoding: quoted-printable Message-Id: <2863FA6C-F783-4322-9A01-4A2B8A7817A3@amacapital.net> References: <1552680405-5265-1-git-send-email-chang.seok.bae@intel.com> <1552680405-5265-9-git-send-email-chang.seok.bae@intel.com> <5DCF2089-98EC-42D3-96C3-6ECCDA0B18E2@amacapital.net> <0816B012-44E8-40FB-8003-33C4841CD0E1@intel.com> <7029A32B-958E-4C1E-8B5F-D49BA68E4755@intel.com> To: "Bae, Chang Seok" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On May 1, 2019, at 1:21 PM, Bae, Chang Seok wro= te: >=20 >=20 >>> On May 1, 2019, at 11:01, Bae, Chang Seok wro= te: >>>=20 >>> On May 1, 2019, at 10:40, Andy Lutomirski wrote: >>>=20 >>> On Wed, May 1, 2019 at 6:52 AM Bae, Chang Seok wrote: >>>>=20 >>>>=20 >>>>> On Apr 5, 2019, at 06:50, Andy Lutomirski wrote:= >>>>>=20 >>>>> Furthermore, if you folks even want me to review this series, the ptra= ce tests need to be in place. On inspection of the current code (after the d= ebacle a few releases back), it appears the SETREGSET=E2=80=99s effect depen= ds on the current values in the registers =E2=80=94 it does not actually see= m to reliably load the whole state. So my confidence will be greatly increas= ed if your series first adds a test that detects that bug (and fails!), then= fixes the bug in a tiny little patch, then adds FSGSBASE, and keeps the tes= t working. >>>>>=20 >>>>=20 >>>> I think I need to understand the issue. Appreciate if you can elaborate= a little bit. >>>>=20 >>>=20 >>> This patch series gives a particular behavior to PTRACE_SETREGS and >>> PTRACE_POKEUSER. There should be a test case that validates that >>> behavior, including testing the weird cases where gs !=3D 0 and gsbase >>> contains unusual values. Some existing tests might be pretty close to >>> doing what's needed. >>>=20 >>> Beyond that, the current putreg() code does this: >>>=20 >>> case offsetof(struct user_regs_struct,gs_base): >>> /* >>> * Exactly the same here as the %fs handling above. >>> */ >>> if (value >=3D TASK_SIZE_MAX) >>> return -EIO; >>> if (child->thread.gsbase !=3D value) >>> return do_arch_prctl_64(child, ARCH_SET_GS, value); >>> return 0; >>>=20 >>> and do_arch_prctl_64(), in turn, does this: >>>=20 >>> case ARCH_SET_GS: { >>> if (unlikely(arg2 >=3D TASK_SIZE_MAX)) >>> return -EPERM; >>>=20 >>> preempt_disable(); >>> /* >>> * ARCH_SET_GS has always overwritten the index >>> * and the base. Zero is the most sensible value >>> * to put in the index, and is the only value that >>> * makes any sense if FSGSBASE is unavailable. >>> */ >>> if (task =3D=3D current) { >>> [not used for ptrace] >>> } else { >>> task->thread.gsindex =3D 0; >>> x86_gsbase_write_task(task, arg2); >>> } >>>=20 >>> ... >>>=20 >>> So writing the value that was already there to gsbase via putreg() >>> does nothing, but writing a *different* value implicitly clears gs, >>> but writing a different value will clear gs. >>>=20 >>> This behavior is, AFAICT, complete nonsense. It happens to work >>> because usually gdb writes the same value back, and, in any case, gs >>> comes *after* gsbase in user_regs_struct, so gs gets replaced anyway. >>> But I think that this behavior should be fixed up and probably tested. >>> Certainly the behavior should *not* be the same on a fsgsbase kernel, >>> and and the fsgsbase behavior definitely needs a selftest. >>=20 >> Okay, got the point; now crystal clear. >>=20 >> I have my own test case for that though, need to find a very simple and >> acceptable solution. >>=20 >=20 > One solution that I recall, HPA once suggested, is: > Write registers in a reverse order from user_regs_struct, for SETREGS >=20 > Assuming these for clarification, first: > * old and new index !=3D 0 > * taking GS as an example though, should be the same with FS >=20 > Then, interesting cases would be something like these, without FSGSBASE: > Case (a), when index only changed to (new index): > (Then, the result after SETREGS would be) > GS =3D (new index), GSBASE =3D the base fetched from (new index) > Case (b), when base only changed to (new base): > Case (c), when both are changed: > GS =3D 0, GSBASE =3D (new base) >=20 > Now, with FSGSBASE: > Case (a): > GS =3D (new index), GSBASE =3D (old base) > Case (b): > GS =3D (old index), GSBASE =3D (new base) > Case (c): > GS =3D (new index), GSBASE =3D (new base) >=20 > As a reference, today's kernel behavior, without FSGSBASE: > Case (a): > GS =3D (new index), GSBASE =3D the base fetched from (new index) > Case (b): > GS =3D (old index), GSBASE =3D (old base) > Case (c): > GS =3D (new index), GSBASE =3D the base fetched from (new index) >=20 > Now, with that reverse ordering and taking that "GSBASE is important" [1],= > it looks like to be working in terms of its base value: > Case (b) and (c) will behave the same as with FSGSBASE > Case (a) still differs between w/ and w/o FSGSBASE. > Well, I'd say this bit comes from the 'new model' vs. the 'leagcy > model'. So, then okay with that. Any thoughts? >=20 >=20 >=20 This seems more complicated than needed. How about we just remove all the m= agic and make putreg on the base registers never change the selector. As far as I can tell, the only downside is that, on a non-FSGSBASE kernel, s= etting only the base if the selector already has a nonzero value won=E2=80=99= t work, but I would be quite surprised if this breaks anything.=