Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp3157656pxa; Tue, 25 Aug 2020 13:05:15 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxh8NDOjHMEqwQ4ETOSmV1WoWXNen0a/pyRP6dWoNPccOiIYL+ltVTtUPnBBFsQowUsIF57 X-Received: by 2002:aa7:ca0d:: with SMTP id y13mr1228171eds.169.1598385914929; Tue, 25 Aug 2020 13:05:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598385914; cv=none; d=google.com; s=arc-20160816; b=0h0DFEeSXDf8xO54m9o1aY6nCM7Jq/z+vksdiSGnFqSRePUagI0OJmqdM9EwCfcawA z9HBRviznRrc5U3G1GHjImC2PF3/tQV5rsir5yLOgXdR1795NH+aFc7gWIi8U3xBlGCR XPBlVyv4gKVw9H6teaeoe/8mZ8TBN2MVMIL8+smg+6N7zb7xOeIYTz/AQzcXf9cpQS23 LB2/Tj4RwVebSpKh2yYOVqbmy4QHD7V7JdEIsQPUA3vNimgJwtcF18z/BXAIXShArovM JzcWU4oeYfR/ruE6EwG/WnU4vz95BcYaOouLABy+Zx9Mxqvbt/Py4BS4WItJ+Kv4ZZZW C0FQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=KQDIiZy98GX5Xim6YXb546wx9NRSHPk1zUnMZMln4ko=; b=I2DH/49PQ4uYIfxk2vWolnnv6YCNo5Wx9ZAk5qKYrVH0z+VprKlIge+rF2zYQNlMaS 9uf5+AS6twF0LOeOH5nJlghuTDBsLoSiOkBp6dU0sdLoA7gNXiV4LYhCI3Qk+eEn9dUn Bo6L81iO07IwSYp/fQ2WElnOc3dV4NnHt6c1kfESPr5Ceh/sIvR9PVoKNlr0XRO4eouM KYSqDpfryGDqQgOo1LkVdNtbo83bxBOf/q7rbN841EZEHXsGFMwfYu2s+Sj6oCNhNqP0 Nf8ehxLdInJfw6OFOCn5ThfyLijyDjZB6J7RNzXBP4rDiDz/lblY/GJRZ/Co3kiNkpem RuJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kylehuey.com header.s=google header.b=iu4uZkCs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z12si9677892edk.22.2020.08.25.13.04.50; Tue, 25 Aug 2020 13:05:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kylehuey.com header.s=google header.b=iu4uZkCs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726610AbgHYUDw (ORCPT + 99 others); Tue, 25 Aug 2020 16:03:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726090AbgHYUDv (ORCPT ); Tue, 25 Aug 2020 16:03:51 -0400 Received: from mail-ej1-x642.google.com (mail-ej1-x642.google.com [IPv6:2a00:1450:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92362C061574 for ; Tue, 25 Aug 2020 13:03:50 -0700 (PDT) Received: by mail-ej1-x642.google.com with SMTP id l2so12156090eji.3 for ; Tue, 25 Aug 2020 13:03:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kylehuey.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=KQDIiZy98GX5Xim6YXb546wx9NRSHPk1zUnMZMln4ko=; b=iu4uZkCs6ySZEOsj8gMFTslBMJVVf4orMqR2ZnhH8gLNvxddS//sCsXd1bJdNUP3zJ p49pPK92dGoGg8EK9PM/U123z5KGu6pZ9Jd6yIPG5dZoct6RcPfMeAMOqagTZaAqu+l1 f2IPyFC+vxxwIGY7A2PD8ETKji+HmlSEfAiGf5TJ51oVpEbis6iQ3vHd6JCfUCNLK8OJ q/QK8XBJ6GDv9VNYS937QiTW6FGisWhq7xi3Ps1TdCGuoT37BQCOT9Ev+x/whG5dmOyS 6+qgHKyDduWrOuyb57g85Xt93K6jpG/bK/YsCZvQYVCfq7G+9JtOwPyNsEWakVLDYO6i WDng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=KQDIiZy98GX5Xim6YXb546wx9NRSHPk1zUnMZMln4ko=; b=i3/UdVNRGadf7J4/1IpbXKDywVdNYm66Tx5xgUxkH6+1xRa7h5MrL18a41ykXC0zzL GCK6PYtg5hwLhTz8ShVWUKRFtRu4A0RzJ4xRSPTFBxVe1ci7j3JZy3FF8cye7T3354b7 4KtbmlilfvYeWtA7uytqzJVlvkgcGqgyF90ujbvNbJ3Qnie+iaz/6YE7ErMnQr6q5pCE mwhBDz9ndKH9gfu1mEGVR7E59hDfiT9lCL8nS1Vkxzh/8n9glWf/wrXX1HrnwAVqxepz IhWX+HPpUevnXIm7BG9E5OGI6amQvtV5q0LouKR8aKQTzmo99Zc/qXKJc7hZNzy8f9Dc RXEA== X-Gm-Message-State: AOAM530cc1Xs0GzE4oMprmO0w1Sf8QtKDnJ78gUNUjOzwA7cHpzWKEKK vorwKy1eNqq90NzZm7h3hn2SwgbLZ8kYO0WBXiPdHA== X-Received: by 2002:a17:906:d187:: with SMTP id c7mr12668828ejz.196.1598385827444; Tue, 25 Aug 2020 13:03:47 -0700 (PDT) MIME-Version: 1.0 References: <7DF88F22-0310-40C9-9DA6-5EBCB4877933@amacapital.net> In-Reply-To: From: Kyle Huey Date: Tue, 25 Aug 2020 13:03:36 -0700 Message-ID: Subject: Re: [REGRESSION] x86/cpu fsgsbase breaks TLS in 32 bit rr tracees on a 64 bit system To: Andy Lutomirski Cc: "H. Peter Anvin" , "Robert O'Callahan" , "Bae, Chang Seok" , Thomas Gleixner , Ingo Molnar , Andi Kleen , "Shankar, Ravi V" , LKML , "Hansen, Dave" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 25, 2020 at 12:32 PM Andy Lutomirski wrote: > > On Tue, Aug 25, 2020 at 11:50 AM Kyle Huey wrote: > > > > On Tue, Aug 25, 2020 at 10:31 AM Kyle Huey wrote: > > > > > > On Tue, Aug 25, 2020 at 9:46 AM Andy Lutomirski wro= te: > > > > > > > > On Tue, Aug 25, 2020 at 9:32 AM Kyle Huey wrote: > > > > > > > > > > On Tue, Aug 25, 2020 at 9:12 AM Andy Lutomirski wrote: > > > > > > I don=E2=80=99t like this at all. Your behavior really shouldn= =E2=80=99t depend on > > > > > > whether the new instructions are available. Also, some day I w= ould > > > > > > like to change Linux to have the new behavior even if FSGSBASE > > > > > > instructions are not available, and this will break rr again. = (The > > > > > > current !FSGSBASE behavior is an ugly optimization of dubious v= alue. > > > > > > I would not go so far as to describe it as correct.) > > > > > > > > > > Ok. > > > > > > > > > > > I would suggest you do one of the following things: > > > > > > > > > > > > 1. Use int $0x80 directly to load 32-bit regs into a child. Th= is > > > > > > might dramatically simplify your code and should just do the ri= ght > > > > > > thing. > > > > > > > > > > I don't know what that means. > > > > > > > > This is untested, but what I mean is: > > > > > > > > static int ptrace32(int req, pid_t pid, int addr, int data) { > > > > int ret; > > > > /* new enough kernels won't clobber r8, etc. */ > > > > asm volatile ("int $0x80" : "=3Da" (ret) : "a" (26 /* ptrace */)= , "b" > > > > (req), "c" (pid), "d" (addr), "S" (data) : "flags", "r8", "r9", "r1= 0", > > > > "r11"); > > > > return ret; > > > > } > > > > > > > > with a handful of caveats: > > > > > > > > - This won't compile with -fPIC, I think. Instead you'll need to > > > > write a little bit of asm to set up and restore ebx yourself. gcc = is > > > > silly like this. > > > > > > > > - Note that addr is an int. You'll need to mmap(..., MAP_32BIT, .= ..) > > > > to get a buffer that can be pointed to with an int. > > > > > > > > The advantage is that this should work on all kernels that support > > > > 32-bit mode at all. > > > > > > > > > > > > > > > 2. Something like your patch but make it unconditional. > > > > > > > > > > > > 3. Ask for, and receive, real kernel support for setting FS and= GS in > > > > > > the way that 32-bit code expects. > > > > > > > > > > I think the easiest way forward for us would be a PTRACE_GET/SETR= EGSET > > > > > like operation that operates on the regsets according to the > > > > > *tracee*'s bitness (rather than the tracer, as it works currently= ). > > > > > Does that sound workable? > > > > > > > > > > > > > Strictly speaking, on Linux, there is no unified concept of a task'= s > > > > bitness, so "set all these registers according to the target's > > > > bitness" is not well defined. We could easily give you a > > > > PTRACE_SETREGS_X86_32, etc, though. > > > > > > In the process of responding to this I spent some time doing code > > > inspection and discovered a subtlety in the ptrace API that I was > > > previously unaware of. PTRACE_GET/SETREGS use the regset views > > > corresponding to the tracer but PTRACE_GET/SETREGSET use the regset > > > views corresponding to the tracee. This means it is possible for us > > > today to set FS/GS "the old way" with a 64 bit tracer/32 bit tracee > > > combo, as long as we use PTRACE_SETREGSET with NT_PRSTATUS instead of > > > PTRACE_SETREGS. > > > > Alright I reverted the previous changes and switched us to use > > PTRACE_SETREGSET with NT_PRSTATUS[0] and our 32 bit tests pass again. > > I assume this behavior will remain unchanged indefinitely even when > > the fs/gsbase manipulation instructions are not available since the 32 > > bit user_regs_struct can't grow? > > Correct. > > I think it would be reasonable to add new, less erratic PTRACE_SETxyz > modes, but the old ones will stay as is. > > Strictly speaking, the issue you had wasn't a ptrace change. It was a > context switching change. Before the change, you were programming > garbage into the tracee state, but the context switch code wasn't > capable of restoring the garbage correctly and instead happened to > restore the state you actually wanted. With the new code, the context > switch actually worked correctly (for some value of correctly), and > the tracee crashed. Not that this distinction really matters. Yes. We've been feeding the kernel trash for years. You were just letting us get away with it before ;) - Kyle