Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3540314pxb; Mon, 4 Apr 2022 20:31:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxuxMqDSrLmwIFCeHUiyAOqm/HAV9fy96EWxJ1ZO4lrgTiPRQLSpVVTlJnoY7+ClP43BL+f X-Received: by 2002:a17:903:22ca:b0:14c:d9cf:a463 with SMTP id y10-20020a17090322ca00b0014cd9cfa463mr1532489plg.32.1649129460935; Mon, 04 Apr 2022 20:31:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649129460; cv=none; d=google.com; s=arc-20160816; b=zfzuK7Qz1V6H8WavnG5rMEaSI5/LNvcHYpqbwTAqKDA9Dew9zM9x+yM4zZYvCRMK7w gZQxui4ocObZFEZRikiGoCEX7uGnfMeB/o0Wd8WXVhSjG02yzcx+NqcwahGhkiWgeCGu udoCWoTBIa8mUJAQcsjMqT8KBzAtWUbTOEiS5BWlxNMjFHFs6VKcoI5TQyBpSysP0WuE 5FnFo2a17Ro+/O2YTV9MgZ7gPWg7kJ8JNZT7+jjsC8AHHAOIpwfhUo9euBo54nQuwX4i WRchKa18Evc875LnrirXVlQ8YsOy5yUNamDQGnB7ji4rc6Nj4HDL+Yf9v+ToVNm37yBt 7VYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:date:references:in-reply-to :message-id:mime-version:user-agent:dkim-signature; bh=Nkxj8G+iYlMavmfB8XWt9H61R4ZJM+Uggwjz1OOSuqY=; b=aXI24tWLd9bvOPNH9SuJ2NjGnvCj2jfHWfWtkO5ZYXgll9yA/7OS6IlRuKiY+uhol6 nggvWNcfNmDAFnFg8O7F6ngSVZ+fYGF7/F0cBiUypW1S/ghdSgsVRP7EyJ7OhkflL/Vq eL6dqZeWVvvZI7L9cmQJztmrK3ttviC4xCNw1d2LF9BEk5PT7oWnc2Fb32nZyLpyP1VY MnP9Nma5BPHUmoXJdFAQhNaQ74xGMTv65ynHTgZ4CEIYbrIUaI5gSk8oPX3whstGS/0j /4OIl1cRcSOinSo7J7RQX3KHM6IdxtjMcKUMZsBVZ7exOxFeHSyiv4mpGT/DFO/1RzfK I3PQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=C3lWKOp8; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id f6-20020a056a0022c600b004fa3a8dffc3si12576447pfj.122.2022.04.04.20.31.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 20:31:00 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=C3lWKOp8; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B9D2535B258; Mon, 4 Apr 2022 18:30:27 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348850AbiDATnY (ORCPT + 99 others); Fri, 1 Apr 2022 15:43:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236441AbiDATnX (ORCPT ); Fri, 1 Apr 2022 15:43:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 573461D1936 for ; Fri, 1 Apr 2022 12:41:33 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E582E61720 for ; Fri, 1 Apr 2022 19:41:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B29F3C340EC; Fri, 1 Apr 2022 19:41:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648842092; bh=/Y8/OhXYBEbqn8OGJCyAUnOAQ6cUj7tjx72r1WjnUtU=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=C3lWKOp89sC4JSn0gJXCcVN3ynzqLuzDSn3cQ0vm3jR2hQSMz9gWfjFcTCtC/BLSl ez5hkKK2MAT6zyFiUOuGKzHzyPiXmHlS7dtSV9VZ45bUbTjyxgIDSYzcG1QdJQ1tdQ ckTph4jrL0mTbPNp+z8K+5NYehN3MuzMYv2veDnDKwXtZJwPAS3RX2cBCYpJ33192X G34DV6ddfimMLM3eCbEJq/YXTlcQEyyBcV9c+RCkXrpkPdy93q8wDii9l/6MEMkL0i UCBselLIVbIccn28g8hOoKjUGgXKQ07i6ua8Ea6PpaEOQSe40gBqaWP3ahd/uCW0gC UYO1myyuM+V+A== Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 6696F27C0054; Fri, 1 Apr 2022 15:41:30 -0400 (EDT) Received: from imap48 ([10.202.2.98]) by compute2.internal (MEProxy); Fri, 01 Apr 2022 15:41:30 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvvddrudeiiedgudegtdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvufgtsehttdertderredtnecuhfhrohhmpedftehn ugihucfnuhhtohhmihhrshhkihdfuceolhhuthhosehkvghrnhgvlhdrohhrgheqnecugg ftrfgrthhtvghrnhephfelhfdugeetvdfhfeeuveevtdegueekhfffheetffdtleevtdeh tdeivdeuvedunecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghruf hiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnhguhidomhgvshhmthhprghu thhhphgvrhhsohhnrghlihhthidqudduiedukeehieefvddqvdeifeduieeitdekqdhluh htoheppehkvghrnhgvlhdrohhrgheslhhinhhugidrlhhuthhordhush X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 3AFB921E006E; Fri, 1 Apr 2022 15:41:29 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.7.0-alpha0-382-g88b93171a9-fm-20220330.001-g88b93171 Mime-Version: 1.0 Message-Id: <57a435ac-1ae3-49a0-bacf-307dc1a9349d@www.fastmail.com> In-Reply-To: References: <20220310111545.10852-1-bharata@amd.com> <6a5076ad-405e-4e5e-af55-fe2a6b01467d@www.fastmail.com> Date: Fri, 01 Apr 2022 12:41:07 -0700 From: "Andy Lutomirski" To: "Bharata B Rao" , "Linux Kernel Mailing List" Cc: linux-mm@kvack.org, "the arch/x86 maintainers" , "Kirill A. Shutemov" , "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "Dave Hansen" , "Catalin Marinas" , "Will Deacon" , shuah@kernel.org, "Oleg Nesterov" , ananth.narayan@amd.com Subject: Re: [RFC PATCH v0 0/6] x86/AMD: Userspace address tagging Content-Type: text/plain X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 23, 2022, at 12:48 AM, Bharata B Rao wrote: > On 3/22/2022 3:59 AM, Andy Lutomirski wrote: >> On Thu, Mar 10, 2022, at 3:15 AM, Bharata B Rao wrote: >>> Hi, >>> >>> This patchset makes use of Upper Address Ignore (UAI) feature available >>> on upcoming AMD processors to provide user address tagging support for x86/AMD. >>> >>> UAI allows software to store a tag in the upper 7 bits of a logical >>> address [63:57]. When enabled, the processor will suppress the >>> traditional canonical address checks on the addresses. More information >>> about UAI can be found in section 5.10 of 'AMD64 Architecture >>> Programmer's Manual, Vol 2: System Programming' which is available from >>> >>> https://bugzilla.kernel.org/attachment.cgi?id=300549 >> >> I hate to be a pain, but I'm really not convinced that this feature is suitable for Linux. There are a few reasons: >> >> Right now, the concept that the high bit of an address determines whether it's a user or a kernel address is fairly fundamental to the x86_64 (and x86_32!) code. It may not be strictly necessary to preserve this, but violating it would require substantial thought. With UAI enabled, kernel and user addresses are, functionally, interleaved. This makes things like access_ok checks, and more generally anything that operates on a range of addresses, behave potentially quite differently. A lot of auditing of existing code would be needed to make it safe. > > Ok got that. However can you point to me a few instances in the current > kernel code where such assumption of high bit being user/kernel address > differentiator exists so that I get some idea of what it takes to > audit all such cases? Anything that thinks that an address >= some value means kernel. > > Also wouldn't the problem of high bit be solved by using only the > 6 out of 7 available bits in UAI and leaving the 63rd bit alone? > The hardware will still ignore the top bit, but this should take > care of the requirement of high bit being 0/1 for user/kernel in the > x86_64 kernel. Wouldn't that work? Maybe, but that seems quite ugly. This will make the userspace and kernel semantics diverge. > >> >> UAI looks like it wasn't intended to be context switched and, indeed, your series doesn't context switch it. As far as I'm concerned, this is an error, and if we support UAI at all, we should context switch it. Yes, this will be slow, perhaps painfully slow. AMD knows how to fix it by, for example, reading the Intel SDM. By *not* context switching UAI, we force it on for all user code, including unsuspecting user code, as well as for kernel code. Do we actually want it on for kernel code? With LAM, in contrast, the semantics for kernel pointers vs user pointers actually make sense and can be set per mm, which will make things like io_uring (in theory) do the right thing. > > I plan to enable/disable UAI based on the next task's settings by > doing MSR write to EFER during context switch. I will have to measure > how much additional cost an MSR write in context switch path brings in. > However given that without a hardware feature like ARM64 MTE, this would > primarily be used in non-production environments. Hence I wonder if MSR > write cost could be tolerated? I'm not sure what you mean by a feature like ARM64 MTE. > > Regarding enabling UAI for kernel, I will have to check how clean and > efficient it would be to disable/enable UAI on user/kernel entry/exit > points. > >> >> UAI and LAM are incompatible from a userspace perspective. Since LAM is pretty clearly superior [0], it seems like a better long term outcome would be for programs that want tag bits to target LAM and for AMD to support LAM if there is demand. For that matter, do we actually expect any userspace to want to support UAI? (Are there existing too-clever sandboxes that would be broken by enabling UAI?) >> >> Given that UAI is not efficiently context switched, the implementation of uaccess is rather bizarre. With the implementation in this series in particular, if the access_ok checks ever get out of sync with actual user access, a user access could be emitted with the high bits not masked despite the range check succeeding due to masking, which would, unless great care is taken, result in a "user" access hitting the kernel range. That's no good. > > Okay, I guess if context switching and sticking to 6 bits as mentioned > earlier is feasible, this concern too goes away unless I am missing something. I think it does go away.