Received: by 2002:ab2:3319:0:b0:1ef:7a0f:c32d with SMTP id i25csp266230lqc; Thu, 7 Mar 2024 17:35:45 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCW+5GWQpMnSsh4/0XmbmejFF8Do8Yb32CURiQ5A+W6hgKVy2dxHWl6tLjfADjzSHPd7Eo+gVfqXOnHvAXuFXXK2ny3mTVs/xnUlD4ZqXg== X-Google-Smtp-Source: AGHT+IGrmUBrSnGw1vGkSkYUUzx3HdRCW7jXzIpy9Oi7IsfSyr7aJZj/D8DXtoPo+BCFzBTLGFMl X-Received: by 2002:a05:6808:3bc:b0:3c1:e9d1:2407 with SMTP id n28-20020a05680803bc00b003c1e9d12407mr9443374oie.38.1709861744895; Thu, 07 Mar 2024 17:35:44 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709861744; cv=pass; d=google.com; s=arc-20160816; b=Xw6UoJ25BKZ46vSyDfwBZM1WVzPZ+xIQhHkela9SIkV6XxrwlO9CC6BexY4TStReSP SJem/7RPQ6AZ+hcBD0g1og+30eGN36VshVo2OWjMu5aJZg/M9gEZRJ03WG2JTCQj15Rj ehkNmmUSqfTNvYmnnc45fn3+6Yv3xANrXgZVnEvPiuTTIYQAA2x7OoslGSX3WzpmBATE jrTNZXNRSVtWM0j3eCRpnNQodj7DHnV7CMpFs663TNi5erq6I4L/U85PztTIEQBOS0Na nZYzKLh6SBwSrwfbFpJKLw1yUHevp+KbKjtV7bZVD6xmamZm9FecToI9tDvBSvra5puc r2zA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=subject:cc:to:from:date:references:in-reply-to:message-id :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:feedback-id:dkim-signature; bh=i3O5V8pJhzG74YuD/KeQ7+wa7X0B6Cpua0EqKaFU9uk=; fh=GSaePTZKUvnTt4jFVq+l6fyiGCTeOwYkgQ9RS6VUCzU=; b=GelJIskXcMdp2EQulWIXMbt2gItFpcMKA1/nOGbWeefWou6GAcvgiVZNAyhgW7cIlI +9IyXV8ExIdQ2TlPpRBnTbZmj4n3VWtUTmDVrr70CKuQGI3W2oTbWVJx0Oz37wCc7krk YIO3Ufmm788niDbp9dU0IgFgvT7K2I71UpchKnRgMT1FBwfYmMYMNunMMS7f+ciO4oR7 wRGAzldETDw2AeRcK4o6fqTGAjS4f064vyPbL3vO7vL6PracTTAPNDnIa4tUjo1ZtPiF bR9I+a90QgfBw7pQxR2Go2erOl3V/gEYcwMZb9t4CUBFxzeUoLIABgMt/juof5On0971 Srzw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="c/nEGzLm"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-96432-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96432-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id d125-20020a633683000000b005d71b72c632si15272243pga.81.2024.03.07.17.35.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Mar 2024 17:35:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-96432-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="c/nEGzLm"; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-96432-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96432-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 5AE41B20E09 for ; Fri, 8 Mar 2024 01:34:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 02573139F; Fri, 8 Mar 2024 01:34:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="c/nEGzLm" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25DB1F4F1 for ; Fri, 8 Mar 2024 01:34:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709861687; cv=none; b=dFzopNaXaClwVpKEQOle2f6+DFH75zJPGtXDEQsHqKiF9N0VgFBKdybyJ+HdTcPYo2c5dnLX6nX0L056k8yTz42+cKMsITcAfrNZTEAwh2Byn+V0rWQgqFMonPCe6ULi9pDRTZbO11hFAEUHDhyP8M5PC9I101af3xCCIDE+kLo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709861687; c=relaxed/simple; bh=VjCS+mPzMg57EfVH0u7opRtoT+iYbBR5Y72kAhSAutY=; h=MIME-Version:Message-Id:In-Reply-To:References:Date:From:To:Cc: Subject:Content-Type; b=sNFw3qQFspqoUNF0kDVVtL5sWpP0YwS3dPfCZSUYvvqPGofNYz1USIBLzwk1Y5Jw4c5T0yNt+alu3Xnzec84yn2ZDtLK7flPRoWNBfCqY8+rNxVnATCBAeFDv493TLi1rsOjyJOUzcXq+8AjrlMZ8e15itn5egLfTqswJsY4724= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=c/nEGzLm; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 50E25C43394; Fri, 8 Mar 2024 01:34:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1709861686; bh=VjCS+mPzMg57EfVH0u7opRtoT+iYbBR5Y72kAhSAutY=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=c/nEGzLmKXdbr0YrgSBm/K+mVHde1oYwHDddKj/l5JGP5tetyCw/QZuhYaPCrtJRI aeIP0uYYF1746o8BOSIZ6MAVWEKePUJNM225tAAHqHTpFBl6a7LeTlGKHvUz2ufuFF IVRc3QdcKexwxXkmgOuMja82d0TmGYAQfkwgqe0cp9xw8ejp+9klNstX4RFyeCeCt7 kf4VLbIS8Avm7WlzNQrZ6WVUeX7D8ud+LLyInf2t2QVoRPry7VTAOdzdVxQ08PdNRK sew34aSEMlZ5LCiwaat5yXGZTOzIJSLMUGSVkkqEK04iGi/FsGSy7JacvmyOUJmBMI cGKAQL6uka4Tw== Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfauth.nyi.internal (Postfix) with ESMTP id 34A211200043; Thu, 7 Mar 2024 20:34:45 -0500 (EST) Received: from imap48 ([10.202.2.98]) by compute3.internal (MEProxy); Thu, 07 Mar 2024 20:34:45 -0500 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrieeggdefhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefofgggkfgjfhffhffvvefutgesthdtredtreertdenucfhrhhomhepfdetnhgu hicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenucggtf frrghtthgvrhhnpedvhfeuvddthfdufffhkeekffetgffhledtleegffetheeugeejffdu hefgteeihfenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpegrnhguhidomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudduiedukeeh ieefvddqvdeifeduieeitdekqdhluhhtoheppehkvghrnhgvlhdrohhrgheslhhinhhugi drlhhuthhordhush X-ME-Proxy: Feedback-ID: ieff94742:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 2E30031A0065; Thu, 7 Mar 2024 20:34:44 -0500 (EST) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.11.0-alpha0-251-g8332da0bf6-fm-20240305.001-g8332da0b Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <420fcb06-c3c3-4e8f-a82d-be2fb2ef444d@app.fastmail.com> In-Reply-To: <20240307133916.3782068-3-yosryahmed@google.com> References: <20240307133916.3782068-1-yosryahmed@google.com> <20240307133916.3782068-3-yosryahmed@google.com> Date: Thu, 07 Mar 2024 17:34:21 -0800 From: "Andy Lutomirski" To: "Yosry Ahmed" , "Andrew Morton" Cc: "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "Dave Hansen" , "Peter Zijlstra (Intel)" , "Kirill A. Shutemov" , "the arch/x86 maintainers" , linux-mm@kvack.org, "Linux Kernel Mailing List" Subject: Re: [RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching Content-Type: text/plain Catching up a bit... On Thu, Mar 7, 2024, at 5:39 AM, Yosry Ahmed wrote: > During context switching, if we are not switching to new mm and no TLB > flush is needed, we do not write CR3. However, it is possible that a > user thread enables LAM while a kthread is running on a different CPU > with the old LAM CR3 mask. If the kthread context switches into any > thread of that user process, it may not write CR3 with the new LAM mask, > which would cause the user thread to run with a misconfigured CR3 that > disables LAM on the CPU. So I think (off the top of my head -- haven't thought about it all that hard) that LAM is logically like PCE and LDT: it's a property of an mm that is only rarely changed, and it doesn't really belong as part of the tlb_gen mechanism. And, critically, it's not worth the effort and complexity to try to optimize LAM changes when we have a lazy CPU (just like PCE and LDT) (whereas TLB flushes are performance critical and are absolutely worth optimizing). So... > > Fix this by making sure we write a new CR3 if LAM is not up-to-date. No > problems were observed in practice, this was found by code inspection. I think it should be fixed with a much bigger hammer: explicit IPIs. Just don't ever let it get out of date, like install_ldt(). > > Not that it is possible that mm->context.lam_cr3_mask changes throughout > switch_mm_irqs_off(). But since LAM can only be enabled by a > single-threaded process on its own behalf, in that case we cannot be > switching to a user thread in that same process, we can only be > switching to another kthread using the borrowed mm or a different user > process, which should be fine. The thought process is even simpler with the IPI: it *can* change while switching, but it will resynchronize immediately once IRQs turn back on. And whoever changes it will *synchronize* with us, which would otherwise require extremely complex logic to get right. And... > - if (!was_lazy) > - return; > + if (was_lazy) { > + /* > + * Read the tlb_gen to check whether a flush is needed. > + * If the TLB is up to date, just use it. The barrier > + * synchronizes with the tlb_gen increment in the TLB > + * shootdown code. > + */ > + smp_mb(); This is actually rather expensive -- from old memory, we're talking maybe 20 cycles here, but this path is *very* hot and we try fairly hard to make it be fast. If we get the happy PCID path, it's maybe 100-200 cycles, so this is like a 10% regression. Ouch. And you can delete all of this if you accept my suggestion.