Received: by 2002:a05:7412:3b8b:b0:fc:a2b0:25d7 with SMTP id nd11csp3041128rdb; Tue, 13 Feb 2024 05:21:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IFeM1wqncdpuJlLqw6B59GAoxEGcXWgK3KgW78617gh6x3PAMucf3B5OAHRkxij5o0u2eIA X-Received: by 2002:a0c:9b19:0:b0:68c:d9e4:63c4 with SMTP id b25-20020a0c9b19000000b0068cd9e463c4mr9265156qve.63.1707830516739; Tue, 13 Feb 2024 05:21:56 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707830516; cv=pass; d=google.com; s=arc-20160816; b=Z+kK4pHRcu22bsst2hovD+0Q3U7D/Ykxs5dwzAZvv5nkTCq3I5gnohcYbZrygy+ZSl TRx6lqOOPSHq23BebEioC/Hu12PXpizHreXFD5tgZUFtGi5va9f7oMnbAfBNt2ZZP9ao V7AgYvZ1wHHv0OCYZqpyQT1e4xEpCf4AKRdvres1dQTq1ubP7RElsRPI8VkUcnWyoO6j oOMOzsbl5BehiN276izqQUMUNmOzmMW9boX8ZvU4pmD4UsOQoHQu2npOdfxSHNnPsY3s 8VTiuei+vfjf6V1/8E+4Ld4Ln3gasJH3txROArMkqVWn+QdFOEUJv0dLi5e6tas4It/y Yu8w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=8Fhg4tQxBXbadhl07KiA74KUc3LqRQ3zdnhXSy51HgY=; fh=8GpZjtOhnJCcSWndkV5R22zn1kFBR32BylE8lP9YH0A=; b=xREQqA7eJ3wk3liJDKnC2c3CcRbglBIPF4w02bOVfCw9xaHSgM6M5kS9XnCKM3ehbs RLus8Zwxe0DcQCVrH+3TWzaDm00R3Rt3UBCvRd6rWmmGYSvyFL97VNJP2m4yGd7C2WtS 89+Cm0+VX5FnRYMlxyjcdI0i85voi6aNSfGWkQtafUxfeQNhwUb89CcxZhk6Y+o/GpqF 0UFAMWmJCIk+18E/1BXkbN+qVPhb8k93klpS72TGVE7EoxhDs+D51OMT0saWU9twJ7mp mJWhyqYgOB1bl4ibLhP0J/oqLJPqc55UiWzhlLFFvNkmnIAiL/48oPSXaQxGV5QFbMgD uz9A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-63576-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-63576-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com X-Forwarded-Encrypted: i=2; AJvYcCURy6Bl111w/ed2qs8DVWdEVndOKHt1umDEr5WSxCvmT+TQB8auFqXlQPW2LP6MVzyPAc8lCteHxAHIlXglChAdfZzE1hDwO7A/DKm92Q== Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id fn3-20020ad45d63000000b0068cd9f35a23si2720513qvb.542.2024.02.13.05.21.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 05:21:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-63576-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-63576-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-63576-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 5B6CE1C263C9 for ; Tue, 13 Feb 2024 13:21:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D6BED55794; Tue, 13 Feb 2024 13:21:04 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8DABD55C01 for ; Tue, 13 Feb 2024 13:21:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707830464; cv=none; b=XiQXdAkY03hz44icvmlAJx2RJR21OS6G1eUP3iubYJgAEFfVdBdEoSfI43oOyoxqJiCQTziI23HTC6RGziKoqDi5kERQdiFiJVH+5WcmQbGB0G0rUvzgAltI3AHgOQMvY0jhRK7AVmnFE4i4Ud5onqBtCoVHiqhyL5+n0a3lGuY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707830464; c=relaxed/simple; bh=FmvIhG9gIv0Yx6WYx650HxW0UcaYosnlTfhklJmt5D0=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=jFyOssl8E+OGnOwPcYl2e6PxbAlWfCFBVVEE3pE5WdUU8KWl4k3eSUhXezL+jTxLeoO8KjcVXSvitrFr5j5I0PfVF75vMGfYuvYsda3ykfLUYX1w599r2kofsHiyZe/LNXMNC/J0AT5FkpMfpVLZWKFV2Wtk09zrUQzFyDfKGRU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C4FADA7; Tue, 13 Feb 2024 05:21:41 -0800 (PST) Received: from [10.1.36.184] (XHFQ2J9959.cambridge.arm.com [10.1.36.184]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3E9C13F762; Tue, 13 Feb 2024 05:20:56 -0800 (PST) Message-ID: Date: Tue, 13 Feb 2024 13:20:55 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings Content-Language: en-GB To: David Hildenbrand , Mark Rutland Cc: Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , James Morse , Andrey Ryabinin , Andrew Morton , Matthew Wilcox , Kefeng Wang , John Hubbard , Zi Yan , Barry Song <21cnbao@gmail.com>, Alistair Popple , Yang Shi , Nicholas Piggin , Christophe Leroy , "Aneesh Kumar K.V" , "Naveen N. Rao" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-arm-kernel@lists.infradead.org, x86@kernel.org, linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240202080756.1453939-1-ryan.roberts@arm.com> <20240202080756.1453939-20-ryan.roberts@arm.com> <64395ae4-3a7d-45dd-8f1d-ea6b232829c5@arm.com> <41499621-482f-455b-9f68-b43ea8052557@redhat.com> <1d302d7a-50ab-4ab4-b049-75ed4a71a87d@arm.com> <99e2a92c-f2a2-4e1e-8ce2-08caae2cb7e4@redhat.com> From: Ryan Roberts In-Reply-To: <99e2a92c-f2a2-4e1e-8ce2-08caae2cb7e4@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On 13/02/2024 13:13, David Hildenbrand wrote: > On 13.02.24 14:06, Ryan Roberts wrote: >> On 13/02/2024 12:19, David Hildenbrand wrote: >>> On 13.02.24 13:06, Ryan Roberts wrote: >>>> On 12/02/2024 20:38, Ryan Roberts wrote: >>>>> [...] >>>>> >>>>>>>>> +static inline bool mm_is_user(struct mm_struct *mm) >>>>>>>>> +{ >>>>>>>>> +    /* >>>>>>>>> +     * Don't attempt to apply the contig bit to kernel mappings, because >>>>>>>>> +     * dynamically adding/removing the contig bit can cause page faults. >>>>>>>>> +     * These racing faults are ok for user space, since they get >>>>>>>>> serialized >>>>>>>>> +     * on the PTL. But kernel mappings can't tolerate faults. >>>>>>>>> +     */ >>>>>>>>> +    return mm != &init_mm; >>>>>>>>> +} >>>>>>>> >>>>>>>> We also have the efi_mm as a non-user mm, though I don't think we >>>>>>>> manipulate >>>>>>>> that while it is live, and I'm not sure if that needs any special handling. >>>>>>> >>>>>>> Well we never need this function in the hot (order-0 folio) path, so I >>>>>>> think I >>>>>>> could add a check for efi_mm here with performance implication. It's >>>>>>> probably >>>>>>> safest to explicitly exclude it? What do you think? >>>>>> >>>>>> Oops: This should have read "I think I could add a check for efi_mm here >>>>>> *without* performance implication" >>>>> >>>>> It turns out that efi_mm is only defined when CONFIG_EFI is enabled. I can do >>>>> this: >>>>> >>>>> return mm != &init_mm && (!IS_ENABLED(CONFIG_EFI) || mm != &efi_mm); >>>>> >>>>> Is that acceptable? This is my preference, but nothing else outside of efi >>>>> references this symbol currently. >>>>> >>>>> Or perhaps I can convince myself that its safe to treat efi_mm like userspace. >>>>> There are a couple of things that need to be garanteed for it to be safe: >>>>> >>>>>     - The PFNs of present ptes either need to have an associated struct >>>>> page or >>>>>       need to have the PTE_SPECIAL bit set (either pte_mkspecial() or >>>>>       pte_mkdevmap()) >>>>> >>>>>     - Live mappings must either be static (no changes that could cause >>>>> fold/unfold >>>>>       while live) or the system must be able to tolerate a temporary fault >>>>> >>>>> Mark suggests efi_mm is not manipulated while live, so that meets the latter >>>>> requirement, but I'm not sure about the former? >>>> >>>> I've gone through all the efi code, and conclude that, as Mark suggests, the >>>> mappings are indeed static. And additionally, the ptes are populated using only >>>> the _private_ ptep API, so there is no issue here. As just discussed with Mark, >>>> my prefereence is to not make any changes to code, and just add a comment >>>> describing why efi_mm is safe. >>>> >>>> Details: >>>> >>>> * Registered with ptdump >>>>       * ptep_get_lockless() >>>> * efi_create_mapping -> create_pgd_mapping … -> init_pte: >>>>       * __ptep_get() >>>>       * __set_pte() >>>> * efi_memattr_apply_permissions -> efi_set_mapping_permissions … -> >>>> set_permissions >>>>       * __ptep_get() >>>>       * __set_pte() >>> >>> Sound good. We could add some VM_WARN_ON if we ever get the efi_mm via the >>> "official" APIs. >> >> We could, but that would lead to the same linkage issue, which I'm trying to >> avoid in the first place: >> >> VM_WARN_ON(IS_ENABLED(CONFIG_EFI) && mm == efi_mm); >> >> This creates new source code dependencies, which I would rather avoid if >> possible. > > Just a thought, you could have a is_efi_mm() function that abstracts all that. > > diff --git a/include/linux/efi.h b/include/linux/efi.h > index c74f47711f0b..152f5fa66a2a 100644 > --- a/include/linux/efi.h > +++ b/include/linux/efi.h > @@ -692,6 +692,15 @@ extern struct efi { >   >  extern struct mm_struct efi_mm; >   > +static inline void is_efi_mm(struct mm_struct *mm) > +{ > +#ifdef CONFIG_EFI > +       return mm == &efi_mm; > +#else > +       return false; > +#endif > +} > + >  static inline int >  efi_guidcmp (efi_guid_t left, efi_guid_t right) >  { > > That would definitely work, but in that case, I might as well just check for it in mm_is_user() (and personally I would change the name to mm_is_efi()): static inline bool mm_is_user(struct mm_struct *mm) { return mm != &init_mm && !mm_is_efi(mm); } Any objections?