Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp35673855rwd; Mon, 10 Jul 2023 10:45:02 -0700 (PDT) X-Google-Smtp-Source: APBJJlFw9uI7bcFoWVcfisuVNZ4sscOvocWNg5kDZGAcsSyvReGSDmf6027FAJ1MRlpdcKg78eb2 X-Received: by 2002:a17:906:5304:b0:993:db29:d27d with SMTP id h4-20020a170906530400b00993db29d27dmr11784148ejo.34.1689011102184; Mon, 10 Jul 2023 10:45:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689011102; cv=none; d=google.com; s=arc-20160816; b=InC0iYXV4eTja6W0Xk0+PcXLJOimtG80wOejgIS4YF5IqEACw4J4Y8ThQET5ATI0U+ UiN8vh+NzdBI4+7kKTJNoT+kdVFXV7ckA8K5NFidEdtvfw+6wifCYXXN0GjZOHYIgGFf YSTTrwSROsMRJ8ilNQXlVdLwOQYjrDt1oJIik/Ut94G19Da1x4gz0EKWhR6RwrQB6lWg kco7kUIvq1mpWp83DWnuclZDeeHwcKkV+FUojsVkPDWTbJqwBfTLES4ODIaSkxiLW3Zp p/I4j287Lxn8gBbe3ZUbEVQc8RZitACVLyjwmCtQiwxeyBQtGhXdBWYD3gjTP4U1IO3r F7Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; fh=1SQAG30xfrpHPUlSiWGJ8RXbXVqt7lVKN7YMZT+bjck=; b=XSho/3BE4yzcwsEO/wH7IKVQZlonWhu7UjJqplg3n0G4dgTaHr7njY4tqcZXSdc3lZ A7a3/pLPBoZD8K31OUTxkzKNZ0EAOZmx41pVVvSZaIV0RDRGaS9LsoVAe9PBC2c61ZsQ KSdkSr/SHzbzgj89okknV4Fy/kTaNWcT/HDLkiUNBeNMz12llRxlt0BrKZSWg/lYy82Y EHzrkP1J4amXLQ5Tb9+D7sTxoex8Yg5+C93Au3QxCH8m8ZRn437gCYACbA9olCnsl0U0 hY88F1yxwBYLMefZ1rlBSadSJG6mZDLUil8OKTaDdqvqoxb6y3jRy2beN+1k2ZAzkSgj 0MFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ht6uj+rb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n24-20020a170906119800b00992f7fe9c0bsi68448eja.328.2023.07.10.10.44.37; Mon, 10 Jul 2023 10:45:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=ht6uj+rb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231856AbjGJRUf (ORCPT + 99 others); Mon, 10 Jul 2023 13:20:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232147AbjGJRUe (ORCPT ); Mon, 10 Jul 2023 13:20:34 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 201F712E for ; Mon, 10 Jul 2023 10:20:32 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-98de21518fbso625842666b.0 for ; Mon, 10 Jul 2023 10:20:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689009630; x=1691601630; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; b=ht6uj+rb8yZwl2NV6V2gaufj2a9VqDGv3yssM3oEm8JYDAo3AfjXbxHJBoKuO5DlTl pD2WbqWUNWBfOfUcp7EjZQaJuiAnv1BmYyrgKUWyxJdJCQ68MaP1VKmDPuxC7dJIjAX7 5lTxg6vkvvnF36m5L1bm4PnNDP4LvpxIoQveWHh2gkoks+XtXvEeE+KHOyBM/KtU7ana rLs2CvLZmdELqO0wio4tAbxozq0KdYtf0J/CTkahgjBxYGRmPFuUf1LzxrU4EkTfeN3Z gpy7JaV38FmOqpGRYZf93Pg/PDCJFItQcB8y5NCti8Gz2Dad+OcDiuukXy3mkez7MG1/ Lpcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689009630; x=1691601630; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TUvRrbUgojp3S6sSCvyqiMIcXWz8PVCjBGghvW84veI=; b=iZO/fyFvnL0T/0E1KG/4FPANEHGKtzDRvRM0gikiarS9D4cT54NOta0DmtOTpvARez fqXUZ+MoaCAuYe6AYnqUpp1GLOsqf4hpghi+AuAjorFLMvZDNoo1muEyb9/F0ywFhjeK fX5LDcoq0dFeYMh7TzBSiijZoFClOlOdveY33IV/ST6MsjHNu1Z6/ElZrHdegMQxYM78 byfm5iEA7AY1ZdBFPx6m+8EL494d4TObtzyIjWpIobsdbG2qhTzbEE1kFWS/4NchlA1a JbmtYIFyp29t+noOwpnYRUa0T8pKH+LFrojEU8mVTeex5uBZHO3VvIkUPgBIBB2oRnZ0 YImA== X-Gm-Message-State: ABy/qLaEyRgVw2FNXlEUcREKAgjiwq599oO9ndli87WenUoZPeIHzJjN AQ+t3kHnItrOUUtVwg4pIi7OrS6gKv6UYQVcqv9svA== X-Received: by 2002:a17:907:d23:b0:991:f383:d5c3 with SMTP id gn35-20020a1709070d2300b00991f383d5c3mr17452608ejc.74.1689009630336; Mon, 10 Jul 2023 10:20:30 -0700 (PDT) MIME-Version: 1.0 References: <20230707215540.2324998-1-axelrasmussen@google.com> <20230707215540.2324998-2-axelrasmussen@google.com> <20230708180850.bc938ab49fbfb38b83c367c8@linux-foundation.org> In-Reply-To: <20230708180850.bc938ab49fbfb38b83c367c8@linux-foundation.org> From: Axel Rasmussen Date: Mon, 10 Jul 2023 10:19:54 -0700 Message-ID: Subject: Re: [PATCH v4 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general To: Andrew Morton Cc: Alexander Viro , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 8, 2023 at 6:08=E2=80=AFPM Andrew Morton wrote: > > On Fri, 7 Jul 2023 14:55:33 -0700 Axel Rasmussen wrote: > > > Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement > > UFFDIO_POISON, so make some various preparations for that: > > > > First, rename it to just PTE_MARKER_POISONED. The "SWAPIN" can be > > confusing since we're going to re-use it for something not really > > related to swap. This can be particularly confusing for things like > > hugetlbfs, which doesn't support swap whatsoever. Also rename some > > various helper functions. > > > > Next, fix pte marker copying for hugetlbfs. Previously, it would WARN o= n > > seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap. > > But, since we're going to re-use it, we want it to go ahead and copy it > > just like non-hugetlbfs memory does today. Since the code to do this is > > more complicated now, pull it out into a helper which can be re-used in > > both places. While we're at it, also make it slightly more explicit in > > its handling of e.g. uffd wp markers. > > > > For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for > > an error entry, return VM_FAULT_HWPOISON. For most cases this change > > doesn't matter, e.g. a userspace program would receive a SIGBUS either > > way. But for UFFDIO_POISON, this change will let KVM guests get an MCE > > out of the box, instead of giving a SIGBUS to the hypervisor and > > requiring it to somehow inject an MCE. > > > > Finally, for hugetlbfs faults, handle PTE_MARKER_POISONED, and return > > VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen toda= y > > because the lack of swap support means we'll never end up with such a > > PTE anyway, but this behavior will be needed once such entries *can* > > show up via UFFDIO_POISON. > > > > --- a/include/linux/mm_inline.h > > +++ b/include/linux/mm_inline.h > > @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_s= truct *mm) > > return atomic_read(&mm->tlb_flush_pending) > 1; > > } > > > > +/* > > + * Computes the pte marker to copy from the given source entry into ds= t_vma. > > + * If no marker should be copied, returns 0. > > + * The caller should insert a new pte created with make_pte_marker(). > > + */ > > +static inline pte_marker copy_pte_marker( > > + swp_entry_t entry, struct vm_area_struct *dst_vma) > > +{ > > + pte_marker srcm =3D pte_marker_get(entry); > > + /* Always copy error entries. */ > > + pte_marker dstm =3D srcm & PTE_MARKER_POISONED; > > + > > + /* Only copy PTE markers if UFFD register matches. */ > > + if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma)) > > + dstm |=3D PTE_MARKER_UFFD_WP; > > + > > + return dstm; > > +} > > Breaks the build with CONFIG_MMU=3Dn (arm allnoconfig). pte_marker isn't > defined. > > I'll slap #ifdef CONFIG_MMU around this function, but probably somethng m= ore > fine-grained could be used, like CONFIG_PTE_MARKER_UFFD_WP. Please > consider. Whoops, sorry about this. This function "ought" to be in include/linux/swapops.h where it would be inside a #ifdef CONFIG_MMU anyway, but it can't be because it uses userfaultfd_wp() so there'd be a circular include. I think just wrapping it in CONFIG_MMU is the right way. But, this has also made me realize we need to not advertise UFFDIO_POISON as supported unless we have CONFIG_MMU. I don't want HAVE_ARCH_USERFAULTFD_WP for that, because it's only enabled on x86_64, whereas I want to support at least arm64 as well. I don't see a strong reason not to just use CONFIG_MMU for this too; this feature depends on the API in swapops.h, which uses that ifdef, so I don't see a lot of value out of creating a new but equivalent config option. I'll make the needed changes (and also address Peter's comment above) and send out a v5. > > btw, both copy_pte_marker() and pte_install_uffd_wp_if_needed() look > far too large to justify inlining. Please review the desirability of > this. > >