Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp25919284rwd; Mon, 3 Jul 2023 02:56:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7C0yslj53mfdHOgeuTZHAqfpXTbxYd6OGfN3U84Io2glEwR60Kto1HQLvK8+xEgUKexWkK X-Received: by 2002:a05:6a00:4ac5:b0:64d:42b9:6895 with SMTP id ds5-20020a056a004ac500b0064d42b96895mr20245788pfb.5.1688378207033; Mon, 03 Jul 2023 02:56:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688378207; cv=none; d=google.com; s=arc-20160816; b=SO/NLu71PkhLOV4EzuEyGKmjV+1vqBCafiLke8Pn2vGtkSArSN2MR01J7hiImb1D0E fqjqMxwcc45KAzO/ZwqIhECQcW0vAoYjjeYXob73yUfpDCD5RCV5QQZqIcB5ytbXyELo TSwUypE2gEshymuXVBQaqvH34pfcpwqMdebSuQ4CiXAY1EttkOYt1OhCop8xl2/MBlvB Gbakizp078M7LTw1Y76vpJX0+zQoBv7MG7ejCjKxd/Uj6rOj2BmyoDVZbd3jk5Nk2PbT GE9TDCS3ITIx3xX4RZKuTlSt9gFCHxwlH90HSf4g//dkiSeIFulmdgh7FPtuS44qgIQo MT4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id; bh=akXlpqlxNZ7+9F7MUCQSHOEoQCxYiycEGUMuuJ/ftZw=; fh=fdrdprNU7Uc/C4DKC8vBvOvJRR5XqaDdsfl7qIXAd3I=; b=ZPpFPOsoxr6n7VODk4nFEWftRcNWBbKd7WfemgEMnBX/CXBj/ROsE3RaN1iN6N35Iz quGyJ5oCO6vgInoOqQikYwpM77i6lVYnxdkair1OxmPMR9dcSv8ycpy3ZO79TbUOXe/3 j6Xrn3hVQm4kFtNvChBNQiST8zGxsBa+EqqCCLMsBpHSmBxgYnhX/UXnDXaQk5NtDfHM ZjzisHxDHPoHy90V7FMKBA3cmRK9sJ+wakKsU+ymIAASqutEv2VwIVxbbcd47hfGD4TT l3wMf5c2iH5ry3Ts/IAzbvjoDTP3Bs977PxzyiAhT6DMlpnD5BsYqau8Zy8nkU9M6V0S SgkA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k25-20020a635a59000000b00553813c2df0si17705615pgm.513.2023.07.03.02.56.32; Mon, 03 Jul 2023 02:56:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231201AbjGCJst (ORCPT + 99 others); Mon, 3 Jul 2023 05:48:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229644AbjGCJsd (ORCPT ); Mon, 3 Jul 2023 05:48:33 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 80F4CDD for ; Mon, 3 Jul 2023 02:48:32 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CF4EF1FB; Mon, 3 Jul 2023 02:49:14 -0700 (PDT) Received: from [10.57.76.103] (unknown [10.57.76.103]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0970A3F762; Mon, 3 Jul 2023 02:48:28 -0700 (PDT) Message-ID: Date: Mon, 3 Jul 2023 10:48:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v1 11/14] arm64/mm: Wire up PTE_CONT for user mappings To: John Hubbard , Catalin Marinas , Will Deacon , Ard Biesheuvel , Marc Zyngier , Oliver Upton , James Morse , Suzuki K Poulose , Zenghui Yu , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Anshuman Khandual , Matthew Wilcox , Yu Zhao , Mark Rutland Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20230622144210.2623299-1-ryan.roberts@arm.com> <20230622144210.2623299-12-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/06/2023 02:54, John Hubbard wrote: > On 6/22/23 07:42, Ryan Roberts wrote: >> With the ptep API sufficiently refactored, we can now introduce a new >> "contpte" API layer, which transparently manages the PTE_CONT bit for >> user mappings. Whenever it detects a set of PTEs that meet the >> requirements for a contiguous range, the PTEs are re-painted with the >> PTE_CONT bit. >> >> This initial change provides a baseline that can be optimized in future >> commits. That said, fold/unfold operations (which imply tlb >> invalidation) are avoided where possible with a few tricks for >> access/dirty bit management. >> >> Write-enable and write-protect modifications are likely non-optimal and >> likely incure a regression in fork() performance. This will be addressed >> separately. >> >> Signed-off-by: Ryan Roberts >> --- > > Hi Ryan! > > While trying out the full series from your gitlab features/granule_perf/all > branch, I found it necessary to EXPORT a symbol in order to build this. Thanks for the bug report! > Please see below: > > ... >> + >> +pte_t contpte_ptep_get(pte_t *ptep, pte_t orig_pte) >> +{ >> +    /* >> +     * Gather access/dirty bits, which may be populated in any of the ptes >> +     * of the contig range. We are guarranteed to be holding the PTL, so any >> +     * contiguous range cannot be unfolded or otherwise modified under our >> +     * feet. >> +     */ >> + >> +    pte_t pte; >> +    int i; >> + >> +    ptep = contpte_align_down(ptep); >> + >> +    for (i = 0; i < CONT_PTES; i++, ptep++) { >> +        pte = __ptep_get(ptep); >> + >> +        /* >> +         * Deal with the partial contpte_ptep_get_and_clear_full() case, >> +         * where some of the ptes in the range may be cleared but others >> +         * are still to do. See contpte_ptep_get_and_clear_full(). >> +         */ >> +        if (pte_val(pte) == 0) >> +            continue; >> + >> +        if (pte_dirty(pte)) >> +            orig_pte = pte_mkdirty(orig_pte); >> + >> +        if (pte_young(pte)) >> +            orig_pte = pte_mkyoung(orig_pte); >> +    } >> + >> +    return orig_pte; >> +} > > Here we need something like this, in order to get it to build in all > possible configurations: > > EXPORT_SYMBOL_GPL(contpte_ptep_get); > > (and a corresponding "#include linux/export.h" at the top of the file). > > Because, the static inline functions invoke this routine, above. A quick grep through the drivers directory shows: ptep_get() is used by: - drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c - drivers/misc/sgi-gru/grufault.c - drivers/vfio/vfio_iommu_type1.c - drivers/xen/privcmd.c ptep_set_at() is used by: - drivers/gpu/drm/i915/i915_mm.c - drivers/xen/xlate_mmu.c None of the other symbols are called, but I guess it is possible that out of tree modules are calling others. So on the basis that these symbols were previously pure inline, I propose to export all the contpte_* symbols using EXPORT_SYMBOL() so that anything that was previously calling them successfully continue to do so. Will include in v2. Thanks, Ryan > > thanks,