Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2838272pxb; Sun, 28 Feb 2021 15:33:03 -0800 (PST) X-Google-Smtp-Source: ABdhPJyCnVZF0OxJzxdy5D1fMIgwwfoYCbNqBnZXZDE6UvdazMZQEQ9uCd6YjtDUb67FNpF5yBvx X-Received: by 2002:a05:6402:b1c:: with SMTP id bm28mr14044725edb.354.1614555183228; Sun, 28 Feb 2021 15:33:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614555183; cv=none; d=google.com; s=arc-20160816; b=zOVuWNO2adjdRjH7NE8nwhAdO0ZcMZbf8/Uiolxzj/+5LgE4hFZlYTSngCHdVZkoPu lohb2KShQkWK5e8vbbw2e2tAipztiJM87yAHkRXZHWIerBiFL0WSr+TUsnc4VQrlJBsg kE1H8ZM27Wf3tHCsF0Ae52T4eUEC9Nicig0g/fj372wMCJ7/Hp30+woTPqR1Cbp2ZaL4 6C5bgnFwlO2pjTIRFpDYUa/RXkTZ/9vzP4FGvQK/08W5qMkh4uub5D8aFdQZzFrvtbbA hyOemPTZesHZQzcCYXPPlU0EZNP+nelMutrr5b83N6hz9td7ptYFN9juHCbq7DqA0zYS EMBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=AAIYL8nOKNNgtLJhyYxhrIB/+Wg7cJCtVNle0Vo8a8A=; b=NTJUE80TJR9wAsdHXrwQai/n7GcSTlCp1kFKD4UW4QTjU1jNd4FoDNMeEI+t6DqHz9 prFAG3HaVRCySmjTeTqemQy85E4WYrI7SXajfiEn90QegjkWc+x8w+aP96DvQY74a4EO Y1arAqbgn1z0cb9EaPm2ho5DlrzJlC6tP3DY7gutIAcB+cZYXDfCMKj2cPO0KmJt9kYd 3Pth2pHVVHvGu3I9a8ICsMJBHFdl1maZtuhBd4FtTTL3nk0mwuGtWx+IGSA/56d6Eyh0 h6M2ALZTGSdJVYLGuxDHOgagOvajySXnqLQtsHOxKfM9eRm1IS0w2L9cgkYcH1twMTDI Dksw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=jQHCMDvo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n11si10051658ejg.99.2021.02.28.15.32.40; Sun, 28 Feb 2021 15:33:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=jQHCMDvo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231357AbhB1Vvp (ORCPT + 99 others); Sun, 28 Feb 2021 16:51:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231367AbhB1Vvo (ORCPT ); Sun, 28 Feb 2021 16:51:44 -0500 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16DC2C061786 for ; Sun, 28 Feb 2021 13:51:04 -0800 (PST) Received: by mail-pg1-x52a.google.com with SMTP id a23so459830pga.8 for ; Sun, 28 Feb 2021 13:51:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=AAIYL8nOKNNgtLJhyYxhrIB/+Wg7cJCtVNle0Vo8a8A=; b=jQHCMDvoKu8hXHqfa+gy29rCq4JXR9KCvsXlTfuTQvmuE1ppLyFBfYnUb6V2dtFsvn 8aVZidHqEhIPmcj3iM9TzKwmFtf5wmQaBjrpw7RHyu5hykuiGbgvUenRy18e50rDBIkH JAHsP/bIPvSwYmtijNlWhLLj2oyTR1W2qBokHvVbkFMfi/+tZkEkCkh3it5SCNrXucoJ j43/gC3mrMRU+6ziTGjUJ7aeReC1Z1UKo15rjukkLTVlh6bJVpVZxnpo2ttFkL5CY+xH ssdh66il5Hj70QxIkioL2d0FnY8LXGqHemrCNHTteeuBVGSUqQZ2QVpg/+xiC3WPIGN5 SvyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=AAIYL8nOKNNgtLJhyYxhrIB/+Wg7cJCtVNle0Vo8a8A=; b=tE9cwTp1QcBV3+BjKUP2Wt3MZMvfNT4aYboxMGOWkOOO20h21MGAsefverl9Bo7+9z BUr/enefkOtzPZEBU2q3Hnzd8H3yyIFYT9qg3sXWNopFtG9kRglCANfIa6eKfC52LAFh DZ1DiIkwbCEj8WIvfSbt+OWBxtYPc8WziWug8hOtmSXJNBkQsn2rRplL019i1t90U4lw N3WQrww1yGSrhxJeuDttc+fjlzas7zyLJl91lODze0Z3YeNpzG4/5ZO4bvR0W6ED14fF Gkn6hsH0hgIX5dd79HZINTl65oGLD2zmrNsp0QM/k5qXCcfnC+YSKalt48RVMYF7xohQ Edzg== X-Gm-Message-State: AOAM530qb78jzRikRt9d00hgSQ+zcoyrJFs+S810i/Ht8VNIXL54q3K4 81EHownhBX1gng4jHrGs0dNtIw== X-Received: by 2002:a63:e42:: with SMTP id 2mr11128865pgo.100.1614549063025; Sun, 28 Feb 2021 13:51:03 -0800 (PST) Received: from google.com ([2620:15c:2ce:0:88b8:a009:be00:c947]) by smtp.gmail.com with ESMTPSA id q9sm14298642pgs.28.2021.02.28.13.51.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Feb 2021 13:51:02 -0800 (PST) Date: Sun, 28 Feb 2021 13:50:58 -0800 From: Fangrui Song To: Bill Wendling Cc: Jonathan Corbet , Masahiro Yamada , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, clang-built-linux@googlegroups.com, Andrew Morton , Nathan Chancellor , Nick Desaulniers , Sami Tolvanen Subject: Re: [PATCH v8] pgo: add clang's Profile Guided Optimization infrastructure Message-ID: <20210228215058.r5425p6zidwolhw7@google.com> References: <20210122101156.3257143-1-morbo@google.com> <20210226222030.3718075-1-morbo@google.com> <20210228185214.sdmpytoh37nyvwgm@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20210228185214.sdmpytoh37nyvwgm@google.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021-02-28, Fangrui Song wrote: >Reviewed-by: Fangrui Song > >Some minor items below: > >On 2021-02-26, 'Bill Wendling' via Clang Built Linux wrote: >>From: Sami Tolvanen >> >>Enable the use of clang's Profile-Guided Optimization[1]. To generate a >>profile, the kernel is instrumented with PGO counters, a representative >>workload is run, and the raw profile data is collected from >>/sys/kernel/debug/pgo/profraw. >> >>The raw profile data must be processed by clang's "llvm-profdata" tool >>before it can be used during recompilation: >> >> $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw >> $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw >> >>Multiple raw profiles may be merged during this step. >> >>The data can now be used by the compiler: >> >> $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ... >> >>This initial submission is restricted to x86, as that's the platform we >>know works. This restriction can be lifted once other platforms have >>been verified to work with PGO. >> >>Note that this method of profiling the kernel is clang-native, unlike >>the clang support in kernel/gcov. >> >>[1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization >> >>Signed-off-by: Sami Tolvanen >>Co-developed-by: Bill Wendling >>Signed-off-by: Bill Wendling >>--- >>v8: - Rebased on top-of-tree. >>v7: - Fix minor build failure reported by Sedat. >>v6: - Add better documentation about the locking scheme and other things. >> - Rename macros to better match the same macros in LLVM's source code. >>v5: - Correct padding calculation, discovered by Nathan Chancellor. >>v4: - Remove non-x86 Makfile changes and se "hweight64" instead of using our >> own popcount implementation, based on Nick Desaulniers's comment. >>v3: - Added change log section based on Sedat Dilek's comments. >>v2: - Added "__llvm_profile_instrument_memop" based on Nathan Chancellor's >> testing. >> - Corrected documentation, re PGO flags when using LTO, based on Fangrui >> Song's comments. >>--- >>Documentation/dev-tools/index.rst | 1 + >>Documentation/dev-tools/pgo.rst | 127 +++++++++ >>MAINTAINERS | 9 + >>Makefile | 3 + >>arch/Kconfig | 1 + >>arch/x86/Kconfig | 1 + >>arch/x86/boot/Makefile | 1 + >>arch/x86/boot/compressed/Makefile | 1 + >>arch/x86/crypto/Makefile | 4 + >>arch/x86/entry/vdso/Makefile | 1 + >>arch/x86/kernel/vmlinux.lds.S | 2 + >>arch/x86/platform/efi/Makefile | 1 + >>arch/x86/purgatory/Makefile | 1 + >>arch/x86/realmode/rm/Makefile | 1 + >>arch/x86/um/vdso/Makefile | 1 + >>drivers/firmware/efi/libstub/Makefile | 1 + >>include/asm-generic/vmlinux.lds.h | 44 +++ >>kernel/Makefile | 1 + >>kernel/pgo/Kconfig | 35 +++ >>kernel/pgo/Makefile | 5 + >>kernel/pgo/fs.c | 389 ++++++++++++++++++++++++++ >>kernel/pgo/instrument.c | 189 +++++++++++++ >>kernel/pgo/pgo.h | 203 ++++++++++++++ >>scripts/Makefile.lib | 10 + >>24 files changed, 1032 insertions(+) >>create mode 100644 Documentation/dev-tools/pgo.rst >>create mode 100644 kernel/pgo/Kconfig >>create mode 100644 kernel/pgo/Makefile >>create mode 100644 kernel/pgo/fs.c >>create mode 100644 kernel/pgo/instrument.c >>create mode 100644 kernel/pgo/pgo.h >> >>diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst >>index f7809c7b1ba9..8d6418e85806 100644 >>--- a/Documentation/dev-tools/index.rst >>+++ b/Documentation/dev-tools/index.rst >>@@ -26,6 +26,7 @@ whole; patches welcome! >> kgdb >> kselftest >> kunit/index >>+ pgo >> >> >>.. only:: subproject and html >>diff --git a/Documentation/dev-tools/pgo.rst b/Documentation/dev-tools/pgo.rst >>new file mode 100644 >>index 000000000000..b7f11d8405b7 >>--- /dev/null >>+++ b/Documentation/dev-tools/pgo.rst >>@@ -0,0 +1,127 @@ >>+.. SPDX-License-Identifier: GPL-2.0 >>+ >>+=============================== >>+Using PGO with the Linux kernel >>+=============================== >>+ >>+Clang's profiling kernel support (PGO_) enables profiling of the Linux kernel >>+when building with Clang. The profiling data is exported via the ``pgo`` >>+debugfs directory. >>+ >>+.. _PGO: https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization >>+ >>+ >>+Preparation >>+=========== >>+ >>+Configure the kernel with: >>+ >>+.. code-block:: make >>+ >>+ CONFIG_DEBUG_FS=y >>+ CONFIG_PGO_CLANG=y >>+ >>+Note that kernels compiled with profiling flags will be significantly larger >>+and run slower. >>+ >>+Profiling data will only become accessible once debugfs has been mounted: >>+ >>+.. code-block:: sh >>+ >>+ mount -t debugfs none /sys/kernel/debug >>+ >>+ >>+Customization >>+============= >>+ >>+You can enable or disable profiling for individual file and directories by >>+adding a line similar to the following to the respective kernel Makefile: >>+ >>+- For a single file (e.g. main.o) >>+ >>+ .. code-block:: make >>+ >>+ PGO_PROFILE_main.o := y >>+ >>+- For all files in one directory >>+ >>+ .. code-block:: make >>+ >>+ PGO_PROFILE := y >>+ >>+To exclude files from being profiled use >>+ >>+ .. code-block:: make >>+ >>+ PGO_PROFILE_main.o := n >>+ >>+and >>+ >>+ .. code-block:: make >>+ >>+ PGO_PROFILE := n >>+ >>+Only files which are linked to the main kernel image or are compiled as kernel >>+modules are supported by this mechanism. >>+ >>+ >>+Files >>+===== >>+ >>+The PGO kernel support creates the following files in debugfs: >>+ >>+``/sys/kernel/debug/pgo`` >>+ Parent directory for all PGO-related files. >>+ >>+``/sys/kernel/debug/pgo/reset`` >>+ Global reset file: resets all coverage data to zero when written to. >>+ >>+``/sys/kernel/debug/profraw`` >>+ The raw PGO data that must be processed with ``llvm_profdata``. >>+ >>+ >>+Workflow >>+======== >>+ >>+The PGO kernel can be run on the host or test machines. The data though should >>+be analyzed with Clang's tools from the same Clang version as the kernel was >>+compiled. Clang's tolerant of version skew, but it's easier to use the same >>+Clang version. >>+ >>+The profiling data is useful for optimizing the kernel, analyzing coverage, >>+etc. Clang offers tools to perform these tasks. >>+ >>+Here is an example workflow for profiling an instrumented kernel with PGO and >>+using the result to optimize the kernel: >>+ >>+1) Install the kernel on the TEST machine. >>+ >>+2) Reset the data counters right before running the load tests >>+ >>+ .. code-block:: sh >>+ >>+ $ echo 1 > /sys/kernel/debug/pgo/reset >>+ >>+3) Run the load tests. >>+ >>+4) Collect the raw profile data >>+ >>+ .. code-block:: sh >>+ >>+ $ cp -a /sys/kernel/debug/pgo/profraw /tmp/vmlinux.profraw >>+ >>+5) (Optional) Download the raw profile data to the HOST machine. >>+ >>+6) Process the raw profile data >>+ >>+ .. code-block:: sh >>+ >>+ $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw >>+ >>+ Note that multiple raw profile data files can be merged during this step. >>+ >>+7) Rebuild the kernel using the profile data (PGO disabled) >>+ >>+ .. code-block:: sh >>+ >>+ $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ... >>diff --git a/MAINTAINERS b/MAINTAINERS >>index c71664ca8bfd..3a6668792bc5 100644 >>--- a/MAINTAINERS >>+++ b/MAINTAINERS >>@@ -14019,6 +14019,15 @@ S: Maintained >>F: include/linux/personality.h >>F: include/uapi/linux/personality.h >> >>+PGO BASED KERNEL PROFILING >>+M: Sami Tolvanen >>+M: Bill Wendling >>+R: Nathan Chancellor >>+R: Nick Desaulniers >>+S: Supported >>+F: Documentation/dev-tools/pgo.rst >>+F: kernel/pgo >>+ >>PHOENIX RC FLIGHT CONTROLLER ADAPTER >>M: Marcus Folkesson >>L: linux-input@vger.kernel.org >>diff --git a/Makefile b/Makefile >>index 6ecd0d22e608..b57d4d44c799 100644 >>--- a/Makefile >>+++ b/Makefile >>@@ -657,6 +657,9 @@ endif # KBUILD_EXTMOD >># Defaults to vmlinux, but the arch makefile usually adds further targets >>all: vmlinux >> >>+CFLAGS_PGO_CLANG := -fprofile-generate >>+export CFLAGS_PGO_CLANG >>+ >>CFLAGS_GCOV := -fprofile-arcs -ftest-coverage \ >> $(call cc-option,-fno-tree-loop-im) \ >> $(call cc-disable-warning,maybe-uninitialized,) >>diff --git a/arch/Kconfig b/arch/Kconfig >>index 2bb30673d8e6..111e642a2af7 100644 >>--- a/arch/Kconfig >>+++ b/arch/Kconfig >>@@ -1192,6 +1192,7 @@ config ARCH_HAS_ELFCORE_COMPAT >> bool >> >>source "kernel/gcov/Kconfig" >>+source "kernel/pgo/Kconfig" >> >>source "scripts/gcc-plugins/Kconfig" >> >>diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >>index cd4b9b1204a8..c9808583b528 100644 >>--- a/arch/x86/Kconfig >>+++ b/arch/x86/Kconfig >>@@ -99,6 +99,7 @@ config X86 >> select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096 >> select ARCH_SUPPORTS_LTO_CLANG if X86_64 >> select ARCH_SUPPORTS_LTO_CLANG_THIN if X86_64 >>+ select ARCH_SUPPORTS_PGO_CLANG if X86_64 >> select ARCH_USE_BUILTIN_BSWAP >> select ARCH_USE_QUEUED_RWLOCKS >> select ARCH_USE_QUEUED_SPINLOCKS >>diff --git a/arch/x86/boot/Makefile b/arch/x86/boot/Makefile >>index fe605205b4ce..383853e32f67 100644 >>--- a/arch/x86/boot/Makefile >>+++ b/arch/x86/boot/Makefile >>@@ -71,6 +71,7 @@ KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ >>KBUILD_CFLAGS += $(call cc-option,-fmacro-prefix-map=$(srctree)/=) >>KBUILD_CFLAGS += -fno-asynchronous-unwind-tables >>GCOV_PROFILE := n >>+PGO_PROFILE := n >>UBSAN_SANITIZE := n >> >>$(obj)/bzImage: asflags-y := $(SVGA_MODE) >>diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile >>index e0bc3988c3fa..ed12ab65f606 100644 >>--- a/arch/x86/boot/compressed/Makefile >>+++ b/arch/x86/boot/compressed/Makefile >>@@ -54,6 +54,7 @@ CFLAGS_sev-es.o += -I$(objtree)/arch/x86/lib/ >> >>KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ >>GCOV_PROFILE := n >>+PGO_PROFILE := n >>UBSAN_SANITIZE :=n >> >>KBUILD_LDFLAGS := -m elf_$(UTS_MACHINE) >>diff --git a/arch/x86/crypto/Makefile b/arch/x86/crypto/Makefile >>index b28e36b7c96b..4b2e9620c412 100644 >>--- a/arch/x86/crypto/Makefile >>+++ b/arch/x86/crypto/Makefile >>@@ -4,6 +4,10 @@ >> >>OBJECT_FILES_NON_STANDARD := y >> >>+# Disable PGO for curve25519-x86_64. With PGO enabled, clang runs out of >>+# registers for some of the functions. >>+PGO_PROFILE_curve25519-x86_64.o := n >>+ >>obj-$(CONFIG_CRYPTO_TWOFISH_586) += twofish-i586.o >>twofish-i586-y := twofish-i586-asm_32.o twofish_glue.o >>obj-$(CONFIG_CRYPTO_TWOFISH_X86_64) += twofish-x86_64.o >>diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile >>index 05c4abc2fdfd..f7421e44725a 100644 >>--- a/arch/x86/entry/vdso/Makefile >>+++ b/arch/x86/entry/vdso/Makefile >>@@ -180,6 +180,7 @@ quiet_cmd_vdso = VDSO $@ >>VDSO_LDFLAGS = -shared --hash-style=both --build-id=sha1 \ >> $(call ld-option, --eh-frame-hdr) -Bsymbolic >>GCOV_PROFILE := n >>+PGO_PROFILE := n >> >>quiet_cmd_vdso_and_check = VDSO $@ >> cmd_vdso_and_check = $(cmd_vdso); $(cmd_vdso_check) >>diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S >>index efd9e9ea17f2..f6cab2316c46 100644 >>--- a/arch/x86/kernel/vmlinux.lds.S >>+++ b/arch/x86/kernel/vmlinux.lds.S >>@@ -184,6 +184,8 @@ SECTIONS >> >> BUG_TABLE >> >>+ PGO_CLANG_DATA >>+ >> ORC_UNWIND_TABLE >> >> . = ALIGN(PAGE_SIZE); >>diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile >>index 84b09c230cbd..5f22b31446ad 100644 >>--- a/arch/x86/platform/efi/Makefile >>+++ b/arch/x86/platform/efi/Makefile >>@@ -2,6 +2,7 @@ >>OBJECT_FILES_NON_STANDARD_efi_thunk_$(BITS).o := y >>KASAN_SANITIZE := n >>GCOV_PROFILE := n >>+PGO_PROFILE := n >> >>obj-$(CONFIG_EFI) += quirks.o efi.o efi_$(BITS).o efi_stub_$(BITS).o >>obj-$(CONFIG_EFI_MIXED) += efi_thunk_$(BITS).o >>diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile >>index 95ea17a9d20c..36f20e99da0b 100644 >>--- a/arch/x86/purgatory/Makefile >>+++ b/arch/x86/purgatory/Makefile >>@@ -23,6 +23,7 @@ targets += purgatory.ro purgatory.chk >> >># Sanitizer, etc. runtimes are unavailable and cannot be linked here. >>GCOV_PROFILE := n >>+PGO_PROFILE := n >>KASAN_SANITIZE := n >>UBSAN_SANITIZE := n >>KCSAN_SANITIZE := n >>diff --git a/arch/x86/realmode/rm/Makefile b/arch/x86/realmode/rm/Makefile >>index 83f1b6a56449..21797192f958 100644 >>--- a/arch/x86/realmode/rm/Makefile >>+++ b/arch/x86/realmode/rm/Makefile >>@@ -76,4 +76,5 @@ KBUILD_CFLAGS := $(REALMODE_CFLAGS) -D_SETUP -D_WAKEUP \ >>KBUILD_AFLAGS := $(KBUILD_CFLAGS) -D__ASSEMBLY__ >>KBUILD_CFLAGS += -fno-asynchronous-unwind-tables >>GCOV_PROFILE := n >>+PGO_PROFILE := n >>UBSAN_SANITIZE := n >>diff --git a/arch/x86/um/vdso/Makefile b/arch/x86/um/vdso/Makefile >>index 5943387e3f35..54f5768f5853 100644 >>--- a/arch/x86/um/vdso/Makefile >>+++ b/arch/x86/um/vdso/Makefile >>@@ -64,6 +64,7 @@ quiet_cmd_vdso = VDSO $@ >> >>VDSO_LDFLAGS = -fPIC -shared -Wl,--hash-style=sysv >>GCOV_PROFILE := n >>+PGO_PROFILE := n >> >># >># Install the unstripped copy of vdso*.so listed in $(vdso-install-y). >>diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile >>index c23466e05e60..724fb389bb9d 100644 >>--- a/drivers/firmware/efi/libstub/Makefile >>+++ b/drivers/firmware/efi/libstub/Makefile >>@@ -42,6 +42,7 @@ KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_SCS), $(KBUILD_CFLAGS)) >>KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS)) >> >>GCOV_PROFILE := n >>+PGO_PROFILE := n >># Sanitizer runtimes are unavailable and cannot be linked here. >>KASAN_SANITIZE := n >>KCSAN_SANITIZE := n >>diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h >>index 6786f8c0182f..4a0c21b840b3 100644 >>--- a/include/asm-generic/vmlinux.lds.h >>+++ b/include/asm-generic/vmlinux.lds.h >>@@ -329,6 +329,49 @@ >>#define DTPM_TABLE() >>#endif >> >>+#ifdef CONFIG_PGO_CLANG >>+#define PGO_CLANG_DATA \ >>+ __llvm_prf_data : AT(ADDR(__llvm_prf_data) - LOAD_OFFSET) { \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_start = .; \ >>+ __llvm_prf_data_start = .; \ >>+ KEEP(*(__llvm_prf_data)) \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_data_end = .; \ >>+ } \ > >Some minor items on linker script usage. The end of a metadata section >usually does not need alignment. Does the . = ALIGN(8) have >significance? Ditto below. > > > >This is an item about LD_DEAD_CODE_DATA_ELIMINATION. Feel free to >postpone after this patch is in tree: > > KEEP(*(__llvm_prf_data)) > >KEEP should be dropped. > >I have been involved in improving GC (my recent interests on such >metadata sections :) >https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order) > >With LLVM>=13 (https://reviews.llvm.org/D96757), __llvm_prf_* associated >to non-COMDAT text sections can be GCed as well. KEEP would >unnecessarily retain them under LD_DEAD_CODE_DATA_ELIMINATION. > >For older releases (at least 10<=LLVM<13), such __llvm_prf_* sections >are not in zero flag section groups so they usually cannot be discarded. >So perhaps with KEEP or without KEEP, you won't find many size >differences. > >>+ __llvm_prf_cnts : AT(ADDR(__llvm_prf_cnts) - LOAD_OFFSET) { \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_cnts_start = .; \ >>+ KEEP(*(__llvm_prf_cnts)) \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_cnts_end = .; \ >>+ } \ >>+ __llvm_prf_names : AT(ADDR(__llvm_prf_names) - LOAD_OFFSET) { \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_names_start = .; \ >>+ KEEP(*(__llvm_prf_names)) \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_names_end = .; \ >>+ . = ALIGN(8); \ >>+ } \ > >__llvm_prf_names does not need alignment. >It is often 1 in userspace programs. > >>+ __llvm_prf_vals : AT(ADDR(__llvm_prf_vals) - LOAD_OFFSET) { \ >>+ __llvm_prf_vals_start = .; \ >>+ KEEP(*(__llvm_prf_vals)) \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_vals_end = .; \ >>+ . = ALIGN(8); \ >>+ } \ >>+ __llvm_prf_vnds : AT(ADDR(__llvm_prf_vnds) - LOAD_OFFSET) { \ >>+ __llvm_prf_vnds_start = .; \ >>+ KEEP(*(__llvm_prf_vnds)) \ >>+ . = ALIGN(8); \ >>+ __llvm_prf_vnds_end = .; \ >>+ __llvm_prf_end = .; \ >>+ } > >In userspace PGO instrumentation, the start is often aligned by 16. >The end does not need alignment. Thinking more, my suggestion is to drop explicit alignment annotations entirely: __llvm_prf_vals : AT(ADDR(__llvm_prf_vals) - LOAD_OFFSET) { \ __llvm_prf_vals_start = .; \ *(__llvm_prf_vals) \ __llvm_prf_vals_end = .; \ } \ __llvm_prf_vnds : AT(ADDR(__llvm_prf_vnds) - LOAD_OFFSET) { \ __llvm_prf_vnds_start = .; \ *(__llvm_prf_vnds) \ __llvm_prf_vnds_end = .; \ __llvm_prf_end = .; \ } // _cnts, _names and _data are similar. Just delete all ALIGN. // I deleted KEEP above to facilitate --gc-sections as well. Let the linker figure out the alignments in input sections and the output section alignment (https://lld.llvm.org/ELF/linker_script.html#output-section-alignment). Omitting alignment is probably preferable in most cases, unless no input section is present (either not emitted at all or all discarded by ld --gc-sections) (very rare event, happened with commit 793f49a87aae ("firmware_loader: align .builtin_fw to 8"), but that case unlikely happens with PGO). > >>+#else >>+#define PGO_CLANG_DATA >>+#endif >>+ >>#define KERNEL_DTB() \ >> STRUCT_ALIGN(); \ >> __dtb_start = .; \ >>@@ -1105,6 +1148,7 @@ >> CONSTRUCTORS \ >> } \ >> BUG_TABLE \ >>+ PGO_CLANG_DATA >> >>#define INIT_TEXT_SECTION(inittext_align) \ >> . = ALIGN(inittext_align); \ >>diff --git a/kernel/Makefile b/kernel/Makefile >>index 320f1f3941b7..a2a23ef2b12f 100644 >>--- a/kernel/Makefile >>+++ b/kernel/Makefile >>@@ -111,6 +111,7 @@ obj-$(CONFIG_BPF) += bpf/ >>obj-$(CONFIG_KCSAN) += kcsan/ >>obj-$(CONFIG_SHADOW_CALL_STACK) += scs.o >>obj-$(CONFIG_HAVE_STATIC_CALL_INLINE) += static_call.o >>+obj-$(CONFIG_PGO_CLANG) += pgo/ >> >>obj-$(CONFIG_PERF_EVENTS) += events/ >> >>diff --git a/kernel/pgo/Kconfig b/kernel/pgo/Kconfig >>new file mode 100644 >>index 000000000000..76a640b6cf6e >>--- /dev/null >>+++ b/kernel/pgo/Kconfig >>@@ -0,0 +1,35 @@ >>+# SPDX-License-Identifier: GPL-2.0-only >>+menu "Profile Guided Optimization (PGO) (EXPERIMENTAL)" >>+ >>+config ARCH_SUPPORTS_PGO_CLANG >>+ bool >>+ >>+config PGO_CLANG >>+ bool "Enable clang's PGO-based kernel profiling" >>+ depends on DEBUG_FS >>+ depends on ARCH_SUPPORTS_PGO_CLANG >>+ depends on CC_IS_CLANG && CLANG_VERSION >= 120000 >>+ help >>+ This option enables clang's PGO (Profile Guided Optimization) based >>+ code profiling to better optimize the kernel. >>+ >>+ If unsure, say N. >>+ >>+ Run a representative workload for your application on a kernel >>+ compiled with this option and download the raw profile file from >>+ /sys/kernel/debug/pgo/profraw. This file needs to be processed with >>+ llvm-profdata. It may be merged with other collected raw profiles. >>+ >>+ Copy the resulting profile file into vmlinux.profdata, and enable >>+ KCFLAGS=-fprofile-use=vmlinux.profdata to produce an optimized >>+ kernel. >>+ >>+ Note that a kernel compiled with profiling flags will be >>+ significantly larger and run slower. Also be sure to exclude files >>+ from profiling which are not linked to the kernel image to prevent >>+ linker errors. >>+ >>+ Note that the debugfs filesystem has to be mounted to access >>+ profiling data. >>+ >>+endmenu >>diff --git a/kernel/pgo/Makefile b/kernel/pgo/Makefile >>new file mode 100644 >>index 000000000000..41e27cefd9a4 >>--- /dev/null >>+++ b/kernel/pgo/Makefile >>@@ -0,0 +1,5 @@ >>+# SPDX-License-Identifier: GPL-2.0 >>+GCOV_PROFILE := n >>+PGO_PROFILE := n >>+ >>+obj-y += fs.o instrument.o >>diff --git a/kernel/pgo/fs.c b/kernel/pgo/fs.c >>new file mode 100644 >>index 000000000000..1678df3b7d64 >>--- /dev/null >>+++ b/kernel/pgo/fs.c >>@@ -0,0 +1,389 @@ >>+// SPDX-License-Identifier: GPL-2.0 >>+/* >>+ * Copyright (C) 2019 Google, Inc. >>+ * >>+ * Author: >>+ * Sami Tolvanen >>+ * >>+ * This software is licensed under the terms of the GNU General Public >>+ * License version 2, as published by the Free Software Foundation, and >>+ * may be copied, distributed, and modified under those terms. >>+ * >>+ * This program is distributed in the hope that it will be useful, >>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of >>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>+ * GNU General Public License for more details. >>+ * >>+ */ >>+ >>+#define pr_fmt(fmt) "pgo: " fmt >>+ >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include "pgo.h" >>+ >>+static struct dentry *directory; >>+ >>+struct prf_private_data { >>+ void *buffer; >>+ unsigned long size; >>+}; >>+ >>+/* >>+ * Raw profile data format: >>+ * >>+ * - llvm_prf_header >>+ * - __llvm_prf_data >>+ * - __llvm_prf_cnts >>+ * - __llvm_prf_names >>+ * - zero padding to 8 bytes >>+ * - for each llvm_prf_data in __llvm_prf_data: >>+ * - llvm_prf_value_data >>+ * - llvm_prf_value_record + site count array >>+ * - llvm_prf_value_node_data >>+ * ... >>+ * ... >>+ * ... >>+ */ >>+ >>+static void prf_fill_header(void **buffer) >>+{ >>+ struct llvm_prf_header *header = *(struct llvm_prf_header **)buffer; >>+ >>+#ifdef CONFIG_64BIT >>+ header->magic = LLVM_INSTR_PROF_RAW_MAGIC_64; >>+#else >>+ header->magic = LLVM_INSTR_PROF_RAW_MAGIC_32; >>+#endif >>+ header->version = LLVM_VARIANT_MASK_IR_PROF | LLVM_INSTR_PROF_RAW_VERSION; >>+ header->data_size = prf_data_count(); >>+ header->padding_bytes_before_counters = 0; >>+ header->counters_size = prf_cnts_count(); >>+ header->padding_bytes_after_counters = 0; >>+ header->names_size = prf_names_count(); >>+ header->counters_delta = (u64)__llvm_prf_cnts_start; >>+ header->names_delta = (u64)__llvm_prf_names_start; >>+ header->value_kind_last = LLVM_INSTR_PROF_IPVK_LAST; >>+ >>+ *buffer += sizeof(*header); >>+} >>+ >>+/* >>+ * Copy the source into the buffer, incrementing the pointer into buffer in the >>+ * process. >>+ */ >>+static void prf_copy_to_buffer(void **buffer, void *src, unsigned long size) >>+{ >>+ memcpy(*buffer, src, size); >>+ *buffer += size; >>+} >>+ >>+static u32 __prf_get_value_size(struct llvm_prf_data *p, u32 *value_kinds) >>+{ >>+ struct llvm_prf_value_node **nodes = >>+ (struct llvm_prf_value_node **)p->values; >>+ u32 kinds = 0; >>+ u32 size = 0; >>+ unsigned int kind; >>+ unsigned int n; >>+ unsigned int s = 0; >>+ >>+ for (kind = 0; kind < ARRAY_SIZE(p->num_value_sites); kind++) { >>+ unsigned int sites = p->num_value_sites[kind]; >>+ >>+ if (!sites) >>+ continue; >>+ >>+ /* Record + site count array */ >>+ size += prf_get_value_record_size(sites); >>+ kinds++; >>+ >>+ if (!nodes) >>+ continue; >>+ >>+ for (n = 0; n < sites; n++) { >>+ u32 count = 0; >>+ struct llvm_prf_value_node *site = nodes[s + n]; >>+ >>+ while (site && ++count <= U8_MAX) >>+ site = site->next; >>+ >>+ size += count * >>+ sizeof(struct llvm_prf_value_node_data); >>+ } >>+ >>+ s += sites; >>+ } >>+ >>+ if (size) >>+ size += sizeof(struct llvm_prf_value_data); >>+ >>+ if (value_kinds) >>+ *value_kinds = kinds; >>+ >>+ return size; >>+} >>+ >>+static u32 prf_get_value_size(void) >>+{ >>+ u32 size = 0; >>+ struct llvm_prf_data *p; >>+ >>+ for (p = __llvm_prf_data_start; p < __llvm_prf_data_end; p++) >>+ size += __prf_get_value_size(p, NULL); >>+ >>+ return size; >>+} >>+ >>+/* Serialize the profiling's value. */ >>+static void prf_serialize_value(struct llvm_prf_data *p, void **buffer) >>+{ >>+ struct llvm_prf_value_data header; >>+ struct llvm_prf_value_node **nodes = >>+ (struct llvm_prf_value_node **)p->values; >>+ unsigned int kind; >>+ unsigned int n; >>+ unsigned int s = 0; >>+ >>+ header.total_size = __prf_get_value_size(p, &header.num_value_kinds); >>+ >>+ if (!header.num_value_kinds) >>+ /* Nothing to write. */ >>+ return; >>+ >>+ prf_copy_to_buffer(buffer, &header, sizeof(header)); >>+ >>+ for (kind = 0; kind < ARRAY_SIZE(p->num_value_sites); kind++) { >>+ struct llvm_prf_value_record *record; >>+ u8 *counts; >>+ unsigned int sites = p->num_value_sites[kind]; >>+ >>+ if (!sites) >>+ continue; >>+ >>+ /* Profiling value record. */ >>+ record = *(struct llvm_prf_value_record **)buffer; >>+ *buffer += prf_get_value_record_header_size(); >>+ >>+ record->kind = kind; >>+ record->num_value_sites = sites; >>+ >>+ /* Site count array. */ >>+ counts = *(u8 **)buffer; >>+ *buffer += prf_get_value_record_site_count_size(sites); >>+ >>+ /* >>+ * If we don't have nodes, we can skip updating the site count >>+ * array, because the buffer is zero filled. >>+ */ >>+ if (!nodes) >>+ continue; >>+ >>+ for (n = 0; n < sites; n++) { >>+ u32 count = 0; >>+ struct llvm_prf_value_node *site = nodes[s + n]; >>+ >>+ while (site && ++count <= U8_MAX) { >>+ prf_copy_to_buffer(buffer, site, >>+ sizeof(struct llvm_prf_value_node_data)); >>+ site = site->next; >>+ } >>+ >>+ counts[n] = (u8)count; >>+ } >>+ >>+ s += sites; >>+ } >>+} >>+ >>+static void prf_serialize_values(void **buffer) >>+{ >>+ struct llvm_prf_data *p; >>+ >>+ for (p = __llvm_prf_data_start; p < __llvm_prf_data_end; p++) >>+ prf_serialize_value(p, buffer); >>+} >>+ >>+static inline unsigned long prf_get_padding(unsigned long size) >>+{ >>+ return 7 & (sizeof(u64) - size % sizeof(u64)); >>+} >>+ >>+static unsigned long prf_buffer_size(void) >>+{ >>+ return sizeof(struct llvm_prf_header) + >>+ prf_data_size() + >>+ prf_cnts_size() + >>+ prf_names_size() + >>+ prf_get_padding(prf_names_size()) + >>+ prf_get_value_size(); >>+} >>+ >>+/* >>+ * Serialize the profiling data into a format LLVM's tools can understand. >>+ * Note: caller *must* hold pgo_lock. >>+ */ >>+static int prf_serialize(struct prf_private_data *p) >>+{ >>+ int err = 0; >>+ void *buffer; >>+ >>+ p->size = prf_buffer_size(); >>+ p->buffer = vzalloc(p->size); >>+ >>+ if (!p->buffer) { >>+ err = -ENOMEM; >>+ goto out; >>+ } >>+ >>+ buffer = p->buffer; >>+ >>+ prf_fill_header(&buffer); >>+ prf_copy_to_buffer(&buffer, __llvm_prf_data_start, prf_data_size()); >>+ prf_copy_to_buffer(&buffer, __llvm_prf_cnts_start, prf_cnts_size()); >>+ prf_copy_to_buffer(&buffer, __llvm_prf_names_start, prf_names_size()); >>+ buffer += prf_get_padding(prf_names_size()); >>+ >>+ prf_serialize_values(&buffer); >>+ >>+out: >>+ return err; >>+} >>+ >>+/* open() implementation for PGO. Creates a copy of the profiling data set. */ >>+static int prf_open(struct inode *inode, struct file *file) >>+{ >>+ struct prf_private_data *data; >>+ unsigned long flags; >>+ int err; >>+ >>+ data = kzalloc(sizeof(*data), GFP_KERNEL); >>+ if (!data) { >>+ err = -ENOMEM; >>+ goto out; >>+ } >>+ >>+ flags = prf_lock(); >>+ >>+ err = prf_serialize(data); >>+ if (unlikely(err)) { >>+ kfree(data); >>+ goto out_unlock; >>+ } >>+ >>+ file->private_data = data; >>+ >>+out_unlock: >>+ prf_unlock(flags); >>+out: >>+ return err; >>+} >>+ >>+/* read() implementation for PGO. */ >>+static ssize_t prf_read(struct file *file, char __user *buf, size_t count, >>+ loff_t *ppos) >>+{ >>+ struct prf_private_data *data = file->private_data; >>+ >>+ BUG_ON(!data); >>+ >>+ return simple_read_from_buffer(buf, count, ppos, data->buffer, >>+ data->size); >>+} >>+ >>+/* release() implementation for PGO. Release resources allocated by open(). */ >>+static int prf_release(struct inode *inode, struct file *file) >>+{ >>+ struct prf_private_data *data = file->private_data; >>+ >>+ if (data) { >>+ vfree(data->buffer); >>+ kfree(data); >>+ } >>+ >>+ return 0; >>+} >>+ >>+static const struct file_operations prf_fops = { >>+ .owner = THIS_MODULE, >>+ .open = prf_open, >>+ .read = prf_read, >>+ .llseek = default_llseek, >>+ .release = prf_release >>+}; >>+ >>+/* write() implementation for resetting PGO's profile data. */ >>+static ssize_t reset_write(struct file *file, const char __user *addr, >>+ size_t len, loff_t *pos) >>+{ >>+ struct llvm_prf_data *data; >>+ >>+ memset(__llvm_prf_cnts_start, 0, prf_cnts_size()); >>+ >>+ for (data = __llvm_prf_data_start; data < __llvm_prf_data_end; data++) { >>+ struct llvm_prf_value_node **vnodes; >>+ u64 current_vsite_count; >>+ u32 i; >>+ >>+ if (!data->values) >>+ continue; >>+ >>+ current_vsite_count = 0; >>+ vnodes = (struct llvm_prf_value_node **)data->values; >>+ >>+ for (i = LLVM_INSTR_PROF_IPVK_FIRST; i <= LLVM_INSTR_PROF_IPVK_LAST; i++) >>+ current_vsite_count += data->num_value_sites[i]; >>+ >>+ for (i = 0; i < current_vsite_count; i++) { >>+ struct llvm_prf_value_node *current_vnode = vnodes[i]; >>+ >>+ while (current_vnode) { >>+ current_vnode->count = 0; >>+ current_vnode = current_vnode->next; >>+ } >>+ } >>+ } >>+ >>+ return len; >>+} >>+ >>+static const struct file_operations prf_reset_fops = { >>+ .owner = THIS_MODULE, >>+ .write = reset_write, >>+ .llseek = noop_llseek, >>+}; >>+ >>+/* Create debugfs entries. */ >>+static int __init pgo_init(void) >>+{ >>+ directory = debugfs_create_dir("pgo", NULL); >>+ if (!directory) >>+ goto err_remove; >>+ >>+ if (!debugfs_create_file("profraw", 0600, directory, NULL, >>+ &prf_fops)) >>+ goto err_remove; >>+ >>+ if (!debugfs_create_file("reset", 0200, directory, NULL, >>+ &prf_reset_fops)) >>+ goto err_remove; >>+ >>+ return 0; >>+ >>+err_remove: >>+ pr_err("initialization failed\n"); >>+ return -EIO; >>+} >>+ >>+/* Remove debugfs entries. */ >>+static void __exit pgo_exit(void) >>+{ >>+ debugfs_remove_recursive(directory); >>+} >>+ >>+module_init(pgo_init); >>+module_exit(pgo_exit); >>diff --git a/kernel/pgo/instrument.c b/kernel/pgo/instrument.c >>new file mode 100644 >>index 000000000000..62ff5cfce7b1 >>--- /dev/null >>+++ b/kernel/pgo/instrument.c >>@@ -0,0 +1,189 @@ >>+// SPDX-License-Identifier: GPL-2.0 >>+/* >>+ * Copyright (C) 2019 Google, Inc. >>+ * >>+ * Author: >>+ * Sami Tolvanen >>+ * >>+ * This software is licensed under the terms of the GNU General Public >>+ * License version 2, as published by the Free Software Foundation, and >>+ * may be copied, distributed, and modified under those terms. >>+ * >>+ * This program is distributed in the hope that it will be useful, >>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of >>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>+ * GNU General Public License for more details. >>+ * >>+ */ >>+ >>+#define pr_fmt(fmt) "pgo: " fmt >>+ >>+#include >>+#include >>+#include >>+#include >>+#include >>+#include "pgo.h" >>+ >>+/* >>+ * This lock guards both profile count updating and serialization of the >>+ * profiling data. Keeping both of these activities separate via locking >>+ * ensures that we don't try to serialize data that's only partially updated. >>+ */ >>+static DEFINE_SPINLOCK(pgo_lock); >>+static int current_node; >>+ >>+unsigned long prf_lock(void) >>+{ >>+ unsigned long flags; >>+ >>+ spin_lock_irqsave(&pgo_lock, flags); >>+ >>+ return flags; >>+} >>+ >>+void prf_unlock(unsigned long flags) >>+{ >>+ spin_unlock_irqrestore(&pgo_lock, flags); >>+} >>+ >>+/* >>+ * Return a newly allocated profiling value node which contains the tracked >>+ * value by the value profiler. >>+ * Note: caller *must* hold pgo_lock. >>+ */ >>+static struct llvm_prf_value_node *allocate_node(struct llvm_prf_data *p, >>+ u32 index, u64 value) >>+{ >>+ if (&__llvm_prf_vnds_start[current_node + 1] >= __llvm_prf_vnds_end) >>+ return NULL; /* Out of nodes */ >>+ >>+ current_node++; >>+ >>+ /* Make sure the node is entirely within the section */ >>+ if (&__llvm_prf_vnds_start[current_node] >= __llvm_prf_vnds_end || >>+ &__llvm_prf_vnds_start[current_node + 1] > __llvm_prf_vnds_end) >>+ return NULL; >>+ >>+ return &__llvm_prf_vnds_start[current_node]; >>+} >>+ >>+/* >>+ * Counts the number of times a target value is seen. >>+ * >>+ * Records the target value for the index if not seen before. Otherwise, >>+ * increments the counter associated w/ the target value. >>+ */ >>+void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index); >>+void __llvm_profile_instrument_target(u64 target_value, void *data, u32 index) >>+{ >>+ struct llvm_prf_data *p = (struct llvm_prf_data *)data; >>+ struct llvm_prf_value_node **counters; >>+ struct llvm_prf_value_node *curr; >>+ struct llvm_prf_value_node *min = NULL; >>+ struct llvm_prf_value_node *prev = NULL; >>+ u64 min_count = U64_MAX; >>+ u8 values = 0; >>+ unsigned long flags; >>+ >>+ if (!p || !p->values) >>+ return; >>+ >>+ counters = (struct llvm_prf_value_node **)p->values; >>+ curr = counters[index]; >>+ >>+ while (curr) { >>+ if (target_value == curr->value) { >>+ curr->count++; >>+ return; >>+ } >>+ >>+ if (curr->count < min_count) { >>+ min_count = curr->count; >>+ min = curr; >>+ } >>+ >>+ prev = curr; >>+ curr = curr->next; >>+ values++; >>+ } >>+ >>+ if (values >= LLVM_INSTR_PROF_MAX_NUM_VAL_PER_SITE) { >>+ if (!min->count || !(--min->count)) { >>+ curr = min; >>+ curr->value = target_value; >>+ curr->count++; >>+ } >>+ return; >>+ } >>+ >>+ /* Lock when updating the value node structure. */ >>+ flags = prf_lock(); >>+ >>+ curr = allocate_node(p, index, target_value); >>+ if (!curr) >>+ goto out; >>+ >>+ curr->value = target_value; >>+ curr->count++; >>+ >>+ if (!counters[index]) >>+ counters[index] = curr; >>+ else if (prev && !prev->next) >>+ prev->next = curr; >>+ >>+out: >>+ prf_unlock(flags); >>+} >>+EXPORT_SYMBOL(__llvm_profile_instrument_target); >>+ >>+/* Counts the number of times a range of targets values are seen. */ >>+void __llvm_profile_instrument_range(u64 target_value, void *data, >>+ u32 index, s64 precise_start, >>+ s64 precise_last, s64 large_value); >>+void __llvm_profile_instrument_range(u64 target_value, void *data, >>+ u32 index, s64 precise_start, >>+ s64 precise_last, s64 large_value) >>+{ >>+ if (large_value != S64_MIN && (s64)target_value >= large_value) >>+ target_value = large_value; >>+ else if ((s64)target_value < precise_start || >>+ (s64)target_value > precise_last) >>+ target_value = precise_last + 1; >>+ >>+ __llvm_profile_instrument_target(target_value, data, index); >>+} >>+EXPORT_SYMBOL(__llvm_profile_instrument_range); >>+ >>+static u64 inst_prof_get_range_rep_value(u64 value) >>+{ >>+ if (value <= 8) >>+ /* The first ranges are individually tracked, use it as is. */ >>+ return value; >>+ else if (value >= 513) >>+ /* The last range is mapped to its lowest value. */ >>+ return 513; >>+ else if (hweight64(value) == 1) >>+ /* If it's a power of two, use it as is. */ >>+ return value; >>+ >>+ /* Otherwise, take to the previous power of two + 1. */ >>+ return (1 << (64 - __builtin_clzll(value) - 1)) + 1; >>+} `1 << ...` is another very minor issue. I sent https://reviews.llvm.org/D97640 to fix the upstream. The overflow won't happen in practice because the function is only used by the size parameter of memory operation (e.g. memcpy). >>+/* >>+ * The target values are partitioned into multiple ranges. The range spec is >>+ * defined in compiler-rt/include/profile/InstrProfData.inc. >>+ */ >>+void __llvm_profile_instrument_memop(u64 target_value, void *data, >>+ u32 counter_index); >>+void __llvm_profile_instrument_memop(u64 target_value, void *data, >>+ u32 counter_index) >>+{ >>+ u64 rep_value; >>+ >>+ /* Map the target value to the representative value of its range. */ >>+ rep_value = inst_prof_get_range_rep_value(target_value); >>+ __llvm_profile_instrument_target(rep_value, data, counter_index); >>+} >>+EXPORT_SYMBOL(__llvm_profile_instrument_memop); >>diff --git a/kernel/pgo/pgo.h b/kernel/pgo/pgo.h >>new file mode 100644 >>index 000000000000..ddc8d3002fe5 >>--- /dev/null >>+++ b/kernel/pgo/pgo.h >>@@ -0,0 +1,203 @@ >>+/* SPDX-License-Identifier: GPL-2.0 */ >>+/* >>+ * Copyright (C) 2019 Google, Inc. >>+ * >>+ * Author: >>+ * Sami Tolvanen >>+ * >>+ * This software is licensed under the terms of the GNU General Public >>+ * License version 2, as published by the Free Software Foundation, and >>+ * may be copied, distributed, and modified under those terms. >>+ * >>+ * This program is distributed in the hope that it will be useful, >>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of >>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >>+ * GNU General Public License for more details. >>+ * >>+ */ >>+ >>+#ifndef _PGO_H >>+#define _PGO_H >>+ >>+/* >>+ * Note: These internal LLVM definitions must match the compiler version. >>+ * See llvm/include/llvm/ProfileData/InstrProfData.inc in LLVM's source code. >>+ */ >>+ >>+#define LLVM_INSTR_PROF_RAW_MAGIC_64 \ >>+ ((u64)255 << 56 | \ >>+ (u64)'l' << 48 | \ >>+ (u64)'p' << 40 | \ >>+ (u64)'r' << 32 | \ >>+ (u64)'o' << 24 | \ >>+ (u64)'f' << 16 | \ >>+ (u64)'r' << 8 | \ >>+ (u64)129) >>+#define LLVM_INSTR_PROF_RAW_MAGIC_32 \ >>+ ((u64)255 << 56 | \ >>+ (u64)'l' << 48 | \ >>+ (u64)'p' << 40 | \ >>+ (u64)'r' << 32 | \ >>+ (u64)'o' << 24 | \ >>+ (u64)'f' << 16 | \ >>+ (u64)'R' << 8 | \ >>+ (u64)129) >>+ >>+#define LLVM_INSTR_PROF_RAW_VERSION 5 >>+#define LLVM_INSTR_PROF_DATA_ALIGNMENT 8 >>+#define LLVM_INSTR_PROF_IPVK_FIRST 0 >>+#define LLVM_INSTR_PROF_IPVK_LAST 1 >>+#define LLVM_INSTR_PROF_MAX_NUM_VAL_PER_SITE 255 >>+ >>+#define LLVM_VARIANT_MASK_IR_PROF (0x1ULL << 56) >>+#define LLVM_VARIANT_MASK_CSIR_PROF (0x1ULL << 57) >>+ >>+/** >>+ * struct llvm_prf_header - represents the raw profile header data structure. >>+ * @magic: the magic token for the file format. >>+ * @version: the version of the file format. >>+ * @data_size: the number of entries in the profile data section. >>+ * @padding_bytes_before_counters: the number of padding bytes before the >>+ * counters. >>+ * @counters_size: the size in bytes of the LLVM profile section containing the >>+ * counters. >>+ * @padding_bytes_after_counters: the number of padding bytes after the >>+ * counters. >>+ * @names_size: the size in bytes of the LLVM profile section containing the >>+ * counters' names. >>+ * @counters_delta: the beginning of the LLMV profile counters section. >>+ * @names_delta: the beginning of the LLMV profile names section. >>+ * @value_kind_last: the last profile value kind. >>+ */ >>+struct llvm_prf_header { >>+ u64 magic; >>+ u64 version; >>+ u64 data_size; >>+ u64 padding_bytes_before_counters; >>+ u64 counters_size; >>+ u64 padding_bytes_after_counters; >>+ u64 names_size; >>+ u64 counters_delta; >>+ u64 names_delta; >>+ u64 value_kind_last; >>+}; >>+ >>+/** >>+ * struct llvm_prf_data - represents the per-function control structure. >>+ * @name_ref: the reference to the function's name. >>+ * @func_hash: the hash value of the function. >>+ * @counter_ptr: a pointer to the profile counter. >>+ * @function_ptr: a pointer to the function. >>+ * @values: the profiling values associated with this function. >>+ * @num_counters: the number of counters in the function. >>+ * @num_value_sites: the number of value profile sites. >>+ */ >>+struct llvm_prf_data { >>+ const u64 name_ref; >>+ const u64 func_hash; >>+ const void *counter_ptr; >>+ const void *function_ptr; >>+ void *values; >>+ const u32 num_counters; >>+ const u16 num_value_sites[LLVM_INSTR_PROF_IPVK_LAST + 1]; >>+} __aligned(LLVM_INSTR_PROF_DATA_ALIGNMENT); >>+ >>+/** >>+ * structure llvm_prf_value_node_data - represents the data part of the struct >>+ * llvm_prf_value_node data structure. >>+ * @value: the value counters. >>+ * @count: the counters' count. >>+ */ >>+struct llvm_prf_value_node_data { >>+ u64 value; >>+ u64 count; >>+}; >>+ >>+/** >>+ * struct llvm_prf_value_node - represents an internal data structure used by >>+ * the value profiler. >>+ * @value: the value counters. >>+ * @count: the counters' count. >>+ * @next: the next value node. >>+ */ >>+struct llvm_prf_value_node { >>+ u64 value; >>+ u64 count; >>+ struct llvm_prf_value_node *next; >>+}; >>+ >>+/** >>+ * struct llvm_prf_value_data - represents the value profiling data in indexed >>+ * format. >>+ * @total_size: the total size in bytes including this field. >>+ * @num_value_kinds: the number of value profile kinds that has value profile >>+ * data. >>+ */ >>+struct llvm_prf_value_data { >>+ u32 total_size; >>+ u32 num_value_kinds; >>+}; >>+ >>+/** >>+ * struct llvm_prf_value_record - represents the on-disk layout of the value >>+ * profile data of a particular kind for one function. >>+ * @kind: the kind of the value profile record. >>+ * @num_value_sites: the number of value profile sites. >>+ * @site_count_array: the first element of the array that stores the number >>+ * of profiled values for each value site. >>+ */ >>+struct llvm_prf_value_record { >>+ u32 kind; >>+ u32 num_value_sites; >>+ u8 site_count_array[]; >>+}; >>+ >>+#define prf_get_value_record_header_size() \ >>+ offsetof(struct llvm_prf_value_record, site_count_array) >>+#define prf_get_value_record_site_count_size(sites) \ >>+ roundup((sites), 8) >>+#define prf_get_value_record_size(sites) \ >>+ (prf_get_value_record_header_size() + \ >>+ prf_get_value_record_site_count_size((sites))) >>+ >>+/* Data sections */ >>+extern struct llvm_prf_data __llvm_prf_data_start[]; >>+extern struct llvm_prf_data __llvm_prf_data_end[]; >>+ >>+extern u64 __llvm_prf_cnts_start[]; >>+extern u64 __llvm_prf_cnts_end[]; >>+ >>+extern char __llvm_prf_names_start[]; >>+extern char __llvm_prf_names_end[]; >>+ >>+extern struct llvm_prf_value_node __llvm_prf_vnds_start[]; >>+extern struct llvm_prf_value_node __llvm_prf_vnds_end[]; >>+ >>+/* Locking for vnodes */ >>+extern unsigned long prf_lock(void); >>+extern void prf_unlock(unsigned long flags); >>+ >>+#define __DEFINE_PRF_SIZE(s) \ >>+ static inline unsigned long prf_ ## s ## _size(void) \ >>+ { \ >>+ unsigned long start = \ >>+ (unsigned long)__llvm_prf_ ## s ## _start; \ >>+ unsigned long end = \ >>+ (unsigned long)__llvm_prf_ ## s ## _end; \ >>+ return roundup(end - start, \ >>+ sizeof(__llvm_prf_ ## s ## _start[0])); \ >>+ } \ >>+ static inline unsigned long prf_ ## s ## _count(void) \ >>+ { \ >>+ return prf_ ## s ## _size() / \ >>+ sizeof(__llvm_prf_ ## s ## _start[0]); \ >>+ } >>+ >>+__DEFINE_PRF_SIZE(data); >>+__DEFINE_PRF_SIZE(cnts); >>+__DEFINE_PRF_SIZE(names); >>+__DEFINE_PRF_SIZE(vnds); >>+ >>+#undef __DEFINE_PRF_SIZE >>+ >>+#endif /* _PGO_H */ >>diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib >>index eee59184de64..48a65d092c5b 100644 >>--- a/scripts/Makefile.lib >>+++ b/scripts/Makefile.lib >>@@ -139,6 +139,16 @@ _c_flags += $(if $(patsubst n%,, \ >> $(CFLAGS_GCOV)) >>endif >> >>+# >>+# Enable clang's PGO profiling flags for a file or directory depending on >>+# variables PGO_PROFILE_obj.o and PGO_PROFILE. >>+# >>+ifeq ($(CONFIG_PGO_CLANG),y) >>+_c_flags += $(if $(patsubst n%,, \ >>+ $(PGO_PROFILE_$(basetarget).o)$(PGO_PROFILE)y), \ >>+ $(CFLAGS_PGO_CLANG)) >>+endif >>+ >># >># Enable address sanitizer flags for kernel except some files or directories >># we don't want to check (depends on variables KASAN_SANITIZE_obj.o, KASAN_SANITIZE) >>-- >>2.30.1.766.gb4fecdf3b7-goog >> >>-- >>You received this message because you are subscribed to the Google Groups "Clang Built Linux" group. >>To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com. >>To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20210226222030.3718075-1-morbo%40google.com.