Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1219777rwb; Thu, 6 Oct 2022 09:57:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4iS3PEdLFc1gRe+fbVIG1Mjc2h/m0AMkFn+JpgK57Jt+248eGFRPBjV0nSpWQFnOu2ywa0 X-Received: by 2002:aa7:d415:0:b0:459:4c7:60ce with SMTP id z21-20020aa7d415000000b0045904c760cemr708296edq.202.1665075468355; Thu, 06 Oct 2022 09:57:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665075468; cv=none; d=google.com; s=arc-20160816; b=UUb9t00DmjXOyaNTd17GDhkpAdjrXdb10Sb3O/HOutTgI90laa84hT1H9+wBnKkRV1 GScjvnLTB/VLGoaFi/cl4cP6UW3cNxI1OXzBYukiWfyQAnAAPFO5kdoRw7MhxAZhDGbE HB1fkg4QDLy9B+OaJxiy/bY6fjXBqYB006A5D9iaKGjXbT/0By0aM+ipNd6UKJNzhx4r QxCyUtBqROuDsxWGz02cIzmOLpQYcOpoQyfaMReiAzh/a4p5CmTkpP+Mmg9aLG78jlhN hPPRCmsGtFOfdc0sOXI1PmeuGSo+0Eb8HNCOc99ALBOjf6e+7TTBxZMJ0WGkVKEqaGZ4 X4xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=IYjpZqz77gNBznI1MevLhzgKVnRedXukiYRKEBoeAFY=; b=uxrDHJ123LT7mKzip/d+ZR92rApjMT0IUwx/29C5No7vGFrRZMChQkVyispZzt0/8O FquShzWbaqSni59A2lMbE9gGu1wWYNqLFyWRqjctib0HzDhFDRitp4jKfEqf6menXUWX bU81MtT+YkCUn24G7YFQzSIkPwHpiDSWJkylyp1/fYpGgDY1QbVWfEcb2x+8UZT/Hxlt E9ABzStweN9ZitRMmEMSvoQNPmPd833DWFJbc1dnES+5nvg5ngu1WhqTZkUqBxVQHMUN 1svG8Wg8Ce7nvA83XtRQywy/kWRy2iAr4VfJJPRR1agHmXF3pRFSRFyh+gXkgP2x86Om lqUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=OZXZD+2s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hu9-20020a170907a08900b00771d781fb88si7383280ejc.411.2022.10.06.09.57.21; Thu, 06 Oct 2022 09:57:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=OZXZD+2s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231600AbiJFQ2f (ORCPT + 99 others); Thu, 6 Oct 2022 12:28:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230171AbiJFQ2b (ORCPT ); Thu, 6 Oct 2022 12:28:31 -0400 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B22B125000 for ; Thu, 6 Oct 2022 09:28:29 -0700 (PDT) Received: by mail-il1-x130.google.com with SMTP id g13so1281878ile.0 for ; Thu, 06 Oct 2022 09:28:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=IYjpZqz77gNBznI1MevLhzgKVnRedXukiYRKEBoeAFY=; b=OZXZD+2stSJaYU5puMxCQ53KwFD+psvQbsHF7u9hkb3drO1Q3S3Vde169dKFmun3cC vQmMKnIESoAp5gTljfL2E9X++n4gNc3P6wUKzGmyfTjuwjuK+TPPvXunXyaCQvIi6/6n ER4fFye6OZUNcrYSJjzCuLoPX25LQ3IVdyuAo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=IYjpZqz77gNBznI1MevLhzgKVnRedXukiYRKEBoeAFY=; b=yJLCM9H9GvG4JP5a8Wx218eGMotkim7TOjiKGU7D0eZSmXdPypS/DdxDZmkjda7lHa nRCw/o2Q4G8gncM2uz5s6/4KxezAczQGlj7djqTdkoThsoa69WCYwNPnVhupVA5bLApU wNai97AyxvMCK4g4yfIaUPO0wvFIumuM8OEn1gbI8fof6kmwknGga8YXQdPx1RLS1q7D XB1iPAU2gLnB223EQsPjT5c3sdlhYZEck5UjkZvgHm+O/0Lt1ZUPssD3WBd9oBNSeYV+ lk22530K5PH3l1JbCrT/+uEAiM2245SekZc/cXbmNt/Wx4sLIAyF0Mip3Rjuba/+NZcl R31Q== X-Gm-Message-State: ACrzQf3SG7vD5k/p7ZGyZaN0r/Rl05BJKxjB4hSugCLBGAEKQfowr0JM /b9x8DcEFz845gRCkw8JkREO04SGsugCN45l0kADdA== X-Received: by 2002:a05:6e02:180b:b0:2fa:9904:9de6 with SMTP id a11-20020a056e02180b00b002fa99049de6mr236308ilv.96.1665073708561; Thu, 06 Oct 2022 09:28:28 -0700 (PDT) MIME-Version: 1.0 References: <20220920082005.2459826-1-denik@chromium.org> <20220922053145.944786-1-denik@chromium.org> <87h70zk83g.wl-maz@kernel.org> In-Reply-To: From: Denis Nikitin Date: Thu, 6 Oct 2022 09:28:17 -0700 Message-ID: Subject: Re: [PATCH v2] KVM: arm64: nvhe: Fix build with profile optimization To: Marc Zyngier Cc: Catalin Marinas , Will Deacon , James Morse , Alexandru Elisei , Nick Desaulniers , David Brazdil , linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, Manoj Gupta Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Mark, This problem currently blocks the PGO roll on the ChromeOS kernel and we need some kind of a solution. Could you please take a look? Thanks, Denis On Thu, Sep 29, 2022 at 9:13 AM Denis Nikitin wrote: > > Hi Marc, > > Please let me know what you think about this approach. > > Thanks, > Denis > > On Thu, Sep 22, 2022 at 11:04 PM Manoj Gupta wrote: > > > > > > > > On Thu, Sep 22, 2022 at 10:01 PM Denis Nikitin wrote: > >> > >> Hi Mark, > >> > >> On Thu, Sep 22, 2022 at 3:38 AM Marc Zyngier wrote: > >> > > >> > I was really hoping that you'd just drop the flags from the CFLAGS > >> > instead of removing the generated section. Something like: > >> > > >> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile > >> > index b5c5119c7396..e5b2d43925b4 100644 > >> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile > >> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile > >> > @@ -88,7 +88,7 @@ quiet_cmd_hypcopy = HYPCOPY $@ > >> > > >> > # Remove ftrace, Shadow Call Stack, and CFI CFLAGS. > >> > # This is equivalent to the 'notrace', '__noscs', and '__nocfi' annotations. > >> > -KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI), $(KBUILD_CFLAGS)) > >> > +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_FTRACE) $(CC_FLAGS_SCS) $(CC_FLAGS_CFI) -fprofile-sample-use, $(KBUILD_CFLAGS)) > >> > > >> > # KVM nVHE code is run at a different exception code with a different map, so > >> > # compiler instrumentation that inserts callbacks or checks into the code may > >> > >> Sorry, I moved on with a different approach and didn't explain the rationale. > >> > >> Like you mentioned before, the flag `-fprofile-sample-use` does not appear > >> in the kernel. And it looks confusing when the flag is disabled or filtered out > >> here. This was the first reason. > >> > >> The root cause of the build failure wasn't the compiler profile guided > >> optimization but the extra metadata in SHT_REL section which llvm injected > >> into kvm_nvhe.tmp.o for further link optimization. > >> If we remove the .llvm.call-graph-profile section we fix the build and avoid > >> potential problems with relocations optimized by the linker. The profile > >> guided optimization will still be applied by the compiler. > >> > >> Let me know what you think about it. > >> > >> > > >> > However, I even failed to reproduce your problem using LLVM 14 as > >> > packaged by Debian (if that matters, I'm using an arm64 build > >> > machine). I build the kernel with: > >> > > >> > $ make LLVM=1 KCFLAGS=-fprofile-sample-use -j8 vmlinux > >> > > >> > and the offending object only contains the following sections: > >> > > > > > > > Just some comments based on my ChromeOS build experience. > > > > fprofile-sample-use needs the profile file name argument to read the pgo data from > > i.e. -fprofile-sample-use=/path/to/gcov.profile. > > > > Since the path to filename can change, it makes filtering out more difficult. > > It is certainly possible to find and filter the exact argument by some string search of KCFLAGS. > > But passing -fno-profile-sample-use is easier and less error prone which I believe the previous patch version tried to do. > > > > > >> > arch/arm64/kvm/hyp/nvhe/kvm_nvhe.tmp.o: file format elf64-littleaarch64 > >> > > >> > Sections: > >> > Idx Name Size VMA LMA File off Algn > >> > 0 .hyp.idmap.text 00000ae4 0000000000000000 0000000000000000 00000800 2**11 > >> > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > >> > 1 .hyp.text 0000e988 0000000000000000 0000000000000000 00001800 2**11 > >> > CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE > >> > 2 .hyp.data..ro_after_init 00000820 0000000000000000 0000000000000000 00010188 2**3 > >> > CONTENTS, ALLOC, LOAD, DATA > >> > 3 .hyp.rodata 00002e70 0000000000000000 0000000000000000 000109a8 2**3 > >> > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > >> > 4 .hyp.data..percpu 00001ee0 0000000000000000 0000000000000000 00013820 2**4 > >> > CONTENTS, ALLOC, LOAD, DATA > >> > 5 .hyp.bss 00001158 0000000000000000 0000000000000000 00015700 2**3 > >> > ALLOC > >> > 6 .comment 0000001f 0000000000000000 0000000000000000 00017830 2**0 > >> > CONTENTS, READONLY > >> > 7 .llvm_addrsig 000000b8 0000000000000000 0000000000000000 0001784f 2**0 > >> > CONTENTS, READONLY, EXCLUDE > >> > 8 .altinstructions 00001284 0000000000000000 0000000000000000 00015700 2**0 > >> > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > >> > 9 __jump_table 00000960 0000000000000000 0000000000000000 00016988 2**3 > >> > CONTENTS, ALLOC, LOAD, RELOC, DATA > >> > 10 __bug_table 0000051c 0000000000000000 0000000000000000 000172e8 2**2 > >> > CONTENTS, ALLOC, LOAD, RELOC, DATA > >> > 11 __kvm_ex_table 00000028 0000000000000000 0000000000000000 00017808 2**3 > >> > CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA > >> > 12 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00027370 2**0 > >> > CONTENTS, READONLY > >> > > >> > So what am I missing to trigger this issue? Does it rely on something > >> > like PGO, which is not upstream yet? A bit of handholding would be > >> > much appreciated. > >> > >> Right, it relies on the PGO profile. > >> On ChromeOS we collect the sample PGO profile from Arm devices with > >> enabled CoreSight/ETM. You can find more details on ETM at > >> https://www.kernel.org/doc/Documentation/trace/coresight/coresight.rst. > >> > >> https://github.com/Linaro/OpenCSD/blob/master/decoder/tests/auto-fdo/autofdo.md > >> contains information about the pipeline of collecting, processing, and applying > >> the profile. > >> > > > > Generally the difficult part is in collecting a good matching profile for the workload. > > So I think this patch is better than previous since it still keeps the compiler optimization for the hot code paths > > in the file but removes the problematic section. > > > > Thanks, > > Manoj > > > > > >> > >> > > >> > Thanks, > >> > > >> > M. > >> > > >> > -- > >> > Without deviation from the norm, progress is not possible. > >> > >> Thanks, > >> Denis