Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2647328pxj; Mon, 14 Jun 2021 03:59:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxneAZc1QZjGx15NR53cDvwDI8jcn97RIRtMwDhcSMKLTSwTJSdpCU00/7YtkZZ1VUfhyQn X-Received: by 2002:aa7:c983:: with SMTP id c3mr16027938edt.58.1623668371522; Mon, 14 Jun 2021 03:59:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623668371; cv=none; d=google.com; s=arc-20160816; b=If6CEmp3MA4YjyQGquAo2U4qBqIpKOokPGEfBb+HqBz0PTrxq2UXdqRbXTLtGOwaBO h0sCiRMZolWDwwgSssc0ZI/p5CiexYqNYtckzPO4ThgKo5Voy7KwgsCyAvw35rty77Ho 1BJKYTfqB6qDsper5o6/Bkxk5eKQ9DG5KCLkoqlVlCm9ExeQzgHteL/1j73r4NMkcBf0 uFzUTmoU0HEU7pRvBVMEt0s50Kb4rhCi+EXndZuPbNlL0DB77Gj1bAvpKl0JgTRy6I1K Yf1UnNriFcfoNHuxy8KAdi8zKwTOc5iJVOUjVys+1HcB6yPXEn0ZyCA21txQPdErJfI9 zmSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=BT1NtYIux4JLaeybPfCOT8uueCBZvOTndG6fM411rmw=; b=rIuly8OSA0NtGOgkqCie4OFBPPRhZGONFez/SE7ybPNckR5Cv82mvL44bBvFSQ/zVf x3x/NEW8yLbu6fGPVD1YFqhHfGDBsWct9YxWH2iSZFTppNAEQ8NmKLfDmfxHXWUllskm chbevUvBtU8axrMQmFe56rp7Qn2MwD+X0HWaC9W/6cxS2Xs90ri9lr3ofOAeXGzb5NBN ErTKrYBthsvSF01hcr/D1OsY7l8KwEt1R4Mya7Nn2M16gT5SqYQtdQ1ObKsj/leW2waT KaaY7Svzhhl88NaTLHHQBHx4KsyaRzKL+GEWAM7yJw04T+L4WhSvJP/q5vzinrhLG/nw Zipg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ehBof3aM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bz20si12924023ejc.486.2021.06.14.03.59.09; Mon, 14 Jun 2021 03:59:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=ehBof3aM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234041AbhFNK65 (ORCPT + 99 others); Mon, 14 Jun 2021 06:58:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233752AbhFNKug (ORCPT ); Mon, 14 Jun 2021 06:50:36 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69912C0611C6; Mon, 14 Jun 2021 03:46:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=BT1NtYIux4JLaeybPfCOT8uueCBZvOTndG6fM411rmw=; b=ehBof3aMHlzMx886GSIXylvzUK uZri8LTBRJaYrh+M5TasVMYcRFbnzh8VUiHFL+Eb4XYAnryHDEjhyr9E+3pa3tId+0KV2/MVO8jmR dJ9uGarsIbTa2orDHYMzwAahs5Vz2w8eY9kbT0CTYMWmjrWvzKqGHhqKlbnorsdfexMK+XU/9QpR4 +v+8zz+MQcpSD+gB0q+0Dknk/bzpoNWJ1MUUKBuqwpWDKgdrbUoq6gtL0l4jkjRUKDvsHNm5vmz01 XUNykyufR8Z0x+NoOHZv21g5P//b8Vt3b/A06My4i5bO8YdNwxOlFSG81SRVjO090UGTvJ8Qo+Gcn UKflDZow==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94 #2 (Red Hat Linux)) id 1lsk5N-005HsH-Ok; Mon, 14 Jun 2021 10:44:57 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id E35923001E3; Mon, 14 Jun 2021 12:44:52 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id C7B322C178C03; Mon, 14 Jun 2021 12:44:52 +0200 (CEST) Date: Mon, 14 Jun 2021 12:44:52 +0200 From: Peter Zijlstra To: Bill Wendling Cc: Kees Cook , Jonathan Corbet , Masahiro Yamada , Linux Doc Mailing List , LKML , Linux Kbuild mailing list , clang-built-linux , Andrew Morton , Nathan Chancellor , Nick Desaulniers , Sami Tolvanen , Fangrui Song , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , andreyknvl@gmail.com, dvyukov@google.com, elver@google.com, johannes.berg@intel.com, oberpar@linux.vnet.ibm.com, linux-toolchains@vger.kernel.org Subject: Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure Message-ID: References: <20210111081821.3041587-1-morbo@google.com> <20210407211704.367039-1-morbo@google.com> <20210612202505.GG68208@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 14, 2021 at 02:39:41AM -0700, Bill Wendling wrote: > On Mon, Jun 14, 2021 at 2:01 AM Peter Zijlstra wrote: > > Because having GCOV, KCOV and PGO all do essentially the same thing > > differently, makes heaps of sense? > > > It does when you're dealing with one toolchain without access to another. Here's a sekrit, don't tell anyone, but you can get a free copy of GCC right here: https://gcc.gnu.org/ We also have this linux-toolchains list (Cc'ed now) that contains folks from both sides. > > I understand that the compilers actually generates radically different > > instrumentation for the various cases, but essentially they're all > > collecting (function/branch) arcs. > > > That's true, but there's no one format for profiling data that's > usable between all compilers. I'm not even sure there's a good way to > translate between, say, gcov and llvm's format. To make matters more > complicated, each compiler's format is tightly coupled to a specific > version of that compiler. And depending on *how* the data is collected > (e.g. sampling or instrumentation), it may not give us the full > benefit of FDO/PGO. I'm thinking that something simple like: struct arc { u64 from; u64 to; u64 nr; u64 cntrs[0]; }; goes a very long way. Stick a header on that says how large cntrs[] is, and some other data (like load offset and whatnot) and you should be good. Combine that with the executable image (say /proc/kcore) to recover what's @from (call, jmp or conditional branch) and I'm thinking one ought to be able to construct lots of useful data. I've also been led to believe that the KCOV data format is not in fact dependent on which toolchain is used. > > I'm thinking it might be about time to build _one_ infrastructure for > > that and define a kernel arc format and call it a day. > > > That may be nice, but it's a rather large request. Given GCOV just died, perhaps you can look at what KCOV does and see if that can be extended to do as you want. KCOV is actively used and we actually tripped over all the fun little noinstr bugs at the time.