Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2793796pxj; Mon, 14 Jun 2021 07:19:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxEXbYwX1kR2Tf92BjDvSWmEVwado8ULLl/xdG2aIwrzV+GKM28lc/kjAUJ2t9MDi/GVWPz X-Received: by 2002:a17:906:4111:: with SMTP id j17mr15360290ejk.488.1623680381075; Mon, 14 Jun 2021 07:19:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623680381; cv=none; d=google.com; s=arc-20160816; b=FNyYH09532sLPTxMN/cu4fMXyrV92+sOA5MXKWtOzQOFqoPiJWeCXx5aB6eYogLl76 v0nv2NMNvO2aeaxzABOqbP87j6lGme3Oh2Fhjos6wiIpUWRBlB0hM95rpPWa9YD10KxA j4IUHfjDFNM0AFr0JSHPzAZXeyRdFriXhXG4excW577i5fMtyBBm1LkHdUl8df5mYPQa xY9nPpDkZvwNpK+CTE2AG/XLKltJA3a7pJLuHNIg8tipdRxLQj89uoo+ZUryzIQrJnHG u4k7uHDVik3y7gdEJCeje1UMl4SCUqQh4D8aiYkyj+Svondm1gob+IY6COdqzCoer1ML Oe3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=FBYjVJRS+b/w4pnsYesBjbYpCL3m16xVdTBky4XS4D0=; b=svKL2puwkCj3JmMhfluGelsJLoQMgc3OfovtVor2P28DGjyNjwpyWDpfseLkRssgLf KClqZ8ccwE845nyMlYyZyrB+VMxUwqBeAtiVxUc7fKR5iREMbjCocOZ6vpkmQwLKfTNq p7AJQ2aHJKN6viomoStazldz3jEJi3I72K/i92XeUSOiqR8goJH+mVyUoG/MOOESEeKG b4DsVSbeGZAl60O50pWq3jkf49NuAuv+Bz1q+8PZd1dHawlcKSiqDZU7JtCqhUXey8dR klp9FnF9hBGLLZzJ9dYQNdgKa0ronjGgUGJCIN8Fu9suuLn9zMiIU/vDob3HWG4Q2fAR B01w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VbC94GJ4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g1si8062387edn.478.2021.06.14.07.19.18; Mon, 14 Jun 2021 07:19:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VbC94GJ4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234510AbhFNOTn (ORCPT + 99 others); Mon, 14 Jun 2021 10:19:43 -0400 Received: from mail-ot1-f45.google.com ([209.85.210.45]:35411 "EHLO mail-ot1-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234493AbhFNOTm (ORCPT ); Mon, 14 Jun 2021 10:19:42 -0400 Received: by mail-ot1-f45.google.com with SMTP id 7-20020a9d0d070000b0290439abcef697so5481025oti.2 for ; Mon, 14 Jun 2021 07:17:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FBYjVJRS+b/w4pnsYesBjbYpCL3m16xVdTBky4XS4D0=; b=VbC94GJ4DfIQmAwxyQpCMvzQjzFuU7+D96764TvBV0xOZvHK0y2l/Cv22fZQPkrEjS Qhyzk/QT78EnsWos8WsffbqHz2A9eXnuu9JjXR9X+PdLxj3P+VAlPexiym38RrHeNnlf ldwldkgVcse3rIHvqVk12WtjOTiLZvFYxlP/VhoR8TgEE6f2GZ5vajgk3RCMOyChf2w2 x6nfojz2FJML9kA+MUQuA1B+XfxLIOmSZUhsvhpLPjkbDYhdmnTVTXMqNsfiZoseRKt9 3OI9nkfmzPaKVoJXOIY3OARtrt/EUJJ2MDRU2fvV8/Q1D8MJqAjr8v4sXg6evBPl70/e k5kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FBYjVJRS+b/w4pnsYesBjbYpCL3m16xVdTBky4XS4D0=; b=JYQk9HjZuCNyFbd6ScgNNRiPoKDCZ/8hqxfAX/dR6x645VjS0Ry94mclyCPzpzI6X+ O9Xn/AxH/cq/xo/nZs8XqnH47OyEvKsb9fLEkg2c9dsaAbPtFim5vxV15sWSj3os/x98 HDNR28VV2Wcz51J/TzNn44JfAau5H4eIoaWv81U/plMxi+6gyiP60CYyQafsyn8rlev/ jLi8l3cxpyCzyQVGTYCFQzVGcPU7+Y44QpMxYbjNofG/JxzQ+mz1tcrlGBGIBcEfO6Nf OZHJ2uiheN0b2OUIWIk06aM9frQAPXBcV8sjf1Z54DqqoF+otItIN0SeJlojwik/5kMl /O9w== X-Gm-Message-State: AOAM532J9Zp68VZ5XNdV8dkDF6vN1KUFn18EBL/C55NvsGO3KJ9bCdQb zIM5oTmttOtuBdHCErv0PLDKu8g4AvMx/8tDXnTM2Q== X-Received: by 2002:a05:6830:93:: with SMTP id a19mr13523515oto.17.1623680187940; Mon, 14 Jun 2021 07:16:27 -0700 (PDT) MIME-Version: 1.0 References: <20210111081821.3041587-1-morbo@google.com> <20210407211704.367039-1-morbo@google.com> <20210612202505.GG68208@worktop.programming.kicks-ass.net> In-Reply-To: From: Marco Elver Date: Mon, 14 Jun 2021 16:16:16 +0200 Message-ID: Subject: Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure To: Peter Zijlstra Cc: Bill Wendling , Kees Cook , Jonathan Corbet , Masahiro Yamada , Linux Doc Mailing List , LKML , Linux Kbuild mailing list , clang-built-linux , Andrew Morton , Nathan Chancellor , Nick Desaulniers , Sami Tolvanen , Fangrui Song , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , Andrey Konovalov , Dmitry Vyukov , johannes.berg@intel.com, oberpar@linux.vnet.ibm.com, linux-toolchains@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 14 Jun 2021 at 12:45, Peter Zijlstra wrote: [...] > I've also been led to believe that the KCOV data format is not in fact > dependent on which toolchain is used. Correct, we use KCOV with both gcc and clang. Both gcc and clang emit the same instrumentation for -fsanitize-coverage. Thus, the user-space portion and interface is indeed identical: https://www.kernel.org/doc/html/latest/dev-tools/kcov.html > > > I'm thinking it might be about time to build _one_ infrastructure for > > > that and define a kernel arc format and call it a day. > > > > > That may be nice, but it's a rather large request. > > Given GCOV just died, perhaps you can look at what KCOV does and see if > that can be extended to do as you want. KCOV is actively used and > we actually tripped over all the fun little noinstr bugs at the time. There might be a subtle mismatch between coverage instrumentation for testing/fuzzing and for profiling. (Disclaimer: I'm not too familiar with Clang-PGO's requirements.) For example, while for testing/fuzzing we may only require information if a code-path has been visited, for profiling the "hotness" might be of interest. Therefore, the user-space exported data format can make several trade-offs in complexity. In theory, I imagine there's a limit to how generic one could make profiling information, because one compiler's optimizations are not another compiler's optimizations. On the other hand, it may be doable to collect unified profiling information for common stuff, but I guess there's little motivation for figuring out the common ground given the producer and consumer of the PGO data is the same compiler by design (unlike coverage info for testing/fuzzing). Therefore, if KCOV's exposed information does not match PGO's requirements today, I'm not sure what realistically can be done without turning KCOV into a monster. Because KCOV is optimized for testing/fuzzing coverage, and I'm not sure how complex we can or want to make it to cater to a new use-case. My intuition is that the simpler design is to have 2 subsystems for instrumentation-based coverage collection: one for testing/fuzzing, and the other for profiling. Alas, there's the problem of GCOV, which should be replaceable by KCOV for most use cases. But it would be good to hear from a GCOV user if there are some. But as we learned GCOV is broken on x86 now, I see these options: 1. Remove GCOV, make KCOV the de-facto test-coverage collection subsystem. Introduce PGO-instrumentation subsystem for profile collection only, and make it _very_ clear that KCOV != PGO data as hinted above. A pre-requisite is that compiler-support for PGO instrumentation adds selective instrumentation support, likely just making attribute no_instrument_function do the right thing. 2. Like (1) but also keep GCOV, given proper support for attribute no_instrument_function would probably fix it (?). 3. Keep GCOV (and KCOV of course). Somehow extract PGO profiles from KCOV. 4. Somehow extract PGO profiles from GCOV, or modify kernel/gcov to do so. Thanks.