Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp251865rdg; Tue, 10 Oct 2023 09:13:35 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHR+vynL9o//hm4z0B9+LYDQBnbMa/zcB8CAd9lXvLV0Z5Ods+r1+lfzZed501bwN5BelvF X-Received: by 2002:a05:6a20:7d9c:b0:129:3bb4:77f1 with SMTP id v28-20020a056a207d9c00b001293bb477f1mr19864981pzj.0.1696954415407; Tue, 10 Oct 2023 09:13:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696954415; cv=none; d=google.com; s=arc-20160816; b=ox9OD/s1UP2xR2KGKxXRnyTxLZodrKyjEKjyoLcQuERp3p9bplfp5hKbLoIBG/UAHI QvSlLqnHvdA3VFXpL7UjP4PuQTrzkrnamfGZ2y99Eq4E+Vn0W97YfjXE7X75WgYwtdnp xFw5bV7UQSDj9BYFYGP2FtBb8kmre56AYWUWEgGVzE1MA0l/zb7oOFxrs1FGSHaTqeI4 ExUyjcMQwFPGkXybJch0MYeZpYEdKB059xxJfc6YOZBl5MrWdF84ZZ+yAWM45RReIorI 4lmjDLTaoNNMK8zhYfxjmv6CnZ7z82t2R8R6mMrbBdFAI/D7bU8ScVhkpOy9VvWKlwcl KfVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id:dkim-signature; bh=tSjnvG5NwalIEjKIQtTbuCFGAtOMZwRcBRSi/1kopbk=; fh=4HjowzNQ0stQpaD/VmUMwR8jZ8Vjau7UxOKMwb/1FyM=; b=QaCwBU1EEni4BE/t0z7GuZao8nPgNUnbbKEye5bnzZbuxULcC9YBtXGWGp08RFQA3n mGqs3hAXHe/ZK5RrKSPB+aht3RVEFczMEkC9OrAdL7CJmJsN1rhHqr6PeNod/0x/4X5f i1vlaz61tjZluAbDxee2Q5LDH0UySDdQzWFNNBgQlZjQZc0zep95ZjeF37i7ixI/TvDI ShgS5YbUkZXKoR1YUBJdVQWDJgs85sZI1ALL+ja678YzLrDwXtfN1Eljxzxxa+XnqQk2 +oV4fwQkAoK9F0URhMnbbvKzWsAekypCZU4xIQtnxucEE4enPNtreQ4G6CcelEkvvrr5 DoWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Ccxq2yOs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id j190-20020a6380c7000000b00565e92e8734si3217318pgd.769.2023.10.10.09.13.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 09:13:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Ccxq2yOs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id E318B80CCD1E; Tue, 10 Oct 2023 09:13:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233608AbjJJQNH (ORCPT + 99 others); Tue, 10 Oct 2023 12:13:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55374 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231953AbjJJQNG (ORCPT ); Tue, 10 Oct 2023 12:13:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E96AD7 for ; Tue, 10 Oct 2023 09:12:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1696954337; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tSjnvG5NwalIEjKIQtTbuCFGAtOMZwRcBRSi/1kopbk=; b=Ccxq2yOsQxg6xcSOV3Fw5lVOSPHexLLWdwNcIMvBXmCtDB57X1RV80rbHVa+JqlVEarSu2 s/bxVkgspiX53ikD11kau1s5YYUBrcVaUZH7rM41sYcOlajcmwHj+2a2CurFgREaexGd7h c5zalzT0+J6b5KAqO5e6MfXXzZxAtc0= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-536-fKRhNiBONE2OhEm-amK30w-1; Tue, 10 Oct 2023 12:12:15 -0400 X-MC-Unique: fKRhNiBONE2OhEm-amK30w-1 Received: by mail-wr1-f72.google.com with SMTP id ffacd0b85a97d-32cbe54ee03so631532f8f.1 for ; Tue, 10 Oct 2023 09:12:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696954334; x=1697559134; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=tSjnvG5NwalIEjKIQtTbuCFGAtOMZwRcBRSi/1kopbk=; b=SYU0pyLO5cUb9p8067viCCgo3MafzjfdjpBC7CDEmDrs8JgjxVnPKXY2y1OPEjM8Jh OFibutDIfQCNAba4P1fBzmc1FP40JrIRbylos7/4TS/DSfDdevx0dZ9ivXANzD39ES/p yrEnHPrjloZvID72XHLFq4coKImwF3FCIK55AMNn3DIkrcAje0rA49/aExK0Nq0dW8JH Vvila9Iq7oxs1bBBWIH9++ZGD3KjPvYOf9Auz34r1zV/92vCFjeJJmnih3lgC8BfeQAE +ygOqNNx+bIZnMApnz4ANacuFc3+qVGef3kcih8h5URVyUs6Xz59C/AKYb1Pd9jszYlU 9tsg== X-Gm-Message-State: AOJu0YwAYXBox1pklJWX+p1FfuWlEYq2tQSgPGftXn5IXAT0Vf3b2yWj zBVxt2WSku0wkj1+dijA4NI3wQMTYkOCE5aW7HAMW62hWI951xewFz69c4VIdWkqL8NHoaeID2q ALMHVPfI1QZ01nt3y96vddUap X-Received: by 2002:a05:6000:cb:b0:321:7050:6fb6 with SMTP id q11-20020a05600000cb00b0032170506fb6mr14952071wrx.67.1696954334571; Tue, 10 Oct 2023 09:12:14 -0700 (PDT) X-Received: by 2002:a05:6000:cb:b0:321:7050:6fb6 with SMTP id q11-20020a05600000cb00b0032170506fb6mr14952042wrx.67.1696954334151; Tue, 10 Oct 2023 09:12:14 -0700 (PDT) Received: from starship ([89.237.100.246]) by smtp.gmail.com with ESMTPSA id j16-20020adff010000000b0032008f99216sm13043941wro.96.2023.10.10.09.12.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 09:12:13 -0700 (PDT) Message-ID: Subject: Re: [PATCH v2 4/5] perf kvm: Support sampling guest callchains From: Maxim Levitsky To: Tianyi Liu , seanjc@google.com, pbonzini@redhat.com, peterz@infradead.org, mingo@redhat.com, acme@kernel.org Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, kvm@vger.kernel.org, x86@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com Date: Tue, 10 Oct 2023 19:12:11 +0300 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.36.5 (3.36.5-2.fc32) MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.7 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 10 Oct 2023 09:13:32 -0700 (PDT) X-Spam-Level: ** У нд, 2023-10-08 у 22:57 +0800, Tianyi Liu пише: > This patch provides support for sampling guests' callchains. > > The signature of `get_perf_callchain` has been modified to explicitly > specify whether it needs to sample the host or guest callchain. > Based on the context, it will distribute the sampling request to one of > `perf_callchain_user`, `perf_callchain_kernel`, or `perf_callchain_guest`. > > The reason for separately implementing `perf_callchain_user` and > `perf_callchain_kernel` is that the kernel may utilize special unwinders > such as `ORC`. However, for the guest, we only support stackframe-based > unwinding, so the implementation is generic and only needs to be > separately implemented for 32-bit and 64-bit. > > Signed-off-by: Tianyi Liu > --- > arch/x86/events/core.c | 56 +++++++++++++++++++++++++++++++------- > include/linux/perf_event.h | 3 +- > kernel/bpf/stackmap.c | 8 +++--- > kernel/events/callchain.c | 27 +++++++++++++++++- > kernel/events/core.c | 7 ++++- > 5 files changed, 84 insertions(+), 17 deletions(-) > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c > index 185f902e5..ea4c86175 100644 > --- a/arch/x86/events/core.c > +++ b/arch/x86/events/core.c > @@ -2758,11 +2758,6 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re > struct unwind_state state; > unsigned long addr; > > - if (perf_guest_state()) { > - /* TODO: We don't support guest os callchain now */ > - return; > - } > - > if (perf_callchain_store(entry, regs->ip)) > return; > > @@ -2778,6 +2773,52 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re > } > } > > +static inline void > +perf_callchain_guest32(struct perf_callchain_entry_ctx *entry) > +{ > + struct stack_frame_ia32 frame; > + const struct stack_frame_ia32 *fp; > + > + fp = (void *)perf_guest_get_frame_pointer(); > + while (fp && entry->nr < entry->max_stack) { > + if (!perf_guest_read_virt(&fp->next_frame, &frame.next_frame, This should be fp->next_frame. > + sizeof(frame.next_frame))) > + break; > + if (!perf_guest_read_virt(&fp->return_address, &frame.return_address, Same here. > + sizeof(frame.return_address))) > + break; > + perf_callchain_store(entry, frame.return_address); > + fp = (void *)frame.next_frame; > + } > +} > + > +void > +perf_callchain_guest(struct perf_callchain_entry_ctx *entry) > +{ > + struct stack_frame frame; > + const struct stack_frame *fp; > + unsigned int guest_state; > + > + guest_state = perf_guest_state(); > + perf_callchain_store(entry, perf_guest_get_ip()); > + > + if (guest_state & PERF_GUEST_64BIT) { > + fp = (void *)perf_guest_get_frame_pointer(); > + while (fp && entry->nr < entry->max_stack) { > + if (!perf_guest_read_virt(&fp->next_frame, &frame.next_frame, Same here. > + sizeof(frame.next_frame))) > + break; > + if (!perf_guest_read_virt(&fp->return_address, &frame.return_address, And here. > + sizeof(frame.return_address))) > + break; > + perf_callchain_store(entry, frame.return_address); > + fp = (void *)frame.next_frame; > + } > + } else { > + perf_callchain_guest32(entry); > + } > +} For symmetry, maybe it makes sense to have perf_callchain_guest32 and perf_callchain_guest64 and then make perf_callchain_guest call each? No strong opinion on this of course. > + > static inline int > valid_user_frame(const void __user *fp, unsigned long size) > { > @@ -2861,11 +2902,6 @@ perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs > struct stack_frame frame; > const struct stack_frame __user *fp; > > - if (perf_guest_state()) { > - /* TODO: We don't support guest os callchain now */ > - return; > - } > - > /* > * We don't know what to do with VM86 stacks.. ignore them for now. > */ > diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h > index d0f937a62..a2baf4856 100644 > --- a/include/linux/perf_event.h > +++ b/include/linux/perf_event.h > @@ -1545,9 +1545,10 @@ DECLARE_PER_CPU(struct perf_callchain_entry, perf_callchain_entry); > > extern void perf_callchain_user(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs); > extern void perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs); > +extern void perf_callchain_guest(struct perf_callchain_entry_ctx *entry); > extern struct perf_callchain_entry * > get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, > - u32 max_stack, bool crosstask, bool add_mark); > + bool host, bool guest, u32 max_stack, bool crosstask, bool add_mark); > extern int get_callchain_buffers(int max_stack); > extern void put_callchain_buffers(void); > extern struct perf_callchain_entry *get_callchain_entry(int *rctx); > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 458bb80b1..2e88d4639 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -294,8 +294,8 @@ BPF_CALL_3(bpf_get_stackid, struct pt_regs *, regs, struct bpf_map *, map, > if (max_depth > sysctl_perf_event_max_stack) > max_depth = sysctl_perf_event_max_stack; > > - trace = get_perf_callchain(regs, 0, kernel, user, max_depth, > - false, false); > + trace = get_perf_callchain(regs, 0, kernel, user, true, false, > + max_depth, false, false); > > if (unlikely(!trace)) > /* couldn't fetch the stack trace */ > @@ -420,8 +420,8 @@ static long __bpf_get_stack(struct pt_regs *regs, struct task_struct *task, > else if (kernel && task) > trace = get_callchain_entry_for_task(task, max_depth); > else > - trace = get_perf_callchain(regs, 0, kernel, user, max_depth, > - false, false); > + trace = get_perf_callchain(regs, 0, kernel, user, true, false, > + max_depth, false, false); > if (unlikely(!trace)) > goto err_fault; > > diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c > index 1273be843..7e80729e9 100644 > --- a/kernel/events/callchain.c > +++ b/kernel/events/callchain.c > @@ -45,6 +45,10 @@ __weak void perf_callchain_user(struct perf_callchain_entry_ctx *entry, > { > } > > +__weak void perf_callchain_guest(struct perf_callchain_entry_ctx *entry) > +{ > +} > + > static void release_callchain_buffers_rcu(struct rcu_head *head) > { > struct callchain_cpus_entries *entries; > @@ -178,11 +182,12 @@ put_callchain_entry(int rctx) > > struct perf_callchain_entry * > get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, > - u32 max_stack, bool crosstask, bool add_mark) > + bool host, bool guest, u32 max_stack, bool crosstask, bool add_mark) > { > struct perf_callchain_entry *entry; > struct perf_callchain_entry_ctx ctx; > int rctx; > + unsigned int guest_state; > > entry = get_callchain_entry(&rctx); > if (!entry) > @@ -194,6 +199,26 @@ get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user, > ctx.contexts = 0; > ctx.contexts_maxed = false; > > + guest_state = perf_guest_state(); > + if (guest_state) { > + if (!guest) > + goto exit_put; > + if (user && (guest_state & PERF_GUEST_USER)) { > + if (add_mark) > + perf_callchain_store_context(&ctx, PERF_CONTEXT_GUEST_USER); > + perf_callchain_guest(&ctx); > + } > + if (kernel && !(guest_state & PERF_GUEST_USER)) { > + if (add_mark) > + perf_callchain_store_context(&ctx, PERF_CONTEXT_GUEST_KERNEL); > + perf_callchain_guest(&ctx); > + } > + goto exit_put; > + } > + > + if (unlikely(!host)) > + goto exit_put; > + > if (kernel && !user_mode(regs)) { > if (add_mark) > perf_callchain_store_context(&ctx, PERF_CONTEXT_KERNEL); > diff --git a/kernel/events/core.c b/kernel/events/core.c > index eaba00ec2..b3401f403 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -7559,6 +7559,8 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs) > { > bool kernel = !event->attr.exclude_callchain_kernel; > bool user = !event->attr.exclude_callchain_user; > + bool host = !event->attr.exclude_host; > + bool guest = !event->attr.exclude_guest; > /* Disallow cross-task user callchains. */ > bool crosstask = event->ctx->task && event->ctx->task != current; > const u32 max_stack = event->attr.sample_max_stack; > @@ -7567,7 +7569,10 @@ perf_callchain(struct perf_event *event, struct pt_regs *regs) > if (!kernel && !user) > return &__empty_callchain; > > - callchain = get_perf_callchain(regs, 0, kernel, user, > + if (!host && !guest) > + return &__empty_callchain; > + > + callchain = get_perf_callchain(regs, 0, kernel, user, host, guest, > max_stack, crosstask, true); > return callchain ?: &__empty_callchain; > } Best regards, Maxim Levitsky