Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp23939pxb; Wed, 30 Mar 2022 21:50:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzXNbYRkX4rzleL98xcbvKi+rFtuTnfpbGzcr3uHiMUJspCXGuSmgWILCgnSOU3mVmAvyKK X-Received: by 2002:a05:6a00:2349:b0:4fa:934f:f6db with SMTP id j9-20020a056a00234900b004fa934ff6dbmr3391350pfj.44.1648702213002; Wed, 30 Mar 2022 21:50:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648702212; cv=none; d=google.com; s=arc-20160816; b=gPk4Dmu0uqMdSvHn+wEzkQ853gfq8aV6ZS9LgNNB7BRgrA6Q3sQpc1kTy9FYYXLpV5 sAl4vVd9LLqoyFbn8WE4NIOyn68qAxpO1B3hEN5sKS1ODsZvlbSz9kcoZsuJK4aYy3ZU vYwTDlLu5Pyi35b+AH/rtG6gh8PPxYo2ZUPW5cp99uWCOZIieplJjRapXw/vyhR18fQk IzWRZtYOnfTR03oKoO4WRb0Rsq07C2qD2T6kannTukpMxeJEKPLVnbIJw5HUhhfzD9oQ FYZhnV1u0q8/XjpcDnqBo4bS2AHYGIUTtQvNdTerFPZ3GGxP9cot9i+Mu3O0q982/UrW Lhrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=BL5gp9g+4lOt7MP6OWX+etUWh+Qufqk3KhmvfKjHLxk=; b=ardwUisa6D/NcgukrxJkTj5eQ/p6BDD/6XEv9eY4KIDfmM00FEGDtxhZMolgr073QH HeARKv4bs5SjzItXnPXZbwvrKffB54QT1/WBryZuC+oZg/mx2B3o5Ooq1YzGzquZST7r +XHmYTk6S8vdZTnxNTKyUCCYPTA8s3vXqxbDCdVgrreaU56Sqp1yQlsbAd1TUqbaNYSl +tbSiQaFXLW1UZ+NWpAgYoEl4Jmpmk5jmt5UuzzPp9B+fIqU0+TZT6RYg386MF8yYhhl R/jcKaS/SVgD1JcykwkmudVqmtqmDSMAUmHtEFFrpk+X1lreMYgTlkzd/7sOOue5ZRf8 OAkw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zfn2Non3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id a16-20020a63d210000000b0039822e42cbasi18812185pgg.427.2022.03.30.21.50.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Mar 2022 21:50:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Zfn2Non3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0CDE711CF50; Wed, 30 Mar 2022 20:29:57 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348610AbiC3QI2 (ORCPT + 99 others); Wed, 30 Mar 2022 12:08:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348701AbiC3QIX (ORCPT ); Wed, 30 Mar 2022 12:08:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1829C23D76F; Wed, 30 Mar 2022 09:06:38 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A8F096179B; Wed, 30 Mar 2022 16:06:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0998EC340F2; Wed, 30 Mar 2022 16:06:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648656397; bh=BL5gp9g+4lOt7MP6OWX+etUWh+Qufqk3KhmvfKjHLxk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=Zfn2Non3kSBEOnWTt9QLvhBkXdnDzBo9T2WjnFMkn+SdwNlxjoQZnB9KhDABI5xVw vzWkoSatYmuHh/Du7CZbHr5rzuMBCF9e6Z9y/jOuAUfQ8pAzRT1h4CCt/omVnCs1Tg 15E224Sw//26e/uTpGetvwFxid8Oav+ifLLxZsEyi/abUc6j+sDASk1ScVmVTJBr52 PIOBRoWVz9f3IwdQLebvwdgLW8bXZ/KJF/CbzHpCeMUkmrdSRW+SZMmUDMZ+T3CmZU Cq9C9E7iDyEvwWFFlCJJCVAcxzFJk1G3MUrYhqRpP3/XRhTTQrLqtxKdI3iUcQCMgX Hf4ww8yfhcFhQ== Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-2e5827a76f4so224484207b3.6; Wed, 30 Mar 2022 09:06:36 -0700 (PDT) X-Gm-Message-State: AOAM533ap4cN1WUXYknJTU8UMmrHmHY+MlkA5ArGC/VrAfM4lSPhenGO RcC4CTj9m9vCGMBU9HRMElNBr8h9e+0QLZj7474= X-Received: by 2002:a81:13c4:0:b0:2e6:bdb4:6d9f with SMTP id 187-20020a8113c4000000b002e6bdb46d9fmr333573ywt.211.1648656396060; Wed, 30 Mar 2022 09:06:36 -0700 (PDT) MIME-Version: 1.0 References: <20220329181935.2183-1-beaub@linux.microsoft.com> <20220329201057.GA2549@kbox> <20220329231137.GA3357@kbox> In-Reply-To: <20220329231137.GA3357@kbox> From: Song Liu Date: Wed, 30 Mar 2022 09:06:24 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] tracing/user_events: Add eBPF interface for user_event created events To: Beau Belgrave Cc: Alexei Starovoitov , Steven Rostedt , Masami Hiramatsu , linux-trace-devel , LKML , bpf , Network Development , linux-arch , Mathieu Desnoyers Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 29, 2022 at 4:11 PM Beau Belgrave wrote: > > On Tue, Mar 29, 2022 at 03:31:31PM -0700, Alexei Starovoitov wrote: > > On Tue, Mar 29, 2022 at 1:11 PM Beau Belgrave wrote: > > > > > > On Tue, Mar 29, 2022 at 12:50:40PM -0700, Alexei Starovoitov wrote: > > > > On Tue, Mar 29, 2022 at 11:19 AM Beau Belgrave > > > > wrote: > > > > > > > > > > Send user_event data to attached eBPF programs for user_event based perf > > > > > events. > > > > > > > > > > Add BPF_ITER flag to allow user_event data to have a zero copy path into > > > > > eBPF programs if required. > > > > > > > > > > Update documentation to describe new flags and structures for eBPF > > > > > integration. > > > > > > > > > > Signed-off-by: Beau Belgrave > > > > > > > > The commit describes _what_ it does, but says nothing about _why_. > > > > At present I see no use out of bpf and user_events connection. > > > > The whole user_events feature looks redundant to me. > > > > We have uprobes and usdt. It doesn't look to me that > > > > user_events provide anything new that wasn't available earlier. > > > > > > A lot of the why, in general, for user_events is covered in the first > > > change in the series. > > > Link: https://lore.kernel.org/all/20220118204326.2169-1-beaub@linux.microsoft.com/ > > > > > > The why was also covered in Linux Plumbers Conference 2021 within the > > > tracing microconference. > > > > > > An example of why we want user_events: > > > Managed code running that emits data out via Open Telemetry. > > > Since it's managed there isn't a stub location to patch, it moves. > > > We watch the Open Telemetry spans in an eBPF program, when a span takes > > > too long we collect stack data and perform other actions. > > > With user_events and perf we can monitor the entire system from the root > > > container without having to have relay agents within each > > > cgroup/namespace taking up resources. > > > We do not need to enter each cgroup mnt space and determine the correct > > > patch location or the right version of each binary for processes that > > > use user_events. > > > > > > An example of why we want eBPF integration: > > > We also have scenarios where we are live decoding the data quickly. > > > Having user_data fed directly to eBPF lets us cast the data coming in to > > > a struct and decode very very quickly to determine if something is > > > wrong. > > > We can take that data quickly and put it into maps to perform further > > > aggregation as required. > > > We have scenarios that have "skid" problems, where we need to grab > > > further data exactly when the process that had the problem was running. > > > eBPF lets us do all of this that we cannot easily do otherwise. > > > > > > Another benefit from user_events is the tracing is much faster than > > > uprobes or others using int 3 traps. This is critical to us to enable on > > > production systems. > > > > None of it makes sense to me. > > Sorry. > > > To take advantage of user_events user space has to be modified > > and writev syscalls inserted. > > Yes, both user_events and lttng require user space modifications to do > tracing correctly. The syscall overheads are real, and the cost depends > on the mitigations around spectre/meltdown. > > > This is not cheap and I cannot see a production system using this interface. > > But you are fine with uprobe costs? uprobes appear to be much more costly > than a syscall approach on the hardware I've run on. Can we achieve the same/similar performance with sys_bpf(BPF_PROG_RUN)? Thanks, Song > > > All you did is a poor man version of lttng that doesn't rely > > on such heavy instrumentation. > > Well I am a frugal person. :) > > This work has solved some critical issues we've been having, and I would > appreciate a review of the code if possible. > > Thanks, > -Beau