Received: by 2002:ab2:1149:0:b0:1f3:1f8c:d0c6 with SMTP id z9csp323631lqz; Fri, 29 Mar 2024 20:17:26 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXk3t+VcahlguJcrp2KRdbn6GEKsKA6oECi8du45SZKME7ARyQNyVTbnyPZ8apG+YK6zFmj9ou7xdLHChmsrMC5Yhxz3dqv3+8aUaarqA== X-Google-Smtp-Source: AGHT+IGVTi20E3z4C4tTdyWhCd7KD0muLjBvUYqDJSwAXEiVopljGkOpbsLKGDuDNNgK7xMKql88 X-Received: by 2002:a50:c04e:0:b0:568:bc48:5f27 with SMTP id u14-20020a50c04e000000b00568bc485f27mr2530843edd.39.1711768646401; Fri, 29 Mar 2024 20:17:26 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711768646; cv=pass; d=google.com; s=arc-20160816; b=j7gX2IiqZQe4mX84qDPykBSYUX3e5kLspYB1uyAnSA90WJwM34+tHrtFHm9lw/uBlr sIEmAocYUzBV1zDVqSfbhn6onERG1R9s7CZPfkpLIp7GWylqcxHFv8wGEOconH6KykrL AmHa46d4rBkQd/mbJBIlwLYN+ZiqAqGxJvumDer7WvMEwq8sWeZB9UJoWuawJpI+cM8F j8zaAsFF2W8aFPrJmhdoqTYUKHt3Va5NcNZfhjUqDC/S5aBtqwDYyDUJmVll+97SPSu6 SHqX1CiPDHPD/2y+waT8rU5X/RM2qeoQgeculYGtcUiWuc0ylvb5+cM0U1YH0KVC65oU D93Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=W+7nrVQkSe14hYl8Q/RTZav7B+/IIz7y9mWl0h4nHVI=; fh=LhCkSMn2G6xptR9/+pP74igTtGaN+e34P2n6pOGZXTY=; b=Pt0VsRKYJRQ0cnzAMTOKDL1F3Z79xMNDqrBdj/C22xxWRmF9R+GT2YvetsjJLi2XHa EBJ/0spk241vNSdj+5Y9+1/NCVYB6bpGwDAudp1FEmOxtUPcxA8DhR9VBfKR9FzLOQqQ S4e8Pp/3U9RVXIqyQWlIbJ0M7RXKyBWmHJai4PXvgX/WekvYQu4JYayjmsfmP2xKNuCb 7cmjdqOXhxdxdfXRm1me18lBIrjlPU+czOMi9tV93WjrYkeKRfAKw9VAASCjqEtpk23X h4bcnSkC5cqBxovTmgz56mVbHDZ4nCXo/24zIX8zNNaUmaOQxZMg/q2StYQoOuYH0TDk 3zmw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=NJ9zwf18; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-125580-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-125580-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id u16-20020a50c050000000b0056bae59de3fsi2349657edd.90.2024.03.29.20.17.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Mar 2024 20:17:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-125580-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=NJ9zwf18; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-125580-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-125580-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id EE9981F22412 for ; Sat, 30 Mar 2024 03:17:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 312D67470; Sat, 30 Mar 2024 03:17:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="NJ9zwf18" Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CF454690 for ; Sat, 30 Mar 2024 03:17:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.176 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711768630; cv=none; b=K2SBLv6PlTq8dCXzbRtPVYqqMcrgvhpYfoUlOIWbQw7NS5Z7nCWMiYCkNKmQ/oH/LfeVt5Qp8C/MmalRyKg3BBt8Vj3z5tmvZM5K7OnNVld6TbRFcH3KUSoVICqphRHg0mG8ERtyweyGgOMU5qO8+tRAgGAkyQZFD2XCLr1XL50= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711768630; c=relaxed/simple; bh=VmRd6vdqQ/4kj1fNz8tXb6dpWXuQ22GvcuAxWJLw2Uc=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=HHG/uNMYN22N49LJOpnILpP9AZYT+nDtYOLrVOinEHxgzgfksVP6g/YjGK3xSQYsTc8e1pLZERNuXzwGwitYhyzcXT7oQHfi5NMxr6H+W6Q+8vsTFMrtdLKK5YxL+FoOM2J+HcP1bYUOMLCBPAnE042wUnWzf58f8jd9zchvXKg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=NJ9zwf18; arc=none smtp.client-ip=209.85.215.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Received: by mail-pg1-f176.google.com with SMTP id 41be03b00d2f7-53fa455cd94so1814158a12.2 for ; Fri, 29 Mar 2024 20:17:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1711768628; x=1712373428; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=W+7nrVQkSe14hYl8Q/RTZav7B+/IIz7y9mWl0h4nHVI=; b=NJ9zwf18p7aCeardjmlLPL/ya6ehQ2UR0n73FuJd3VldOc2h/iS/gdTC5rr76kLorV jk9frrEBYFm+FAVyBilKWExgAlTri5cWDU5cvVrSZFW04n8jqm3ZSTfqYRy7AAkwFstX EiIM0t4XC5dFOJ3IHBB3cq+jezLRmcfFGnX0q0dPRsVG1r8zIg/v01mQv2JjQlO/XTMi 2ry0wzMjopaa9ovGqDnrz87SfuzD1l6GTbNOF4u9mvBxDWR7RLIAGf3CpORGegtRDtvl tfxveM1hHblM4xnRgn1yH4NgaHzqA8tHGZNnmAWGxlGju2kll9wGHsHh8k6eqhiVgjQp pGpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1711768628; x=1712373428; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W+7nrVQkSe14hYl8Q/RTZav7B+/IIz7y9mWl0h4nHVI=; b=Bdw5KkdyvdUQIpwV5X8AA6iaOX6tJnr1F6/L4W3rovqyK0EzB5svlbrUxDCC9jcXov 7PdYmr+moG/xd6XsUeLfLPgjvfDVag4Vpux+YSFSBK0uoiU7wtJnQNaniIV28BdeZlga Z8ocBXqqit7c+eG0TnBThZhK8uK90zpcnPS30VOH0C6awMuaI+dXySdlv++z3YKcMk+E wJPE8f0j672QI+/4ybCoOpuSZf31puRCSFlSJBtTCxs6CtwdvfaNwA3PTnYN7gM6yWI1 ELe56kbIA+zOOwV+kuESJj6oCE8Grz2lFU2e4SGpzRVf9TX0sWJC2zR42QctfZiZn6yU NT2A== X-Forwarded-Encrypted: i=1; AJvYcCWddBK8fMmxGel/ADLUtbmHFd2REGGqi2cIyXexG46sfa1AfOer7EFgS/SeNeC3j3Cs2Dk31hFt55tsEvyQ6iZc81AC+6BYvp+W/F6s X-Gm-Message-State: AOJu0YyW9BudvNWggDW0C9qUjnDYZAOGEFHgIricYkoJx31FwbP9WoqG NgAF4Acz9YHSPxssVOAXDlNftGKMIaHQV2kxpAbEm7LcZrf08NsGS8djC172xFckOY8cu/FnmuN gYXs6/nI8+yoquFbJS1yobtehoI5lcSPVhG5e4Q== X-Received: by 2002:a05:6a21:189:b0:1a3:dc33:2e47 with SMTP id le9-20020a056a21018900b001a3dc332e47mr3993857pzb.4.1711768627780; Fri, 29 Mar 2024 20:17:07 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240311093526.1010158-1-dongmenglong.8@bytedance.com> <20240311093526.1010158-2-dongmenglong.8@bytedance.com> <20240328111330.194dcbe5@gandalf.local.home> In-Reply-To: <20240328111330.194dcbe5@gandalf.local.home> From: =?UTF-8?B?5qKm6b6Z6JGj?= Date: Sat, 30 Mar 2024 11:18:29 +0800 Message-ID: Subject: Re: [External] Re: [PATCH bpf-next v2 1/9] bpf: tracing: add support to record and check the accessed args To: Steven Rostedt Cc: Alexei Starovoitov , Jiri Olsa , Andrii Nakryiko , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Eddy Z , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , "David S. Miller" , David Ahern , Dave Hansen , X86 ML , Mathieu Desnoyers , Quentin Monnet , bpf , linux-arm-kernel , LKML , linux-riscv , linux-s390 , Network Development , linux-trace-kernel@vger.kernel.org, "open list:KERNEL SELFTEST FRAMEWORK" , linux-stm32@st-md-mailman.stormreply.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Mar 28, 2024 at 11:11=E2=80=AFPM Steven Rostedt wrote: > > On Thu, 28 Mar 2024 22:43:46 +0800 > =E6=A2=A6=E9=BE=99=E8=91=A3 wrote: > > > I have done a simple benchmark on creating 1000 > > trampolines. It is slow, quite slow, which consume up to > > 60s. We can't do it this way. > > > > Now, I have a bad idea. How about we introduce > > a "dynamic trampoline"? The basic logic of it can be: > > > > """ > > save regs > > bpfs =3D trampoline_lookup_ip(ip) > > fentry =3D bpfs->fentries > > while fentry: > > fentry(ctx) > > fentry =3D fentry->next > > > > call origin > > save return value > > > > fexit =3D bpfs->fexits > > while fexit: > > fexit(ctx) > > fexit =3D fexit->next > > > > xxxxxx > > """ > > > > And we lookup the "bpfs" by the function ip in a hash map > > in trampoline_lookup_ip. The type of "bpfs" is: > > > > struct bpf_array { > > struct bpf_prog *fentries; > > struct bpf_prog *fexits; > > struct bpf_prog *modify_returns; > > } > > > > When we need to attach the bpf progA to function A/B/C, > > we only need to create the bpf_arrayA, bpf_arrayB, bpf_arrayC > > and add the progA to them, and insert them to the hash map > > "direct_call_bpfs", and attach the "dynamic trampoline" to > > A/B/C. If bpf_arrayA exist, just add progA to the tail of > > bpf_arrayA->fentries. When we need to attach progB to > > B/C, just add progB to bpf_arrayB->fentries and > > bpf_arrayB->fentries. > > > > Compared to the trampoline, extra overhead is introduced > > by the hash lookuping. > > > > I have not begun to code yet, and I am not sure the overhead is > > acceptable. Considering that we also need to do hash lookup > > by the function in kprobe_multi, maybe the overhead is > > acceptable? > > Sounds like you are just recreating the function management that ftrace > has. It also can add thousands of trampolines very quickly, because it do= es > it in batches. It takes special synchronization steps to attach to fentry= . > ftrace (and I believe multi-kprobes) updates all the attachments for each > step, so the synchronization needed is only done once. > Yes, it is fast to register a trampoline for a kernel function in the managed ftrace in register_fentry->register_ftrace_direct->ftrace_add_rec_direct. And it will add the trampoline to the hash table "direct_functions". And the trampoline will be called in the following step (I'm not sure if I understand it correctly): ftrace_regs_caller | __ftrace_ops_list_func -> call_direct_funcs -> save trampoline to pt_regs->origin_ax | call pt_regs->origin_ax if not NULL The logic above means that we can only call a trampoline once, and a kernel function can only have one trampoline. The original idea of mine is to register all the shared trampoline to the managed ftrace. For example, if we have the shared trampoline1 for function A/B/C, and shared trampoline2 for function B/C/D, then I register trampoline1 and trampoline2 for function B/C. However, it can't work, as we can't call 2 trampolines for a function. Then, I thought that we could create a "dynamic trampoline". The logic for the non-ftrace-managed case is simple, we only need to replace the "nop" of all the target functions to "call dynamic_trampoline". And for the ftrace-managed case, the logic is the same too, except that the trampoline that we add to the "direct_functions" hash is the dynamic-trampoline: ftrace_regs_caller | __ftrace_ops_list_func -> call_direct_funcs -> save dynamic-trampoline to pt_regs->origin_ax | call pt_regs->origin_ax(dynamic-trampoline) if not NULL And in the dynamic-trampoline, we can call prog1 for A, call prog1 and prog2 for B/C, call prog2 for D. And the register is fast enough. > If you really want to have thousands of functions, why not just register = it > with ftrace itself. It will give you the arguments via the ftrace_regs > structure. Can't you just register a program as the callback? > Ennn...I don't understand. The main purpose for me to use TRACING is: 1. we can directly access the memory, which is more efficient. 2. we can obtain the function args in FEXIT, which kretprobe can't do it. And this is the main reason. Thanks! Menglong Dong > It will probably make your accounting much easier, and just let ftrace > handle the fentry logic. That's what it was made to do. > > -- Steve