Received: by 2002:ab2:3350:0:b0:1f4:6588:b3a7 with SMTP id o16csp1223845lqe; Mon, 8 Apr 2024 02:32:26 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVlR5M0YTejiRTzYKVDcR4O/j+99X6Ipi+sMtObCbwm+BVLJ70pYfwQpBJ1XQGO6btdMfhbtvDXZJIMTRvSWiWrYfrOYBItLyHY3eLo5w== X-Google-Smtp-Source: AGHT+IH1/A7Va3MKFEMDD3hsBCPngq/Wsq6hYaj58n7sKAsPlDBKQ5YqddzQqWuZLKRCbBZ1j2s8 X-Received: by 2002:a05:6a21:4843:b0:1a7:2ceb:e874 with SMTP id au3-20020a056a21484300b001a72cebe874mr9094407pzc.37.1712568746459; Mon, 08 Apr 2024 02:32:26 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712568746; cv=pass; d=google.com; s=arc-20160816; b=uUuZuRLaYaSWNVsLD5KznCUHyo+4HAveYrZmhneWFbJ2JHq9e0DvLbD5lolJePgfJo CgMqB63Q6VGJtFRl1+9GSdb5Stn3wAyhwB4ZZH0C5huQlAMJO2tCsGSd2QRl3xIf18AZ nta25I7RXUyUBd3tqpKHhcj/gT2LOaadTvcTIMqqKRejx1JOEn29JzpAfa7dIkY2el+Q YOBA2Xqk6zF+KsfbK1X7jUM0F8mc9SvoUbnURaVEBBV+tU+ortGwSbNW8hKPipxYHvCU a5ffA55/p1Jl8XpVNt4XgxmUsQZIsniwZyZo3uOWa8RNv+s9+ykDM/1E0Rwh/dkhUh7k k5QQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=dt4+ZQSYRlcSKRtpx5FfFY83qCQMJSz/0ZWKLb2RT/U=; fh=OK5RVxByjxvYUPkfJf6WfbClkG+PfrOMuqPV33hY8rY=; b=XHEaDllx6yygjwNPI5nBS+PmigXOTQtIWmgJxRDd0cIDZkz/AZK+AUwxH7FXFtd5li lY3epweMJ/zODDnvhMSxUhZ20P7S1M0dguZuWihLt3FZ9NGGufhvWnH8bJuO2sad080x HXtU+yXG041rTEmMtYnscTqJAlFltPBfprNW1vQ40+iuK5DS6KXGX5bA31+PIczVde8s pa4xzse8rnLWe/+bSdfAGE29Aj330u6dylqn7R/P+YhqzMZmhLFJgDfuLrQddUKtfDke rheXFjiZnx4Li6X5jtMXeO3wgefY9rXphm8JGpN06wHxEubABazWf0r9i+LK99rTrtG2 RAjQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tYdxR0tp; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-135147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-135147-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id im22-20020a170902bb1600b001e3eb265a72si2977418plb.220.2024.04.08.02.32.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Apr 2024 02:32:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-135147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tYdxR0tp; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-135147-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-135147-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 9C61FB22A48 for ; Mon, 8 Apr 2024 09:31:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 94BF4481A6; Mon, 8 Apr 2024 09:30:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tYdxR0tp" Received: from mail-ua1-f42.google.com (mail-ua1-f42.google.com [209.85.222.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 18E934C61F for ; Mon, 8 Apr 2024 09:30:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.42 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712568652; cv=none; b=jXXajGOb2OTzpV9zyEfiFPyaeODOT1KT/N4nipzsJP1xkbT/1amnWoekUnHvhJM0tcC3sAJqfAq40jddkCWKCxmGwDu29EGreRIA9zxzhM4ahW20iV0L+AIYTmGBZHFC8mQ5U+/cp5ZYBk/38KUpcjrRtwLx5W1Ec9xB5dCfi9E= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712568652; c=relaxed/simple; bh=mXGVh5sNg5jlDv7GLXUH38FmV/TKaLl65KjI3d2PhLA=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=PIo81EeGB6yUOtL4NGjAgrHvw1nB/b/gpHherD/61laAvdfqhMaz4/mt7O1xfXRi6zw248ZhqLY02zB0Y/joN31fd5p2whBC6+APFROouZdljXEoUtkN4RAX680tuTxkLg40AC8Z/Wu3outv2Gu1dxoPKCWupZZuzkD9nH0/458= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tYdxR0tp; arc=none smtp.client-ip=209.85.222.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ua1-f42.google.com with SMTP id a1e0cc1a2514c-7db44846727so1090226241.0 for ; Mon, 08 Apr 2024 02:30:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1712568650; x=1713173450; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=dt4+ZQSYRlcSKRtpx5FfFY83qCQMJSz/0ZWKLb2RT/U=; b=tYdxR0tpqgRQdqvSM4RNMxGHia5crMBau46tsOiIcKjdtVbCT4K13ru+f5WE8XC1bh B7YwXtL91oF2iNLn9UaeCZfrzY+sAEq+GdMqtwplT+9VQTiJ725Aq6jAIqE6enS7bgae k9vanfSq2ki08grr8e46v53qMNbtproXy/GX53S6Y2pqI2nJHMkNF8rHG1bNnsGGVnUj /ciWJ04t1JSnrDacUmnEn1JO5oVpwYyYw14yK5gJlK5O2FgrrRbAUrOOj1QUukgmhBrm Om3orNRrLcMBYIlviJidFLuB8H/Mykav0I2uXlb1f2QOBGRnAexJTAsXlHuu3mtMlHEn QpqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1712568650; x=1713173450; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dt4+ZQSYRlcSKRtpx5FfFY83qCQMJSz/0ZWKLb2RT/U=; b=T6rBb1af2yxS0hEiII8gFFNaAurGWrT25HrGFjR3iAbqMAocEZGK5BTI/3i6VzXhJ6 z9dIywfPnb5e0TlZzQRunYYKEOwp6hcV8b8Ke9RHlncsZ7b2Wmis7u9mOgVzsweckNz+ dK5bNhgCHvYP7FuMH7hRiRB4Cbfh9sj48hNEezkq/jtjBx2QSMuJgJ7kw1aPVj/ASbr6 Q54V+yrkxsAuBM6rhB0hjTTlyznmXdUTsGAkycQq1Z4Gp/QnTx6PWY/axkM57x1dU/Yn RNX6jn5wKo07XoXkGnxwpJO5+t2YZNzQSLsFpwYspe3vZGL6huHhRSPS4H98a728bnkS o7yQ== X-Forwarded-Encrypted: i=1; AJvYcCW/MW3U6r+onV7n5jI3AzUmmtLqwBFCHriBt/xYe+6YGpBWvWQcuPoSxEqUdfZbyFx9lL+hssakRjbH7LvKTNqe8Yd9qul32D7X2nF+ X-Gm-Message-State: AOJu0YwDXf/WmV4ezcTBvP9VFcE29KMMQgmoivApCWSpN9PvTLtWUPrk EMLdKbnWQ2VcqGuJAHK43nydJfBBIvE3vxYs04IAA469nLA7P/2ixC4cpisqes253JEGx1ZJmzd R3QTccfrC5MEq9Vw5doTL5q8Y0hCM+vQ8rdf8 X-Received: by 2002:a05:6102:509f:b0:47a:66a:189f with SMTP id bl31-20020a056102509f00b0047a066a189fmr883424vsb.19.1712568649739; Mon, 08 Apr 2024 02:30:49 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240404190146.1898103-1-elver@google.com> In-Reply-To: From: Marco Elver Date: Mon, 8 Apr 2024 11:30:11 +0200 Message-ID: Subject: Re: [PATCH bpf-next 1/2] bpf: Introduce bpf_probe_write_user_registered() To: Andrii Nakryiko Cc: Alexei Starovoitov , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Dmitry Vyukov , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , bpf , "open list:DOCUMENTATION" , linux-trace-kernel@vger.kernel.org, LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, 5 Apr 2024 at 22:28, Andrii Nakryiko wr= ote: > > On Fri, Apr 5, 2024 at 1:28=E2=80=AFAM Marco Elver wro= te: > > > > On Fri, 5 Apr 2024 at 01:23, Alexei Starovoitov > > wrote: [...] > > > and the tasks can use mmaped array shared across all or unique to eac= h > > > process. > > > And both bpf and user space can read/write them with a single instruc= tion. > > > > That's BPF_F_MMAPABLE, right? > > > > That does not work because the mmapped region is global. Our requiremen= ts are: > > > > 1. Single tracing BPF program. > > > > 2. Per-process (per VM) memory region (here it's per-thread, but each > > thread just registers the same process-wide region). No sharing > > between processes. > > > > 3. From #2 it follows: exec unregisters the registered memory region; > > fork gets a cloned region. > > > > 4. Unprivileged processes can do prctl(REGISTER). Some of them might > > not be able to use the bpf syscall. > > > > The reason for #2 is that each user space process also writes to the > > memory region (read by the BPF program to make updates depending on > > what state it finds), and having shared state between processes > > doesn't work here. > > > > Is there any reasonable BPF facility that can do this today? (If > > BPF_F_MMAPABLE could do it while satisfying requirements 2-4, I'd be a > > happy camper.) > > You could simulate something like this with multi-element ARRAY + > BPF_F_MMAPABLE, though you'd need to pre-allocate up to max number of > processes, so it's not an exact fit. Right, for production use this is infeasible. > But what seems to be much closer is using BPF task-local storage, if > we support mmap()'ing its memory into user-space. We've had previous > discussions on how to achieve this (the simplest being that > mmap(task_local_map_fd, ...) maps current thread's part of BPF task > local storage). You won't get automatic cloning (you'd have to do that > from the BPF program on fork/exec tracepoint, for example), and within > the process you'd probably want to have just one thread (main?) to > mmap() initially and just share the pointer across all relevant > threads. In the way you imagine it, would that allow all threads sharing the same memory, despite it being task-local? Presumably each task's local storage would be mapped to just point to the same memory? > But this is a more generic building block, IMO. This relying > on BPF map also means pinning is possible and all the other BPF map > abstraction benefits. Deployment-wise it will make things harder because unprivileged processes still have to somehow get the map's shared fd somehow to mmap() it. Not unsolvable, and in general what you describe looks interesting, but I currently can't see how it will be simpler. In absence of all that, is a safer "bpf_probe_write_user()" like I proposed in this patch ("bpf_probe_write_user_registered()") at all appealing? Thanks, -- Marco