Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp5281023pxu; Tue, 22 Dec 2020 12:43:10 -0800 (PST) X-Google-Smtp-Source: ABdhPJyQztlM4/uAUM2ZZr57hyDsWbgaeWkZBI7p5kmOprSxmhE5rqM1e0sHpfasywrAOQfpyPw+ X-Received: by 2002:a05:6402:379:: with SMTP id s25mr22687762edw.367.1608669789847; Tue, 22 Dec 2020 12:43:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608669789; cv=none; d=google.com; s=arc-20160816; b=ElEcoGTkcB3hAbx7rXQZU0R29qaUhL9BzHS7ykR2xwt1aeAL65x+XsjrdN1ILFtlsy QlVj2gNB2u/vXH3/Xfk/3FeRrACdoSjXnAHpGl4MOIGRgl3DavdXjqv2OgQsa2NUBjRT B0IjVGZ6p6Q3N3LqWJPJNqMyivOEvJ/uBGy6VWVuaJL+MnvnqhfR8ZszZrkZ+aQL2va+ dPn4tNNZR0OIByjVwo4P0PPwXtihGPH+Sxd5gCS/g58EeADZePFUiebVI6NgeMTk5gxT Iov8zwmQ2I1l74vM0Uf4npJQZ8ovhOsRfmAAjk3Vx+PJTJLBpP65/yPflK1bjqj5Y1lx rilg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Fs0wEultW9zvQel3qPl1so7Ch2WK22A6xYiOBL0kebU=; b=doTA2naxAvV1VvdHx4sjT05UMvNMsvUwMGcr7wP9c2L1WAqOc89wrlOu9E39JHJTBh E+jR++nxIPrLph1/HM7u+TWSUQmwzPARcd0VfmQ3f6YMV/rf8LYgjQw8/AEkt2Lzo4o4 Wi4g41nsUd0J9PG6p8avDAv3NTH0rOVj3OaycJelGwEUc9DN1EC8MT79ZIo44qLiFGMT akAnj+xF3EZihU3D3ypOMa2TlpzCdQrAOqnvbwKKPwlnsWUCX/ftFY20fCl5k/pwjcie ZquFdFuu0Db0DNXJ6B68RM81F4lK4BDVrn7yUvy79uYvbQ819PKHzgXHQSwvlCDdiTkS Ab+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=GUOqVK2G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n18si10657516ejk.69.2020.12.22.12.42.46; Tue, 22 Dec 2020 12:43:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=GUOqVK2G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727677AbgLVUjv (ORCPT + 99 others); Tue, 22 Dec 2020 15:39:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727543AbgLVUju (ORCPT ); Tue, 22 Dec 2020 15:39:50 -0500 Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CC18C061282 for ; Tue, 22 Dec 2020 12:38:45 -0800 (PST) Received: by mail-io1-xd30.google.com with SMTP id y5so13162634iow.5 for ; Tue, 22 Dec 2020 12:38:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Fs0wEultW9zvQel3qPl1so7Ch2WK22A6xYiOBL0kebU=; b=GUOqVK2GKM6L9XvTXPd7UsU0AbRN112EvkhpERSk2KvSH+z0D2y7907p5uVvtTM4rh yI2X+on0CziqmK9VkqSViFaFIKnhQOxKrkOMqtCvxOPpXA/iBeIll/+rPH3uKhVgOBlf sjPryZ/C8wbToLv++MkCvvaTCRbKEneiuv3ug= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Fs0wEultW9zvQel3qPl1so7Ch2WK22A6xYiOBL0kebU=; b=ll/Vb9cg3/RXq0gQx6Z1Husang5dkpYcFA+aRiXSjzatxzGZzUycra8XRHr+VhUf3f BDMrXB9jsVl0Ccl/XGuUBWpcmUzDt5iUd4gBEG/eSgPhluRdcMD6EVIgpt7KtBbcgxIY w+i4yj2qHsnPPTZwB0EdzRXRWoLT6NAWn09IPgvDkNIAN99HvMagJhF/0qv0RncPbogT xYWhG8IpSSglhe4Bog8BO1fjMChodgwOYOGjYMnzfe5n26ny1bSVuh/yqZ21+3qLIQja oPeFA4DnvpU9KKmY6IVgQRDFXyaBGMIZk4z5nY/HkCSqQzTR4Dxvx2umS5HuwWKNrb+/ 0lbA== X-Gm-Message-State: AOAM532Y1LJNpopVuvoUJRJ8unPDtMuv+pEV7CcmGipeWsHx6uqQvq4n yhiiAZPkjHW9WxO4f6sIwwZ1sZdm7aUQfddR/SVnIg== X-Received: by 2002:a05:6638:1a:: with SMTP id z26mr20379959jao.52.1608669524643; Tue, 22 Dec 2020 12:38:44 -0800 (PST) MIME-Version: 1.0 References: <194b5a6e6e30574a035a3e3baa98d7fde7f91f1c.camel@chromium.org> <221fb873-80fc-5407-965e-b075c964fa13@fb.com> <20201218032009.ycmyqn2kjs3ynfbp@ast-mbp> <20201218203655.clqyeeamwicvej5z@ast-mbp> In-Reply-To: From: Florent Revest Date: Tue, 22 Dec 2020 21:38:33 +0100 Message-ID: Subject: Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper To: Andrii Nakryiko Cc: Alexei Starovoitov , Yonghong Song , KP Singh , bpf , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Florent Revest , open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Dec 18, 2020 at 9:47 PM Andrii Nakryiko wrote: > > On Fri, Dec 18, 2020 at 12:36 PM Alexei Starovoitov > wrote: > > > > On Fri, Dec 18, 2020 at 10:53:57AM -0800, Andrii Nakryiko wrote: > > > On Thu, Dec 17, 2020 at 7:20 PM Alexei Starovoitov > > > wrote: > > > > > > > > On Thu, Dec 17, 2020 at 09:26:09AM -0800, Yonghong Song wrote: > > > > > > > > > > > > > > > On 12/17/20 7:31 AM, Florent Revest wrote: > > > > > > On Mon, Dec 14, 2020 at 7:47 AM Yonghong Song wrote: > > > > > > > On 12/11/20 6:40 AM, Florent Revest wrote: > > > > > > > > On Wed, Dec 2, 2020 at 10:18 PM Alexei Starovoitov > > > > > > > > wrote: > > > > > > > > > I still think that adopting printk/vsnprintf for this instead of > > > > > > > > > reinventing the wheel > > > > > > > > > is more flexible and easier to maintain long term. > > > > > > > > > Almost the same layout can be done with vsnprintf > > > > > > > > > with exception of \0 char. > > > > > > > > > More meaningful names, etc. > > > > > > > > > See Documentation/core-api/printk-formats.rst > > > > > > > > > > > > > > > > I agree this would be nice. I finally got a bit of time to experiment > > > > > > > > with this and I noticed a few things: > > > > > > > > > > > > > > > > First of all, because helpers only have 5 arguments, if we use two for > > > > > > > > the output buffer and its size and two for the format string and its > > > > > > > > size, we are only left with one argument for a modifier. This is still > > > > > > > > enough for our usecase (where we'd only use "%ps" for example) but it > > > > > > > > does not strictly-speaking allow for the same layout that Andrii > > > > > > > > proposed. > > > > > > > > > > > > > > See helper bpf_seq_printf. It packs all arguments for format string and > > > > > > > puts them into an array. bpf_seq_printf will unpack them as it parsed > > > > > > > through the format string. So it should be doable to have more than > > > > > > > "%ps" in format string. > > > > > > > > > > > > This could be a nice trick, thank you for the suggestion Yonghong :) > > > > > > > > > > > > My understanding is that this would also require two extra args (one > > > > > > for the array of arguments and one for the size of this array) so it > > > > > > would still not fit the 5 arguments limit I described in my previous > > > > > > email. > > > > > > eg: this would not be possible: > > > > > > long bpf_snprintf(const char *out, u32 out_size, > > > > > > const char *fmt, u32 fmt_size, > > > > > > const void *data, u32 data_len) > > > > > > > > > > Right. bpf allows only up to 5 parameters. > > > > > > > > > > > > Would you then suggest that we also put the format string and its > > > > > > length in the first and second cells of this array and have something > > > > > > along the line of: > > > > > > long bpf_snprintf(const char *out, u32 out_size, > > > > > > const void *args, u32 args_len) ? > > > > > > This seems like a fairly opaque signature to me and harder to verify. > > > > > > > > > > One way is to define an explicit type for args, something like > > > > > struct bpf_fmt_str_data { > > > > > char *fmt; > > > > > u64 fmt_len; > > > > > u64 data[]; > > > > > }; > > > > > > > > that feels a bit convoluted. > > > > > > > > The reason I feel unease with the helper as was originally proposed > > > > and with Andrii's proposal is all the extra strlen and strcpy that > > > > needs to be done. In the helper we have to call kallsyms_lookup() Note that vsprintf itself calls __sprint_symbol which does the same thing as my helper (a kallsyms_lookup followed by a strcpy and a strlen) > > > > which is ok interface for what it was desinged to do, > > > > but it's awkward to use to construct new string ("%s [%s]", sym, modname) > > > > or to send two strings into a ring buffer. > > > > Andrii's zero separator idea will simplify bpf prog, but user space > > > > would need to do strlen anyway if it needs to pretty print. > > > > If we take pain on converting addr to sym+modname let's figure out > > > > how to make it easy for the bpf prog to do and easy for user space to consume. > > > > That's why I proposed snprintf. Both solutions are fine with us but I feel that the snprintf would be generally more helpful for BPF. > > > > > > I have nothing against snprintf support for symbols. But > > > bpf_ksym_resolve() solves only a partially overlapping problem, so > > > deserves to be added in addition to snprintf support. With snprintf, > > > it will be hard to avoid two lookups of the same symbol to print "%s > > > [%s]" form, so there is a performance loss, which is probably bigger > > > than a simple search for a zero-byte. > > > > I suspect we're not on the same page in terms of what printf can do. > > See Documentation/core-api/printk-formats.rst and lib/vsprintf.c:symbol_string() > > It's exactly one lookup in sprintf implementation. > > bpf_snprintf(buf, "%ps", addr) would be equivalent to > > { > > ksym_resolve(sym, modname, addr, SYM | MOD); > > printf("%s [%s]", sym, modname); > > } > > Ah, I missed that we'll have a single specifier for "%s [%s]" format. > My assumption was that we have one for symbol name only and another > for symbol module. Yeah, then it's fine from the performance > perspective. > > > > > > But bpf_ksym_resolve() can be > > > used flexibly. You can either do two separate bpf_ksym_resolve() calls > > > to get symbol name (and its length) and symbol's module (and its > > > length), if you need to process it programmatically in BPF program. Or > > > you can bundle it together and let user-space process it. User-space > > > will need to copy data anyways because it can't stay in > > > perfbuf/ringbuf for long. So scanning for zero delimiters will be > > > negligible, it will just bring data into cache. All I'm saying is that > > > ksym_resolve() gives flexibility which snprintf can't provide. > > > > Well, with snprintf there will be no way to print mod symbol > > without modname, but imo it's a good thing. > > What is the use case for getting mod symbol without modname? > > For easier post-processing on the user side. Instead of parsing > "vmlinux_symbol" or "module_symbol [module_name]" (two non-uniform > variants already), user-space would just get two separate strings. I > just like APIs that don't assume how I am going to use them :), so > "symbol [module]" format is a bit more inconvenient than decomposed > pieces. > > > > > Additionally, with ksym_resolve() being able to return base address, > > > it's now possible to do a bunch of new stuff, from in-BPF > > > symbolization to additional things like correlating memory accesses or > > > function calls, etc. > > > > Getting adjusted base address could be useful some day, but why now? What for? > > I proposed that only if we do bpf_ksym_resolve(). No need to support > that in snprintf case, of course. > > > > > > bits), my point is that ksym_resolve() is more powerful than > > > snprintf(): the latter can be used pretty much only for > > > pretty-printing. > > > > Potentially yes. I think the stated goal was pretty printing. > > That's fine if we do only snprintf, yes. But if a separate helper, > then we should think more broadly. Let's start with only snprintf then, this solves our usecase and if a different need arises in the future (eg: offset) we could design a new helper around that need. > > > > > > > > > > > > > As far as 6 arg issue: > > > > long bpf_snprintf(const char *out, u32 out_size, > > > > const char *fmt, u32 fmt_size, > > > > const void *data, u32 data_len); > > > > Yeah. It won't work as-is, but fmt_size is unnecessary nowadays. > > > > The verifier understands read-only data. > > > > Hence the helper can be: > > > > long bpf_snprintf(const char *out, u32 out_size, > > > > > > With the power of BTF, we can also put these two correlated values > > > into a single struct and pass a pointer to it. It will take only one > > > parameter for one memory region. Alternative is the "fat pointer" > > > approach that Go and Rust use, but it's less flexible overall. > > > > I think it will be less flexible when output size is fixed by the type info. > > With explicit size the bpf_snprintf() can print directly into ringbuffer. > > Multiple bpf_snprintf() will be able to fill it one by one reducing > > space available at every step. > > bpf_snprintf() would need to return the number of bytes, of course. > > Just like probe_read_str. > > Ok, I should have probably demonstrated with an example. I don't > propose to specify the size through BTF itself. I was thinking about: > > struct bpf_mem_ptr { > void *data; > size_t size; > }; > > > struct bpf_mem_ptr p = { ptr, 123 }; > bpf_whatever_helper(&p, ...); > > > bpf_whatever_helper() will specify that the first argument has to be > PTR_TO_BTF_ID where btf_id corresponds to struct bpf_mem_ptr. Hope > this helps.