Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp576691pxy; Thu, 22 Apr 2021 08:36:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzgUwDLSrckiRkl/aMqTOP9qIEVITj6jwvoi255FDWC4OSj8o0+3/ZIgKm8Xo44erQuB4fU X-Received: by 2002:a17:906:d789:: with SMTP id pj9mr4124756ejb.128.1619105810702; Thu, 22 Apr 2021 08:36:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619105810; cv=none; d=google.com; s=arc-20160816; b=QpsGIU2340rPVRSX/cPBeGaWNLOlle3091Nd7yc/af2HiJav4A2YIst0fRgwohcxHY q3NMF9NkuBOaKCYS+/f60jlhlFJDl5J5DUKXg1WQcSvQdvqXHuPYeiUckeEiF6YFCdz/ gU3BEHSARJEy+XcVfsZQQF/GZnGxPXxZQCxj97MaEzzKgg7/30uIlYxKMs44fDR1ddlm N8+IYtVD4Fg2sCalpqUkGSY/bD0mvLVui95D4LNmL3Kv8gMhYFJ1Hig/JAKCb1RxZY8L dcD+sDGpyKUNR10WzcrpX91AzDFHm4/5FfTdIKThzGZozB+NWqZFc+R7QfO7sEfAZqyw CrIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=eHBbCi3pN+iJkHBatI04E+yq2lifaj9Rlve6YQLrS2Q=; b=FFb1s3H0nra+8uonLigW5p0UKdk5uEqCYX4vglQ2xSWcVV0Czu2BxwU1R9KWHS46Wk apyK5X5V7XWYbDJ/3IF5LPDtvy4XrhhfknsE/CgoMzo+qPKppM+blFUfxBWlPcfo2oJN ZcNao/uxBFR6Ui8cC4iOXoRLTq2I3h9bl/21173faxkiOGQRxuE8wrFuRVCf84PtKRGi 4Hr+4nCvU+Axxz+lznoegSZdmOXSrqKx9kFx6jCXMADVuSta+EIPu3nIpuhVx9HuBe3+ yKtxWk9GiLU4yG8jbzun97I+qXUVLGJM+QDrdGSw0j4hVd0+iV6Nn59csGfsieE3BTIv NCMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=DH0vChr+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p14si2481348ejf.582.2021.04.22.08.36.26; Thu, 22 Apr 2021 08:36:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=DH0vChr+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236465AbhDVPfq (ORCPT + 99 others); Thu, 22 Apr 2021 11:35:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230005AbhDVPfn (ORCPT ); Thu, 22 Apr 2021 11:35:43 -0400 Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C6E7C06174A for ; Thu, 22 Apr 2021 08:35:07 -0700 (PDT) Received: by mail-io1-xd29.google.com with SMTP id b10so45942582iot.4 for ; Thu, 22 Apr 2021 08:35:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eHBbCi3pN+iJkHBatI04E+yq2lifaj9Rlve6YQLrS2Q=; b=DH0vChr+b3MF7BQ6d54fQNwVkOW453fs9VpsBAtilGLjHzG253BxE8ksNjzc99FGAQ KkVAr3YeKlivaUmmddmnV+3a1y7yTFgGxmjSZsCeB+fRw1iKpAUi/mrcogYOhc9k00L0 qdCubbnNoDkBEl7Ll30t9/xrVjQ8PY7bP97Ac= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eHBbCi3pN+iJkHBatI04E+yq2lifaj9Rlve6YQLrS2Q=; b=MMW3SPZ2V4QbRFKtGSYWA+WjLkoKfa7Epc+gYe2AddMtLvZG8OkfVSQBgKbk6JVk56 QS9TcWsLKJpAbr5wp9LJtUNxKt1SBgx5NG75aZ1lGX7U5xkgOVAcnLS1ocPeT3tRRel0 JCuHlStIoBMYI9l8bS0foJl2LDrhP55vlitNa8ocDf0NqUxoCeD0rPuaU0PILNhu3EP1 CFDKxOTyhnjIwcO6aZEhl7hn0JOczbm7rZ6i5H7hsJE9t0eahvtzTHN4AF9hCcCC/OSL xgm5CY9Nnc3SKqmMY7gLeLncSjkag7YICpT4Iho0+w4PtcG3Vni2wIFixKPIvTD/jOqW JRpQ== X-Gm-Message-State: AOAM533qASTUAf4TX+lQyLBbmkFVDy0ExzW8VowVugniDuGz/MWAsIv8 hakfEIf9j+4siKuLpqnaYMYI69ldTAzZ0fCcCeuL2Q== X-Received: by 2002:a05:6602:218a:: with SMTP id b10mr3344679iob.122.1619105706623; Thu, 22 Apr 2021 08:35:06 -0700 (PDT) MIME-Version: 1.0 References: <20210421190736.1538217-1-linux@rasmusvillemoes.dk> <236995f6-30ee-8047-624c-08d0a1552dc1@rasmusvillemoes.dk> <7e9d3337-eb7b-a2c8-a5ef-037d6a9765d7@rasmusvillemoes.dk> In-Reply-To: From: Florent Revest Date: Thu, 22 Apr 2021 17:34:55 +0200 Message-ID: Subject: Re: [PATCH] bpf: remove pointless code from bpf_do_trace_printk() To: Rasmus Villemoes Cc: Andrii Nakryiko , Daniel Borkmann , Alan Maguire , Steven Rostedt , bpf , open list , Alexei Starovoitov Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 22, 2021 at 2:36 PM Florent Revest wrote: > > On Thu, Apr 22, 2021 at 12:09 PM Rasmus Villemoes > wrote: > > > > On 22/04/2021 11.23, Florent Revest wrote: > > > On Thu, Apr 22, 2021 at 9:13 AM Rasmus Villemoes > > > wrote: > > >> > > >> On 22/04/2021 05.32, Andrii Nakryiko wrote: > > >>> On Wed, Apr 21, 2021 at 6:19 PM Rasmus Villemoes > > >>> wrote: > > >>>> > > >>>> The comment is wrong. snprintf(buf, 16, "") and snprintf(buf, 16, > > >>>> "%s", "") etc. will certainly put '\0' in buf[0]. The only case where > > >>>> snprintf() does not guarantee a nul-terminated string is when it is > > >>>> given a buffer size of 0 (which of course prevents it from writing > > >>>> anything at all to the buffer). > > >>>> > > >>>> Remove it before it gets cargo-culted elsewhere. > > >>>> > > >>>> Signed-off-by: Rasmus Villemoes > > >>>> --- > > >>>> kernel/trace/bpf_trace.c | 3 --- > > >>>> 1 file changed, 3 deletions(-) > > >>>> > > >>> > > >>> The change looks good to me, but please rebase it on top of the > > >>> bpf-next tree. This is not a bug, so it doesn't have to go into the > > >>> bpf tree. As it is right now, it doesn't apply cleanly onto bpf-next. > > > > > > FWIW the idea of the patch also looks good to me :) > > > > > >> Thanks for the pointer. Looking in next-20210420, it seems to me that > > >> > > >> commit d9c9e4db186ab4d81f84e6f22b225d333b9424e3 > > >> Author: Florent Revest > > >> Date: Mon Apr 19 17:52:38 2021 +0200 > > >> > > >> bpf: Factorize bpf_trace_printk and bpf_seq_printf > > >> > > >> is buggy. In particular, these two snippets: > > >> > > >> +#define BPF_CAST_FMT_ARG(arg_nb, args, mod) \ > > >> + (mod[arg_nb] == BPF_PRINTF_LONG_LONG || \ > > >> + (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64) \ > > >> + ? (u64)args[arg_nb] \ > > >> + : (u32)args[arg_nb]) > > >> > > >> > > >> + ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, > > >> mod), > > >> + BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, > > >> args, mod)); > > >> > > >> Regardless of the casts done in that macro, the type of the resulting > > >> expression is that resulting from C promotion rules. And (foo ? (u64)bla > > >> : (u32)blib) has type u64, which is thus the type the compiler uses when > > >> building the vararg list being passed into snprintf(). C simply doesn't > > >> allow you to change types at run-time in this way. > > >> > > >> It probably works fine on x86-64, which passes the first six or so > > >> argument in registers, va_start() puts those registers into the va_list > > >> opaque structure, and when it comes time to do a va_arg(int), just the > > >> lower 32 bits are used. It is broken on i386 and other architectures > > >> where arguments are passed on the stack (and for x86-64 as well had > > >> there been a few more arguments) and va_arg(ap, int) is essentially ({ > > >> int res = *(int *)ap; ap += 4; res; }) [or maybe it's -= 4 because stack > > >> direction etc., that's not really relevant here]. > > >> > > >> Rasmus > > > > > > Thank you Rasmus :) > > > > > > I think you were lucky (or unlucky, depending on how you look at it) > > with your test case > > > > + num_ret = BPF_SNPRINTF(num_out, sizeof(num_out), > > + "%d %u %x %li %llu %lX", > > + -8, 9, 150, -424242, 1337, 0xDABBAD00); > > > > because it just so happens that the eventual snprintf() call uses three > > arguments for itself, so the first three 32-bit arguments end up being > > passed via registers, while the 64 bit arguments are passed via the > > stack. Can I get you to test what would happen if you interchanged > > these, i.e. changed the test case to do > > > > + num_ret = BPF_SNPRINTF(num_out, sizeof(num_out), > > + "%li %llu %lX %d %u %x", > > + -424242, 1337, 0xDABBAD00, -8, 9, 150); > > > > (or just add a few more expects-a-32-bit argument format specifiers and > > corresponding arguments). My guess is that up until formatting -8 it > > goes well, but when vsnprintf() is to grab the argument corresponding to > > %u, it will get the 0xffffffff from the upper half of (u64)-8. > > I will need to come up with a repro and let you know yes :) > > > > It seems that we went offtrack in > > > https://lore.kernel.org/bpf/CAEf4BzZVEGM4esi-Rz67_xX_RTDrgxViy0gHfpeauECR5bmRNA@mail.gmail.com/ > > > and we do need something like "88a5c690b6 bpf: fix bpf_trace_printk on > > > 32 bit archs". Thinking about it again, it's clearer now why the > > > __BPF_TP_EMIT macro emits 2^3=8 different __trace_printk() indeed. > > > > Isn't it 3^3 = 27, or has that been reduced in -next compared to Linus' > > master? Doesn't matter much, just curious. > > > > > In the case of bpf_trace_printk with a maximum of 3 args, it's > > > relatively cheap; but for bpf_seq_printf and bpf_snprintf which accept > > > up to 12 arguments, that would be 2^12=4096 calls. > > > > Yeah, that doesn't scale at all. > > > > Until now > > > bpf_seq_printf has just ignored this problem and just considered > > > everything as u64, I wonder if that'd be the best approach for these > > > two helpers anyway. > > > > > > > [wild handwaving ahead] > > > > One possibility, if one is willing to get hands dirty and dig into ABI > > details on various arches, is to create a > > > > struct fake_va_list { > > union { > > va_list ap; /* opaque, compiler-provided */ > > arch_va_list _ap; /* arch-provided, must match layout of ap */ > > }; > > void *stack; > > }; > > > > Then do > > > > struct fake_va_list fva; > > u64 buf[24]; /* or whatever you want to support, can be different in > > different functions */ > > > > fake_va_init(&fva, buf); > > /* various C code, parsing format string etc. */ > > if (arg[i] is really 32 bits) > > fake_va_push(&fva, (u32)arg[i]); > > else > > fake_va_push(&fva, (u64)arg[i]); > > /* etc. */ > > ... > > vsnprintf(out, size, fmt, fva.va); > > > > On arches like x86-64, where va_list is really a typedef for a > > one-element array of > > > > struct __va_list_tag { > > unsigned int gp_offset; > > unsigned int fp_offset; > > void * overflow_arg_area; > > void * reg_save_area; > > }; > > > > > > fake_va_init() would make the va_list look like the reg_save_area is > > already used (i.e., set gp_offset to 48), and initialize both > > ->_ap.overflow_arg_area and ->stack to point at the given buffer. > > fake_va_push() would use and update stack appropriately. For 32 bit x86, > > va_list is really just a pointer, so fake_va_init would essentially just > > do "fva->_ap = fva->stack = buf", and fake_va_push() would again just > > need to manipulate ->stack. > > > > It's not pretty, but I don't think it necessarily requires too much > > arch-specific work (fake_va_push() could be common, perhaps just with a > > arch define to say whether 64 bit arguments need ->stack to first be > > up-aligned to an 8 byte boundary). > > > > Rasmus > > Creative! :D I think these arch-specific structures would be a hard > sell though ahah. > > I was having a stroll through lib/vsprintf.c and noticed bstr_printf: > > * This function like C99 vsnprintf, but the difference is that vsnprintf gets > * arguments from stack, and bstr_printf gets arguments from @bin_buf which is > * a binary buffer that generated by vbin_printf. > > Maybe it would be easier to just build our argument buffer similarly > to what vbin_printf does. I've been experimenting with this idea and it is quite promising :) it also makes the code much cleaner, I find. I'll send a series asap. BPF maintainers: should we fix forward or do you prefer reverting the snprintf series and then re-applying another snprintf series without the regression in bpf_trace_printk that mangles some argument types ? (bpf_seq_printf has always been like that so no regression there)