Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp620195pxy; Wed, 28 Apr 2021 10:36:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwfw/9F2bmk/TxdceXZZvo8thAUGn5meS8JiX31yzdM+RxIDNo46sQEYDcE3GN8hQTMb6tA X-Received: by 2002:a05:6402:1357:: with SMTP id y23mr11087352edw.235.1619631411079; Wed, 28 Apr 2021 10:36:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619631411; cv=none; d=google.com; s=arc-20160816; b=JnmJXI5JbeSL+O8KMYaQ2mHct8PKm5EAMS4tocPYxXD6o90W1f177revzyRVp5ui/4 FxOAzUIXJKL2kVVLl9rFRLy50aUiMIlR6Gg7RqMuYAjMxJNSJMhrnQZBOcPtRY3mxbX+ BgAk3i0mcuVYPJRG3ezhtlyGuCvDRTceEzUnZbE36QCWYLc2Tw5etWuozG9XghFKOeER oGmvu/VW9/uxGSjvWHDOgbfgvBSF/3O9tQoS4/uT8OGCgAF3rNTNLHmJAQZ7SpaSf8Vj OKeKG/4+Pty3fEtFCMjStWbp7Fr41ZGztdIAbdg7Z+Xc3OtJ6+jG2ogT9Csaxk3AzqX9 bxSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Iabn/7TldLlLe4ct1XbBE3tIHi0bNFX+HoKTspy+Zd8=; b=OEooRGg7oVFZud3iXSbb/mj/BHRQbN52KQiOrfJuQDdSHASSp5qU7Hsl+uI5/MPGZF RRUhF7kISmY334RD+mHIdsD0JT2inmyTu9BnPQ27jb4ZyG8s1p3ZJxwmj5JZk4TNBKvg iVZUtDWD+QjIRdauq/wB/wxtsUP85Q1WcsHx4kvnVMDGJ6C6opu0psps61C6rl05sUd0 0mzvWIcDkDcPWgUA+ZktFd61FgDLu/36m+KiLbQAcGZ38kEXCgn+h2JDo/Dwd6iX0AN5 R+mCyqRYnz0qRD9gh9mm+3792Zk21+FYPxN5ZJ/R/xeMxiINwOkU0jdeo6LVd5irlo4T Opvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=W8J8yxp2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i14si738497eje.207.2021.04.28.10.36.24; Wed, 28 Apr 2021 10:36:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=W8J8yxp2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231542AbhD1PDn (ORCPT + 99 others); Wed, 28 Apr 2021 11:03:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239509AbhD1PD1 (ORCPT ); Wed, 28 Apr 2021 11:03:27 -0400 Received: from mail-io1-xd31.google.com (mail-io1-xd31.google.com [IPv6:2607:f8b0:4864:20::d31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC890C061374 for ; Wed, 28 Apr 2021 07:59:53 -0700 (PDT) Received: by mail-io1-xd31.google.com with SMTP id p8so10324324iol.11 for ; Wed, 28 Apr 2021 07:59:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Iabn/7TldLlLe4ct1XbBE3tIHi0bNFX+HoKTspy+Zd8=; b=W8J8yxp2ijkdH07hhcmJoTch4+ds+RpR3uJrFyC0n+ijzmDDAkR7Wa2g/1NMRtci5F XGfLuh8mdCRY1S3icexBaEMHztAzbCBe+wx+rqRERgmLQBZGxhCp2gvvUlj8IFXHI0Iy TwtKi5pO89ycJaUT+bcI/pkLkI/z/p05oaD3A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Iabn/7TldLlLe4ct1XbBE3tIHi0bNFX+HoKTspy+Zd8=; b=HX/vbMuc4N2T2pC9IYrLwDRlgsrvuejoKDfawMGjXlO3bP1en6x7ZkUGNMjH5TE48e cPPCryEiZSgc8b6VlZKnvMqRhDhfDwhvXw3euwa3gau5MXqEqzTEkgCUGe1Co9YXKvr6 1ObEvPYUXjDt1jJbpKk1Fo+xQUuQhroB/Ph2L0egvZiSpaLzozwZNLQG1Dy3OED4bWK7 PzON71lSFeXg1fOdlqmZmuhjuub8z+UTKq3CgxDmp0eZuR9mb+oDNjvTfiBbN0M3yKLz ozsfTDSzQ4e+KbDVlrNqMz9IJXvXQUnUWSWt69hJdMmxpi24SIGcIh0Kn0Oa6LYp/e6e YNbg== X-Gm-Message-State: AOAM533seDr/da6aNMaBeZ6funuKXuVLiMfSXL1GSrEGz1ZR2KcGbvu6 gHIxG9nXVpx0+04j9hQQ0NL2+gHvRBpMTCmTbhjltQ== X-Received: by 2002:a05:6638:2515:: with SMTP id v21mr27686360jat.110.1619621993352; Wed, 28 Apr 2021 07:59:53 -0700 (PDT) MIME-Version: 1.0 References: <20210419155243.1632274-1-revest@chromium.org> <20210419155243.1632274-7-revest@chromium.org> <2db39f1c-cedd-b9e7-2a15-aef203f068eb@rasmusvillemoes.dk> In-Reply-To: From: Florent Revest Date: Wed, 28 Apr 2021 16:59:42 +0200 Message-ID: Subject: Re: [PATCH bpf-next v5 6/6] selftests/bpf: Add a series of tests for bpf_snprintf To: Andrii Nakryiko Cc: Rasmus Villemoes , bpf , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Yonghong Song , KP Singh , Brendan Jackman , open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 27, 2021 at 8:03 PM Andrii Nakryiko wrote: > > On Tue, Apr 27, 2021 at 2:51 AM Florent Revest wrote: > > > > On Tue, Apr 27, 2021 at 8:35 AM Rasmus Villemoes > > wrote: > > > u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 }; > > > - enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS]; > > > + u32 *bin_args; > > > static char buf[BPF_TRACE_PRINTK_SIZE]; > > > unsigned long flags; > > > int ret; > > > > > > - ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod, > > > - MAX_TRACE_PRINTK_VARARGS); > > > + ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args, > > > + MAX_TRACE_PRINTK_VARARGS); > > > if (ret < 0) > > > return ret; > > > > > > - ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod), > > > - BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod)); > > > - /* snprintf() will not append null for zero-length strings */ > > > - if (ret == 0) > > > - buf[0] = '\0'; > > > + ret = bstr_printf(buf, sizeof(buf), fmt, bin_args); > > > > > > raw_spin_lock_irqsave(&trace_printk_lock, flags); > > > trace_bpf_trace_printk(buf); > > > raw_spin_unlock_irqrestore(&trace_printk_lock, flags); > > > > > > Why isn't the write to buf[] protected by that spinlock? Or put another > > > way, what protects buf[] from concurrent writes? > > > > You're right, that is a bug, I missed that buf was static and thought > > it was just on the stack. That snprintf call should be after the > > raw_spin_lock_irqsave. I'll send a patch. Thank you Rasmus. (before my > > snprintf series, there was a vsprintf after the raw_spin_lock_irqsave) Solved now > Can you please also clean up unnecessary ()s you added in at least a > few places. Thanks. Alexei said he took care of this .:) > > > Probably the test cases are not run in parallel, but this is the kind of > > > thing that would give those symptoms. > > > > I think it's a separate issue from what Andrii reported though because > > the flaky test exercises the bpf_snprintf helper and this buf spinlock > > bug you just found only affects the bpf_trace_printk helper. > > > > That being said, it does smell a little bit like a concurrency issue > > too, indeed. The bpf_snprintf test program is a raw_tp/sys_enter so it > > attaches to all syscall entries and most likely gets executed many > > more times than necessary and probably on parallel CPUs. The "pad_out" > > buffer they write to is unique and not locked so maybe the test's > > userspace reads pad_out while another CPU is writing on it and if the > > string output goes through a stage where it is " 4 0000" before > > being " 4 000", we might read at the wrong time. That being said, I > > would find it weird that this happens as much as 50% of the time and > > always specifically on that test case. > > > > Andrii could you maybe try changing the prog type to > > "tp/syscalls/sys_enter_nanosleep" on the machine where you can > > reproduce this bug ? > > Yes, it helps. I can't repro it easily anymore. Good, so it does sound like a concurrency issue indeed > I think the right fix, though, should be to filter by tid, not change the tracepoint. Agreed, I'll send a patch for that today. :) > I think what's happening is we see the string right before bstr_printf > does zero-termination with end[-1] = '\0'; So in some cases we see > truncated string, in others we see untruncated one. Makes sense but I still wonder why it happens so often (50% of the time is really a lot) and why it is consistently that one test case that fails and not the "overflow" case for example. But I'm confident that this is not a bug in the helper now and that the tid filter will fix the test.