Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp181169pxf; Wed, 10 Mar 2021 03:52:53 -0800 (PST) X-Google-Smtp-Source: ABdhPJxUSB0/nISMZDc2ChccouRkZUgy8qIYjhiScmyAWzCHnwWubgMYQo8E63jveAt5VHf4DbQX X-Received: by 2002:a05:6402:22b5:: with SMTP id cx21mr2819465edb.27.1615377173622; Wed, 10 Mar 2021 03:52:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615377173; cv=none; d=google.com; s=arc-20160816; b=qOqix6A2xtHHkVpxrLICs1m+Tf/lDPsga6T6wYaHlAm8DR06PLcN1+AdZ1qcnd7LB7 6hXIWrSwvSMEjtdsZk1IDUQ4k7LPanwYAnJd9u34ya9epdD7cgsWBsySuMSgiIyC9OaA lOn4SJLbXIKWtUzJj156oWKqGHvx1qurqzV3NTIUn8cpplW7RodceK8oej/trMW7rDBH BI+2h3D0uxbUM5yodYcJukMvFDUBJk+C4KDH3cPD/HNFeY6l7KhH8dRC5Etxjg9qs8Ze oFgUMRYMre1jm024Lknxz5otN4+zwkERou5+5N+mSsCNrMLAa21xbufUM3G2uQfekQ+F ohpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=OUZP8m10BICXg+bVPGfgJD/AJgcaU4lrh7k/xffSUxE=; b=GT6vOPyH6TU+HP/HryTQI8+6I62/ZdzlC37aHK2K0Gz9ird1T2zkcKpKi7WogZxPVm wRnlRzPdnFf88UQPP09nvscU/EOF1wnCHuZEbELGaBh0zEHIIpiDnyN0++XBqsL5HotL shyZmJJYwWc0i+q9GQDZ044X95Pt0sdMgTWS7YlXFMmRWgrGVLU3mHeE8zXM8lTWnhVj x0KQW+kKnnzHc+BH6XwURNgX83+C4qARnFQru51+9Mr7KyQ/OiJZwfVYeiKbCiX1cSIT TC08OUelu4ibblYkTH6s/3wqjtwTfvTgMhv4uQFPiBfiqA8MCxYMaBysWfynrUkCQMcK dVPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mbtgLm0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w24si11679291eds.559.2021.03.10.03.52.31; Wed, 10 Mar 2021 03:52:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=mbtgLm0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232010AbhCJLt0 (ORCPT + 99 others); Wed, 10 Mar 2021 06:49:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230122AbhCJLs4 (ORCPT ); Wed, 10 Mar 2021 06:48:56 -0500 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6898EC06174A for ; Wed, 10 Mar 2021 03:48:56 -0800 (PST) Received: by mail-il1-x130.google.com with SMTP id s1so15237582ilh.12 for ; Wed, 10 Mar 2021 03:48:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OUZP8m10BICXg+bVPGfgJD/AJgcaU4lrh7k/xffSUxE=; b=mbtgLm0pM+7qiwMv60dRKDKZJqODnkpoqbEPoXqEqIaq5AD66TqKZs3XHEKiTdSPBd E6J3u/bTHxuSW6PPuPCqsU3aDrPURWPqHsrbCWzVBicwhtV/KwmL3qrhpGDRfhjaaiDp yiEtfTMyHG8ISa0s1zbE7YNeG/nMKcc0hSOlA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OUZP8m10BICXg+bVPGfgJD/AJgcaU4lrh7k/xffSUxE=; b=q0iX1JRbyM7FBWnPWYNCwovWzJAGZTnL71c/Z+y5Brlz3/Os7NdWHEcqmj/2NVT84f LV9zDRj807/wxDTf1eozGsvTrcZBOLXCCC4KrEQHtjJ2IIAKLyWTy8/7DYuMtUYe5fnt TWYNBbl/MN+YKi3yw9Wht8d/oaVg8iwdW1a8B3e3HkXw6E5XWSYZwYNNjZJyG+d9zLxk U9zEwjekH6v6F0qOSXI8FOJl68wT5ACn2Z03D9zkOIQQuqMI0pnb0Qr9zjRPm3yXMwqD tpPBgRgmXDY/U002dUYwSpEjG+dnC6TRhT8W1LP2Zjgshw4RHqUnWmGxSZ0Yjpbt2iDi GnHA== X-Gm-Message-State: AOAM530czemoi2JTNnWPIFfVQTYz49VS1+jRqm8UCSAK50UskVflQQoQ /XgtZs5I8bvpEZ3+9bPW71XmelBnAeYvAgcre0qwpw== X-Received: by 2002:a05:6e02:12b4:: with SMTP id f20mr2231220ilr.220.1615376935917; Wed, 10 Mar 2021 03:48:55 -0800 (PST) MIME-Version: 1.0 References: <20210310015455.1095207-1-revest@chromium.org> In-Reply-To: From: Florent Revest Date: Wed, 10 Mar 2021 12:48:45 +0100 Message-ID: Subject: Re: [BUG] One-liner array initialization with two pointers in BPF results in NULLs To: Yonghong Song Cc: bpf , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , KP Singh , Brendan Jackman , open list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 10, 2021 at 6:16 AM Yonghong Song wrote: > On 3/9/21 7:43 PM, Yonghong Song wrote: > > On 3/9/21 5:54 PM, Florent Revest wrote: > >> I noticed that initializing an array of pointers using this syntax: > >> __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > >> (which is a fairly common operation with macros such as BPF_SEQ_PRINTF) > >> always results in array[0] and array[1] being NULL. > >> > >> Interestingly, if the array is only initialized with one pointer, ex: > >> __u64 array[] = { (__u64)&var1 }; > >> Then array[0] will not be NULL. > >> > >> Or if the array is initialized field by field, ex: > >> __u64 array[2]; > >> array[0] = (__u64)&var1; > >> array[1] = (__u64)&var2; > >> Then array[0] and array[1] will not be NULL either. > >> > >> I'm assuming that this should have something to do with relocations > >> and might be a bug in clang or in libbpf but because I don't know much > >> about these, I thought that reporting could be a good first step. :) > > > > Thanks for reporting. What you guess is correct, this is due to > > relocations :-( > > > > The compiler notoriously tend to put complex initial values into > > rodata section. For example, for > > __u64 array[] = { (__u64)&var1, (__u64)&var2 }; > > the compiler will put > > { (__u64)&var1, (__u64)&var2 } > > into rodata section. > > > > But &var1 and &var2 themselves need relocation since they are > > address of static variables which will sit inside .data section. > > > > So in the elf file, you will see the following relocations: > > > > RELOCATION RECORDS FOR [.rodata]: > > OFFSET TYPE VALUE > > 0000000000000018 R_BPF_64_64 .data > > 0000000000000020 R_BPF_64_64 .data Right :) Thank you for the explanations Yonghong! > > Currently, libbpf does not handle relocation inside .rodata > > section, so they content remains 0. Just for my own edification, why is .rodata relocation not yet handled in libbpf ? Is it because of a read-only mapping that makes it more difficult ? > > That is why you see the issue with pointer as NULL. > > > > With array size of 1, compiler does not bother to put it into > > rodata section. > > > > I *guess* that it works in the macro due to some kind of heuristics, > > e.g., nested blocks, etc, and llvm did not promote the array init value > > to rodata. I will double check whether llvm can complete prevent > > such transformation. > > > > Maybe in the future libbpf is able to handle relocations for > > rodata section too. But for the time being, please just consider to use > > either macro, or the explicit array assignment. > > Digging into the compiler, the compiler tries to make *const* initial > value into rodata section if the initial value size > 64, so in > this case, macro does not work either. I think this is how you > discovered the issue. Indeed, I was using a macro similar to BPF_SEQ_PRINTF and this is how I found the bug. > The llvm does not provide target hooks to > influence this transformation. Oh, that is unfortunate :) Thanks for looking into it! I feel that the real fix would be in libbpf anyway and the rest is just workarounds. > So, there are two workarounds, > (1). __u64 param_working[2]; > param_working[0] = (__u64)str1; > param_working[1] = (__u64)str2; > (2). BPF_SEQ_PRINTF(seq, "%s ", str1); > BPF_SEQ_PRINTF(seq, "%s", str2); (2) is a bit impractical for my actual usecase. I am implementing a bpf_snprintf helper (patch series Coming Soon TM) and I wanted to keep the selftest short with a few BPF_SNPRINTF() calls that exercise most format specifiers. > In practice, if you have at least one non-const format argument, > you should be fine. But if all format arguments are constant, then > none of them should be strings. Just for context, this does not only happen for strings but also for all sorts of pointers, for example, when I try to do address lookup of global __ksym variables, which is important for my selftest. > Maybe we could change marco > unsigned long long ___param[] = { args }; > to declare an array explicitly and then have a loop to > assign each array element? I think this would be a good workaround for now, indeed. :) I'll look into it today and send it as part of my bpf_snprintf series. Thanks!