Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp3953465rdh; Tue, 28 Nov 2023 08:03:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IHwoes/6D9JOM1ePOBabB7RvWPX5ypw2+SM/tvbWFUrBsvSsBI1pLU3HoYjxaFIk7PSftL+ X-Received: by 2002:a25:2057:0:b0:da3:74ad:e05e with SMTP id g84-20020a252057000000b00da374ade05emr11367103ybg.40.1701187389990; Tue, 28 Nov 2023 08:03:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701187389; cv=none; d=google.com; s=arc-20160816; b=bd9ht5Ozy0Y4u4NY9+w8irLsuFUB51007KyoQk0uTKRusuA9e70oUccrXpnXQOkcaC n9LFJMdmBI1aS5vjh+f3wdAyWOzQ4hi+I/fjX9zX7CBxD78vN205AqmE1IwY3rnLOv1Y 8LNcTv87I3QGU6B2D5PfY1XNoOPQV8qNGqrjNBhAqQLcp5HvNwtGEQNXN/2tD35MCOvz BsY8NjjrHreD1OK6Hd9Otr+x4iPiDkAar0Q2orT/iBGsYTAml0UUeQreb8ptlTFd6tMa GiRTjyONBaO02vzaItM2qS6Ts9BxDJDxIxXXPUqxhjtqMAY7zM4a7798ngluxfBAue/2 KBMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=ADK4UzT2nKfv38QmGQVpz2/R0ySW0ULt4MZjYJMnC70=; fh=NmcBc67mYAwBmBMS3JlPQ/2b0G7ZwK/P/vjsOBZL7sg=; b=xaP+LrJRFLmA7hRpaQtdAp5ja1Gu/lPR3/IlDc8AA6vuPe2KrVDfdlaX4b+LKtwgJV zR/9ESChehRaDLzQKEVSIXJhnNxIz6QUmkBTaXuvl6OKSOLN7mzcXWOcGuYwLaTy+1WM w4gn/TDVyPaEe493M4wbFx5sIH/JOX2YR7QXiGclFxHe3NSUgBzBKobDpbXnNWcQ4Uo0 QgQHeWiRpazwkcInL7R0T8ByBrCXkqaWpP7ZTnK2Ha4z+SyJ5RpJWBlrqDWey02AdSpO VuuvH487Hkdm5/bFTAQjkfdNAAxPIM/ko/xgeEZrl4DaDicBNJOX+Pk+WnfPv7GugYAl U3Ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KinhKpEt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id j188-20020a2523c5000000b00da36f782320si7750076ybj.34.2023.11.28.08.02.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 08:03:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=KinhKpEt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id BB2E98045964; Tue, 28 Nov 2023 08:02:51 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345160AbjK1QCk (ORCPT + 99 others); Tue, 28 Nov 2023 11:02:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344653AbjK1QCj (ORCPT ); Tue, 28 Nov 2023 11:02:39 -0500 Received: from mail-lf1-x131.google.com (mail-lf1-x131.google.com [IPv6:2a00:1450:4864:20::131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 492258E; Tue, 28 Nov 2023 08:02:45 -0800 (PST) Received: by mail-lf1-x131.google.com with SMTP id 2adb3069b0e04-50aab20e828so7983526e87.2; Tue, 28 Nov 2023 08:02:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701187363; x=1701792163; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ADK4UzT2nKfv38QmGQVpz2/R0ySW0ULt4MZjYJMnC70=; b=KinhKpEt/BKSYRhPKaeFuJVjyTcrx5BMwKbwYVZPke8vtYmJBsFEST+ciR0G1+xA++ r2NiOGgAoussGTAZqnkzTOQ78EAXzHby2aUHoerxMofnNwsGMCFaKUYpBqbC2KxBxJFr QoHE+drn5y94clRbk/kbbcbL6xfkbfIWqtZuDA7k8OtWPZZG4ACWjd74IT83EjhoTovG XjrymeT82p3Hx58H4vcf+F+RzkO9eSSdATY9Q0v07wLFsvYNihzPmj111kTCYY5oVYzT 1iFLmBTt1zcfzmlXnSZoOIXvmKuCVG18CPspFpcO4ryp/LsxdroluAqJD/j7E73PyZo4 D43A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701187363; x=1701792163; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ADK4UzT2nKfv38QmGQVpz2/R0ySW0ULt4MZjYJMnC70=; b=oO/lzZnWbVEcVyPO0Z4++6Ff0DqDP1YON/cpUMi1nPp9oT0L0YJ4XgeL7yeGBCb7Eh 0C+TEu/izIqJYQ1ySuJFSu8SZqazoVga/ASJCA1+DGfzrpJBO00cCYgooiqLz+++jUbG ICfUm4xC9+LMUSBNmxAzSZt+ltz/KWMYwKF2ht1BalBVOCh9ZKI1RTCchfQwXNakmLLC ge9kx47zKjefaxRU5PP1hZRBRbpYB0AcFVUEsUlno5f3y9KJEHj4NGm9Dkm7/zjIu1B/ xIJErGsc0BBjbb2oIMmmgfu4cAWk4t/86RlQExf7XocrT9oWP2unHcDMU/sYv0ZB+wM3 tRww== X-Gm-Message-State: AOJu0YzT4SPrm03vxf28Vsmp9yoBFRT7F6ryxncTc1iGyVb0p0JhRueW BgTcRaWRIpWD5oRZWMfhmFMSgI29w+sfiQIx1OjSH71X X-Received: by 2002:ac2:58f8:0:b0:4fa:f96c:745f with SMTP id v24-20020ac258f8000000b004faf96c745fmr9014758lfo.38.1701187362785; Tue, 28 Nov 2023 08:02:42 -0800 (PST) MIME-Version: 1.0 References: <391d524c496acc97a8801d8bea80976f58485810.1700676682.git.dxu@dxuuu.xyz> <0f210cef-c6e9-41c1-9ba8-225f046435e5@linux.dev> <3ec6c068-7f95-419a-a0ae-a901f95e4838@linux.dev> <18e43cdf65e7ba0d8f6912364fbc5b08a6928b35.camel@gmail.com> <0535eb913f1a0c2d3c291478fde07e0aa2b333f1.camel@gmail.com> <42f9bf0d-695a-412d-bea5-cb7036fa7418@linux.dev> <53jaqi72ef4gynyafxidl5veb54kfs7dttxezkarwg75t7szd4@cvfg5pc7pyum> In-Reply-To: From: Andrii Nakryiko Date: Tue, 28 Nov 2023 08:02:30 -0800 Message-ID: Subject: Re: [PATCH ipsec-next v1 6/7] bpf: selftests: test_tunnel: Disable CO-RE relocations To: Yonghong Song Cc: Daniel Xu , Eduard Zingerman , Alexei Starovoitov , Shuah Khan , Daniel Borkmann , Andrii Nakryiko , Alexei Starovoitov , Steffen Klassert , antony.antony@secunet.com, Mykola Lysenko , Martin KaFai Lau , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf , "open list:KERNEL SELFTEST FRAMEWORK" , LKML , devel@linux-ipsec.org, Network Development Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 28 Nov 2023 08:02:52 -0800 (PST) On Mon, Nov 27, 2023 at 8:06=E2=80=AFPM Yonghong Song wrote: > > > On 11/27/23 7:01 PM, Daniel Xu wrote: > > On Mon, Nov 27, 2023 at 02:45:11PM -0600, Daniel Xu wrote: > >> On Sun, Nov 26, 2023 at 09:53:04PM -0800, Yonghong Song wrote: > >>> On 11/27/23 12:44 AM, Yonghong Song wrote: > >>>> On 11/26/23 8:52 PM, Eduard Zingerman wrote: > >>>>> On Sun, 2023-11-26 at 18:04 -0600, Daniel Xu wrote: > >>>>> [...] > >>>>>>> Tbh I'm not sure. This test passes with preserve_static_offset > >>>>>>> because it suppresses preserve_access_index. In general clang > >>>>>>> translates bitfield access to a set of IR statements like: > >>>>>>> > >>>>>>> C: > >>>>>>> struct foo { > >>>>>>> unsigned _; > >>>>>>> unsigned a:1; > >>>>>>> ... > >>>>>>> }; > >>>>>>> ... foo->a ... > >>>>>>> > >>>>>>> IR: > >>>>>>> %a =3D getelementptr inbounds %struct.foo, ptr %0, i32 0, i= 32 1 > >>>>>>> %bf.load =3D load i8, ptr %a, align 4 > >>>>>>> %bf.clear =3D and i8 %bf.load, 1 > >>>>>>> %bf.cast =3D zext i8 %bf.clear to i32 > >>>>>>> > >>>>>>> With preserve_static_offset the getelementptr+load are replaced b= y a > >>>>>>> single statement which is preserved as-is till code generation, > >>>>>>> thus load with align 4 is preserved. > >>>>>>> > >>>>>>> On the other hand, I'm not sure that clang guarantees that load o= r > >>>>>>> stores used for bitfield access would be always aligned according= to > >>>>>>> verifier expectations. > >>>>>>> > >>>>>>> I think we should check if there are some clang knobs that preven= t > >>>>>>> generation of unaligned memory access. I'll take a look. > >>>>>> Is there a reason to prefer fixing in compiler? I'm not opposed to= it, > >>>>>> but the downside to compiler fix is it takes years to propagate an= d > >>>>>> sprinkles ifdefs into the code. > >>>>>> > >>>>>> Would it be possible to have an analogue of BPF_CORE_READ_BITFIELD= ()? > >>>>> Well, the contraption below passes verification, tunnel selftest > >>>>> appears to work. I might have messed up some shifts in the macro, > >>>>> though. > >>>> I didn't test it. But from high level it should work. > >>>> > >>>>> Still, if clang would peek unlucky BYTE_{OFFSET,SIZE} for a particu= lar > >>>>> field access might be unaligned. > >>>> clang should pick a sensible BYTE_SIZE/BYTE_OFFSET to meet > >>>> alignment requirement. This is also required for BPF_CORE_READ_BITFI= ELD. > >>>> > >>>>> --- > >>>>> > >>>>> diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > >>>>> b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > >>>>> index 3065a716544d..41cd913ac7ff 100644 > >>>>> --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > >>>>> +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > >>>>> @@ -9,6 +9,7 @@ > >>>>> #include "vmlinux.h" > >>>>> #include > >>>>> #include > >>>>> +#include > >>>>> #include "bpf_kfuncs.h" > >>>>> #include "bpf_tracing_net.h" > >>>>> @@ -144,6 +145,38 @@ int ip6gretap_get_tunnel(struct __sk_buff *= skb) > >>>>> return TC_ACT_OK; > >>>>> } > >>>>> +#define BPF_CORE_WRITE_BITFIELD(s, field, new_val) ({ = \ > >>>>> + void *p =3D (void *)s + __CORE_RELO(s, field, BYTE_OFFSET); = \ > >>>>> + unsigned byte_size =3D __CORE_RELO(s, field, BYTE_SIZE); = \ > >>>>> + unsigned lshift =3D __CORE_RELO(s, field, LSHIFT_U64); \ > >>>>> + unsigned rshift =3D __CORE_RELO(s, field, RSHIFT_U64); \ > >>>>> + unsigned bit_size =3D (rshift - lshift); \ > >>>>> + unsigned long long nval, val, hi, lo; \ > >>>>> + \ > >>>>> + asm volatile("" : "=3Dr"(p) : "0"(p)); \ > >>>> Use asm volatile("" : "+r"(p)) ? > >>>> > >>>>> + \ > >>>>> + switch (byte_size) { \ > >>>>> + case 1: val =3D *(unsigned char *)p; break; \ > >>>>> + case 2: val =3D *(unsigned short *)p; break; \ > >>>>> + case 4: val =3D *(unsigned int *)p; break; \ > >>>>> + case 8: val =3D *(unsigned long long *)p; break; \ > >>>>> + } \ > >>>>> + hi =3D val >> (bit_size + rshift); \ > >>>>> + hi <<=3D bit_size + rshift; \ > >>>>> + lo =3D val << (bit_size + lshift); \ > >>>>> + lo >>=3D bit_size + lshift; \ > >>>>> + nval =3D new_val; \ > >>>>> + nval <<=3D lshift; \ > >>>>> + nval >>=3D rshift; \ > >>>>> + val =3D hi | nval | lo; \ > >>>>> + switch (byte_size) { \ > >>>>> + case 1: *(unsigned char *)p =3D val; break; \ > >>>>> + case 2: *(unsigned short *)p =3D val; break; \ > >>>>> + case 4: *(unsigned int *)p =3D val; break; \ > >>>>> + case 8: *(unsigned long long *)p =3D val; break; \ > >>>>> + } \ > >>>>> +}) > >>>> I think this should be put in libbpf public header files but not sur= e > >>>> where to put it. bpf_core_read.h although it is core write? > >>>> > >>>> But on the other hand, this is a uapi struct bitfield write, > >>>> strictly speaking, CORE write is really unnecessary here. It > >>>> would be great if we can relieve users from dealing with > >>>> such unnecessary CORE writes. In that sense, for this particular > >>>> case, I would prefer rewriting the code by using byte-level > >>>> stores... > >>> or preserve_static_offset to clearly mean to undo bitfield CORE ... > >> Ok, I will do byte-level rewrite for next revision. > > [...] > > > > This patch seems to work: https://pastes.dxuuu.xyz/0glrf9 . > > > > But I don't think it's very pretty. Also I'm seeing on the internet tha= t > > people are saying the exact layout of bitfields is compiler dependent. > > Any reference for this (exact layout of bitfields is compiler dependent)? > > > So I am wondering if these byte sized writes are correct. For that > > matter, I am wondering how the GCC generated bitfield accesses line up > > with clang generated BPF bytecode. Or why uapi contains a bitfield. > > One thing for sure is memory layout of bitfields should be the same > for both clang and gcc as it is determined by C standard. Register > representation and how to manipulate could be different for different > compilers. > > > > > WDYT, should I send up v2 with this or should I do one of the other > > approaches in this thread? > > Daniel, look at your patch, since we need to do CORE_READ for > those bitfields any way, I think Eduard's patch with > BPF_CORE_WRITE_BITFIELD does make sense and it also makes code > easy to understand. Could you take Eduard's patch for now? > Whether and where to put BPF_CORE_WRITE_BITFIELD macros > can be decided later. bpf_core_read.h name is... let's say "historical" and was never meant to limit stuff there to read-only or anything like that. Think about it as just bpf_core.h where all the CO-RE-related stuff goes. So please put BPF_CORE_WRITE_BITFIELD there. > > > > > I am ok with any of the approaches. > > > > Thanks, > > Daniel > >