Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp3965962rdh; Tue, 28 Nov 2023 08:18:16 -0800 (PST) X-Google-Smtp-Source: AGHT+IFphimzbyx1eqWj+R8noNgFoCeTeC3bplevgaPaTG0SUdhqJbrFSagGcK4hweOgmzxySe/J X-Received: by 2002:a05:6e02:1bcd:b0:35d:176f:77ed with SMTP id x13-20020a056e021bcd00b0035d176f77edmr1913685ilv.25.1701188295768; Tue, 28 Nov 2023 08:18:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701188295; cv=none; d=google.com; s=arc-20160816; b=UKXPQB8hmGND9I+3iaPEpR+iMPnX5Gr+Hlh6YP8ASIJBMJxQggGhIJPi0FO4FrdLwQ CewPlnLkmLVVXmeQGu7vtyn5mWZ+WU95aSiMfcGUJsLUlHdT+arJiFGLPvqn/7kdqMCw lLidFfjpacP0HOAMi63pmH0/at5pFUozxW2YucI2lbV7woXSeP2iF9iIyC5SFyezxW+W G8V1y5x9CUCDsqmpIHRpQK0feR2PTeZeWQIPfcuQRClvU9iqWd1kJuEFmebjOA9ojQr0 h6xKrE9v0f2wKvZPr7PmmEh9iRuNiRTPLTN/2sBQ7HWsZLCjvziXzx/DtVM/IUCPRDvN 5+CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:feedback-id:dkim-signature:dkim-signature; bh=GZ5L0AMTLI05H5bqiQnQyltqp4P9zp6XIYxLYjP5ZtA=; fh=7FYOkeFAH0oGBRwt5+KHxAjaBmVOD5DOXmHIrEP75hE=; b=dKVzLfh8+j81V2qUDAQ53xhqYKavjfjcY5sIpBuzl+AGGNDsyf44czrkwqiv7Yj2le zcZlrdquo7N4GT4O7Eq9zJQ01dNzPl9ZhqvN1UIDISenSH5tHm0AoyruOaZcjcnSFPZx 1dskPeQ27DWqZJxn3lhHEZMuBPltkGBXvsH+GWIZM4Qv+VWgoglOvVpopt85Rn0dXz8Q bNEfTTHGgZEgwzAOTm09DSzHPuLcbTL5uCVQLRj6AItz7AWTBmlfJtxC5lapEVWeQLeq IeQEtmMjdpfhlDqd/wW6z/+SOiG89ruVwliHXvwtSao0VS4U7fWXRE7AGJ/1dGWNcAQC g2/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@dxuuu.xyz header.s=fm3 header.b=o7xF4oy4; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=nm4kC8Dy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id g4-20020a635204000000b005bdc61e1789si12527924pgb.427.2023.11.28.08.18.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 08:18:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@dxuuu.xyz header.s=fm3 header.b=o7xF4oy4; dkim=pass header.i=@messagingengine.com header.s=fm1 header.b=nm4kC8Dy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 59F8A8075EE9; Tue, 28 Nov 2023 08:18:12 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229748AbjK1QRw (ORCPT + 99 others); Tue, 28 Nov 2023 11:17:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229567AbjK1QRu (ORCPT ); Tue, 28 Nov 2023 11:17:50 -0500 Received: from new4-smtp.messagingengine.com (new4-smtp.messagingengine.com [66.111.4.230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3436182; Tue, 28 Nov 2023 08:17:56 -0800 (PST) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.nyi.internal (Postfix) with ESMTP id 516DA580855; Tue, 28 Nov 2023 11:17:56 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 28 Nov 2023 11:17:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dxuuu.xyz; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to; s=fm3; t= 1701188276; x=1701195476; bh=GZ5L0AMTLI05H5bqiQnQyltqp4P9zp6XIYx LYjP5ZtA=; b=o7xF4oy4c3SxLmp5JIOOCX58qCpUsPD4N74ZCiJ3dFmRQjueNTk XNE6hqR3GvcpzANiIPoJt0VZYAKXRw/9uD7FXeg/TlRiUC1uxjoNSd3pcKVRDRf3 oaDGA/Vn5cr5MmPWN35HSOoswwD3HsA8b+N3VSeeF6T59Hm8hlGTcqNF6blVNqZd usaUnWCd4ZI/P4N1leCwfFYZ5BPfcLgCbI3mALFWmo7/ZpX76UC2XExQLcmUf3VB lCllqFWn8FgIZvwuG6pL9gdlGyB6owoAhM4kBY0ZCtBnC46/35ni/wZ/gutfBzc9 L2q7C5nTHWCBWqUBVqC/Y7bSUwQZiva7QyQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t= 1701188276; x=1701195476; bh=GZ5L0AMTLI05H5bqiQnQyltqp4P9zp6XIYx LYjP5ZtA=; b=nm4kC8Dy6SVhN6AJoRCe0Y7Gira86b2rfiGTtPYUhHi76WZ42/p CDV39H4wZ5maA+nySZXj9/urDVUvS3i1YNigtwXy0etS3laluMwp4zPvgYqDMv8U oaKqwI31TGKT4RiHfKvfBCa6Bj6FEkMnUJP/b4ETlK1r9mYGpr5qnhfrCDVVEykU Yk6eHS8XUoVlViEgRQJ2Zlm8XQHZStUHUAjvaXZt8KZXaA+0hImcVfSBE9oEABDl z1/NYfchHUsbMtZrjF+QIlEOL5DtopQEFQJTvpcGpIlH5FTTjm7xgEO5WnfvJuGT 7ywpRSBDCp7r7eScKu10Vt5GEvc8GbbXWsg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrudeifedgkeehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne gfrhhlucfvnfffucdlvdefmdenucfjughrpeffhffvvefukfhfgggtugfgjgestheksfdt tddtudenucfhrhhomhepffgrnhhivghlucgiuhcuoegugihusegugihuuhhurdighiiiqe enucggtffrrghtthgvrhhnpeelieeljeefgffgteehfeevgffgieeukeetveduteeuvdff kedufeegvdeuvdejfeenucffohhmrghinhepugiguhhuuhdrgiihiidpghhithhhuhgsrd gtohhmpdifihhkihhpvgguihgrrdhorhhgnecuvehluhhsthgvrhfuihiivgeptdenucfr rghrrghmpehmrghilhhfrhhomhepugiguhesugiguhhuuhdrgiihii X-ME-Proxy: Feedback-ID: i6a694271:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 28 Nov 2023 11:17:53 -0500 (EST) Date: Tue, 28 Nov 2023 10:17:51 -0600 From: Daniel Xu To: Yonghong Song Cc: Eduard Zingerman , Alexei Starovoitov , Shuah Khan , Daniel Borkmann , Andrii Nakryiko , Alexei Starovoitov , Steffen Klassert , antony.antony@secunet.com, Mykola Lysenko , Martin KaFai Lau , Song Liu , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , bpf , "open list:KERNEL SELFTEST FRAMEWORK" , LKML , devel@linux-ipsec.org, Network Development Subject: Re: [PATCH ipsec-next v1 6/7] bpf: selftests: test_tunnel: Disable CO-RE relocations Message-ID: References: <3ec6c068-7f95-419a-a0ae-a901f95e4838@linux.dev> <18e43cdf65e7ba0d8f6912364fbc5b08a6928b35.camel@gmail.com> <0535eb913f1a0c2d3c291478fde07e0aa2b333f1.camel@gmail.com> <42f9bf0d-695a-412d-bea5-cb7036fa7418@linux.dev> <53jaqi72ef4gynyafxidl5veb54kfs7dttxezkarwg75t7szd4@cvfg5pc7pyum> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Tue, 28 Nov 2023 08:18:12 -0800 (PST) On Tue, Nov 28, 2023 at 10:13:50AM -0600, Daniel Xu wrote: > On Mon, Nov 27, 2023 at 08:06:01PM -0800, Yonghong Song wrote: > > > > On 11/27/23 7:01 PM, Daniel Xu wrote: > > > On Mon, Nov 27, 2023 at 02:45:11PM -0600, Daniel Xu wrote: > > > > On Sun, Nov 26, 2023 at 09:53:04PM -0800, Yonghong Song wrote: > > > > > On 11/27/23 12:44 AM, Yonghong Song wrote: > > > > > > On 11/26/23 8:52 PM, Eduard Zingerman wrote: > > > > > > > On Sun, 2023-11-26 at 18:04 -0600, Daniel Xu wrote: > > > > > > > [...] > > > > > > > > > Tbh I'm not sure. This test passes with preserve_static_offset > > > > > > > > > because it suppresses preserve_access_index. In general clang > > > > > > > > > translates bitfield access to a set of IR statements like: > > > > > > > > > > > > > > > > > > ?? C: > > > > > > > > > ???? struct foo { > > > > > > > > > ?????? unsigned _; > > > > > > > > > ?????? unsigned a:1; > > > > > > > > > ?????? ... > > > > > > > > > ???? }; > > > > > > > > > ???? ... foo->a ... > > > > > > > > > > > > > > > > > > ?? IR: > > > > > > > > > ???? %a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1 > > > > > > > > > ???? %bf.load = load i8, ptr %a, align 4 > > > > > > > > > ???? %bf.clear = and i8 %bf.load, 1 > > > > > > > > > ???? %bf.cast = zext i8 %bf.clear to i32 > > > > > > > > > > > > > > > > > > With preserve_static_offset the getelementptr+load are replaced by a > > > > > > > > > single statement which is preserved as-is till code generation, > > > > > > > > > thus load with align 4 is preserved. > > > > > > > > > > > > > > > > > > On the other hand, I'm not sure that clang guarantees that load or > > > > > > > > > stores used for bitfield access would be always aligned according to > > > > > > > > > verifier expectations. > > > > > > > > > > > > > > > > > > I think we should check if there are some clang knobs that prevent > > > > > > > > > generation of unaligned memory access. I'll take a look. > > > > > > > > Is there a reason to prefer fixing in compiler? I'm not opposed to it, > > > > > > > > but the downside to compiler fix is it takes years to propagate and > > > > > > > > sprinkles ifdefs into the code. > > > > > > > > > > > > > > > > Would it be possible to have an analogue of BPF_CORE_READ_BITFIELD()? > > > > > > > Well, the contraption below passes verification, tunnel selftest > > > > > > > appears to work. I might have messed up some shifts in the macro, > > > > > > > though. > > > > > > I didn't test it. But from high level it should work. > > > > > > > > > > > > > Still, if clang would peek unlucky BYTE_{OFFSET,SIZE} for a particular > > > > > > > field access might be unaligned. > > > > > > clang should pick a sensible BYTE_SIZE/BYTE_OFFSET to meet > > > > > > alignment requirement. This is also required for BPF_CORE_READ_BITFIELD. > > > > > > > > > > > > > --- > > > > > > > > > > > > > > diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > > > > b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > > > > index 3065a716544d..41cd913ac7ff 100644 > > > > > > > --- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > > > > +++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c > > > > > > > @@ -9,6 +9,7 @@ > > > > > > > ? #include "vmlinux.h" > > > > > > > ? #include > > > > > > > ? #include > > > > > > > +#include > > > > > > > ? #include "bpf_kfuncs.h" > > > > > > > ? #include "bpf_tracing_net.h" > > > > > > > ? @@ -144,6 +145,38 @@ int ip6gretap_get_tunnel(struct __sk_buff *skb) > > > > > > > ????? return TC_ACT_OK; > > > > > > > ? } > > > > > > > ? +#define BPF_CORE_WRITE_BITFIELD(s, field, new_val) ({??????????? \ > > > > > > > +??? void *p = (void *)s + __CORE_RELO(s, field, BYTE_OFFSET);??? \ > > > > > > > +??? unsigned byte_size = __CORE_RELO(s, field, BYTE_SIZE);??????? \ > > > > > > > +??? unsigned lshift = __CORE_RELO(s, field, LSHIFT_U64); \ > > > > > > > +??? unsigned rshift = __CORE_RELO(s, field, RSHIFT_U64); \ > > > > > > > +??? unsigned bit_size = (rshift - lshift);??????????????? \ > > > > > > > +??? unsigned long long nval, val, hi, lo;??????????????? \ > > > > > > > +??????????????????????????????????? \ > > > > > > > +??? asm volatile("" : "=r"(p) : "0"(p));??????????????? \ > > > > > > Use asm volatile("" : "+r"(p)) ? > > > > > > > > > > > > > +??????????????????????????????????? \ > > > > > > > +??? switch (byte_size) {??????????????????????? \ > > > > > > > +??? case 1: val = *(unsigned char *)p; break;??????????? \ > > > > > > > +??? case 2: val = *(unsigned short *)p; break;??????????? \ > > > > > > > +??? case 4: val = *(unsigned int *)p; break;??????????? \ > > > > > > > +??? case 8: val = *(unsigned long long *)p; break;??????????? \ > > > > > > > +??? }??????????????????????????????? \ > > > > > > > +??? hi = val >> (bit_size + rshift);??????????????? \ > > > > > > > +??? hi <<= bit_size + rshift;??????????????????? \ > > > > > > > +??? lo = val << (bit_size + lshift);??????????????? \ > > > > > > > +??? lo >>= bit_size + lshift;??????????????????? \ > > > > > > > +??? nval = new_val;??????????????????????????? \ > > > > > > > +??? nval <<= lshift;??????????????????????? \ > > > > > > > +??? nval >>= rshift;??????????????????????? \ > > > > > > > +??? val = hi | nval | lo;??????????????????????? \ > > > > > > > +??? switch (byte_size) {??????????????????????? \ > > > > > > > +??? case 1: *(unsigned char *)p????? = val; break;??????????? \ > > > > > > > +??? case 2: *(unsigned short *)p???? = val; break;??????????? \ > > > > > > > +??? case 4: *(unsigned int *)p?????? = val; break;??????????? \ > > > > > > > +??? case 8: *(unsigned long long *)p = val; break;??????????? \ > > > > > > > +??? }??????????????????????????????? \ > > > > > > > +}) > > > > > > I think this should be put in libbpf public header files but not sure > > > > > > where to put it. bpf_core_read.h although it is core write? > > > > > > > > > > > > But on the other hand, this is a uapi struct bitfield write, > > > > > > strictly speaking, CORE write is really unnecessary here. It > > > > > > would be great if we can relieve users from dealing with > > > > > > such unnecessary CORE writes. In that sense, for this particular > > > > > > case, I would prefer rewriting the code by using byte-level > > > > > > stores... > > > > > or preserve_static_offset to clearly mean to undo bitfield CORE ... > > > > Ok, I will do byte-level rewrite for next revision. > > > [...] > > > > > > This patch seems to work: https://pastes.dxuuu.xyz/0glrf9 . > > > > > > But I don't think it's very pretty. Also I'm seeing on the internet that > > > people are saying the exact layout of bitfields is compiler dependent. > > > > Any reference for this (exact layout of bitfields is compiler dependent)? > > > > > So I am wondering if these byte sized writes are correct. For that > > > matter, I am wondering how the GCC generated bitfield accesses line up > > > with clang generated BPF bytecode. Or why uapi contains a bitfield. > > > > One thing for sure is memory layout of bitfields should be the same > > for both clang and gcc as it is determined by C standard. Register > > representation and how to manipulate could be different for different > > compilers. > > I was reading this thread: > https://github.com/Lora-net/LoRaMac-node/issues/697. It's obviously not > authoritative, but they sure sound confident! > > I think I've also heard it before a long time ago when I was working on > adding bitfield support to bpftrace. Wikipedia [0] also claims this: The layout of bit fields in a C struct is implementation-defined. For behavior that remains predictable across compilers, it may be preferable to emulate bit fields with a primitive and bit operators: [0]: https://en.wikipedia.org/wiki/Bit_field#C_programming_language