Received: by 2002:a05:7412:cfc7:b0:fc:a2b0:25d7 with SMTP id by7csp1504065rdb; Mon, 19 Feb 2024 19:45:35 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVHPz38qEK1TiJTPml5ijlSmT4sotIe4Wf08xq+wzEZbKUG2BaaISlDoIT8+ZCGJPcFVeKPZB6iuq6DdL4rv1h1R9kHl3oTA0JVr0q0LQ== X-Google-Smtp-Source: AGHT+IGEd0tmSHvuDaWBVkRf05LowgH0EzgjEQ1EzMCkdbA3VyIrzwnNtezl1qTQ2kuyScj2FPwM X-Received: by 2002:adf:e19e:0:b0:33d:1d7a:fc7e with SMTP id az30-20020adfe19e000000b0033d1d7afc7emr11807469wrb.25.1708400735614; Mon, 19 Feb 2024 19:45:35 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708400735; cv=pass; d=google.com; s=arc-20160816; b=VjcC6r9euwWBIg+WF0o3+RpHQExSCZ7qirtrW04m57TEbnOBMPQvVfg2Y05qmGJzlC +5bW8Mt9/aaSdYAsz+OqHvxKV8tdy4I1CgR6r+OgH8yggi16WIdfz52H6yRGeqeH9Tdl 25tZdbbkMZTEbzvwdSaRJ00ycsTwOEXVmo6Nhqmz6XygjR1vNnByUn26MtJiEefXlvo+ bUl/NqYg+A5zE/+f0XiNvtpW4ELksHJpSn2tk/yeTlW36B2HDj8jdPMDm93rNSpIsG5m f+jVB9cyQuhxhQlMKHO2DYt785LI3wTxJXvlNlotsYTOIDtEVLnQ2UpFEXEerFWs52h6 ioZw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=h1njW/efThCJFEqzUeXvd51T+cMl3qjxmrwE87+THD0=; fh=TsOvyMQKTSbPbDsJHcAjMRccwDNl6EiB6R/yoKo/aNY=; b=UD6pte3AU2IlzLZAFID4AQ9aJ0XdDGHx+GOvMmhlDMaz5jrBtK3jIwlD7UPaMXSx1w EkZzRK1TKSLkK3o0NHjOt1nLmXSqTum4ZvqYk5Vm0cOTl0y2xG9bBX8wWNwJGC5oeVB6 YCBQQD5UUU8P9PvAQZZ1C1eT9VK9PTtfQW5fwJLzQDV3r30Od3NjF2HzoTy7V2xmsRp/ D5NC5AYuoaAldiwsQnzOz/IbRavZFIF7Rwz8tn/41XnL8JJFGY89ZltiPgtgyqfgEj8+ zTeX4Qw707nSxFngogrksT6VHzP6LunB6M7byGbPoqxUUhgVRMflf88rLGlf4efwJoLN mRHw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=fail header.i=@embeddedor.com header.s=default header.b=QMKgEzCE; arc=pass (i=1 spf=pass spfdomain=embeddedor.com dkim=pass dkdomain=embeddedor.com); spf=pass (google.com: domain of linux-kernel+bounces-72284-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-72284-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id m25-20020a50d7d9000000b00564accf2b6esi637547edj.406.2024.02.19.19.45.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Feb 2024 19:45:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-72284-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=fail header.i=@embeddedor.com header.s=default header.b=QMKgEzCE; arc=pass (i=1 spf=pass spfdomain=embeddedor.com dkim=pass dkdomain=embeddedor.com); spf=pass (google.com: domain of linux-kernel+bounces-72284-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-72284-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 055B71F2224D for ; Tue, 20 Feb 2024 03:45:35 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C5F555644E; Tue, 20 Feb 2024 03:45:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=embeddedor.com header.i=@embeddedor.com header.b="QMKgEzCE" Received: from omta036.useast.a.cloudfilter.net (omta036.useast.a.cloudfilter.net [44.202.169.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CBF1C4C610 for ; Tue, 20 Feb 2024 03:45:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=44.202.169.35 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708400717; cv=none; b=Qbj8m8yCMDBU+ql1xemr+x91EMXQa1wQ1a5LOmCebuOsbCM9u+RylmOanbstGAehVdA10JelXTbRn3Zew4ot9zarwcW0NqpZyBf8ZmzBC8vaLAuS3EJiWOAUr+C8tWv6Boa54HRl9qEBASmQWhdHcf5/7eIsESBG7V/xLgzMA3g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708400717; c=relaxed/simple; bh=OcPCIZy7nXuy5ziO0susXqNVG47xVgOrCSXAibeuhFE=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=bk6ByOuImAE8CH5p9pjko3JvYz9NYXOSrPbCf66/y3pOE54BDO0F1wcXt416dp8PHd8tx7Tr8+VOstOmbV6Ygyh2g7RpMQtFEesRjJsHQ1TizpMmyd7rWhgD5viOpv5IDmn4JBsU3cjyyzBCcti2bWtM1d0D0Q2WfdOHme4jNzA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=embeddedor.com; spf=pass smtp.mailfrom=embeddedor.com; dkim=pass (2048-bit key) header.d=embeddedor.com header.i=@embeddedor.com header.b=QMKgEzCE; arc=none smtp.client-ip=44.202.169.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=embeddedor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=embeddedor.com Received: from eig-obgw-5010a.ext.cloudfilter.net ([10.0.29.199]) by cmsmtp with ESMTPS id cH1Pr1Cmw8uLRcH48rjKZp; Tue, 20 Feb 2024 03:45:08 +0000 Received: from gator4166.hostgator.com ([108.167.133.22]) by cmsmtp with ESMTPS id cH47ra6GujzOqcH48r5WWB; Tue, 20 Feb 2024 03:45:08 +0000 X-Authority-Analysis: v=2.4 cv=VdxUP0p9 c=1 sm=1 tr=0 ts=65d42044 a=1YbLdUo/zbTtOZ3uB5T3HA==:117 a=VhncohosazJxI00KdYJ/5A==:17 a=IkcTkHD0fZMA:10 a=k7vzHIieQBIA:10 a=wYkD_t78qR0A:10 a=VwQbUJbxAAAA:8 a=7CQSdrXTAAAA:8 a=3_uRt0xjAAAA:8 a=cm27Pg_UAAAA:8 a=hWMQpYRtAAAA:8 a=FOH2dFAWAAAA:8 a=pGLkceISAAAA:8 a=1XWaLZrsAAAA:8 a=yjU-xTemAAAA:8 a=eDVy6FjPjR8uDmFmMbsA:9 a=QEXdDO2ut3YA:10 a=AjGcO6oz07-iQ99wixmX:22 a=a-qgeE7W1pNrGK8U0ZQC:22 a=z1SuboXgGPGzQ8_2mWib:22 a=xmb-EsYY8bH0VWELuYED:22 a=KCsI-UfzjElwHeZNREa_:22 a=i3VuKzQdj-NEYjvDI-p3:22 a=SwQY0DHxSCHDbjv2szoi:22 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=embeddedor.com; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:From:References:Cc:To:Subject:MIME-Version:Date:Message-ID:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=h1njW/efThCJFEqzUeXvd51T+cMl3qjxmrwE87+THD0=; b=QMKgEzCE0INN/Ug+cnk4Rb96bG TutBBXmI/CslUDUjom0yyYT6CfL/lzWkvLUeNKnAOtao0VaZRNONBhTYS4lAcXQHLJPUTo9XbkXN/ aiABwQOHhgKbTVx/ZtwOOCUmvZh6xPtuGV/mPU2XNxGt9AnSt/8Yd0Ed8ATy7Lqyh5lmX0EV2jWcL gz8Ets25CCkNXomnGH/kcd22rQ2TpLcTkVpJ6VEpsfJGdeAwA9NHbsdUbzUTCl909843x7HmS2TtM V9HmTy41ddkY+2IDEy+CEbyziMrPtRy5cHiYhbYy1P3vPp934o4ZePWX34CzfxH5o1I/PHBSrtoA2 R+wYPAVA==; Received: from [201.172.172.225] (port=37524 helo=[192.168.15.10]) by gator4166.hostgator.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96.2) (envelope-from ) id 1rcH43-001sze-2s; Mon, 19 Feb 2024 21:45:03 -0600 Message-ID: Date: Mon, 19 Feb 2024 21:45:00 -0600 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v3] bpf: Replace bpf_lpm_trie_key 0-length array with flexible array Content-Language: en-US To: Kees Cook , Daniel Borkmann Cc: Mark Rutland , Alexei Starovoitov , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , Haowen Bai , bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, Yonghong Song , Jonathan Corbet , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , Joanne Koong , Yafang Shao , Kui-Feng Lee , Anton Protopopov , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, netdev@vger.kernel.org, linux-hardening@vger.kernel.org References: <20240219234121.make.373-kees@kernel.org> From: "Gustavo A. R. Silva" In-Reply-To: <20240219234121.make.373-kees@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator4166.hostgator.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - embeddedor.com X-BWhitelist: no X-Source-IP: 201.172.172.225 X-Source-L: No X-Exim-ID: 1rcH43-001sze-2s X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: ([192.168.15.10]) [201.172.172.225]:37524 X-Source-Auth: gustavo@embeddedor.com X-Email-Count: 4 X-Org: HG=hgshared;ORG=hostgator; X-Source-Cap: Z3V6aWRpbmU7Z3V6aWRpbmU7Z2F0b3I0MTY2Lmhvc3RnYXRvci5jb20= X-Local-Domain: yes X-CMAE-Envelope: MS4xfJjMP6JKIhCvBqjC0j+UgpXA+3tx+/QSHEQcjLA8frJ5G5k5hi68bb8z5pcKy2i1LE4MVQexTk2/RBQgQ9rJ0BakUITy1Q4E8fslUepvMupTWmbNZEd8 CDlCxwYjfN2A/qhANlcGYHUkwvbXYDTetIMN4jgxNNynY36k9oDBiLptSAOdKvXxD0tzMeFUpeA3Tze/wzPUR2dsG9+sXCLBFSiJmB+1bUSNDqrPdw1aEWX7 On 2/19/24 17:41, Kees Cook wrote: > Replace deprecated 0-length array in struct bpf_lpm_trie_key with > flexible array. Found with GCC 13: > > ../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=] > 207 | *(__be16 *)&key->data[i]); > | ^~~~~~~~~~~~~ > ../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16' > 102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) > | ^ > ../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu' > 97 | #define be16_to_cpu __be16_to_cpu > | ^~~~~~~~~~~~~ > ../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu' > 206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i] > ^ > | ^~~~~~~~~~~ > In file included from ../include/linux/bpf.h:7: > ../include/uapi/linux/bpf.h:82:17: note: while referencing 'data' > 82 | __u8 data[0]; /* Arbitrary size */ > | ^~~~ > > And found at run-time under CONFIG_FORTIFY_SOURCE: > > UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49 > index 0 is out of range for type '__u8 [*]' > > Changing struct bpf_lpm_trie_key is difficult since has been used by > userspace. For example, in Cilium: > > struct egress_gw_policy_key { > struct bpf_lpm_trie_key lpm_key; > __u32 saddr; > __u32 daddr; > }; > > While direct references to the "data" member haven't been found, there > are static initializers what include the final member. For example, > the "{}" here: > > struct egress_gw_policy_key in_key = { > .lpm_key = { 32 + 24, {} }, > .saddr = CLIENT_IP, > .daddr = EXTERNAL_SVC_IP & 0Xffffff, > }; > > To avoid the build time and run time warnings seen with a 0-sized > trailing array for struct bpf_lpm_trie_key, introduce a new struct > that correctly uses a flexible array for the trailing bytes, > struct bpf_lpm_trie_key_u8. As part of this, include the "header" > portion (which is just the "prefixlen" member), so it can be used > by anything building a bpf_lpr_trie_key that has trailing members that > aren't a u8 flexible array (like the self-test[1]), which is named > struct bpf_lpm_trie_key_hdr. Yep, this `_hdr` tagged struct adds flexibility (no pun intended :p) to flexible structures currently nested in the middle of other structs. We can just use the header members grouped together in the `_hdr` struct, and avoid embedding the flexible-array member. :) > > Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out, > and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment > to the UAPI header directing folks to the two new options. > > Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1] > Reported-by: Mark Rutland > Closes: https://paste.debian.net/hidden/ca500597/ > Signed-off-by: Kees Cook Acked-by: Gustavo A. R. Silva Thanks! -- Gustavo > --- > v3- create a new pair of structs -- leave old struct alone > v2- https://lore.kernel.org/lkml/20240216235536.it.234-kees@kernel.org/ > v1- https://lore.kernel.org/lkml/20230204183241.never.481-kees@kernel.org/ > Cc: Daniel Borkmann > Cc: Alexei Starovoitov > Cc: Andrii Nakryiko > Cc: Martin KaFai Lau > Cc: Song Liu > Cc: Yonghong Song > Cc: John Fastabend > Cc: KP Singh > Cc: Stanislav Fomichev > Cc: Hao Luo > Cc: Jiri Olsa > Cc: Mykola Lysenko > Cc: Shuah Khan > Cc: Haowen Bai > Cc: bpf@vger.kernel.org > Cc: linux-kselftest@vger.kernel.org > --- > Documentation/bpf/map_lpm_trie.rst | 2 +- > include/uapi/linux/bpf.h | 14 ++++++++++++- > kernel/bpf/lpm_trie.c | 20 +++++++++---------- > samples/bpf/map_perf_test_user.c | 2 +- > samples/bpf/xdp_router_ipv4_user.c | 2 +- > tools/include/uapi/linux/bpf.h | 14 ++++++++++++- > .../selftests/bpf/progs/map_ptr_kern.c | 2 +- > tools/testing/selftests/bpf/test_lpm_map.c | 18 ++++++++--------- > 8 files changed, 49 insertions(+), 25 deletions(-) > > diff --git a/Documentation/bpf/map_lpm_trie.rst b/Documentation/bpf/map_lpm_trie.rst > index 74d64a30f500..f9cd579496c9 100644 > --- a/Documentation/bpf/map_lpm_trie.rst > +++ b/Documentation/bpf/map_lpm_trie.rst > @@ -17,7 +17,7 @@ significant byte. > > LPM tries may be created with a maximum prefix length that is a multiple > of 8, in the range from 8 to 2048. The key used for lookup and update > -operations is a ``struct bpf_lpm_trie_key``, extended by > +operations is a ``struct bpf_lpm_trie_key_u8``, extended by > ``max_prefixlen/8`` bytes. > > - For IPv4 addresses the data length is 4 bytes > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index 754e68ca8744..c5a46d12076e 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -77,12 +77,24 @@ struct bpf_insn { > __s32 imm; /* signed immediate constant */ > }; > > -/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */ > +/* Deprecated: use struct bpf_lpm_trie_key_u8 (when the "data" member is needed for > + * byte access) or struct bpf_lpm_trie_key_hdr (when using an alternative type for > + * the trailing flexible array member) instead. > + */ > struct bpf_lpm_trie_key { > __u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */ > __u8 data[0]; /* Arbitrary size */ > }; > > +/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry, with trailing byte array. */ > +struct bpf_lpm_trie_key_u8 { > + __struct_group(bpf_lpm_trie_key_hdr, hdr, /* no attrs */, > + /* up to 32 for AF_INET, 128 for AF_INET6 */ > + __u32 prefixlen; > + ); > + __u8 data[]; /* Arbitrary size */ > +}; > + > struct bpf_cgroup_storage_key { > __u64 cgroup_inode_id; /* cgroup inode id */ > __u32 attach_type; /* program attach type (enum bpf_attach_type) */ > diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c > index b32be680da6c..050fe1ebf0f7 100644 > --- a/kernel/bpf/lpm_trie.c > +++ b/kernel/bpf/lpm_trie.c > @@ -164,13 +164,13 @@ static inline int extract_bit(const u8 *data, size_t index) > */ > static size_t longest_prefix_match(const struct lpm_trie *trie, > const struct lpm_trie_node *node, > - const struct bpf_lpm_trie_key *key) > + const struct bpf_lpm_trie_key_u8 *key) > { > u32 limit = min(node->prefixlen, key->prefixlen); > u32 prefixlen = 0, i = 0; > > BUILD_BUG_ON(offsetof(struct lpm_trie_node, data) % sizeof(u32)); > - BUILD_BUG_ON(offsetof(struct bpf_lpm_trie_key, data) % sizeof(u32)); > + BUILD_BUG_ON(offsetof(struct bpf_lpm_trie_key_u8, data) % sizeof(u32)); > > #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) && defined(CONFIG_64BIT) > > @@ -229,7 +229,7 @@ static void *trie_lookup_elem(struct bpf_map *map, void *_key) > { > struct lpm_trie *trie = container_of(map, struct lpm_trie, map); > struct lpm_trie_node *node, *found = NULL; > - struct bpf_lpm_trie_key *key = _key; > + struct bpf_lpm_trie_key_u8 *key = _key; > > if (key->prefixlen > trie->max_prefixlen) > return NULL; > @@ -309,7 +309,7 @@ static long trie_update_elem(struct bpf_map *map, > struct lpm_trie *trie = container_of(map, struct lpm_trie, map); > struct lpm_trie_node *node, *im_node = NULL, *new_node = NULL; > struct lpm_trie_node __rcu **slot; > - struct bpf_lpm_trie_key *key = _key; > + struct bpf_lpm_trie_key_u8 *key = _key; > unsigned long irq_flags; > unsigned int next_bit; > size_t matchlen = 0; > @@ -437,7 +437,7 @@ static long trie_update_elem(struct bpf_map *map, > static long trie_delete_elem(struct bpf_map *map, void *_key) > { > struct lpm_trie *trie = container_of(map, struct lpm_trie, map); > - struct bpf_lpm_trie_key *key = _key; > + struct bpf_lpm_trie_key_u8 *key = _key; > struct lpm_trie_node __rcu **trim, **trim2; > struct lpm_trie_node *node, *parent; > unsigned long irq_flags; > @@ -536,7 +536,7 @@ static long trie_delete_elem(struct bpf_map *map, void *_key) > sizeof(struct lpm_trie_node)) > #define LPM_VAL_SIZE_MIN 1 > > -#define LPM_KEY_SIZE(X) (sizeof(struct bpf_lpm_trie_key) + (X)) > +#define LPM_KEY_SIZE(X) (sizeof(struct bpf_lpm_trie_key_u8) + (X)) > #define LPM_KEY_SIZE_MAX LPM_KEY_SIZE(LPM_DATA_SIZE_MAX) > #define LPM_KEY_SIZE_MIN LPM_KEY_SIZE(LPM_DATA_SIZE_MIN) > > @@ -565,7 +565,7 @@ static struct bpf_map *trie_alloc(union bpf_attr *attr) > /* copy mandatory map attributes */ > bpf_map_init_from_attr(&trie->map, attr); > trie->data_size = attr->key_size - > - offsetof(struct bpf_lpm_trie_key, data); > + offsetof(struct bpf_lpm_trie_key_u8, data); > trie->max_prefixlen = trie->data_size * 8; > > spin_lock_init(&trie->lock); > @@ -616,7 +616,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) > { > struct lpm_trie_node *node, *next_node = NULL, *parent, *search_root; > struct lpm_trie *trie = container_of(map, struct lpm_trie, map); > - struct bpf_lpm_trie_key *key = _key, *next_key = _next_key; > + struct bpf_lpm_trie_key_u8 *key = _key, *next_key = _next_key; > struct lpm_trie_node **node_stack = NULL; > int err = 0, stack_ptr = -1; > unsigned int next_bit; > @@ -703,7 +703,7 @@ static int trie_get_next_key(struct bpf_map *map, void *_key, void *_next_key) > } > do_copy: > next_key->prefixlen = next_node->prefixlen; > - memcpy((void *)next_key + offsetof(struct bpf_lpm_trie_key, data), > + memcpy((void *)next_key + offsetof(struct bpf_lpm_trie_key_u8, data), > next_node->data, trie->data_size); > free_stack: > kfree(node_stack); > @@ -715,7 +715,7 @@ static int trie_check_btf(const struct bpf_map *map, > const struct btf_type *key_type, > const struct btf_type *value_type) > { > - /* Keys must have struct bpf_lpm_trie_key embedded. */ > + /* Keys must have struct bpf_lpm_trie_key_u8 embedded. */ > return BTF_INFO_KIND(key_type->info) != BTF_KIND_STRUCT ? > -EINVAL : 0; > } > diff --git a/samples/bpf/map_perf_test_user.c b/samples/bpf/map_perf_test_user.c > index d2fbcf963cdf..07ff471ed6ae 100644 > --- a/samples/bpf/map_perf_test_user.c > +++ b/samples/bpf/map_perf_test_user.c > @@ -370,7 +370,7 @@ static void run_perf_test(int tasks) > > static void fill_lpm_trie(void) > { > - struct bpf_lpm_trie_key *key; > + struct bpf_lpm_trie_key_u8 *key; > unsigned long value = 0; > unsigned int i; > int r; > diff --git a/samples/bpf/xdp_router_ipv4_user.c b/samples/bpf/xdp_router_ipv4_user.c > index 9d41db09c480..266fdd0b025d 100644 > --- a/samples/bpf/xdp_router_ipv4_user.c > +++ b/samples/bpf/xdp_router_ipv4_user.c > @@ -91,7 +91,7 @@ static int recv_msg(struct sockaddr_nl sock_addr, int sock) > static void read_route(struct nlmsghdr *nh, int nll) > { > char dsts[24], gws[24], ifs[16], dsts_len[24], metrics[24]; > - struct bpf_lpm_trie_key *prefix_key; > + struct bpf_lpm_trie_key_u8 *prefix_key; > struct rtattr *rt_attr; > struct rtmsg *rt_msg; > int rtm_family; > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h > index 7f24d898efbb..c55244bf1a20 100644 > --- a/tools/include/uapi/linux/bpf.h > +++ b/tools/include/uapi/linux/bpf.h > @@ -77,12 +77,24 @@ struct bpf_insn { > __s32 imm; /* signed immediate constant */ > }; > > -/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry */ > +/* Deprecated: use struct bpf_lpm_trie_key_u8 (when the "data" member is needed for > + * byte access) or struct bpf_lpm_trie_key_hdr (when using an alternative type for > + * the trailing flexible array member) instead. > + */ > struct bpf_lpm_trie_key { > __u32 prefixlen; /* up to 32 for AF_INET, 128 for AF_INET6 */ > __u8 data[0]; /* Arbitrary size */ > }; > > +/* Key of an a BPF_MAP_TYPE_LPM_TRIE entry, with trailing byte array. */ > +struct bpf_lpm_trie_key_u8 { > + __struct_group(bpf_lpm_trie_key_hdr, hdr, /* no attrs */, > + /* up to 32 for AF_INET, 128 for AF_INET6 */ > + __u32 prefixlen; > + ); > + __u8 data[]; /* Arbitrary size */ > +}; > + > struct bpf_cgroup_storage_key { > __u64 cgroup_inode_id; /* cgroup inode id */ > __u32 attach_type; /* program attach type (enum bpf_attach_type) */ > diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c > index 3325da17ec81..efaf622c28dd 100644 > --- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c > +++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c > @@ -316,7 +316,7 @@ struct lpm_trie { > } __attribute__((preserve_access_index)); > > struct lpm_key { > - struct bpf_lpm_trie_key trie_key; > + struct bpf_lpm_trie_key_hdr trie_key; > __u32 data; > }; > > diff --git a/tools/testing/selftests/bpf/test_lpm_map.c b/tools/testing/selftests/bpf/test_lpm_map.c > index c028d621c744..d98c72dc563e 100644 > --- a/tools/testing/selftests/bpf/test_lpm_map.c > +++ b/tools/testing/selftests/bpf/test_lpm_map.c > @@ -211,7 +211,7 @@ static void test_lpm_map(int keysize) > volatile size_t n_matches, n_matches_after_delete; > size_t i, j, n_nodes, n_lookups; > struct tlpm_node *t, *list = NULL; > - struct bpf_lpm_trie_key *key; > + struct bpf_lpm_trie_key_u8 *key; > uint8_t *data, *value; > int r, map; > > @@ -331,8 +331,8 @@ static void test_lpm_map(int keysize) > static void test_lpm_ipaddr(void) > { > LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); > - struct bpf_lpm_trie_key *key_ipv4; > - struct bpf_lpm_trie_key *key_ipv6; > + struct bpf_lpm_trie_key_u8 *key_ipv4; > + struct bpf_lpm_trie_key_u8 *key_ipv6; > size_t key_size_ipv4; > size_t key_size_ipv6; > int map_fd_ipv4; > @@ -423,7 +423,7 @@ static void test_lpm_ipaddr(void) > static void test_lpm_delete(void) > { > LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); > - struct bpf_lpm_trie_key *key; > + struct bpf_lpm_trie_key_u8 *key; > size_t key_size; > int map_fd; > __u64 value; > @@ -532,7 +532,7 @@ static void test_lpm_delete(void) > static void test_lpm_get_next_key(void) > { > LIBBPF_OPTS(bpf_map_create_opts, opts, .map_flags = BPF_F_NO_PREALLOC); > - struct bpf_lpm_trie_key *key_p, *next_key_p; > + struct bpf_lpm_trie_key_u8 *key_p, *next_key_p; > size_t key_size; > __u32 value = 0; > int map_fd; > @@ -693,9 +693,9 @@ static void *lpm_test_command(void *arg) > { > int i, j, ret, iter, key_size; > struct lpm_mt_test_info *info = arg; > - struct bpf_lpm_trie_key *key_p; > + struct bpf_lpm_trie_key_u8 *key_p; > > - key_size = sizeof(struct bpf_lpm_trie_key) + sizeof(__u32); > + key_size = sizeof(*key_p) + sizeof(__u32); > key_p = alloca(key_size); > for (iter = 0; iter < info->iter; iter++) > for (i = 0; i < MAX_TEST_KEYS; i++) { > @@ -717,7 +717,7 @@ static void *lpm_test_command(void *arg) > ret = bpf_map_lookup_elem(info->map_fd, key_p, &value); > assert(ret == 0 || errno == ENOENT); > } else { > - struct bpf_lpm_trie_key *next_key_p = alloca(key_size); > + struct bpf_lpm_trie_key_u8 *next_key_p = alloca(key_size); > ret = bpf_map_get_next_key(info->map_fd, key_p, next_key_p); > assert(ret == 0 || errno == ENOENT || errno == ENOMEM); > } > @@ -752,7 +752,7 @@ static void test_lpm_multi_thread(void) > > /* create a trie */ > value_size = sizeof(__u32); > - key_size = sizeof(struct bpf_lpm_trie_key) + value_size; > + key_size = sizeof(struct bpf_lpm_trie_key_hdr) + value_size; > map_fd = bpf_map_create(BPF_MAP_TYPE_LPM_TRIE, NULL, key_size, value_size, 100, &opts); > > /* create 4 threads to test update, delete, lookup and get_next_key */