Add support for creating PFCP filters in switchdev mode. Add pfcp module
that allows to create a PFCP-type netdev. The netdev then can be passed to
tc when creating a filter to indicate that PFCP filter should be created.
To add a PFCP filter, a special netdev must be created and passed to tc
command:
ip link add pfcp0 type pfcp
tc filter add dev eth0 ingress prio 1 flower pfcp_opts \
1:12ab/ff:fffffffffffffff0 skip_hw action mirred egress redirect \
dev pfcp0
Changes in iproute2 [1] are required to use pfcp_opts in tc.
ICE COMMS package is required as it contains PFCP profiles.
Part of this patchset modifies IP_TUNNEL_*_OPTs, which were previously
stored in a __be16. All possible values have already been used, making
it impossible to add new ones.
* 1-3: add new bitmap_{read,write}(), which is used later in the IP
tunnel flags code (from Alexander's ARM64 MTE series[2]);
* 4-14: some bitmap code preparations also used later in IP tunnels;
* 15-17: convert IP tunnel flags from __be16 to a bitmap;
* 18-21: add PFCP module and support for it in ice.
[1] https://lore.kernel.org/netdev/[email protected]
[2] https://lore.kernel.org/linux-kernel/[email protected]/
---
From v4[3]:
* rebase on top of 6.8-rc1;
* collect all the dependencies together in one series (did I get it
right? :s);
* no functional changes.
v3: https://lore.kernel.org/intel-wired-lan/[email protected]
v2: https://lore.kernel.org/intel-wired-lan/[email protected]
v1: https://lore.kernel.org/intel-wired-lan/[email protected]
[3] https://lore.kernel.org/netdev/[email protected]
---
Alexander Lobakin (14):
bitops: add missing prototype check
bitops: make BYTES_TO_BITS() treewide-available
bitops: let the compiler optimize {__,}assign_bit()
linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros
s390/cio: rename bitmap_size() -> idset_bitmap_size()
fs/ntfs3: add prefix to bitmap_size() and use BITS_TO_U64()
btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
tools: move alignment-related macros to new <linux/align.h>
bitmap: introduce generic optimized bitmap_size()
bitmap: make bitmap_{get,set}_value8() use bitmap_{read,write}()
lib/bitmap: add compile-time test for __assign_bit() optimization
ip_tunnel: use a separate struct to store tunnel params in the kernel
ip_tunnel: convert __be16 tunnel flags to bitmaps
lib/bitmap: add tests for IP tunnel flags conversion helpers
Alexander Potapenko (2):
lib/test_bitmap: add tests for bitmap_{read,write}()
lib/test_bitmap: use pr_info() for non-error messages
Marcin Szycik (2):
ice: refactor ICE_TC_FLWR_FIELD_ENC_OPTS
ice: Add support for PFCP hardware offload in switchdev
Michal Swiatkowski (1):
pfcp: always set pfcp metadata
Syed Nayyar Waris (1):
lib/bitmap: add bitmap_{read,write}()
Wojciech Drewek (1):
pfcp: add PFCP module
drivers/net/Kconfig | 13 +
drivers/net/Makefile | 1 +
.../net/ethernet/intel/ice/ice_flex_type.h | 4 +-
.../ethernet/intel/ice/ice_protocol_type.h | 12 +
drivers/net/ethernet/intel/ice/ice_switch.h | 2 +
drivers/net/ethernet/intel/ice/ice_tc_lib.h | 8 +-
.../ethernet/mellanox/mlx5/core/en/tc_tun.h | 2 +-
.../ethernet/mellanox/mlxsw/spectrum_ipip.h | 2 +-
fs/ntfs3/ntfs_fs.h | 4 +-
include/linux/bitmap.h | 93 ++++--
include/linux/bitops.h | 23 +-
include/linux/cpumask.h | 2 +-
include/linux/linkmode.h | 27 +-
include/linux/netdevice.h | 7 +-
include/net/dst_metadata.h | 10 +-
include/net/flow_dissector.h | 2 +-
include/net/gre.h | 70 ++--
include/net/ip6_tunnel.h | 4 +-
include/net/ip_tunnels.h | 139 ++++++--
include/net/pfcp.h | 90 ++++++
include/net/udp_tunnel.h | 4 +-
include/uapi/linux/if_tunnel.h | 36 +++
include/uapi/linux/pkt_cls.h | 14 +
tools/include/linux/align.h | 12 +
tools/include/linux/bitmap.h | 9 +-
tools/include/linux/bitops.h | 2 +
tools/include/linux/mm.h | 5 +-
drivers/md/dm-clone-metadata.c | 5 -
drivers/net/bareudp.c | 19 +-
drivers/net/ethernet/intel/ice/ice_ddp.c | 9 +
drivers/net/ethernet/intel/ice/ice_switch.c | 85 +++++
drivers/net/ethernet/intel/ice/ice_tc_lib.c | 68 +++-
.../mellanox/mlx5/core/en/tc_tun_encap.c | 6 +-
.../mellanox/mlx5/core/en/tc_tun_geneve.c | 12 +-
.../mellanox/mlx5/core/en/tc_tun_gre.c | 8 +-
.../mellanox/mlx5/core/en/tc_tun_vxlan.c | 9 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 16 +-
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 56 ++--
.../ethernet/mellanox/mlxsw/spectrum_span.c | 10 +-
.../ethernet/netronome/nfp/flower/action.c | 27 +-
drivers/net/geneve.c | 44 ++-
drivers/net/pfcp.c | 302 +++++++++++++++++
drivers/net/vxlan/vxlan_core.c | 14 +-
drivers/s390/cio/idset.c | 12 +-
fs/btrfs/free-space-cache.c | 8 +-
fs/ntfs3/bitmap.c | 4 +-
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 +-
fs/ntfs3/super.c | 2 +-
kernel/trace/trace_probe.c | 2 -
lib/math/prime_numbers.c | 2 -
lib/test_bitmap.c | 303 ++++++++++++++++--
net/bridge/br_vlan_tunnel.c | 9 +-
net/core/filter.c | 26 +-
net/core/flow_dissector.c | 20 +-
net/ipv4/fou_bpf.c | 2 +-
net/ipv4/gre_demux.c | 2 +-
net/ipv4/ip_gre.c | 144 +++++----
net/ipv4/ip_tunnel.c | 109 +++++--
net/ipv4/ip_tunnel_core.c | 82 +++--
net/ipv4/ip_vti.c | 41 ++-
net/ipv4/ipip.c | 33 +-
net/ipv4/ipmr.c | 2 +-
net/ipv4/udp_tunnel_core.c | 5 +-
net/ipv6/addrconf.c | 3 +-
net/ipv6/ip6_gre.c | 85 ++---
net/ipv6/ip6_tunnel.c | 14 +-
net/ipv6/sit.c | 38 ++-
net/netfilter/ipvs/ip_vs_core.c | 6 +-
net/netfilter/ipvs/ip_vs_xmit.c | 20 +-
net/netfilter/nft_tunnel.c | 44 +--
net/openvswitch/flow_netlink.c | 61 ++--
net/psample/psample.c | 26 +-
net/sched/act_tunnel_key.c | 36 +--
net/sched/cls_flower.c | 134 +++++++-
tools/perf/util/probe-finder.c | 4 +-
76 files changed, 1965 insertions(+), 614 deletions(-)
create mode 100644 include/net/pfcp.h
create mode 100644 tools/include/linux/align.h
create mode 100644 drivers/net/pfcp.c
--
2.43.0
From: Syed Nayyar Waris <[email protected]>
The two new functions allow reading/writing values of length up to
BITS_PER_LONG bits at arbitrary position in the bitmap.
The code was taken from "bitops: Introduce the for_each_set_clump macro"
by Syed Nayyar Waris with a number of changes and simplifications:
- instead of using roundup(), which adds an unnecessary dependency
on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
- indentation is reduced by not using else-clauses (suggested by
checkpatch for bitmap_get_value());
- bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
and bitmap_write();
- some redundant computations are omitted.
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Syed Nayyar Waris <[email protected]>
Signed-off-by: William Breathitt Gray <[email protected]>
Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
Suggested-by: Yury Norov <[email protected]>
Co-developed-by: Alexander Potapenko <[email protected]>
Signed-off-by: Alexander Potapenko <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitmap.h | 77 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99451431e4d6..7ca0379be8c1 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -79,6 +79,10 @@ struct device;
* bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst
* bitmap_get_value8(map, start) Get 8bit value from map at start
* bitmap_set_value8(map, value, start) Set 8bit value to map at start
+ * bitmap_read(map, start, nbits) Read an nbits-sized value from
+ * map at start
+ * bitmap_write(map, value, start, nbits) Write an nbits-sized value to
+ * map at start
*
* Note, bitmap_zero() and bitmap_fill() operate over the region of
* unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -636,6 +640,79 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
map[index] |= value << offset;
}
+/**
+ * bitmap_read - read a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG
+ *
+ * Returns: value of @nbits bits located at the @start bit offset within the
+ * @map memory region. For @nbits = 0 and @nbits > BITS_PER_LONG the return
+ * value is undefined.
+ */
+static inline unsigned long bitmap_read(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+ size_t index = BIT_WORD(start);
+ unsigned long offset = start % BITS_PER_LONG;
+ unsigned long space = BITS_PER_LONG - offset;
+ unsigned long value_low, value_high;
+
+ if (unlikely(!nbits || nbits > BITS_PER_LONG))
+ return 0;
+
+ if (space >= nbits)
+ return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits);
+
+ value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+ value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+ return (value_low >> offset) | (value_high << space);
+}
+
+/**
+ * bitmap_write - write n-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: value to write, clamped to nbits
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG.
+ *
+ * bitmap_write() behaves as-if implemented as @nbits calls of __assign_bit(),
+ * i.e. bits beyond @nbits are ignored:
+ *
+ * for (bit = 0; bit < nbits; bit++)
+ * __assign_bit(start + bit, bitmap, val & BIT(bit));
+ *
+ * For @nbits == 0 and @nbits > BITS_PER_LONG no writes are performed.
+ */
+static inline void bitmap_write(unsigned long *map, unsigned long value,
+ unsigned long start, unsigned long nbits)
+{
+ size_t index;
+ unsigned long offset;
+ unsigned long space;
+ unsigned long mask;
+ bool fit;
+
+ if (unlikely(!nbits || nbits > BITS_PER_LONG))
+ return;
+
+ mask = BITMAP_LAST_WORD_MASK(nbits);
+ value &= mask;
+ offset = start % BITS_PER_LONG;
+ space = BITS_PER_LONG - offset;
+ fit = space >= nbits;
+ index = BIT_WORD(start);
+
+ map[index] &= (fit ? (~(mask << offset)) : ~BITMAP_FIRST_WORD_MASK(start));
+ map[index] |= value << offset;
+ if (fit)
+ return;
+
+ map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits);
+ map[index + 1] |= (value >> space);
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __LINUX_BITMAP_H */
--
2.43.0
Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
a new bitop, test_bit_acquire(), with proper wrapping in order to try to
optimize it at compile-time, but missed the list of bitops used for
checking their prototypes a bit below.
The functions added have consistent prototypes, so that no more changes
are required and no functional changes take place.
Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Cc: [email protected] # 6.0+
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..f7f5a783da2a 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -80,6 +80,7 @@ __check_bitop_pr(__test_and_set_bit);
__check_bitop_pr(__test_and_clear_bit);
__check_bitop_pr(__test_and_change_bit);
__check_bitop_pr(test_bit);
+__check_bitop_pr(test_bit_acquire);
#undef __check_bitop_pr
--
2.43.0
bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is
NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
(although it uses ideologically right ALIGN() instead of division).
Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
it with the legacy module) and use generic BITS_TO_U64() while at it.
Suggested-by: Yury Norov <[email protected]> # BITS_TO_U64()
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
fs/ntfs3/ntfs_fs.h | 4 ++--
fs/ntfs3/bitmap.c | 4 ++--
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 ++++++-----
fs/ntfs3/super.c | 2 +-
5 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index f6706143d14b..16b84d605cd2 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -961,9 +961,9 @@ static inline bool run_is_empty(struct runs_tree *run)
}
/* NTFS uses quad aligned bitmaps. */
-static inline size_t bitmap_size(size_t bits)
+static inline size_t ntfs3_bitmap_size(size_t bits)
{
- return ALIGN((bits + 7) >> 3, 8);
+ return BITS_TO_U64(bits) * sizeof(u64);
}
#define _100ns2seconds 10000000
diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
index 63f14a0232f6..a19a73ed630b 100644
--- a/fs/ntfs3/bitmap.c
+++ b/fs/ntfs3/bitmap.c
@@ -654,7 +654,7 @@ int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
wnd->total_zeroes = nbits;
wnd->extent_max = MINUS_ONE_T;
wnd->zone_bit = wnd->zone_end = 0;
- wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
+ wnd->nwnd = bytes_to_block(sb, ntfs3_bitmap_size(nbits));
wnd->bits_last = nbits & (wbits - 1);
if (!wnd->bits_last)
wnd->bits_last = wbits;
@@ -1347,7 +1347,7 @@ int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
return -EINVAL;
/* Align to 8 byte boundary. */
- new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
+ new_wnd = bytes_to_block(sb, ntfs3_bitmap_size(new_bits));
new_last = new_bits & (wbits - 1);
if (!new_last)
new_last = wbits;
diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
index fbfe21dbb425..e18de9c4c2fa 100644
--- a/fs/ntfs3/fsntfs.c
+++ b/fs/ntfs3/fsntfs.c
@@ -522,7 +522,7 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
ni->mi.dirty = true;
/* Step 2: Resize $MFT::BITMAP. */
- new_bitmap_bytes = bitmap_size(new_mft_total);
+ new_bitmap_bytes = ntfs3_bitmap_size(new_mft_total);
err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
index cf92b2433f7a..e0cef8f4e414 100644
--- a/fs/ntfs3/index.c
+++ b/fs/ntfs3/index.c
@@ -1456,8 +1456,8 @@ static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
alloc->nres.valid_size = alloc->nres.data_size = cpu_to_le64(data_size);
- err = ni_insert_resident(ni, bitmap_size(1), ATTR_BITMAP, in->name,
- in->name_len, &bitmap, NULL, NULL);
+ err = ni_insert_resident(ni, ntfs3_bitmap_size(1), ATTR_BITMAP,
+ in->name, in->name_len, &bitmap, NULL, NULL);
if (err)
goto out2;
@@ -1518,8 +1518,9 @@ static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
if (bmp) {
/* Increase bitmap. */
err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
- &indx->bitmap_run, bitmap_size(bit + 1),
- NULL, true, NULL);
+ &indx->bitmap_run,
+ ntfs3_bitmap_size(bit + 1), NULL, true,
+ NULL);
if (err)
goto out1;
}
@@ -2092,7 +2093,7 @@ static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
if (in->name == I30_NAME)
ni->vfs_inode.i_size = new_data;
- bpb = bitmap_size(bit);
+ bpb = ntfs3_bitmap_size(bit);
if (bpb * 8 == nbits)
return 0;
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index 9153dffde950..0248db1e5c01 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1331,7 +1331,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
/* Check bitmap boundary. */
tt = sbi->used.bitmap.nbits;
- if (inode->i_size < bitmap_size(tt)) {
+ if (inode->i_size < ntfs3_bitmap_size(tt)) {
ntfs_err(sb, "$Bitmap is corrupted.");
err = -EINVAL;
goto put_inode_out;
--
2.43.0
bitmap_set_bits() does not start with the FS' prefix and may collide
with a new generic helper one day. It operates with the FS-specific
types, so there's no change those two could do the same thing.
Just add the prefix to exclude such possible conflict.
Reviewed-by: Przemek Kitszel <[email protected]>
Acked-by: David Sterba <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
fs/btrfs/free-space-cache.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index d372c7ce0e6b..47c979c3e67e 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -1913,9 +1913,9 @@ static inline void bitmap_clear_bits(struct btrfs_free_space_ctl *ctl,
ctl->free_space -= bytes;
}
-static void bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
- struct btrfs_free_space *info, u64 offset,
- u64 bytes)
+static void btrfs_bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
+ struct btrfs_free_space *info, u64 offset,
+ u64 bytes)
{
unsigned long start, count, end;
int extent_delta = 1;
@@ -2251,7 +2251,7 @@ static u64 add_bytes_to_bitmap(struct btrfs_free_space_ctl *ctl,
bytes_to_set = min(end - offset, bytes);
- bitmap_set_bits(ctl, info, offset, bytes_to_set);
+ btrfs_bitmap_set_bits(ctl, info, offset, bytes_to_set);
return bytes_to_set;
--
2.43.0
Now that we have generic bitmap_read() and bitmap_write(), which are
inline and try to take care of non-bound-crossing and aligned cases
to keep them optimized, collapse bitmap_{get,set}_value8() into
simple wrappers around the former ones.
bloat-o-meter shows no difference in vmlinux and -2 bytes for
gpio-pca953x.ko, which says the optimization didn't suffer due to
that change. The converted helpers have the value width embedded
and always compile-time constant and that helps a lot.
Suggested-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitmap.h | 38 +++++---------------------------------
1 file changed, 5 insertions(+), 33 deletions(-)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 9a6a27a7f675..f80e116b8f60 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -609,39 +609,6 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask)
bitmap_from_arr64(dst, &mask, 64);
}
-/**
- * bitmap_get_value8 - get an 8-bit value within a memory region
- * @map: address to the bitmap memory region
- * @start: bit offset of the 8-bit value; must be a multiple of 8
- *
- * Returns the 8-bit value located at the @start bit offset within the @src
- * memory region.
- */
-static inline unsigned long bitmap_get_value8(const unsigned long *map,
- unsigned long start)
-{
- const size_t index = BIT_WORD(start);
- const unsigned long offset = start % BITS_PER_LONG;
-
- return (map[index] >> offset) & 0xFF;
-}
-
-/**
- * bitmap_set_value8 - set an 8-bit value within a memory region
- * @map: address to the bitmap memory region
- * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
- * @start: bit offset of the 8-bit value; must be a multiple of 8
- */
-static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
- unsigned long start)
-{
- const size_t index = BIT_WORD(start);
- const unsigned long offset = start % BITS_PER_LONG;
-
- map[index] &= ~(0xFFUL << offset);
- map[index] |= value << offset;
-}
-
/**
* bitmap_read - read a value of n-bits from the memory region
* @map: address to the bitmap memory region
@@ -715,6 +682,11 @@ static inline void bitmap_write(unsigned long *map, unsigned long value,
map[index + 1] |= (value >> space);
}
+#define bitmap_get_value8(map, start) \
+ bitmap_read(map, start, BITS_PER_BYTE)
+#define bitmap_set_value8(map, value, start) \
+ bitmap_write(map, value, start, BITS_PER_BYTE)
+
#endif /* __ASSEMBLY__ */
#endif /* __LINUX_BITMAP_H */
--
2.43.0
Commit dc34d5036692 ("lib: test_bitmap: add compile-time
optimization/evaluations assertions") initially missed __assign_bit(),
which led to that quite a time passed before I realized it doesn't get
optimized at compilation time. Now that it does, add test for that just
to make sure nothing will break one day.
To make things more interesting, use bitmap_complement() and
bitmap_full(), thus checking their compile-time evaluation as well. And
remove the misleading comment mentioning the workaround removed recently
in favor of adding the whole file to GCov exceptions.
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index a6e92cf5266a..4ee1f8ceb51d 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -1204,14 +1204,7 @@ static void __init test_bitmap_const_eval(void)
* in runtime.
*/
- /*
- * Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }`.
- * Clang on s390 optimizes bitops at compile-time as intended, but at
- * the same time stops treating @bitmap and @bitopvar as compile-time
- * constants after regular test_bit() is executed, thus triggering the
- * build bugs below. So, call const_test_bit() there directly until
- * the compiler is fixed.
- */
+ /* Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }` */
bitmap_clear(bitmap, 0, BITS_PER_LONG);
if (!test_bit(7, bitmap))
bitmap_set(bitmap, 5, 2);
@@ -1243,6 +1236,15 @@ static void __init test_bitmap_const_eval(void)
/* ~BIT(25) */
BUILD_BUG_ON(!__builtin_constant_p(~var));
BUILD_BUG_ON(~var != ~BIT(25));
+
+ /* ~BIT(25) | BIT(25) == ~0UL */
+ bitmap_complement(&var, &var, BITS_PER_LONG);
+ __assign_bit(25, &var, true);
+
+ /* !(~(~0UL)) == 1 */
+ res = bitmap_full(&var, BITS_PER_LONG);
+ BUILD_BUG_ON(!__builtin_constant_p(res));
+ BUILD_BUG_ON(!res);
}
/*
--
2.43.0
Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure
to store params inside the kernel, IPv4 tunnel code uses the same
ip_tunnel_parm which is being used to talk with the userspace.
This makes it difficult to alter or add any fields or use a
different format for whatever data.
Define struct ip_tunnel_parm_kern, a 1:1 copy of ip_tunnel_parm for
now, and use it throughout the code. Define the pieces, where the copy
user <-> kernel happens, as standalone functions, and copy the data
there field-by-field, so that the kernel-side structure could be easily
modified later on and the users wouldn't have to care about this.
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
.../ethernet/mellanox/mlxsw/spectrum_ipip.h | 2 +-
include/linux/netdevice.h | 7 ++-
include/net/ip_tunnels.h | 25 ++++++--
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 30 +++++----
.../ethernet/mellanox/mlxsw/spectrum_span.c | 4 +-
net/ipv4/ip_gre.c | 17 ++---
net/ipv4/ip_tunnel.c | 63 +++++++++++++++----
net/ipv4/ip_tunnel_core.c | 2 +-
net/ipv4/ip_vti.c | 12 ++--
net/ipv4/ipip.c | 12 ++--
net/ipv4/ipmr.c | 2 +-
net/ipv6/addrconf.c | 3 +-
net/ipv6/sit.c | 33 +++++-----
13 files changed, 137 insertions(+), 75 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
index a35f009da561..a66173779641 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
@@ -9,7 +9,7 @@
#include <linux/if_tunnel.h>
#include <net/ip6_tunnel.h>
-struct ip_tunnel_parm
+struct ip_tunnel_parm_kern
mlxsw_sp_ipip_netdev_parms4(const struct net_device *ol_dev);
struct __ip6_tnl_parm
mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev);
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 118c40258d07..d88b4760c518 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -59,7 +59,7 @@ struct ethtool_ops;
struct kernel_hwtstamp_config;
struct phy_device;
struct dsa_port;
-struct ip_tunnel_parm;
+struct ip_tunnel_parm_kern;
struct macsec_context;
struct macsec_ops;
struct netdev_name_node;
@@ -1411,7 +1411,7 @@ struct netdev_net_notifier {
* queue id bound to an AF_XDP socket. The flags field specifies if
* only RX, only Tx, or both should be woken up using the flags
* XDP_WAKEUP_RX and XDP_WAKEUP_TX.
- * int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm *p,
+ * int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm_kern *p,
* int cmd);
* Add, change, delete or get information on an IPv4 tunnel.
* struct net_device *(*ndo_get_peer_dev)(struct net_device *dev);
@@ -1667,7 +1667,8 @@ struct net_device_ops {
int (*ndo_xsk_wakeup)(struct net_device *dev,
u32 queue_id, u32 flags);
int (*ndo_tunnel_ctl)(struct net_device *dev,
- struct ip_tunnel_parm *p, int cmd);
+ struct ip_tunnel_parm_kern *p,
+ int cmd);
struct net_device * (*ndo_get_peer_dev)(struct net_device *dev);
int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx,
struct net_device_path *path);
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 2d746f4c9a0a..847ffebaa09b 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -110,6 +110,17 @@ struct ip_tunnel_prl_entry {
struct metadata_dst;
+/* Kernel-side copy of ip_tunnel_parm */
+struct ip_tunnel_parm_kern {
+ char name[IFNAMSIZ];
+ int link;
+ __be16 i_flags;
+ __be16 o_flags;
+ __be32 i_key;
+ __be32 o_key;
+ struct iphdr iph;
+};
+
struct ip_tunnel {
struct ip_tunnel __rcu *next;
struct hlist_node hash_node;
@@ -136,7 +147,7 @@ struct ip_tunnel {
struct dst_cache dst_cache;
- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;
int mlink;
int encap_hlen; /* Encap header length (FOU,GUE) */
@@ -290,7 +301,11 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
const struct iphdr *tnl_params, const u8 protocol);
void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
const u8 proto, int tunnel_hlen);
-int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd);
+int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd);
+bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,
+ const void __user *data);
+bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp);
int ip_tunnel_siocdevprivate(struct net_device *dev, struct ifreq *ifr,
void __user *data, int cmd);
int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict);
@@ -306,16 +321,16 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,
const struct tnl_ptk_info *tpi, struct metadata_dst *tun_dst,
bool log_ecn_error);
int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark);
+ struct ip_tunnel_parm_kern *p, __u32 fwmark);
int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark);
+ struct ip_tunnel_parm_kern *p, __u32 fwmark);
void ip_tunnel_setup(struct net_device *dev, unsigned int net_id);
bool ip_tunnel_netlink_encap_parms(struct nlattr *data[],
struct ip_tunnel_encap *encap);
void ip_tunnel_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms);
+ struct ip_tunnel_parm_kern *parms);
extern const struct header_ops ip_tunnel_header_ops;
__be16 ip_tunnel_parse_protocol(const struct sk_buff *skb);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
index 3340b4a694c3..d67df358a52f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
@@ -8,7 +8,7 @@
#include "spectrum_ipip.h"
#include "reg.h"
-struct ip_tunnel_parm
+struct ip_tunnel_parm_kern
mlxsw_sp_ipip_netdev_parms4(const struct net_device *ol_dev)
{
struct ip_tunnel *tun = netdev_priv(ol_dev);
@@ -24,7 +24,8 @@ mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev)
return tun->parms;
}
-static bool mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm *parms)
+static bool
+mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm_kern *parms)
{
return !!(parms->i_flags & TUNNEL_KEY);
}
@@ -34,7 +35,8 @@ static bool mlxsw_sp_ipip_parms6_has_ikey(const struct __ip6_tnl_parm *parms)
return !!(parms->i_flags & TUNNEL_KEY);
}
-static bool mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm *parms)
+static bool
+mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm_kern *parms)
{
return !!(parms->o_flags & TUNNEL_KEY);
}
@@ -44,7 +46,7 @@ static bool mlxsw_sp_ipip_parms6_has_okey(const struct __ip6_tnl_parm *parms)
return !!(parms->o_flags & TUNNEL_KEY);
}
-static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm *parms)
+static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm_kern *parms)
{
return mlxsw_sp_ipip_parms4_has_ikey(parms) ?
be32_to_cpu(parms->i_key) : 0;
@@ -56,7 +58,7 @@ static u32 mlxsw_sp_ipip_parms6_ikey(const struct __ip6_tnl_parm *parms)
be32_to_cpu(parms->i_key) : 0;
}
-static u32 mlxsw_sp_ipip_parms4_okey(const struct ip_tunnel_parm *parms)
+static u32 mlxsw_sp_ipip_parms4_okey(const struct ip_tunnel_parm_kern *parms)
{
return mlxsw_sp_ipip_parms4_has_okey(parms) ?
be32_to_cpu(parms->o_key) : 0;
@@ -69,7 +71,7 @@ static u32 mlxsw_sp_ipip_parms6_okey(const struct __ip6_tnl_parm *parms)
}
static union mlxsw_sp_l3addr
-mlxsw_sp_ipip_parms4_saddr(const struct ip_tunnel_parm *parms)
+mlxsw_sp_ipip_parms4_saddr(const struct ip_tunnel_parm_kern *parms)
{
return (union mlxsw_sp_l3addr) { .addr4 = parms->iph.saddr };
}
@@ -81,7 +83,7 @@ mlxsw_sp_ipip_parms6_saddr(const struct __ip6_tnl_parm *parms)
}
static union mlxsw_sp_l3addr
-mlxsw_sp_ipip_parms4_daddr(const struct ip_tunnel_parm *parms)
+mlxsw_sp_ipip_parms4_daddr(const struct ip_tunnel_parm_kern *parms)
{
return (union mlxsw_sp_l3addr) { .addr4 = parms->iph.daddr };
}
@@ -96,7 +98,7 @@ union mlxsw_sp_l3addr
mlxsw_sp_ipip_netdev_saddr(enum mlxsw_sp_l3proto proto,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms4;
+ struct ip_tunnel_parm_kern parms4;
struct __ip6_tnl_parm parms6;
switch (proto) {
@@ -115,7 +117,9 @@ mlxsw_sp_ipip_netdev_saddr(enum mlxsw_sp_l3proto proto,
static __be32 mlxsw_sp_ipip_netdev_daddr4(const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms4 = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms4;
+
+ parms4 = mlxsw_sp_ipip_netdev_parms4(ol_dev);
return mlxsw_sp_ipip_parms4_daddr(&parms4).addr4;
}
@@ -124,7 +128,7 @@ static union mlxsw_sp_l3addr
mlxsw_sp_ipip_netdev_daddr(enum mlxsw_sp_l3proto proto,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms4;
+ struct ip_tunnel_parm_kern parms4;
struct __ip6_tnl_parm parms6;
switch (proto) {
@@ -150,7 +154,7 @@ bool mlxsw_sp_l3addr_is_zero(union mlxsw_sp_l3addr addr)
static struct mlxsw_sp_ipip_parms
mlxsw_sp_ipip_netdev_parms_init_gre4(const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
return (struct mlxsw_sp_ipip_parms) {
.proto = MLXSW_SP_L3_PROTO_IPV4,
@@ -187,8 +191,8 @@ mlxsw_sp_ipip_decap_config_gre4(struct mlxsw_sp *mlxsw_sp,
{
u16 rif_index = mlxsw_sp_ipip_lb_rif_index(ipip_entry->ol_lb);
u16 ul_rif_id = mlxsw_sp_ipip_lb_ul_rif_id(ipip_entry->ol_lb);
+ struct ip_tunnel_parm_kern parms;
char rtdp_pl[MLXSW_REG_RTDP_LEN];
- struct ip_tunnel_parm parms;
unsigned int type_check;
bool has_ikey;
u32 daddr4;
@@ -252,7 +256,7 @@ static struct mlxsw_sp_rif_ipip_lb_config
mlxsw_sp_ipip_ol_loopback_config_gre4(struct mlxsw_sp *mlxsw_sp,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
enum mlxsw_reg_ritr_loopback_ipip_type lb_ipipt;
lb_ipipt = mlxsw_sp_ipip_parms4_has_okey(&parms) ?
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index af50ff9e5f26..f0cbc6b9b041 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -413,8 +413,8 @@ mlxsw_sp_span_gretap4_route(const struct net_device *to_dev,
__be32 *saddrp, __be32 *daddrp)
{
struct ip_tunnel *tun = netdev_priv(to_dev);
+ struct ip_tunnel_parm_kern parms;
struct net_device *dev = NULL;
- struct ip_tunnel_parm parms;
struct rtable *rt = NULL;
struct flowi4 fl4;
@@ -451,7 +451,7 @@ mlxsw_sp_span_entry_gretap4_parms(struct mlxsw_sp *mlxsw_sp,
const struct net_device *to_dev,
struct mlxsw_sp_span_parms *sparmsp)
{
- struct ip_tunnel_parm tparm = mlxsw_sp_ipip_netdev_parms4(to_dev);
+ struct ip_tunnel_parm_kern tparm = mlxsw_sp_ipip_netdev_parms4(to_dev);
union mlxsw_sp_l3addr saddr = { .addr4 = tparm.iph.saddr };
union mlxsw_sp_l3addr daddr = { .addr4 = tparm.iph.daddr };
bool inherit_tos = tparm.iph.tos & 0x1;
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 5169c3c72cff..04862874dd43 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -785,7 +785,8 @@ static void ipgre_link_update(struct net_device *dev, bool set_mtu)
}
}
-static int ipgre_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p,
+static int ipgre_tunnel_ctl(struct net_device *dev,
+ struct ip_tunnel_parm_kern *p,
int cmd)
{
int err;
@@ -1129,7 +1130,7 @@ static int erspan_validate(struct nlattr *tb[], struct nlattr *data[],
static int ipgre_netlink_parms(struct net_device *dev,
struct nlattr *data[],
struct nlattr *tb[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1196,7 +1197,7 @@ static int ipgre_netlink_parms(struct net_device *dev,
static int erspan_netlink_parms(struct net_device *dev,
struct nlattr *data[],
struct nlattr *tb[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1355,7 +1356,7 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;
int err;
@@ -1373,7 +1374,7 @@ static int erspan_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;
int err;
@@ -1392,8 +1393,8 @@ static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;
int err;
err = ipgre_newlink_encap_setup(dev, data);
@@ -1421,8 +1422,8 @@ static int erspan_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;
int err;
err = ipgre_newlink_encap_setup(dev, data);
@@ -1494,7 +1495,7 @@ static size_t ipgre_get_size(const struct net_device *dev)
static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm *p = &t->parms;
+ struct ip_tunnel_parm_kern *p = &t->parms;
__be16 o_flags = p->o_flags;
if (nla_put_u32(skb, IFLA_GRE_LINK, p->link) ||
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index beeae624c412..658f46208c8d 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -56,7 +56,7 @@ static unsigned int ip_tunnel_hash(__be32 key, __be32 remote)
IP_TNL_HASH_BITS);
}
-static bool ip_tunnel_key_match(const struct ip_tunnel_parm *p,
+static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
__be16 flags, __be32 key)
{
if (p->i_flags & TUNNEL_KEY) {
@@ -172,7 +172,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
EXPORT_SYMBOL_GPL(ip_tunnel_lookup);
static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
unsigned int h;
__be32 remote;
@@ -207,7 +207,7 @@ static void ip_tunnel_del(struct ip_tunnel_net *itn, struct ip_tunnel *t)
}
static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
int type)
{
__be32 remote = parms->iph.daddr;
@@ -231,7 +231,7 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
static struct net_device *__ip_tunnel_create(struct net *net,
const struct rtnl_link_ops *ops,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
int err;
struct ip_tunnel *tunnel;
@@ -327,7 +327,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)
static struct ip_tunnel *ip_tunnel_create(struct net *net,
struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
struct ip_tunnel *nt;
struct net_device *dev;
@@ -845,7 +845,7 @@ EXPORT_SYMBOL_GPL(ip_tunnel_xmit);
static void ip_tunnel_update(struct ip_tunnel_net *itn,
struct ip_tunnel *t,
struct net_device *dev,
- struct ip_tunnel_parm *p,
+ struct ip_tunnel_parm_kern *p,
bool set_mtu,
__u32 fwmark)
{
@@ -877,7 +877,8 @@ static void ip_tunnel_update(struct ip_tunnel_net *itn,
netdev_state_change(dev);
}
-int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd)
{
int err = 0;
struct ip_tunnel *t = netdev_priv(dev);
@@ -979,16 +980,52 @@ int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
}
EXPORT_SYMBOL_GPL(ip_tunnel_ctl);
+bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,
+ const void __user *data)
+{
+ struct ip_tunnel_parm p;
+
+ if (copy_from_user(&p, data, sizeof(p)))
+ return false;
+
+ strscpy(kp->name, p.name, sizeof(kp->name));
+ kp->link = p.link;
+ kp->i_flags = p.i_flags;
+ kp->o_flags = p.o_flags;
+ kp->i_key = p.i_key;
+ kp->o_key = p.o_key;
+ memcpy(&kp->iph, &p.iph, min(sizeof(kp->iph), sizeof(p.iph)));
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(ip_tunnel_parm_from_user);
+
+bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp)
+{
+ struct ip_tunnel_parm p;
+
+ strscpy(p.name, kp->name, sizeof(p.name));
+ p.link = kp->link;
+ p.i_flags = kp->i_flags;
+ p.o_flags = kp->o_flags;
+ p.i_key = kp->i_key;
+ p.o_key = kp->o_key;
+ memcpy(&p.iph, &kp->iph, min(sizeof(p.iph), sizeof(kp->iph)));
+
+ return !copy_to_user(data, &p, sizeof(p));
+}
+EXPORT_SYMBOL_GPL(ip_tunnel_parm_to_user);
+
int ip_tunnel_siocdevprivate(struct net_device *dev, struct ifreq *ifr,
void __user *data, int cmd)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
int err;
- if (copy_from_user(&p, data, sizeof(p)))
+ if (!ip_tunnel_parm_from_user(&p, data))
return -EFAULT;
err = dev->netdev_ops->ndo_tunnel_ctl(dev, &p, cmd);
- if (!err && copy_to_user(data, &p, sizeof(p)))
+ if (!err && !ip_tunnel_parm_to_user(data, &p))
return -EFAULT;
return err;
}
@@ -1067,7 +1104,7 @@ int ip_tunnel_init_net(struct net *net, unsigned int ip_tnl_net_id,
struct rtnl_link_ops *ops, char *devname)
{
struct ip_tunnel_net *itn = net_generic(net, ip_tnl_net_id);
- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;
unsigned int i;
itn->rtnl_link_ops = ops;
@@ -1147,7 +1184,7 @@ void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id,
EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets);
int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark)
+ struct ip_tunnel_parm_kern *p, __u32 fwmark)
{
struct ip_tunnel *nt;
struct net *net = dev_net(dev);
@@ -1201,7 +1238,7 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
EXPORT_SYMBOL_GPL(ip_tunnel_newlink);
int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark)
+ struct ip_tunnel_parm_kern *p, __u32 fwmark)
{
struct ip_tunnel *t;
struct ip_tunnel *tunnel = netdev_priv(dev);
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 586b1b3e35b8..f89e3bdef123 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -1116,7 +1116,7 @@ bool ip_tunnel_netlink_encap_parms(struct nlattr *data[],
EXPORT_SYMBOL_GPL(ip_tunnel_netlink_encap_parms);
void ip_tunnel_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
if (data[IFLA_IPTUN_LINK])
parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 9ab9b3ebe0cd..569a54d4bb89 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -167,7 +167,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
struct flowi *fl)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parms = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parms = &tunnel->parms;
struct dst_entry *dst = skb_dst(skb);
struct net_device *tdev; /* Device to other host */
int pkt_len = skb->len;
@@ -373,7 +373,7 @@ static int vti4_err(struct sk_buff *skb, u32 info)
}
static int
-vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
int err = 0;
@@ -529,7 +529,7 @@ static int vti_tunnel_validate(struct nlattr *tb[], struct nlattr *data[],
}
static void vti_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));
@@ -564,7 +564,7 @@ static int vti_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;
__u32 fwmark = 0;
vti_netlink_parms(data, &parms, &fwmark);
@@ -576,8 +576,8 @@ static int vti_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;
vti_netlink_parms(data, &p, &fwmark);
return ip_tunnel_changelink(dev, tb, &p, fwmark);
@@ -604,7 +604,7 @@ static size_t vti_get_size(const struct net_device *dev)
static int vti_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm *p = &t->parms;
+ struct ip_tunnel_parm_kern *p = &t->parms;
if (nla_put_u32(skb, IFLA_VTI_LINK, p->link) ||
nla_put_be32(skb, IFLA_VTI_IKEY, p->i_key) ||
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 27b8f83c6ea2..0dd2d3b55c75 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -330,7 +330,7 @@ static bool ipip_tunnel_ioctl_verify_protocol(u8 ipproto)
}
static int
-ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
if (p->iph.version != 4 ||
@@ -405,8 +405,8 @@ static int ipip_tunnel_validate(struct nlattr *tb[], struct nlattr *data[],
}
static void ipip_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms, bool *collect_md,
- __u32 *fwmark)
+ struct ip_tunnel_parm_kern *parms,
+ bool *collect_md, __u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));
@@ -432,8 +432,8 @@ static int ipip_newlink(struct net *src_net, struct net_device *dev,
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;
if (ip_tunnel_netlink_encap_parms(data, &ipencap)) {
@@ -452,8 +452,8 @@ static int ipip_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
bool collect_md;
__u32 fwmark = t->fwmark;
@@ -510,7 +510,7 @@ static size_t ipip_get_size(const struct net_device *dev)
static int ipip_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parm = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parm = &tunnel->parms;
if (nla_put_u32(skb, IFLA_IPTUN_LINK, parm->link) ||
nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, parm->iph.saddr) ||
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 9d6f59531b3a..9b8253c00159 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -441,7 +441,7 @@ static bool ipmr_init_vif_indev(const struct net_device *dev)
static struct net_device *ipmr_new_tunnel(struct net *net, struct vifctl *v)
{
struct net_device *tunnel_dev, *new_dev;
- struct ip_tunnel_parm p = { };
+ struct ip_tunnel_parm_kern p = { };
int err;
tunnel_dev = __dev_get_by_name(net, "tunl0");
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 733ace18806c..5334fdc388ac 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -63,6 +63,7 @@
#include <linux/string.h>
#include <linux/hash.h>
+#include <net/ip_tunnels.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <net/snmp.h>
@@ -2861,7 +2862,7 @@ void addrconf_prefix_rcv(struct net_device *dev, u8 *opt, int len, bool sllao)
static int addrconf_set_sit_dstaddr(struct net *net, struct net_device *dev,
struct in6_ifreq *ireq)
{
- struct ip_tunnel_parm p = { };
+ struct ip_tunnel_parm_kern p = { };
int err;
if (!(ipv6_addr_type(&ireq->ifr6_addr) & IPV6_ADDR_COMPATv4))
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index cc24cefdb85c..19af34e4c13c 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -132,8 +132,8 @@ static struct ip_tunnel *ipip6_tunnel_lookup(struct net *net,
return NULL;
}
-static struct ip_tunnel __rcu **__ipip6_bucket(struct sit_net *sitn,
- struct ip_tunnel_parm *parms)
+static struct ip_tunnel __rcu **
+__ipip6_bucket(struct sit_net *sitn, struct ip_tunnel_parm_kern *parms)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
@@ -226,7 +226,8 @@ static int ipip6_tunnel_create(struct net_device *dev)
}
static struct ip_tunnel *ipip6_tunnel_locate(struct net *net,
- struct ip_tunnel_parm *parms, int create)
+ struct ip_tunnel_parm_kern *parms,
+ int create)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
@@ -1135,7 +1136,8 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
dev->needed_headroom = t_hlen + hlen;
}
-static void ipip6_tunnel_update(struct ip_tunnel *t, struct ip_tunnel_parm *p,
+static void ipip6_tunnel_update(struct ip_tunnel *t,
+ struct ip_tunnel_parm_kern *p,
__u32 fwmark)
{
struct net *net = t->net;
@@ -1196,11 +1198,11 @@ static int
ipip6_tunnel_get6rd(struct net_device *dev, struct ip_tunnel_parm __user *data)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
struct ip_tunnel_6rd ip6rd;
- struct ip_tunnel_parm p;
if (dev == dev_to_sit_net(dev)->fb_tunnel_dev) {
- if (copy_from_user(&p, data, sizeof(p)))
+ if (!ip_tunnel_parm_from_user(&p, data))
return -EFAULT;
t = ipip6_tunnel_locate(t->net, &p, 0);
}
@@ -1251,7 +1253,7 @@ static bool ipip6_valid_ip_proto(u8 ipproto)
}
static int
-__ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm *p)
+__ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm_kern *p)
{
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
return -EPERM;
@@ -1268,7 +1270,7 @@ __ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm *p)
}
static int
-ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1281,7 +1283,7 @@ ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm *p)
}
static int
-ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
int err;
@@ -1297,7 +1299,7 @@ ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm *p)
}
static int
-ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
int err;
@@ -1328,7 +1330,7 @@ ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm *p)
}
static int
-ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1348,7 +1350,8 @@ ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm *p)
}
static int
-ipip6_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+ipip6_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd)
{
switch (cmd) {
case SIOCGETTUNNEL:
@@ -1494,7 +1497,7 @@ static int ipip6_validate(struct nlattr *tb[], struct nlattr *data[],
}
static void ipip6_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));
@@ -1603,8 +1606,8 @@ static int ipip6_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
struct net *net = t->net;
struct sit_net *sitn = net_generic(net, sit_net_id);
#ifdef CONFIG_IPV6_SIT_6RD
@@ -1691,7 +1694,7 @@ static size_t ipip6_get_size(const struct net_device *dev)
static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parm = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parm = &tunnel->parms;
if (nla_put_u32(skb, IFLA_IPTUN_LINK, parm->link) ||
nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, parm->iph.saddr) ||
--
2.43.0
Now that there are helpers for converting IP tunnel flags between the
old __be16 format and the bitmap format, make sure they work as expected
by adding a couple of tests to the bitmap testing suite. The helpers are
all inline, so no dependencies on the related CONFIG_* (or a standalone
module) are needed.
Cover three possible cases:
1. No bits past BIT(15) are set, VTI/SIT bits are not set. This
conversion is almost a direct assignment.
2. No bits past BIT(15) are set, but VTI/SIT bit is set. During the
conversion, it must be transformed into BIT(16) in the bitmap,
but still compatible with the __be16 format.
3. The bitmap has bits past BIT(15) set (not the VTI/SIT one). The
result will be truncated.
Note that currently __IP_TUNNEL_FLAG_NUM is 17 (incl. special),
which means that the result of this case is currently
semi-false-positive. When BIT(17) is finally here, it will be
adjusted accordingly.
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 105 insertions(+)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4ee1f8ceb51d..270afc0cba5c 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -14,6 +14,8 @@
#include <linux/string.h>
#include <linux/uaccess.h>
+#include <net/ip_tunnels.h>
+
#include "../tools/testing/selftests/kselftest_module.h"
#define EXP1_IN_BITS (sizeof(exp1) * 8)
@@ -1409,6 +1411,108 @@ static void __init test_bitmap_write_perf(void)
#undef TEST_BIT_LEN
+struct ip_tunnel_flags_test {
+ const u16 *src_bits;
+ const u16 *exp_bits;
+ u8 src_num;
+ u8 exp_num;
+ __be16 exp_val;
+ bool exp_comp:1;
+};
+
+#define IP_TUNNEL_FLAGS_TEST(src, comp, eval, exp) { \
+ .src_bits = (src), \
+ .src_num = ARRAY_SIZE(src), \
+ .exp_comp = (comp), \
+ .exp_val = (eval), \
+ .exp_bits = (exp), \
+ .exp_num = ARRAY_SIZE(exp), \
+}
+
+/* These are __be16-compatible and can be compared as is */
+static const u16 ip_tunnel_flags_1[] __initconst = {
+ IP_TUNNEL_KEY_BIT,
+ IP_TUNNEL_STRICT_BIT,
+ IP_TUNNEL_ERSPAN_OPT_BIT,
+};
+
+/*
+ * Due to the previous flags design limitation, setting either
+ * ``IP_TUNNEL_CSUM_BIT`` (on Big Endian) or ``IP_TUNNEL_DONT_FRAGMENT_BIT``
+ * (on Little) also sets VTI/ISATAP bit. In the bitmap implementation, they
+ * correspond to ``BIT(16)``, which is bigger than ``U16_MAX``, but still is
+ * backward-compatible.
+ */
+#ifdef __BIG_ENDIAN
+#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_CSUM_BIT
+#else
+#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_DONT_FRAGMENT_BIT
+#endif
+
+static const u16 ip_tunnel_flags_2_src[] __initconst = {
+ IP_TUNNEL_CONFLICT_BIT,
+};
+
+static const u16 ip_tunnel_flags_2_exp[] __initconst = {
+ IP_TUNNEL_CONFLICT_BIT,
+ IP_TUNNEL_SIT_ISATAP_BIT,
+};
+
+/* Bits 17 and higher are not compatible with __be16 flags */
+static const u16 ip_tunnel_flags_3_src[] __initconst = {
+ IP_TUNNEL_VXLAN_OPT_BIT,
+ 17,
+ 18,
+ 20,
+};
+
+static const u16 ip_tunnel_flags_3_exp[] __initconst = {
+ IP_TUNNEL_VXLAN_OPT_BIT,
+};
+
+static const struct ip_tunnel_flags_test ip_tunnel_flags_test[] __initconst = {
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_1, true,
+ cpu_to_be16(BIT(IP_TUNNEL_KEY_BIT) |
+ BIT(IP_TUNNEL_STRICT_BIT) |
+ BIT(IP_TUNNEL_ERSPAN_OPT_BIT)),
+ ip_tunnel_flags_1),
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_2_src, true, VTI_ISVTI,
+ ip_tunnel_flags_2_exp),
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_3_src,
+ /*
+ * This must be set to ``false`` once
+ * ``__IP_TUNNEL_FLAG_NUM`` goes above 17.
+ */
+ true,
+ cpu_to_be16(BIT(IP_TUNNEL_VXLAN_OPT_BIT)),
+ ip_tunnel_flags_3_exp),
+};
+
+static void __init test_ip_tunnel_flags(void)
+{
+ for (u32 i = 0; i < ARRAY_SIZE(ip_tunnel_flags_test); i++) {
+ typeof(*ip_tunnel_flags_test) *test = &ip_tunnel_flags_test[i];
+ IP_TUNNEL_DECLARE_FLAGS(src) = { };
+ IP_TUNNEL_DECLARE_FLAGS(exp) = { };
+ IP_TUNNEL_DECLARE_FLAGS(out);
+
+ for (u32 j = 0; j < test->src_num; j++)
+ __set_bit(test->src_bits[j], src);
+
+ for (u32 j = 0; j < test->exp_num; j++)
+ __set_bit(test->exp_bits[j], exp);
+
+ ip_tunnel_flags_from_be16(out, test->exp_val);
+
+ expect_eq_uint(test->exp_comp,
+ ip_tunnel_flags_is_be16_compat(src));
+ expect_eq_uint((__force u16)test->exp_val,
+ (__force u16)ip_tunnel_flags_to_be16(src));
+
+ __ipt_flag_op(expect_eq_bitmap, exp, out);
+ }
+}
+
static void __init selftest(void)
{
test_zero_clear();
@@ -1428,6 +1532,7 @@ static void __init selftest(void)
test_bitmap_read_write();
test_bitmap_read_perf();
test_bitmap_write_perf();
+ test_ip_tunnel_flags();
test_find_nth_bit();
test_for_each_set_bit();
--
2.43.0
From: Wojciech Drewek <[email protected]>
Packet Forwarding Control Protocol (PFCP) is a 3GPP Protocol
used between the control plane and the user plane function.
It is specified in TS 29.244[1].
Note that this module is not designed to support this Protocol
in the kernel space. There is no support for parsing any PFCP messages.
There is no API that could be used by any userspace daemon.
Basically it does not support PFCP. This protocol is sophisticated
and there is no need for implementing it in the kernel. The purpose
of this module is to allow users to setup software and hardware offload
of PFCP packets using tc tool.
When user requests to create a PFCP device, a new socket is created.
The socket is set up with port number 8805 which is specific for
PFCP [29.244 4.2.2]. This allow to receive PFCP request messages,
response messages use other ports.
Note that only one PFCP netdev can be created.
Only IPv4 is supported at this time.
[1] https://portal.3gpp.org/desktopmodules/Specifications/SpecificationDetails.aspx?specificationId=3111
Signed-off-by: Wojciech Drewek <[email protected]>
Signed-off-by: Marcin Szycik <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/Kconfig | 13 +++
drivers/net/Makefile | 1 +
include/net/pfcp.h | 17 ++++
drivers/net/pfcp.c | 223 +++++++++++++++++++++++++++++++++++++++++++
4 files changed, 254 insertions(+)
create mode 100644 include/net/pfcp.h
create mode 100644 drivers/net/pfcp.c
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 8ca0bc223b30..172d84e39129 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -290,6 +290,19 @@ config GTP
To compile this drivers as a module, choose M here: the module
will be called gtp.
+config PFCP
+ tristate "Packet Forwarding Control Protocol (PFCP)"
+ depends on INET
+ select NET_UDP_TUNNEL
+ help
+ This allows one to create PFCP virtual interfaces that allows to
+ set up software and hardware offload of PFCP packets.
+ Note that this module does not support PFCP protocol in the kernel space.
+ There is no support for parsing any PFCP messages.
+
+ To compile this drivers as a module, choose M here: the module
+ will be called pfcp.
+
config AMT
tristate "Automatic Multicast Tunneling (AMT)"
depends on INET && IP_MULTICAST
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 7cab36f94782..9c053673d6b2 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -38,6 +38,7 @@ obj-$(CONFIG_GENEVE) += geneve.o
obj-$(CONFIG_BAREUDP) += bareudp.o
obj-$(CONFIG_GTP) += gtp.o
obj-$(CONFIG_NLMON) += nlmon.o
+obj-$(CONFIG_PFCP) += pfcp.o
obj-$(CONFIG_NET_VRF) += vrf.o
obj-$(CONFIG_VSOCKMON) += vsockmon.o
obj-$(CONFIG_MHI_NET) += mhi_net.o
diff --git a/include/net/pfcp.h b/include/net/pfcp.h
new file mode 100644
index 000000000000..3f9ebf27a8ff
--- /dev/null
+++ b/include/net/pfcp.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PFCP_H_
+#define _PFCP_H_
+
+#include <linux/netdevice.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+#define PFCP_PORT 8805
+
+static inline bool netif_is_pfcp(const struct net_device *dev)
+{
+ return dev->rtnl_link_ops &&
+ !strcmp(dev->rtnl_link_ops->kind, "pfcp");
+}
+
+#endif
diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c
new file mode 100644
index 000000000000..3f1ee0ae7111
--- /dev/null
+++ b/drivers/net/pfcp.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * PFCP according to 3GPP TS 29.244
+ *
+ * Copyright (C) 2022, Intel Corporation.
+ */
+
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/rculist.h>
+#include <linux/skbuff.h>
+#include <linux/types.h>
+
+#include <net/udp.h>
+#include <net/udp_tunnel.h>
+#include <net/pfcp.h>
+
+struct pfcp_dev {
+ struct list_head list;
+
+ struct socket *sock;
+ struct net_device *dev;
+ struct net *net;
+};
+
+static unsigned int pfcp_net_id __read_mostly;
+
+struct pfcp_net {
+ struct list_head pfcp_dev_list;
+};
+
+static void pfcp_del_sock(struct pfcp_dev *pfcp)
+{
+ udp_tunnel_sock_release(pfcp->sock);
+ pfcp->sock = NULL;
+}
+
+static void pfcp_dev_uninit(struct net_device *dev)
+{
+ struct pfcp_dev *pfcp = netdev_priv(dev);
+
+ pfcp_del_sock(pfcp);
+}
+
+static int pfcp_dev_init(struct net_device *dev)
+{
+ struct pfcp_dev *pfcp = netdev_priv(dev);
+
+ pfcp->dev = dev;
+
+ return 0;
+}
+
+static const struct net_device_ops pfcp_netdev_ops = {
+ .ndo_init = pfcp_dev_init,
+ .ndo_uninit = pfcp_dev_uninit,
+ .ndo_get_stats64 = dev_get_tstats64,
+};
+
+static const struct device_type pfcp_type = {
+ .name = "pfcp",
+};
+
+static void pfcp_link_setup(struct net_device *dev)
+{
+ dev->netdev_ops = &pfcp_netdev_ops;
+ dev->needs_free_netdev = true;
+ SET_NETDEV_DEVTYPE(dev, &pfcp_type);
+
+ dev->hard_header_len = 0;
+ dev->addr_len = 0;
+
+ dev->type = ARPHRD_NONE;
+ dev->flags = IFF_POINTOPOINT | IFF_NOARP | IFF_MULTICAST;
+ dev->priv_flags |= IFF_NO_QUEUE;
+
+ netif_keep_dst(dev);
+}
+
+static struct socket *pfcp_create_sock(struct pfcp_dev *pfcp)
+{
+ struct udp_tunnel_sock_cfg tuncfg = {};
+ struct udp_port_cfg udp_conf = {
+ .local_ip.s_addr = htonl(INADDR_ANY),
+ .family = AF_INET,
+ };
+ struct net *net = pfcp->net;
+ struct socket *sock;
+ int err;
+
+ udp_conf.local_udp_port = htons(PFCP_PORT);
+
+ err = udp_sock_create(net, &udp_conf, &sock);
+ if (err)
+ return ERR_PTR(err);
+
+ setup_udp_tunnel_sock(net, sock, &tuncfg);
+
+ return sock;
+}
+
+static int pfcp_add_sock(struct pfcp_dev *pfcp)
+{
+ pfcp->sock = pfcp_create_sock(pfcp);
+
+ return PTR_ERR_OR_ZERO(pfcp->sock);
+}
+
+static int pfcp_newlink(struct net *net, struct net_device *dev,
+ struct nlattr *tb[], struct nlattr *data[],
+ struct netlink_ext_ack *extack)
+{
+ struct pfcp_dev *pfcp = netdev_priv(dev);
+ struct pfcp_net *pn;
+ int err;
+
+ pfcp->net = net;
+
+ err = pfcp_add_sock(pfcp);
+ if (err) {
+ netdev_dbg(dev, "failed to add pfcp socket %d\n", err);
+ goto exit_err;
+ }
+
+ err = register_netdevice(dev);
+ if (err) {
+ netdev_dbg(dev, "failed to register pfcp netdev %d\n", err);
+ goto exit_del_pfcp_sock;
+ }
+
+ pn = net_generic(dev_net(dev), pfcp_net_id);
+ list_add_rcu(&pfcp->list, &pn->pfcp_dev_list);
+
+ netdev_dbg(dev, "registered new PFCP interface\n");
+
+ return 0;
+
+exit_del_pfcp_sock:
+ pfcp_del_sock(pfcp);
+exit_err:
+ pfcp->net = NULL;
+ return err;
+}
+
+static void pfcp_dellink(struct net_device *dev, struct list_head *head)
+{
+ struct pfcp_dev *pfcp = netdev_priv(dev);
+
+ list_del_rcu(&pfcp->list);
+ unregister_netdevice_queue(dev, head);
+}
+
+static struct rtnl_link_ops pfcp_link_ops __read_mostly = {
+ .kind = "pfcp",
+ .priv_size = sizeof(struct pfcp_dev),
+ .setup = pfcp_link_setup,
+ .newlink = pfcp_newlink,
+ .dellink = pfcp_dellink,
+};
+
+static int __net_init pfcp_net_init(struct net *net)
+{
+ struct pfcp_net *pn = net_generic(net, pfcp_net_id);
+
+ INIT_LIST_HEAD(&pn->pfcp_dev_list);
+ return 0;
+}
+
+static void __net_exit pfcp_net_exit(struct net *net)
+{
+ struct pfcp_net *pn = net_generic(net, pfcp_net_id);
+ struct pfcp_dev *pfcp;
+ LIST_HEAD(list);
+
+ rtnl_lock();
+ list_for_each_entry(pfcp, &pn->pfcp_dev_list, list)
+ pfcp_dellink(pfcp->dev, &list);
+
+ unregister_netdevice_many(&list);
+ rtnl_unlock();
+}
+
+static struct pernet_operations pfcp_net_ops = {
+ .init = pfcp_net_init,
+ .exit = pfcp_net_exit,
+ .id = &pfcp_net_id,
+ .size = sizeof(struct pfcp_net),
+};
+
+static int __init pfcp_init(void)
+{
+ int err;
+
+ err = register_pernet_subsys(&pfcp_net_ops);
+ if (err)
+ goto exit_err;
+
+ err = rtnl_link_register(&pfcp_link_ops);
+ if (err)
+ goto exit_unregister_subsys;
+ return 0;
+
+exit_unregister_subsys:
+ unregister_pernet_subsys(&pfcp_net_ops);
+exit_err:
+ pr_err("loading PFCP module failed: err %d\n", err);
+ return err;
+}
+late_initcall(pfcp_init);
+
+static void __exit pfcp_exit(void)
+{
+ rtnl_link_unregister(&pfcp_link_ops);
+ unregister_pernet_subsys(&pfcp_net_ops);
+
+ pr_info("PFCP module unloaded\n");
+}
+module_exit(pfcp_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Wojciech Drewek <[email protected]>");
+MODULE_DESCRIPTION("Interface driver for PFCP encapsulated traffic");
+MODULE_ALIAS_RTNL_LINK("pfcp");
--
2.43.0
From: Michal Swiatkowski <[email protected]>
In PFCP receive path set metadata needed by flower code to do correct
classification based on this metadata.
Signed-off-by: Michal Swiatkowski <[email protected]>
Signed-off-by: Marcin Szycik <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/net/ip_tunnels.h | 3 +
include/net/pfcp.h | 73 ++++++++++++++++++++++
include/uapi/linux/if_tunnel.h | 3 +
include/uapi/linux/pkt_cls.h | 14 +++++
drivers/net/pfcp.c | 81 ++++++++++++++++++++++++-
lib/test_bitmap.c | 7 +--
net/sched/cls_flower.c | 107 +++++++++++++++++++++++++++++++++
7 files changed, 281 insertions(+), 7 deletions(-)
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 1f04a379291f..74abfc5f6579 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -216,6 +216,7 @@ static inline void ip_tunnel_set_options_present(unsigned long *flags)
__set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_PFCP_OPT_BIT, present);
ip_tunnel_flags_or(flags, flags, present);
}
@@ -228,6 +229,7 @@ static inline void ip_tunnel_clear_options_present(unsigned long *flags)
__set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_PFCP_OPT_BIT, present);
__ipt_flag_op(bitmap_andnot, flags, flags, present);
}
@@ -240,6 +242,7 @@ static inline bool ip_tunnel_is_options_present(const unsigned long *flags)
__set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
__set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_PFCP_OPT_BIT, present);
return ip_tunnel_flags_intersect(flags, present);
}
diff --git a/include/net/pfcp.h b/include/net/pfcp.h
index 3f9ebf27a8ff..af14f970b80e 100644
--- a/include/net/pfcp.h
+++ b/include/net/pfcp.h
@@ -2,12 +2,85 @@
#ifndef _PFCP_H_
#define _PFCP_H_
+#include <uapi/linux/if_ether.h>
+#include <net/dst_metadata.h>
#include <linux/netdevice.h>
+#include <uapi/linux/ipv6.h>
+#include <net/udp_tunnel.h>
+#include <uapi/linux/udp.h>
+#include <uapi/linux/ip.h>
#include <linux/string.h>
#include <linux/types.h>
+#include <linux/bits.h>
#define PFCP_PORT 8805
+/* PFCP protocol header */
+struct pfcphdr {
+ u8 flags;
+ u8 message_type;
+ __be16 message_length;
+};
+
+/* PFCP header flags */
+#define PFCP_SEID_FLAG BIT(0)
+#define PFCP_MP_FLAG BIT(1)
+
+#define PFCP_VERSION_MASK GENMASK(4, 0)
+
+#define PFCP_HLEN (sizeof(struct udphdr) + sizeof(struct pfcphdr))
+
+/* PFCP node related messages */
+struct pfcphdr_node {
+ u8 seq_number[3];
+ u8 reserved;
+};
+
+/* PFCP session related messages */
+struct pfcphdr_session {
+ __be64 seid;
+ u8 seq_number[3];
+#ifdef __LITTLE_ENDIAN_BITFIELD
+ u8 message_priority:4,
+ reserved:4;
+#elif defined(__BIG_ENDIAN_BITFIELD)
+ u8 reserved:4,
+ message_priprity:4;
+#else
+#error "Please fix <asm/byteorder>"
+#endif
+};
+
+struct pfcp_metadata {
+ u8 type;
+ __be64 seid;
+} __packed;
+
+enum {
+ PFCP_TYPE_NODE = 0,
+ PFCP_TYPE_SESSION = 1,
+};
+
+#define PFCP_HEADROOM (sizeof(struct iphdr) + sizeof(struct udphdr) + \
+ sizeof(struct pfcphdr) + sizeof(struct ethhdr))
+#define PFCP6_HEADROOM (sizeof(struct ipv6hdr) + sizeof(struct udphdr) + \
+ sizeof(struct pfcphdr) + sizeof(struct ethhdr))
+
+static inline struct pfcphdr *pfcp_hdr(struct sk_buff *skb)
+{
+ return (struct pfcphdr *)(udp_hdr(skb) + 1);
+}
+
+static inline struct pfcphdr_node *pfcp_hdr_node(struct sk_buff *skb)
+{
+ return (struct pfcphdr_node *)(pfcp_hdr(skb) + 1);
+}
+
+static inline struct pfcphdr_session *pfcp_hdr_session(struct sk_buff *skb)
+{
+ return (struct pfcphdr_session *)(pfcp_hdr(skb) + 1);
+}
+
static inline bool netif_is_pfcp(const struct net_device *dev)
{
return dev->rtnl_link_ops &&
diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 838927dd73a4..e1a246dd8c62 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -212,6 +212,9 @@ enum {
IP_TUNNEL_VTI_BIT,
IP_TUNNEL_SIT_ISATAP_BIT = IP_TUNNEL_VTI_BIT,
+ /* Flags starting from here are not available via the old UAPI */
+ IP_TUNNEL_PFCP_OPT_BIT, /* OPTIONS_PRESENT */
+
__IP_TUNNEL_FLAG_NUM,
};
diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
index ea277039f89d..229fc925ec3a 100644
--- a/include/uapi/linux/pkt_cls.h
+++ b/include/uapi/linux/pkt_cls.h
@@ -587,6 +587,10 @@ enum {
* TCA_FLOWER_KEY_ENC_OPT_GTP_
* attributes
*/
+ TCA_FLOWER_KEY_ENC_OPTS_PFCP, /* Nested
+ * TCA_FLOWER_KEY_ENC_IPT_PFCP
+ * attributes
+ */
__TCA_FLOWER_KEY_ENC_OPTS_MAX,
};
@@ -636,6 +640,16 @@ enum {
#define TCA_FLOWER_KEY_ENC_OPT_GTP_MAX \
(__TCA_FLOWER_KEY_ENC_OPT_GTP_MAX - 1)
+enum {
+ TCA_FLOWER_KEY_ENC_OPT_PFCP_UNSPEC,
+ TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE, /* u8 */
+ TCA_FLOWER_KEY_ENC_OPT_PFCP_SEID, /* be64 */
+ __TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX,
+};
+
+#define TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX \
+ (__TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX - 1)
+
enum {
TCA_FLOWER_KEY_MPLS_OPTS_UNSPEC,
TCA_FLOWER_KEY_MPLS_OPTS_LSE,
diff --git a/drivers/net/pfcp.c b/drivers/net/pfcp.c
index 3f1ee0ae7111..cc5b28c5f99f 100644
--- a/drivers/net/pfcp.c
+++ b/drivers/net/pfcp.c
@@ -21,6 +21,8 @@ struct pfcp_dev {
struct socket *sock;
struct net_device *dev;
struct net *net;
+
+ struct gro_cells gro_cells;
};
static unsigned int pfcp_net_id __read_mostly;
@@ -29,6 +31,78 @@ struct pfcp_net {
struct list_head pfcp_dev_list;
};
+static void
+pfcp_session_recv(struct pfcp_dev *pfcp, struct sk_buff *skb,
+ struct pfcp_metadata *md)
+{
+ struct pfcphdr_session *unparsed = pfcp_hdr_session(skb);
+
+ md->seid = unparsed->seid;
+ md->type = PFCP_TYPE_SESSION;
+}
+
+static void
+pfcp_node_recv(struct pfcp_dev *pfcp, struct sk_buff *skb,
+ struct pfcp_metadata *md)
+{
+ md->type = PFCP_TYPE_NODE;
+}
+
+static int pfcp_encap_recv(struct sock *sk, struct sk_buff *skb)
+{
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+ struct metadata_dst *tun_dst;
+ struct pfcp_metadata *md;
+ struct pfcphdr *unparsed;
+ struct pfcp_dev *pfcp;
+
+ if (unlikely(!pskb_may_pull(skb, PFCP_HLEN)))
+ goto drop;
+
+ pfcp = rcu_dereference_sk_user_data(sk);
+ if (unlikely(!pfcp))
+ goto drop;
+
+ unparsed = pfcp_hdr(skb);
+
+ ip_tunnel_flags_zero(flags);
+ tun_dst = udp_tun_rx_dst(skb, sk->sk_family, flags, 0,
+ sizeof(*md));
+ if (unlikely(!tun_dst))
+ goto drop;
+
+ md = ip_tunnel_info_opts(&tun_dst->u.tun_info);
+ if (unlikely(!md))
+ goto drop;
+
+ if (unparsed->flags & PFCP_SEID_FLAG)
+ pfcp_session_recv(pfcp, skb, md);
+ else
+ pfcp_node_recv(pfcp, skb, md);
+
+ __set_bit(IP_TUNNEL_PFCP_OPT_BIT, flags);
+ ip_tunnel_info_opts_set(&tun_dst->u.tun_info, md, sizeof(*md),
+ flags);
+
+ if (unlikely(iptunnel_pull_header(skb, PFCP_HLEN, skb->protocol,
+ !net_eq(sock_net(sk),
+ dev_net(pfcp->dev)))))
+ goto drop;
+
+ skb_dst_set(skb, (struct dst_entry *)tun_dst);
+
+ skb_reset_network_header(skb);
+ skb_reset_mac_header(skb);
+ skb->dev = pfcp->dev;
+
+ gro_cells_receive(&pfcp->gro_cells, skb);
+
+ return 0;
+drop:
+ kfree_skb(skb);
+ return 0;
+}
+
static void pfcp_del_sock(struct pfcp_dev *pfcp)
{
udp_tunnel_sock_release(pfcp->sock);
@@ -39,6 +113,7 @@ static void pfcp_dev_uninit(struct net_device *dev)
{
struct pfcp_dev *pfcp = netdev_priv(dev);
+ gro_cells_destroy(&pfcp->gro_cells);
pfcp_del_sock(pfcp);
}
@@ -48,7 +123,7 @@ static int pfcp_dev_init(struct net_device *dev)
pfcp->dev = dev;
- return 0;
+ return gro_cells_init(&pfcp->gro_cells, dev);
}
static const struct net_device_ops pfcp_netdev_ops = {
@@ -94,6 +169,10 @@ static struct socket *pfcp_create_sock(struct pfcp_dev *pfcp)
if (err)
return ERR_PTR(err);
+ tuncfg.sk_user_data = pfcp;
+ tuncfg.encap_rcv = pfcp_encap_recv;
+ tuncfg.encap_type = 1;
+
setup_udp_tunnel_sock(net, sock, &tuncfg);
return sock;
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 270afc0cba5c..1639d27d128b 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -1478,12 +1478,7 @@ static const struct ip_tunnel_flags_test ip_tunnel_flags_test[] __initconst = {
ip_tunnel_flags_1),
IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_2_src, true, VTI_ISVTI,
ip_tunnel_flags_2_exp),
- IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_3_src,
- /*
- * This must be set to ``false`` once
- * ``__IP_TUNNEL_FLAG_NUM`` goes above 17.
- */
- true,
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_3_src, false,
cpu_to_be16(BIT(IP_TUNNEL_VXLAN_OPT_BIT)),
ip_tunnel_flags_3_exp),
};
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index 8326919cbf67..8f2806117a18 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -28,6 +28,7 @@
#include <net/vxlan.h>
#include <net/erspan.h>
#include <net/gtp.h>
+#include <net/pfcp.h>
#include <net/tc_wrapper.h>
#include <net/dst.h>
@@ -741,6 +742,7 @@ enc_opts_policy[TCA_FLOWER_KEY_ENC_OPTS_MAX + 1] = {
[TCA_FLOWER_KEY_ENC_OPTS_VXLAN] = { .type = NLA_NESTED },
[TCA_FLOWER_KEY_ENC_OPTS_ERSPAN] = { .type = NLA_NESTED },
[TCA_FLOWER_KEY_ENC_OPTS_GTP] = { .type = NLA_NESTED },
+ [TCA_FLOWER_KEY_ENC_OPTS_PFCP] = { .type = NLA_NESTED },
};
static const struct nla_policy
@@ -770,6 +772,12 @@ gtp_opt_policy[TCA_FLOWER_KEY_ENC_OPT_GTP_MAX + 1] = {
[TCA_FLOWER_KEY_ENC_OPT_GTP_QFI] = { .type = NLA_U8 },
};
+static const struct nla_policy
+pfcp_opt_policy[TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX + 1] = {
+ [TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE] = { .type = NLA_U8 },
+ [TCA_FLOWER_KEY_ENC_OPT_PFCP_SEID] = { .type = NLA_U64 },
+};
+
static const struct nla_policy
mpls_stack_entry_policy[TCA_FLOWER_KEY_MPLS_OPT_LSE_MAX + 1] = {
[TCA_FLOWER_KEY_MPLS_OPT_LSE_DEPTH] = { .type = NLA_U8 },
@@ -1419,6 +1427,44 @@ static int fl_set_gtp_opt(const struct nlattr *nla, struct fl_flow_key *key,
return sizeof(*sinfo);
}
+static int fl_set_pfcp_opt(const struct nlattr *nla, struct fl_flow_key *key,
+ int depth, int option_len,
+ struct netlink_ext_ack *extack)
+{
+ struct nlattr *tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX + 1];
+ struct pfcp_metadata *md;
+ int err;
+
+ md = (struct pfcp_metadata *)&key->enc_opts.data[key->enc_opts.len];
+ memset(md, 0xff, sizeof(*md));
+
+ if (!depth)
+ return sizeof(*md);
+
+ if (nla_type(nla) != TCA_FLOWER_KEY_ENC_OPTS_PFCP) {
+ NL_SET_ERR_MSG_MOD(extack, "Non-pfcp option type for mask");
+ return -EINVAL;
+ }
+
+ err = nla_parse_nested(tb, TCA_FLOWER_KEY_ENC_OPT_PFCP_MAX, nla,
+ pfcp_opt_policy, extack);
+ if (err < 0)
+ return err;
+
+ if (!option_len && !tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE]) {
+ NL_SET_ERR_MSG_MOD(extack, "Missing tunnel key pfcp option type");
+ return -EINVAL;
+ }
+
+ if (tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE])
+ md->type = nla_get_u8(tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE]);
+
+ if (tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_SEID])
+ md->seid = nla_get_be64(tb[TCA_FLOWER_KEY_ENC_OPT_PFCP_SEID]);
+
+ return sizeof(*md);
+}
+
static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
struct fl_flow_key *mask,
struct netlink_ext_ack *extack)
@@ -1576,6 +1622,36 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
break;
+ case TCA_FLOWER_KEY_ENC_OPTS_PFCP:
+ if (key->enc_opts.dst_opt_type) {
+ NL_SET_ERR_MSG_MOD(extack, "Duplicate type for pfcp options");
+ return -EINVAL;
+ }
+ option_len = 0;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_PFCP_OPT_BIT;
+ option_len = fl_set_pfcp_opt(nla_opt_key, key,
+ key_depth, option_len,
+ extack);
+ if (option_len < 0)
+ return option_len;
+
+ key->enc_opts.len += option_len;
+ /* At the same time we need to parse through the mask
+ * in order to verify exact and mask attribute lengths.
+ */
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_PFCP_OPT_BIT;
+ option_len = fl_set_pfcp_opt(nla_opt_msk, mask,
+ msk_depth, option_len,
+ extack);
+ if (option_len < 0)
+ return option_len;
+
+ mask->enc_opts.len += option_len;
+ if (key->enc_opts.len != mask->enc_opts.len) {
+ NL_SET_ERR_MSG_MOD(extack, "Key and mask miss aligned");
+ return -EINVAL;
+ }
+ break;
default:
NL_SET_ERR_MSG(extack, "Unknown tunnel option type");
return -EINVAL;
@@ -3115,6 +3191,32 @@ static int fl_dump_key_gtp_opt(struct sk_buff *skb,
return -EMSGSIZE;
}
+static int fl_dump_key_pfcp_opt(struct sk_buff *skb,
+ struct flow_dissector_key_enc_opts *enc_opts)
+{
+ struct pfcp_metadata *md;
+ struct nlattr *nest;
+
+ nest = nla_nest_start_noflag(skb, TCA_FLOWER_KEY_ENC_OPTS_PFCP);
+ if (!nest)
+ goto nla_put_failure;
+
+ md = (struct pfcp_metadata *)&enc_opts->data[0];
+ if (nla_put_u8(skb, TCA_FLOWER_KEY_ENC_OPT_PFCP_TYPE, md->type))
+ goto nla_put_failure;
+
+ if (nla_put_be64(skb, TCA_FLOWER_KEY_ENC_OPT_PFCP_SEID,
+ md->seid, 0))
+ goto nla_put_failure;
+
+ nla_nest_end(skb, nest);
+ return 0;
+
+nla_put_failure:
+ nla_nest_cancel(skb, nest);
+ return -EMSGSIZE;
+}
+
static int fl_dump_key_ct(struct sk_buff *skb,
struct flow_dissector_key_ct *key,
struct flow_dissector_key_ct *mask)
@@ -3220,6 +3322,11 @@ static int fl_dump_key_options(struct sk_buff *skb, int enc_opt_type,
if (err)
goto nla_put_failure;
break;
+ case IP_TUNNEL_PFCP_OPT_BIT:
+ err = fl_dump_key_pfcp_opt(skb, enc_opts);
+ if (err)
+ goto nla_put_failure;
+ break;
default:
goto nla_put_failure;
}
--
2.43.0
Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT
have been defined as __be16. Now all of those 16 bits are occupied
and there's no more free space for new flags.
It can't be simply switched to a bigger container with no
adjustments to the values, since it's an explicit Endian storage,
and on LE systems (__be16)0x0001 equals to
(__be64)0x0001000000000000.
We could probably define new 64-bit flags depending on the
Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on
LE, but that would introduce an Endianness dependency and spawn a
ton of Sparse warnings. To mitigate them, all of those places which
were adjusted with this change would be touched anyway, so why not
define stuff properly if there's no choice.
Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the
value already coded and a fistful of <16 <-> bitmap> converters and
helpers. The two flags which have a different bit position are
SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as
__cpu_to_be16(), but as (__force __be16), i.e. had different
positions on LE and BE. Now they both have strongly defined places.
Change all __be16 fields which were used to store those flags, to
IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) ->
unsigned long[1] for now, and replace all TUNNEL_* occurrences to
their bitmap counterparts. Use the converters in the places which talk
to the userspace, hardware (NFP) or other hosts (GRE header). The rest
must explicitly use the new flags only. This must be done at once,
otherwise there will be too many conversions throughout the code in
the intermediate commits.
Finally, disable the old __be16 flags for use in the kernel code
(except for the two 'irregular' flags mentioned above), to prevent
any accidental (mis)use of them. For the userspace, nothing is
changed, only additions were made.
Most noticeable bloat-o-meter difference (.text):
vmlinux: 307/-1 (306)
gre.ko: 62/0 (62)
ip_gre.ko: 941/-217 (724) [*]
ip_tunnel.ko: 390/-900 (-510) [**]
ip_vti.ko: 138/0 (138)
ip6_gre.ko: 534/-18 (516) [*]
ip6_tunnel.ko: 118/-10 (108)
[*] gre_flags_to_tnl_flags() grew, but still is inlined
[**] ip_tunnel_find() got uninlined, hence such decrease
The average code size increase in non-extreme case is 100-200 bytes
per module, mostly due to sizeof(long) > sizeof(__be16), as
%__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers
are able to expand the majority of bitmap_*() calls here into direct
operations on scalars.
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
.../ethernet/mellanox/mlx5/core/en/tc_tun.h | 2 +-
include/net/dst_metadata.h | 10 +-
include/net/flow_dissector.h | 2 +-
include/net/gre.h | 70 +++++-----
include/net/ip6_tunnel.h | 4 +-
include/net/ip_tunnels.h | 119 +++++++++++++---
include/net/udp_tunnel.h | 4 +-
include/uapi/linux/if_tunnel.h | 33 +++++
drivers/net/bareudp.c | 19 ++-
.../mellanox/mlx5/core/en/tc_tun_encap.c | 6 +-
.../mellanox/mlx5/core/en/tc_tun_geneve.c | 12 +-
.../mellanox/mlx5/core/en/tc_tun_gre.c | 8 +-
.../mellanox/mlx5/core/en/tc_tun_vxlan.c | 9 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 16 ++-
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 26 ++--
.../ethernet/mellanox/mlxsw/spectrum_span.c | 6 +-
.../ethernet/netronome/nfp/flower/action.c | 27 +++-
drivers/net/geneve.c | 44 +++---
drivers/net/vxlan/vxlan_core.c | 14 +-
net/bridge/br_vlan_tunnel.c | 9 +-
net/core/filter.c | 26 ++--
net/core/flow_dissector.c | 20 ++-
net/ipv4/fou_bpf.c | 2 +-
net/ipv4/gre_demux.c | 2 +-
net/ipv4/ip_gre.c | 127 +++++++++++-------
net/ipv4/ip_tunnel.c | 54 ++++----
net/ipv4/ip_tunnel_core.c | 80 ++++++-----
net/ipv4/ip_vti.c | 29 ++--
net/ipv4/ipip.c | 21 ++-
net/ipv4/udp_tunnel_core.c | 5 +-
net/ipv6/ip6_gre.c | 85 +++++++-----
net/ipv6/ip6_tunnel.c | 14 +-
net/ipv6/sit.c | 5 +-
net/netfilter/ipvs/ip_vs_core.c | 6 +-
net/netfilter/ipvs/ip_vs_xmit.c | 20 +--
net/netfilter/nft_tunnel.c | 44 +++---
net/openvswitch/flow_netlink.c | 61 +++++----
net/psample/psample.c | 26 ++--
net/sched/act_tunnel_key.c | 36 ++---
net/sched/cls_flower.c | 27 ++--
40 files changed, 715 insertions(+), 415 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
index 92065568bb19..6873c1201803 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
@@ -117,7 +117,7 @@ bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a,
bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b,
- __be16 tun_flags);
+ u32 tun_type);
#endif /* CONFIG_MLX5_ESWITCH */
#endif //__MLX5_EN_TC_TUNNEL_H__
diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index 1b7fae4c6b24..4160731dcb6e 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -198,7 +198,7 @@ static inline struct metadata_dst *__ip_tun_set_dst(__be32 saddr,
__be32 daddr,
__u8 tos, __u8 ttl,
__be16 tp_dst,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -215,7 +215,7 @@ static inline struct metadata_dst *__ip_tun_set_dst(__be32 saddr,
}
static inline struct metadata_dst *ip_tun_rx_dst(struct sk_buff *skb,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -230,7 +230,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad
__u8 tos, __u8 ttl,
__be16 tp_dst,
__be32 label,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -243,7 +243,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad
info = &tun_dst->u.tun_info;
info->mode = IP_TUNNEL_INFO_IPV6;
- info->key.tun_flags = flags;
+ ip_tunnel_flags_copy(info->key.tun_flags, flags);
info->key.tun_id = tunnel_id;
info->key.tp_src = 0;
info->key.tp_dst = tp_dst;
@@ -259,7 +259,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad
}
static inline struct metadata_dst *ipv6_tun_rx_dst(struct sk_buff *skb,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 1a7131d6cb0e..9ab376d1a677 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -97,7 +97,7 @@ struct flow_dissector_key_enc_opts {
* here but seems difficult to #include
*/
u8 len;
- __be16 dst_opt_type;
+ u32 dst_opt_type;
};
struct flow_dissector_key_keyid {
diff --git a/include/net/gre.h b/include/net/gre.h
index 4e209708b754..ccd293203284 100644
--- a/include/net/gre.h
+++ b/include/net/gre.h
@@ -49,67 +49,61 @@ static inline bool netif_is_ip6gretap(const struct net_device *dev)
!strcmp(dev->rtnl_link_ops->kind, "ip6gretap");
}
-static inline int gre_calc_hlen(__be16 o_flags)
+static inline int gre_calc_hlen(const unsigned long *o_flags)
{
int addend = 4;
- if (o_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, o_flags))
addend += 4;
- if (o_flags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, o_flags))
addend += 4;
- if (o_flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, o_flags))
addend += 4;
return addend;
}
-static inline __be16 gre_flags_to_tnl_flags(__be16 flags)
+static inline void gre_flags_to_tnl_flags(unsigned long *dst, __be16 flags)
{
- __be16 tflags = 0;
-
- if (flags & GRE_CSUM)
- tflags |= TUNNEL_CSUM;
- if (flags & GRE_ROUTING)
- tflags |= TUNNEL_ROUTING;
- if (flags & GRE_KEY)
- tflags |= TUNNEL_KEY;
- if (flags & GRE_SEQ)
- tflags |= TUNNEL_SEQ;
- if (flags & GRE_STRICT)
- tflags |= TUNNEL_STRICT;
- if (flags & GRE_REC)
- tflags |= TUNNEL_REC;
- if (flags & GRE_VERSION)
- tflags |= TUNNEL_VERSION;
-
- return tflags;
+ IP_TUNNEL_DECLARE_FLAGS(res) = { };
+
+ __assign_bit(IP_TUNNEL_CSUM_BIT, res, flags & GRE_CSUM);
+ __assign_bit(IP_TUNNEL_ROUTING_BIT, res, flags & GRE_ROUTING);
+ __assign_bit(IP_TUNNEL_KEY_BIT, res, flags & GRE_KEY);
+ __assign_bit(IP_TUNNEL_SEQ_BIT, res, flags & GRE_SEQ);
+ __assign_bit(IP_TUNNEL_STRICT_BIT, res, flags & GRE_STRICT);
+ __assign_bit(IP_TUNNEL_REC_BIT, res, flags & GRE_REC);
+ __assign_bit(IP_TUNNEL_VERSION_BIT, res, flags & GRE_VERSION);
+
+ ip_tunnel_flags_copy(dst, res);
}
-static inline __be16 gre_tnl_flags_to_gre_flags(__be16 tflags)
+static inline __be16 gre_tnl_flags_to_gre_flags(const unsigned long *tflags)
{
__be16 flags = 0;
- if (tflags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tflags))
flags |= GRE_CSUM;
- if (tflags & TUNNEL_ROUTING)
+ if (test_bit(IP_TUNNEL_ROUTING_BIT, tflags))
flags |= GRE_ROUTING;
- if (tflags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, tflags))
flags |= GRE_KEY;
- if (tflags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tflags))
flags |= GRE_SEQ;
- if (tflags & TUNNEL_STRICT)
+ if (test_bit(IP_TUNNEL_STRICT_BIT, tflags))
flags |= GRE_STRICT;
- if (tflags & TUNNEL_REC)
+ if (test_bit(IP_TUNNEL_REC_BIT, tflags))
flags |= GRE_REC;
- if (tflags & TUNNEL_VERSION)
+ if (test_bit(IP_TUNNEL_VERSION_BIT, tflags))
flags |= GRE_VERSION;
return flags;
}
static inline void gre_build_header(struct sk_buff *skb, int hdr_len,
- __be16 flags, __be16 proto,
+ const unsigned long *flags, __be16 proto,
__be32 key, __be32 seq)
{
+ IP_TUNNEL_DECLARE_FLAGS(cond) = { };
struct gre_base_hdr *greh;
skb_push(skb, hdr_len);
@@ -120,18 +114,22 @@ static inline void gre_build_header(struct sk_buff *skb, int hdr_len,
greh->flags = gre_tnl_flags_to_gre_flags(flags);
greh->protocol = proto;
- if (flags & (TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_SEQ)) {
+ __set_bit(IP_TUNNEL_KEY_BIT, cond);
+ __set_bit(IP_TUNNEL_CSUM_BIT, cond);
+ __set_bit(IP_TUNNEL_SEQ_BIT, cond);
+
+ if (ip_tunnel_flags_intersect(flags, cond)) {
__be32 *ptr = (__be32 *)(((u8 *)greh) + hdr_len - 4);
- if (flags & TUNNEL_SEQ) {
+ if (test_bit(IP_TUNNEL_SEQ_BIT, flags)) {
*ptr = seq;
ptr--;
}
- if (flags & TUNNEL_KEY) {
+ if (test_bit(IP_TUNNEL_KEY_BIT, flags)) {
*ptr = key;
ptr--;
}
- if (flags & TUNNEL_CSUM &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, flags) &&
!(skb_shinfo(skb)->gso_type &
(SKB_GSO_GRE | SKB_GSO_GRE_CSUM))) {
*ptr = 0;
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 74b369bddf49..399592405c72 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -30,8 +30,8 @@ struct __ip6_tnl_parm {
struct in6_addr laddr; /* local tunnel end-point address */
struct in6_addr raddr; /* remote tunnel end-point address */
- __be16 i_flags;
- __be16 o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(i_flags);
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
__be32 i_key;
__be32 o_key;
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 847ffebaa09b..1f04a379291f 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -36,6 +36,24 @@
(sizeof_field(struct ip_tunnel_key, u) - \
sizeof_field(struct ip_tunnel_key, u.ipv4))
+#define __ipt_flag_op(op, ...) \
+ op(__VA_ARGS__, __IP_TUNNEL_FLAG_NUM)
+
+#define IP_TUNNEL_DECLARE_FLAGS(...) \
+ __ipt_flag_op(DECLARE_BITMAP, __VA_ARGS__)
+
+#define ip_tunnel_flags_zero(...) __ipt_flag_op(bitmap_zero, __VA_ARGS__)
+#define ip_tunnel_flags_copy(...) __ipt_flag_op(bitmap_copy, __VA_ARGS__)
+#define ip_tunnel_flags_and(...) __ipt_flag_op(bitmap_and, __VA_ARGS__)
+#define ip_tunnel_flags_or(...) __ipt_flag_op(bitmap_or, __VA_ARGS__)
+
+#define ip_tunnel_flags_empty(...) \
+ __ipt_flag_op(bitmap_empty, __VA_ARGS__)
+#define ip_tunnel_flags_intersect(...) \
+ __ipt_flag_op(bitmap_intersects, __VA_ARGS__)
+#define ip_tunnel_flags_subset(...) \
+ __ipt_flag_op(bitmap_subset, __VA_ARGS__)
+
struct ip_tunnel_key {
__be64 tun_id;
union {
@@ -48,11 +66,11 @@ struct ip_tunnel_key {
struct in6_addr dst;
} ipv6;
} u;
- __be16 tun_flags;
- u8 tos; /* TOS for IPv4, TC for IPv6 */
- u8 ttl; /* TTL for IPv4, HL for IPv6 */
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags);
__be32 label; /* Flow Label for IPv6 */
u32 nhid;
+ u8 tos; /* TOS for IPv4, TC for IPv6 */
+ u8 ttl; /* TTL for IPv4, HL for IPv6 */
__be16 tp_src;
__be16 tp_dst;
__u8 flow_flags;
@@ -110,14 +128,14 @@ struct ip_tunnel_prl_entry {
struct metadata_dst;
-/* Kernel-side copy of ip_tunnel_parm */
+/* Kernel-side variant of ip_tunnel_parm */
struct ip_tunnel_parm_kern {
char name[IFNAMSIZ];
- int link;
- __be16 i_flags;
- __be16 o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(i_flags);
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
__be32 i_key;
__be32 o_key;
+ int link;
struct iphdr iph;
};
@@ -168,7 +186,7 @@ struct ip_tunnel {
};
struct tnl_ptk_info {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 proto;
__be32 key;
__be32 seq;
@@ -190,11 +208,77 @@ struct ip_tunnel_net {
int type;
};
+static inline void ip_tunnel_set_options_present(unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ ip_tunnel_flags_or(flags, flags, present);
+}
+
+static inline void ip_tunnel_clear_options_present(unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ __ipt_flag_op(bitmap_andnot, flags, flags, present);
+}
+
+static inline bool ip_tunnel_is_options_present(const unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ return ip_tunnel_flags_intersect(flags, present);
+}
+
+static inline bool ip_tunnel_flags_is_be16_compat(const unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(supp) = { };
+
+ bitmap_set(supp, 0, BITS_PER_TYPE(__be16));
+ __set_bit(IP_TUNNEL_VTI_BIT, supp);
+
+ return ip_tunnel_flags_subset(flags, supp);
+}
+
+static inline void ip_tunnel_flags_from_be16(unsigned long *dst, __be16 flags)
+{
+ ip_tunnel_flags_zero(dst);
+
+ bitmap_write(dst, be16_to_cpu(flags), 0, BITS_PER_TYPE(__be16));
+ __assign_bit(IP_TUNNEL_VTI_BIT, dst, flags & VTI_ISVTI);
+}
+
+static inline __be16 ip_tunnel_flags_to_be16(const unsigned long *flags)
+{
+ __be16 ret;
+
+ ret = cpu_to_be16(bitmap_read(flags, 0, BITS_PER_TYPE(__be16)));
+ if (test_bit(IP_TUNNEL_VTI_BIT, flags))
+ ret |= VTI_ISVTI;
+
+ return ret;
+}
+
static inline void ip_tunnel_key_init(struct ip_tunnel_key *key,
__be32 saddr, __be32 daddr,
u8 tos, u8 ttl, __be32 label,
__be16 tp_src, __be16 tp_dst,
- __be64 tun_id, __be16 tun_flags)
+ __be64 tun_id,
+ const unsigned long *tun_flags)
{
key->tun_id = tun_id;
key->u.ipv4.src = saddr;
@@ -204,7 +288,7 @@ static inline void ip_tunnel_key_init(struct ip_tunnel_key *key,
key->tos = tos;
key->ttl = ttl;
key->label = label;
- key->tun_flags = tun_flags;
+ ip_tunnel_flags_copy(key->tun_flags, tun_flags);
/* For the tunnel types on the top of IPsec, the tp_src and tp_dst of
* the upper tunnel are used.
@@ -225,12 +309,8 @@ ip_tunnel_dst_cache_usable(const struct sk_buff *skb,
{
if (skb->mark)
return false;
- if (!info)
- return true;
- if (info->key.tun_flags & TUNNEL_NOCACHE)
- return false;
- return true;
+ return !info || !test_bit(IP_TUNNEL_NOCACHE_BIT, info->key.tun_flags);
}
static inline unsigned short ip_tunnel_info_af(const struct ip_tunnel_info
@@ -312,7 +392,7 @@ int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict);
int ip_tunnel_change_mtu(struct net_device *dev, int new_mtu);
struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
- int link, __be16 flags,
+ int link, const unsigned long *flags,
__be32 remote, __be32 local,
__be32 key);
@@ -528,12 +608,13 @@ static inline void ip_tunnel_info_opts_get(void *to,
static inline void ip_tunnel_info_opts_set(struct ip_tunnel_info *info,
const void *from, int len,
- __be16 flags)
+ const unsigned long *flags)
{
info->options_len = len;
if (len > 0) {
memcpy(ip_tunnel_info_opts(info), from, len);
- info->key.tun_flags |= flags;
+ ip_tunnel_flags_or(info->key.tun_flags, info->key.tun_flags,
+ flags);
}
}
@@ -577,7 +658,7 @@ static inline void ip_tunnel_info_opts_get(void *to,
static inline void ip_tunnel_info_opts_set(struct ip_tunnel_info *info,
const void *from, int len,
- __be16 flags)
+ const unsigned long *flags)
{
info->options_len = 0;
}
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index d716214fe03d..a93dc51f6323 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -179,8 +179,8 @@ struct dst_entry *udp_tunnel6_dst_lookup(struct sk_buff *skb,
struct dst_cache *dst_cache);
struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
- __be16 flags, __be64 tunnel_id,
- int md_size);
+ const unsigned long *flags,
+ __be64 tunnel_id, int md_size);
#ifdef CONFIG_INET
static inline int udp_tunnel_handle_offloads(struct sk_buff *skb, bool udp_csum)
diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 102119628ff5..838927dd73a4 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -161,6 +161,14 @@ enum {
#define IFLA_VTI_MAX (__IFLA_VTI_MAX - 1)
+#ifndef __KERNEL__
+/* Historically, tunnel flags have been defined as __be16 and now there are
+ * no free bits left. It is strongly advised to switch the already existing
+ * userspace code to the new *_BIT definitions from down below, as __be16
+ * can't be simply cast to a wider type on LE systems. All new flags and
+ * code must use *_BIT only.
+ */
+
#define TUNNEL_CSUM __cpu_to_be16(0x01)
#define TUNNEL_ROUTING __cpu_to_be16(0x02)
#define TUNNEL_KEY __cpu_to_be16(0x04)
@@ -181,5 +189,30 @@ enum {
#define TUNNEL_OPTIONS_PRESENT \
(TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT | TUNNEL_ERSPAN_OPT | \
TUNNEL_GTP_OPT)
+#endif
+
+enum {
+ IP_TUNNEL_CSUM_BIT = 0U,
+ IP_TUNNEL_ROUTING_BIT,
+ IP_TUNNEL_KEY_BIT,
+ IP_TUNNEL_SEQ_BIT,
+ IP_TUNNEL_STRICT_BIT,
+ IP_TUNNEL_REC_BIT,
+ IP_TUNNEL_VERSION_BIT,
+ IP_TUNNEL_NO_KEY_BIT,
+ IP_TUNNEL_DONT_FRAGMENT_BIT,
+ IP_TUNNEL_OAM_BIT,
+ IP_TUNNEL_CRIT_OPT_BIT,
+ IP_TUNNEL_GENEVE_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_VXLAN_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_NOCACHE_BIT,
+ IP_TUNNEL_ERSPAN_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_GTP_OPT_BIT, /* OPTIONS_PRESENT */
+
+ IP_TUNNEL_VTI_BIT,
+ IP_TUNNEL_SIT_ISATAP_BIT = IP_TUNNEL_VTI_BIT,
+
+ __IP_TUNNEL_FLAG_NUM,
+};
#endif /* _UAPI_IF_TUNNEL_H_ */
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 31377bb1cc97..8f862bbe9968 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -61,6 +61,7 @@ struct bareudp_dev {
static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
{
struct metadata_dst *tun_dst = NULL;
+ IP_TUNNEL_DECLARE_FLAGS(key) = { };
struct bareudp_dev *bareudp;
unsigned short family;
unsigned int len;
@@ -137,7 +138,10 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
bareudp->dev->stats.rx_dropped++;
goto drop;
}
- tun_dst = udp_tun_rx_dst(skb, family, TUNNEL_KEY, 0, 0);
+
+ __set_bit(IP_TUNNEL_KEY_BIT, key);
+
+ tun_dst = udp_tun_rx_dst(skb, family, key, 0, 0);
if (!tun_dst) {
bareudp->dev->stats.rx_dropped++;
goto drop;
@@ -291,10 +295,10 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct bareudp_dev *bareudp,
const struct ip_tunnel_info *info)
{
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
bool xnet = !net_eq(bareudp->net, dev_net(bareudp->dev));
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct socket *sock = rcu_dereference(bareudp->sock);
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
const struct ip_tunnel_key *key = &info->key;
struct rtable *rt;
__be16 sport, df;
@@ -322,7 +326,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
ttl = key->ttl;
- df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
+ df = test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) ?
+ htons(IP_DF) : 0;
skb_scrub_packet(skb, xnet);
err = -ENOSPC;
@@ -344,7 +349,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel_xmit_skb(rt, sock->sk, skb, saddr, info->key.u.ipv4.dst,
tos, ttl, df, sport, bareudp->port,
!net_eq(bareudp->net, dev_net(bareudp->dev)),
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
free_dst:
@@ -356,10 +362,10 @@ static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct bareudp_dev *bareudp,
const struct ip_tunnel_info *info)
{
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
bool xnet = !net_eq(bareudp->net, dev_net(bareudp->dev));
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct socket *sock = rcu_dereference(bareudp->sock);
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
const struct ip_tunnel_key *key = &info->key;
struct dst_entry *dst = NULL;
struct in6_addr saddr, daddr;
@@ -408,7 +414,8 @@ static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel6_xmit_skb(dst, sock->sk, skb, dev,
&saddr, &daddr, prio, ttl,
info->key.label, sport, bareudp->port,
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
free_dst:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index f1d1e1542e81..878cbdbf5ec8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -587,7 +587,7 @@ bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a,
bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b,
- __be16 tun_flags)
+ u32 tun_type)
{
struct ip_tunnel_info *a_info;
struct ip_tunnel_info *b_info;
@@ -596,8 +596,8 @@ bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
if (!mlx5e_tc_tun_encap_info_equal_generic(a, b))
return false;
- a_has_opts = !!(a->ip_tun_key->tun_flags & tun_flags);
- b_has_opts = !!(b->ip_tun_key->tun_flags & tun_flags);
+ a_has_opts = test_bit(tun_type, a->ip_tun_key->tun_flags);
+ b_has_opts = test_bit(tun_type, b->ip_tun_key->tun_flags);
/* keys are equal when both don't have any options attached */
if (!a_has_opts && !b_has_opts)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
index 2bcd10b6d653..bf969212cc77 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
@@ -106,12 +106,13 @@ static int mlx5e_gen_ip_tunnel_header_geneve(char buf[],
memset(geneveh, 0, sizeof(*geneveh));
geneveh->ver = MLX5E_GENEVE_VER;
geneveh->opt_len = tun_info->options_len / 4;
- geneveh->oam = !!(tun_info->key.tun_flags & TUNNEL_OAM);
- geneveh->critical = !!(tun_info->key.tun_flags & TUNNEL_CRIT_OPT);
+ geneveh->oam = test_bit(IP_TUNNEL_OAM_BIT, tun_info->key.tun_flags);
+ geneveh->critical = test_bit(IP_TUNNEL_CRIT_OPT_BIT,
+ tun_info->key.tun_flags);
mlx5e_tunnel_id_to_vni(tun_info->key.tun_id, geneveh->vni);
geneveh->proto_type = htons(ETH_P_TEB);
- if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_info->key.tun_flags)) {
if (!geneveh->opt_len)
return -EOPNOTSUPP;
ip_tunnel_info_opts_get(geneveh->options, tun_info);
@@ -188,7 +189,7 @@ static int mlx5e_tc_tun_parse_geneve_options(struct mlx5e_priv *priv,
/* make sure that we're talking about GENEVE options */
- if (enc_opts.key->dst_opt_type != TUNNEL_GENEVE_OPT) {
+ if (enc_opts.key->dst_opt_type != IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG_MOD(extack,
"Matching on GENEVE options: option type is not GENEVE");
netdev_warn(priv->netdev,
@@ -337,7 +338,8 @@ static int mlx5e_tc_tun_parse_geneve(struct mlx5e_priv *priv,
static bool mlx5e_tc_tun_encap_info_equal_geneve(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b)
{
- return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_GENEVE_OPT);
+ return mlx5e_tc_tun_encap_info_equal_options(a, b,
+ IP_TUNNEL_GENEVE_OPT_BIT);
}
struct mlx5e_tc_tunnel geneve_tunnel = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
index ada14f0574dc..579eda89fc76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
@@ -31,12 +31,16 @@ static int mlx5e_gen_ip_tunnel_header_gretap(char buf[],
const struct ip_tunnel_key *tun_key = &e->tun_info->key;
struct gre_base_hdr *greh = (struct gre_base_hdr *)(buf);
__be32 tun_id = tunnel_id_to_key32(tun_key->tun_id);
+ IP_TUNNEL_DECLARE_FLAGS(unsupp) = { };
int hdr_len;
*ip_proto = IPPROTO_GRE;
/* the HW does not calculate GRE csum or sequences */
- if (tun_key->tun_flags & (TUNNEL_CSUM | TUNNEL_SEQ))
+ __set_bit(IP_TUNNEL_CSUM_BIT, unsupp);
+ __set_bit(IP_TUNNEL_SEQ_BIT, unsupp);
+
+ if (ip_tunnel_flags_intersect(tun_key->tun_flags, unsupp))
return -EOPNOTSUPP;
greh->protocol = htons(ETH_P_TEB);
@@ -44,7 +48,7 @@ static int mlx5e_gen_ip_tunnel_header_gretap(char buf[],
/* GRE key */
hdr_len = mlx5e_tc_tun_calc_hlen_gretap(e);
greh->flags = gre_tnl_flags_to_gre_flags(tun_key->tun_flags);
- if (tun_key->tun_flags & TUNNEL_KEY) {
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags)) {
__be32 *ptr = (__be32 *)(((u8 *)greh) + hdr_len - 4);
*ptr = tun_id;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
index a184d739d5f8..e4e487c8431b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
@@ -90,7 +90,7 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[],
const struct vxlan_metadata *md;
struct vxlanhdr *vxh;
- if ((tun_key->tun_flags & TUNNEL_VXLAN_OPT) &&
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_key->tun_flags) &&
e->tun_info->options_len != sizeof(*md))
return -EOPNOTSUPP;
vxh = (struct vxlanhdr *)((char *)udp + sizeof(struct udphdr));
@@ -99,7 +99,7 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[],
udp->dest = tun_key->tp_dst;
vxh->vx_flags = VXLAN_HF_VNI;
vxh->vx_vni = vxlan_vni_field(tun_id);
- if (tun_key->tun_flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_key->tun_flags)) {
md = ip_tunnel_info_opts(e->tun_info);
vxlan_build_gbp_hdr(vxh, md);
}
@@ -125,7 +125,7 @@ static int mlx5e_tc_tun_parse_vxlan_gbp_option(struct mlx5e_priv *priv,
return -EOPNOTSUPP;
}
- if (enc_opts.key->dst_opt_type != TUNNEL_VXLAN_OPT) {
+ if (enc_opts.key->dst_opt_type != IP_TUNNEL_VXLAN_OPT_BIT) {
NL_SET_ERR_MSG_MOD(extack, "Wrong VxLAN option type: not GBP");
return -EOPNOTSUPP;
}
@@ -208,7 +208,8 @@ static int mlx5e_tc_tun_parse_vxlan(struct mlx5e_priv *priv,
static bool mlx5e_tc_tun_encap_info_equal_vxlan(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b)
{
- return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_VXLAN_OPT);
+ return mlx5e_tc_tun_encap_info_equal_options(a, b,
+ IP_TUNNEL_VXLAN_OPT_BIT);
}
static int mlx5e_tc_tun_get_remote_ifindex(struct net_device *mirred_dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 9fb2c057bd78..85fbdd3da657 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5464,6 +5464,7 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct tunnel_match_enc_opts enc_opts = {};
struct mlx5_rep_uplink_priv *uplink_priv;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct mlx5e_rep_priv *uplink_rpriv;
struct metadata_dst *tun_dst;
struct tunnel_match_key key;
@@ -5471,6 +5472,8 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
struct net_device *dev;
int err;
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+
enc_opts_id = tunnel_id & ENC_OPTS_BITS_MASK;
tun_id = tunnel_id >> ENC_OPTS_BITS;
@@ -5503,14 +5506,14 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
case FLOW_DISSECTOR_KEY_IPV4_ADDRS:
tun_dst = __ip_tun_set_dst(key.enc_ipv4.src, key.enc_ipv4.dst,
key.enc_ip.tos, key.enc_ip.ttl,
- key.enc_tp.dst, TUNNEL_KEY,
+ key.enc_tp.dst, flags,
key32_to_tunnel_id(key.enc_key_id.keyid),
enc_opts.key.len);
break;
case FLOW_DISSECTOR_KEY_IPV6_ADDRS:
tun_dst = __ipv6_tun_set_dst(&key.enc_ipv6.src, &key.enc_ipv6.dst,
key.enc_ip.tos, key.enc_ip.ttl,
- key.enc_tp.dst, 0, TUNNEL_KEY,
+ key.enc_tp.dst, 0, flags,
key32_to_tunnel_id(key.enc_key_id.keyid),
enc_opts.key.len);
break;
@@ -5528,11 +5531,16 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
tun_dst->u.tun_info.key.tp_src = key.enc_tp.src;
- if (enc_opts.key.len)
+ if (enc_opts.key.len) {
+ ip_tunnel_flags_zero(flags);
+ if (enc_opts.key.dst_opt_type)
+ __set_bit(enc_opts.key.dst_opt_type, flags);
+
ip_tunnel_info_opts_set(&tun_dst->u.tun_info,
enc_opts.key.data,
enc_opts.key.len,
- enc_opts.key.dst_opt_type);
+ flags);
+ }
skb_dst_set(skb, (struct dst_entry *)tun_dst);
dev = dev_get_by_index(&init_net, key.filter_ifindex);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
index d67df358a52f..d761a1235994 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
@@ -27,23 +27,23 @@ mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev)
static bool
mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm_kern *parms)
{
- return !!(parms->i_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags);
}
static bool mlxsw_sp_ipip_parms6_has_ikey(const struct __ip6_tnl_parm *parms)
{
- return !!(parms->i_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags);
}
static bool
mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm_kern *parms)
{
- return !!(parms->o_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->o_flags);
}
static bool mlxsw_sp_ipip_parms6_has_okey(const struct __ip6_tnl_parm *parms)
{
- return !!(parms->o_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->o_flags);
}
static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm_kern *parms)
@@ -242,12 +242,15 @@ static bool mlxsw_sp_ipip_can_offload_gre4(const struct mlxsw_sp *mlxsw_sp,
const struct net_device *ol_dev)
{
struct ip_tunnel *tunnel = netdev_priv(ol_dev);
- __be16 okflags = TUNNEL_KEY; /* We can't offload any other features. */
bool inherit_ttl = tunnel->parms.iph.ttl == 0;
bool inherit_tos = tunnel->parms.iph.tos & 0x1;
+ IP_TUNNEL_DECLARE_FLAGS(okflags) = { };
- return (tunnel->parms.i_flags & ~okflags) == 0 &&
- (tunnel->parms.o_flags & ~okflags) == 0 &&
+ /* We can't offload any other features. */
+ __set_bit(IP_TUNNEL_KEY_BIT, okflags);
+
+ return ip_tunnel_flags_subset(tunnel->parms.i_flags, okflags) &&
+ ip_tunnel_flags_subset(tunnel->parms.o_flags, okflags) &&
inherit_ttl && inherit_tos &&
mlxsw_sp_ipip_tunnel_complete(MLXSW_SP_L3_PROTO_IPV4, ol_dev);
}
@@ -443,10 +446,13 @@ static bool mlxsw_sp_ipip_can_offload_gre6(const struct mlxsw_sp *mlxsw_sp,
struct __ip6_tnl_parm tparm = mlxsw_sp_ipip_netdev_parms6(ol_dev);
bool inherit_tos = tparm.flags & IP6_TNL_F_USE_ORIG_TCLASS;
bool inherit_ttl = tparm.hop_limit == 0;
- __be16 okflags = TUNNEL_KEY; /* We can't offload any other features. */
+ IP_TUNNEL_DECLARE_FLAGS(okflags) = { };
+
+ /* We can't offload any other features. */
+ __set_bit(IP_TUNNEL_KEY_BIT, okflags);
- return (tparm.i_flags & ~okflags) == 0 &&
- (tparm.o_flags & ~okflags) == 0 &&
+ return ip_tunnel_flags_subset(tparm.i_flags, okflags) &&
+ ip_tunnel_flags_subset(tparm.o_flags, okflags) &&
inherit_ttl && inherit_tos &&
mlxsw_sp_ipip_tunnel_complete(MLXSW_SP_L3_PROTO_IPV6, ol_dev);
}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index f0cbc6b9b041..3de69b2eb5c4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -461,7 +461,8 @@ mlxsw_sp_span_entry_gretap4_parms(struct mlxsw_sp *mlxsw_sp,
if (!(to_dev->flags & IFF_UP) ||
/* Reject tunnels with GRE keys, checksums, etc. */
- tparm.i_flags || tparm.o_flags ||
+ !ip_tunnel_flags_empty(tparm.i_flags) ||
+ !ip_tunnel_flags_empty(tparm.o_flags) ||
/* Require a fixed TTL and a TOS copied from the mirrored packet. */
inherit_ttl || !inherit_tos ||
/* A destination address may not be "any". */
@@ -565,7 +566,8 @@ mlxsw_sp_span_entry_gretap6_parms(struct mlxsw_sp *mlxsw_sp,
if (!(to_dev->flags & IFF_UP) ||
/* Reject tunnels with GRE keys, checksums, etc. */
- tparm.i_flags || tparm.o_flags ||
+ !ip_tunnel_flags_empty(tparm.i_flags) ||
+ !ip_tunnel_flags_empty(tparm.o_flags) ||
/* Require a fixed TTL and a TOS copied from the mirrored packet. */
inherit_ttl || !inherit_tos ||
/* A destination address may not be "any". */
diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 2b383d92d7f5..db6e060ad882 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -396,6 +396,17 @@ nfp_fl_push_geneve_options(struct nfp_fl_payload *nfp_fl, int *list_len,
return 0;
}
+#define NFP_FL_CHECK(flag) ({ \
+ IP_TUNNEL_DECLARE_FLAGS(__check) = { }; \
+ __be16 __res; \
+ \
+ __set_bit(IP_TUNNEL_##flag##_BIT, __check); \
+ __res = ip_tunnel_flags_to_be16(__check); \
+ \
+ BUILD_BUG_ON(__builtin_constant_p(__res) && \
+ NFP_FL_TUNNEL_##flag != __res); \
+})
+
static int
nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
const struct flow_action_entry *act,
@@ -410,6 +421,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
u32 tmp_set_ip_tun_type_index = 0;
/* Currently support one pre-tunnel so index is always 0. */
int pretun_idx = 0;
+ __be16 tun_flags;
if (!IS_ENABLED(CONFIG_IPV6) && ipv6)
return -EOPNOTSUPP;
@@ -417,9 +429,10 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
if (ipv6 && !(priv->flower_ext_feats & NFP_FL_FEATS_IPV6_TUN))
return -EOPNOTSUPP;
- BUILD_BUG_ON(NFP_FL_TUNNEL_CSUM != TUNNEL_CSUM ||
- NFP_FL_TUNNEL_KEY != TUNNEL_KEY ||
- NFP_FL_TUNNEL_GENEVE_OPT != TUNNEL_GENEVE_OPT);
+ NFP_FL_CHECK(CSUM);
+ NFP_FL_CHECK(KEY);
+ NFP_FL_CHECK(GENEVE_OPT);
+
if (ip_tun->options_len &&
(tun_type != NFP_FL_TUNNEL_GENEVE ||
!(priv->flower_ext_feats & NFP_FL_FEATS_GENEVE_OPT))) {
@@ -427,7 +440,9 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
return -EOPNOTSUPP;
}
- if (ip_tun->key.tun_flags & ~NFP_FL_SUPPORTED_UDP_TUN_FLAGS) {
+ tun_flags = ip_tunnel_flags_to_be16(ip_tun->key.tun_flags);
+ if (!ip_tunnel_flags_is_be16_compat(ip_tun->key.tun_flags) ||
+ (tun_flags & ~NFP_FL_SUPPORTED_UDP_TUN_FLAGS)) {
NL_SET_ERR_MSG_MOD(extack,
"unsupported offload: loaded firmware does not support tunnel flag offload");
return -EOPNOTSUPP;
@@ -442,7 +457,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
FIELD_PREP(NFP_FL_PRE_TUN_INDEX, pretun_idx);
set_tun->tun_type_index = cpu_to_be32(tmp_set_ip_tun_type_index);
- if (ip_tun->key.tun_flags & NFP_FL_TUNNEL_KEY)
+ if (tun_flags & NFP_FL_TUNNEL_KEY)
set_tun->tun_id = ip_tun->key.tun_id;
if (ip_tun->key.ttl) {
@@ -486,7 +501,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
}
set_tun->tos = ip_tun->key.tos;
- set_tun->tun_flags = ip_tun->key.tun_flags;
+ set_tun->tun_flags = tun_flags;
if (tun_type == NFP_FL_TUNNEL_GENEVE) {
set_tun->tun_proto = htons(ETH_P_TEB);
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 32c51c244153..e2b770a6748f 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -225,10 +225,11 @@ static void geneve_rx(struct geneve_dev *geneve, struct geneve_sock *gs,
void *oiph;
if (ip_tunnel_collect_metadata() || gs->collect_md) {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
- flags = TUNNEL_KEY | (gnvh->oam ? TUNNEL_OAM : 0) |
- (gnvh->critical ? TUNNEL_CRIT_OPT : 0);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __assign_bit(IP_TUNNEL_OAM_BIT, flags, gnvh->oam);
+ __assign_bit(IP_TUNNEL_CRIT_OPT_BIT, flags, gnvh->critical);
tun_dst = udp_tun_rx_dst(skb, geneve_get_sk_family(gs), flags,
vni_to_tunnel_id(gnvh->vni),
@@ -238,9 +239,11 @@ static void geneve_rx(struct geneve_dev *geneve, struct geneve_sock *gs,
goto drop;
}
/* Update tunnel dst according to Geneve options. */
+ ip_tunnel_flags_zero(flags);
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, flags);
ip_tunnel_info_opts_set(&tun_dst->u.tun_info,
gnvh->options, gnvh->opt_len * 4,
- TUNNEL_GENEVE_OPT);
+ flags);
} else {
/* Drop packets w/ critical options,
* since we don't support any...
@@ -738,14 +741,15 @@ static void geneve_build_header(struct genevehdr *geneveh,
{
geneveh->ver = GENEVE_VER;
geneveh->opt_len = info->options_len / 4;
- geneveh->oam = !!(info->key.tun_flags & TUNNEL_OAM);
- geneveh->critical = !!(info->key.tun_flags & TUNNEL_CRIT_OPT);
+ geneveh->oam = test_bit(IP_TUNNEL_OAM_BIT, info->key.tun_flags);
+ geneveh->critical = test_bit(IP_TUNNEL_CRIT_OPT_BIT,
+ info->key.tun_flags);
geneveh->rsvd1 = 0;
tunnel_id_to_vni(info->key.tun_id, geneveh->vni);
geneveh->proto_type = inner_proto;
geneveh->rsvd2 = 0;
- if (info->key.tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags))
ip_tunnel_info_opts_get(geneveh->options, info);
}
@@ -754,7 +758,7 @@ static int geneve_build_skb(struct dst_entry *dst, struct sk_buff *skb,
bool xnet, int ip_hdr_len,
bool inner_proto_inherit)
{
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
struct genevehdr *gnvh;
__be16 inner_proto;
int min_headroom;
@@ -871,7 +875,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
if (geneve->cfg.collect_md) {
ttl = key->ttl;
- df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
+ df = test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) ?
+ htons(IP_DF) : 0;
} else {
if (geneve->cfg.ttl_inherit)
ttl = ip_tunnel_get_ttl(ip_hdr(skb), skb);
@@ -903,7 +908,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, saddr, info->key.u.ipv4.dst,
tos, ttl, df, sport, geneve->cfg.info.key.tp_dst,
!net_eq(geneve->net, dev_net(geneve->dev)),
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
}
@@ -991,7 +997,8 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
&saddr, &key->u.ipv6.dst, prio, ttl,
info->key.label, sport, geneve->cfg.info.key.tp_dst,
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
}
#endif
@@ -1290,7 +1297,8 @@ static struct geneve_dev *geneve_find_dev(struct geneve_net *gn,
static bool is_tnl_info_zero(const struct ip_tunnel_info *info)
{
- return !(info->key.tun_id || info->key.tun_flags || info->key.tos ||
+ return !(info->key.tun_id || info->key.tos ||
+ !ip_tunnel_flags_empty(info->key.tun_flags) ||
info->key.ttl || info->key.label || info->key.tp_src ||
memchr_inv(&info->key.u, 0, sizeof(info->key.u)));
}
@@ -1428,7 +1436,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
"Remote IPv6 address cannot be Multicast");
return -EINVAL;
}
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
cfg->use_udp6_rx_checksums = true;
#else
NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_REMOTE6],
@@ -1503,7 +1511,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
goto change_notsup;
}
if (nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
}
if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]) {
@@ -1513,7 +1521,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
goto change_notsup;
}
if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
- info->key.tun_flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
#else
NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX],
"IPv6 support not enabled in the kernel");
@@ -1746,7 +1754,8 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
info->key.u.ipv4.dst))
goto nla_put_failure;
if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
- !!(info->key.tun_flags & TUNNEL_CSUM)))
+ test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags)))
goto nla_put_failure;
#if IS_ENABLED(CONFIG_IPV6)
@@ -1755,7 +1764,8 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
&info->key.u.ipv6.dst))
goto nla_put_failure;
if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
- !(info->key.tun_flags & TUNNEL_CSUM)))
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags)))
goto nla_put_failure;
#endif
}
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 16106e088c63..f27082ed8ede 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1584,7 +1584,8 @@ static void vxlan_parse_gbp_hdr(struct vxlanhdr *unparsed,
tun_dst = (struct metadata_dst *)skb_dst(skb);
if (tun_dst) {
- tun_dst->u.tun_info.key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT,
+ tun_dst->u.tun_info.key.tun_flags);
tun_dst->u.tun_info.options_len = sizeof(*md);
}
if (gbp->dont_learn)
@@ -1716,9 +1717,11 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
goto drop;
if (vxlan_collect_metadata(vs)) {
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tun_dst;
- tun_dst = udp_tun_rx_dst(skb, vxlan_get_sk_family(vs), TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tun_dst = udp_tun_rx_dst(skb, vxlan_get_sk_family(vs), flags,
key32_to_tunnel_id(vni), sizeof(*md));
if (!tun_dst)
@@ -2403,14 +2406,14 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
vni = tunnel_id_to_key32(info->key.tun_id);
ifindex = 0;
dst_cache = &info->dst_cache;
- if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
if (info->options_len < sizeof(*md))
goto drop;
md = ip_tunnel_info_opts(info);
}
ttl = info->key.ttl;
tos = info->key.tos;
- udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
+ udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
}
src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
vxlan->cfg.port_max, true);
@@ -2451,7 +2454,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
old_iph->frag_off & htons(IP_DF)))
df = htons(IP_DF);
}
- } else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
+ } else if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT,
+ info->key.tun_flags)) {
df = htons(IP_DF);
}
diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
index 81833ca7a2c7..a966a6ec8263 100644
--- a/net/bridge/br_vlan_tunnel.c
+++ b/net/bridge/br_vlan_tunnel.c
@@ -65,13 +65,14 @@ static int __vlan_tunnel_info_add(struct net_bridge_vlan_group *vg,
{
struct metadata_dst *metadata = rtnl_dereference(vlan->tinfo.tunnel_dst);
__be64 key = key32_to_tunnel_id(cpu_to_be32(tun_id));
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
int err;
if (metadata)
return -EEXIST;
- metadata = __ip_tun_set_dst(0, 0, 0, 0, 0, TUNNEL_KEY,
- key, 0);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ metadata = __ip_tun_set_dst(0, 0, 0, 0, 0, flags, key, 0);
if (!metadata)
return -EINVAL;
@@ -185,6 +186,7 @@ void br_handle_ingress_vlan_tunnel(struct sk_buff *skb,
int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
struct net_bridge_vlan *vlan)
{
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tunnel_dst;
__be64 tunnel_id;
int err;
@@ -202,7 +204,8 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
return err;
if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
- tunnel_dst = __ip_tun_set_dst(0, 0, 0, 0, 0, TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tunnel_dst = __ip_tun_set_dst(0, 0, 0, 0, 0, flags,
tunnel_id, 0);
if (!tunnel_dst)
return -ENOMEM;
diff --git a/net/core/filter.c b/net/core/filter.c
index 358870408a51..7898ee7643bc 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4662,7 +4662,7 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
to->tunnel_tos = info->key.tos;
to->tunnel_ttl = info->key.ttl;
if (flags & BPF_F_TUNINFO_FLAGS)
- to->tunnel_flags = info->key.tun_flags;
+ to->tunnel_flags = ip_tunnel_flags_to_be16(info->key.tun_flags);
else
to->tunnel_ext = 0;
@@ -4705,7 +4705,7 @@ BPF_CALL_3(bpf_skb_get_tunnel_opt, struct sk_buff *, skb, u8 *, to, u32, size)
int err;
if (unlikely(!info ||
- !(info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))) {
+ !ip_tunnel_is_options_present(info->key.tun_flags))) {
err = -ENOENT;
goto err_clear;
}
@@ -4775,15 +4775,15 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
memset(info, 0, sizeof(*info));
info->mode = IP_TUNNEL_INFO_TX;
- info->key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
- if (flags & BPF_F_DONT_FRAGMENT)
- info->key.tun_flags |= TUNNEL_DONT_FRAGMENT;
- if (flags & BPF_F_ZERO_CSUM_TX)
- info->key.tun_flags &= ~TUNNEL_CSUM;
- if (flags & BPF_F_SEQ_NUMBER)
- info->key.tun_flags |= TUNNEL_SEQ;
- if (flags & BPF_F_NO_TUNNEL_KEY)
- info->key.tun_flags &= ~TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_NOCACHE_BIT, info->key.tun_flags);
+ __assign_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, info->key.tun_flags,
+ flags & BPF_F_DONT_FRAGMENT);
+ __assign_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags,
+ !(flags & BPF_F_ZERO_CSUM_TX));
+ __assign_bit(IP_TUNNEL_SEQ_BIT, info->key.tun_flags,
+ flags & BPF_F_SEQ_NUMBER);
+ __assign_bit(IP_TUNNEL_KEY_BIT, info->key.tun_flags,
+ !(flags & BPF_F_NO_TUNNEL_KEY));
info->key.tun_id = cpu_to_be64(from->tunnel_id);
info->key.tos = from->tunnel_tos;
@@ -4821,13 +4821,15 @@ BPF_CALL_3(bpf_skb_set_tunnel_opt, struct sk_buff *, skb,
{
struct ip_tunnel_info *info = skb_tunnel_info(skb);
const struct metadata_dst *md = this_cpu_ptr(md_dst);
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
if (unlikely(info != &md->u.tun_info || (size & (sizeof(u32) - 1))))
return -EINVAL;
if (unlikely(size > IP_TUNNEL_OPTS_MAX))
return -ENOMEM;
- ip_tunnel_info_opts_set(info, from, size, TUNNEL_OPTIONS_PRESENT);
+ ip_tunnel_set_options_present(present);
+ ip_tunnel_info_opts_set(info, from, size, present);
return 0;
}
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 272f09251343..f82e9a7d3b37 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -455,17 +455,25 @@ skb_flow_dissect_tunnel_info(const struct sk_buff *skb,
if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_ENC_OPTS)) {
struct flow_dissector_key_enc_opts *enc_opt;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+ u32 val;
enc_opt = skb_flow_dissector_target(flow_dissector,
FLOW_DISSECTOR_KEY_ENC_OPTS,
target_container);
- if (info->options_len) {
- enc_opt->len = info->options_len;
- ip_tunnel_info_opts_get(enc_opt->data, info);
- enc_opt->dst_opt_type = info->key.tun_flags &
- TUNNEL_OPTIONS_PRESENT;
- }
+ if (!info->options_len)
+ return;
+
+ enc_opt->len = info->options_len;
+ ip_tunnel_info_opts_get(enc_opt->data, info);
+
+ ip_tunnel_set_options_present(flags);
+ ip_tunnel_flags_and(flags, info->key.tun_flags, flags);
+
+ val = find_next_bit(flags, __IP_TUNNEL_FLAG_NUM,
+ IP_TUNNEL_GENEVE_OPT_BIT);
+ enc_opt->dst_opt_type = val < __IP_TUNNEL_FLAG_NUM ? val : 0;
}
}
EXPORT_SYMBOL(skb_flow_dissect_tunnel_info);
diff --git a/net/ipv4/fou_bpf.c b/net/ipv4/fou_bpf.c
index 4da03bf45c9b..a06b2e3f3d2a 100644
--- a/net/ipv4/fou_bpf.c
+++ b/net/ipv4/fou_bpf.c
@@ -64,7 +64,7 @@ __bpf_kfunc int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx,
info->encap.type = TUNNEL_ENCAP_NONE;
}
- if (info->key.tun_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags))
info->encap.flags |= TUNNEL_ENCAP_FLAG_CSUM;
info->encap.sport = encap->sport;
diff --git a/net/ipv4/gre_demux.c b/net/ipv4/gre_demux.c
index cbb2b4bb0dfa..01765891f82b 100644
--- a/net/ipv4/gre_demux.c
+++ b/net/ipv4/gre_demux.c
@@ -73,7 +73,7 @@ int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
if (unlikely(greh->flags & (GRE_VERSION | GRE_ROUTING)))
return -EINVAL;
- tpi->flags = gre_flags_to_tnl_flags(greh->flags);
+ gre_flags_to_tnl_flags(tpi->flags, greh->flags);
hdr_len = gre_calc_hlen(tpi->flags);
if (!pskb_may_pull(skb, nhs + hdr_len))
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 04862874dd43..1b78881f027d 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -265,6 +265,7 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
struct net *net = dev_net(skb->dev);
struct metadata_dst *tun_dst = NULL;
struct erspan_base_hdr *ershdr;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct ip_tunnel_net *itn;
struct ip_tunnel *tunnel;
const struct iphdr *iph;
@@ -272,18 +273,20 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
int ver;
int len;
+ ip_tunnel_flags_copy(flags, tpi->flags);
+
itn = net_generic(net, erspan_net_id);
iph = ip_hdr(skb);
if (is_erspan_type1(gre_hdr_len)) {
ver = 0;
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex,
- tpi->flags | TUNNEL_NO_KEY,
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, 0);
} else {
ershdr = (struct erspan_base_hdr *)(skb->data + gre_hdr_len);
ver = ershdr->ver;
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex,
- tpi->flags | TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, tpi->key);
}
@@ -307,10 +310,9 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
struct ip_tunnel_info *info;
unsigned char *gh;
__be64 tun_id;
- __be16 flags;
- tpi->flags |= TUNNEL_KEY;
- flags = tpi->flags;
+ __set_bit(IP_TUNNEL_KEY_BIT, tpi->flags);
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);
tun_dst = ip_tun_rx_dst(skb, flags,
@@ -333,7 +335,8 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
ERSPAN_V2_MDSIZE);
info = &tun_dst->u.tun_info;
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ info->key.tun_flags);
info->options_len = sizeof(*md);
}
@@ -376,10 +379,13 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
tnl_params = &tunnel->parms.iph;
if (tunnel->collect_md || tnl_params->daddr == 0) {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
__be64 tun_id;
- flags = tpi->flags & (TUNNEL_CSUM | TUNNEL_KEY);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ ip_tunnel_flags_and(flags, tpi->flags, flags);
+
tun_id = key32_to_tunnel_id(tpi->key);
tun_dst = ip_tun_rx_dst(skb, flags, tun_id, 0);
if (!tun_dst)
@@ -459,12 +465,15 @@ static void __gre_xmit(struct sk_buff *skb, struct net_device *dev,
__be16 proto)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- __be16 flags = tunnel->parms.o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+
+ ip_tunnel_flags_copy(flags, tunnel->parms.o_flags);
/* Push GRE header. */
gre_build_header(skb, tunnel->tun_hlen,
flags, proto, tunnel->parms.o_key,
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
ip_tunnel_xmit(skb, dev, tnl_params, tnl_params->protocol);
}
@@ -478,10 +487,10 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
__be16 proto)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
int tunnel_hlen;
- __be16 flags;
tun_info = skb_tunnel_info(skb);
if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
@@ -495,14 +504,19 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
goto err_free_skb;
/* Push Tunnel header. */
- if (gre_handle_offloads(skb, !!(tun_info->key.tun_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto err_free_skb;
- flags = tun_info->key.tun_flags &
- (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ ip_tunnel_flags_and(flags, tun_info->key.tun_flags, flags);
+
gre_build_header(skb, tunnel_hlen, flags, proto,
tunnel_id_to_key32(tun_info->key.tun_id),
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);
@@ -516,6 +530,7 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
struct erspan_metadata *md;
@@ -531,7 +546,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
goto err_free_skb;
key = &tun_info->key;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
+ if (!test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_info->key.tun_flags))
goto err_free_skb;
if (tun_info->options_len < sizeof(*md))
goto err_free_skb;
@@ -584,8 +599,9 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
goto err_free_skb;
}
- gre_build_header(skb, 8, TUNNEL_SEQ,
- proto, 0, htonl(atomic_fetch_inc(&tunnel->o_seqno)));
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ gre_build_header(skb, 8, flags, proto, 0,
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)));
ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);
@@ -659,7 +675,8 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
tnl_params = &tunnel->parms.iph;
}
- if (gre_handle_offloads(skb, !!(tunnel->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto free_skb;
__gre_xmit(skb, dev, tnl_params, skb->protocol);
@@ -701,7 +718,7 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
/* Push ERSPAN header */
if (tunnel->erspan_ver == 0) {
proto = htons(ETH_P_ERSPAN);
- tunnel->parms.o_flags &= ~TUNNEL_SEQ;
+ __clear_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags);
} else if (tunnel->erspan_ver == 1) {
erspan_build_header(skb, ntohl(tunnel->parms.o_key),
tunnel->index,
@@ -716,7 +733,7 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
goto free_skb;
}
- tunnel->parms.o_flags &= ~TUNNEL_KEY;
+ __clear_bit(IP_TUNNEL_KEY_BIT, tunnel->parms.o_flags);
__gre_xmit(skb, dev, &tunnel->parms.iph, proto);
return NETDEV_TX_OK;
@@ -739,7 +756,8 @@ static netdev_tx_t gre_tap_xmit(struct sk_buff *skb,
return NETDEV_TX_OK;
}
- if (gre_handle_offloads(skb, !!(tunnel->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto free_skb;
if (skb_cow_head(skb, dev->needed_headroom))
@@ -757,7 +775,6 @@ static netdev_tx_t gre_tap_xmit(struct sk_buff *skb,
static void ipgre_link_update(struct net_device *dev, bool set_mtu)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- __be16 flags;
int len;
len = tunnel->tun_hlen;
@@ -773,10 +790,9 @@ static void ipgre_link_update(struct net_device *dev, bool set_mtu)
if (set_mtu)
dev->mtu = max_t(int, dev->mtu - len, 68);
- flags = tunnel->parms.o_flags;
-
- if (flags & TUNNEL_SEQ ||
- (flags & TUNNEL_CSUM && tunnel->encap.type != TUNNEL_ENCAP_NONE)) {
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags) ||
+ (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.o_flags) &&
+ tunnel->encap.type != TUNNEL_ENCAP_NONE)) {
dev->features &= ~NETIF_F_GSO_SOFTWARE;
dev->hw_features &= ~NETIF_F_GSO_SOFTWARE;
} else {
@@ -789,17 +805,25 @@ static int ipgre_tunnel_ctl(struct net_device *dev,
struct ip_tunnel_parm_kern *p,
int cmd)
{
+ __be16 i_flags, o_flags;
int err;
+ if (!ip_tunnel_flags_is_be16_compat(p->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(p->o_flags))
+ return -EOVERFLOW;
+
+ i_flags = ip_tunnel_flags_to_be16(p->i_flags);
+ o_flags = ip_tunnel_flags_to_be16(p->o_flags);
+
if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
if (p->iph.version != 4 || p->iph.protocol != IPPROTO_GRE ||
p->iph.ihl != 5 || (p->iph.frag_off & htons(~IP_DF)) ||
- ((p->i_flags | p->o_flags) & (GRE_VERSION | GRE_ROUTING)))
+ ((i_flags | o_flags) & (GRE_VERSION | GRE_ROUTING)))
return -EINVAL;
}
- p->i_flags = gre_flags_to_tnl_flags(p->i_flags);
- p->o_flags = gre_flags_to_tnl_flags(p->o_flags);
+ gre_flags_to_tnl_flags(p->i_flags, i_flags);
+ gre_flags_to_tnl_flags(p->o_flags, o_flags);
err = ip_tunnel_ctl(dev, p, cmd);
if (err)
@@ -808,15 +832,18 @@ static int ipgre_tunnel_ctl(struct net_device *dev,
if (cmd == SIOCCHGTUNNEL) {
struct ip_tunnel *t = netdev_priv(dev);
- t->parms.i_flags = p->i_flags;
- t->parms.o_flags = p->o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p->i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p->o_flags);
if (strcmp(dev->rtnl_link_ops->kind, "erspan"))
ipgre_link_update(dev, true);
}
- p->i_flags = gre_tnl_flags_to_gre_flags(p->i_flags);
- p->o_flags = gre_tnl_flags_to_gre_flags(p->o_flags);
+ i_flags = gre_tnl_flags_to_gre_flags(p->i_flags);
+ ip_tunnel_flags_from_be16(p->i_flags, i_flags);
+ o_flags = gre_tnl_flags_to_gre_flags(p->o_flags);
+ ip_tunnel_flags_from_be16(p->o_flags, o_flags);
+
return 0;
}
@@ -956,7 +983,6 @@ static void ipgre_tunnel_setup(struct net_device *dev)
static void __gre_tunnel_init(struct net_device *dev)
{
struct ip_tunnel *tunnel;
- __be16 flags;
tunnel = netdev_priv(dev);
tunnel->tun_hlen = gre_calc_hlen(tunnel->parms.o_flags);
@@ -968,14 +994,13 @@ static void __gre_tunnel_init(struct net_device *dev)
dev->features |= GRE_FEATURES | NETIF_F_LLTX;
dev->hw_features |= GRE_FEATURES;
- flags = tunnel->parms.o_flags;
-
/* TCP offload with GRE SEQ is not supported, nor can we support 2
* levels of outer headers requiring an update.
*/
- if (flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags))
return;
- if (flags & TUNNEL_CSUM && tunnel->encap.type != TUNNEL_ENCAP_NONE)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.o_flags) &&
+ tunnel->encap.type != TUNNEL_ENCAP_NONE)
return;
dev->features |= NETIF_F_GSO_SOFTWARE;
@@ -1146,10 +1171,12 @@ static int ipgre_netlink_parms(struct net_device *dev,
parms->link = nla_get_u32(data[IFLA_GRE_LINK]);
if (data[IFLA_GRE_IFLAGS])
- parms->i_flags = gre_flags_to_tnl_flags(nla_get_be16(data[IFLA_GRE_IFLAGS]));
+ gre_flags_to_tnl_flags(parms->i_flags,
+ nla_get_be16(data[IFLA_GRE_IFLAGS]));
if (data[IFLA_GRE_OFLAGS])
- parms->o_flags = gre_flags_to_tnl_flags(nla_get_be16(data[IFLA_GRE_OFLAGS]));
+ gre_flags_to_tnl_flags(parms->o_flags,
+ nla_get_be16(data[IFLA_GRE_OFLAGS]));
if (data[IFLA_GRE_IKEY])
parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
@@ -1409,8 +1436,8 @@ static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[],
if (err < 0)
return err;
- t->parms.i_flags = p.i_flags;
- t->parms.o_flags = p.o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p.i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p.o_flags);
ipgre_link_update(dev, !tb[IFLA_MTU]);
@@ -1438,8 +1465,8 @@ static int erspan_changelink(struct net_device *dev, struct nlattr *tb[],
if (err < 0)
return err;
- t->parms.i_flags = p.i_flags;
- t->parms.o_flags = p.o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p.i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p.o_flags);
return 0;
}
@@ -1496,7 +1523,9 @@ static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
struct ip_tunnel_parm_kern *p = &t->parms;
- __be16 o_flags = p->o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
+
+ ip_tunnel_flags_copy(o_flags, p->o_flags);
if (nla_put_u32(skb, IFLA_GRE_LINK, p->link) ||
nla_put_be16(skb, IFLA_GRE_IFLAGS,
@@ -1544,7 +1573,7 @@ static int erspan_fill_info(struct sk_buff *skb, const struct net_device *dev)
if (t->erspan_ver <= 2) {
if (t->erspan_ver != 0 && !t->collect_md)
- t->parms.o_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, t->parms.o_flags);
if (nla_put_u8(skb, IFLA_GRE_ERSPAN_VER, t->erspan_ver))
goto nla_put_failure;
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 658f46208c8d..15e5a5464d99 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -57,16 +57,12 @@ static unsigned int ip_tunnel_hash(__be32 key, __be32 remote)
}
static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
- __be16 flags, __be32 key)
+ const unsigned long *flags, __be32 key)
{
- if (p->i_flags & TUNNEL_KEY) {
- if (flags & TUNNEL_KEY)
- return key == p->i_key;
- else
- /* key expected, none present */
- return false;
- } else
- return !(flags & TUNNEL_KEY);
+ if (!test_bit(IP_TUNNEL_KEY_BIT, flags))
+ return !test_bit(IP_TUNNEL_KEY_BIT, p->i_flags);
+
+ return test_bit(IP_TUNNEL_KEY_BIT, p->i_flags) && p->i_key == key;
}
/* Fallback tunnel: no source, no destination, no key, no options
@@ -81,7 +77,7 @@ static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
Given src, dst and key, find appropriate for input tunnel.
*/
struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
- int link, __be16 flags,
+ int link, const unsigned long *flags,
__be32 remote, __be32 local,
__be32 key)
{
@@ -144,7 +140,8 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
}
hlist_for_each_entry_rcu(t, head, hash_node) {
- if ((!(flags & TUNNEL_NO_KEY) && t->parms.i_key != key) ||
+ if ((!test_bit(IP_TUNNEL_NO_KEY_BIT, flags) &&
+ t->parms.i_key != key) ||
t->parms.iph.saddr != 0 ||
t->parms.iph.daddr != 0 ||
!(t->dev->flags & IFF_UP))
@@ -183,7 +180,8 @@ static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn,
else
remote = 0;
- if (!(parms->i_flags & TUNNEL_KEY) && (parms->i_flags & VTI_ISVTI))
+ if (!test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags) &&
+ test_bit(IP_TUNNEL_VTI_BIT, parms->i_flags))
i_key = 0;
h = ip_tunnel_hash(i_key, remote);
@@ -212,12 +210,14 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be32 key = parms->i_key;
- __be16 flags = parms->i_flags;
int link = parms->link;
struct ip_tunnel *t = NULL;
struct hlist_head *head = ip_bucket(itn, parms);
+ ip_tunnel_flags_copy(flags, parms->i_flags);
+
hlist_for_each_entry_rcu(t, head, hash_node) {
if (local == t->parms.iph.saddr &&
remote == t->parms.iph.daddr &&
@@ -387,15 +387,15 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,
}
#endif
- if ((!(tpi->flags&TUNNEL_CSUM) && (tunnel->parms.i_flags&TUNNEL_CSUM)) ||
- ((tpi->flags&TUNNEL_CSUM) && !(tunnel->parms.i_flags&TUNNEL_CSUM))) {
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.i_flags) !=
+ test_bit(IP_TUNNEL_CSUM_BIT, tpi->flags)) {
DEV_STATS_INC(tunnel->dev, rx_crc_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
goto drop;
}
- if (tunnel->parms.i_flags&TUNNEL_SEQ) {
- if (!(tpi->flags&TUNNEL_SEQ) ||
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.i_flags)) {
+ if (!test_bit(IP_TUNNEL_SEQ_BIT, tpi->flags) ||
(tunnel->i_seqno && (s32)(ntohl(tpi->seq) - tunnel->i_seqno) < 0)) {
DEV_STATS_INC(tunnel->dev, rx_fifo_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
@@ -612,7 +612,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
goto tx_error;
}
- if (key->tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags))
df = htons(IP_DF);
if (tnl_update_pmtu(dev, skb, rt, df, inner_iph, tunnel_hlen,
key->u.ipv4.dst, true)) {
@@ -902,10 +902,10 @@ int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
goto done;
if (p->iph.ttl)
p->iph.frag_off |= htons(IP_DF);
- if (!(p->i_flags & VTI_ISVTI)) {
- if (!(p->i_flags & TUNNEL_KEY))
+ if (!test_bit(IP_TUNNEL_VTI_BIT, p->i_flags)) {
+ if (!test_bit(IP_TUNNEL_KEY_BIT, p->i_flags))
p->i_key = 0;
- if (!(p->o_flags & TUNNEL_KEY))
+ if (!test_bit(IP_TUNNEL_KEY_BIT, p->o_flags))
p->o_key = 0;
}
@@ -990,8 +990,8 @@ bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,
strscpy(kp->name, p.name, sizeof(kp->name));
kp->link = p.link;
- kp->i_flags = p.i_flags;
- kp->o_flags = p.o_flags;
+ ip_tunnel_flags_from_be16(kp->i_flags, p.i_flags);
+ ip_tunnel_flags_from_be16(kp->o_flags, p.o_flags);
kp->i_key = p.i_key;
kp->o_key = p.o_key;
memcpy(&kp->iph, &p.iph, min(sizeof(kp->iph), sizeof(p.iph)));
@@ -1004,10 +1004,14 @@ bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp)
{
struct ip_tunnel_parm p;
+ if (!ip_tunnel_flags_is_be16_compat(kp->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(kp->o_flags))
+ return false;
+
strscpy(p.name, kp->name, sizeof(p.name));
p.link = kp->link;
- p.i_flags = kp->i_flags;
- p.o_flags = kp->o_flags;
+ p.i_flags = ip_tunnel_flags_to_be16(kp->i_flags);
+ p.o_flags = ip_tunnel_flags_to_be16(kp->o_flags);
p.i_key = kp->i_key;
p.o_key = kp->o_key;
memcpy(&p.iph, &kp->iph, min(sizeof(p.iph), sizeof(kp->iph)));
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index f89e3bdef123..36aec14e2c9f 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -125,6 +125,7 @@ EXPORT_SYMBOL_GPL(__iptunnel_pull_header);
struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
gfp_t flags)
{
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags) = { };
struct metadata_dst *res;
struct ip_tunnel_info *dst, *src;
@@ -144,10 +145,10 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
sizeof(struct in6_addr));
else
dst->key.u.ipv4.dst = src->key.u.ipv4.src;
- dst->key.tun_flags = src->key.tun_flags;
+ ip_tunnel_flags_copy(dst->key.tun_flags, src->key.tun_flags);
dst->mode = src->mode | IP_TUNNEL_INFO_TX;
ip_tunnel_info_opts_set(dst, ip_tunnel_info_opts(src),
- src->options_len, 0);
+ src->options_len, tun_flags);
return res;
}
@@ -497,7 +498,7 @@ static int ip_tun_parse_opts_geneve(struct nlattr *attr,
opt->opt_class = nla_get_be16(attr);
attr = tb[LWTUNNEL_IP_OPT_GENEVE_TYPE];
opt->type = nla_get_u8(attr);
- info->key.tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags);
}
return sizeof(struct geneve_opt) + data_len;
@@ -525,7 +526,7 @@ static int ip_tun_parse_opts_vxlan(struct nlattr *attr,
attr = tb[LWTUNNEL_IP_OPT_VXLAN_GBP];
md->gbp = nla_get_u32(attr);
md->gbp &= VXLAN_GBP_MASK;
- info->key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags);
}
return sizeof(struct vxlan_metadata);
@@ -574,7 +575,7 @@ static int ip_tun_parse_opts_erspan(struct nlattr *attr,
set_hwid(&md->u.md2, nla_get_u8(attr));
}
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags);
}
return sizeof(struct erspan_metadata);
@@ -585,7 +586,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
{
int err, rem, opt_len, opts_len = 0;
struct nlattr *nla;
- __be16 type = 0;
+ u32 type = 0;
if (!attr)
return 0;
@@ -598,7 +599,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
nla_for_each_attr(nla, nla_data(attr), nla_len(attr), rem) {
switch (nla_type(nla)) {
case LWTUNNEL_IP_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT)
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT)
return -EINVAL;
opt_len = ip_tun_parse_opts_geneve(nla, info, opts_len,
extack);
@@ -607,7 +608,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
opts_len += opt_len;
if (opts_len > IP_TUNNEL_OPTS_MAX)
return -EINVAL;
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
case LWTUNNEL_IP_OPTS_VXLAN:
if (type)
@@ -617,7 +618,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case LWTUNNEL_IP_OPTS_ERSPAN:
if (type)
@@ -627,7 +628,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
default:
return -EINVAL;
@@ -705,10 +706,16 @@ static int ip_tun_build_state(struct net *net, struct nlattr *attr,
if (tb[LWTUNNEL_IP_TOS])
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP_TOS]);
- if (tb[LWTUNNEL_IP_FLAGS])
- tun_info->key.tun_flags |=
- (nla_get_be16(tb[LWTUNNEL_IP_FLAGS]) &
- ~TUNNEL_OPTIONS_PRESENT);
+ if (tb[LWTUNNEL_IP_FLAGS]) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+
+ ip_tunnel_flags_from_be16(flags,
+ nla_get_be16(tb[LWTUNNEL_IP_FLAGS]));
+ ip_tunnel_clear_options_present(flags);
+
+ ip_tunnel_flags_or(tun_info->key.tun_flags,
+ tun_info->key.tun_flags, flags);
+ }
tun_info->mode = IP_TUNNEL_INFO_TX;
tun_info->options_len = opt_len;
@@ -812,18 +819,18 @@ static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
struct nlattr *nest;
int err = 0;
- if (!(tun_info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))
+ if (!ip_tunnel_is_options_present(tun_info->key.tun_flags))
return 0;
nest = nla_nest_start_noflag(skb, type);
if (!nest)
return -ENOMEM;
- if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_geneve(skb, tun_info);
- else if (tun_info->key.tun_flags & TUNNEL_VXLAN_OPT)
+ else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_vxlan(skb, tun_info);
- else if (tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT)
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_erspan(skb, tun_info);
if (err) {
@@ -846,7 +853,8 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in_addr(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) ||
nla_put_u8(skb, LWTUNNEL_IP_TOS, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags) ||
+ nla_put_be16(skb, LWTUNNEL_IP_FLAGS,
+ ip_tunnel_flags_to_be16(tun_info->key.tun_flags)) ||
ip_tun_fill_encap_opts(skb, LWTUNNEL_IP_OPTS, tun_info))
return -ENOMEM;
@@ -857,11 +865,11 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
{
int opt_len;
- if (!(info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))
+ if (!ip_tunnel_is_options_present(info->key.tun_flags))
return 0;
opt_len = nla_total_size(0); /* LWTUNNEL_IP_OPTS */
- if (info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags)) {
struct geneve_opt *opt;
int offset = 0;
@@ -874,10 +882,10 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
/* OPT_GENEVE_DATA */
offset += sizeof(*opt) + opt->length * 4;
}
- } else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_VXLAN */
+ nla_total_size(4); /* OPT_VXLAN_GBP */
- } else if (info->key.tun_flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags)) {
struct erspan_metadata *md = ip_tunnel_info_opts(info);
opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_ERSPAN */
@@ -984,10 +992,17 @@ static int ip6_tun_build_state(struct net *net, struct nlattr *attr,
if (tb[LWTUNNEL_IP6_TC])
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP6_TC]);
- if (tb[LWTUNNEL_IP6_FLAGS])
- tun_info->key.tun_flags |=
- (nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]) &
- ~TUNNEL_OPTIONS_PRESENT);
+ if (tb[LWTUNNEL_IP6_FLAGS]) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+ __be16 data;
+
+ data = nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
+ ip_tunnel_flags_from_be16(flags, data);
+ ip_tunnel_clear_options_present(flags);
+
+ ip_tunnel_flags_or(tun_info->key.tun_flags,
+ tun_info->key.tun_flags, flags);
+ }
tun_info->mode = IP_TUNNEL_INFO_TX | IP_TUNNEL_INFO_IPV6;
tun_info->options_len = opt_len;
@@ -1008,7 +1023,8 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in6_addr(skb, LWTUNNEL_IP6_SRC, &tun_info->key.u.ipv6.src) ||
nla_put_u8(skb, LWTUNNEL_IP6_TC, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags) ||
+ nla_put_be16(skb, LWTUNNEL_IP6_FLAGS,
+ ip_tunnel_flags_to_be16(tun_info->key.tun_flags)) ||
ip_tun_fill_encap_opts(skb, LWTUNNEL_IP6_OPTS, tun_info))
return -ENOMEM;
@@ -1139,8 +1155,12 @@ void ip_tunnel_netlink_parms(struct nlattr *data[],
if (!data[IFLA_IPTUN_PMTUDISC] || nla_get_u8(data[IFLA_IPTUN_PMTUDISC]))
parms->iph.frag_off = htons(IP_DF);
- if (data[IFLA_IPTUN_FLAGS])
- parms->i_flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]);
+ if (data[IFLA_IPTUN_FLAGS]) {
+ __be16 flags;
+
+ flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]);
+ ip_tunnel_flags_from_be16(parms->i_flags, flags);
+ }
if (data[IFLA_IPTUN_PROTO])
parms->iph.protocol = nla_get_u8(data[IFLA_IPTUN_PROTO]);
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 569a54d4bb89..6faab684f70c 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -51,8 +51,11 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
const struct iphdr *iph = ip_hdr(skb);
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, 0);
if (tunnel) {
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
@@ -322,8 +325,11 @@ static int vti4_err(struct sk_buff *skb, u32 info)
const struct iphdr *iph = (const struct iphdr *)skb->data;
int protocol = iph->protocol;
struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->daddr, iph->saddr, 0);
if (!tunnel)
return -1;
@@ -375,6 +381,7 @@ static int vti4_err(struct sk_buff *skb, u32 info)
static int
vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
int err = 0;
if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
@@ -383,20 +390,26 @@ vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
return -EINVAL;
}
- if (!(p->i_flags & GRE_KEY))
+ if (!ip_tunnel_flags_is_be16_compat(p->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(p->o_flags))
+ return -EOVERFLOW;
+
+ if (!(ip_tunnel_flags_to_be16(p->i_flags) & GRE_KEY))
p->i_key = 0;
- if (!(p->o_flags & GRE_KEY))
+ if (!(ip_tunnel_flags_to_be16(p->o_flags) & GRE_KEY))
p->o_key = 0;
- p->i_flags = VTI_ISVTI;
+ __set_bit(IP_TUNNEL_VTI_BIT, flags);
+ ip_tunnel_flags_copy(p->i_flags, flags);
err = ip_tunnel_ctl(dev, p, cmd);
if (err)
return err;
if (cmd != SIOCDELTUNNEL) {
- p->i_flags |= GRE_KEY;
- p->o_flags |= GRE_KEY;
+ ip_tunnel_flags_from_be16(flags, GRE_KEY);
+ ip_tunnel_flags_or(p->i_flags, p->i_flags, flags);
+ ip_tunnel_flags_or(p->o_flags, p->o_flags, flags);
}
return 0;
}
@@ -539,7 +552,7 @@ static void vti_netlink_parms(struct nlattr *data[],
if (!data)
return;
- parms->i_flags = VTI_ISVTI;
+ __set_bit(IP_TUNNEL_VTI_BIT, parms->i_flags);
if (data[IFLA_VTI_LINK])
parms->link = nla_get_u32(data[IFLA_VTI_LINK]);
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 0dd2d3b55c75..11cb4d8a93d7 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -130,13 +130,16 @@ static int ipip_err(struct sk_buff *skb, u32 info)
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, ipip_net_id);
const struct iphdr *iph = (const struct iphdr *)skb->data;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
const int type = icmp_hdr(skb)->type;
const int code = icmp_hdr(skb)->code;
struct ip_tunnel *t;
int err = 0;
- t = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
- iph->daddr, iph->saddr, 0);
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
+ t = ip_tunnel_lookup(itn, skb->dev->ifindex, flags, iph->daddr,
+ iph->saddr, 0);
if (!t) {
err = -ENOENT;
goto out;
@@ -213,13 +216,16 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
{
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, ipip_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tun_dst = NULL;
struct ip_tunnel *tunnel;
const struct iphdr *iph;
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
iph = ip_hdr(skb);
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
- iph->saddr, iph->daddr, 0);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags, iph->saddr,
+ iph->daddr, 0);
if (tunnel) {
const struct tnl_ptk_info *tpi;
@@ -238,7 +244,9 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
if (iptunnel_pull_header(skb, 0, tpi->proto, false))
goto drop;
if (tunnel->collect_md) {
- tun_dst = ip_tun_rx_dst(skb, 0, 0, 0);
+ ip_tunnel_flags_zero(flags);
+
+ tun_dst = ip_tun_rx_dst(skb, flags, 0, 0);
if (!tun_dst)
return 0;
ip_tunnel_md_udp_encap(skb, &tun_dst->u.tun_info);
@@ -340,7 +348,8 @@ ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
}
p->i_key = p->o_key = 0;
- p->i_flags = p->o_flags = 0;
+ ip_tunnel_flags_zero(p->i_flags);
+ ip_tunnel_flags_zero(p->o_flags);
return ip_tunnel_ctl(dev, p, cmd);
}
diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index a87defb2b167..3a7878a165e6 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -183,7 +183,8 @@ void udp_tunnel_sock_release(struct socket *sock)
EXPORT_SYMBOL_GPL(udp_tunnel_sock_release);
struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
- __be16 flags, __be64 tunnel_id, int md_size)
+ const unsigned long *flags,
+ __be64 tunnel_id, int md_size)
{
struct metadata_dst *tun_dst;
struct ip_tunnel_info *info;
@@ -199,7 +200,7 @@ struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
info->key.tp_src = udp_hdr(skb)->source;
info->key.tp_dst = udp_hdr(skb)->dest;
if (udp_hdr(skb)->check)
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
return tun_dst;
}
EXPORT_SYMBOL_GPL(udp_tun_rx_dst);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 070d87abf7c0..e1c126f4a61c 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -496,11 +496,11 @@ static int ip6gre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi)
tpi->proto);
if (tunnel) {
if (tunnel->parms.collect_md) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct metadata_dst *tun_dst;
__be64 tun_id;
- __be16 flags;
- flags = tpi->flags;
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);
tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id, 0);
@@ -548,14 +548,14 @@ static int ip6erspan_rcv(struct sk_buff *skb,
if (tunnel->parms.collect_md) {
struct erspan_metadata *pkt_md, *md;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct metadata_dst *tun_dst;
struct ip_tunnel_info *info;
unsigned char *gh;
__be64 tun_id;
- __be16 flags;
- tpi->flags |= TUNNEL_KEY;
- flags = tpi->flags;
+ __set_bit(IP_TUNNEL_KEY_BIT, tpi->flags);
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);
tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id,
@@ -577,7 +577,8 @@ static int ip6erspan_rcv(struct sk_buff *skb,
md2 = &md->u.md2;
memcpy(md2, pkt_md, ver == 1 ? ERSPAN_V1_MDSIZE :
ERSPAN_V2_MDSIZE);
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ info->key.tun_flags);
info->options_len = sizeof(*md);
ip6_tnl_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error);
@@ -745,8 +746,8 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
__u32 *pmtu, __be16 proto)
{
struct ip6_tnl *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 protocol;
- __be16 flags;
if (dev->type == ARPHRD_ETHER)
IPCB(skb)->flags = 0;
@@ -778,8 +779,11 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
fl6->fl6_gre_key = tunnel_id_to_key32(key->tun_id);
dsfield = key->tos;
- flags = key->tun_flags &
- (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ);
+ ip_tunnel_flags_zero(flags);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ ip_tunnel_flags_and(flags, flags, key->tun_flags);
tun_hlen = gre_calc_hlen(flags);
if (skb_cow_head(skb, dev->needed_headroom ?: tun_hlen + tunnel->encap_hlen))
@@ -788,19 +792,21 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
gre_build_header(skb, tun_hlen,
flags, protocol,
tunnel_id_to_key32(tun_info->key.tun_id),
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno))
- : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) :
+ 0);
} else {
if (skb_cow_head(skb, dev->needed_headroom ?: tunnel->hlen))
return -ENOMEM;
- flags = tunnel->parms.o_flags;
+ ip_tunnel_flags_copy(flags, tunnel->parms.o_flags);
gre_build_header(skb, tunnel->tun_hlen, flags,
protocol, tunnel->parms.o_key,
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno))
- : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) :
+ 0);
}
return ip6_tnl_xmit(skb, dev, dsfield, fl6, encap_limit, pmtu,
@@ -822,7 +828,8 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_ipv4(skb, dev, &fl6,
&dsfield, &encap_limit);
- err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
+ err = gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags));
if (err)
return -1;
@@ -856,7 +863,8 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_ipv6(skb, dev, &fl6, &dsfield, &encap_limit))
return -1;
- if (gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags)))
return -1;
err = __gre6_xmit(skb, dev, dsfield, &fl6, encap_limit,
@@ -883,7 +891,8 @@ static int ip6gre_xmit_other(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_other(skb, dev, &fl6, &dsfield, &encap_limit))
return -1;
- err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
+ err = gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags));
if (err)
return err;
err = __gre6_xmit(skb, dev, dsfield, &fl6, encap_limit, &mtu, skb->protocol);
@@ -936,6 +945,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
struct ip_tunnel_info *tun_info = NULL;
struct ip6_tnl *t = netdev_priv(dev);
struct dst_entry *dst = skb_dst(skb);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
bool truncate = false;
int encap_limit = -1;
__u8 dsfield = false;
@@ -979,7 +989,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
if (skb_cow_head(skb, dev->needed_headroom ?: t->hlen))
goto tx_err;
- t->parms.o_flags &= ~TUNNEL_KEY;
+ __clear_bit(IP_TUNNEL_KEY_BIT, t->parms.o_flags);
IPCB(skb)->flags = 0;
/* For collect_md mode, derive fl6 from the tunnel key,
@@ -1004,7 +1014,8 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
fl6.fl6_gre_key = tunnel_id_to_key32(key->tun_id);
dsfield = key->tos;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
+ if (!test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_info->key.tun_flags))
goto tx_err;
if (tun_info->options_len < sizeof(*md))
goto tx_err;
@@ -1065,7 +1076,9 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
}
/* Push GRE header. */
- gre_build_header(skb, 8, TUNNEL_SEQ, proto, 0, htonl(atomic_fetch_inc(&t->o_seqno)));
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ gre_build_header(skb, 8, flags, proto, 0,
+ htonl(atomic_fetch_inc(&t->o_seqno)));
/* TooBig packet may have updated dst->dev's mtu */
if (!t->parms.collect_md && dst && dst_mtu(dst) > dst->dev->mtu)
@@ -1208,8 +1221,8 @@ static void ip6gre_tnl_copy_tnl_parm(struct ip6_tnl *t,
t->parms.proto = p->proto;
t->parms.i_key = p->i_key;
t->parms.o_key = p->o_key;
- t->parms.i_flags = p->i_flags;
- t->parms.o_flags = p->o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p->i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p->o_flags);
t->parms.fwmark = p->fwmark;
t->parms.erspan_ver = p->erspan_ver;
t->parms.index = p->index;
@@ -1238,8 +1251,8 @@ static void ip6gre_tnl_parm_from_user(struct __ip6_tnl_parm *p,
p->link = u->link;
p->i_key = u->i_key;
p->o_key = u->o_key;
- p->i_flags = gre_flags_to_tnl_flags(u->i_flags);
- p->o_flags = gre_flags_to_tnl_flags(u->o_flags);
+ gre_flags_to_tnl_flags(p->i_flags, u->i_flags);
+ gre_flags_to_tnl_flags(p->o_flags, u->o_flags);
memcpy(p->name, u->name, sizeof(u->name));
}
@@ -1391,7 +1404,7 @@ static int ip6gre_header(struct sk_buff *skb, struct net_device *dev,
ipv6h->daddr = t->parms.raddr;
p = (__be16 *)(ipv6h + 1);
- p[0] = t->parms.o_flags;
+ p[0] = ip_tunnel_flags_to_be16(t->parms.o_flags);
p[1] = htons(type);
/*
@@ -1455,19 +1468,17 @@ static void ip6gre_tunnel_setup(struct net_device *dev)
static void ip6gre_tnl_init_features(struct net_device *dev)
{
struct ip6_tnl *nt = netdev_priv(dev);
- __be16 flags;
dev->features |= GRE6_FEATURES | NETIF_F_LLTX;
dev->hw_features |= GRE6_FEATURES;
- flags = nt->parms.o_flags;
-
/* TCP offload with GRE SEQ is not supported, nor can we support 2
* levels of outer headers requiring an update.
*/
- if (flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, nt->parms.o_flags))
return;
- if (flags & TUNNEL_CSUM && nt->encap.type != TUNNEL_ENCAP_NONE)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, nt->parms.o_flags) &&
+ nt->encap.type != TUNNEL_ENCAP_NONE)
return;
dev->features |= NETIF_F_GSO_SOFTWARE;
@@ -1793,12 +1804,12 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
parms->link = nla_get_u32(data[IFLA_GRE_LINK]);
if (data[IFLA_GRE_IFLAGS])
- parms->i_flags = gre_flags_to_tnl_flags(
- nla_get_be16(data[IFLA_GRE_IFLAGS]));
+ gre_flags_to_tnl_flags(parms->i_flags,
+ nla_get_be16(data[IFLA_GRE_IFLAGS]));
if (data[IFLA_GRE_OFLAGS])
- parms->o_flags = gre_flags_to_tnl_flags(
- nla_get_be16(data[IFLA_GRE_OFLAGS]));
+ gre_flags_to_tnl_flags(parms->o_flags,
+ nla_get_be16(data[IFLA_GRE_OFLAGS]));
if (data[IFLA_GRE_IKEY])
parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
@@ -2144,11 +2155,13 @@ static int ip6gre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip6_tnl *t = netdev_priv(dev);
struct __ip6_tnl_parm *p = &t->parms;
- __be16 o_flags = p->o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
+
+ ip_tunnel_flags_copy(o_flags, p->o_flags);
if (p->erspan_ver == 1 || p->erspan_ver == 2) {
if (!p->collect_md)
- o_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, o_flags);
if (nla_put_u8(skb, IFLA_GRE_ERSPAN_VER, p->erspan_ver))
goto nla_put_failure;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 46c19bd48990..10c71b181474 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -799,17 +799,15 @@ static int __ip6_tnl_rcv(struct ip6_tnl *tunnel, struct sk_buff *skb,
const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
int err;
- if ((!(tpi->flags & TUNNEL_CSUM) &&
- (tunnel->parms.i_flags & TUNNEL_CSUM)) ||
- ((tpi->flags & TUNNEL_CSUM) &&
- !(tunnel->parms.i_flags & TUNNEL_CSUM))) {
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.i_flags) !=
+ test_bit(IP_TUNNEL_CSUM_BIT, tpi->flags)) {
DEV_STATS_INC(tunnel->dev, rx_crc_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
goto drop;
}
- if (tunnel->parms.i_flags & TUNNEL_SEQ) {
- if (!(tpi->flags & TUNNEL_SEQ) ||
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.i_flags)) {
+ if (!test_bit(IP_TUNNEL_SEQ_BIT, tpi->flags) ||
(tunnel->i_seqno &&
(s32)(ntohl(tpi->seq) - tunnel->i_seqno) < 0)) {
DEV_STATS_INC(tunnel->dev, rx_fifo_errors);
@@ -932,7 +930,9 @@ static int ipxip6_rcv(struct sk_buff *skb, u8 ipproto,
if (iptunnel_pull_header(skb, 0, tpi->proto, false))
goto drop;
if (t->parms.collect_md) {
- tun_dst = ipv6_tun_rx_dst(skb, 0, 0, 0);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+
+ tun_dst = ipv6_tun_rx_dst(skb, flags, 0, 0);
if (!tun_dst)
goto drop;
}
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 19af34e4c13c..b8babe787a91 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -207,7 +207,7 @@ static int ipip6_tunnel_create(struct net_device *dev)
__dev_addr_set(dev, &t->parms.iph.saddr, 4);
memcpy(dev->broadcast, &t->parms.iph.daddr, 4);
- if ((__force u16)t->parms.i_flags & SIT_ISATAP)
+ if (test_bit(IP_TUNNEL_SIT_ISATAP_BIT, t->parms.i_flags))
dev->priv_flags |= IFF_ISATAP;
dev->rtnl_link_ops = &sit_link_ops;
@@ -1704,7 +1704,8 @@ static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
nla_put_u8(skb, IFLA_IPTUN_PMTUDISC,
!!(parm->iph.frag_off & htons(IP_DF))) ||
nla_put_u8(skb, IFLA_IPTUN_PROTO, parm->iph.protocol) ||
- nla_put_be16(skb, IFLA_IPTUN_FLAGS, parm->i_flags) ||
+ nla_put_be16(skb, IFLA_IPTUN_FLAGS,
+ ip_tunnel_flags_to_be16(parm->i_flags)) ||
nla_put_u32(skb, IFLA_IPTUN_FWMARK, tunnel->fwmark))
goto nla_put_failure;
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index a2c16b501087..c7a8a08b7308 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1550,6 +1550,7 @@ static int ipvs_gre_decap(struct netns_ipvs *ipvs, struct sk_buff *skb,
if (!dest)
goto unk;
if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 type;
/* Only support version 0 and C (csum) */
@@ -1560,7 +1561,10 @@ static int ipvs_gre_decap(struct netns_ipvs *ipvs, struct sk_buff *skb,
if (type != htons(ETH_P_IP))
goto unk;
*proto = IPPROTO_IPIP;
- return gre_calc_hlen(gre_flags_to_tnl_flags(greh->flags));
+
+ gre_flags_to_tnl_flags(flags, greh->flags);
+
+ return gre_calc_hlen(flags);
}
unk:
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 65e0259178da..39b5fd6bbf65 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -390,10 +390,10 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
skb->ip_summed == CHECKSUM_PARTIAL)
mtu -= GUE_PLEN_REMCSUM + GUE_LEN_PRIV;
} else if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
if (dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
mtu -= gre_calc_hlen(tflags);
}
if (mtu < 68) {
@@ -553,10 +553,10 @@ __ip_vs_get_out_rt_v6(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
skb->ip_summed == CHECKSUM_PARTIAL)
mtu -= GUE_PLEN_REMCSUM + GUE_LEN_PRIV;
} else if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
if (dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
mtu -= gre_calc_hlen(tflags);
}
if (mtu < IPV6_MIN_MTU) {
@@ -1082,11 +1082,11 @@ ipvs_gre_encap(struct net *net, struct sk_buff *skb,
{
__be16 proto = *next_protocol == IPPROTO_IPIP ?
htons(ETH_P_IP) : htons(ETH_P_IPV6);
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t hdrlen;
if (cp->dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
hdrlen = gre_calc_hlen(tflags);
gre_build_header(skb, hdrlen, tflags, proto, 0, 0);
@@ -1165,11 +1165,11 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
max_headroom += sizeof(struct udphdr) + gue_hdrlen;
} else if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t gre_hdrlen;
- __be16 tflags = 0;
if (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
gre_hdrlen = gre_calc_hlen(tflags);
max_headroom += gre_hdrlen;
@@ -1310,11 +1310,11 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
max_headroom += sizeof(struct udphdr) + gue_hdrlen;
} else if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t gre_hdrlen;
- __be16 tflags = 0;
if (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
gre_hdrlen = gre_calc_hlen(tflags);
max_headroom += gre_hdrlen;
diff --git a/net/netfilter/nft_tunnel.c b/net/netfilter/nft_tunnel.c
index 9f21953c7433..9653561aa5c0 100644
--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -174,8 +174,8 @@ struct nft_tunnel_opts {
struct erspan_metadata erspan;
u8 data[IP_TUNNEL_OPTS_MAX];
} u;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
u32 len;
- __be16 flags;
};
struct nft_tunnel_obj {
@@ -271,7 +271,8 @@ static int nft_tunnel_obj_vxlan_init(const struct nlattr *attr,
opts->u.vxlan.gbp = ntohl(nla_get_be32(tb[NFTA_TUNNEL_KEY_VXLAN_GBP]));
opts->len = sizeof(struct vxlan_metadata);
- opts->flags = TUNNEL_VXLAN_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, opts->flags);
return 0;
}
@@ -325,7 +326,8 @@ static int nft_tunnel_obj_erspan_init(const struct nlattr *attr,
opts->u.erspan.version = version;
opts->len = sizeof(struct erspan_metadata);
- opts->flags = TUNNEL_ERSPAN_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, opts->flags);
return 0;
}
@@ -366,7 +368,8 @@ static int nft_tunnel_obj_geneve_init(const struct nlattr *attr,
opt->length = data_len / 4;
opt->opt_class = nla_get_be16(tb[NFTA_TUNNEL_KEY_GENEVE_CLASS]);
opt->type = nla_get_u8(tb[NFTA_TUNNEL_KEY_GENEVE_TYPE]);
- opts->flags = TUNNEL_GENEVE_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, opts->flags);
return 0;
}
@@ -385,8 +388,8 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
struct nft_tunnel_opts *opts)
{
struct nlattr *nla;
- __be16 type = 0;
int err, rem;
+ u32 type = 0;
err = nla_validate_nested_deprecated(attr, NFTA_TUNNEL_KEY_OPTS_MAX,
nft_tunnel_opts_policy, NULL);
@@ -401,7 +404,7 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
err = nft_tunnel_obj_vxlan_init(nla, opts);
if (err)
return err;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case NFTA_TUNNEL_KEY_OPTS_ERSPAN:
if (type)
@@ -409,15 +412,15 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
err = nft_tunnel_obj_erspan_init(nla, opts);
if (err)
return err;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
case NFTA_TUNNEL_KEY_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT)
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT)
return -EINVAL;
err = nft_tunnel_obj_geneve_init(nla, opts);
if (err)
return err;
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
default:
return -EOPNOTSUPP;
@@ -454,7 +457,9 @@ static int nft_tunnel_obj_init(const struct nft_ctx *ctx,
memset(&info, 0, sizeof(info));
info.mode = IP_TUNNEL_INFO_TX;
info.key.tun_id = key32_to_tunnel_id(nla_get_be32(tb[NFTA_TUNNEL_KEY_ID]));
- info.key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
+ __set_bit(IP_TUNNEL_KEY_BIT, info.key.tun_flags);
+ __set_bit(IP_TUNNEL_CSUM_BIT, info.key.tun_flags);
+ __set_bit(IP_TUNNEL_NOCACHE_BIT, info.key.tun_flags);
if (tb[NFTA_TUNNEL_KEY_IP]) {
err = nft_tunnel_obj_ip_init(ctx, tb[NFTA_TUNNEL_KEY_IP], &info);
@@ -483,11 +488,12 @@ static int nft_tunnel_obj_init(const struct nft_ctx *ctx,
return -EOPNOTSUPP;
if (tun_flags & NFT_TUNNEL_F_ZERO_CSUM_TX)
- info.key.tun_flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, info.key.tun_flags);
if (tun_flags & NFT_TUNNEL_F_DONT_FRAGMENT)
- info.key.tun_flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT,
+ info.key.tun_flags);
if (tun_flags & NFT_TUNNEL_F_SEQ_NUMBER)
- info.key.tun_flags |= TUNNEL_SEQ;
+ __set_bit(IP_TUNNEL_SEQ_BIT, info.key.tun_flags);
}
if (tb[NFTA_TUNNEL_KEY_TOS])
info.key.tos = nla_get_u8(tb[NFTA_TUNNEL_KEY_TOS]);
@@ -583,7 +589,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
if (!nest)
return -1;
- if (opts->flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, opts->flags)) {
inner = nla_nest_start_noflag(skb, NFTA_TUNNEL_KEY_OPTS_VXLAN);
if (!inner)
goto failure;
@@ -591,7 +597,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
htonl(opts->u.vxlan.gbp)))
goto inner_failure;
nla_nest_end(skb, inner);
- } else if (opts->flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, opts->flags)) {
inner = nla_nest_start_noflag(skb, NFTA_TUNNEL_KEY_OPTS_ERSPAN);
if (!inner)
goto failure;
@@ -613,7 +619,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
break;
}
nla_nest_end(skb, inner);
- } else if (opts->flags & TUNNEL_GENEVE_OPT) {
+ } else if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, opts->flags)) {
struct geneve_opt *opt;
int offset = 0;
@@ -658,11 +664,11 @@ static int nft_tunnel_flags_dump(struct sk_buff *skb,
{
u32 flags = 0;
- if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_DONT_FRAGMENT;
- if (!(info->key.tun_flags & TUNNEL_CSUM))
+ if (!test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_ZERO_CSUM_TX;
- if (info->key.tun_flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_SEQ_NUMBER;
if (nla_put_be32(skb, NFTA_TUNNEL_KEY_FLAGS, htonl(flags)) < 0)
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 88965e2068ac..6a30a594d38a 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -151,6 +151,13 @@ static void update_range(struct sw_flow_match *match,
sizeof((match)->key->field)); \
} while (0)
+#define SW_FLOW_KEY_BITMAP_COPY(match, field, value_p, nbits, is_mask) ({ \
+ update_range(match, offsetof(struct sw_flow_key, field), \
+ bitmap_size(nbits), is_mask); \
+ bitmap_copy(is_mask ? (match)->mask->key.field : (match)->key->field, \
+ value_p, nbits); \
+})
+
static bool match_validate(const struct sw_flow_match *match,
u64 key_attrs, u64 mask_attrs, bool log)
{
@@ -669,8 +676,8 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
bool log)
{
bool ttl = false, ipv4 = false, ipv6 = false;
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags) = { };
bool info_bridge_mode = false;
- __be16 tun_flags = 0;
int opts_type = 0;
struct nlattr *a;
int rem;
@@ -696,7 +703,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
case OVS_TUNNEL_KEY_ATTR_ID:
SW_FLOW_KEY_PUT(match, tun_key.tun_id,
nla_get_be64(a), is_mask);
- tun_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_IPV4_SRC:
SW_FLOW_KEY_PUT(match, tun_key.u.ipv4.src,
@@ -728,10 +735,10 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
ttl = true;
break;
case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT:
- tun_flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_CSUM:
- tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_TP_SRC:
SW_FLOW_KEY_PUT(match, tun_key.tp_src,
@@ -742,7 +749,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
nla_get_be16(a), is_mask);
break;
case OVS_TUNNEL_KEY_ATTR_OAM:
- tun_flags |= TUNNEL_OAM;
+ __set_bit(IP_TUNNEL_OAM_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS:
if (opts_type) {
@@ -754,7 +761,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;
- tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS:
@@ -767,7 +774,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;
- tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_PAD:
@@ -783,7 +790,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;
- tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_IPV4_INFO_BRIDGE:
@@ -797,7 +804,8 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
}
}
- SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags, is_mask);
+ SW_FLOW_KEY_BITMAP_COPY(match, tun_key.tun_flags, tun_flags,
+ __IP_TUNNEL_FLAG_NUM, is_mask);
if (is_mask)
SW_FLOW_KEY_MEMSET_FIELD(match, tun_proto, 0xff, true);
else
@@ -822,13 +830,15 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
}
if (ipv4) {
if (info_bridge_mode) {
+ __clear_bit(IP_TUNNEL_KEY_BIT, tun_flags);
+
if (match->key->tun_key.u.ipv4.src ||
match->key->tun_key.u.ipv4.dst ||
match->key->tun_key.tp_src ||
match->key->tun_key.tp_dst ||
match->key->tun_key.ttl ||
match->key->tun_key.tos ||
- tun_flags & ~TUNNEL_KEY) {
+ !ip_tunnel_flags_empty(tun_flags)) {
OVS_NLERR(log, "IPv4 tun info is not correct");
return -EINVAL;
}
@@ -873,7 +883,7 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
const void *tun_opts, int swkey_tun_opts_len,
unsigned short tun_proto, u8 mode)
{
- if (output->tun_flags & TUNNEL_KEY &&
+ if (test_bit(IP_TUNNEL_KEY_BIT, output->tun_flags) &&
nla_put_be64(skb, OVS_TUNNEL_KEY_ATTR_ID, output->tun_id,
OVS_TUNNEL_KEY_ATTR_PAD))
return -EMSGSIZE;
@@ -909,10 +919,10 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
return -EMSGSIZE;
if (nla_put_u8(skb, OVS_TUNNEL_KEY_ATTR_TTL, output->ttl))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_CSUM) &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_CSUM))
return -EMSGSIZE;
if (output->tp_src &&
@@ -921,18 +931,20 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
if (output->tp_dst &&
nla_put_be16(skb, OVS_TUNNEL_KEY_ATTR_TP_DST, output->tp_dst))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_OAM) &&
+ if (test_bit(IP_TUNNEL_OAM_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_OAM))
return -EMSGSIZE;
if (swkey_tun_opts_len) {
- if (output->tun_flags & TUNNEL_GENEVE_OPT &&
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, output->tun_flags) &&
nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
swkey_tun_opts_len, tun_opts))
return -EMSGSIZE;
- else if (output->tun_flags & TUNNEL_VXLAN_OPT &&
+ else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT,
+ output->tun_flags) &&
vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
return -EMSGSIZE;
- else if (output->tun_flags & TUNNEL_ERSPAN_OPT &&
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ output->tun_flags) &&
nla_put(skb, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
swkey_tun_opts_len, tun_opts))
return -EMSGSIZE;
@@ -2028,7 +2040,7 @@ static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
if ((swkey->tun_proto || is_mask)) {
const void *opts = NULL;
- if (output->tun_key.tun_flags & TUNNEL_OPTIONS_PRESENT)
+ if (ip_tunnel_is_options_present(output->tun_key.tun_flags))
opts = TUN_METADATA_OPTS(output, swkey->tun_opts_len);
if (ip_tun_to_nlattr(skb, &output->tun_key, opts,
@@ -2744,7 +2756,8 @@ static int validate_geneve_opts(struct sw_flow_key *key)
opts_len -= len;
}
- key->tun_key.tun_flags |= crit_opt ? TUNNEL_CRIT_OPT : 0;
+ if (crit_opt)
+ __set_bit(IP_TUNNEL_CRIT_OPT_BIT, key->tun_key.tun_flags);
return 0;
}
@@ -2752,6 +2765,7 @@ static int validate_geneve_opts(struct sw_flow_key *key)
static int validate_and_copy_set_tun(const struct nlattr *attr,
struct sw_flow_actions **sfa, bool log)
{
+ IP_TUNNEL_DECLARE_FLAGS(dst_opt_type) = { };
struct sw_flow_match match;
struct sw_flow_key key;
struct metadata_dst *tun_dst;
@@ -2759,9 +2773,7 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
struct ovs_tunnel_info *ovs_tun;
struct nlattr *a;
int err = 0, start, opts_type;
- __be16 dst_opt_type;
- dst_opt_type = 0;
ovs_match_init(&match, &key, true, NULL);
opts_type = ip_tun_from_nlattr(nla_data(attr), &match, false, log);
if (opts_type < 0)
@@ -2773,13 +2785,14 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
err = validate_geneve_opts(&key);
if (err < 0)
return err;
- dst_opt_type = TUNNEL_GENEVE_OPT;
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, dst_opt_type);
break;
case OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS:
- dst_opt_type = TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, dst_opt_type);
break;
case OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS:
- dst_opt_type = TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, dst_opt_type);
break;
}
}
diff --git a/net/psample/psample.c b/net/psample/psample.c
index ddd211a151d0..a5d9b8446f77 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -221,7 +221,7 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
const struct ip_tunnel_key *tun_key = &tun_info->key;
int tun_opts_len = tun_info->options_len;
- if (tun_key->tun_flags & TUNNEL_KEY &&
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags) &&
nla_put_be64(skb, PSAMPLE_TUNNEL_KEY_ATTR_ID, tun_key->tun_id,
PSAMPLE_TUNNEL_KEY_ATTR_PAD))
return -EMSGSIZE;
@@ -257,10 +257,10 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
return -EMSGSIZE;
if (nla_put_u8(skb, PSAMPLE_TUNNEL_KEY_ATTR_TTL, tun_key->ttl))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_DONT_FRAGMENT))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_CSUM) &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_CSUM))
return -EMSGSIZE;
if (tun_key->tp_src &&
@@ -269,15 +269,16 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
if (tun_key->tp_dst &&
nla_put_be16(skb, PSAMPLE_TUNNEL_KEY_ATTR_TP_DST, tun_key->tp_dst))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_OAM) &&
+ if (test_bit(IP_TUNNEL_OAM_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_OAM))
return -EMSGSIZE;
if (tun_opts_len) {
- if (tun_key->tun_flags & TUNNEL_GENEVE_OPT &&
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_key->tun_flags) &&
nla_put(skb, PSAMPLE_TUNNEL_KEY_ATTR_GENEVE_OPTS,
tun_opts_len, tun_opts))
return -EMSGSIZE;
- else if (tun_key->tun_flags & TUNNEL_ERSPAN_OPT &&
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_key->tun_flags) &&
nla_put(skb, PSAMPLE_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
tun_opts_len, tun_opts))
return -EMSGSIZE;
@@ -314,7 +315,7 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
int tun_opts_len = tun_info->options_len;
int sum = nla_total_size(0); /* PSAMPLE_ATTR_TUNNEL */
- if (tun_key->tun_flags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags))
sum += nla_total_size_64bit(sizeof(u64));
if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE)
@@ -337,20 +338,21 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
if (tun_key->tos)
sum += nla_total_size(sizeof(u8));
sum += nla_total_size(sizeof(u8)); /* TTL */
- if (tun_key->tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
- if (tun_key->tun_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
if (tun_key->tp_src)
sum += nla_total_size(sizeof(u16));
if (tun_key->tp_dst)
sum += nla_total_size(sizeof(u16));
- if (tun_key->tun_flags & TUNNEL_OAM)
+ if (test_bit(IP_TUNNEL_OAM_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
if (tun_opts_len) {
- if (tun_key->tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_key->tun_flags))
sum += nla_total_size(tun_opts_len);
- else if (tun_key->tun_flags & TUNNEL_ERSPAN_OPT)
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_key->tun_flags))
sum += nla_total_size(tun_opts_len);
}
diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 300b08aa8283..8b052ea1ee09 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -230,7 +230,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
nla_for_each_attr(attr, head, len, rem) {
switch (nla_type(attr)) {
case TCA_TUNNEL_KEY_ENC_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT) {
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG(extack, "Duplicate type for geneve options");
return -EINVAL;
}
@@ -247,7 +247,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
dst_len -= opt_len;
dst += opt_len;
}
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
case TCA_TUNNEL_KEY_ENC_OPTS_VXLAN:
if (type) {
@@ -259,7 +259,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case TCA_TUNNEL_KEY_ENC_OPTS_ERSPAN:
if (type) {
@@ -271,7 +271,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
}
}
@@ -302,7 +302,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
switch (nla_type(nla_data(nla))) {
case TCA_TUNNEL_KEY_ENC_OPTS_GENEVE:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -310,7 +310,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
#endif
case TCA_TUNNEL_KEY_ENC_OPTS_VXLAN:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -318,7 +318,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
#endif
case TCA_TUNNEL_KEY_ENC_OPTS_ERSPAN:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -363,6 +363,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
bool bind = act_flags & TCA_ACT_FLAGS_BIND;
struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1];
struct tcf_tunnel_key_params *params_new;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *metadata = NULL;
struct tcf_chain *goto_ch = NULL;
struct tc_tunnel_key *parm;
@@ -371,7 +372,6 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
__be16 dst_port = 0;
__be64 key_id = 0;
int opts_len = 0;
- __be16 flags = 0;
u8 tos, ttl;
int ret = 0;
u32 index;
@@ -412,16 +412,16 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
key32 = nla_get_be32(tb[TCA_TUNNEL_KEY_ENC_KEY_ID]);
key_id = key32_to_tunnel_id(key32);
- flags = TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
}
- flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
if (tb[TCA_TUNNEL_KEY_NO_CSUM] &&
nla_get_u8(tb[TCA_TUNNEL_KEY_NO_CSUM]))
- flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, flags);
if (nla_get_flag(tb[TCA_TUNNEL_KEY_NO_FRAG]))
- flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, flags);
if (tb[TCA_TUNNEL_KEY_ENC_DST_PORT])
dst_port = nla_get_be16(tb[TCA_TUNNEL_KEY_ENC_DST_PORT]);
@@ -663,15 +663,15 @@ static int tunnel_key_opts_dump(struct sk_buff *skb,
if (!start)
return -EMSGSIZE;
- if (info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_geneve_opts_dump(skb, info);
if (err)
goto err_out;
- } else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_vxlan_opts_dump(skb, info);
if (err)
goto err_out;
- } else if (info->key.tun_flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_erspan_opts_dump(skb, info);
if (err)
goto err_out;
@@ -741,7 +741,7 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
struct ip_tunnel_key *key = &info->key;
__be32 key_id = tunnel_id_to_key32(key->tun_id);
- if (((key->tun_flags & TUNNEL_KEY) &&
+ if ((test_bit(IP_TUNNEL_KEY_BIT, key->tun_flags) &&
nla_put_be32(skb, TCA_TUNNEL_KEY_ENC_KEY_ID, key_id)) ||
tunnel_key_dump_addresses(skb,
¶ms->tcft_enc_metadata->u.tun_info) ||
@@ -749,8 +749,8 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
nla_put_be16(skb, TCA_TUNNEL_KEY_ENC_DST_PORT,
key->tp_dst)) ||
nla_put_u8(skb, TCA_TUNNEL_KEY_NO_CSUM,
- !(key->tun_flags & TUNNEL_CSUM)) ||
- ((key->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ !test_bit(IP_TUNNEL_CSUM_BIT, key->tun_flags)) ||
+ (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) &&
nla_put_flag(skb, TCA_TUNNEL_KEY_NO_FRAG)) ||
tunnel_key_opts_dump(skb, info))
goto nla_put_failure;
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index efb9d2811b73..8326919cbf67 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1454,12 +1454,13 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
switch (nla_type(nla_opt_key)) {
case TCA_FLOWER_KEY_ENC_OPTS_GENEVE:
if (key->enc_opts.dst_opt_type &&
- key->enc_opts.dst_opt_type != TUNNEL_GENEVE_OPT) {
+ key->enc_opts.dst_opt_type !=
+ IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG(extack, "Duplicate type for geneve options");
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_GENEVE_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_GENEVE_OPT_BIT;
option_len = fl_set_geneve_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1470,7 +1471,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_GENEVE_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_GENEVE_OPT_BIT;
option_len = fl_set_geneve_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1489,7 +1490,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_VXLAN_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_VXLAN_OPT_BIT;
option_len = fl_set_vxlan_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1500,7 +1501,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_VXLAN_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_VXLAN_OPT_BIT;
option_len = fl_set_vxlan_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1519,7 +1520,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_ERSPAN_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_ERSPAN_OPT_BIT;
option_len = fl_set_erspan_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1530,7 +1531,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_ERSPAN_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_ERSPAN_OPT_BIT;
option_len = fl_set_erspan_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1550,7 +1551,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_GTP_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_GTP_OPT_BIT;
option_len = fl_set_gtp_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1561,7 +1562,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_GTP_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_GTP_OPT_BIT;
option_len = fl_set_gtp_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -3199,22 +3200,22 @@ static int fl_dump_key_options(struct sk_buff *skb, int enc_opt_type,
goto nla_put_failure;
switch (enc_opts->dst_opt_type) {
- case TUNNEL_GENEVE_OPT:
+ case IP_TUNNEL_GENEVE_OPT_BIT:
err = fl_dump_key_geneve_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_VXLAN_OPT:
+ case IP_TUNNEL_VXLAN_OPT_BIT:
err = fl_dump_key_vxlan_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_ERSPAN_OPT:
+ case IP_TUNNEL_ERSPAN_OPT_BIT:
err = fl_dump_key_erspan_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_GTP_OPT:
+ case IP_TUNNEL_GTP_OPT_BIT:
err = fl_dump_key_gtp_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
--
2.43.0
From: Marcin Szycik <[email protected]>
FLOW_DISSECTOR_KEY_ENC_OPTS can be used for multiple headers, but currently
it is treated as GTP-exclusive in ice. Rename ICE_TC_FLWR_FIELD_ENC_OPTS to
ICE_TC_FLWR_FIELD_GTP_OPTS and check for tunnel type earlier. After this
refactor, it is easier to add new headers using FLOW_DISSECTOR_KEY_ENC_OPTS
- instead of checking tunnel type in ice_tc_count_lkups() and
ice_tc_fill_tunnel_outer(), it needs to be checked only once, in
ice_parse_tunnel_attr().
Signed-off-by: Marcin Szycik <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/ethernet/intel/ice/ice_tc_lib.h | 2 +-
drivers/net/ethernet/intel/ice/ice_tc_lib.c | 10 +++++-----
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.h b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
index 65d387163a46..5d188ad7517a 100644
--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
@@ -22,7 +22,7 @@
#define ICE_TC_FLWR_FIELD_ENC_SRC_L4_PORT BIT(15)
#define ICE_TC_FLWR_FIELD_ENC_DST_MAC BIT(16)
#define ICE_TC_FLWR_FIELD_ETH_TYPE_ID BIT(17)
-#define ICE_TC_FLWR_FIELD_ENC_OPTS BIT(18)
+#define ICE_TC_FLWR_FIELD_GTP_OPTS BIT(18)
#define ICE_TC_FLWR_FIELD_CVLAN BIT(19)
#define ICE_TC_FLWR_FIELD_PPPOE_SESSID BIT(20)
#define ICE_TC_FLWR_FIELD_PPP_PROTO BIT(21)
diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.c b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
index b890410a2bc0..80797db9f2b9 100644
--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
@@ -35,7 +35,7 @@ ice_tc_count_lkups(u32 flags, struct ice_tc_flower_lyr_2_4_hdrs *headers,
if (flags & ICE_TC_FLWR_FIELD_ENC_DST_MAC)
lkups_cnt++;
- if (flags & ICE_TC_FLWR_FIELD_ENC_OPTS)
+ if (flags & ICE_TC_FLWR_FIELD_GTP_OPTS)
lkups_cnt++;
if (flags & (ICE_TC_FLWR_FIELD_ENC_SRC_IPV4 |
@@ -219,8 +219,7 @@ ice_tc_fill_tunnel_outer(u32 flags, struct ice_tc_flower_fltr *fltr,
i++;
}
- if (flags & ICE_TC_FLWR_FIELD_ENC_OPTS &&
- (fltr->tunnel_type == TNL_GTPU || fltr->tunnel_type == TNL_GTPC)) {
+ if (flags & ICE_TC_FLWR_FIELD_GTP_OPTS) {
list[i].type = ice_proto_type_from_tunnel(fltr->tunnel_type);
if (fltr->gtp_pdu_info_masks.pdu_type) {
@@ -1401,7 +1400,8 @@ ice_parse_tunnel_attr(struct net_device *dev, struct flow_rule *rule,
}
}
- if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_OPTS)) {
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_OPTS) &&
+ (fltr->tunnel_type == TNL_GTPU || fltr->tunnel_type == TNL_GTPC)) {
struct flow_match_enc_opts match;
flow_rule_match_enc_opts(rule, &match);
@@ -1412,7 +1412,7 @@ ice_parse_tunnel_attr(struct net_device *dev, struct flow_rule *rule,
memcpy(&fltr->gtp_pdu_info_masks, &match.mask->data[0],
sizeof(struct gtp_pdu_session_info));
- fltr->flags |= ICE_TC_FLWR_FIELD_ENC_OPTS;
+ fltr->flags |= ICE_TC_FLWR_FIELD_GTP_OPTS;
}
return 0;
--
2.43.0
bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is not
"generic", i.e. it's not just `nbits -> size of bitmap in bytes`
converter as it would be expected from its name.
Add the prefix 'idset_' used throughout the file where the function
resides.
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/s390/cio/idset.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
index 45f9c0736be4..0a1105a483bf 100644
--- a/drivers/s390/cio/idset.c
+++ b/drivers/s390/cio/idset.c
@@ -16,7 +16,7 @@ struct idset {
unsigned long bitmap[];
};
-static inline unsigned long bitmap_size(int num_ssid, int num_id)
+static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
{
return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
}
@@ -25,11 +25,12 @@ static struct idset *idset_new(int num_ssid, int num_id)
{
struct idset *set;
- set = vmalloc(sizeof(struct idset) + bitmap_size(num_ssid, num_id));
+ set = vmalloc(sizeof(struct idset) +
+ idset_bitmap_size(num_ssid, num_id));
if (set) {
set->num_ssid = num_ssid;
set->num_id = num_id;
- memset(set->bitmap, 0, bitmap_size(num_ssid, num_id));
+ memset(set->bitmap, 0, idset_bitmap_size(num_ssid, num_id));
}
return set;
}
@@ -41,7 +42,8 @@ void idset_free(struct idset *set)
void idset_fill(struct idset *set)
{
- memset(set->bitmap, 0xff, bitmap_size(set->num_ssid, set->num_id));
+ memset(set->bitmap, 0xff,
+ idset_bitmap_size(set->num_ssid, set->num_id));
}
static inline void idset_add(struct idset *set, int ssid, int id)
--
2.43.0
Avoid open-coding that simple expression each time by moving
BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export
it to the rest of the kernel.
Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always
equals to %BITS_PER_BYTE, regardless of the target architecture.
Do the same for the tools ecosystem as well (incl. its version of
bitops.h). The previous implementation had its implicit type of long,
while the new one is int, so adjust the format literal accordingly in
the perf code.
Suggested-by: Andy Shevchenko <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 2 ++
tools/include/linux/bitops.h | 2 ++
kernel/trace/trace_probe.c | 2 --
tools/perf/util/probe-finder.c | 4 +---
4 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index f7f5a783da2a..e0cd09eb91cd 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -21,6 +21,8 @@
#define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(char))
+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/tools/include/linux/bitops.h b/tools/include/linux/bitops.h
index f18683b95ea6..bc6600466e7b 100644
--- a/tools/include/linux/bitops.h
+++ b/tools/include/linux/bitops.h
@@ -20,6 +20,8 @@
#define BITS_TO_U32(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(char))
+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 4dc74d73fc1d..2b743c1e37db 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1053,8 +1053,6 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
return ret;
}
-#define BYTES_TO_BITS(nb) ((BITS_PER_LONG * (nb)) / sizeof(long))
-
/* Bitfield type needs to be parsed into a fetch function */
static int __parse_bitfield_probe_arg(const char *bf,
const struct fetch_type *t,
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index c8923375e30d..630e16c54ed5 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -186,8 +186,6 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,
return ret2;
}
-#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_LONG / sizeof(long))
-
static int convert_variable_type(Dwarf_Die *vr_die,
struct probe_trace_arg *tvar,
const char *cast, bool user_access)
@@ -217,7 +215,7 @@ static int convert_variable_type(Dwarf_Die *vr_die,
total = dwarf_bytesize(vr_die);
if (boffs < 0 || total < 0)
return -ENOENT;
- ret = snprintf(buf, 16, "b%d@%d/%zd", bsize, boffs,
+ ret = snprintf(buf, 16, "b%d@%d/%d", bsize, boffs,
BYTES_TO_BITS(total));
goto formatted;
}
--
2.43.0
From: Alexander Potapenko <[email protected]>
Add basic tests ensuring that values can be added at arbitrary positions
of the bitmap, including those spanning into the adjacent unsigned
longs.
Two new performance tests, test_bitmap_read_perf() and
test_bitmap_write_perf(), can be used to assess future performance
improvements of bitmap_read() and bitmap_write():
[ 0.431119][ T1] test_bitmap: Time spent in test_bitmap_read_perf: 615253
[ 0.433197][ T1] test_bitmap: Time spent in test_bitmap_write_perf: 916313
(numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running
QEMU).
Signed-off-by: Alexander Potapenko <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 179 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 172 insertions(+), 7 deletions(-)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 65f22c2578b0..46c015468077 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -60,18 +60,17 @@ static const unsigned long exp3_1_0[] __initconst = {
};
static bool __init
-__check_eq_uint(const char *srcfile, unsigned int line,
- const unsigned int exp_uint, unsigned int x)
+__check_eq_ulong(const char *srcfile, unsigned int line,
+ const unsigned long exp_ulong, unsigned long x)
{
- if (exp_uint != x) {
- pr_err("[%s:%u] expected %u, got %u\n",
- srcfile, line, exp_uint, x);
+ if (exp_ulong != x) {
+ pr_err("[%s:%u] expected %lu, got %lu\n",
+ srcfile, line, exp_ulong, x);
return false;
}
return true;
}
-
static bool __init
__check_eq_bitmap(const char *srcfile, unsigned int line,
const unsigned long *exp_bmap, const unsigned long *bmap,
@@ -185,7 +184,8 @@ __check_eq_str(const char *srcfile, unsigned int line,
result; \
})
-#define expect_eq_uint(...) __expect_eq(uint, ##__VA_ARGS__)
+#define expect_eq_ulong(...) __expect_eq(ulong, ##__VA_ARGS__)
+#define expect_eq_uint(x, y) expect_eq_ulong((unsigned int)(x), (unsigned int)(y))
#define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__)
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
@@ -1245,6 +1245,168 @@ static void __init test_bitmap_const_eval(void)
BUILD_BUG_ON(~var != ~BIT(25));
}
+/*
+ * Test bitmap should be big enough to include the cases when start is not in
+ * the first word, and start+nbits lands in the following word.
+ */
+#define TEST_BIT_LEN (1000)
+
+/*
+ * Helper function to test bitmap_write() overwriting the chosen byte pattern.
+ */
+static void __init test_bitmap_write_helper(const char *pattern)
+{
+ DECLARE_BITMAP(bitmap, TEST_BIT_LEN);
+ DECLARE_BITMAP(exp_bitmap, TEST_BIT_LEN);
+ DECLARE_BITMAP(pat_bitmap, TEST_BIT_LEN);
+ unsigned long w, r, bit;
+ int i, n, nbits;
+
+ /*
+ * Only parse the pattern once and store the result in the intermediate
+ * bitmap.
+ */
+ bitmap_parselist(pattern, pat_bitmap, TEST_BIT_LEN);
+
+ /*
+ * Check that writing a single bit does not accidentally touch the
+ * adjacent bits.
+ */
+ for (i = 0; i < TEST_BIT_LEN; i++) {
+ bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN);
+ bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN);
+ for (bit = 0; bit <= 1; bit++) {
+ bitmap_write(bitmap, bit, i, 1);
+ __assign_bit(i, exp_bitmap, bit);
+ expect_eq_bitmap(exp_bitmap, bitmap,
+ TEST_BIT_LEN);
+ }
+ }
+
+ /* Ensure writing 0 bits does not change anything. */
+ bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN);
+ bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN);
+ for (i = 0; i < TEST_BIT_LEN; i++) {
+ bitmap_write(bitmap, ~0UL, i, 0);
+ expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN);
+ }
+
+ for (nbits = BITS_PER_LONG; nbits >= 1; nbits--) {
+ w = IS_ENABLED(CONFIG_64BIT) ? 0xdeadbeefdeadbeefUL
+ : 0xdeadbeefUL;
+ w >>= (BITS_PER_LONG - nbits);
+ for (i = 0; i <= TEST_BIT_LEN - nbits; i++) {
+ bitmap_copy(bitmap, pat_bitmap, TEST_BIT_LEN);
+ bitmap_copy(exp_bitmap, pat_bitmap, TEST_BIT_LEN);
+ for (n = 0; n < nbits; n++)
+ __assign_bit(i + n, exp_bitmap, w & BIT(n));
+ bitmap_write(bitmap, w, i, nbits);
+ expect_eq_bitmap(exp_bitmap, bitmap, TEST_BIT_LEN);
+ r = bitmap_read(bitmap, i, nbits);
+ expect_eq_ulong(r, w);
+ }
+ }
+}
+
+static void __init test_bitmap_read_write(void)
+{
+ unsigned char *pattern[3] = {"", "all:1/2", "all"};
+ DECLARE_BITMAP(bitmap, TEST_BIT_LEN);
+ unsigned long zero_bits = 0, bits_per_long = BITS_PER_LONG;
+ unsigned long val;
+ int i, pi;
+
+ /*
+ * Reading/writing zero bits should not crash the kernel.
+ * READ_ONCE() prevents constant folding.
+ */
+ bitmap_write(NULL, 0, 0, READ_ONCE(zero_bits));
+ /* Return value of bitmap_read() is undefined here. */
+ bitmap_read(NULL, 0, READ_ONCE(zero_bits));
+
+ /*
+ * Reading/writing more than BITS_PER_LONG bits should not crash the
+ * kernel. READ_ONCE() prevents constant folding.
+ */
+ bitmap_write(NULL, 0, 0, READ_ONCE(bits_per_long) + 1);
+ /* Return value of bitmap_read() is undefined here. */
+ bitmap_read(NULL, 0, READ_ONCE(bits_per_long) + 1);
+
+ /*
+ * Ensure that bitmap_read() reads the same value that was previously
+ * written, and two consequent values are correctly merged.
+ * The resulting bit pattern is asymmetric to rule out possible issues
+ * with bit numeration order.
+ */
+ for (i = 0; i < TEST_BIT_LEN - 7; i++) {
+ bitmap_zero(bitmap, TEST_BIT_LEN);
+
+ bitmap_write(bitmap, 0b10101UL, i, 5);
+ val = bitmap_read(bitmap, i, 5);
+ expect_eq_ulong(0b10101UL, val);
+
+ bitmap_write(bitmap, 0b101UL, i + 5, 3);
+ val = bitmap_read(bitmap, i + 5, 3);
+ expect_eq_ulong(0b101UL, val);
+
+ val = bitmap_read(bitmap, i, 8);
+ expect_eq_ulong(0b10110101UL, val);
+ }
+
+ for (pi = 0; pi < ARRAY_SIZE(pattern); pi++)
+ test_bitmap_write_helper(pattern[pi]);
+}
+
+static void __init test_bitmap_read_perf(void)
+{
+ DECLARE_BITMAP(bitmap, TEST_BIT_LEN);
+ unsigned int cnt, nbits, i;
+ unsigned long val;
+ ktime_t time;
+
+ bitmap_fill(bitmap, TEST_BIT_LEN);
+ time = ktime_get();
+ for (cnt = 0; cnt < 5; cnt++) {
+ for (nbits = 1; nbits <= BITS_PER_LONG; nbits++) {
+ for (i = 0; i < TEST_BIT_LEN; i++) {
+ if (i + nbits > TEST_BIT_LEN)
+ break;
+ /*
+ * Prevent the compiler from optimizing away the
+ * bitmap_read() by using its value.
+ */
+ WRITE_ONCE(val, bitmap_read(bitmap, i, nbits));
+ }
+ }
+ }
+ time = ktime_get() - time;
+ pr_err("Time spent in %s:\t%llu\n", __func__, time);
+}
+
+static void __init test_bitmap_write_perf(void)
+{
+ DECLARE_BITMAP(bitmap, TEST_BIT_LEN);
+ unsigned int cnt, nbits, i;
+ unsigned long val = 0xfeedface;
+ ktime_t time;
+
+ bitmap_zero(bitmap, TEST_BIT_LEN);
+ time = ktime_get();
+ for (cnt = 0; cnt < 5; cnt++) {
+ for (nbits = 1; nbits <= BITS_PER_LONG; nbits++) {
+ for (i = 0; i < TEST_BIT_LEN; i++) {
+ if (i + nbits > TEST_BIT_LEN)
+ break;
+ bitmap_write(bitmap, val, i, nbits);
+ }
+ }
+ }
+ time = ktime_get() - time;
+ pr_err("Time spent in %s:\t%llu\n", __func__, time);
+}
+
+#undef TEST_BIT_LEN
+
static void __init selftest(void)
{
test_zero_clear();
@@ -1261,6 +1423,9 @@ static void __init selftest(void)
test_bitmap_cut();
test_bitmap_print_buf();
test_bitmap_const_eval();
+ test_bitmap_read_write();
+ test_bitmap_read_perf();
+ test_bitmap_write_perf();
test_find_nth_bit();
test_for_each_set_bit();
--
2.43.0
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the compilers are able to expand inline
bitmap operations to compile-time initializers when possible.
However, during the round of replacement if-__set-else-__clear with
__assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes
difference in object code size for one module (even one function),
where the pattern:
DECLARE_BITMAP(foo) = { }; // on the stack, zeroed
if (a)
__set_bit(const_bit_num, foo);
if (b)
__set_bit(another_const_bit_num, foo);
...
is heavily used, although there should be no difference: the bitmap is
zeroed, so the second half of __assign_bit() should be compiled-out as
a no-op.
I either missed the fact that __assign_bit() has bitmap pointer marked
as `volatile` (as we usually do for bitops) or was hoping that the
compilers would at least try to look past the `volatile` for
__always_inline functions. Anyhow, due to that attribute, the compilers
were always compiling the whole expression and no mentioned compile-time
optimizations were working.
Convert __assign_bit() to a macro since it's a very simple if-else and
all of the checks are performed inside __set_bit() and __clear_bit(),
thus that wrapper has to be as transparent as possible. After that
change, despite it showing only -20 bytes change for vmlinux (due to
that it's still relatively unpopular), no drastic code size changes
happen when replacing if-set-else-clear for onstack bitmaps with
__assign_bit(), meaning the compiler now expands them to the actual
operations will all the expected optimizations.
Atomic assign_bit() is less affected due to its nature, but let's
convert it to a macro as well to keep the code consistent and not
leave a place for possible suboptimal codegen. Moreover, with certain
kernel configuration it actually gives some saves (x86):
do_ip_setsockopt 4154 4099 -55
Suggested-by: Yury Norov <[email protected]> # assign_bit(), too
Cc: Andy Shevchenko <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 20 ++++----------------
1 file changed, 4 insertions(+), 16 deletions(-)
diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index e0cd09eb91cd..b25dc8742124 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -275,23 +275,11 @@ static inline unsigned long fns(unsigned long word, unsigned int n)
* @addr: the address to start counting from
* @value: the value to assign
*/
-static __always_inline void assign_bit(long nr, volatile unsigned long *addr,
- bool value)
-{
- if (value)
- set_bit(nr, addr);
- else
- clear_bit(nr, addr);
-}
+#define assign_bit(nr, addr, value) \
+ ((value) ? set_bit((nr), (addr)) : clear_bit((nr), (addr)))
-static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
- bool value)
-{
- if (value)
- __set_bit(nr, addr);
- else
- __clear_bit(nr, addr);
-}
+#define __assign_bit(nr, addr, value) \
+ ((value) ? __set_bit((nr), (addr)) : __clear_bit((nr), (addr)))
/**
* __ptr_set_bit - Set bit in a pointer's value
--
2.43.0
From: Marcin Szycik <[email protected]>
Add support for creating PFCP filters in switchdev mode. Add support
for parsing PFCP-specific tc options: S flag and SEID.
To create a PFCP filter, a special netdev must be created and passed
to tc command:
ip link add pfcp0 type pfcp
tc filter add dev eth0 ingress prio 1 flower pfcp_opts \
1:123/ff:fffffffffffffff0 skip_hw action mirred egress redirect \
dev pfcp0
Changes in iproute2 [1] are required to be able to use pfcp_opts in tc.
ICE COMMS package is required to create a filter as it contains PFCP
profiles.
Link: https://lore.kernel.org/netdev/[email protected] [1]
Signed-off-by: Marcin Szycik <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
.../net/ethernet/intel/ice/ice_flex_type.h | 4 +-
.../ethernet/intel/ice/ice_protocol_type.h | 12 +++
drivers/net/ethernet/intel/ice/ice_switch.h | 2 +
drivers/net/ethernet/intel/ice/ice_tc_lib.h | 6 ++
drivers/net/ethernet/intel/ice/ice_ddp.c | 9 ++
drivers/net/ethernet/intel/ice/ice_switch.c | 85 +++++++++++++++++++
drivers/net/ethernet/intel/ice/ice_tc_lib.c | 58 +++++++++++--
7 files changed, 169 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_flex_type.h b/drivers/net/ethernet/intel/ice/ice_flex_type.h
index d427a79d001a..817beca591e0 100644
--- a/drivers/net/ethernet/intel/ice/ice_flex_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_flex_type.h
@@ -93,6 +93,7 @@ enum ice_tunnel_type {
TNL_GRETAP,
TNL_GTPC,
TNL_GTPU,
+ TNL_PFCP,
__TNL_TYPE_CNT,
TNL_LAST = 0xFF,
TNL_ALL = 0xFF,
@@ -358,7 +359,8 @@ enum ice_prof_type {
ICE_PROF_TUN_GRE = 0x4,
ICE_PROF_TUN_GTPU = 0x8,
ICE_PROF_TUN_GTPC = 0x10,
- ICE_PROF_TUN_ALL = 0x1E,
+ ICE_PROF_TUN_PFCP = 0x20,
+ ICE_PROF_TUN_ALL = 0x3E,
ICE_PROF_ALL = 0xFF,
};
diff --git a/drivers/net/ethernet/intel/ice/ice_protocol_type.h b/drivers/net/ethernet/intel/ice/ice_protocol_type.h
index f6f27361c3cf..755a9c55267c 100644
--- a/drivers/net/ethernet/intel/ice/ice_protocol_type.h
+++ b/drivers/net/ethernet/intel/ice/ice_protocol_type.h
@@ -43,6 +43,7 @@ enum ice_protocol_type {
ICE_NVGRE,
ICE_GTP,
ICE_GTP_NO_PAY,
+ ICE_PFCP,
ICE_PPPOE,
ICE_L2TPV3,
ICE_VLAN_EX,
@@ -61,6 +62,7 @@ enum ice_sw_tunnel_type {
ICE_SW_TUN_NVGRE,
ICE_SW_TUN_GTPU,
ICE_SW_TUN_GTPC,
+ ICE_SW_TUN_PFCP,
ICE_ALL_TUNNELS /* All tunnel types including NVGRE */
};
@@ -202,6 +204,15 @@ struct ice_udp_gtp_hdr {
u8 rsvrd;
};
+struct ice_pfcp_hdr {
+ u8 flags;
+ u8 msg_type;
+ __be16 length;
+ __be64 seid;
+ __be32 seq;
+ u8 spare;
+} __packed __aligned(__alignof__(u16));
+
struct ice_pppoe_hdr {
u8 rsrvd_ver_type;
u8 rsrvd_code;
@@ -418,6 +429,7 @@ union ice_prot_hdr {
struct ice_udp_tnl_hdr tnl_hdr;
struct ice_nvgre_hdr nvgre_hdr;
struct ice_udp_gtp_hdr gtp_hdr;
+ struct ice_pfcp_hdr pfcp_hdr;
struct ice_pppoe_hdr pppoe_hdr;
struct ice_l2tpv3_sess_hdr l2tpv3_sess_hdr;
struct ice_hw_metadata metadata;
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.h b/drivers/net/ethernet/intel/ice/ice_switch.h
index db7e501b7e0a..30091a80e7fa 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.h
+++ b/drivers/net/ethernet/intel/ice/ice_switch.h
@@ -21,6 +21,8 @@
#define ICE_PROFID_IPV6_GTPC_NO_TEID 45
#define ICE_PROFID_IPV6_GTPU_TEID 46
#define ICE_PROFID_IPV6_GTPU_IPV6_TCP_INNER 70
+#define ICE_PROFID_IPV4_PFCP_NODE 79
+#define ICE_PROFID_IPV6_PFCP_SESSION 82
#define ICE_SW_RULE_VSI_LIST_SIZE(s, n) struct_size((s), vsi, (n))
#define ICE_SW_RULE_RX_TX_HDR_SIZE(s, l) struct_size((s), hdr_data, (l))
diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.h b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
index 5d188ad7517a..d84f153517ec 100644
--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.h
+++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.h
@@ -4,6 +4,9 @@
#ifndef _ICE_TC_LIB_H_
#define _ICE_TC_LIB_H_
+#include <linux/bits.h>
+#include <net/pfcp.h>
+
#define ICE_TC_FLWR_FIELD_DST_MAC BIT(0)
#define ICE_TC_FLWR_FIELD_SRC_MAC BIT(1)
#define ICE_TC_FLWR_FIELD_VLAN BIT(2)
@@ -34,6 +37,7 @@
#define ICE_TC_FLWR_FIELD_VLAN_PRIO BIT(27)
#define ICE_TC_FLWR_FIELD_CVLAN_PRIO BIT(28)
#define ICE_TC_FLWR_FIELD_VLAN_TPID BIT(29)
+#define ICE_TC_FLWR_FIELD_PFCP_OPTS BIT(30)
#define ICE_TC_FLOWER_MASK_32 0xFFFFFFFF
@@ -161,6 +165,8 @@ struct ice_tc_flower_fltr {
__be32 tenant_id;
struct gtp_pdu_session_info gtp_pdu_info_keys;
struct gtp_pdu_session_info gtp_pdu_info_masks;
+ struct pfcp_metadata pfcp_meta_keys;
+ struct pfcp_metadata pfcp_meta_masks;
u32 flags;
u8 tunnel_type;
struct ice_tc_flower_action action;
diff --git a/drivers/net/ethernet/intel/ice/ice_ddp.c b/drivers/net/ethernet/intel/ice/ice_ddp.c
index 8b7504a9df31..57278ec71aac 100644
--- a/drivers/net/ethernet/intel/ice/ice_ddp.c
+++ b/drivers/net/ethernet/intel/ice/ice_ddp.c
@@ -721,6 +721,12 @@ static bool ice_is_gtp_c_profile(u16 prof_idx)
}
}
+static bool ice_is_pfcp_profile(u16 prof_idx)
+{
+ return prof_idx >= ICE_PROFID_IPV4_PFCP_NODE &&
+ prof_idx <= ICE_PROFID_IPV6_PFCP_SESSION;
+}
+
/**
* ice_get_sw_prof_type - determine switch profile type
* @hw: pointer to the HW structure
@@ -738,6 +744,9 @@ static enum ice_prof_type ice_get_sw_prof_type(struct ice_hw *hw,
if (ice_is_gtp_u_profile(prof_idx))
return ICE_PROF_TUN_GTPU;
+ if (ice_is_pfcp_profile(prof_idx))
+ return ICE_PROF_TUN_PFCP;
+
for (i = 0; i < hw->blk[ICE_BLK_SW].es.fvw; i++) {
/* UDP tunnel will have UDP_OF protocol ID and VNI offset */
if (fv->ew[i].prot_id == (u8)ICE_PROT_UDP_OF &&
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index f84bab80ca42..0138fc832a23 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -42,6 +42,7 @@ enum {
ICE_PKT_KMALLOC = BIT(9),
ICE_PKT_PPPOE = BIT(10),
ICE_PKT_L2TPV3 = BIT(11),
+ ICE_PKT_PFCP = BIT(12),
};
struct ice_dummy_pkt_offsets {
@@ -1110,6 +1111,77 @@ ICE_DECLARE_PKT_TEMPLATE(ipv6_gtp) = {
0x00, 0x00,
};
+ICE_DECLARE_PKT_OFFSETS(pfcp_session_ipv4) = {
+ { ICE_MAC_OFOS, 0 },
+ { ICE_ETYPE_OL, 12 },
+ { ICE_IPV4_OFOS, 14 },
+ { ICE_UDP_ILOS, 34 },
+ { ICE_PFCP, 42 },
+ { ICE_PROTOCOL_LAST, 0 },
+};
+
+ICE_DECLARE_PKT_TEMPLATE(pfcp_session_ipv4) = {
+ 0x00, 0x00, 0x00, 0x00, /* ICE_MAC_OFOS 0 */
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x08, 0x00, /* ICE_ETYPE_OL 12 */
+
+ 0x45, 0x00, 0x00, 0x2c, /* ICE_IPV4_OFOS 14 */
+ 0x00, 0x01, 0x00, 0x00,
+ 0x00, 0x11, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x00, 0x00, 0x22, 0x65, /* ICE_UDP_ILOS 34 */
+ 0x00, 0x18, 0x00, 0x00,
+
+ 0x21, 0x01, 0x00, 0x0c, /* ICE_PFCP 42 */
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x00, 0x00, /* 2 bytes for 4 byte alignment */
+};
+
+ICE_DECLARE_PKT_OFFSETS(pfcp_session_ipv6) = {
+ { ICE_MAC_OFOS, 0 },
+ { ICE_ETYPE_OL, 12 },
+ { ICE_IPV6_OFOS, 14 },
+ { ICE_UDP_ILOS, 54 },
+ { ICE_PFCP, 62 },
+ { ICE_PROTOCOL_LAST, 0 },
+};
+
+ICE_DECLARE_PKT_TEMPLATE(pfcp_session_ipv6) = {
+ 0x00, 0x00, 0x00, 0x00, /* ICE_MAC_OFOS 0 */
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x86, 0xdd, /* ICE_ETYPE_OL 12 */
+
+ 0x60, 0x00, 0x00, 0x00, /* ICE_IPV6_OFOS 14 */
+ 0x00, 0x10, 0x11, 0x00, /* Next header UDP */
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x00, 0x00, 0x22, 0x65, /* ICE_UDP_ILOS 54 */
+ 0x00, 0x18, 0x00, 0x00,
+
+ 0x21, 0x01, 0x00, 0x0c, /* ICE_PFCP 62 */
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00,
+
+ 0x00, 0x00, /* 2 bytes for 4 byte alignment */
+};
+
ICE_DECLARE_PKT_OFFSETS(pppoe_ipv4_tcp) = {
{ ICE_MAC_OFOS, 0 },
{ ICE_ETYPE_OL, 12 },
@@ -1343,6 +1415,8 @@ static const struct ice_dummy_pkt_profile ice_dummy_pkt_profiles[] = {
ICE_PKT_PROFILE(ipv4_gtpu_ipv4_tcp, ICE_PKT_TUN_GTPU),
ICE_PKT_PROFILE(ipv6_gtp, ICE_PKT_TUN_GTPC | ICE_PKT_OUTER_IPV6),
ICE_PKT_PROFILE(ipv4_gtpu_ipv4, ICE_PKT_TUN_GTPC),
+ ICE_PKT_PROFILE(pfcp_session_ipv6, ICE_PKT_PFCP | ICE_PKT_OUTER_IPV6),
+ ICE_PKT_PROFILE(pfcp_session_ipv4, ICE_PKT_PFCP),
ICE_PKT_PROFILE(pppoe_ipv6_udp, ICE_PKT_PPPOE | ICE_PKT_OUTER_IPV6 |
ICE_PKT_INNER_UDP),
ICE_PKT_PROFILE(pppoe_ipv6_tcp, ICE_PKT_PPPOE | ICE_PKT_OUTER_IPV6),
@@ -4526,6 +4600,7 @@ static const struct ice_prot_ext_tbl_entry ice_prot_ext[ICE_PROTOCOL_LAST] = {
ICE_PROTOCOL_ENTRY(ICE_NVGRE, 0, 2, 4, 6),
ICE_PROTOCOL_ENTRY(ICE_GTP, 8, 10, 12, 14, 16, 18, 20, 22),
ICE_PROTOCOL_ENTRY(ICE_GTP_NO_PAY, 8, 10, 12, 14),
+ ICE_PROTOCOL_ENTRY(ICE_PFCP, 8, 10, 12, 14, 16, 18, 20, 22),
ICE_PROTOCOL_ENTRY(ICE_PPPOE, 0, 2, 4, 6),
ICE_PROTOCOL_ENTRY(ICE_L2TPV3, 0, 2, 4, 6, 8, 10),
ICE_PROTOCOL_ENTRY(ICE_VLAN_EX, 2, 0),
@@ -4559,6 +4634,7 @@ static struct ice_protocol_entry ice_prot_id_tbl[ICE_PROTOCOL_LAST] = {
{ ICE_NVGRE, ICE_GRE_OF_HW },
{ ICE_GTP, ICE_UDP_OF_HW },
{ ICE_GTP_NO_PAY, ICE_UDP_ILOS_HW },
+ { ICE_PFCP, ICE_UDP_ILOS_HW },
{ ICE_PPPOE, ICE_PPPOE_HW },
{ ICE_L2TPV3, ICE_L2TPV3_HW },
{ ICE_VLAN_EX, ICE_VLAN_OF_HW },
@@ -5266,6 +5342,9 @@ ice_get_compat_fv_bitmap(struct ice_hw *hw, struct ice_adv_rule_info *rinfo,
case ICE_SW_TUN_GTPC:
prof_type = ICE_PROF_TUN_GTPC;
break;
+ case ICE_SW_TUN_PFCP:
+ prof_type = ICE_PROF_TUN_PFCP;
+ break;
case ICE_SW_TUN_AND_NON_TUN:
default:
prof_type = ICE_PROF_ALL;
@@ -5548,6 +5627,9 @@ ice_find_dummy_packet(struct ice_adv_lkup_elem *lkups, u16 lkups_cnt,
case ICE_SW_TUN_VXLAN:
match |= ICE_PKT_TUN_UDP;
break;
+ case ICE_SW_TUN_PFCP:
+ match |= ICE_PKT_PFCP;
+ break;
default:
break;
}
@@ -5688,6 +5770,9 @@ ice_fill_adv_dummy_packet(struct ice_adv_lkup_elem *lkups, u16 lkups_cnt,
case ICE_GTP:
len = sizeof(struct ice_udp_gtp_hdr);
break;
+ case ICE_PFCP:
+ len = sizeof(struct ice_pfcp_hdr);
+ break;
case ICE_PPPOE:
len = sizeof(struct ice_pppoe_hdr);
break;
diff --git a/drivers/net/ethernet/intel/ice/ice_tc_lib.c b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
index 80797db9f2b9..2f2fce285ecd 100644
--- a/drivers/net/ethernet/intel/ice/ice_tc_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_tc_lib.c
@@ -38,6 +38,9 @@ ice_tc_count_lkups(u32 flags, struct ice_tc_flower_lyr_2_4_hdrs *headers,
if (flags & ICE_TC_FLWR_FIELD_GTP_OPTS)
lkups_cnt++;
+ if (flags & ICE_TC_FLWR_FIELD_PFCP_OPTS)
+ lkups_cnt++;
+
if (flags & (ICE_TC_FLWR_FIELD_ENC_SRC_IPV4 |
ICE_TC_FLWR_FIELD_ENC_DEST_IPV4 |
ICE_TC_FLWR_FIELD_ENC_SRC_IPV6 |
@@ -138,6 +141,8 @@ ice_proto_type_from_tunnel(enum ice_tunnel_type type)
return ICE_GTP;
case TNL_GTPC:
return ICE_GTP_NO_PAY;
+ case TNL_PFCP:
+ return ICE_PFCP;
default:
return 0;
}
@@ -157,6 +162,8 @@ ice_sw_type_from_tunnel(enum ice_tunnel_type type)
return ICE_SW_TUN_GTPU;
case TNL_GTPC:
return ICE_SW_TUN_GTPC;
+ case TNL_PFCP:
+ return ICE_SW_TUN_PFCP;
default:
return ICE_NON_TUN;
}
@@ -236,6 +243,22 @@ ice_tc_fill_tunnel_outer(u32 flags, struct ice_tc_flower_fltr *fltr,
i++;
}
+ if (flags & ICE_TC_FLWR_FIELD_PFCP_OPTS) {
+ struct ice_pfcp_hdr *hdr_h, *hdr_m;
+
+ hdr_h = &list[i].h_u.pfcp_hdr;
+ hdr_m = &list[i].m_u.pfcp_hdr;
+ list[i].type = ICE_PFCP;
+
+ hdr_h->flags = fltr->pfcp_meta_keys.type;
+ hdr_m->flags = fltr->pfcp_meta_masks.type & 0x01;
+
+ hdr_h->seid = fltr->pfcp_meta_keys.seid;
+ hdr_m->seid = fltr->pfcp_meta_masks.seid;
+
+ i++;
+ }
+
if (flags & (ICE_TC_FLWR_FIELD_ENC_SRC_IPV4 |
ICE_TC_FLWR_FIELD_ENC_DEST_IPV4)) {
list[i].type = ice_proto_type_from_ipv4(false);
@@ -366,8 +389,11 @@ ice_tc_fill_rules(struct ice_hw *hw, u32 flags,
if (tc_fltr->tunnel_type != TNL_LAST) {
i = ice_tc_fill_tunnel_outer(flags, tc_fltr, list, i);
- headers = &tc_fltr->inner_headers;
- inner = true;
+ /* PFCP is considered non-tunneled - don't swap headers. */
+ if (tc_fltr->tunnel_type != TNL_PFCP) {
+ headers = &tc_fltr->inner_headers;
+ inner = true;
+ }
}
if (flags & ICE_TC_FLWR_FIELD_ETH_TYPE_ID) {
@@ -621,6 +647,8 @@ static int ice_tc_tun_get_type(struct net_device *tunnel_dev)
*/
if (netif_is_gtp(tunnel_dev))
return TNL_GTPU;
+ if (netif_is_pfcp(tunnel_dev))
+ return TNL_PFCP;
return TNL_LAST;
}
@@ -1415,6 +1443,20 @@ ice_parse_tunnel_attr(struct net_device *dev, struct flow_rule *rule,
fltr->flags |= ICE_TC_FLWR_FIELD_GTP_OPTS;
}
+ if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ENC_OPTS) &&
+ fltr->tunnel_type == TNL_PFCP) {
+ struct flow_match_enc_opts match;
+
+ flow_rule_match_enc_opts(rule, &match);
+
+ memcpy(&fltr->pfcp_meta_keys, match.key->data,
+ sizeof(struct pfcp_metadata));
+ memcpy(&fltr->pfcp_meta_masks, match.mask->data,
+ sizeof(struct pfcp_metadata));
+
+ fltr->flags |= ICE_TC_FLWR_FIELD_PFCP_OPTS;
+ }
+
return 0;
}
@@ -1473,10 +1515,14 @@ ice_parse_cls_flower(struct net_device *filter_dev, struct ice_vsi *vsi,
return err;
}
- /* header pointers should point to the inner headers, outer
- * header were already set by ice_parse_tunnel_attr
- */
- headers = &fltr->inner_headers;
+ /* PFCP is considered non-tunneled - don't swap headers. */
+ if (fltr->tunnel_type != TNL_PFCP) {
+ /* Header pointers should point to the inner headers,
+ * outer header were already set by
+ * ice_parse_tunnel_attr().
+ */
+ headers = &fltr->inner_headers;
+ }
} else if (dissector->used_keys &
(BIT_ULL(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) |
BIT_ULL(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) |
--
2.43.0
Currently, tools have *ALIGN*() macros scattered across the unrelated
headers, as there are only 3 of them and they were added separately
each time on an as-needed basis.
Anyway, let's make it more consistent with the kernel headers and allow
using those macros outside of the mentioned headers. Create
<linux/align.h> inside the tools/ folder and include it where needed.
Signed-off-by: Alexander Lobakin <[email protected]>
---
tools/include/linux/align.h | 12 ++++++++++++
tools/include/linux/bitmap.h | 2 +-
tools/include/linux/mm.h | 5 +----
3 files changed, 14 insertions(+), 5 deletions(-)
create mode 100644 tools/include/linux/align.h
diff --git a/tools/include/linux/align.h b/tools/include/linux/align.h
new file mode 100644
index 000000000000..14e34ace80dd
--- /dev/null
+++ b/tools/include/linux/align.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _TOOLS_LINUX_ALIGN_H
+#define _TOOLS_LINUX_ALIGN_H
+
+#include <uapi/linux/const.h>
+
+#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
+#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
+#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
+
+#endif /* _TOOLS_LINUX_ALIGN_H */
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index f3566ea0f932..8c6852dba04f 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -3,6 +3,7 @@
#define _TOOLS_LINUX_BITMAP_H
#include <string.h>
+#include <linux/align.h>
#include <linux/bitops.h>
#include <linux/find.h>
#include <stdlib.h>
@@ -126,7 +127,6 @@ static inline bool bitmap_and(unsigned long *dst, const unsigned long *src1,
#define BITMAP_MEM_ALIGNMENT (8 * sizeof(unsigned long))
#endif
#define BITMAP_MEM_MASK (BITMAP_MEM_ALIGNMENT - 1)
-#define IS_ALIGNED(x, a) (((x) & ((typeof(x))(a) - 1)) == 0)
static inline bool bitmap_equal(const unsigned long *src1,
const unsigned long *src2, unsigned int nbits)
diff --git a/tools/include/linux/mm.h b/tools/include/linux/mm.h
index f3c82ab5b14c..7a6b98f4e579 100644
--- a/tools/include/linux/mm.h
+++ b/tools/include/linux/mm.h
@@ -2,8 +2,8 @@
#ifndef _TOOLS_LINUX_MM_H
#define _TOOLS_LINUX_MM_H
+#include <linux/align.h>
#include <linux/mmzone.h>
-#include <uapi/linux/const.h>
#define PAGE_SHIFT 12
#define PAGE_SIZE (_AC(1, UL) << PAGE_SHIFT)
@@ -11,9 +11,6 @@
#define PHYS_ADDR_MAX (~(phys_addr_t)0)
-#define ALIGN(x, a) __ALIGN_KERNEL((x), (a))
-#define ALIGN_DOWN(x, a) __ALIGN_KERNEL((x) - ((a) - 1), (a))
-
#define PAGE_ALIGN(addr) ALIGN(addr, PAGE_SIZE)
#define __va(x) ((void *)((unsigned long)(x)))
--
2.43.0
From: Alexander Potapenko <[email protected]>
pr_err() messages may be treated as errors by some log readers, so let
us only use them for test failures. For non-error messages, replace them
with pr_info().
Suggested-by: Alexander Lobakin <[email protected]>
Signed-off-by: Alexander Potapenko <[email protected]>
Acked-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 46c015468077..a6e92cf5266a 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -507,7 +507,7 @@ static void __init test_bitmap_parselist(void)
}
if (ptest.flags & PARSE_TIME)
- pr_err("parselist: %d: input is '%s' OK, Time: %llu\n",
+ pr_info("parselist: %d: input is '%s' OK, Time: %llu\n",
i, ptest.in, time);
#undef ptest
@@ -546,7 +546,7 @@ static void __init test_bitmap_printlist(void)
goto out;
}
- pr_err("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
+ pr_info("bitmap_print_to_pagebuf: input is '%s', Time: %llu\n", buf, time);
out:
kfree(buf);
kfree(bmap);
@@ -624,7 +624,7 @@ static void __init test_bitmap_parse(void)
}
if (test.flags & PARSE_TIME)
- pr_err("parse: %d: input is '%s' OK, Time: %llu\n",
+ pr_info("parse: %d: input is '%s' OK, Time: %llu\n",
i, test.in, time);
}
}
@@ -1380,7 +1380,7 @@ static void __init test_bitmap_read_perf(void)
}
}
time = ktime_get() - time;
- pr_err("Time spent in %s:\t%llu\n", __func__, time);
+ pr_info("Time spent in %s:\t%llu\n", __func__, time);
}
static void __init test_bitmap_write_perf(void)
@@ -1402,7 +1402,7 @@ static void __init test_bitmap_write_perf(void)
}
}
time = ktime_get() - time;
- pr_err("Time spent in %s:\t%llu\n", __func__, time);
+ pr_info("Time spent in %s:\t%llu\n", __func__, time);
}
#undef TEST_BIT_LEN
--
2.43.0
Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the non-atomic bitops are macros which can
be expanded by the compilers into compile-time expressions, which will
result in better optimized object code. Unfortunately, turned out that
passing `volatile` to those macros discards any possibility of
optimization, as the compilers then don't even try to look whether
the passed bitmap is known at compilation time. In addition to that,
the mentioned linkmode helpers are marked with `inline`, not
`__always_inline`, meaning that it's not guaranteed some compiler won't
uninline them for no reason, which will also effectively prevent them
from being optimized (it's a well-known thing the compilers sometimes
uninline `2 + 2`).
Convert linkmode_*_bit() from inlines to macros. Their calling
convention are 1:1 with the corresponding bitops, so that it's not even
needed to enumerate and map the arguments, only the names. No changes in
vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that
doesn't necessarily means the change is meaningless.
Reviewed-by: Przemek Kitszel <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/linkmode.h | 27 ++++-----------------------
1 file changed, 4 insertions(+), 23 deletions(-)
diff --git a/include/linux/linkmode.h b/include/linux/linkmode.h
index 287f590ed56b..d94bfd9ac8cc 100644
--- a/include/linux/linkmode.h
+++ b/include/linux/linkmode.h
@@ -43,29 +43,10 @@ static inline int linkmode_andnot(unsigned long *dst, const unsigned long *src1,
return bitmap_andnot(dst, src1, src2, __ETHTOOL_LINK_MODE_MASK_NBITS);
}
-static inline void linkmode_set_bit(int nr, volatile unsigned long *addr)
-{
- __set_bit(nr, addr);
-}
-
-static inline void linkmode_clear_bit(int nr, volatile unsigned long *addr)
-{
- __clear_bit(nr, addr);
-}
-
-static inline void linkmode_mod_bit(int nr, volatile unsigned long *addr,
- int set)
-{
- if (set)
- linkmode_set_bit(nr, addr);
- else
- linkmode_clear_bit(nr, addr);
-}
-
-static inline int linkmode_test_bit(int nr, const volatile unsigned long *addr)
-{
- return test_bit(nr, addr);
-}
+#define linkmode_test_bit test_bit
+#define linkmode_set_bit __set_bit
+#define linkmode_clear_bit __clear_bit
+#define linkmode_mod_bit __assign_bit
static inline void linkmode_set_bit_array(const int *array, int array_size,
unsigned long *addr)
--
2.43.0
The number of times yet another open coded
`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
Some generic helper is long overdue.
Add one, bitmap_size(), but with one detail.
BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
divident and divisor are compile-time constants or when the divisor
is not a pow-of-2. When it is however, the compilers sometimes tend
to generate suboptimal code (GCC 13):
48 83 c0 3f add $0x3f,%rax
48 c1 e8 06 shr $0x6,%rax
48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx
%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
full division of `nbits + 63` by it and then multiplication by 8.
Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:
8d 50 3f lea 0x3f(%rax),%edx
c1 ea 03 shr $0x3,%edx
81 e2 f8 ff ff 1f and $0x1ffffff8,%edx
Now it shifts `nbits + 63` by 3 positions (IOW performs fast division
by 8) and then masks bits[2:0]. bloat-o-meter:
add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)
Clang does it better and generates the same code before/after starting
from -O1, except that with the ALIGN() approach it uses %edx and thus
still saves some bytes:
add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)
Note that we can't expand DIV_ROUND_UP() by adding a check and using
this approach there, as it's used in array declarations where
expressions are not allowed.
Add this helper to tools/ as well.
Reviewed-by: Przemek Kitszel <[email protected]>
Acked-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitmap.h | 8 +++++---
include/linux/cpumask.h | 2 +-
tools/include/linux/bitmap.h | 7 ++++---
drivers/md/dm-clone-metadata.c | 5 -----
drivers/s390/cio/idset.c | 2 +-
lib/math/prime_numbers.c | 2 --
6 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 7ca0379be8c1..9a6a27a7f675 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -218,9 +218,11 @@ void bitmap_fold(unsigned long *dst, const unsigned long *orig,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = 0;
@@ -230,7 +232,7 @@ static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = ~0UL;
@@ -241,7 +243,7 @@ static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);
if (small_const_nbits(nbits))
*dst = *src;
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index cfb545841a2c..a2064c2a9441 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -839,7 +839,7 @@ static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
*/
static inline unsigned int cpumask_size(void)
{
- return BITS_TO_LONGS(large_cpumask_bits) * sizeof(long);
+ return bitmap_size(large_cpumask_bits);
}
/*
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index 8c6852dba04f..210c13b1b857 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -26,13 +26,14 @@ bool __bitmap_intersects(const unsigned long *bitmap1,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
if (small_const_nbits(nbits))
*dst = 0UL;
else {
- int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memset(dst, 0, len);
+ memset(dst, 0, bitmap_size(nbits));
}
}
@@ -84,7 +85,7 @@ static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
*/
static inline unsigned long *bitmap_zalloc(int nbits)
{
- return calloc(1, BITS_TO_LONGS(nbits) * sizeof(unsigned long));
+ return calloc(1, bitmap_size(nbits));
}
/*
diff --git a/drivers/md/dm-clone-metadata.c b/drivers/md/dm-clone-metadata.c
index c43d55672bce..47c1fa7aad8b 100644
--- a/drivers/md/dm-clone-metadata.c
+++ b/drivers/md/dm-clone-metadata.c
@@ -465,11 +465,6 @@ static void __destroy_persistent_data_structures(struct dm_clone_metadata *cmd)
/*---------------------------------------------------------------------------*/
-static size_t bitmap_size(unsigned long nr_bits)
-{
- return BITS_TO_LONGS(nr_bits) * sizeof(long);
-}
-
static int __dirty_map_init(struct dirty_map *dmap, unsigned long nr_words,
unsigned long nr_regions)
{
diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
index 0a1105a483bf..e5f28370a903 100644
--- a/drivers/s390/cio/idset.c
+++ b/drivers/s390/cio/idset.c
@@ -18,7 +18,7 @@ struct idset {
static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
{
- return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
+ return bitmap_size(size_mul(num_ssid, num_id));
}
static struct idset *idset_new(int num_ssid, int num_id)
diff --git a/lib/math/prime_numbers.c b/lib/math/prime_numbers.c
index d42cebf7407f..d3b64b10da1c 100644
--- a/lib/math/prime_numbers.c
+++ b/lib/math/prime_numbers.c
@@ -6,8 +6,6 @@
#include <linux/prime_numbers.h>
#include <linux/slab.h>
-#define bitmap_size(nbits) (BITS_TO_LONGS(nbits) * sizeof(unsigned long))
-
struct primes {
struct rcu_head rcu;
unsigned long last, sz;
--
2.43.0
On Thu, Feb 1, 2024, at 13:21, Alexander Lobakin wrote:
> From: Syed Nayyar Waris <[email protected]>
>
> The two new functions allow reading/writing values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
>
> The code was taken from "bitops: Introduce the for_each_set_clump macro"
> by Syed Nayyar Waris with a number of changes and simplifications:
> - instead of using roundup(), which adds an unnecessary dependency
> on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
> - indentation is reduced by not using else-clauses (suggested by
> checkpatch for bitmap_get_value());
> - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
> and bitmap_write();
> - some redundant computations are omitted.
These functions feel like they should not be inline but are
better off in lib/bitmap.c given their length.
As far as I can tell, the header ends up being included
indirectly almost everywhere, so just parsing these functions
likey adds not just dependencies but also compile time.
Arnd
On Thu, Feb 1, 2024 at 2:23 PM Arnd Bergmann <[email protected]> wrote:
>
> On Thu, Feb 1, 2024, at 13:21, Alexander Lobakin wrote:
> > From: Syed Nayyar Waris <[email protected]>
> >
> > The two new functions allow reading/writing values of length up to
> > BITS_PER_LONG bits at arbitrary position in the bitmap.
> >
> > The code was taken from "bitops: Introduce the for_each_set_clump macro"
> > by Syed Nayyar Waris with a number of changes and simplifications:
> > - instead of using roundup(), which adds an unnecessary dependency
> > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
> > - indentation is reduced by not using else-clauses (suggested by
> > checkpatch for bitmap_get_value());
> > - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
> > and bitmap_write();
> > - some redundant computations are omitted.
>
> These functions feel like they should not be inline but are
> better off in lib/bitmap.c given their length.
>
> As far as I can tell, the header ends up being included
> indirectly almost everywhere, so just parsing these functions
> likey adds not just dependencies but also compile time.
>
> Arnd
Removing particular functions from a header to reduce compilation time
does not really scale.
Do we know this case has a noticeable impact on the compilation time?
If yes, maybe we need to tackle this problem in a different way (e.g.
reduce the number of dependencies on it)?
On Thu, Feb 1, 2024, at 14:45, Alexander Potapenko wrote:
> On Thu, Feb 1, 2024 at 2:23 PM Arnd Bergmann <[email protected]> wrote:
>> On Thu, Feb 1, 2024, at 13:21, Alexander Lobakin wrote:
>>
>> As far as I can tell, the header ends up being included
>> indirectly almost everywhere, so just parsing these functions
>> likey adds not just dependencies but also compile time.
>>
>
> Removing particular functions from a header to reduce compilation time
> does not really scale.
> Do we know this case has a noticeable impact on the compilation time?
> If yes, maybe we need to tackle this problem in a different way (e.g.
> reduce the number of dependencies on it)?
Cleaning up the header dependencies is definitely possible in
theory, and there are other places we could start this, but
it's also a multi-year effort that several people have tried
without much success.
All I'm asking here is to not make it worse by adding this
one without need. If the function is not normally inlined
anyway, there is no benefit to having it in the header.
Arnd
From: Arnd Bergmann <[email protected]>
Date: Thu, 01 Feb 2024 14:23:33 +0100
> On Thu, Feb 1, 2024, at 13:21, Alexander Lobakin wrote:
>> From: Syed Nayyar Waris <[email protected]>
>>
>> The two new functions allow reading/writing values of length up to
>> BITS_PER_LONG bits at arbitrary position in the bitmap.
>>
>> The code was taken from "bitops: Introduce the for_each_set_clump macro"
>> by Syed Nayyar Waris with a number of changes and simplifications:
>> - instead of using roundup(), which adds an unnecessary dependency
>> on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
>> - indentation is reduced by not using else-clauses (suggested by
>> checkpatch for bitmap_get_value());
>> - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
>> and bitmap_write();
>> - some redundant computations are omitted.
>
> These functions feel like they should not be inline but are
> better off in lib/bitmap.c given their length.
When their arguments are compile-time constants, they got optimized
well. They're also used on hotpath, so making them external could hurt
performance + taking the first sentence into account, making them
external will hurt the performance even more, 'cause they won't be then
as optimized by the compiler as they are now.
>
> As far as I can tell, the header ends up being included
> indirectly almost everywhere, so just parsing these functions
> likey adds not just dependencies but also compile time.
>
> Arnd
Thanks,
Olek
On 2/1/24 13:22, Alexander Lobakin wrote:
> Currently, tools have *ALIGN*() macros scattered across the unrelated
> headers, as there are only 3 of them and they were added separately
> each time on an as-needed basis.
> Anyway, let's make it more consistent with the kernel headers and allow
> using those macros outside of the mentioned headers. Create
> <linux/align.h> inside the tools/ folder and include it where needed.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> tools/include/linux/align.h | 12 ++++++++++++
> tools/include/linux/bitmap.h | 2 +-
> tools/include/linux/mm.h | 5 +----
> 3 files changed, 14 insertions(+), 5 deletions(-)
> create mode 100644 tools/include/linux/align.h
>
Reviewed-by: Przemek Kitszel <[email protected]>
On 2/1/24 13:22, Alexander Lobakin wrote:
> Now that we have generic bitmap_read() and bitmap_write(), which are
> inline and try to take care of non-bound-crossing and aligned cases
> to keep them optimized, collapse bitmap_{get,set}_value8() into
> simple wrappers around the former ones.
> bloat-o-meter shows no difference in vmlinux and -2 bytes for
> gpio-pca953x.ko, which says the optimization didn't suffer due to
> that change. The converted helpers have the value width embedded
> and always compile-time constant and that helps a lot.
>
> Suggested-by: Yury Norov <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> include/linux/bitmap.h | 38 +++++---------------------------------
> 1 file changed, 5 insertions(+), 33 deletions(-)
>
Hah, nice!
Reviewed-by: Przemek Kitszel <[email protected]>
From: Alexander Lobakin <[email protected]>
Date: Thu, 1 Feb 2024 13:21:55 +0100
> Add support for creating PFCP filters in switchdev mode. Add pfcp module
> that allows to create a PFCP-type netdev. The netdev then can be passed to
> tc when creating a filter to indicate that PFCP filter should be created.
I believe folks agreed that bitmap_{read,write}() should stay inline,
ping then?
[...]
Thanks,
Olek
On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
> > Add support for creating PFCP filters in switchdev mode. Add pfcp module
> > that allows to create a PFCP-type netdev. The netdev then can be passed to
> > tc when creating a filter to indicate that PFCP filter should be created.
>
> I believe folks agreed that bitmap_{read,write}() should stay inline,
> ping then?
It's probably fine, IMHO. I mean, I think we agree that the rarely used
inlines should not sit in a header included by half of the kernel (not
an exaggeration). But IMHO a better fix would be to move out whatever
cpumask.h xarray.h and other common headers depend on to a cut-down
version rather than making your helpers not inline.
So I think all we need for now is for people to ack the respective
patches? Looks like cio and ntfs and missing acks, so are some of
the bitops core patches.
On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
> > Add support for creating PFCP filters in switchdev mode. Add pfcp module
> > that allows to create a PFCP-type netdev. The netdev then can be passed to
> > tc when creating a filter to indicate that PFCP filter should be created.
>
> I believe folks agreed that bitmap_{read,write}() should stay inline,
> ping then?
Well, Dave dropped this from PW, again. Can you ping people to give you
the acks and repost? What's your plan?
From: Jakub Kicinski <[email protected]>
Date: Wed, 7 Feb 2024 07:05:35 -0800
> On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
>>> Add support for creating PFCP filters in switchdev mode. Add pfcp module
>>> that allows to create a PFCP-type netdev. The netdev then can be passed to
>>> tc when creating a filter to indicate that PFCP filter should be created.
>>
>> I believe folks agreed that bitmap_{read,write}() should stay inline,
>> ping then?
>
> Well, Dave dropped this from PW, again. Can you ping people to give you
Why was it dropped? :D
> the acks and repost? What's your plan?
Ufff, I thought people read their emails...
Yury, Konstantin, s390 folks? Could you please give some missing acks? I
don't want to ping everyone privately :z
Thanks,
Olek
On 01.02.2024 13:22, Alexander Lobakin wrote:
> bitmap_size() is a pretty generic name and one may want to use it for
> a generic bitmap API function. At the same time, its logic is not
> "generic", i.e. it's not just `nbits -> size of bitmap in bytes`
> converter as it would be expected from its name.
> Add the prefix 'idset_' used throughout the file where the function
> resides.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Apologies for the delay.
Acked-by: Peter Oberparleiter <[email protected]>
--
Peter Oberparleiter
Linux on IBM Z Development - IBM Germany R&D
On Thu, Feb 01, 2024 at 01:21:57PM +0100, Alexander Lobakin wrote:
> From: Alexander Potapenko <[email protected]>
>
> Add basic tests ensuring that values can be added at arbitrary positions
> of the bitmap, including those spanning into the adjacent unsigned
> longs.
>
> Two new performance tests, test_bitmap_read_perf() and
> test_bitmap_write_perf(), can be used to assess future performance
> improvements of bitmap_read() and bitmap_write():
>
> [ 0.431119][ T1] test_bitmap: Time spent in test_bitmap_read_perf: 615253
> [ 0.433197][ T1] test_bitmap: Time spent in test_bitmap_write_perf: 916313
>
> (numbers from a Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz machine running
> QEMU).
>
> Signed-off-by: Alexander Potapenko <[email protected]>
> Reviewed-by: Andy Shevchenko <[email protected]>
> Acked-by: Yury Norov <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:21:58PM +0100, Alexander Lobakin wrote:
> From: Alexander Potapenko <[email protected]>
>
> pr_err() messages may be treated as errors by some log readers, so let
> us only use them for test failures. For non-error messages, replace them
> with pr_info().
>
> Suggested-by: Alexander Lobakin <[email protected]>
> Signed-off-by: Alexander Potapenko <[email protected]>
> Acked-by: Yury Norov <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:21:59PM +0100, Alexander Lobakin wrote:
> Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
> a new bitop, test_bit_acquire(), with proper wrapping in order to try to
> optimize it at compile-time, but missed the list of bitops used for
> checking their prototypes a bit below.
> The functions added have consistent prototypes, so that no more changes
> are required and no functional changes take place.
>
> Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
> Cc: [email protected] # 6.0+
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Acked-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:00PM +0100, Alexander Lobakin wrote:
> Avoid open-coding that simple expression each time by moving
> BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export
> it to the rest of the kernel.
> Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always
> equals to %BITS_PER_BYTE, regardless of the target architecture.
> Do the same for the tools ecosystem as well (incl. its version of
> bitops.h). The previous implementation had its implicit type of long,
> while the new one is int, so adjust the format literal accordingly in
> the perf code.
>
> Suggested-by: Andy Shevchenko <[email protected]>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Acked-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:01PM +0100, Alexander Lobakin wrote:
> Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
> on compile-time constants"), the compilers are able to expand inline
> bitmap operations to compile-time initializers when possible.
> However, during the round of replacement if-__set-else-__clear with
> __assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes
> difference in object code size for one module (even one function),
> where the pattern:
>
> DECLARE_BITMAP(foo) = { }; // on the stack, zeroed
>
> if (a)
> __set_bit(const_bit_num, foo);
> if (b)
> __set_bit(another_const_bit_num, foo);
> ...
>
> is heavily used, although there should be no difference: the bitmap is
> zeroed, so the second half of __assign_bit() should be compiled-out as
> a no-op.
> I either missed the fact that __assign_bit() has bitmap pointer marked
> as `volatile` (as we usually do for bitops) or was hoping that the
> compilers would at least try to look past the `volatile` for
> __always_inline functions. Anyhow, due to that attribute, the compilers
> were always compiling the whole expression and no mentioned compile-time
> optimizations were working.
>
> Convert __assign_bit() to a macro since it's a very simple if-else and
> all of the checks are performed inside __set_bit() and __clear_bit(),
> thus that wrapper has to be as transparent as possible. After that
> change, despite it showing only -20 bytes change for vmlinux (due to
> that it's still relatively unpopular), no drastic code size changes
> happen when replacing if-set-else-clear for onstack bitmaps with
> __assign_bit(), meaning the compiler now expands them to the actual
> operations will all the expected optimizations.
>
> Atomic assign_bit() is less affected due to its nature, but let's
> convert it to a macro as well to keep the code consistent and not
> leave a place for possible suboptimal codegen. Moreover, with certain
> kernel configuration it actually gives some saves (x86):
>
> do_ip_setsockopt 4154 4099 -55
>
> Suggested-by: Yury Norov <[email protected]> # assign_bit(), too
> Cc: Andy Shevchenko <[email protected]>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Acked-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:02PM +0100, Alexander Lobakin wrote:
> Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
> on compile-time constants"), the non-atomic bitops are macros which can
> be expanded by the compilers into compile-time expressions, which will
> result in better optimized object code. Unfortunately, turned out that
> passing `volatile` to those macros discards any possibility of
> optimization, as the compilers then don't even try to look whether
> the passed bitmap is known at compilation time. In addition to that,
> the mentioned linkmode helpers are marked with `inline`, not
> `__always_inline`, meaning that it's not guaranteed some compiler won't
> uninline them for no reason, which will also effectively prevent them
> from being optimized (it's a well-known thing the compilers sometimes
> uninline `2 + 2`).
> Convert linkmode_*_bit() from inlines to macros. Their calling
> convention are 1:1 with the corresponding bitops, so that it's not even
> needed to enumerate and map the arguments, only the names. No changes in
> vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that
> doesn't necessarily means the change is meaningless.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Acked-by: Jakub Kicinski <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Acked-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:04PM +0100, Alexander Lobakin wrote:
> bitmap_size() is a pretty generic name and one may want to use it for
> a generic bitmap API function. At the same time, its logic is
> NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
> (although it uses ideologically right ALIGN() instead of division).
> Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
> it with the legacy module) and use generic BITS_TO_U64() while at it.
>
> Suggested-by: Yury Norov <[email protected]> # BITS_TO_U64()
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:05PM +0100, Alexander Lobakin wrote:
> bitmap_set_bits() does not start with the FS' prefix and may collide
> with a new generic helper one day. It operates with the FS-specific
> types, so there's no change those two could do the same thing.
> Just add the prefix to exclude such possible conflict.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Acked-by: David Sterba <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
On Wed, Feb 28, 2024 at 08:28:31AM -0800, Yury Norov wrote:
> On Thu, Feb 01, 2024 at 01:22:06PM +0100, Alexander Lobakin wrote:
> > Currently, tools have *ALIGN*() macros scattered across the unrelated
> > headers, as there are only 3 of them and they were added separately
> > each time on an as-needed basis.
> > Anyway, let's make it more consistent with the kernel headers and allow
> > using those macros outside of the mentioned headers. Create
> > <linux/align.h> inside the tools/ folder and include it where needed.
> >
> > Signed-off-by: Alexander Lobakin <[email protected]>
>
> Reviewed-by: Yury Norov <[email protected]>
Sorry, please read this as:
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:07PM +0100, Alexander Lobakin wrote:
> The number of times yet another open coded
> `BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
> Some generic helper is long overdue.
>
> Add one, bitmap_size(), but with one detail.
> BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
> divident and divisor are compile-time constants or when the divisor
> is not a pow-of-2. When it is however, the compilers sometimes tend
> to generate suboptimal code (GCC 13):
>
> 48 83 c0 3f add $0x3f,%rax
> 48 c1 e8 06 shr $0x6,%rax
> 48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx
>
> %BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
> full division of `nbits + 63` by it and then multiplication by 8.
> Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:
>
> 8d 50 3f lea 0x3f(%rax),%edx
> c1 ea 03 shr $0x3,%edx
> 81 e2 f8 ff ff 1f and $0x1ffffff8,%edx
>
> Now it shifts `nbits + 63` by 3 positions (IOW performs fast division
> by 8) and then masks bits[2:0]. bloat-o-meter:
>
> add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)
>
> Clang does it better and generates the same code before/after starting
> from -O1, except that with the ALIGN() approach it uses %edx and thus
> still saves some bytes:
>
> add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)
>
> Note that we can't expand DIV_ROUND_UP() by adding a check and using
> this approach there, as it's used in array declarations where
> expressions are not allowed.
> Add this helper to tools/ as well.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Acked-by: Yury Norov <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:08PM +0100, Alexander Lobakin wrote:
> Now that we have generic bitmap_read() and bitmap_write(), which are
> inline and try to take care of non-bound-crossing and aligned cases
> to keep them optimized, collapse bitmap_{get,set}_value8() into
> simple wrappers around the former ones.
> bloat-o-meter shows no difference in vmlinux and -2 bytes for
> gpio-pca953x.ko, which says the optimization didn't suffer due to
> that change. The converted helpers have the value width embedded
> and always compile-time constant and that helps a lot.
>
> Suggested-by: Yury Norov <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:09PM +0100, Alexander Lobakin wrote:
> Commit dc34d5036692 ("lib: test_bitmap: add compile-time
> optimization/evaluations assertions") initially missed __assign_bit(),
> which led to that quite a time passed before I realized it doesn't get
> optimized at compilation time. Now that it does, add test for that just
> to make sure nothing will break one day.
> To make things more interesting, use bitmap_complement() and
> bitmap_full(), thus checking their compile-time evaluation as well. And
> remove the misleading comment mentioning the workaround removed recently
> in favor of adding the whole file to GCov exceptions.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 03:02:50PM +0100, Arnd Bergmann wrote:
> On Thu, Feb 1, 2024, at 14:45, Alexander Potapenko wrote:
> > On Thu, Feb 1, 2024 at 2:23 PM Arnd Bergmann <[email protected]> wrote:
> >> On Thu, Feb 1, 2024, at 13:21, Alexander Lobakin wrote:
> >>
> >> As far as I can tell, the header ends up being included
> >> indirectly almost everywhere, so just parsing these functions
> >> likey adds not just dependencies but also compile time.
> >>
> >
> > Removing particular functions from a header to reduce compilation time
> > does not really scale.
> > Do we know this case has a noticeable impact on the compilation time?
> > If yes, maybe we need to tackle this problem in a different way (e.g.
> > reduce the number of dependencies on it)?
>
> Cleaning up the header dependencies is definitely possible in
> theory, and there are other places we could start this, but
> it's also a multi-year effort that several people have tried
> without much success.
>
> All I'm asking here is to not make it worse by adding this
> one without need. If the function is not normally inlined
> anyway, there is no benefit to having it in the header.
>
> Arnd
Hi Arnd,
I think Alexander has shown that the functions are normally inlined.
If for some target that doesn't hold, we'd use __always_inline.
They are very lightweight by nature - one or at max two word fetches
followed by some shifting. We spent quite some cycles making sure
that the generated code looks efficient, at least not worse than the
existing bitmap_{get,set}_value8(), which is a special case of the
bitmap_{read,write}.
I agree that bitmap header is overwhelmed (like many other kernel
headers), and I'm working on unloading it.
I checked allyesconfig build time before and after this patch, and
I found no difference for me. So if you're concerned about compilation
time, this patch doesn't make things worse in this department.
With all that, Alexander, can you please double-check that the
functions get inlined, and if so:
Signed-off-by: Yury Norov <[email protected]>
On Thu, Feb 01, 2024 at 01:22:12PM +0100, Alexander Lobakin wrote:
> Now that there are helpers for converting IP tunnel flags between the
> old __be16 format and the bitmap format, make sure they work as expected
> by adding a couple of tests to the bitmap testing suite. The helpers are
> all inline, so no dependencies on the related CONFIG_* (or a standalone
> module) are needed.
>
> Cover three possible cases:
>
> 1. No bits past BIT(15) are set, VTI/SIT bits are not set. This
> conversion is almost a direct assignment.
> 2. No bits past BIT(15) are set, but VTI/SIT bit is set. During the
> conversion, it must be transformed into BIT(16) in the bitmap,
> but still compatible with the __be16 format.
> 3. The bitmap has bits past BIT(15) set (not the VTI/SIT one). The
> result will be truncated.
> Note that currently __IP_TUNNEL_FLAG_NUM is 17 (incl. special),
> which means that the result of this case is currently
> semi-false-positive. When BIT(17) is finally here, it will be
> adjusted accordingly.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
So why testing IP tunnels stuff in lib/test_bitmap? I think it should
go with the rest of networking code.
> ---
> lib/test_bitmap.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 105 insertions(+)
>
> diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
> index 4ee1f8ceb51d..270afc0cba5c 100644
> --- a/lib/test_bitmap.c
> +++ b/lib/test_bitmap.c
> @@ -14,6 +14,8 @@
> #include <linux/string.h>
> #include <linux/uaccess.h>
>
> +#include <net/ip_tunnels.h>
> +
> #include "../tools/testing/selftests/kselftest_module.h"
>
> #define EXP1_IN_BITS (sizeof(exp1) * 8)
> @@ -1409,6 +1411,108 @@ static void __init test_bitmap_write_perf(void)
>
> #undef TEST_BIT_LEN
>
> +struct ip_tunnel_flags_test {
> + const u16 *src_bits;
> + const u16 *exp_bits;
> + u8 src_num;
> + u8 exp_num;
> + __be16 exp_val;
> + bool exp_comp:1;
> +};
> +
> +#define IP_TUNNEL_FLAGS_TEST(src, comp, eval, exp) { \
> + .src_bits = (src), \
> + .src_num = ARRAY_SIZE(src), \
> + .exp_comp = (comp), \
> + .exp_val = (eval), \
> + .exp_bits = (exp), \
> + .exp_num = ARRAY_SIZE(exp), \
> +}
> +
> +/* These are __be16-compatible and can be compared as is */
> +static const u16 ip_tunnel_flags_1[] __initconst = {
> + IP_TUNNEL_KEY_BIT,
> + IP_TUNNEL_STRICT_BIT,
> + IP_TUNNEL_ERSPAN_OPT_BIT,
> +};
> +
> +/*
> + * Due to the previous flags design limitation, setting either
> + * ``IP_TUNNEL_CSUM_BIT`` (on Big Endian) or ``IP_TUNNEL_DONT_FRAGMENT_BIT``
> + * (on Little) also sets VTI/ISATAP bit. In the bitmap implementation, they
> + * correspond to ``BIT(16)``, which is bigger than ``U16_MAX``, but still is
> + * backward-compatible.
> + */
> +#ifdef __BIG_ENDIAN
> +#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_CSUM_BIT
> +#else
> +#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_DONT_FRAGMENT_BIT
> +#endif
> +
> +static const u16 ip_tunnel_flags_2_src[] __initconst = {
> + IP_TUNNEL_CONFLICT_BIT,
> +};
> +
> +static const u16 ip_tunnel_flags_2_exp[] __initconst = {
> + IP_TUNNEL_CONFLICT_BIT,
> + IP_TUNNEL_SIT_ISATAP_BIT,
> +};
> +
> +/* Bits 17 and higher are not compatible with __be16 flags */
> +static const u16 ip_tunnel_flags_3_src[] __initconst = {
> + IP_TUNNEL_VXLAN_OPT_BIT,
> + 17,
> + 18,
> + 20,
> +};
> +
> +static const u16 ip_tunnel_flags_3_exp[] __initconst = {
> + IP_TUNNEL_VXLAN_OPT_BIT,
> +};
> +
> +static const struct ip_tunnel_flags_test ip_tunnel_flags_test[] __initconst = {
> + IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_1, true,
> + cpu_to_be16(BIT(IP_TUNNEL_KEY_BIT) |
> + BIT(IP_TUNNEL_STRICT_BIT) |
> + BIT(IP_TUNNEL_ERSPAN_OPT_BIT)),
> + ip_tunnel_flags_1),
> + IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_2_src, true, VTI_ISVTI,
> + ip_tunnel_flags_2_exp),
> + IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_3_src,
> + /*
> + * This must be set to ``false`` once
> + * ``__IP_TUNNEL_FLAG_NUM`` goes above 17.
> + */
> + true,
> + cpu_to_be16(BIT(IP_TUNNEL_VXLAN_OPT_BIT)),
> + ip_tunnel_flags_3_exp),
> +};
> +
> +static void __init test_ip_tunnel_flags(void)
> +{
> + for (u32 i = 0; i < ARRAY_SIZE(ip_tunnel_flags_test); i++) {
> + typeof(*ip_tunnel_flags_test) *test = &ip_tunnel_flags_test[i];
> + IP_TUNNEL_DECLARE_FLAGS(src) = { };
> + IP_TUNNEL_DECLARE_FLAGS(exp) = { };
> + IP_TUNNEL_DECLARE_FLAGS(out);
> +
> + for (u32 j = 0; j < test->src_num; j++)
> + __set_bit(test->src_bits[j], src);
> +
> + for (u32 j = 0; j < test->exp_num; j++)
> + __set_bit(test->exp_bits[j], exp);
> +
> + ip_tunnel_flags_from_be16(out, test->exp_val);
> +
> + expect_eq_uint(test->exp_comp,
> + ip_tunnel_flags_is_be16_compat(src));
> + expect_eq_uint((__force u16)test->exp_val,
> + (__force u16)ip_tunnel_flags_to_be16(src));
> +
> + __ipt_flag_op(expect_eq_bitmap, exp, out);
> + }
> +}
> +
> static void __init selftest(void)
> {
> test_zero_clear();
> @@ -1428,6 +1532,7 @@ static void __init selftest(void)
> test_bitmap_read_write();
> test_bitmap_read_perf();
> test_bitmap_write_perf();
> + test_ip_tunnel_flags();
>
> test_find_nth_bit();
> test_for_each_set_bit();
> --
> 2.43.0
On Mon, Feb 12, 2024 at 12:35:38PM +0100, Alexander Lobakin wrote:
> From: Jakub Kicinski <[email protected]>
> Date: Wed, 7 Feb 2024 07:05:35 -0800
>
> > On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
> >>> Add support for creating PFCP filters in switchdev mode. Add pfcp module
> >>> that allows to create a PFCP-type netdev. The netdev then can be passed to
> >>> tc when creating a filter to indicate that PFCP filter should be created.
> >>
> >> I believe folks agreed that bitmap_{read,write}() should stay inline,
> >> ping then?
> >
> > Well, Dave dropped this from PW, again. Can you ping people to give you
>
> Why was it dropped? :D
>
> > the acks and repost? What's your plan?
>
> Ufff, I thought people read their emails...
>
> Yury, Konstantin, s390 folks? Could you please give some missing acks? I
> don't want to ping everyone privately :z
Hi Alexander, Jakub,
I reviewed the series again and added my SOBs for bitmap-related
patches, and Acks or RBs for the rest, where appropriate.
Regarding the patch #17, I don't think that network-related tests
should be hosted in lib/test-bitmap. This is not a critical issue,
but Alexander, can you find a better place for the code?
The rest of the series is OK for me. I think Jakub wants to pull this
as a whole in his -net branch? If so please go ahead, if not - I can
pull bitmap-related part in bitmap-for-next.
Thanks,
Yury
On Thu, Feb 01, 2024 at 01:22:06PM +0100, Alexander Lobakin wrote:
> Currently, tools have *ALIGN*() macros scattered across the unrelated
> headers, as there are only 3 of them and they were added separately
> each time on an as-needed basis.
> Anyway, let's make it more consistent with the kernel headers and allow
> using those macros outside of the mentioned headers. Create
> <linux/align.h> inside the tools/ folder and include it where needed.
>
> Signed-off-by: Alexander Lobakin <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
From: Yury Norov <[email protected]>
Date: Wed, 28 Feb 2024 08:38:49 -0800
> On Thu, Feb 01, 2024 at 01:22:12PM +0100, Alexander Lobakin wrote:
>> Now that there are helpers for converting IP tunnel flags between the
>> old __be16 format and the bitmap format, make sure they work as expected
>> by adding a couple of tests to the bitmap testing suite. The helpers are
>> all inline, so no dependencies on the related CONFIG_* (or a standalone
>> module) are needed.
>>
>> Cover three possible cases:
>>
>> 1. No bits past BIT(15) are set, VTI/SIT bits are not set. This
>> conversion is almost a direct assignment.
>> 2. No bits past BIT(15) are set, but VTI/SIT bit is set. During the
>> conversion, it must be transformed into BIT(16) in the bitmap,
>> but still compatible with the __be16 format.
>> 3. The bitmap has bits past BIT(15) set (not the VTI/SIT one). The
>> result will be truncated.
>> Note that currently __IP_TUNNEL_FLAG_NUM is 17 (incl. special),
>> which means that the result of this case is currently
>> semi-false-positive. When BIT(17) is finally here, it will be
>> adjusted accordingly.
>>
>> Signed-off-by: Alexander Lobakin <[email protected]>
>
> So why testing IP tunnels stuff in lib/test_bitmap? I think it should
> go with the rest of networking code.
When I was creating these tests, there was no networking-specific KUnit,
but now we have one, so it makes sense to move it.
Thanks,
Olek
On Mon, 2024-02-12 at 12:35 +0100, Alexander Lobakin wrote:
> From: Jakub Kicinski <[email protected]>
> Date: Wed, 7 Feb 2024 07:05:35 -0800
>
> > On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
> > > > Add support for creating PFCP filters in switchdev mode. Add pfcp module
> > > > that allows to create a PFCP-type netdev. The netdev then can be passed to
> > > > tc when creating a filter to indicate that PFCP filter should be created.
> > >
> > > I believe folks agreed that bitmap_{read,write}() should stay inline,
> > > ping then?
> >
> > Well, Dave dropped this from PW, again. Can you ping people to give you
>
> Why was it dropped? :D
>
> > the acks and repost? What's your plan?
>
> Ufff, I thought people read their emails...
>
> Yury, Konstantin, s390 folks? Could you please give some missing acks? I
> don't want to ping everyone privately :z
>
> Thanks,
> Olek
>
I do see an Acked-by from Peter Oberparleiter for the s390/cio bit, so
as far as I can tell the s390 part has all Acks necessary, no?
Thanks,
Niklas
From: Niklas Schnelle <[email protected]>
Date: Tue, 02 Apr 2024 12:59:19 +0200
> On Mon, 2024-02-12 at 12:35 +0100, Alexander Lobakin wrote:
>> From: Jakub Kicinski <[email protected]>
>> Date: Wed, 7 Feb 2024 07:05:35 -0800
>>
>>> On Tue, 6 Feb 2024 13:46:44 +0100 Alexander Lobakin wrote:
>>>>> Add support for creating PFCP filters in switchdev mode. Add pfcp module
>>>>> that allows to create a PFCP-type netdev. The netdev then can be passed to
>>>>> tc when creating a filter to indicate that PFCP filter should be created.
>>>>
>>>> I believe folks agreed that bitmap_{read,write}() should stay inline,
>>>> ping then?
>>>
>>> Well, Dave dropped this from PW, again. Can you ping people to give you
>>
>> Why was it dropped? :D
>>
>>> the acks and repost? What's your plan?
>>
>> Ufff, I thought people read their emails...
>>
>> Yury, Konstantin, s390 folks? Could you please give some missing acks? I
>> don't want to ping everyone privately :z
>>
>> Thanks,
>> Olek
>>
>
> I do see an Acked-by from Peter Oberparleiter for the s390/cio bit, so
> as far as I can tell the s390 part has all Acks necessary, no?
The series was already taken to net-next :D
>
> Thanks,
> Niklas
Thanks,
Olek