2023-10-09 15:13:45

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 00/14] ip_tunnel: convert __be16 tunnel flags to bitmaps

Derived from the PFCP support series[0] as this grew bigger (2 -> 14
commits) and involved more core bitmap changes. Only commits 10 and 11
are from the mentioned tree, the rest is new. PFCP itself still depends
on this series.

IP tunnels have their flags defined as `__be16`, including UAPI, and
after GTP was accepted, there are no more free bits left. UAPI (incl.
direct usage of one of the user structs) and explicit Endianness only
complicate things.
Since it would either way end up with hundreds of locs due to all that,
pick bitmaps right from the start to store the flags in the most native
and scalable format with rich API. I don't think it's worth trying to
praise luck and pick smth like u32 only to redo everything in x years :)
More details regarding the IP tunnel flags is in 11 and 14.

The rest is just a good bunch of prereqs and tests: a couple of new
helpers and extensions to the old ones, a few optimizations to partially
mitigate IP tunnel object code growth due to __be16 -> long, and
decouping one UAPI struct used throughout the whole kernel into the
userspace and the kernel space counterparts to eliminate the dependency.

[0] https://lore.kernel.org/netdev/[email protected]

Alexander Lobakin (14):
bitops: add missing prototype check
bitops: make BYTES_TO_BITS() treewide-available
bitops: let the compiler optimize __assign_bit()
linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros
s390/cio: rename bitmap_size() -> idset_bitmap_size()
fs/ntfs3: rename bitmap_size() -> ntfs3_bitmap_size()
btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
bitmap: introduce generic optimized bitmap_size()
bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()
ip_tunnel: use a separate struct to store tunnel params in the kernel
ip_tunnel: convert __be16 tunnel flags to bitmaps
lib/bitmap: add compile-time test for __assign_bit() optimization
lib/bitmap: add tests for bitmap_{get,set}_bits()
lib/bitmap: add tests for IP tunnel flags conversion helpers

drivers/md/dm-clone-metadata.c | 5 -
drivers/net/bareudp.c | 19 +-
.../ethernet/mellanox/mlx5/core/en/tc_tun.h | 2 +-
.../mellanox/mlx5/core/en/tc_tun_encap.c | 6 +-
.../mellanox/mlx5/core/en/tc_tun_geneve.c | 12 +-
.../mellanox/mlx5/core/en/tc_tun_gre.c | 8 +-
.../mellanox/mlx5/core/en/tc_tun_vxlan.c | 9 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 16 +-
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 56 +++--
.../ethernet/mellanox/mlxsw/spectrum_ipip.h | 2 +-
.../ethernet/mellanox/mlxsw/spectrum_span.c | 10 +-
.../ethernet/netronome/nfp/flower/action.c | 27 ++-
drivers/net/geneve.c | 44 ++--
drivers/net/vxlan/vxlan_core.c | 14 +-
drivers/s390/cio/idset.c | 10 +-
fs/btrfs/free-space-cache.c | 8 +-
fs/ntfs3/bitmap.c | 4 +-
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 +-
fs/ntfs3/ntfs_fs.h | 2 +-
fs/ntfs3/super.c | 2 +-
include/linux/bitmap.h | 59 ++++--
include/linux/bitops.h | 13 +-
include/linux/cpumask.h | 2 +-
include/linux/linkmode.h | 27 +--
include/linux/netdevice.h | 7 +-
include/net/dst_metadata.h | 10 +-
include/net/flow_dissector.h | 2 +-
include/net/gre.h | 70 +++---
include/net/ip6_tunnel.h | 4 +-
include/net/ip_tunnels.h | 136 ++++++++++--
include/net/udp_tunnel.h | 4 +-
include/uapi/linux/if_tunnel.h | 33 +++
kernel/trace/trace_probe.c | 2 -
lib/math/prime_numbers.c | 2 -
lib/test_bitmap.c | 200 +++++++++++++++++-
net/bridge/br_vlan_tunnel.c | 9 +-
net/core/filter.c | 26 +--
net/core/flow_dissector.c | 20 +-
net/ipv4/fou_bpf.c | 2 +-
net/ipv4/gre_demux.c | 2 +-
net/ipv4/ip_gre.c | 144 ++++++++-----
net/ipv4/ip_tunnel.c | 109 +++++++---
net/ipv4/ip_tunnel_core.c | 82 ++++---
net/ipv4/ip_vti.c | 41 ++--
net/ipv4/ipip.c | 33 +--
net/ipv4/ipmr.c | 2 +-
net/ipv4/udp_tunnel_core.c | 5 +-
net/ipv6/addrconf.c | 3 +-
net/ipv6/ip6_gre.c | 85 ++++----
net/ipv6/ip6_tunnel.c | 14 +-
net/ipv6/sit.c | 38 ++--
net/netfilter/ipvs/ip_vs_core.c | 6 +-
net/netfilter/ipvs/ip_vs_xmit.c | 20 +-
net/netfilter/nft_tunnel.c | 44 ++--
net/openvswitch/flow_netlink.c | 61 +++---
net/psample/psample.c | 26 +--
net/sched/act_tunnel_key.c | 36 ++--
net/sched/cls_flower.c | 27 +--
tools/include/linux/bitmap.h | 8 +-
tools/include/linux/bitops.h | 2 +
tools/perf/util/probe-finder.c | 2 -
62 files changed, 1116 insertions(+), 571 deletions(-)

---
Not sure whether it's fine to have that all in one series, but OTOH
there's not much stuff I could split (like, 3 commits), it either
depends directly (new helpers etc.) or will just generate suboptimal
code w/o some of the commits.

I'm also thinking of which tree this would ideally be taken through.
The main subject is networking, but most of the commits are generic.
My idea is to push this via Yury / bitmaps and then ask the netdev
maintainers to pull his tree before they take PFCP (dependent on this
one).

Speaking of bitmap_{read,write}() from [1] vs bitmap_{get,set}_bits()
from #09: they don't really conflict, because the former are
generic-generic and support bound crossing, while the latter require
the width to be a pow-2 and the offset to be a multiple of the width
in order to preserve the optimization level as close to the current
bitmap_{get,set}_value8() as possible...

Old pfcp -> bitmap changelog:

As for former commits (now 10 and 11), almost all of the changes were
suggested by Andy, notably: stop violating bitmap API, use
__assign_bit() where appropriate, and add more tests to make sure
everything works as expected. Apart from that, add simple wrappers for
bitmap_*() used in the IP tunnel code to avoid manually specifying
``__IP_TUNNEL_FLAG_NUM`` each time.

[1] https://lore.kernel.org/lkml/[email protected]
--
2.41.0


2023-10-09 15:13:48

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 01/14] bitops: add missing prototype check

Commit 8238b4579866 ("wait_on_bit: add an acquire memory barrier") added
a new bitop, test_bit_acquire(), with proper wrapping to try optimize it
at compile-time, but missed the list of bitops used for checking their
prototypes a bit below.
The functions added have consistent prototypes, so that no more changes
are required and no functional changes take place.

Fixes: 8238b4579866 ("wait_on_bit: add an acquire memory barrier")
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..f7f5a783da2a 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -80,6 +80,7 @@ __check_bitop_pr(__test_and_set_bit);
__check_bitop_pr(__test_and_clear_bit);
__check_bitop_pr(__test_and_change_bit);
__check_bitop_pr(test_bit);
+__check_bitop_pr(test_bit_acquire);

#undef __check_bitop_pr

--
2.41.0

2023-10-09 15:13:49

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 02/14] bitops: make BYTES_TO_BITS() treewide-available

Avoid open-coding that simple expression each time by moving
BYTES_TO_BITS() from the probes code to <linux/bitops.h> to export
it to the rest of the kernel.
Simplify the macro while at it. `BITS_PER_LONG / sizeof(long)` always
equals to %BITS_PER_BYTE, regardless of the target architecture.
Do the same for the tools ecosystem as well (incl. its version of
bitops.h).

Suggested-by: Andy Shevchenko <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 2 ++
kernel/trace/trace_probe.c | 2 --
tools/include/linux/bitops.h | 2 ++
tools/perf/util/probe-finder.c | 2 --
4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index f7f5a783da2a..e0cd09eb91cd 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -21,6 +21,8 @@
#define BITS_TO_U32(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) __KERNEL_DIV_ROUND_UP(nr, BITS_PER_TYPE(char))

+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 4dc74d73fc1d..2b743c1e37db 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -1053,8 +1053,6 @@ parse_probe_arg(char *arg, const struct fetch_type *type,
return ret;
}

-#define BYTES_TO_BITS(nb) ((BITS_PER_LONG * (nb)) / sizeof(long))
-
/* Bitfield type needs to be parsed into a fetch function */
static int __parse_bitfield_probe_arg(const char *bf,
const struct fetch_type *t,
diff --git a/tools/include/linux/bitops.h b/tools/include/linux/bitops.h
index f18683b95ea6..bc6600466e7b 100644
--- a/tools/include/linux/bitops.h
+++ b/tools/include/linux/bitops.h
@@ -20,6 +20,8 @@
#define BITS_TO_U32(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(u32))
#define BITS_TO_BYTES(nr) DIV_ROUND_UP(nr, BITS_PER_TYPE(char))

+#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_BYTE)
+
extern unsigned int __sw_hweight8(unsigned int w);
extern unsigned int __sw_hweight16(unsigned int w);
extern unsigned int __sw_hweight32(unsigned int w);
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f171360b0ef4..35f66c12ad8a 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -304,8 +304,6 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,
return ret2;
}

-#define BYTES_TO_BITS(nb) ((nb) * BITS_PER_LONG / sizeof(long))
-
static int convert_variable_type(Dwarf_Die *vr_die,
struct probe_trace_arg *tvar,
const char *cast, bool user_access)
--
2.41.0

2023-10-09 15:13:51

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 03/14] bitops: let the compiler optimize __assign_bit()

Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the compilers are able to expand inline
bitmap operations to compile-time initializers when possible.
However, during the round of replacement if-__set-else-__clear with
__assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes
difference in object code size for one module (even one function),
where the pattern:

DECLARE_BITMAP(foo) = { }; // on the stack, zeroed

if (a)
__set_bit(const_bit_num, foo);
if (b)
__set_bit(another_const_bit_num, foo);
...

is heavily used, although there should be no difference: the bitmap is
zeroed, so the second half of __assign_bit() should be compiled-out as
a no-op.
I either missed the fact that __assign_bit() has bitmap pointer marked
as `volatile` (as we usually do for bitmaps) or was hoping that the
compilers would at least try to look past the `volatile` for
__always_inline functions. Anyhow, due to that attribute, the compilers
were always compiling the whole expression and no mentioned compile-time
optimizations were working.

Convert __assign_bit() to a macro since it's a very simple if-else and
all of the checks are performed inside __set_bit() and __clear_bit(),
thus that wrapper has to be as transparent as possible. After that
change, despite it showing only -20 bytes change for vmlinux (due to
that it's still relatively unpopular), no drastic code size changes
happen when replacing if-set-else-clear for onstack bitmaps with
__assign_bit(), meaning the compiler now expands them to the actual
operations will all the expected optimizations.

Cc: Andy Shevchenko <[email protected]>
Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitops.h | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index e0cd09eb91cd..f98f4fd1047f 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -284,14 +284,8 @@ static __always_inline void assign_bit(long nr, volatile unsigned long *addr,
clear_bit(nr, addr);
}

-static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
- bool value)
-{
- if (value)
- __set_bit(nr, addr);
- else
- __clear_bit(nr, addr);
-}
+#define __assign_bit(nr, addr, value) \
+ ((value) ? __set_bit(nr, addr) : __clear_bit(nr, addr))

/**
* __ptr_set_bit - Set bit in a pointer's value
--
2.41.0

2023-10-09 15:14:14

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 06/14] fs/ntfs3: rename bitmap_size() -> ntfs3_bitmap_size()

bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is
NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
(although it uses ideologically right ALIGN() instead of division).
Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
it with the legacy module).

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
fs/ntfs3/bitmap.c | 4 ++--
fs/ntfs3/fsntfs.c | 2 +-
fs/ntfs3/index.c | 11 ++++++-----
fs/ntfs3/ntfs_fs.h | 2 +-
fs/ntfs3/super.c | 2 +-
5 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
index 107e808e06ea..a2e18f13e93a 100644
--- a/fs/ntfs3/bitmap.c
+++ b/fs/ntfs3/bitmap.c
@@ -653,7 +653,7 @@ int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
wnd->total_zeroes = nbits;
wnd->extent_max = MINUS_ONE_T;
wnd->zone_bit = wnd->zone_end = 0;
- wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
+ wnd->nwnd = bytes_to_block(sb, ntfs3_bitmap_size(nbits));
wnd->bits_last = nbits & (wbits - 1);
if (!wnd->bits_last)
wnd->bits_last = wbits;
@@ -1345,7 +1345,7 @@ int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
return -EINVAL;

/* Align to 8 byte boundary. */
- new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
+ new_wnd = bytes_to_block(sb, ntfs3_bitmap_size(new_bits));
new_last = new_bits & (wbits - 1);
if (!new_last)
new_last = wbits;
diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
index 33afee0f5559..7a14d2347f27 100644
--- a/fs/ntfs3/fsntfs.c
+++ b/fs/ntfs3/fsntfs.c
@@ -522,7 +522,7 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
ni->mi.dirty = true;

/* Step 2: Resize $MFT::BITMAP. */
- new_bitmap_bytes = bitmap_size(new_mft_total);
+ new_bitmap_bytes = ntfs3_bitmap_size(new_mft_total);

err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
index 124c6e822623..ab53a4b6ddf8 100644
--- a/fs/ntfs3/index.c
+++ b/fs/ntfs3/index.c
@@ -1453,8 +1453,8 @@ static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,

alloc->nres.valid_size = alloc->nres.data_size = cpu_to_le64(data_size);

- err = ni_insert_resident(ni, bitmap_size(1), ATTR_BITMAP, in->name,
- in->name_len, &bitmap, NULL, NULL);
+ err = ni_insert_resident(ni, ntfs3_bitmap_size(1), ATTR_BITMAP,
+ in->name, in->name_len, &bitmap, NULL, NULL);
if (err)
goto out2;

@@ -1515,8 +1515,9 @@ static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
if (bmp) {
/* Increase bitmap. */
err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
- &indx->bitmap_run, bitmap_size(bit + 1),
- NULL, true, NULL);
+ &indx->bitmap_run,
+ ntfs3_bitmap_size(bit + 1), NULL, true,
+ NULL);
if (err)
goto out1;
}
@@ -2089,7 +2090,7 @@ static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
if (in->name == I30_NAME)
ni->vfs_inode.i_size = new_data;

- bpb = bitmap_size(bit);
+ bpb = ntfs3_bitmap_size(bit);
if (bpb * 8 == nbits)
return 0;

diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index 629403ede6e5..93333156aac6 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -961,7 +961,7 @@ static inline bool run_is_empty(struct runs_tree *run)
}

/* NTFS uses quad aligned bitmaps. */
-static inline size_t bitmap_size(size_t bits)
+static inline size_t ntfs3_bitmap_size(size_t bits)
{
return ALIGN((bits + 7) >> 3, 8);
}
diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
index cfec5e0c7f66..b1fb6efe7084 100644
--- a/fs/ntfs3/super.c
+++ b/fs/ntfs3/super.c
@@ -1285,7 +1285,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)

/* Check bitmap boundary. */
tt = sbi->used.bitmap.nbits;
- if (inode->i_size < bitmap_size(tt)) {
+ if (inode->i_size < ntfs3_bitmap_size(tt)) {
ntfs_err(sb, "$Bitmap is corrupted.");
err = -EINVAL;
goto put_inode_out;
--
2.41.0

2023-10-09 15:14:21

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 04/14] linkmode: convert linkmode_{test,set,clear,mod}_bit() to macros

Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
on compile-time constants"), the non-atomic bitops are macros which can
be expanded by the compilers into compile-time expressions, which will
result in better optimized object code. Unfortunately, turned out that
passing `volatile` to those macros discards any possibility of
optimization, as the compilers then don't even try to look whether
the passed bitmap is known at compilation time. In addition to that,
the mentioned linkmode helpers are marked with `inline`, not
`__always_inline`, meaning that it's not guaranteed some compiler won't
uninline them for no reason, which will also effectively prevent them
from being optimized (it's a well-known thing the compilers sometimes
uninline `2 + 2`).
Convert linkmode_*_bit() from inlines to macros. Their calling
convention are 1:1 with the corresponding bitops, so that it's not even
needed to enumerate and map the arguments, only the names. No changes in
vmlinux' object code (compiled by LLVM for x86_64) whatsoever, but that
doesn't necessarily means the change is meaningless.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/linkmode.h | 27 ++++-----------------------
1 file changed, 4 insertions(+), 23 deletions(-)

diff --git a/include/linux/linkmode.h b/include/linux/linkmode.h
index 15e0e0209da4..f231e2edbfa5 100644
--- a/include/linux/linkmode.h
+++ b/include/linux/linkmode.h
@@ -38,10 +38,10 @@ static inline int linkmode_andnot(unsigned long *dst, const unsigned long *src1,
return bitmap_andnot(dst, src1, src2, __ETHTOOL_LINK_MODE_MASK_NBITS);
}

-static inline void linkmode_set_bit(int nr, volatile unsigned long *addr)
-{
- __set_bit(nr, addr);
-}
+#define linkmode_test_bit test_bit
+#define linkmode_set_bit __set_bit
+#define linkmode_clear_bit __clear_bit
+#define linkmode_mod_bit __assign_bit

static inline void linkmode_set_bit_array(const int *array, int array_size,
unsigned long *addr)
@@ -52,25 +52,6 @@ static inline void linkmode_set_bit_array(const int *array, int array_size,
linkmode_set_bit(array[i], addr);
}

-static inline void linkmode_clear_bit(int nr, volatile unsigned long *addr)
-{
- __clear_bit(nr, addr);
-}
-
-static inline void linkmode_mod_bit(int nr, volatile unsigned long *addr,
- int set)
-{
- if (set)
- linkmode_set_bit(nr, addr);
- else
- linkmode_clear_bit(nr, addr);
-}
-
-static inline int linkmode_test_bit(int nr, const volatile unsigned long *addr)
-{
- return test_bit(nr, addr);
-}
-
static inline int linkmode_equal(const unsigned long *src1,
const unsigned long *src2)
{
--
2.41.0

2023-10-09 15:14:27

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 08/14] bitmap: introduce generic optimized bitmap_size()

The number of times yet another open coded
`BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
Some generic helper is long overdue.

Add one, bitmap_size(), but with one detail.
BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
divident and divisor are compile-time constants or when the divisor
is not a pow-of-2. When it is however, the compilers sometimes tend
to generate suboptimal code (GCC 13):

48 83 c0 3f add $0x3f,%rax
48 c1 e8 06 shr $0x6,%rax
48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx

%BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
full division of `nbits + 63` by it and then multiplication by 8.
Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:

8d 50 3f lea 0x3f(%rax),%edx
c1 ea 03 shr $0x3,%edx
81 e2 f8 ff ff 1f and $0x1ffffff8,%edx

Now it divides `nbits + 63` by 8 and then masks bits[2:0].
bloat-o-meter:

add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)

Clang does it better and generates the same code before/after starting
from -O1, except that with the ALIGN() approach it uses %edx and thus
still saves some bytes:

add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)

Note that we can't expand DIV_ROUND_UP() by adding a check and using
this approach there, as it's used in array declarations where
expressions are not allowed.
Add this helper to tools/ as well.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/md/dm-clone-metadata.c | 5 -----
include/linux/bitmap.h | 8 +++++---
include/linux/cpumask.h | 2 +-
lib/math/prime_numbers.c | 2 --
tools/include/linux/bitmap.h | 8 +++++---
5 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-clone-metadata.c b/drivers/md/dm-clone-metadata.c
index c43d55672bce..47c1fa7aad8b 100644
--- a/drivers/md/dm-clone-metadata.c
+++ b/drivers/md/dm-clone-metadata.c
@@ -465,11 +465,6 @@ static void __destroy_persistent_data_structures(struct dm_clone_metadata *cmd)

/*---------------------------------------------------------------------------*/

-static size_t bitmap_size(unsigned long nr_bits)
-{
- return BITS_TO_LONGS(nr_bits) * sizeof(long);
-}
-
static int __dirty_map_init(struct dirty_map *dmap, unsigned long nr_words,
unsigned long nr_regions)
{
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 03644237e1ef..63e422f8ba3d 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -237,9 +237,11 @@ extern int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))

+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);

if (small_const_nbits(nbits))
*dst = 0;
@@ -249,7 +251,7 @@ static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)

static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);

if (small_const_nbits(nbits))
*dst = ~0UL;
@@ -260,7 +262,7 @@ static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
unsigned int nbits)
{
- unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
+ unsigned int len = bitmap_size(nbits);

if (small_const_nbits(nbits))
*dst = *src;
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index f10fb87d49db..dbdbf1451cad 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -821,7 +821,7 @@ static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
*/
static inline unsigned int cpumask_size(void)
{
- return BITS_TO_LONGS(large_cpumask_bits) * sizeof(long);
+ return bitmap_size(large_cpumask_bits);
}

/*
diff --git a/lib/math/prime_numbers.c b/lib/math/prime_numbers.c
index d42cebf7407f..d3b64b10da1c 100644
--- a/lib/math/prime_numbers.c
+++ b/lib/math/prime_numbers.c
@@ -6,8 +6,6 @@
#include <linux/prime_numbers.h>
#include <linux/slab.h>

-#define bitmap_size(nbits) (BITS_TO_LONGS(nbits) * sizeof(unsigned long))
-
struct primes {
struct rcu_head rcu;
unsigned long last, sz;
diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
index f3566ea0f932..81a2299ace15 100644
--- a/tools/include/linux/bitmap.h
+++ b/tools/include/linux/bitmap.h
@@ -2,6 +2,7 @@
#ifndef _TOOLS_LINUX_BITMAP_H
#define _TOOLS_LINUX_BITMAP_H

+#include <linux/align.h>
#include <string.h>
#include <linux/bitops.h>
#include <linux/find.h>
@@ -25,13 +26,14 @@ bool __bitmap_intersects(const unsigned long *bitmap1,
#define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
#define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))

+#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
+
static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
{
if (small_const_nbits(nbits))
*dst = 0UL;
else {
- int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
- memset(dst, 0, len);
+ memset(dst, 0, bitmap_size(nbits));
}
}

@@ -83,7 +85,7 @@ static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
*/
static inline unsigned long *bitmap_zalloc(int nbits)
{
- return calloc(1, BITS_TO_LONGS(nbits) * sizeof(unsigned long));
+ return calloc(1, bitmap_size(nbits));
}

/*
--
2.41.0

2023-10-09 15:14:33

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 05/14] s390/cio: rename bitmap_size() -> idset_bitmap_size()

bitmap_size() is a pretty generic name and one may want to use it for
a generic bitmap API function. At the same time, its logic is not
"generic", i.e. it's not just `nbits -> size of bitmap in bytes`
converter as it would be expected from its name.
Add the prefix 'idset_' used throughout the file where the function
resides.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
idset_new() really wants its vmalloc() + memset() pair to be replaced
with vzalloc().
---
drivers/s390/cio/idset.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
index 45f9c0736be4..0a1105a483bf 100644
--- a/drivers/s390/cio/idset.c
+++ b/drivers/s390/cio/idset.c
@@ -16,7 +16,7 @@ struct idset {
unsigned long bitmap[];
};

-static inline unsigned long bitmap_size(int num_ssid, int num_id)
+static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
{
return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
}
@@ -25,11 +25,12 @@ static struct idset *idset_new(int num_ssid, int num_id)
{
struct idset *set;

- set = vmalloc(sizeof(struct idset) + bitmap_size(num_ssid, num_id));
+ set = vmalloc(sizeof(struct idset) +
+ idset_bitmap_size(num_ssid, num_id));
if (set) {
set->num_ssid = num_ssid;
set->num_id = num_id;
- memset(set->bitmap, 0, bitmap_size(num_ssid, num_id));
+ memset(set->bitmap, 0, idset_bitmap_size(num_ssid, num_id));
}
return set;
}
@@ -41,7 +42,8 @@ void idset_free(struct idset *set)

void idset_fill(struct idset *set)
{
- memset(set->bitmap, 0xff, bitmap_size(set->num_ssid, set->num_id));
+ memset(set->bitmap, 0xff,
+ idset_bitmap_size(set->num_ssid, set->num_id));
}

static inline void idset_add(struct idset *set, int ssid, int id)
--
2.41.0

2023-10-09 15:14:39

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 10/14] ip_tunnel: use a separate struct to store tunnel params in the kernel

Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure
to store params inside the kernel, IPv4 tunnel code uses the same
ip_tunnel_parm which is being used to talk with the userspace.
This makes it difficult to alter or add any fields or use a
different format for whatever data.
Define struct ip_tunnel_parm_kern, a 1:1 copy of ip_tunnel_parm for
now, and use it throughout the code. Define the pieces, where the copy
user <-> kernel happens, as standalone functions, and copy the data
there field-by-field, so that the kernel-side structure could be easily
modified later on and the users wouldn't have to care about this.

Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 30 +++++----
.../ethernet/mellanox/mlxsw/spectrum_ipip.h | 2 +-
.../ethernet/mellanox/mlxsw/spectrum_span.c | 4 +-
include/linux/netdevice.h | 7 ++-
include/net/ip_tunnels.h | 25 ++++++--
net/ipv4/ip_gre.c | 17 ++---
net/ipv4/ip_tunnel.c | 63 +++++++++++++++----
net/ipv4/ip_tunnel_core.c | 2 +-
net/ipv4/ip_vti.c | 12 ++--
net/ipv4/ipip.c | 12 ++--
net/ipv4/ipmr.c | 2 +-
net/ipv6/addrconf.c | 3 +-
net/ipv6/sit.c | 33 +++++-----
13 files changed, 137 insertions(+), 75 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
index 3340b4a694c3..d67df358a52f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
@@ -8,7 +8,7 @@
#include "spectrum_ipip.h"
#include "reg.h"

-struct ip_tunnel_parm
+struct ip_tunnel_parm_kern
mlxsw_sp_ipip_netdev_parms4(const struct net_device *ol_dev)
{
struct ip_tunnel *tun = netdev_priv(ol_dev);
@@ -24,7 +24,8 @@ mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev)
return tun->parms;
}

-static bool mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm *parms)
+static bool
+mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm_kern *parms)
{
return !!(parms->i_flags & TUNNEL_KEY);
}
@@ -34,7 +35,8 @@ static bool mlxsw_sp_ipip_parms6_has_ikey(const struct __ip6_tnl_parm *parms)
return !!(parms->i_flags & TUNNEL_KEY);
}

-static bool mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm *parms)
+static bool
+mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm_kern *parms)
{
return !!(parms->o_flags & TUNNEL_KEY);
}
@@ -44,7 +46,7 @@ static bool mlxsw_sp_ipip_parms6_has_okey(const struct __ip6_tnl_parm *parms)
return !!(parms->o_flags & TUNNEL_KEY);
}

-static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm *parms)
+static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm_kern *parms)
{
return mlxsw_sp_ipip_parms4_has_ikey(parms) ?
be32_to_cpu(parms->i_key) : 0;
@@ -56,7 +58,7 @@ static u32 mlxsw_sp_ipip_parms6_ikey(const struct __ip6_tnl_parm *parms)
be32_to_cpu(parms->i_key) : 0;
}

-static u32 mlxsw_sp_ipip_parms4_okey(const struct ip_tunnel_parm *parms)
+static u32 mlxsw_sp_ipip_parms4_okey(const struct ip_tunnel_parm_kern *parms)
{
return mlxsw_sp_ipip_parms4_has_okey(parms) ?
be32_to_cpu(parms->o_key) : 0;
@@ -69,7 +71,7 @@ static u32 mlxsw_sp_ipip_parms6_okey(const struct __ip6_tnl_parm *parms)
}

static union mlxsw_sp_l3addr
-mlxsw_sp_ipip_parms4_saddr(const struct ip_tunnel_parm *parms)
+mlxsw_sp_ipip_parms4_saddr(const struct ip_tunnel_parm_kern *parms)
{
return (union mlxsw_sp_l3addr) { .addr4 = parms->iph.saddr };
}
@@ -81,7 +83,7 @@ mlxsw_sp_ipip_parms6_saddr(const struct __ip6_tnl_parm *parms)
}

static union mlxsw_sp_l3addr
-mlxsw_sp_ipip_parms4_daddr(const struct ip_tunnel_parm *parms)
+mlxsw_sp_ipip_parms4_daddr(const struct ip_tunnel_parm_kern *parms)
{
return (union mlxsw_sp_l3addr) { .addr4 = parms->iph.daddr };
}
@@ -96,7 +98,7 @@ union mlxsw_sp_l3addr
mlxsw_sp_ipip_netdev_saddr(enum mlxsw_sp_l3proto proto,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms4;
+ struct ip_tunnel_parm_kern parms4;
struct __ip6_tnl_parm parms6;

switch (proto) {
@@ -115,7 +117,9 @@ mlxsw_sp_ipip_netdev_saddr(enum mlxsw_sp_l3proto proto,
static __be32 mlxsw_sp_ipip_netdev_daddr4(const struct net_device *ol_dev)
{

- struct ip_tunnel_parm parms4 = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms4;
+
+ parms4 = mlxsw_sp_ipip_netdev_parms4(ol_dev);

return mlxsw_sp_ipip_parms4_daddr(&parms4).addr4;
}
@@ -124,7 +128,7 @@ static union mlxsw_sp_l3addr
mlxsw_sp_ipip_netdev_daddr(enum mlxsw_sp_l3proto proto,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms4;
+ struct ip_tunnel_parm_kern parms4;
struct __ip6_tnl_parm parms6;

switch (proto) {
@@ -150,7 +154,7 @@ bool mlxsw_sp_l3addr_is_zero(union mlxsw_sp_l3addr addr)
static struct mlxsw_sp_ipip_parms
mlxsw_sp_ipip_netdev_parms_init_gre4(const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);

return (struct mlxsw_sp_ipip_parms) {
.proto = MLXSW_SP_L3_PROTO_IPV4,
@@ -187,8 +191,8 @@ mlxsw_sp_ipip_decap_config_gre4(struct mlxsw_sp *mlxsw_sp,
{
u16 rif_index = mlxsw_sp_ipip_lb_rif_index(ipip_entry->ol_lb);
u16 ul_rif_id = mlxsw_sp_ipip_lb_ul_rif_id(ipip_entry->ol_lb);
+ struct ip_tunnel_parm_kern parms;
char rtdp_pl[MLXSW_REG_RTDP_LEN];
- struct ip_tunnel_parm parms;
unsigned int type_check;
bool has_ikey;
u32 daddr4;
@@ -252,7 +256,7 @@ static struct mlxsw_sp_rif_ipip_lb_config
mlxsw_sp_ipip_ol_loopback_config_gre4(struct mlxsw_sp *mlxsw_sp,
const struct net_device *ol_dev)
{
- struct ip_tunnel_parm parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
+ struct ip_tunnel_parm_kern parms = mlxsw_sp_ipip_netdev_parms4(ol_dev);
enum mlxsw_reg_ritr_loopback_ipip_type lb_ipipt;

lb_ipipt = mlxsw_sp_ipip_parms4_has_okey(&parms) ?
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
index a35f009da561..a66173779641 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.h
@@ -9,7 +9,7 @@
#include <linux/if_tunnel.h>
#include <net/ip6_tunnel.h>

-struct ip_tunnel_parm
+struct ip_tunnel_parm_kern
mlxsw_sp_ipip_netdev_parms4(const struct net_device *ol_dev);
struct __ip6_tnl_parm
mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index b3472fb94617..ee08184bd60f 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -413,8 +413,8 @@ mlxsw_sp_span_gretap4_route(const struct net_device *to_dev,
__be32 *saddrp, __be32 *daddrp)
{
struct ip_tunnel *tun = netdev_priv(to_dev);
+ struct ip_tunnel_parm_kern parms;
struct net_device *dev = NULL;
- struct ip_tunnel_parm parms;
struct rtable *rt = NULL;
struct flowi4 fl4;

@@ -451,7 +451,7 @@ mlxsw_sp_span_entry_gretap4_parms(struct mlxsw_sp *mlxsw_sp,
const struct net_device *to_dev,
struct mlxsw_sp_span_parms *sparmsp)
{
- struct ip_tunnel_parm tparm = mlxsw_sp_ipip_netdev_parms4(to_dev);
+ struct ip_tunnel_parm_kern tparm = mlxsw_sp_ipip_netdev_parms4(to_dev);
union mlxsw_sp_l3addr saddr = { .addr4 = tparm.iph.saddr };
union mlxsw_sp_l3addr daddr = { .addr4 = tparm.iph.daddr };
bool inherit_tos = tparm.iph.tos & 0x1;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 0896aaa91dd7..34e01d29b449 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -59,7 +59,7 @@ struct ethtool_ops;
struct kernel_hwtstamp_config;
struct phy_device;
struct dsa_port;
-struct ip_tunnel_parm;
+struct ip_tunnel_parm_kern;
struct macsec_context;
struct macsec_ops;
struct netdev_name_node;
@@ -1382,7 +1382,7 @@ struct netdev_net_notifier {
* queue id bound to an AF_XDP socket. The flags field specifies if
* only RX, only Tx, or both should be woken up using the flags
* XDP_WAKEUP_RX and XDP_WAKEUP_TX.
- * int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm *p,
+ * int (*ndo_tunnel_ctl)(struct net_device *dev, struct ip_tunnel_parm_kern *p,
* int cmd);
* Add, change, delete or get information on an IPv4 tunnel.
* struct net_device *(*ndo_get_peer_dev)(struct net_device *dev);
@@ -1633,7 +1633,8 @@ struct net_device_ops {
int (*ndo_xsk_wakeup)(struct net_device *dev,
u32 queue_id, u32 flags);
int (*ndo_tunnel_ctl)(struct net_device *dev,
- struct ip_tunnel_parm *p, int cmd);
+ struct ip_tunnel_parm_kern *p,
+ int cmd);
struct net_device * (*ndo_get_peer_dev)(struct net_device *dev);
int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx,
struct net_device_path *path);
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index f346b4efbc30..6da667b743e1 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -110,6 +110,17 @@ struct ip_tunnel_prl_entry {

struct metadata_dst;

+/* Kernel-side copy of ip_tunnel_parm */
+struct ip_tunnel_parm_kern {
+ char name[IFNAMSIZ];
+ int link;
+ __be16 i_flags;
+ __be16 o_flags;
+ __be32 i_key;
+ __be32 o_key;
+ struct iphdr iph;
+};
+
struct ip_tunnel {
struct ip_tunnel __rcu *next;
struct hlist_node hash_node;
@@ -136,7 +147,7 @@ struct ip_tunnel {

struct dst_cache dst_cache;

- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;

int mlink;
int encap_hlen; /* Encap header length (FOU,GUE) */
@@ -290,7 +301,11 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
const struct iphdr *tnl_params, const u8 protocol);
void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
const u8 proto, int tunnel_hlen);
-int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd);
+int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd);
+bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,
+ const void __user *data);
+bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp);
int ip_tunnel_siocdevprivate(struct net_device *dev, struct ifreq *ifr,
void __user *data, int cmd);
int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict);
@@ -306,16 +321,16 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,
const struct tnl_ptk_info *tpi, struct metadata_dst *tun_dst,
bool log_ecn_error);
int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark);
+ struct ip_tunnel_parm_kern *p, __u32 fwmark);
int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark);
+ struct ip_tunnel_parm_kern *p, __u32 fwmark);
void ip_tunnel_setup(struct net_device *dev, unsigned int net_id);

bool ip_tunnel_netlink_encap_parms(struct nlattr *data[],
struct ip_tunnel_encap *encap);

void ip_tunnel_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms);
+ struct ip_tunnel_parm_kern *parms);

extern const struct header_ops ip_tunnel_header_ops;
__be16 ip_tunnel_parse_protocol(const struct sk_buff *skb);
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index 22a26d1d29a0..b65318a55ae8 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -782,7 +782,8 @@ static void ipgre_link_update(struct net_device *dev, bool set_mtu)
}
}

-static int ipgre_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p,
+static int ipgre_tunnel_ctl(struct net_device *dev,
+ struct ip_tunnel_parm_kern *p,
int cmd)
{
int err;
@@ -1126,7 +1127,7 @@ static int erspan_validate(struct nlattr *tb[], struct nlattr *data[],
static int ipgre_netlink_parms(struct net_device *dev,
struct nlattr *data[],
struct nlattr *tb[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1193,7 +1194,7 @@ static int ipgre_netlink_parms(struct net_device *dev,
static int erspan_netlink_parms(struct net_device *dev,
struct nlattr *data[],
struct nlattr *tb[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
struct ip_tunnel *t = netdev_priv(dev);
@@ -1352,7 +1353,7 @@ static int ipgre_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;
int err;

@@ -1370,7 +1371,7 @@ static int erspan_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;
int err;

@@ -1389,8 +1390,8 @@ static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;
int err;

err = ipgre_newlink_encap_setup(dev, data);
@@ -1418,8 +1419,8 @@ static int erspan_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;
int err;

err = ipgre_newlink_encap_setup(dev, data);
@@ -1491,7 +1492,7 @@ static size_t ipgre_get_size(const struct net_device *dev)
static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm *p = &t->parms;
+ struct ip_tunnel_parm_kern *p = &t->parms;
__be16 o_flags = p->o_flags;

if (nla_put_u32(skb, IFLA_GRE_LINK, p->link) ||
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index beeae624c412..658f46208c8d 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -56,7 +56,7 @@ static unsigned int ip_tunnel_hash(__be32 key, __be32 remote)
IP_TNL_HASH_BITS);
}

-static bool ip_tunnel_key_match(const struct ip_tunnel_parm *p,
+static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
__be16 flags, __be32 key)
{
if (p->i_flags & TUNNEL_KEY) {
@@ -172,7 +172,7 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
EXPORT_SYMBOL_GPL(ip_tunnel_lookup);

static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
unsigned int h;
__be32 remote;
@@ -207,7 +207,7 @@ static void ip_tunnel_del(struct ip_tunnel_net *itn, struct ip_tunnel *t)
}

static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
int type)
{
__be32 remote = parms->iph.daddr;
@@ -231,7 +231,7 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,

static struct net_device *__ip_tunnel_create(struct net *net,
const struct rtnl_link_ops *ops,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
int err;
struct ip_tunnel *tunnel;
@@ -327,7 +327,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)

static struct ip_tunnel *ip_tunnel_create(struct net *net,
struct ip_tunnel_net *itn,
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
struct ip_tunnel *nt;
struct net_device *dev;
@@ -845,7 +845,7 @@ EXPORT_SYMBOL_GPL(ip_tunnel_xmit);
static void ip_tunnel_update(struct ip_tunnel_net *itn,
struct ip_tunnel *t,
struct net_device *dev,
- struct ip_tunnel_parm *p,
+ struct ip_tunnel_parm_kern *p,
bool set_mtu,
__u32 fwmark)
{
@@ -877,7 +877,8 @@ static void ip_tunnel_update(struct ip_tunnel_net *itn,
netdev_state_change(dev);
}

-int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd)
{
int err = 0;
struct ip_tunnel *t = netdev_priv(dev);
@@ -979,16 +980,52 @@ int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
}
EXPORT_SYMBOL_GPL(ip_tunnel_ctl);

+bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,
+ const void __user *data)
+{
+ struct ip_tunnel_parm p;
+
+ if (copy_from_user(&p, data, sizeof(p)))
+ return false;
+
+ strscpy(kp->name, p.name, sizeof(kp->name));
+ kp->link = p.link;
+ kp->i_flags = p.i_flags;
+ kp->o_flags = p.o_flags;
+ kp->i_key = p.i_key;
+ kp->o_key = p.o_key;
+ memcpy(&kp->iph, &p.iph, min(sizeof(kp->iph), sizeof(p.iph)));
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(ip_tunnel_parm_from_user);
+
+bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp)
+{
+ struct ip_tunnel_parm p;
+
+ strscpy(p.name, kp->name, sizeof(p.name));
+ p.link = kp->link;
+ p.i_flags = kp->i_flags;
+ p.o_flags = kp->o_flags;
+ p.i_key = kp->i_key;
+ p.o_key = kp->o_key;
+ memcpy(&p.iph, &kp->iph, min(sizeof(p.iph), sizeof(kp->iph)));
+
+ return !copy_to_user(data, &p, sizeof(p));
+}
+EXPORT_SYMBOL_GPL(ip_tunnel_parm_to_user);
+
int ip_tunnel_siocdevprivate(struct net_device *dev, struct ifreq *ifr,
void __user *data, int cmd)
{
- struct ip_tunnel_parm p;
+ struct ip_tunnel_parm_kern p;
int err;

- if (copy_from_user(&p, data, sizeof(p)))
+ if (!ip_tunnel_parm_from_user(&p, data))
return -EFAULT;
err = dev->netdev_ops->ndo_tunnel_ctl(dev, &p, cmd);
- if (!err && copy_to_user(data, &p, sizeof(p)))
+ if (!err && !ip_tunnel_parm_to_user(data, &p))
return -EFAULT;
return err;
}
@@ -1067,7 +1104,7 @@ int ip_tunnel_init_net(struct net *net, unsigned int ip_tnl_net_id,
struct rtnl_link_ops *ops, char *devname)
{
struct ip_tunnel_net *itn = net_generic(net, ip_tnl_net_id);
- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;
unsigned int i;

itn->rtnl_link_ops = ops;
@@ -1147,7 +1184,7 @@ void ip_tunnel_delete_nets(struct list_head *net_list, unsigned int id,
EXPORT_SYMBOL_GPL(ip_tunnel_delete_nets);

int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark)
+ struct ip_tunnel_parm_kern *p, __u32 fwmark)
{
struct ip_tunnel *nt;
struct net *net = dev_net(dev);
@@ -1201,7 +1238,7 @@ int ip_tunnel_newlink(struct net_device *dev, struct nlattr *tb[],
EXPORT_SYMBOL_GPL(ip_tunnel_newlink);

int ip_tunnel_changelink(struct net_device *dev, struct nlattr *tb[],
- struct ip_tunnel_parm *p, __u32 fwmark)
+ struct ip_tunnel_parm_kern *p, __u32 fwmark)
{
struct ip_tunnel *t;
struct ip_tunnel *tunnel = netdev_priv(dev);
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index 586b1b3e35b8..f89e3bdef123 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -1116,7 +1116,7 @@ bool ip_tunnel_netlink_encap_parms(struct nlattr *data[],
EXPORT_SYMBOL_GPL(ip_tunnel_netlink_encap_parms);

void ip_tunnel_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms)
+ struct ip_tunnel_parm_kern *parms)
{
if (data[IFLA_IPTUN_LINK])
parms->link = nla_get_u32(data[IFLA_IPTUN_LINK]);
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index d1e7d0ceb7ed..e1718c088433 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -167,7 +167,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
struct flowi *fl)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parms = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parms = &tunnel->parms;
struct dst_entry *dst = skb_dst(skb);
struct net_device *tdev; /* Device to other host */
int pkt_len = skb->len;
@@ -373,7 +373,7 @@ static int vti4_err(struct sk_buff *skb, u32 info)
}

static int
-vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
int err = 0;

@@ -529,7 +529,7 @@ static int vti_tunnel_validate(struct nlattr *tb[], struct nlattr *data[],
}

static void vti_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));
@@ -564,7 +564,7 @@ static int vti_newlink(struct net *src_net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct ip_tunnel_parm parms;
+ struct ip_tunnel_parm_kern parms;
__u32 fwmark = 0;

vti_netlink_parms(data, &parms, &fwmark);
@@ -576,8 +576,8 @@ static int vti_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = t->fwmark;
- struct ip_tunnel_parm p;

vti_netlink_parms(data, &p, &fwmark);
return ip_tunnel_changelink(dev, tb, &p, fwmark);
@@ -604,7 +604,7 @@ static size_t vti_get_size(const struct net_device *dev)
static int vti_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm *p = &t->parms;
+ struct ip_tunnel_parm_kern *p = &t->parms;

if (nla_put_u32(skb, IFLA_VTI_LINK, p->link) ||
nla_put_be32(skb, IFLA_VTI_IKEY, p->i_key) ||
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 27b8f83c6ea2..0dd2d3b55c75 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -330,7 +330,7 @@ static bool ipip_tunnel_ioctl_verify_protocol(u8 ipproto)
}

static int
-ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
if (p->iph.version != 4 ||
@@ -405,8 +405,8 @@ static int ipip_tunnel_validate(struct nlattr *tb[], struct nlattr *data[],
}

static void ipip_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms, bool *collect_md,
- __u32 *fwmark)
+ struct ip_tunnel_parm_kern *parms,
+ bool *collect_md, __u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));

@@ -432,8 +432,8 @@ static int ipip_newlink(struct net *src_net, struct net_device *dev,
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
__u32 fwmark = 0;

if (ip_tunnel_netlink_encap_parms(data, &ipencap)) {
@@ -452,8 +452,8 @@ static int ipip_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
bool collect_md;
__u32 fwmark = t->fwmark;

@@ -510,7 +510,7 @@ static size_t ipip_get_size(const struct net_device *dev)
static int ipip_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parm = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parm = &tunnel->parms;

if (nla_put_u32(skb, IFLA_IPTUN_LINK, parm->link) ||
nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, parm->iph.saddr) ||
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 9e222a57bc2b..799d1bccf38d 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -441,7 +441,7 @@ static bool ipmr_init_vif_indev(const struct net_device *dev)
static struct net_device *ipmr_new_tunnel(struct net *net, struct vifctl *v)
{
struct net_device *tunnel_dev, *new_dev;
- struct ip_tunnel_parm p = { };
+ struct ip_tunnel_parm_kern p = { };
int err;

tunnel_dev = __dev_get_by_name(net, "tunl0");
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 0b6ee962c84e..2bd0cf2b5a90 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -63,6 +63,7 @@
#include <linux/string.h>
#include <linux/hash.h>

+#include <net/ip_tunnels.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <net/snmp.h>
@@ -2857,7 +2858,7 @@ void addrconf_prefix_rcv(struct net_device *dev, u8 *opt, int len, bool sllao)
static int addrconf_set_sit_dstaddr(struct net *net, struct net_device *dev,
struct in6_ifreq *ireq)
{
- struct ip_tunnel_parm p = { };
+ struct ip_tunnel_parm_kern p = { };
int err;

if (!(ipv6_addr_type(&ireq->ifr6_addr) & IPV6_ADDR_COMPATv4))
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index cc24cefdb85c..19af34e4c13c 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -132,8 +132,8 @@ static struct ip_tunnel *ipip6_tunnel_lookup(struct net *net,
return NULL;
}

-static struct ip_tunnel __rcu **__ipip6_bucket(struct sit_net *sitn,
- struct ip_tunnel_parm *parms)
+static struct ip_tunnel __rcu **
+__ipip6_bucket(struct sit_net *sitn, struct ip_tunnel_parm_kern *parms)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
@@ -226,7 +226,8 @@ static int ipip6_tunnel_create(struct net_device *dev)
}

static struct ip_tunnel *ipip6_tunnel_locate(struct net *net,
- struct ip_tunnel_parm *parms, int create)
+ struct ip_tunnel_parm_kern *parms,
+ int create)
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
@@ -1135,7 +1136,8 @@ static void ipip6_tunnel_bind_dev(struct net_device *dev)
dev->needed_headroom = t_hlen + hlen;
}

-static void ipip6_tunnel_update(struct ip_tunnel *t, struct ip_tunnel_parm *p,
+static void ipip6_tunnel_update(struct ip_tunnel *t,
+ struct ip_tunnel_parm_kern *p,
__u32 fwmark)
{
struct net *net = t->net;
@@ -1196,11 +1198,11 @@ static int
ipip6_tunnel_get6rd(struct net_device *dev, struct ip_tunnel_parm __user *data)
{
struct ip_tunnel *t = netdev_priv(dev);
+ struct ip_tunnel_parm_kern p;
struct ip_tunnel_6rd ip6rd;
- struct ip_tunnel_parm p;

if (dev == dev_to_sit_net(dev)->fb_tunnel_dev) {
- if (copy_from_user(&p, data, sizeof(p)))
+ if (!ip_tunnel_parm_from_user(&p, data))
return -EFAULT;
t = ipip6_tunnel_locate(t->net, &p, 0);
}
@@ -1251,7 +1253,7 @@ static bool ipip6_valid_ip_proto(u8 ipproto)
}

static int
-__ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm *p)
+__ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm_kern *p)
{
if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
return -EPERM;
@@ -1268,7 +1270,7 @@ __ipip6_tunnel_ioctl_validate(struct net *net, struct ip_tunnel_parm *p)
}

static int
-ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);

@@ -1281,7 +1283,7 @@ ipip6_tunnel_get(struct net_device *dev, struct ip_tunnel_parm *p)
}

static int
-ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
int err;
@@ -1297,7 +1299,7 @@ ipip6_tunnel_add(struct net_device *dev, struct ip_tunnel_parm *p)
}

static int
-ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);
int err;
@@ -1328,7 +1330,7 @@ ipip6_tunnel_change(struct net_device *dev, struct ip_tunnel_parm *p)
}

static int
-ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm *p)
+ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm_kern *p)
{
struct ip_tunnel *t = netdev_priv(dev);

@@ -1348,7 +1350,8 @@ ipip6_tunnel_del(struct net_device *dev, struct ip_tunnel_parm *p)
}

static int
-ipip6_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm *p, int cmd)
+ipip6_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
+ int cmd)
{
switch (cmd) {
case SIOCGETTUNNEL:
@@ -1494,7 +1497,7 @@ static int ipip6_validate(struct nlattr *tb[], struct nlattr *data[],
}

static void ipip6_netlink_parms(struct nlattr *data[],
- struct ip_tunnel_parm *parms,
+ struct ip_tunnel_parm_kern *parms,
__u32 *fwmark)
{
memset(parms, 0, sizeof(*parms));
@@ -1603,8 +1606,8 @@ static int ipip6_changelink(struct net_device *dev, struct nlattr *tb[],
struct netlink_ext_ack *extack)
{
struct ip_tunnel *t = netdev_priv(dev);
- struct ip_tunnel_parm p;
struct ip_tunnel_encap ipencap;
+ struct ip_tunnel_parm_kern p;
struct net *net = t->net;
struct sit_net *sitn = net_generic(net, sit_net_id);
#ifdef CONFIG_IPV6_SIT_6RD
@@ -1691,7 +1694,7 @@ static size_t ipip6_get_size(const struct net_device *dev)
static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- struct ip_tunnel_parm *parm = &tunnel->parms;
+ struct ip_tunnel_parm_kern *parm = &tunnel->parms;

if (nla_put_u32(skb, IFLA_IPTUN_LINK, parm->link) ||
nla_put_in_addr(skb, IFLA_IPTUN_LOCAL, parm->iph.saddr) ||
--
2.41.0

2023-10-09 15:15:54

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 14/14] lib/bitmap: add tests for IP tunnel flags conversion helpers

Now that there are helpers for converting IP tunnel flags between the
old __be16 format and the bitmap format, make sure they work as expected
by adding a couple of tests to the bitmap testing suite. The helpers are
all inline, so no dependencies on the related CONFIG_* (or a standalone
module) are needed.

Cover three possible cases:

1. No bits past BIT(15) are set, VTI/SIT bits are not set. This
conversion is almost a direct assignment.
2. No bits past BIT(15) are set, but VTI/SIT bit is set. During the
conversion, it must be transformed into BIT(16) in the bitmap,
but still compatible with the __be16 format.
3. The bitmap has bits past BIT(15) set (not the VTI/SIT one). The
result will be truncated.
Note that currently __IP_TUNNEL_FLAG_NUM is 17 (incl. special),
which means that the result of this case is currently
semi-false-positive. When BIT(17) is finally here, it will be
adjusted accordingly.

Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 105 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6037b66fd30a..23da19e1ff7e 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -14,6 +14,8 @@
#include <linux/string.h>
#include <linux/uaccess.h>

+#include <net/ip_tunnels.h>
+
#include "../tools/testing/selftests/kselftest_module.h"

#define EXP1_IN_BITS (sizeof(exp1) * 8)
@@ -1300,6 +1302,108 @@ static void __init test_bitmap_const_eval(void)
BUILD_BUG_ON(!res);
}

+struct ip_tunnel_flags_test {
+ const u16 *src_bits;
+ const u16 *exp_bits;
+ u8 src_num;
+ u8 exp_num;
+ __be16 exp_val;
+ bool exp_comp:1;
+};
+
+#define IP_TUNNEL_FLAGS_TEST(src, comp, eval, exp) { \
+ .src_bits = (src), \
+ .src_num = ARRAY_SIZE(src), \
+ .exp_comp = (comp), \
+ .exp_val = (eval), \
+ .exp_bits = (exp), \
+ .exp_num = ARRAY_SIZE(exp), \
+}
+
+/* These are __be16-compatible and can be compared as is */
+static const u16 ip_tunnel_flags_1[] __initconst = {
+ IP_TUNNEL_KEY_BIT,
+ IP_TUNNEL_STRICT_BIT,
+ IP_TUNNEL_ERSPAN_OPT_BIT,
+};
+
+/*
+ * Due to the previous flags design limitation, setting either
+ * ``IP_TUNNEL_CSUM_BIT`` (on Big Endian) or ``IP_TUNNEL_DONT_FRAGMENT_BIT``
+ * (on Little) also sets VTI/ISATAP bit. In the bitmap implementation, they
+ * correspond to ``BIT(16)``, which is bigger than ``U16_MAX``, but still is
+ * backward-compatible.
+ */
+#ifdef __BIG_ENDIAN
+#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_CSUM_BIT
+#else
+#define IP_TUNNEL_CONFLICT_BIT IP_TUNNEL_DONT_FRAGMENT_BIT
+#endif
+
+static const u16 ip_tunnel_flags_2_src[] __initconst = {
+ IP_TUNNEL_CONFLICT_BIT,
+};
+
+static const u16 ip_tunnel_flags_2_exp[] __initconst = {
+ IP_TUNNEL_CONFLICT_BIT,
+ IP_TUNNEL_SIT_ISATAP_BIT,
+};
+
+/* Bits 17 and higher are not compatible with __be16 flags */
+static const u16 ip_tunnel_flags_3_src[] __initconst = {
+ IP_TUNNEL_VXLAN_OPT_BIT,
+ 17,
+ 18,
+ 20,
+};
+
+static const u16 ip_tunnel_flags_3_exp[] __initconst = {
+ IP_TUNNEL_VXLAN_OPT_BIT,
+};
+
+static const struct ip_tunnel_flags_test ip_tunnel_flags_test[] __initconst = {
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_1, true,
+ cpu_to_be16(BIT(IP_TUNNEL_KEY_BIT) |
+ BIT(IP_TUNNEL_STRICT_BIT) |
+ BIT(IP_TUNNEL_ERSPAN_OPT_BIT)),
+ ip_tunnel_flags_1),
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_2_src, true, VTI_ISVTI,
+ ip_tunnel_flags_2_exp),
+ IP_TUNNEL_FLAGS_TEST(ip_tunnel_flags_3_src,
+ /*
+ * This must be set to ``false`` once
+ * ``__IP_TUNNEL_FLAG_NUM`` goes above 17.
+ */
+ true,
+ cpu_to_be16(BIT(IP_TUNNEL_VXLAN_OPT_BIT)),
+ ip_tunnel_flags_3_exp),
+};
+
+static void __init test_ip_tunnel_flags(void)
+{
+ for (u32 i = 0; i < ARRAY_SIZE(ip_tunnel_flags_test); i++) {
+ typeof(*ip_tunnel_flags_test) *test = &ip_tunnel_flags_test[i];
+ IP_TUNNEL_DECLARE_FLAGS(src) = { };
+ IP_TUNNEL_DECLARE_FLAGS(exp) = { };
+ IP_TUNNEL_DECLARE_FLAGS(out);
+
+ for (u32 j = 0; j < test->src_num; j++)
+ __set_bit(test->src_bits[j], src);
+
+ for (u32 j = 0; j < test->exp_num; j++)
+ __set_bit(test->exp_bits[j], exp);
+
+ ip_tunnel_flags_from_be16(out, test->exp_val);
+
+ expect_eq_uint(test->exp_comp,
+ ip_tunnel_flags_is_be16_compat(src));
+ expect_eq_uint((__force u16)test->exp_val,
+ (__force u16)ip_tunnel_flags_to_be16(src));
+
+ __ipt_flag_op(expect_eq_bitmap, exp, out);
+ }
+}
+
static void __init selftest(void)
{
test_zero_clear();
@@ -1316,6 +1420,7 @@ static void __init selftest(void)
test_bitmap_print_buf();
test_bitmap_getset_bits();
test_bitmap_const_eval();
+ test_ip_tunnel_flags();

test_find_nth_bit();
test_for_each_set_bit();
--
2.41.0

2023-10-09 15:16:34

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 07/14] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()

bitmap_set_bits() does not start with the FS' prefix and may collide
with a new generic helper one day. It operates with the FS-specific
types, so there's no change those two could do the same thing.
Just add the prefix to exclude such possible conflict.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
fs/btrfs/free-space-cache.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 27fad70451aa..94249b5ee447 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -1909,9 +1909,9 @@ static inline void bitmap_clear_bits(struct btrfs_free_space_ctl *ctl,
ctl->free_space -= bytes;
}

-static void bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
- struct btrfs_free_space *info, u64 offset,
- u64 bytes)
+static void btrfs_bitmap_set_bits(struct btrfs_free_space_ctl *ctl,
+ struct btrfs_free_space *info, u64 offset,
+ u64 bytes)
{
unsigned long start, count, end;
int extent_delta = 1;
@@ -2247,7 +2247,7 @@ static u64 add_bytes_to_bitmap(struct btrfs_free_space_ctl *ctl,

bytes_to_set = min(end - offset, bytes);

- bitmap_set_bits(ctl, info, offset, bytes_to_set);
+ btrfs_bitmap_set_bits(ctl, info, offset, bytes_to_set);

return bytes_to_set;

--
2.41.0

2023-10-09 15:16:50

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 12/14] lib/bitmap: add compile-time test for __assign_bit() optimization

Commit dc34d5036692 ("lib: test_bitmap: add compile-time
optimization/evaluations assertions") initially missed __assign_bit(),
which led to that quite a time passed before I realized it doesn't get
optimized at compilation time. Now that it does, add test for that just
to make sure nothing will break one day.
To make things more interesting, use bitmap_complement() and
bitmap_full(), thus checking their compile-time evaluation as well. And
remove the misleading comment mentioning the workaround removed recently
in favor of adding the whole file to GCov exceptions.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index f2ea9f30c7c5..cbf1b9611616 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -1181,14 +1181,7 @@ static void __init test_bitmap_const_eval(void)
* in runtime.
*/

- /*
- * Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }`.
- * Clang on s390 optimizes bitops at compile-time as intended, but at
- * the same time stops treating @bitmap and @bitopvar as compile-time
- * constants after regular test_bit() is executed, thus triggering the
- * build bugs below. So, call const_test_bit() there directly until
- * the compiler is fixed.
- */
+ /* Equals to `unsigned long bitmap[1] = { GENMASK(6, 5), }` */
bitmap_clear(bitmap, 0, BITS_PER_LONG);
if (!test_bit(7, bitmap))
bitmap_set(bitmap, 5, 2);
@@ -1220,6 +1213,15 @@ static void __init test_bitmap_const_eval(void)
/* ~BIT(25) */
BUILD_BUG_ON(!__builtin_constant_p(~var));
BUILD_BUG_ON(~var != ~BIT(25));
+
+ /* ~BIT(25) | BIT(25) == ~0UL */
+ bitmap_complement(&var, &var, BITS_PER_LONG);
+ __assign_bit(25, &var, true);
+
+ /* !(~(~0UL)) == 1 */
+ res = bitmap_full(&var, BITS_PER_LONG);
+ BUILD_BUG_ON(!__builtin_constant_p(res));
+ BUILD_BUG_ON(!res);
}

static void __init selftest(void)
--
2.41.0

2023-10-09 15:17:38

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 11/14] ip_tunnel: convert __be16 tunnel flags to bitmaps

Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT
have been defined as __be16. Now all of those 16 bits are occupied
and there's no more free space for new flags.
It can't be simply switched to a bigger container with no
adjustments to the values, since it's an explicit Endian storage,
and on LE systems (__be16)0x0001 equals to
(__be64)0x0001000000000000.
We could probably define new 64-bit flags depending on the
Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on
LE, but that would introduce an Endianness dependency and spawn a
ton of Sparse warnings. To mitigate them, all of those places which
were adjusted with this change would be touched anyway, so why not
define stuff properly if there's no choice.

Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the
value already coded and a fistful of <16 <-> bitmap> converters and
helpers. The two flags which have a different bit position are
SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as
__cpu_to_be16(), but as (__force __be16), i.e. had different
positions on LE and BE. Now they both have strongly defined places.
Change all __be16 fields which were used to store those flags, to
IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) ->
unsigned long[1] for now, and replace all TUNNEL_* occurrences to
their bitmap counterparts. Use the converters in the places which talk
to the userspace, hardware (NFP) or other hosts (GRE header). The rest
must explicitly use the new flags only. This must be done at once,
otherwise there will be too many conversions throughout the code in
the intermediate commits.
Finally, disable the old __be16 flags for use in the kernel code
(except for the two 'irregular' flags mentioned above), to prevent
any accidental (mis)use of them. For the userspace, nothing is
changed, only additions were made.

Most noticeable bloat-o-meter difference (.text):

vmlinux: 307/-1 (306)
gre.ko: 62/0 (62)
ip_gre.ko: 941/-217 (724) [*]
ip_tunnel.ko: 390/-900 (-510) [**]
ip_vti.ko: 138/0 (138)
ip6_gre.ko: 534/-18 (516) [*]
ip6_tunnel.ko: 118/-10 (108)

[*] gre_flags_to_tnl_flags() grew, but still is inlined
[**] ip_tunnel_find() got uninlined, hence such decrease

The average code size increase in non-extreme case is 100-200 bytes
per module, mostly due to sizeof(long) > sizeof(__be16), as
%__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers
are able to expand the majority of bitmap_*() calls here into direct
operations on scalars.

Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
drivers/net/bareudp.c | 19 ++-
.../ethernet/mellanox/mlx5/core/en/tc_tun.h | 2 +-
.../mellanox/mlx5/core/en/tc_tun_encap.c | 6 +-
.../mellanox/mlx5/core/en/tc_tun_geneve.c | 12 +-
.../mellanox/mlx5/core/en/tc_tun_gre.c | 8 +-
.../mellanox/mlx5/core/en/tc_tun_vxlan.c | 9 +-
.../net/ethernet/mellanox/mlx5/core/en_tc.c | 16 ++-
.../ethernet/mellanox/mlxsw/spectrum_ipip.c | 26 ++--
.../ethernet/mellanox/mlxsw/spectrum_span.c | 6 +-
.../ethernet/netronome/nfp/flower/action.c | 27 +++-
drivers/net/geneve.c | 44 +++---
drivers/net/vxlan/vxlan_core.c | 14 +-
include/net/dst_metadata.h | 10 +-
include/net/flow_dissector.h | 2 +-
include/net/gre.h | 70 +++++-----
include/net/ip6_tunnel.h | 4 +-
include/net/ip_tunnels.h | 119 +++++++++++++---
include/net/udp_tunnel.h | 4 +-
include/uapi/linux/if_tunnel.h | 33 +++++
net/bridge/br_vlan_tunnel.c | 9 +-
net/core/filter.c | 26 ++--
net/core/flow_dissector.c | 20 ++-
net/ipv4/fou_bpf.c | 2 +-
net/ipv4/gre_demux.c | 2 +-
net/ipv4/ip_gre.c | 127 +++++++++++-------
net/ipv4/ip_tunnel.c | 54 ++++----
net/ipv4/ip_tunnel_core.c | 80 ++++++-----
net/ipv4/ip_vti.c | 29 ++--
net/ipv4/ipip.c | 21 ++-
net/ipv4/udp_tunnel_core.c | 5 +-
net/ipv6/ip6_gre.c | 85 +++++++-----
net/ipv6/ip6_tunnel.c | 14 +-
net/ipv6/sit.c | 5 +-
net/netfilter/ipvs/ip_vs_core.c | 6 +-
net/netfilter/ipvs/ip_vs_xmit.c | 20 +--
net/netfilter/nft_tunnel.c | 44 +++---
net/openvswitch/flow_netlink.c | 61 +++++----
net/psample/psample.c | 26 ++--
net/sched/act_tunnel_key.c | 36 ++---
net/sched/cls_flower.c | 27 ++--
40 files changed, 715 insertions(+), 415 deletions(-)

diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 683203f87ae2..f5b001c7f48c 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -61,6 +61,7 @@ struct bareudp_dev {
static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
{
struct metadata_dst *tun_dst = NULL;
+ IP_TUNNEL_DECLARE_FLAGS(key) = { };
struct bareudp_dev *bareudp;
unsigned short family;
unsigned int len;
@@ -137,7 +138,10 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
bareudp->dev->stats.rx_dropped++;
goto drop;
}
- tun_dst = udp_tun_rx_dst(skb, family, TUNNEL_KEY, 0, 0);
+
+ __set_bit(IP_TUNNEL_KEY_BIT, key);
+
+ tun_dst = udp_tun_rx_dst(skb, family, key, 0, 0);
if (!tun_dst) {
bareudp->dev->stats.rx_dropped++;
goto drop;
@@ -291,10 +295,10 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct bareudp_dev *bareudp,
const struct ip_tunnel_info *info)
{
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
bool xnet = !net_eq(bareudp->net, dev_net(bareudp->dev));
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct socket *sock = rcu_dereference(bareudp->sock);
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
const struct ip_tunnel_key *key = &info->key;
struct rtable *rt;
__be16 sport, df;
@@ -320,7 +324,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
true);
tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
ttl = key->ttl;
- df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
+ df = test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) ?
+ htons(IP_DF) : 0;
skb_scrub_packet(skb, xnet);

err = -ENOSPC;
@@ -342,7 +347,8 @@ static int bareudp_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel_xmit_skb(rt, sock->sk, skb, saddr, info->key.u.ipv4.dst,
tos, ttl, df, sport, bareudp->port,
!net_eq(bareudp->net, dev_net(bareudp->dev)),
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;

free_dst:
@@ -354,10 +360,10 @@ static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct bareudp_dev *bareudp,
const struct ip_tunnel_info *info)
{
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
bool xnet = !net_eq(bareudp->net, dev_net(bareudp->dev));
bool use_cache = ip_tunnel_dst_cache_usable(skb, info);
struct socket *sock = rcu_dereference(bareudp->sock);
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
const struct ip_tunnel_key *key = &info->key;
struct dst_entry *dst = NULL;
struct in6_addr saddr, daddr;
@@ -404,7 +410,8 @@ static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel6_xmit_skb(dst, sock->sk, skb, dev,
&saddr, &daddr, prio, ttl,
info->key.label, sport, bareudp->port,
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;

free_dst:
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
index 92065568bb19..6873c1201803 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h
@@ -117,7 +117,7 @@ bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a,

bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b,
- __be16 tun_flags);
+ u32 tun_type);
#endif /* CONFIG_MLX5_ESWITCH */

#endif //__MLX5_EN_TC_TUNNEL_H__
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
index 1730f6a716ee..32b9c15e6c88 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.c
@@ -586,7 +586,7 @@ bool mlx5e_tc_tun_encap_info_equal_generic(struct mlx5e_encap_key *a,

bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b,
- __be16 tun_flags)
+ u32 tun_type)
{
struct ip_tunnel_info *a_info;
struct ip_tunnel_info *b_info;
@@ -595,8 +595,8 @@ bool mlx5e_tc_tun_encap_info_equal_options(struct mlx5e_encap_key *a,
if (!mlx5e_tc_tun_encap_info_equal_generic(a, b))
return false;

- a_has_opts = !!(a->ip_tun_key->tun_flags & tun_flags);
- b_has_opts = !!(b->ip_tun_key->tun_flags & tun_flags);
+ a_has_opts = test_bit(tun_type, a->ip_tun_key->tun_flags);
+ b_has_opts = test_bit(tun_type, b->ip_tun_key->tun_flags);

/* keys are equal when both don't have any options attached */
if (!a_has_opts && !b_has_opts)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
index 2bcd10b6d653..bf969212cc77 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.c
@@ -106,12 +106,13 @@ static int mlx5e_gen_ip_tunnel_header_geneve(char buf[],
memset(geneveh, 0, sizeof(*geneveh));
geneveh->ver = MLX5E_GENEVE_VER;
geneveh->opt_len = tun_info->options_len / 4;
- geneveh->oam = !!(tun_info->key.tun_flags & TUNNEL_OAM);
- geneveh->critical = !!(tun_info->key.tun_flags & TUNNEL_CRIT_OPT);
+ geneveh->oam = test_bit(IP_TUNNEL_OAM_BIT, tun_info->key.tun_flags);
+ geneveh->critical = test_bit(IP_TUNNEL_CRIT_OPT_BIT,
+ tun_info->key.tun_flags);
mlx5e_tunnel_id_to_vni(tun_info->key.tun_id, geneveh->vni);
geneveh->proto_type = htons(ETH_P_TEB);

- if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_info->key.tun_flags)) {
if (!geneveh->opt_len)
return -EOPNOTSUPP;
ip_tunnel_info_opts_get(geneveh->options, tun_info);
@@ -188,7 +189,7 @@ static int mlx5e_tc_tun_parse_geneve_options(struct mlx5e_priv *priv,

/* make sure that we're talking about GENEVE options */

- if (enc_opts.key->dst_opt_type != TUNNEL_GENEVE_OPT) {
+ if (enc_opts.key->dst_opt_type != IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG_MOD(extack,
"Matching on GENEVE options: option type is not GENEVE");
netdev_warn(priv->netdev,
@@ -337,7 +338,8 @@ static int mlx5e_tc_tun_parse_geneve(struct mlx5e_priv *priv,
static bool mlx5e_tc_tun_encap_info_equal_geneve(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b)
{
- return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_GENEVE_OPT);
+ return mlx5e_tc_tun_encap_info_equal_options(a, b,
+ IP_TUNNEL_GENEVE_OPT_BIT);
}

struct mlx5e_tc_tunnel geneve_tunnel = {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
index ada14f0574dc..579eda89fc76 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.c
@@ -31,12 +31,16 @@ static int mlx5e_gen_ip_tunnel_header_gretap(char buf[],
const struct ip_tunnel_key *tun_key = &e->tun_info->key;
struct gre_base_hdr *greh = (struct gre_base_hdr *)(buf);
__be32 tun_id = tunnel_id_to_key32(tun_key->tun_id);
+ IP_TUNNEL_DECLARE_FLAGS(unsupp) = { };
int hdr_len;

*ip_proto = IPPROTO_GRE;

/* the HW does not calculate GRE csum or sequences */
- if (tun_key->tun_flags & (TUNNEL_CSUM | TUNNEL_SEQ))
+ __set_bit(IP_TUNNEL_CSUM_BIT, unsupp);
+ __set_bit(IP_TUNNEL_SEQ_BIT, unsupp);
+
+ if (ip_tunnel_flags_intersect(tun_key->tun_flags, unsupp))
return -EOPNOTSUPP;

greh->protocol = htons(ETH_P_TEB);
@@ -44,7 +48,7 @@ static int mlx5e_gen_ip_tunnel_header_gretap(char buf[],
/* GRE key */
hdr_len = mlx5e_tc_tun_calc_hlen_gretap(e);
greh->flags = gre_tnl_flags_to_gre_flags(tun_key->tun_flags);
- if (tun_key->tun_flags & TUNNEL_KEY) {
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags)) {
__be32 *ptr = (__be32 *)(((u8 *)greh) + hdr_len - 4);
*ptr = tun_id;
}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
index a184d739d5f8..e4e487c8431b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c
@@ -90,7 +90,7 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[],
const struct vxlan_metadata *md;
struct vxlanhdr *vxh;

- if ((tun_key->tun_flags & TUNNEL_VXLAN_OPT) &&
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_key->tun_flags) &&
e->tun_info->options_len != sizeof(*md))
return -EOPNOTSUPP;
vxh = (struct vxlanhdr *)((char *)udp + sizeof(struct udphdr));
@@ -99,7 +99,7 @@ static int mlx5e_gen_ip_tunnel_header_vxlan(char buf[],
udp->dest = tun_key->tp_dst;
vxh->vx_flags = VXLAN_HF_VNI;
vxh->vx_vni = vxlan_vni_field(tun_id);
- if (tun_key->tun_flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_key->tun_flags)) {
md = ip_tunnel_info_opts(e->tun_info);
vxlan_build_gbp_hdr(vxh, md);
}
@@ -125,7 +125,7 @@ static int mlx5e_tc_tun_parse_vxlan_gbp_option(struct mlx5e_priv *priv,
return -EOPNOTSUPP;
}

- if (enc_opts.key->dst_opt_type != TUNNEL_VXLAN_OPT) {
+ if (enc_opts.key->dst_opt_type != IP_TUNNEL_VXLAN_OPT_BIT) {
NL_SET_ERR_MSG_MOD(extack, "Wrong VxLAN option type: not GBP");
return -EOPNOTSUPP;
}
@@ -208,7 +208,8 @@ static int mlx5e_tc_tun_parse_vxlan(struct mlx5e_priv *priv,
static bool mlx5e_tc_tun_encap_info_equal_vxlan(struct mlx5e_encap_key *a,
struct mlx5e_encap_key *b)
{
- return mlx5e_tc_tun_encap_info_equal_options(a, b, TUNNEL_VXLAN_OPT);
+ return mlx5e_tc_tun_encap_info_equal_options(a, b,
+ IP_TUNNEL_VXLAN_OPT_BIT);
}

static int mlx5e_tc_tun_get_remote_ifindex(struct net_device *mirred_dev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index c24828b688ac..82dea59dc80d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -5453,6 +5453,7 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
struct tunnel_match_enc_opts enc_opts = {};
struct mlx5_rep_uplink_priv *uplink_priv;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct mlx5e_rep_priv *uplink_rpriv;
struct metadata_dst *tun_dst;
struct tunnel_match_key key;
@@ -5460,6 +5461,8 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
struct net_device *dev;
int err;

+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+
enc_opts_id = tunnel_id & ENC_OPTS_BITS_MASK;
tun_id = tunnel_id >> ENC_OPTS_BITS;

@@ -5492,14 +5495,14 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb
case FLOW_DISSECTOR_KEY_IPV4_ADDRS:
tun_dst = __ip_tun_set_dst(key.enc_ipv4.src, key.enc_ipv4.dst,
key.enc_ip.tos, key.enc_ip.ttl,
- key.enc_tp.dst, TUNNEL_KEY,
+ key.enc_tp.dst, flags,
key32_to_tunnel_id(key.enc_key_id.keyid),
enc_opts.key.len);
break;
case FLOW_DISSECTOR_KEY_IPV6_ADDRS:
tun_dst = __ipv6_tun_set_dst(&key.enc_ipv6.src, &key.enc_ipv6.dst,
key.enc_ip.tos, key.enc_ip.ttl,
- key.enc_tp.dst, 0, TUNNEL_KEY,
+ key.enc_tp.dst, 0, flags,
key32_to_tunnel_id(key.enc_key_id.keyid),
enc_opts.key.len);
break;
@@ -5517,11 +5520,16 @@ static bool mlx5e_tc_restore_tunnel(struct mlx5e_priv *priv, struct sk_buff *skb

tun_dst->u.tun_info.key.tp_src = key.enc_tp.src;

- if (enc_opts.key.len)
+ if (enc_opts.key.len) {
+ ip_tunnel_flags_zero(flags);
+ if (enc_opts.key.dst_opt_type)
+ __set_bit(enc_opts.key.dst_opt_type, flags);
+
ip_tunnel_info_opts_set(&tun_dst->u.tun_info,
enc_opts.key.data,
enc_opts.key.len,
- enc_opts.key.dst_opt_type);
+ flags);
+ }

skb_dst_set(skb, (struct dst_entry *)tun_dst);
dev = dev_get_by_index(&init_net, key.filter_ifindex);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
index d67df358a52f..d761a1235994 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ipip.c
@@ -27,23 +27,23 @@ mlxsw_sp_ipip_netdev_parms6(const struct net_device *ol_dev)
static bool
mlxsw_sp_ipip_parms4_has_ikey(const struct ip_tunnel_parm_kern *parms)
{
- return !!(parms->i_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags);
}

static bool mlxsw_sp_ipip_parms6_has_ikey(const struct __ip6_tnl_parm *parms)
{
- return !!(parms->i_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags);
}

static bool
mlxsw_sp_ipip_parms4_has_okey(const struct ip_tunnel_parm_kern *parms)
{
- return !!(parms->o_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->o_flags);
}

static bool mlxsw_sp_ipip_parms6_has_okey(const struct __ip6_tnl_parm *parms)
{
- return !!(parms->o_flags & TUNNEL_KEY);
+ return test_bit(IP_TUNNEL_KEY_BIT, parms->o_flags);
}

static u32 mlxsw_sp_ipip_parms4_ikey(const struct ip_tunnel_parm_kern *parms)
@@ -242,12 +242,15 @@ static bool mlxsw_sp_ipip_can_offload_gre4(const struct mlxsw_sp *mlxsw_sp,
const struct net_device *ol_dev)
{
struct ip_tunnel *tunnel = netdev_priv(ol_dev);
- __be16 okflags = TUNNEL_KEY; /* We can't offload any other features. */
bool inherit_ttl = tunnel->parms.iph.ttl == 0;
bool inherit_tos = tunnel->parms.iph.tos & 0x1;
+ IP_TUNNEL_DECLARE_FLAGS(okflags) = { };

- return (tunnel->parms.i_flags & ~okflags) == 0 &&
- (tunnel->parms.o_flags & ~okflags) == 0 &&
+ /* We can't offload any other features. */
+ __set_bit(IP_TUNNEL_KEY_BIT, okflags);
+
+ return ip_tunnel_flags_subset(tunnel->parms.i_flags, okflags) &&
+ ip_tunnel_flags_subset(tunnel->parms.o_flags, okflags) &&
inherit_ttl && inherit_tos &&
mlxsw_sp_ipip_tunnel_complete(MLXSW_SP_L3_PROTO_IPV4, ol_dev);
}
@@ -443,10 +446,13 @@ static bool mlxsw_sp_ipip_can_offload_gre6(const struct mlxsw_sp *mlxsw_sp,
struct __ip6_tnl_parm tparm = mlxsw_sp_ipip_netdev_parms6(ol_dev);
bool inherit_tos = tparm.flags & IP6_TNL_F_USE_ORIG_TCLASS;
bool inherit_ttl = tparm.hop_limit == 0;
- __be16 okflags = TUNNEL_KEY; /* We can't offload any other features. */
+ IP_TUNNEL_DECLARE_FLAGS(okflags) = { };
+
+ /* We can't offload any other features. */
+ __set_bit(IP_TUNNEL_KEY_BIT, okflags);

- return (tparm.i_flags & ~okflags) == 0 &&
- (tparm.o_flags & ~okflags) == 0 &&
+ return ip_tunnel_flags_subset(tparm.i_flags, okflags) &&
+ ip_tunnel_flags_subset(tparm.o_flags, okflags) &&
inherit_ttl && inherit_tos &&
mlxsw_sp_ipip_tunnel_complete(MLXSW_SP_L3_PROTO_IPV6, ol_dev);
}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
index ee08184bd60f..8dd10ac2f226 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -461,7 +461,8 @@ mlxsw_sp_span_entry_gretap4_parms(struct mlxsw_sp *mlxsw_sp,

if (!(to_dev->flags & IFF_UP) ||
/* Reject tunnels with GRE keys, checksums, etc. */
- tparm.i_flags || tparm.o_flags ||
+ !ip_tunnel_flags_empty(tparm.i_flags) ||
+ !ip_tunnel_flags_empty(tparm.o_flags) ||
/* Require a fixed TTL and a TOS copied from the mirrored packet. */
inherit_ttl || !inherit_tos ||
/* A destination address may not be "any". */
@@ -565,7 +566,8 @@ mlxsw_sp_span_entry_gretap6_parms(struct mlxsw_sp *mlxsw_sp,

if (!(to_dev->flags & IFF_UP) ||
/* Reject tunnels with GRE keys, checksums, etc. */
- tparm.i_flags || tparm.o_flags ||
+ !ip_tunnel_flags_empty(tparm.i_flags) ||
+ !ip_tunnel_flags_empty(tparm.o_flags) ||
/* Require a fixed TTL and a TOS copied from the mirrored packet. */
inherit_ttl || !inherit_tos ||
/* A destination address may not be "any". */
diff --git a/drivers/net/ethernet/netronome/nfp/flower/action.c b/drivers/net/ethernet/netronome/nfp/flower/action.c
index 2b383d92d7f5..db6e060ad882 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/action.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/action.c
@@ -396,6 +396,17 @@ nfp_fl_push_geneve_options(struct nfp_fl_payload *nfp_fl, int *list_len,
return 0;
}

+#define NFP_FL_CHECK(flag) ({ \
+ IP_TUNNEL_DECLARE_FLAGS(__check) = { }; \
+ __be16 __res; \
+ \
+ __set_bit(IP_TUNNEL_##flag##_BIT, __check); \
+ __res = ip_tunnel_flags_to_be16(__check); \
+ \
+ BUILD_BUG_ON(__builtin_constant_p(__res) && \
+ NFP_FL_TUNNEL_##flag != __res); \
+})
+
static int
nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
const struct flow_action_entry *act,
@@ -410,6 +421,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
u32 tmp_set_ip_tun_type_index = 0;
/* Currently support one pre-tunnel so index is always 0. */
int pretun_idx = 0;
+ __be16 tun_flags;

if (!IS_ENABLED(CONFIG_IPV6) && ipv6)
return -EOPNOTSUPP;
@@ -417,9 +429,10 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
if (ipv6 && !(priv->flower_ext_feats & NFP_FL_FEATS_IPV6_TUN))
return -EOPNOTSUPP;

- BUILD_BUG_ON(NFP_FL_TUNNEL_CSUM != TUNNEL_CSUM ||
- NFP_FL_TUNNEL_KEY != TUNNEL_KEY ||
- NFP_FL_TUNNEL_GENEVE_OPT != TUNNEL_GENEVE_OPT);
+ NFP_FL_CHECK(CSUM);
+ NFP_FL_CHECK(KEY);
+ NFP_FL_CHECK(GENEVE_OPT);
+
if (ip_tun->options_len &&
(tun_type != NFP_FL_TUNNEL_GENEVE ||
!(priv->flower_ext_feats & NFP_FL_FEATS_GENEVE_OPT))) {
@@ -427,7 +440,9 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
return -EOPNOTSUPP;
}

- if (ip_tun->key.tun_flags & ~NFP_FL_SUPPORTED_UDP_TUN_FLAGS) {
+ tun_flags = ip_tunnel_flags_to_be16(ip_tun->key.tun_flags);
+ if (!ip_tunnel_flags_is_be16_compat(ip_tun->key.tun_flags) ||
+ (tun_flags & ~NFP_FL_SUPPORTED_UDP_TUN_FLAGS)) {
NL_SET_ERR_MSG_MOD(extack,
"unsupported offload: loaded firmware does not support tunnel flag offload");
return -EOPNOTSUPP;
@@ -442,7 +457,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
FIELD_PREP(NFP_FL_PRE_TUN_INDEX, pretun_idx);

set_tun->tun_type_index = cpu_to_be32(tmp_set_ip_tun_type_index);
- if (ip_tun->key.tun_flags & NFP_FL_TUNNEL_KEY)
+ if (tun_flags & NFP_FL_TUNNEL_KEY)
set_tun->tun_id = ip_tun->key.tun_id;

if (ip_tun->key.ttl) {
@@ -486,7 +501,7 @@ nfp_fl_set_tun(struct nfp_app *app, struct nfp_fl_set_tun *set_tun,
}

set_tun->tos = ip_tun->key.tos;
- set_tun->tun_flags = ip_tun->key.tun_flags;
+ set_tun->tun_flags = tun_flags;

if (tun_type == NFP_FL_TUNNEL_GENEVE) {
set_tun->tun_proto = htons(ETH_P_TEB);
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 78f9d588f712..f17a30c3bda5 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -225,10 +225,11 @@ static void geneve_rx(struct geneve_dev *geneve, struct geneve_sock *gs,
void *oiph;

if (ip_tunnel_collect_metadata() || gs->collect_md) {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };

- flags = TUNNEL_KEY | (gnvh->oam ? TUNNEL_OAM : 0) |
- (gnvh->critical ? TUNNEL_CRIT_OPT : 0);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __assign_bit(IP_TUNNEL_OAM_BIT, flags, gnvh->oam);
+ __assign_bit(IP_TUNNEL_CRIT_OPT_BIT, flags, gnvh->critical);

tun_dst = udp_tun_rx_dst(skb, geneve_get_sk_family(gs), flags,
vni_to_tunnel_id(gnvh->vni),
@@ -238,9 +239,11 @@ static void geneve_rx(struct geneve_dev *geneve, struct geneve_sock *gs,
goto drop;
}
/* Update tunnel dst according to Geneve options. */
+ ip_tunnel_flags_zero(flags);
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, flags);
ip_tunnel_info_opts_set(&tun_dst->u.tun_info,
gnvh->options, gnvh->opt_len * 4,
- TUNNEL_GENEVE_OPT);
+ flags);
} else {
/* Drop packets w/ critical options,
* since we don't support any...
@@ -738,14 +741,15 @@ static void geneve_build_header(struct genevehdr *geneveh,
{
geneveh->ver = GENEVE_VER;
geneveh->opt_len = info->options_len / 4;
- geneveh->oam = !!(info->key.tun_flags & TUNNEL_OAM);
- geneveh->critical = !!(info->key.tun_flags & TUNNEL_CRIT_OPT);
+ geneveh->oam = test_bit(IP_TUNNEL_OAM_BIT, info->key.tun_flags);
+ geneveh->critical = test_bit(IP_TUNNEL_CRIT_OPT_BIT,
+ info->key.tun_flags);
geneveh->rsvd1 = 0;
tunnel_id_to_vni(info->key.tun_id, geneveh->vni);
geneveh->proto_type = inner_proto;
geneveh->rsvd2 = 0;

- if (info->key.tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags))
ip_tunnel_info_opts_get(geneveh->options, info);
}

@@ -754,7 +758,7 @@ static int geneve_build_skb(struct dst_entry *dst, struct sk_buff *skb,
bool xnet, int ip_hdr_len,
bool inner_proto_inherit)
{
- bool udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
+ bool udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
struct genevehdr *gnvh;
__be16 inner_proto;
int min_headroom;
@@ -958,7 +962,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
ttl = key->ttl;

- df = key->tun_flags & TUNNEL_DONT_FRAGMENT ? htons(IP_DF) : 0;
+ df = test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) ?
+ htons(IP_DF) : 0;
} else {
tos = ip_tunnel_ecn_encap(full_tos, ip_hdr(skb), skb);
if (geneve->cfg.ttl_inherit)
@@ -991,7 +996,8 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel_xmit_skb(rt, gs4->sock->sk, skb, fl4.saddr, fl4.daddr,
tos, ttl, df, sport, geneve->cfg.info.key.tp_dst,
!net_eq(geneve->net, dev_net(geneve->dev)),
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
}

@@ -1071,7 +1077,8 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
udp_tunnel6_xmit_skb(dst, gs6->sock->sk, skb, dev,
&fl6.saddr, &fl6.daddr, prio, ttl,
info->key.label, sport, geneve->cfg.info.key.tp_dst,
- !(info->key.tun_flags & TUNNEL_CSUM));
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags));
return 0;
}
#endif
@@ -1351,7 +1358,8 @@ static struct geneve_dev *geneve_find_dev(struct geneve_net *gn,

static bool is_tnl_info_zero(const struct ip_tunnel_info *info)
{
- return !(info->key.tun_id || info->key.tun_flags || info->key.tos ||
+ return !(info->key.tun_id || info->key.tos ||
+ !ip_tunnel_flags_empty(info->key.tun_flags) ||
info->key.ttl || info->key.label || info->key.tp_src ||
memchr_inv(&info->key.u, 0, sizeof(info->key.u)));
}
@@ -1489,7 +1497,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
"Remote IPv6 address cannot be Multicast");
return -EINVAL;
}
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
cfg->use_udp6_rx_checksums = true;
#else
NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_REMOTE6],
@@ -1564,7 +1572,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
goto change_notsup;
}
if (nla_get_u8(data[IFLA_GENEVE_UDP_CSUM]))
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
}

if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]) {
@@ -1574,7 +1582,7 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[],
goto change_notsup;
}
if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]))
- info->key.tun_flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
#else
NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX],
"IPv6 support not enabled in the kernel");
@@ -1807,7 +1815,8 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
info->key.u.ipv4.dst))
goto nla_put_failure;
if (nla_put_u8(skb, IFLA_GENEVE_UDP_CSUM,
- !!(info->key.tun_flags & TUNNEL_CSUM)))
+ test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags)))
goto nla_put_failure;

#if IS_ENABLED(CONFIG_IPV6)
@@ -1816,7 +1825,8 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev)
&info->key.u.ipv6.dst))
goto nla_put_failure;
if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_TX,
- !(info->key.tun_flags & TUNNEL_CSUM)))
+ !test_bit(IP_TUNNEL_CSUM_BIT,
+ info->key.tun_flags)))
goto nla_put_failure;
#endif
}
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index e463f59e95c2..e2d740f0137a 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -1584,7 +1584,8 @@ static void vxlan_parse_gbp_hdr(struct vxlanhdr *unparsed,

tun_dst = (struct metadata_dst *)skb_dst(skb);
if (tun_dst) {
- tun_dst->u.tun_info.key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT,
+ tun_dst->u.tun_info.key.tun_flags);
tun_dst->u.tun_info.options_len = sizeof(*md);
}
if (gbp->dont_learn)
@@ -1716,9 +1717,11 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
goto drop;

if (vxlan_collect_metadata(vs)) {
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tun_dst;

- tun_dst = udp_tun_rx_dst(skb, vxlan_get_sk_family(vs), TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tun_dst = udp_tun_rx_dst(skb, vxlan_get_sk_family(vs), flags,
key32_to_tunnel_id(vni), sizeof(*md));

if (!tun_dst)
@@ -2497,7 +2500,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
vni = tunnel_id_to_key32(info->key.tun_id);
ifindex = 0;
dst_cache = &info->dst_cache;
- if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
if (info->options_len < sizeof(*md))
goto drop;
md = ip_tunnel_info_opts(info);
@@ -2507,7 +2510,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
#if IS_ENABLED(CONFIG_IPV6)
label = info->key.label;
#endif
- udp_sum = !!(info->key.tun_flags & TUNNEL_CSUM);
+ udp_sum = test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
}
src_port = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
vxlan->cfg.port_max, true);
@@ -2549,7 +2552,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
old_iph->frag_off & htons(IP_DF)))
df = htons(IP_DF);
}
- } else if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT) {
+ } else if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT,
+ info->key.tun_flags)) {
df = htons(IP_DF);
}

diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
index 1b7fae4c6b24..4160731dcb6e 100644
--- a/include/net/dst_metadata.h
+++ b/include/net/dst_metadata.h
@@ -198,7 +198,7 @@ static inline struct metadata_dst *__ip_tun_set_dst(__be32 saddr,
__be32 daddr,
__u8 tos, __u8 ttl,
__be16 tp_dst,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -215,7 +215,7 @@ static inline struct metadata_dst *__ip_tun_set_dst(__be32 saddr,
}

static inline struct metadata_dst *ip_tun_rx_dst(struct sk_buff *skb,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -230,7 +230,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad
__u8 tos, __u8 ttl,
__be16 tp_dst,
__be32 label,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
@@ -243,7 +243,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad

info = &tun_dst->u.tun_info;
info->mode = IP_TUNNEL_INFO_IPV6;
- info->key.tun_flags = flags;
+ ip_tunnel_flags_copy(info->key.tun_flags, flags);
info->key.tun_id = tunnel_id;
info->key.tp_src = 0;
info->key.tp_dst = tp_dst;
@@ -259,7 +259,7 @@ static inline struct metadata_dst *__ipv6_tun_set_dst(const struct in6_addr *sad
}

static inline struct metadata_dst *ipv6_tun_rx_dst(struct sk_buff *skb,
- __be16 flags,
+ const unsigned long *flags,
__be64 tunnel_id,
int md_size)
{
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 1a7131d6cb0e..9ab376d1a677 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -97,7 +97,7 @@ struct flow_dissector_key_enc_opts {
* here but seems difficult to #include
*/
u8 len;
- __be16 dst_opt_type;
+ u32 dst_opt_type;
};

struct flow_dissector_key_keyid {
diff --git a/include/net/gre.h b/include/net/gre.h
index 4e209708b754..ccd293203284 100644
--- a/include/net/gre.h
+++ b/include/net/gre.h
@@ -49,67 +49,61 @@ static inline bool netif_is_ip6gretap(const struct net_device *dev)
!strcmp(dev->rtnl_link_ops->kind, "ip6gretap");
}

-static inline int gre_calc_hlen(__be16 o_flags)
+static inline int gre_calc_hlen(const unsigned long *o_flags)
{
int addend = 4;

- if (o_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, o_flags))
addend += 4;
- if (o_flags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, o_flags))
addend += 4;
- if (o_flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, o_flags))
addend += 4;
return addend;
}

-static inline __be16 gre_flags_to_tnl_flags(__be16 flags)
+static inline void gre_flags_to_tnl_flags(unsigned long *dst, __be16 flags)
{
- __be16 tflags = 0;
-
- if (flags & GRE_CSUM)
- tflags |= TUNNEL_CSUM;
- if (flags & GRE_ROUTING)
- tflags |= TUNNEL_ROUTING;
- if (flags & GRE_KEY)
- tflags |= TUNNEL_KEY;
- if (flags & GRE_SEQ)
- tflags |= TUNNEL_SEQ;
- if (flags & GRE_STRICT)
- tflags |= TUNNEL_STRICT;
- if (flags & GRE_REC)
- tflags |= TUNNEL_REC;
- if (flags & GRE_VERSION)
- tflags |= TUNNEL_VERSION;
-
- return tflags;
+ IP_TUNNEL_DECLARE_FLAGS(res) = { };
+
+ __assign_bit(IP_TUNNEL_CSUM_BIT, res, flags & GRE_CSUM);
+ __assign_bit(IP_TUNNEL_ROUTING_BIT, res, flags & GRE_ROUTING);
+ __assign_bit(IP_TUNNEL_KEY_BIT, res, flags & GRE_KEY);
+ __assign_bit(IP_TUNNEL_SEQ_BIT, res, flags & GRE_SEQ);
+ __assign_bit(IP_TUNNEL_STRICT_BIT, res, flags & GRE_STRICT);
+ __assign_bit(IP_TUNNEL_REC_BIT, res, flags & GRE_REC);
+ __assign_bit(IP_TUNNEL_VERSION_BIT, res, flags & GRE_VERSION);
+
+ ip_tunnel_flags_copy(dst, res);
}

-static inline __be16 gre_tnl_flags_to_gre_flags(__be16 tflags)
+static inline __be16 gre_tnl_flags_to_gre_flags(const unsigned long *tflags)
{
__be16 flags = 0;

- if (tflags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tflags))
flags |= GRE_CSUM;
- if (tflags & TUNNEL_ROUTING)
+ if (test_bit(IP_TUNNEL_ROUTING_BIT, tflags))
flags |= GRE_ROUTING;
- if (tflags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, tflags))
flags |= GRE_KEY;
- if (tflags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tflags))
flags |= GRE_SEQ;
- if (tflags & TUNNEL_STRICT)
+ if (test_bit(IP_TUNNEL_STRICT_BIT, tflags))
flags |= GRE_STRICT;
- if (tflags & TUNNEL_REC)
+ if (test_bit(IP_TUNNEL_REC_BIT, tflags))
flags |= GRE_REC;
- if (tflags & TUNNEL_VERSION)
+ if (test_bit(IP_TUNNEL_VERSION_BIT, tflags))
flags |= GRE_VERSION;

return flags;
}

static inline void gre_build_header(struct sk_buff *skb, int hdr_len,
- __be16 flags, __be16 proto,
+ const unsigned long *flags, __be16 proto,
__be32 key, __be32 seq)
{
+ IP_TUNNEL_DECLARE_FLAGS(cond) = { };
struct gre_base_hdr *greh;

skb_push(skb, hdr_len);
@@ -120,18 +114,22 @@ static inline void gre_build_header(struct sk_buff *skb, int hdr_len,
greh->flags = gre_tnl_flags_to_gre_flags(flags);
greh->protocol = proto;

- if (flags & (TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_SEQ)) {
+ __set_bit(IP_TUNNEL_KEY_BIT, cond);
+ __set_bit(IP_TUNNEL_CSUM_BIT, cond);
+ __set_bit(IP_TUNNEL_SEQ_BIT, cond);
+
+ if (ip_tunnel_flags_intersect(flags, cond)) {
__be32 *ptr = (__be32 *)(((u8 *)greh) + hdr_len - 4);

- if (flags & TUNNEL_SEQ) {
+ if (test_bit(IP_TUNNEL_SEQ_BIT, flags)) {
*ptr = seq;
ptr--;
}
- if (flags & TUNNEL_KEY) {
+ if (test_bit(IP_TUNNEL_KEY_BIT, flags)) {
*ptr = key;
ptr--;
}
- if (flags & TUNNEL_CSUM &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, flags) &&
!(skb_shinfo(skb)->gso_type &
(SKB_GSO_GRE | SKB_GSO_GRE_CSUM))) {
*ptr = 0;
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index 74b369bddf49..399592405c72 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -30,8 +30,8 @@ struct __ip6_tnl_parm {
struct in6_addr laddr; /* local tunnel end-point address */
struct in6_addr raddr; /* remote tunnel end-point address */

- __be16 i_flags;
- __be16 o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(i_flags);
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
__be32 i_key;
__be32 o_key;

diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index 6da667b743e1..bb2b00b65562 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -36,6 +36,24 @@
(sizeof_field(struct ip_tunnel_key, u) - \
sizeof_field(struct ip_tunnel_key, u.ipv4))

+#define __ipt_flag_op(op, ...) \
+ op(__VA_ARGS__, __IP_TUNNEL_FLAG_NUM)
+
+#define IP_TUNNEL_DECLARE_FLAGS(...) \
+ __ipt_flag_op(DECLARE_BITMAP, __VA_ARGS__)
+
+#define ip_tunnel_flags_zero(...) __ipt_flag_op(bitmap_zero, __VA_ARGS__)
+#define ip_tunnel_flags_copy(...) __ipt_flag_op(bitmap_copy, __VA_ARGS__)
+#define ip_tunnel_flags_and(...) __ipt_flag_op(bitmap_and, __VA_ARGS__)
+#define ip_tunnel_flags_or(...) __ipt_flag_op(bitmap_or, __VA_ARGS__)
+
+#define ip_tunnel_flags_empty(...) \
+ __ipt_flag_op(bitmap_empty, __VA_ARGS__)
+#define ip_tunnel_flags_intersect(...) \
+ __ipt_flag_op(bitmap_intersects, __VA_ARGS__)
+#define ip_tunnel_flags_subset(...) \
+ __ipt_flag_op(bitmap_subset, __VA_ARGS__)
+
struct ip_tunnel_key {
__be64 tun_id;
union {
@@ -48,11 +66,11 @@ struct ip_tunnel_key {
struct in6_addr dst;
} ipv6;
} u;
- __be16 tun_flags;
- u8 tos; /* TOS for IPv4, TC for IPv6 */
- u8 ttl; /* TTL for IPv4, HL for IPv6 */
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags);
__be32 label; /* Flow Label for IPv6 */
u32 nhid;
+ u8 tos; /* TOS for IPv4, TC for IPv6 */
+ u8 ttl; /* TTL for IPv4, HL for IPv6 */
__be16 tp_src;
__be16 tp_dst;
__u8 flow_flags;
@@ -110,14 +128,14 @@ struct ip_tunnel_prl_entry {

struct metadata_dst;

-/* Kernel-side copy of ip_tunnel_parm */
+/* Kernel-side variant of ip_tunnel_parm */
struct ip_tunnel_parm_kern {
char name[IFNAMSIZ];
- int link;
- __be16 i_flags;
- __be16 o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(i_flags);
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
__be32 i_key;
__be32 o_key;
+ int link;
struct iphdr iph;
};

@@ -168,7 +186,7 @@ struct ip_tunnel {
};

struct tnl_ptk_info {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 proto;
__be32 key;
__be32 seq;
@@ -190,11 +208,77 @@ struct ip_tunnel_net {
int type;
};

+static inline void ip_tunnel_set_options_present(unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ ip_tunnel_flags_or(flags, flags, present);
+}
+
+static inline void ip_tunnel_clear_options_present(unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ __ipt_flag_op(bitmap_andnot, flags, flags, present);
+}
+
+static inline bool ip_tunnel_is_options_present(const unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, present);
+ __set_bit(IP_TUNNEL_GTP_OPT_BIT, present);
+
+ return ip_tunnel_flags_intersect(flags, present);
+}
+
+static inline bool ip_tunnel_flags_is_be16_compat(const unsigned long *flags)
+{
+ IP_TUNNEL_DECLARE_FLAGS(supp) = { };
+
+ bitmap_set(supp, 0, BITS_PER_TYPE(__be16));
+ __set_bit(IP_TUNNEL_VTI_BIT, supp);
+
+ return ip_tunnel_flags_subset(flags, supp);
+}
+
+static inline void ip_tunnel_flags_from_be16(unsigned long *dst, __be16 flags)
+{
+ ip_tunnel_flags_zero(dst);
+
+ bitmap_set_bits(dst, 0, be16_to_cpu(flags), BITS_PER_TYPE(__be16));
+ __assign_bit(IP_TUNNEL_VTI_BIT, dst, flags & VTI_ISVTI);
+}
+
+static inline __be16 ip_tunnel_flags_to_be16(const unsigned long *flags)
+{
+ __be16 ret;
+
+ ret = cpu_to_be16(bitmap_get_bits(flags, 0, BITS_PER_TYPE(__be16)));
+ if (test_bit(IP_TUNNEL_VTI_BIT, flags))
+ ret |= VTI_ISVTI;
+
+ return ret;
+}
+
static inline void ip_tunnel_key_init(struct ip_tunnel_key *key,
__be32 saddr, __be32 daddr,
u8 tos, u8 ttl, __be32 label,
__be16 tp_src, __be16 tp_dst,
- __be64 tun_id, __be16 tun_flags)
+ __be64 tun_id,
+ const unsigned long *tun_flags)
{
key->tun_id = tun_id;
key->u.ipv4.src = saddr;
@@ -204,7 +288,7 @@ static inline void ip_tunnel_key_init(struct ip_tunnel_key *key,
key->tos = tos;
key->ttl = ttl;
key->label = label;
- key->tun_flags = tun_flags;
+ ip_tunnel_flags_copy(key->tun_flags, tun_flags);

/* For the tunnel types on the top of IPsec, the tp_src and tp_dst of
* the upper tunnel are used.
@@ -225,12 +309,8 @@ ip_tunnel_dst_cache_usable(const struct sk_buff *skb,
{
if (skb->mark)
return false;
- if (!info)
- return true;
- if (info->key.tun_flags & TUNNEL_NOCACHE)
- return false;

- return true;
+ return !info || !test_bit(IP_TUNNEL_NOCACHE_BIT, info->key.tun_flags);
}

static inline unsigned short ip_tunnel_info_af(const struct ip_tunnel_info
@@ -312,7 +392,7 @@ int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict);
int ip_tunnel_change_mtu(struct net_device *dev, int new_mtu);

struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
- int link, __be16 flags,
+ int link, const unsigned long *flags,
__be32 remote, __be32 local,
__be32 key);

@@ -517,12 +597,13 @@ static inline void ip_tunnel_info_opts_get(void *to,

static inline void ip_tunnel_info_opts_set(struct ip_tunnel_info *info,
const void *from, int len,
- __be16 flags)
+ const unsigned long *flags)
{
info->options_len = len;
if (len > 0) {
memcpy(ip_tunnel_info_opts(info), from, len);
- info->key.tun_flags |= flags;
+ ip_tunnel_flags_or(info->key.tun_flags, info->key.tun_flags,
+ flags);
}
}

@@ -566,7 +647,7 @@ static inline void ip_tunnel_info_opts_get(void *to,

static inline void ip_tunnel_info_opts_set(struct ip_tunnel_info *info,
const void *from, int len,
- __be16 flags)
+ const unsigned long *flags)
{
info->options_len = 0;
}
diff --git a/include/net/udp_tunnel.h b/include/net/udp_tunnel.h
index 0ca9b7a11baf..ff41b18651c4 100644
--- a/include/net/udp_tunnel.h
+++ b/include/net/udp_tunnel.h
@@ -162,8 +162,8 @@ int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,
void udp_tunnel_sock_release(struct socket *sock);

struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
- __be16 flags, __be64 tunnel_id,
- int md_size);
+ const unsigned long *flags,
+ __be64 tunnel_id, int md_size);

#ifdef CONFIG_INET
static inline int udp_tunnel_handle_offloads(struct sk_buff *skb, bool udp_csum)
diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
index 102119628ff5..838927dd73a4 100644
--- a/include/uapi/linux/if_tunnel.h
+++ b/include/uapi/linux/if_tunnel.h
@@ -161,6 +161,14 @@ enum {

#define IFLA_VTI_MAX (__IFLA_VTI_MAX - 1)

+#ifndef __KERNEL__
+/* Historically, tunnel flags have been defined as __be16 and now there are
+ * no free bits left. It is strongly advised to switch the already existing
+ * userspace code to the new *_BIT definitions from down below, as __be16
+ * can't be simply cast to a wider type on LE systems. All new flags and
+ * code must use *_BIT only.
+ */
+
#define TUNNEL_CSUM __cpu_to_be16(0x01)
#define TUNNEL_ROUTING __cpu_to_be16(0x02)
#define TUNNEL_KEY __cpu_to_be16(0x04)
@@ -181,5 +189,30 @@ enum {
#define TUNNEL_OPTIONS_PRESENT \
(TUNNEL_GENEVE_OPT | TUNNEL_VXLAN_OPT | TUNNEL_ERSPAN_OPT | \
TUNNEL_GTP_OPT)
+#endif
+
+enum {
+ IP_TUNNEL_CSUM_BIT = 0U,
+ IP_TUNNEL_ROUTING_BIT,
+ IP_TUNNEL_KEY_BIT,
+ IP_TUNNEL_SEQ_BIT,
+ IP_TUNNEL_STRICT_BIT,
+ IP_TUNNEL_REC_BIT,
+ IP_TUNNEL_VERSION_BIT,
+ IP_TUNNEL_NO_KEY_BIT,
+ IP_TUNNEL_DONT_FRAGMENT_BIT,
+ IP_TUNNEL_OAM_BIT,
+ IP_TUNNEL_CRIT_OPT_BIT,
+ IP_TUNNEL_GENEVE_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_VXLAN_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_NOCACHE_BIT,
+ IP_TUNNEL_ERSPAN_OPT_BIT, /* OPTIONS_PRESENT */
+ IP_TUNNEL_GTP_OPT_BIT, /* OPTIONS_PRESENT */
+
+ IP_TUNNEL_VTI_BIT,
+ IP_TUNNEL_SIT_ISATAP_BIT = IP_TUNNEL_VTI_BIT,
+
+ __IP_TUNNEL_FLAG_NUM,
+};

#endif /* _UAPI_IF_TUNNEL_H_ */
diff --git a/net/bridge/br_vlan_tunnel.c b/net/bridge/br_vlan_tunnel.c
index 81833ca7a2c7..a966a6ec8263 100644
--- a/net/bridge/br_vlan_tunnel.c
+++ b/net/bridge/br_vlan_tunnel.c
@@ -65,13 +65,14 @@ static int __vlan_tunnel_info_add(struct net_bridge_vlan_group *vg,
{
struct metadata_dst *metadata = rtnl_dereference(vlan->tinfo.tunnel_dst);
__be64 key = key32_to_tunnel_id(cpu_to_be32(tun_id));
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
int err;

if (metadata)
return -EEXIST;

- metadata = __ip_tun_set_dst(0, 0, 0, 0, 0, TUNNEL_KEY,
- key, 0);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ metadata = __ip_tun_set_dst(0, 0, 0, 0, 0, flags, key, 0);
if (!metadata)
return -EINVAL;

@@ -185,6 +186,7 @@ void br_handle_ingress_vlan_tunnel(struct sk_buff *skb,
int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
struct net_bridge_vlan *vlan)
{
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tunnel_dst;
__be64 tunnel_id;
int err;
@@ -202,7 +204,8 @@ int br_handle_egress_vlan_tunnel(struct sk_buff *skb,
return err;

if (BR_INPUT_SKB_CB(skb)->backup_nhid) {
- tunnel_dst = __ip_tun_set_dst(0, 0, 0, 0, 0, TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tunnel_dst = __ip_tun_set_dst(0, 0, 0, 0, 0, flags,
tunnel_id, 0);
if (!tunnel_dst)
return -ENOMEM;
diff --git a/net/core/filter.c b/net/core/filter.c
index a094694899c9..037bf6925bd4 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4583,7 +4583,7 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
to->tunnel_tos = info->key.tos;
to->tunnel_ttl = info->key.ttl;
if (flags & BPF_F_TUNINFO_FLAGS)
- to->tunnel_flags = info->key.tun_flags;
+ to->tunnel_flags = ip_tunnel_flags_to_be16(info->key.tun_flags);
else
to->tunnel_ext = 0;

@@ -4626,7 +4626,7 @@ BPF_CALL_3(bpf_skb_get_tunnel_opt, struct sk_buff *, skb, u8 *, to, u32, size)
int err;

if (unlikely(!info ||
- !(info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))) {
+ !ip_tunnel_is_options_present(info->key.tun_flags))) {
err = -ENOENT;
goto err_clear;
}
@@ -4696,15 +4696,15 @@ BPF_CALL_4(bpf_skb_set_tunnel_key, struct sk_buff *, skb,
memset(info, 0, sizeof(*info));
info->mode = IP_TUNNEL_INFO_TX;

- info->key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
- if (flags & BPF_F_DONT_FRAGMENT)
- info->key.tun_flags |= TUNNEL_DONT_FRAGMENT;
- if (flags & BPF_F_ZERO_CSUM_TX)
- info->key.tun_flags &= ~TUNNEL_CSUM;
- if (flags & BPF_F_SEQ_NUMBER)
- info->key.tun_flags |= TUNNEL_SEQ;
- if (flags & BPF_F_NO_TUNNEL_KEY)
- info->key.tun_flags &= ~TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_NOCACHE_BIT, info->key.tun_flags);
+ __assign_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, info->key.tun_flags,
+ flags & BPF_F_DONT_FRAGMENT);
+ __assign_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags,
+ !(flags & BPF_F_ZERO_CSUM_TX));
+ __assign_bit(IP_TUNNEL_SEQ_BIT, info->key.tun_flags,
+ flags & BPF_F_SEQ_NUMBER);
+ __assign_bit(IP_TUNNEL_KEY_BIT, info->key.tun_flags,
+ !(flags & BPF_F_NO_TUNNEL_KEY));

info->key.tun_id = cpu_to_be64(from->tunnel_id);
info->key.tos = from->tunnel_tos;
@@ -4742,13 +4742,15 @@ BPF_CALL_3(bpf_skb_set_tunnel_opt, struct sk_buff *, skb,
{
struct ip_tunnel_info *info = skb_tunnel_info(skb);
const struct metadata_dst *md = this_cpu_ptr(md_dst);
+ IP_TUNNEL_DECLARE_FLAGS(present) = { };

if (unlikely(info != &md->u.tun_info || (size & (sizeof(u32) - 1))))
return -EINVAL;
if (unlikely(size > IP_TUNNEL_OPTS_MAX))
return -ENOMEM;

- ip_tunnel_info_opts_set(info, from, size, TUNNEL_OPTIONS_PRESENT);
+ ip_tunnel_set_options_present(present);
+ ip_tunnel_info_opts_set(info, from, size, present);

return 0;
}
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index b3b3af0e7844..cb75aabea554 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -455,17 +455,25 @@ skb_flow_dissect_tunnel_info(const struct sk_buff *skb,

if (dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_ENC_OPTS)) {
struct flow_dissector_key_enc_opts *enc_opt;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+ u32 val;

enc_opt = skb_flow_dissector_target(flow_dissector,
FLOW_DISSECTOR_KEY_ENC_OPTS,
target_container);

- if (info->options_len) {
- enc_opt->len = info->options_len;
- ip_tunnel_info_opts_get(enc_opt->data, info);
- enc_opt->dst_opt_type = info->key.tun_flags &
- TUNNEL_OPTIONS_PRESENT;
- }
+ if (!info->options_len)
+ return;
+
+ enc_opt->len = info->options_len;
+ ip_tunnel_info_opts_get(enc_opt->data, info);
+
+ ip_tunnel_set_options_present(flags);
+ ip_tunnel_flags_and(flags, info->key.tun_flags, flags);
+
+ val = find_next_bit(flags, __IP_TUNNEL_FLAG_NUM,
+ IP_TUNNEL_GENEVE_OPT_BIT);
+ enc_opt->dst_opt_type = val < __IP_TUNNEL_FLAG_NUM ? val : 0;
}
}
EXPORT_SYMBOL(skb_flow_dissect_tunnel_info);
diff --git a/net/ipv4/fou_bpf.c b/net/ipv4/fou_bpf.c
index 3760a14b6b57..219554388921 100644
--- a/net/ipv4/fou_bpf.c
+++ b/net/ipv4/fou_bpf.c
@@ -66,7 +66,7 @@ __bpf_kfunc int bpf_skb_set_fou_encap(struct __sk_buff *skb_ctx,
info->encap.type = TUNNEL_ENCAP_NONE;
}

- if (info->key.tun_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags))
info->encap.flags |= TUNNEL_ENCAP_FLAG_CSUM;

info->encap.sport = encap->sport;
diff --git a/net/ipv4/gre_demux.c b/net/ipv4/gre_demux.c
index cbb2b4bb0dfa..01765891f82b 100644
--- a/net/ipv4/gre_demux.c
+++ b/net/ipv4/gre_demux.c
@@ -73,7 +73,7 @@ int gre_parse_header(struct sk_buff *skb, struct tnl_ptk_info *tpi,
if (unlikely(greh->flags & (GRE_VERSION | GRE_ROUTING)))
return -EINVAL;

- tpi->flags = gre_flags_to_tnl_flags(greh->flags);
+ gre_flags_to_tnl_flags(tpi->flags, greh->flags);
hdr_len = gre_calc_hlen(tpi->flags);

if (!pskb_may_pull(skb, nhs + hdr_len))
diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index b65318a55ae8..060446fb6483 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -265,6 +265,7 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
struct net *net = dev_net(skb->dev);
struct metadata_dst *tun_dst = NULL;
struct erspan_base_hdr *ershdr;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct ip_tunnel_net *itn;
struct ip_tunnel *tunnel;
const struct iphdr *iph;
@@ -272,18 +273,20 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
int ver;
int len;

+ ip_tunnel_flags_copy(flags, tpi->flags);
+
itn = net_generic(net, erspan_net_id);
iph = ip_hdr(skb);
if (is_erspan_type1(gre_hdr_len)) {
ver = 0;
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex,
- tpi->flags | TUNNEL_NO_KEY,
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, 0);
} else {
ershdr = (struct erspan_base_hdr *)(skb->data + gre_hdr_len);
ver = ershdr->ver;
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex,
- tpi->flags | TUNNEL_KEY,
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, tpi->key);
}

@@ -307,10 +310,9 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
struct ip_tunnel_info *info;
unsigned char *gh;
__be64 tun_id;
- __be16 flags;

- tpi->flags |= TUNNEL_KEY;
- flags = tpi->flags;
+ __set_bit(IP_TUNNEL_KEY_BIT, tpi->flags);
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);

tun_dst = ip_tun_rx_dst(skb, flags,
@@ -333,7 +335,8 @@ static int erspan_rcv(struct sk_buff *skb, struct tnl_ptk_info *tpi,
ERSPAN_V2_MDSIZE);

info = &tun_dst->u.tun_info;
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ info->key.tun_flags);
info->options_len = sizeof(*md);
}

@@ -376,10 +379,13 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,

tnl_params = &tunnel->parms.iph;
if (tunnel->collect_md || tnl_params->daddr == 0) {
- __be16 flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
__be64 tun_id;

- flags = tpi->flags & (TUNNEL_CSUM | TUNNEL_KEY);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ ip_tunnel_flags_and(flags, tpi->flags, flags);
+
tun_id = key32_to_tunnel_id(tpi->key);
tun_dst = ip_tun_rx_dst(skb, flags, tun_id, 0);
if (!tun_dst)
@@ -459,12 +465,15 @@ static void __gre_xmit(struct sk_buff *skb, struct net_device *dev,
__be16 proto)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- __be16 flags = tunnel->parms.o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+
+ ip_tunnel_flags_copy(flags, tunnel->parms.o_flags);

/* Push GRE header. */
gre_build_header(skb, tunnel->tun_hlen,
flags, proto, tunnel->parms.o_key,
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);

ip_tunnel_xmit(skb, dev, tnl_params, tnl_params->protocol);
}
@@ -478,10 +487,10 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
__be16 proto)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
int tunnel_hlen;
- __be16 flags;

tun_info = skb_tunnel_info(skb);
if (unlikely(!tun_info || !(tun_info->mode & IP_TUNNEL_INFO_TX) ||
@@ -495,14 +504,19 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
goto err_free_skb;

/* Push Tunnel header. */
- if (gre_handle_offloads(skb, !!(tun_info->key.tun_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto err_free_skb;

- flags = tun_info->key.tun_flags &
- (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ ip_tunnel_flags_and(flags, tun_info->key.tun_flags, flags);
+
gre_build_header(skb, tunnel_hlen, flags, proto,
tunnel_id_to_key32(tun_info->key.tun_id),
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) : 0);

ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);

@@ -516,6 +530,7 @@ static void gre_fb_xmit(struct sk_buff *skb, struct net_device *dev,
static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct ip_tunnel_info *tun_info;
const struct ip_tunnel_key *key;
struct erspan_metadata *md;
@@ -531,7 +546,7 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
goto err_free_skb;

key = &tun_info->key;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
+ if (!test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_info->key.tun_flags))
goto err_free_skb;
if (tun_info->options_len < sizeof(*md))
goto err_free_skb;
@@ -584,8 +599,9 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev)
goto err_free_skb;
}

- gre_build_header(skb, 8, TUNNEL_SEQ,
- proto, 0, htonl(atomic_fetch_inc(&tunnel->o_seqno)));
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ gre_build_header(skb, 8, flags, proto, 0,
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)));

ip_md_tunnel_xmit(skb, dev, IPPROTO_GRE, tunnel_hlen);

@@ -656,7 +672,8 @@ static netdev_tx_t ipgre_xmit(struct sk_buff *skb,
tnl_params = &tunnel->parms.iph;
}

- if (gre_handle_offloads(skb, !!(tunnel->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto free_skb;

__gre_xmit(skb, dev, tnl_params, skb->protocol);
@@ -698,7 +715,7 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
/* Push ERSPAN header */
if (tunnel->erspan_ver == 0) {
proto = htons(ETH_P_ERSPAN);
- tunnel->parms.o_flags &= ~TUNNEL_SEQ;
+ __clear_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags);
} else if (tunnel->erspan_ver == 1) {
erspan_build_header(skb, ntohl(tunnel->parms.o_key),
tunnel->index,
@@ -713,7 +730,7 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
goto free_skb;
}

- tunnel->parms.o_flags &= ~TUNNEL_KEY;
+ __clear_bit(IP_TUNNEL_KEY_BIT, tunnel->parms.o_flags);
__gre_xmit(skb, dev, &tunnel->parms.iph, proto);
return NETDEV_TX_OK;

@@ -736,7 +753,8 @@ static netdev_tx_t gre_tap_xmit(struct sk_buff *skb,
return NETDEV_TX_OK;
}

- if (gre_handle_offloads(skb, !!(tunnel->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ tunnel->parms.o_flags)))
goto free_skb;

if (skb_cow_head(skb, dev->needed_headroom))
@@ -754,7 +772,6 @@ static netdev_tx_t gre_tap_xmit(struct sk_buff *skb,
static void ipgre_link_update(struct net_device *dev, bool set_mtu)
{
struct ip_tunnel *tunnel = netdev_priv(dev);
- __be16 flags;
int len;

len = tunnel->tun_hlen;
@@ -770,10 +787,9 @@ static void ipgre_link_update(struct net_device *dev, bool set_mtu)
if (set_mtu)
dev->mtu = max_t(int, dev->mtu - len, 68);

- flags = tunnel->parms.o_flags;
-
- if (flags & TUNNEL_SEQ ||
- (flags & TUNNEL_CSUM && tunnel->encap.type != TUNNEL_ENCAP_NONE)) {
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags) ||
+ (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.o_flags) &&
+ tunnel->encap.type != TUNNEL_ENCAP_NONE)) {
dev->features &= ~NETIF_F_GSO_SOFTWARE;
dev->hw_features &= ~NETIF_F_GSO_SOFTWARE;
} else {
@@ -786,17 +802,25 @@ static int ipgre_tunnel_ctl(struct net_device *dev,
struct ip_tunnel_parm_kern *p,
int cmd)
{
+ __be16 i_flags, o_flags;
int err;

+ if (!ip_tunnel_flags_is_be16_compat(p->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(p->o_flags))
+ return -EOVERFLOW;
+
+ i_flags = ip_tunnel_flags_to_be16(p->i_flags);
+ o_flags = ip_tunnel_flags_to_be16(p->o_flags);
+
if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
if (p->iph.version != 4 || p->iph.protocol != IPPROTO_GRE ||
p->iph.ihl != 5 || (p->iph.frag_off & htons(~IP_DF)) ||
- ((p->i_flags | p->o_flags) & (GRE_VERSION | GRE_ROUTING)))
+ ((i_flags | o_flags) & (GRE_VERSION | GRE_ROUTING)))
return -EINVAL;
}

- p->i_flags = gre_flags_to_tnl_flags(p->i_flags);
- p->o_flags = gre_flags_to_tnl_flags(p->o_flags);
+ gre_flags_to_tnl_flags(p->i_flags, i_flags);
+ gre_flags_to_tnl_flags(p->o_flags, o_flags);

err = ip_tunnel_ctl(dev, p, cmd);
if (err)
@@ -805,15 +829,18 @@ static int ipgre_tunnel_ctl(struct net_device *dev,
if (cmd == SIOCCHGTUNNEL) {
struct ip_tunnel *t = netdev_priv(dev);

- t->parms.i_flags = p->i_flags;
- t->parms.o_flags = p->o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p->i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p->o_flags);

if (strcmp(dev->rtnl_link_ops->kind, "erspan"))
ipgre_link_update(dev, true);
}

- p->i_flags = gre_tnl_flags_to_gre_flags(p->i_flags);
- p->o_flags = gre_tnl_flags_to_gre_flags(p->o_flags);
+ i_flags = gre_tnl_flags_to_gre_flags(p->i_flags);
+ ip_tunnel_flags_from_be16(p->i_flags, i_flags);
+ o_flags = gre_tnl_flags_to_gre_flags(p->o_flags);
+ ip_tunnel_flags_from_be16(p->o_flags, o_flags);
+
return 0;
}

@@ -953,7 +980,6 @@ static void ipgre_tunnel_setup(struct net_device *dev)
static void __gre_tunnel_init(struct net_device *dev)
{
struct ip_tunnel *tunnel;
- __be16 flags;

tunnel = netdev_priv(dev);
tunnel->tun_hlen = gre_calc_hlen(tunnel->parms.o_flags);
@@ -965,14 +991,13 @@ static void __gre_tunnel_init(struct net_device *dev)
dev->features |= GRE_FEATURES | NETIF_F_LLTX;
dev->hw_features |= GRE_FEATURES;

- flags = tunnel->parms.o_flags;
-
/* TCP offload with GRE SEQ is not supported, nor can we support 2
* levels of outer headers requiring an update.
*/
- if (flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.o_flags))
return;
- if (flags & TUNNEL_CSUM && tunnel->encap.type != TUNNEL_ENCAP_NONE)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.o_flags) &&
+ tunnel->encap.type != TUNNEL_ENCAP_NONE)
return;

dev->features |= NETIF_F_GSO_SOFTWARE;
@@ -1143,10 +1168,12 @@ static int ipgre_netlink_parms(struct net_device *dev,
parms->link = nla_get_u32(data[IFLA_GRE_LINK]);

if (data[IFLA_GRE_IFLAGS])
- parms->i_flags = gre_flags_to_tnl_flags(nla_get_be16(data[IFLA_GRE_IFLAGS]));
+ gre_flags_to_tnl_flags(parms->i_flags,
+ nla_get_be16(data[IFLA_GRE_IFLAGS]));

if (data[IFLA_GRE_OFLAGS])
- parms->o_flags = gre_flags_to_tnl_flags(nla_get_be16(data[IFLA_GRE_OFLAGS]));
+ gre_flags_to_tnl_flags(parms->o_flags,
+ nla_get_be16(data[IFLA_GRE_OFLAGS]));

if (data[IFLA_GRE_IKEY])
parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
@@ -1406,8 +1433,8 @@ static int ipgre_changelink(struct net_device *dev, struct nlattr *tb[],
if (err < 0)
return err;

- t->parms.i_flags = p.i_flags;
- t->parms.o_flags = p.o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p.i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p.o_flags);

ipgre_link_update(dev, !tb[IFLA_MTU]);

@@ -1435,8 +1462,8 @@ static int erspan_changelink(struct net_device *dev, struct nlattr *tb[],
if (err < 0)
return err;

- t->parms.i_flags = p.i_flags;
- t->parms.o_flags = p.o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p.i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p.o_flags);

return 0;
}
@@ -1493,7 +1520,9 @@ static int ipgre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip_tunnel *t = netdev_priv(dev);
struct ip_tunnel_parm_kern *p = &t->parms;
- __be16 o_flags = p->o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
+
+ ip_tunnel_flags_copy(o_flags, p->o_flags);

if (nla_put_u32(skb, IFLA_GRE_LINK, p->link) ||
nla_put_be16(skb, IFLA_GRE_IFLAGS,
@@ -1541,7 +1570,7 @@ static int erspan_fill_info(struct sk_buff *skb, const struct net_device *dev)

if (t->erspan_ver <= 2) {
if (t->erspan_ver != 0 && !t->collect_md)
- t->parms.o_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, t->parms.o_flags);

if (nla_put_u8(skb, IFLA_GRE_ERSPAN_VER, t->erspan_ver))
goto nla_put_failure;
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 658f46208c8d..15e5a5464d99 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -57,16 +57,12 @@ static unsigned int ip_tunnel_hash(__be32 key, __be32 remote)
}

static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
- __be16 flags, __be32 key)
+ const unsigned long *flags, __be32 key)
{
- if (p->i_flags & TUNNEL_KEY) {
- if (flags & TUNNEL_KEY)
- return key == p->i_key;
- else
- /* key expected, none present */
- return false;
- } else
- return !(flags & TUNNEL_KEY);
+ if (!test_bit(IP_TUNNEL_KEY_BIT, flags))
+ return !test_bit(IP_TUNNEL_KEY_BIT, p->i_flags);
+
+ return test_bit(IP_TUNNEL_KEY_BIT, p->i_flags) && p->i_key == key;
}

/* Fallback tunnel: no source, no destination, no key, no options
@@ -81,7 +77,7 @@ static bool ip_tunnel_key_match(const struct ip_tunnel_parm_kern *p,
Given src, dst and key, find appropriate for input tunnel.
*/
struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
- int link, __be16 flags,
+ int link, const unsigned long *flags,
__be32 remote, __be32 local,
__be32 key)
{
@@ -144,7 +140,8 @@ struct ip_tunnel *ip_tunnel_lookup(struct ip_tunnel_net *itn,
}

hlist_for_each_entry_rcu(t, head, hash_node) {
- if ((!(flags & TUNNEL_NO_KEY) && t->parms.i_key != key) ||
+ if ((!test_bit(IP_TUNNEL_NO_KEY_BIT, flags) &&
+ t->parms.i_key != key) ||
t->parms.iph.saddr != 0 ||
t->parms.iph.daddr != 0 ||
!(t->dev->flags & IFF_UP))
@@ -183,7 +180,8 @@ static struct hlist_head *ip_bucket(struct ip_tunnel_net *itn,
else
remote = 0;

- if (!(parms->i_flags & TUNNEL_KEY) && (parms->i_flags & VTI_ISVTI))
+ if (!test_bit(IP_TUNNEL_KEY_BIT, parms->i_flags) &&
+ test_bit(IP_TUNNEL_VTI_BIT, parms->i_flags))
i_key = 0;

h = ip_tunnel_hash(i_key, remote);
@@ -212,12 +210,14 @@ static struct ip_tunnel *ip_tunnel_find(struct ip_tunnel_net *itn,
{
__be32 remote = parms->iph.daddr;
__be32 local = parms->iph.saddr;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be32 key = parms->i_key;
- __be16 flags = parms->i_flags;
int link = parms->link;
struct ip_tunnel *t = NULL;
struct hlist_head *head = ip_bucket(itn, parms);

+ ip_tunnel_flags_copy(flags, parms->i_flags);
+
hlist_for_each_entry_rcu(t, head, hash_node) {
if (local == t->parms.iph.saddr &&
remote == t->parms.iph.daddr &&
@@ -387,15 +387,15 @@ int ip_tunnel_rcv(struct ip_tunnel *tunnel, struct sk_buff *skb,
}
#endif

- if ((!(tpi->flags&TUNNEL_CSUM) && (tunnel->parms.i_flags&TUNNEL_CSUM)) ||
- ((tpi->flags&TUNNEL_CSUM) && !(tunnel->parms.i_flags&TUNNEL_CSUM))) {
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.i_flags) !=
+ test_bit(IP_TUNNEL_CSUM_BIT, tpi->flags)) {
DEV_STATS_INC(tunnel->dev, rx_crc_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
goto drop;
}

- if (tunnel->parms.i_flags&TUNNEL_SEQ) {
- if (!(tpi->flags&TUNNEL_SEQ) ||
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.i_flags)) {
+ if (!test_bit(IP_TUNNEL_SEQ_BIT, tpi->flags) ||
(tunnel->i_seqno && (s32)(ntohl(tpi->seq) - tunnel->i_seqno) < 0)) {
DEV_STATS_INC(tunnel->dev, rx_fifo_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
@@ -612,7 +612,7 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
goto tx_error;
}

- if (key->tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags))
df = htons(IP_DF);
if (tnl_update_pmtu(dev, skb, rt, df, inner_iph, tunnel_hlen,
key->u.ipv4.dst, true)) {
@@ -902,10 +902,10 @@ int ip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p,
goto done;
if (p->iph.ttl)
p->iph.frag_off |= htons(IP_DF);
- if (!(p->i_flags & VTI_ISVTI)) {
- if (!(p->i_flags & TUNNEL_KEY))
+ if (!test_bit(IP_TUNNEL_VTI_BIT, p->i_flags)) {
+ if (!test_bit(IP_TUNNEL_KEY_BIT, p->i_flags))
p->i_key = 0;
- if (!(p->o_flags & TUNNEL_KEY))
+ if (!test_bit(IP_TUNNEL_KEY_BIT, p->o_flags))
p->o_key = 0;
}

@@ -990,8 +990,8 @@ bool ip_tunnel_parm_from_user(struct ip_tunnel_parm_kern *kp,

strscpy(kp->name, p.name, sizeof(kp->name));
kp->link = p.link;
- kp->i_flags = p.i_flags;
- kp->o_flags = p.o_flags;
+ ip_tunnel_flags_from_be16(kp->i_flags, p.i_flags);
+ ip_tunnel_flags_from_be16(kp->o_flags, p.o_flags);
kp->i_key = p.i_key;
kp->o_key = p.o_key;
memcpy(&kp->iph, &p.iph, min(sizeof(kp->iph), sizeof(p.iph)));
@@ -1004,10 +1004,14 @@ bool ip_tunnel_parm_to_user(void __user *data, struct ip_tunnel_parm_kern *kp)
{
struct ip_tunnel_parm p;

+ if (!ip_tunnel_flags_is_be16_compat(kp->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(kp->o_flags))
+ return false;
+
strscpy(p.name, kp->name, sizeof(p.name));
p.link = kp->link;
- p.i_flags = kp->i_flags;
- p.o_flags = kp->o_flags;
+ p.i_flags = ip_tunnel_flags_to_be16(kp->i_flags);
+ p.o_flags = ip_tunnel_flags_to_be16(kp->o_flags);
p.i_key = kp->i_key;
p.o_key = kp->o_key;
memcpy(&p.iph, &kp->iph, min(sizeof(p.iph), sizeof(kp->iph)));
diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c
index f89e3bdef123..36aec14e2c9f 100644
--- a/net/ipv4/ip_tunnel_core.c
+++ b/net/ipv4/ip_tunnel_core.c
@@ -125,6 +125,7 @@ EXPORT_SYMBOL_GPL(__iptunnel_pull_header);
struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
gfp_t flags)
{
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags) = { };
struct metadata_dst *res;
struct ip_tunnel_info *dst, *src;

@@ -144,10 +145,10 @@ struct metadata_dst *iptunnel_metadata_reply(struct metadata_dst *md,
sizeof(struct in6_addr));
else
dst->key.u.ipv4.dst = src->key.u.ipv4.src;
- dst->key.tun_flags = src->key.tun_flags;
+ ip_tunnel_flags_copy(dst->key.tun_flags, src->key.tun_flags);
dst->mode = src->mode | IP_TUNNEL_INFO_TX;
ip_tunnel_info_opts_set(dst, ip_tunnel_info_opts(src),
- src->options_len, 0);
+ src->options_len, tun_flags);

return res;
}
@@ -497,7 +498,7 @@ static int ip_tun_parse_opts_geneve(struct nlattr *attr,
opt->opt_class = nla_get_be16(attr);
attr = tb[LWTUNNEL_IP_OPT_GENEVE_TYPE];
opt->type = nla_get_u8(attr);
- info->key.tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags);
}

return sizeof(struct geneve_opt) + data_len;
@@ -525,7 +526,7 @@ static int ip_tun_parse_opts_vxlan(struct nlattr *attr,
attr = tb[LWTUNNEL_IP_OPT_VXLAN_GBP];
md->gbp = nla_get_u32(attr);
md->gbp &= VXLAN_GBP_MASK;
- info->key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags);
}

return sizeof(struct vxlan_metadata);
@@ -574,7 +575,7 @@ static int ip_tun_parse_opts_erspan(struct nlattr *attr,
set_hwid(&md->u.md2, nla_get_u8(attr));
}

- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags);
}

return sizeof(struct erspan_metadata);
@@ -585,7 +586,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
{
int err, rem, opt_len, opts_len = 0;
struct nlattr *nla;
- __be16 type = 0;
+ u32 type = 0;

if (!attr)
return 0;
@@ -598,7 +599,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
nla_for_each_attr(nla, nla_data(attr), nla_len(attr), rem) {
switch (nla_type(nla)) {
case LWTUNNEL_IP_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT)
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT)
return -EINVAL;
opt_len = ip_tun_parse_opts_geneve(nla, info, opts_len,
extack);
@@ -607,7 +608,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
opts_len += opt_len;
if (opts_len > IP_TUNNEL_OPTS_MAX)
return -EINVAL;
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
case LWTUNNEL_IP_OPTS_VXLAN:
if (type)
@@ -617,7 +618,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case LWTUNNEL_IP_OPTS_ERSPAN:
if (type)
@@ -627,7 +628,7 @@ static int ip_tun_parse_opts(struct nlattr *attr, struct ip_tunnel_info *info,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
default:
return -EINVAL;
@@ -705,10 +706,16 @@ static int ip_tun_build_state(struct net *net, struct nlattr *attr,
if (tb[LWTUNNEL_IP_TOS])
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP_TOS]);

- if (tb[LWTUNNEL_IP_FLAGS])
- tun_info->key.tun_flags |=
- (nla_get_be16(tb[LWTUNNEL_IP_FLAGS]) &
- ~TUNNEL_OPTIONS_PRESENT);
+ if (tb[LWTUNNEL_IP_FLAGS]) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+
+ ip_tunnel_flags_from_be16(flags,
+ nla_get_be16(tb[LWTUNNEL_IP_FLAGS]));
+ ip_tunnel_clear_options_present(flags);
+
+ ip_tunnel_flags_or(tun_info->key.tun_flags,
+ tun_info->key.tun_flags, flags);
+ }

tun_info->mode = IP_TUNNEL_INFO_TX;
tun_info->options_len = opt_len;
@@ -812,18 +819,18 @@ static int ip_tun_fill_encap_opts(struct sk_buff *skb, int type,
struct nlattr *nest;
int err = 0;

- if (!(tun_info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))
+ if (!ip_tunnel_is_options_present(tun_info->key.tun_flags))
return 0;

nest = nla_nest_start_noflag(skb, type);
if (!nest)
return -ENOMEM;

- if (tun_info->key.tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_geneve(skb, tun_info);
- else if (tun_info->key.tun_flags & TUNNEL_VXLAN_OPT)
+ else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_vxlan(skb, tun_info);
- else if (tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT)
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_info->key.tun_flags))
err = ip_tun_fill_encap_opts_erspan(skb, tun_info);

if (err) {
@@ -846,7 +853,8 @@ static int ip_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in_addr(skb, LWTUNNEL_IP_SRC, tun_info->key.u.ipv4.src) ||
nla_put_u8(skb, LWTUNNEL_IP_TOS, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP_TTL, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP_FLAGS, tun_info->key.tun_flags) ||
+ nla_put_be16(skb, LWTUNNEL_IP_FLAGS,
+ ip_tunnel_flags_to_be16(tun_info->key.tun_flags)) ||
ip_tun_fill_encap_opts(skb, LWTUNNEL_IP_OPTS, tun_info))
return -ENOMEM;

@@ -857,11 +865,11 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
{
int opt_len;

- if (!(info->key.tun_flags & TUNNEL_OPTIONS_PRESENT))
+ if (!ip_tunnel_is_options_present(info->key.tun_flags))
return 0;

opt_len = nla_total_size(0); /* LWTUNNEL_IP_OPTS */
- if (info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags)) {
struct geneve_opt *opt;
int offset = 0;

@@ -874,10 +882,10 @@ static int ip_tun_opts_nlsize(struct ip_tunnel_info *info)
/* OPT_GENEVE_DATA */
offset += sizeof(*opt) + opt->length * 4;
}
- } else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_VXLAN */
+ nla_total_size(4); /* OPT_VXLAN_GBP */
- } else if (info->key.tun_flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags)) {
struct erspan_metadata *md = ip_tunnel_info_opts(info);

opt_len += nla_total_size(0) /* LWTUNNEL_IP_OPTS_ERSPAN */
@@ -984,10 +992,17 @@ static int ip6_tun_build_state(struct net *net, struct nlattr *attr,
if (tb[LWTUNNEL_IP6_TC])
tun_info->key.tos = nla_get_u8(tb[LWTUNNEL_IP6_TC]);

- if (tb[LWTUNNEL_IP6_FLAGS])
- tun_info->key.tun_flags |=
- (nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]) &
- ~TUNNEL_OPTIONS_PRESENT);
+ if (tb[LWTUNNEL_IP6_FLAGS]) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
+ __be16 data;
+
+ data = nla_get_be16(tb[LWTUNNEL_IP6_FLAGS]);
+ ip_tunnel_flags_from_be16(flags, data);
+ ip_tunnel_clear_options_present(flags);
+
+ ip_tunnel_flags_or(tun_info->key.tun_flags,
+ tun_info->key.tun_flags, flags);
+ }

tun_info->mode = IP_TUNNEL_INFO_TX | IP_TUNNEL_INFO_IPV6;
tun_info->options_len = opt_len;
@@ -1008,7 +1023,8 @@ static int ip6_tun_fill_encap_info(struct sk_buff *skb,
nla_put_in6_addr(skb, LWTUNNEL_IP6_SRC, &tun_info->key.u.ipv6.src) ||
nla_put_u8(skb, LWTUNNEL_IP6_TC, tun_info->key.tos) ||
nla_put_u8(skb, LWTUNNEL_IP6_HOPLIMIT, tun_info->key.ttl) ||
- nla_put_be16(skb, LWTUNNEL_IP6_FLAGS, tun_info->key.tun_flags) ||
+ nla_put_be16(skb, LWTUNNEL_IP6_FLAGS,
+ ip_tunnel_flags_to_be16(tun_info->key.tun_flags)) ||
ip_tun_fill_encap_opts(skb, LWTUNNEL_IP6_OPTS, tun_info))
return -ENOMEM;

@@ -1139,8 +1155,12 @@ void ip_tunnel_netlink_parms(struct nlattr *data[],
if (!data[IFLA_IPTUN_PMTUDISC] || nla_get_u8(data[IFLA_IPTUN_PMTUDISC]))
parms->iph.frag_off = htons(IP_DF);

- if (data[IFLA_IPTUN_FLAGS])
- parms->i_flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]);
+ if (data[IFLA_IPTUN_FLAGS]) {
+ __be16 flags;
+
+ flags = nla_get_be16(data[IFLA_IPTUN_FLAGS]);
+ ip_tunnel_flags_from_be16(parms->i_flags, flags);
+ }

if (data[IFLA_IPTUN_PROTO])
parms->iph.protocol = nla_get_u8(data[IFLA_IPTUN_PROTO]);
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index e1718c088433..833953a44748 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -51,8 +51,11 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi,
const struct iphdr *iph = ip_hdr(skb);
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };

- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->saddr, iph->daddr, 0);
if (tunnel) {
if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
@@ -322,8 +325,11 @@ static int vti4_err(struct sk_buff *skb, u32 info)
const struct iphdr *iph = (const struct iphdr *)skb->data;
int protocol = iph->protocol;
struct ip_tunnel_net *itn = net_generic(net, vti_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);

- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags,
iph->daddr, iph->saddr, 0);
if (!tunnel)
return -1;
@@ -375,6 +381,7 @@ static int vti4_err(struct sk_buff *skb, u32 info)
static int
vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
{
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
int err = 0;

if (cmd == SIOCADDTUNNEL || cmd == SIOCCHGTUNNEL) {
@@ -383,20 +390,26 @@ vti_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
return -EINVAL;
}

- if (!(p->i_flags & GRE_KEY))
+ if (!ip_tunnel_flags_is_be16_compat(p->i_flags) ||
+ !ip_tunnel_flags_is_be16_compat(p->o_flags))
+ return -EOVERFLOW;
+
+ if (!(ip_tunnel_flags_to_be16(p->i_flags) & GRE_KEY))
p->i_key = 0;
- if (!(p->o_flags & GRE_KEY))
+ if (!(ip_tunnel_flags_to_be16(p->o_flags) & GRE_KEY))
p->o_key = 0;

- p->i_flags = VTI_ISVTI;
+ __set_bit(IP_TUNNEL_VTI_BIT, flags);
+ ip_tunnel_flags_copy(p->i_flags, flags);

err = ip_tunnel_ctl(dev, p, cmd);
if (err)
return err;

if (cmd != SIOCDELTUNNEL) {
- p->i_flags |= GRE_KEY;
- p->o_flags |= GRE_KEY;
+ ip_tunnel_flags_from_be16(flags, GRE_KEY);
+ ip_tunnel_flags_or(p->i_flags, p->i_flags, flags);
+ ip_tunnel_flags_or(p->o_flags, p->o_flags, flags);
}
return 0;
}
@@ -539,7 +552,7 @@ static void vti_netlink_parms(struct nlattr *data[],
if (!data)
return;

- parms->i_flags = VTI_ISVTI;
+ __set_bit(IP_TUNNEL_VTI_BIT, parms->i_flags);

if (data[IFLA_VTI_LINK])
parms->link = nla_get_u32(data[IFLA_VTI_LINK]);
diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c
index 0dd2d3b55c75..11cb4d8a93d7 100644
--- a/net/ipv4/ipip.c
+++ b/net/ipv4/ipip.c
@@ -130,13 +130,16 @@ static int ipip_err(struct sk_buff *skb, u32 info)
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, ipip_net_id);
const struct iphdr *iph = (const struct iphdr *)skb->data;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
const int type = icmp_hdr(skb)->type;
const int code = icmp_hdr(skb)->code;
struct ip_tunnel *t;
int err = 0;

- t = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
- iph->daddr, iph->saddr, 0);
+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
+ t = ip_tunnel_lookup(itn, skb->dev->ifindex, flags, iph->daddr,
+ iph->saddr, 0);
if (!t) {
err = -ENOENT;
goto out;
@@ -213,13 +216,16 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
{
struct net *net = dev_net(skb->dev);
struct ip_tunnel_net *itn = net_generic(net, ipip_net_id);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *tun_dst = NULL;
struct ip_tunnel *tunnel;
const struct iphdr *iph;

+ __set_bit(IP_TUNNEL_NO_KEY_BIT, flags);
+
iph = ip_hdr(skb);
- tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
- iph->saddr, iph->daddr, 0);
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, flags, iph->saddr,
+ iph->daddr, 0);
if (tunnel) {
const struct tnl_ptk_info *tpi;

@@ -238,7 +244,9 @@ static int ipip_tunnel_rcv(struct sk_buff *skb, u8 ipproto)
if (iptunnel_pull_header(skb, 0, tpi->proto, false))
goto drop;
if (tunnel->collect_md) {
- tun_dst = ip_tun_rx_dst(skb, 0, 0, 0);
+ ip_tunnel_flags_zero(flags);
+
+ tun_dst = ip_tun_rx_dst(skb, flags, 0, 0);
if (!tun_dst)
return 0;
ip_tunnel_md_udp_encap(skb, &tun_dst->u.tun_info);
@@ -340,7 +348,8 @@ ipip_tunnel_ctl(struct net_device *dev, struct ip_tunnel_parm_kern *p, int cmd)
}

p->i_key = p->o_key = 0;
- p->i_flags = p->o_flags = 0;
+ ip_tunnel_flags_zero(p->i_flags);
+ ip_tunnel_flags_zero(p->o_flags);
return ip_tunnel_ctl(dev, p, cmd);
}

diff --git a/net/ipv4/udp_tunnel_core.c b/net/ipv4/udp_tunnel_core.c
index 9b18f371af0d..3d56304e119f 100644
--- a/net/ipv4/udp_tunnel_core.c
+++ b/net/ipv4/udp_tunnel_core.c
@@ -183,7 +183,8 @@ void udp_tunnel_sock_release(struct socket *sock)
EXPORT_SYMBOL_GPL(udp_tunnel_sock_release);

struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
- __be16 flags, __be64 tunnel_id, int md_size)
+ const unsigned long *flags,
+ __be64 tunnel_id, int md_size)
{
struct metadata_dst *tun_dst;
struct ip_tunnel_info *info;
@@ -199,7 +200,7 @@ struct metadata_dst *udp_tun_rx_dst(struct sk_buff *skb, unsigned short family,
info->key.tp_src = udp_hdr(skb)->source;
info->key.tp_dst = udp_hdr(skb)->dest;
if (udp_hdr(skb)->check)
- info->key.tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags);
return tun_dst;
}
EXPORT_SYMBOL_GPL(udp_tun_rx_dst);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 070d87abf7c0..e1c126f4a61c 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -496,11 +496,11 @@ static int ip6gre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi)
tpi->proto);
if (tunnel) {
if (tunnel->parms.collect_md) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct metadata_dst *tun_dst;
__be64 tun_id;
- __be16 flags;

- flags = tpi->flags;
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);

tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id, 0);
@@ -548,14 +548,14 @@ static int ip6erspan_rcv(struct sk_buff *skb,

if (tunnel->parms.collect_md) {
struct erspan_metadata *pkt_md, *md;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
struct metadata_dst *tun_dst;
struct ip_tunnel_info *info;
unsigned char *gh;
__be64 tun_id;
- __be16 flags;

- tpi->flags |= TUNNEL_KEY;
- flags = tpi->flags;
+ __set_bit(IP_TUNNEL_KEY_BIT, tpi->flags);
+ ip_tunnel_flags_copy(flags, tpi->flags);
tun_id = key32_to_tunnel_id(tpi->key);

tun_dst = ipv6_tun_rx_dst(skb, flags, tun_id,
@@ -577,7 +577,8 @@ static int ip6erspan_rcv(struct sk_buff *skb,
md2 = &md->u.md2;
memcpy(md2, pkt_md, ver == 1 ? ERSPAN_V1_MDSIZE :
ERSPAN_V2_MDSIZE);
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ info->key.tun_flags);
info->options_len = sizeof(*md);

ip6_tnl_rcv(tunnel, skb, tpi, tun_dst, log_ecn_error);
@@ -745,8 +746,8 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
__u32 *pmtu, __be16 proto)
{
struct ip6_tnl *tunnel = netdev_priv(dev);
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 protocol;
- __be16 flags;

if (dev->type == ARPHRD_ETHER)
IPCB(skb)->flags = 0;
@@ -778,8 +779,11 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
fl6->fl6_gre_key = tunnel_id_to_key32(key->tun_id);

dsfield = key->tos;
- flags = key->tun_flags &
- (TUNNEL_CSUM | TUNNEL_KEY | TUNNEL_SEQ);
+ ip_tunnel_flags_zero(flags);
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ ip_tunnel_flags_and(flags, flags, key->tun_flags);
tun_hlen = gre_calc_hlen(flags);

if (skb_cow_head(skb, dev->needed_headroom ?: tun_hlen + tunnel->encap_hlen))
@@ -788,19 +792,21 @@ static netdev_tx_t __gre6_xmit(struct sk_buff *skb,
gre_build_header(skb, tun_hlen,
flags, protocol,
tunnel_id_to_key32(tun_info->key.tun_id),
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno))
- : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) :
+ 0);

} else {
if (skb_cow_head(skb, dev->needed_headroom ?: tunnel->hlen))
return -ENOMEM;

- flags = tunnel->parms.o_flags;
+ ip_tunnel_flags_copy(flags, tunnel->parms.o_flags);

gre_build_header(skb, tunnel->tun_hlen, flags,
protocol, tunnel->parms.o_key,
- (flags & TUNNEL_SEQ) ? htonl(atomic_fetch_inc(&tunnel->o_seqno))
- : 0);
+ test_bit(IP_TUNNEL_SEQ_BIT, flags) ?
+ htonl(atomic_fetch_inc(&tunnel->o_seqno)) :
+ 0);
}

return ip6_tnl_xmit(skb, dev, dsfield, fl6, encap_limit, pmtu,
@@ -822,7 +828,8 @@ static inline int ip6gre_xmit_ipv4(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_ipv4(skb, dev, &fl6,
&dsfield, &encap_limit);

- err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
+ err = gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags));
if (err)
return -1;

@@ -856,7 +863,8 @@ static inline int ip6gre_xmit_ipv6(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_ipv6(skb, dev, &fl6, &dsfield, &encap_limit))
return -1;

- if (gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM)))
+ if (gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags)))
return -1;

err = __gre6_xmit(skb, dev, dsfield, &fl6, encap_limit,
@@ -883,7 +891,8 @@ static int ip6gre_xmit_other(struct sk_buff *skb, struct net_device *dev)
prepare_ip6gre_xmit_other(skb, dev, &fl6, &dsfield, &encap_limit))
return -1;

- err = gre_handle_offloads(skb, !!(t->parms.o_flags & TUNNEL_CSUM));
+ err = gre_handle_offloads(skb, test_bit(IP_TUNNEL_CSUM_BIT,
+ t->parms.o_flags));
if (err)
return err;
err = __gre6_xmit(skb, dev, dsfield, &fl6, encap_limit, &mtu, skb->protocol);
@@ -936,6 +945,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
struct ip_tunnel_info *tun_info = NULL;
struct ip6_tnl *t = netdev_priv(dev);
struct dst_entry *dst = skb_dst(skb);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
bool truncate = false;
int encap_limit = -1;
__u8 dsfield = false;
@@ -979,7 +989,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
if (skb_cow_head(skb, dev->needed_headroom ?: t->hlen))
goto tx_err;

- t->parms.o_flags &= ~TUNNEL_KEY;
+ __clear_bit(IP_TUNNEL_KEY_BIT, t->parms.o_flags);
IPCB(skb)->flags = 0;

/* For collect_md mode, derive fl6 from the tunnel key,
@@ -1004,7 +1014,8 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
fl6.fl6_gre_key = tunnel_id_to_key32(key->tun_id);

dsfield = key->tos;
- if (!(tun_info->key.tun_flags & TUNNEL_ERSPAN_OPT))
+ if (!test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_info->key.tun_flags))
goto tx_err;
if (tun_info->options_len < sizeof(*md))
goto tx_err;
@@ -1065,7 +1076,9 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
}

/* Push GRE header. */
- gre_build_header(skb, 8, TUNNEL_SEQ, proto, 0, htonl(atomic_fetch_inc(&t->o_seqno)));
+ __set_bit(IP_TUNNEL_SEQ_BIT, flags);
+ gre_build_header(skb, 8, flags, proto, 0,
+ htonl(atomic_fetch_inc(&t->o_seqno)));

/* TooBig packet may have updated dst->dev's mtu */
if (!t->parms.collect_md && dst && dst_mtu(dst) > dst->dev->mtu)
@@ -1208,8 +1221,8 @@ static void ip6gre_tnl_copy_tnl_parm(struct ip6_tnl *t,
t->parms.proto = p->proto;
t->parms.i_key = p->i_key;
t->parms.o_key = p->o_key;
- t->parms.i_flags = p->i_flags;
- t->parms.o_flags = p->o_flags;
+ ip_tunnel_flags_copy(t->parms.i_flags, p->i_flags);
+ ip_tunnel_flags_copy(t->parms.o_flags, p->o_flags);
t->parms.fwmark = p->fwmark;
t->parms.erspan_ver = p->erspan_ver;
t->parms.index = p->index;
@@ -1238,8 +1251,8 @@ static void ip6gre_tnl_parm_from_user(struct __ip6_tnl_parm *p,
p->link = u->link;
p->i_key = u->i_key;
p->o_key = u->o_key;
- p->i_flags = gre_flags_to_tnl_flags(u->i_flags);
- p->o_flags = gre_flags_to_tnl_flags(u->o_flags);
+ gre_flags_to_tnl_flags(p->i_flags, u->i_flags);
+ gre_flags_to_tnl_flags(p->o_flags, u->o_flags);
memcpy(p->name, u->name, sizeof(u->name));
}

@@ -1391,7 +1404,7 @@ static int ip6gre_header(struct sk_buff *skb, struct net_device *dev,
ipv6h->daddr = t->parms.raddr;

p = (__be16 *)(ipv6h + 1);
- p[0] = t->parms.o_flags;
+ p[0] = ip_tunnel_flags_to_be16(t->parms.o_flags);
p[1] = htons(type);

/*
@@ -1455,19 +1468,17 @@ static void ip6gre_tunnel_setup(struct net_device *dev)
static void ip6gre_tnl_init_features(struct net_device *dev)
{
struct ip6_tnl *nt = netdev_priv(dev);
- __be16 flags;

dev->features |= GRE6_FEATURES | NETIF_F_LLTX;
dev->hw_features |= GRE6_FEATURES;

- flags = nt->parms.o_flags;
-
/* TCP offload with GRE SEQ is not supported, nor can we support 2
* levels of outer headers requiring an update.
*/
- if (flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, nt->parms.o_flags))
return;
- if (flags & TUNNEL_CSUM && nt->encap.type != TUNNEL_ENCAP_NONE)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, nt->parms.o_flags) &&
+ nt->encap.type != TUNNEL_ENCAP_NONE)
return;

dev->features |= NETIF_F_GSO_SOFTWARE;
@@ -1793,12 +1804,12 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
parms->link = nla_get_u32(data[IFLA_GRE_LINK]);

if (data[IFLA_GRE_IFLAGS])
- parms->i_flags = gre_flags_to_tnl_flags(
- nla_get_be16(data[IFLA_GRE_IFLAGS]));
+ gre_flags_to_tnl_flags(parms->i_flags,
+ nla_get_be16(data[IFLA_GRE_IFLAGS]));

if (data[IFLA_GRE_OFLAGS])
- parms->o_flags = gre_flags_to_tnl_flags(
- nla_get_be16(data[IFLA_GRE_OFLAGS]));
+ gre_flags_to_tnl_flags(parms->o_flags,
+ nla_get_be16(data[IFLA_GRE_OFLAGS]));

if (data[IFLA_GRE_IKEY])
parms->i_key = nla_get_be32(data[IFLA_GRE_IKEY]);
@@ -2144,11 +2155,13 @@ static int ip6gre_fill_info(struct sk_buff *skb, const struct net_device *dev)
{
struct ip6_tnl *t = netdev_priv(dev);
struct __ip6_tnl_parm *p = &t->parms;
- __be16 o_flags = p->o_flags;
+ IP_TUNNEL_DECLARE_FLAGS(o_flags);
+
+ ip_tunnel_flags_copy(o_flags, p->o_flags);

if (p->erspan_ver == 1 || p->erspan_ver == 2) {
if (!p->collect_md)
- o_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, o_flags);

if (nla_put_u8(skb, IFLA_GRE_ERSPAN_VER, p->erspan_ver))
goto nla_put_failure;
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 5e80e517f071..df9dc01fddd5 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -799,17 +799,15 @@ static int __ip6_tnl_rcv(struct ip6_tnl *tunnel, struct sk_buff *skb,
const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
int err;

- if ((!(tpi->flags & TUNNEL_CSUM) &&
- (tunnel->parms.i_flags & TUNNEL_CSUM)) ||
- ((tpi->flags & TUNNEL_CSUM) &&
- !(tunnel->parms.i_flags & TUNNEL_CSUM))) {
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tunnel->parms.i_flags) !=
+ test_bit(IP_TUNNEL_CSUM_BIT, tpi->flags)) {
DEV_STATS_INC(tunnel->dev, rx_crc_errors);
DEV_STATS_INC(tunnel->dev, rx_errors);
goto drop;
}

- if (tunnel->parms.i_flags & TUNNEL_SEQ) {
- if (!(tpi->flags & TUNNEL_SEQ) ||
+ if (test_bit(IP_TUNNEL_SEQ_BIT, tunnel->parms.i_flags)) {
+ if (!test_bit(IP_TUNNEL_SEQ_BIT, tpi->flags) ||
(tunnel->i_seqno &&
(s32)(ntohl(tpi->seq) - tunnel->i_seqno) < 0)) {
DEV_STATS_INC(tunnel->dev, rx_fifo_errors);
@@ -932,7 +930,9 @@ static int ipxip6_rcv(struct sk_buff *skb, u8 ipproto,
if (iptunnel_pull_header(skb, 0, tpi->proto, false))
goto drop;
if (t->parms.collect_md) {
- tun_dst = ipv6_tun_rx_dst(skb, 0, 0, 0);
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
+
+ tun_dst = ipv6_tun_rx_dst(skb, flags, 0, 0);
if (!tun_dst)
goto drop;
}
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index 19af34e4c13c..b8babe787a91 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -207,7 +207,7 @@ static int ipip6_tunnel_create(struct net_device *dev)
__dev_addr_set(dev, &t->parms.iph.saddr, 4);
memcpy(dev->broadcast, &t->parms.iph.daddr, 4);

- if ((__force u16)t->parms.i_flags & SIT_ISATAP)
+ if (test_bit(IP_TUNNEL_SIT_ISATAP_BIT, t->parms.i_flags))
dev->priv_flags |= IFF_ISATAP;

dev->rtnl_link_ops = &sit_link_ops;
@@ -1704,7 +1704,8 @@ static int ipip6_fill_info(struct sk_buff *skb, const struct net_device *dev)
nla_put_u8(skb, IFLA_IPTUN_PMTUDISC,
!!(parm->iph.frag_off & htons(IP_DF))) ||
nla_put_u8(skb, IFLA_IPTUN_PROTO, parm->iph.protocol) ||
- nla_put_be16(skb, IFLA_IPTUN_FLAGS, parm->i_flags) ||
+ nla_put_be16(skb, IFLA_IPTUN_FLAGS,
+ ip_tunnel_flags_to_be16(parm->i_flags)) ||
nla_put_u32(skb, IFLA_IPTUN_FWMARK, tunnel->fwmark))
goto nla_put_failure;

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 3230506ae3ff..6eff7a72a95b 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1550,6 +1550,7 @@ static int ipvs_gre_decap(struct netns_ipvs *ipvs, struct sk_buff *skb,
if (!dest)
goto unk;
if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(flags);
__be16 type;

/* Only support version 0 and C (csum) */
@@ -1560,7 +1561,10 @@ static int ipvs_gre_decap(struct netns_ipvs *ipvs, struct sk_buff *skb,
if (type != htons(ETH_P_IP))
goto unk;
*proto = IPPROTO_IPIP;
- return gre_calc_hlen(gre_flags_to_tnl_flags(greh->flags));
+
+ gre_flags_to_tnl_flags(flags, greh->flags);
+
+ return gre_calc_hlen(flags);
}

unk:
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 9193e109e6b3..1180fa945cbd 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -390,10 +390,10 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
skb->ip_summed == CHECKSUM_PARTIAL)
mtu -= GUE_PLEN_REMCSUM + GUE_LEN_PRIV;
} else if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };

if (dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
mtu -= gre_calc_hlen(tflags);
}
if (mtu < 68) {
@@ -553,10 +553,10 @@ __ip_vs_get_out_rt_v6(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
skb->ip_summed == CHECKSUM_PARTIAL)
mtu -= GUE_PLEN_REMCSUM + GUE_LEN_PRIV;
} else if (dest->tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };

if (dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
mtu -= gre_calc_hlen(tflags);
}
if (mtu < IPV6_MIN_MTU) {
@@ -1082,11 +1082,11 @@ ipvs_gre_encap(struct net *net, struct sk_buff *skb,
{
__be16 proto = *next_protocol == IPPROTO_IPIP ?
htons(ETH_P_IP) : htons(ETH_P_IPV6);
- __be16 tflags = 0;
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t hdrlen;

if (cp->dest->tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);

hdrlen = gre_calc_hlen(tflags);
gre_build_header(skb, hdrlen, tflags, proto, 0, 0);
@@ -1165,11 +1165,11 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,

max_headroom += sizeof(struct udphdr) + gue_hdrlen;
} else if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t gre_hdrlen;
- __be16 tflags = 0;

if (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
gre_hdrlen = gre_calc_hlen(tflags);

max_headroom += gre_hdrlen;
@@ -1310,11 +1310,11 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,

max_headroom += sizeof(struct udphdr) + gue_hdrlen;
} else if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GRE) {
+ IP_TUNNEL_DECLARE_FLAGS(tflags) = { };
size_t gre_hdrlen;
- __be16 tflags = 0;

if (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM)
- tflags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tflags);
gre_hdrlen = gre_calc_hlen(tflags);

max_headroom += gre_hdrlen;
diff --git a/net/netfilter/nft_tunnel.c b/net/netfilter/nft_tunnel.c
index 9f21953c7433..9653561aa5c0 100644
--- a/net/netfilter/nft_tunnel.c
+++ b/net/netfilter/nft_tunnel.c
@@ -174,8 +174,8 @@ struct nft_tunnel_opts {
struct erspan_metadata erspan;
u8 data[IP_TUNNEL_OPTS_MAX];
} u;
+ IP_TUNNEL_DECLARE_FLAGS(flags);
u32 len;
- __be16 flags;
};

struct nft_tunnel_obj {
@@ -271,7 +271,8 @@ static int nft_tunnel_obj_vxlan_init(const struct nlattr *attr,
opts->u.vxlan.gbp = ntohl(nla_get_be32(tb[NFTA_TUNNEL_KEY_VXLAN_GBP]));

opts->len = sizeof(struct vxlan_metadata);
- opts->flags = TUNNEL_VXLAN_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, opts->flags);

return 0;
}
@@ -325,7 +326,8 @@ static int nft_tunnel_obj_erspan_init(const struct nlattr *attr,
opts->u.erspan.version = version;

opts->len = sizeof(struct erspan_metadata);
- opts->flags = TUNNEL_ERSPAN_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, opts->flags);

return 0;
}
@@ -366,7 +368,8 @@ static int nft_tunnel_obj_geneve_init(const struct nlattr *attr,
opt->length = data_len / 4;
opt->opt_class = nla_get_be16(tb[NFTA_TUNNEL_KEY_GENEVE_CLASS]);
opt->type = nla_get_u8(tb[NFTA_TUNNEL_KEY_GENEVE_TYPE]);
- opts->flags = TUNNEL_GENEVE_OPT;
+ ip_tunnel_flags_zero(opts->flags);
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, opts->flags);

return 0;
}
@@ -385,8 +388,8 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
struct nft_tunnel_opts *opts)
{
struct nlattr *nla;
- __be16 type = 0;
int err, rem;
+ u32 type = 0;

err = nla_validate_nested_deprecated(attr, NFTA_TUNNEL_KEY_OPTS_MAX,
nft_tunnel_opts_policy, NULL);
@@ -401,7 +404,7 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
err = nft_tunnel_obj_vxlan_init(nla, opts);
if (err)
return err;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case NFTA_TUNNEL_KEY_OPTS_ERSPAN:
if (type)
@@ -409,15 +412,15 @@ static int nft_tunnel_obj_opts_init(const struct nft_ctx *ctx,
err = nft_tunnel_obj_erspan_init(nla, opts);
if (err)
return err;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
case NFTA_TUNNEL_KEY_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT)
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT)
return -EINVAL;
err = nft_tunnel_obj_geneve_init(nla, opts);
if (err)
return err;
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
default:
return -EOPNOTSUPP;
@@ -454,7 +457,9 @@ static int nft_tunnel_obj_init(const struct nft_ctx *ctx,
memset(&info, 0, sizeof(info));
info.mode = IP_TUNNEL_INFO_TX;
info.key.tun_id = key32_to_tunnel_id(nla_get_be32(tb[NFTA_TUNNEL_KEY_ID]));
- info.key.tun_flags = TUNNEL_KEY | TUNNEL_CSUM | TUNNEL_NOCACHE;
+ __set_bit(IP_TUNNEL_KEY_BIT, info.key.tun_flags);
+ __set_bit(IP_TUNNEL_CSUM_BIT, info.key.tun_flags);
+ __set_bit(IP_TUNNEL_NOCACHE_BIT, info.key.tun_flags);

if (tb[NFTA_TUNNEL_KEY_IP]) {
err = nft_tunnel_obj_ip_init(ctx, tb[NFTA_TUNNEL_KEY_IP], &info);
@@ -483,11 +488,12 @@ static int nft_tunnel_obj_init(const struct nft_ctx *ctx,
return -EOPNOTSUPP;

if (tun_flags & NFT_TUNNEL_F_ZERO_CSUM_TX)
- info.key.tun_flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, info.key.tun_flags);
if (tun_flags & NFT_TUNNEL_F_DONT_FRAGMENT)
- info.key.tun_flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT,
+ info.key.tun_flags);
if (tun_flags & NFT_TUNNEL_F_SEQ_NUMBER)
- info.key.tun_flags |= TUNNEL_SEQ;
+ __set_bit(IP_TUNNEL_SEQ_BIT, info.key.tun_flags);
}
if (tb[NFTA_TUNNEL_KEY_TOS])
info.key.tos = nla_get_u8(tb[NFTA_TUNNEL_KEY_TOS]);
@@ -583,7 +589,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
if (!nest)
return -1;

- if (opts->flags & TUNNEL_VXLAN_OPT) {
+ if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, opts->flags)) {
inner = nla_nest_start_noflag(skb, NFTA_TUNNEL_KEY_OPTS_VXLAN);
if (!inner)
goto failure;
@@ -591,7 +597,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
htonl(opts->u.vxlan.gbp)))
goto inner_failure;
nla_nest_end(skb, inner);
- } else if (opts->flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, opts->flags)) {
inner = nla_nest_start_noflag(skb, NFTA_TUNNEL_KEY_OPTS_ERSPAN);
if (!inner)
goto failure;
@@ -613,7 +619,7 @@ static int nft_tunnel_opts_dump(struct sk_buff *skb,
break;
}
nla_nest_end(skb, inner);
- } else if (opts->flags & TUNNEL_GENEVE_OPT) {
+ } else if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, opts->flags)) {
struct geneve_opt *opt;
int offset = 0;

@@ -658,11 +664,11 @@ static int nft_tunnel_flags_dump(struct sk_buff *skb,
{
u32 flags = 0;

- if (info->key.tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_DONT_FRAGMENT;
- if (!(info->key.tun_flags & TUNNEL_CSUM))
+ if (!test_bit(IP_TUNNEL_CSUM_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_ZERO_CSUM_TX;
- if (info->key.tun_flags & TUNNEL_SEQ)
+ if (test_bit(IP_TUNNEL_SEQ_BIT, info->key.tun_flags))
flags |= NFT_TUNNEL_F_SEQ_NUMBER;

if (nla_put_be32(skb, NFTA_TUNNEL_KEY_FLAGS, htonl(flags)) < 0)
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 88965e2068ac..6a30a594d38a 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -151,6 +151,13 @@ static void update_range(struct sw_flow_match *match,
sizeof((match)->key->field)); \
} while (0)

+#define SW_FLOW_KEY_BITMAP_COPY(match, field, value_p, nbits, is_mask) ({ \
+ update_range(match, offsetof(struct sw_flow_key, field), \
+ bitmap_size(nbits), is_mask); \
+ bitmap_copy(is_mask ? (match)->mask->key.field : (match)->key->field, \
+ value_p, nbits); \
+})
+
static bool match_validate(const struct sw_flow_match *match,
u64 key_attrs, u64 mask_attrs, bool log)
{
@@ -669,8 +676,8 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
bool log)
{
bool ttl = false, ipv4 = false, ipv6 = false;
+ IP_TUNNEL_DECLARE_FLAGS(tun_flags) = { };
bool info_bridge_mode = false;
- __be16 tun_flags = 0;
int opts_type = 0;
struct nlattr *a;
int rem;
@@ -696,7 +703,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
case OVS_TUNNEL_KEY_ATTR_ID:
SW_FLOW_KEY_PUT(match, tun_key.tun_id,
nla_get_be64(a), is_mask);
- tun_flags |= TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_IPV4_SRC:
SW_FLOW_KEY_PUT(match, tun_key.u.ipv4.src,
@@ -728,10 +735,10 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
ttl = true;
break;
case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT:
- tun_flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_CSUM:
- tun_flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_TP_SRC:
SW_FLOW_KEY_PUT(match, tun_key.tp_src,
@@ -742,7 +749,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
nla_get_be16(a), is_mask);
break;
case OVS_TUNNEL_KEY_ATTR_OAM:
- tun_flags |= TUNNEL_OAM;
+ __set_bit(IP_TUNNEL_OAM_BIT, tun_flags);
break;
case OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS:
if (opts_type) {
@@ -754,7 +761,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;

- tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS:
@@ -767,7 +774,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;

- tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_PAD:
@@ -783,7 +790,7 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
if (err)
return err;

- tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, tun_flags);
opts_type = type;
break;
case OVS_TUNNEL_KEY_ATTR_IPV4_INFO_BRIDGE:
@@ -797,7 +804,8 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
}
}

- SW_FLOW_KEY_PUT(match, tun_key.tun_flags, tun_flags, is_mask);
+ SW_FLOW_KEY_BITMAP_COPY(match, tun_key.tun_flags, tun_flags,
+ __IP_TUNNEL_FLAG_NUM, is_mask);
if (is_mask)
SW_FLOW_KEY_MEMSET_FIELD(match, tun_proto, 0xff, true);
else
@@ -822,13 +830,15 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
}
if (ipv4) {
if (info_bridge_mode) {
+ __clear_bit(IP_TUNNEL_KEY_BIT, tun_flags);
+
if (match->key->tun_key.u.ipv4.src ||
match->key->tun_key.u.ipv4.dst ||
match->key->tun_key.tp_src ||
match->key->tun_key.tp_dst ||
match->key->tun_key.ttl ||
match->key->tun_key.tos ||
- tun_flags & ~TUNNEL_KEY) {
+ !ip_tunnel_flags_empty(tun_flags)) {
OVS_NLERR(log, "IPv4 tun info is not correct");
return -EINVAL;
}
@@ -873,7 +883,7 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
const void *tun_opts, int swkey_tun_opts_len,
unsigned short tun_proto, u8 mode)
{
- if (output->tun_flags & TUNNEL_KEY &&
+ if (test_bit(IP_TUNNEL_KEY_BIT, output->tun_flags) &&
nla_put_be64(skb, OVS_TUNNEL_KEY_ATTR_ID, output->tun_id,
OVS_TUNNEL_KEY_ATTR_PAD))
return -EMSGSIZE;
@@ -909,10 +919,10 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
return -EMSGSIZE;
if (nla_put_u8(skb, OVS_TUNNEL_KEY_ATTR_TTL, output->ttl))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_CSUM) &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_CSUM))
return -EMSGSIZE;
if (output->tp_src &&
@@ -921,18 +931,20 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
if (output->tp_dst &&
nla_put_be16(skb, OVS_TUNNEL_KEY_ATTR_TP_DST, output->tp_dst))
return -EMSGSIZE;
- if ((output->tun_flags & TUNNEL_OAM) &&
+ if (test_bit(IP_TUNNEL_OAM_BIT, output->tun_flags) &&
nla_put_flag(skb, OVS_TUNNEL_KEY_ATTR_OAM))
return -EMSGSIZE;
if (swkey_tun_opts_len) {
- if (output->tun_flags & TUNNEL_GENEVE_OPT &&
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, output->tun_flags) &&
nla_put(skb, OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS,
swkey_tun_opts_len, tun_opts))
return -EMSGSIZE;
- else if (output->tun_flags & TUNNEL_VXLAN_OPT &&
+ else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT,
+ output->tun_flags) &&
vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
return -EMSGSIZE;
- else if (output->tun_flags & TUNNEL_ERSPAN_OPT &&
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ output->tun_flags) &&
nla_put(skb, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
swkey_tun_opts_len, tun_opts))
return -EMSGSIZE;
@@ -2028,7 +2040,7 @@ static int __ovs_nla_put_key(const struct sw_flow_key *swkey,
if ((swkey->tun_proto || is_mask)) {
const void *opts = NULL;

- if (output->tun_key.tun_flags & TUNNEL_OPTIONS_PRESENT)
+ if (ip_tunnel_is_options_present(output->tun_key.tun_flags))
opts = TUN_METADATA_OPTS(output, swkey->tun_opts_len);

if (ip_tun_to_nlattr(skb, &output->tun_key, opts,
@@ -2744,7 +2756,8 @@ static int validate_geneve_opts(struct sw_flow_key *key)
opts_len -= len;
}

- key->tun_key.tun_flags |= crit_opt ? TUNNEL_CRIT_OPT : 0;
+ if (crit_opt)
+ __set_bit(IP_TUNNEL_CRIT_OPT_BIT, key->tun_key.tun_flags);

return 0;
}
@@ -2752,6 +2765,7 @@ static int validate_geneve_opts(struct sw_flow_key *key)
static int validate_and_copy_set_tun(const struct nlattr *attr,
struct sw_flow_actions **sfa, bool log)
{
+ IP_TUNNEL_DECLARE_FLAGS(dst_opt_type) = { };
struct sw_flow_match match;
struct sw_flow_key key;
struct metadata_dst *tun_dst;
@@ -2759,9 +2773,7 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
struct ovs_tunnel_info *ovs_tun;
struct nlattr *a;
int err = 0, start, opts_type;
- __be16 dst_opt_type;

- dst_opt_type = 0;
ovs_match_init(&match, &key, true, NULL);
opts_type = ip_tun_from_nlattr(nla_data(attr), &match, false, log);
if (opts_type < 0)
@@ -2773,13 +2785,14 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
err = validate_geneve_opts(&key);
if (err < 0)
return err;
- dst_opt_type = TUNNEL_GENEVE_OPT;
+
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, dst_opt_type);
break;
case OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS:
- dst_opt_type = TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, dst_opt_type);
break;
case OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS:
- dst_opt_type = TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, dst_opt_type);
break;
}
}
diff --git a/net/psample/psample.c b/net/psample/psample.c
index 81a794e36f53..3fc90dfc1fdd 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -220,7 +220,7 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
const struct ip_tunnel_key *tun_key = &tun_info->key;
int tun_opts_len = tun_info->options_len;

- if (tun_key->tun_flags & TUNNEL_KEY &&
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags) &&
nla_put_be64(skb, PSAMPLE_TUNNEL_KEY_ATTR_ID, tun_key->tun_id,
PSAMPLE_TUNNEL_KEY_ATTR_PAD))
return -EMSGSIZE;
@@ -256,10 +256,10 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
return -EMSGSIZE;
if (nla_put_u8(skb, PSAMPLE_TUNNEL_KEY_ATTR_TTL, tun_key->ttl))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_DONT_FRAGMENT))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_CSUM) &&
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_CSUM))
return -EMSGSIZE;
if (tun_key->tp_src &&
@@ -268,15 +268,16 @@ static int __psample_ip_tun_to_nlattr(struct sk_buff *skb,
if (tun_key->tp_dst &&
nla_put_be16(skb, PSAMPLE_TUNNEL_KEY_ATTR_TP_DST, tun_key->tp_dst))
return -EMSGSIZE;
- if ((tun_key->tun_flags & TUNNEL_OAM) &&
+ if (test_bit(IP_TUNNEL_OAM_BIT, tun_key->tun_flags) &&
nla_put_flag(skb, PSAMPLE_TUNNEL_KEY_ATTR_OAM))
return -EMSGSIZE;
if (tun_opts_len) {
- if (tun_key->tun_flags & TUNNEL_GENEVE_OPT &&
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_key->tun_flags) &&
nla_put(skb, PSAMPLE_TUNNEL_KEY_ATTR_GENEVE_OPTS,
tun_opts_len, tun_opts))
return -EMSGSIZE;
- else if (tun_key->tun_flags & TUNNEL_ERSPAN_OPT &&
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_key->tun_flags) &&
nla_put(skb, PSAMPLE_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
tun_opts_len, tun_opts))
return -EMSGSIZE;
@@ -313,7 +314,7 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
int tun_opts_len = tun_info->options_len;
int sum = nla_total_size(0); /* PSAMPLE_ATTR_TUNNEL */

- if (tun_key->tun_flags & TUNNEL_KEY)
+ if (test_bit(IP_TUNNEL_KEY_BIT, tun_key->tun_flags))
sum += nla_total_size_64bit(sizeof(u64));

if (tun_info->mode & IP_TUNNEL_INFO_BRIDGE)
@@ -336,20 +337,21 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
if (tun_key->tos)
sum += nla_total_size(sizeof(u8));
sum += nla_total_size(sizeof(u8)); /* TTL */
- if (tun_key->tun_flags & TUNNEL_DONT_FRAGMENT)
+ if (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
- if (tun_key->tun_flags & TUNNEL_CSUM)
+ if (test_bit(IP_TUNNEL_CSUM_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
if (tun_key->tp_src)
sum += nla_total_size(sizeof(u16));
if (tun_key->tp_dst)
sum += nla_total_size(sizeof(u16));
- if (tun_key->tun_flags & TUNNEL_OAM)
+ if (test_bit(IP_TUNNEL_OAM_BIT, tun_key->tun_flags))
sum += nla_total_size(0);
if (tun_opts_len) {
- if (tun_key->tun_flags & TUNNEL_GENEVE_OPT)
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, tun_key->tun_flags))
sum += nla_total_size(tun_opts_len);
- else if (tun_key->tun_flags & TUNNEL_ERSPAN_OPT)
+ else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT,
+ tun_key->tun_flags))
sum += nla_total_size(tun_opts_len);
}

diff --git a/net/sched/act_tunnel_key.c b/net/sched/act_tunnel_key.c
index 0c8aa7e686ea..25469e85922b 100644
--- a/net/sched/act_tunnel_key.c
+++ b/net/sched/act_tunnel_key.c
@@ -230,7 +230,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
nla_for_each_attr(attr, head, len, rem) {
switch (nla_type(attr)) {
case TCA_TUNNEL_KEY_ENC_OPTS_GENEVE:
- if (type && type != TUNNEL_GENEVE_OPT) {
+ if (type && type != IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG(extack, "Duplicate type for geneve options");
return -EINVAL;
}
@@ -247,7 +247,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
dst_len -= opt_len;
dst += opt_len;
}
- type = TUNNEL_GENEVE_OPT;
+ type = IP_TUNNEL_GENEVE_OPT_BIT;
break;
case TCA_TUNNEL_KEY_ENC_OPTS_VXLAN:
if (type) {
@@ -259,7 +259,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_VXLAN_OPT;
+ type = IP_TUNNEL_VXLAN_OPT_BIT;
break;
case TCA_TUNNEL_KEY_ENC_OPTS_ERSPAN:
if (type) {
@@ -271,7 +271,7 @@ static int tunnel_key_copy_opts(const struct nlattr *nla, u8 *dst,
if (opt_len < 0)
return opt_len;
opts_len += opt_len;
- type = TUNNEL_ERSPAN_OPT;
+ type = IP_TUNNEL_ERSPAN_OPT_BIT;
break;
}
}
@@ -302,7 +302,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
switch (nla_type(nla_data(nla))) {
case TCA_TUNNEL_KEY_ENC_OPTS_GENEVE:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_GENEVE_OPT;
+ __set_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -310,7 +310,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
#endif
case TCA_TUNNEL_KEY_ENC_OPTS_VXLAN:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_VXLAN_OPT;
+ __set_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -318,7 +318,7 @@ static int tunnel_key_opts_set(struct nlattr *nla, struct ip_tunnel_info *info,
#endif
case TCA_TUNNEL_KEY_ENC_OPTS_ERSPAN:
#if IS_ENABLED(CONFIG_INET)
- info->key.tun_flags |= TUNNEL_ERSPAN_OPT;
+ __set_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags);
return tunnel_key_copy_opts(nla, ip_tunnel_info_opts(info),
opts_len, extack);
#else
@@ -363,6 +363,7 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
bool bind = act_flags & TCA_ACT_FLAGS_BIND;
struct nlattr *tb[TCA_TUNNEL_KEY_MAX + 1];
struct tcf_tunnel_key_params *params_new;
+ IP_TUNNEL_DECLARE_FLAGS(flags) = { };
struct metadata_dst *metadata = NULL;
struct tcf_chain *goto_ch = NULL;
struct tc_tunnel_key *parm;
@@ -371,7 +372,6 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,
__be16 dst_port = 0;
__be64 key_id = 0;
int opts_len = 0;
- __be16 flags = 0;
u8 tos, ttl;
int ret = 0;
u32 index;
@@ -412,16 +412,16 @@ static int tunnel_key_init(struct net *net, struct nlattr *nla,

key32 = nla_get_be32(tb[TCA_TUNNEL_KEY_ENC_KEY_ID]);
key_id = key32_to_tunnel_id(key32);
- flags = TUNNEL_KEY;
+ __set_bit(IP_TUNNEL_KEY_BIT, flags);
}

- flags |= TUNNEL_CSUM;
+ __set_bit(IP_TUNNEL_CSUM_BIT, flags);
if (tb[TCA_TUNNEL_KEY_NO_CSUM] &&
nla_get_u8(tb[TCA_TUNNEL_KEY_NO_CSUM]))
- flags &= ~TUNNEL_CSUM;
+ __clear_bit(IP_TUNNEL_CSUM_BIT, flags);

if (nla_get_flag(tb[TCA_TUNNEL_KEY_NO_FRAG]))
- flags |= TUNNEL_DONT_FRAGMENT;
+ __set_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, flags);

if (tb[TCA_TUNNEL_KEY_ENC_DST_PORT])
dst_port = nla_get_be16(tb[TCA_TUNNEL_KEY_ENC_DST_PORT]);
@@ -663,15 +663,15 @@ static int tunnel_key_opts_dump(struct sk_buff *skb,
if (!start)
return -EMSGSIZE;

- if (info->key.tun_flags & TUNNEL_GENEVE_OPT) {
+ if (test_bit(IP_TUNNEL_GENEVE_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_geneve_opts_dump(skb, info);
if (err)
goto err_out;
- } else if (info->key.tun_flags & TUNNEL_VXLAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_VXLAN_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_vxlan_opts_dump(skb, info);
if (err)
goto err_out;
- } else if (info->key.tun_flags & TUNNEL_ERSPAN_OPT) {
+ } else if (test_bit(IP_TUNNEL_ERSPAN_OPT_BIT, info->key.tun_flags)) {
err = tunnel_key_erspan_opts_dump(skb, info);
if (err)
goto err_out;
@@ -741,7 +741,7 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
struct ip_tunnel_key *key = &info->key;
__be32 key_id = tunnel_id_to_key32(key->tun_id);

- if (((key->tun_flags & TUNNEL_KEY) &&
+ if ((test_bit(IP_TUNNEL_KEY_BIT, key->tun_flags) &&
nla_put_be32(skb, TCA_TUNNEL_KEY_ENC_KEY_ID, key_id)) ||
tunnel_key_dump_addresses(skb,
&params->tcft_enc_metadata->u.tun_info) ||
@@ -749,8 +749,8 @@ static int tunnel_key_dump(struct sk_buff *skb, struct tc_action *a,
nla_put_be16(skb, TCA_TUNNEL_KEY_ENC_DST_PORT,
key->tp_dst)) ||
nla_put_u8(skb, TCA_TUNNEL_KEY_NO_CSUM,
- !(key->tun_flags & TUNNEL_CSUM)) ||
- ((key->tun_flags & TUNNEL_DONT_FRAGMENT) &&
+ !test_bit(IP_TUNNEL_CSUM_BIT, key->tun_flags)) ||
+ (test_bit(IP_TUNNEL_DONT_FRAGMENT_BIT, key->tun_flags) &&
nla_put_flag(skb, TCA_TUNNEL_KEY_NO_FRAG)) ||
tunnel_key_opts_dump(skb, info))
goto nla_put_failure;
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
index e5314a31f75a..8986bf9d1259 100644
--- a/net/sched/cls_flower.c
+++ b/net/sched/cls_flower.c
@@ -1454,12 +1454,13 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
switch (nla_type(nla_opt_key)) {
case TCA_FLOWER_KEY_ENC_OPTS_GENEVE:
if (key->enc_opts.dst_opt_type &&
- key->enc_opts.dst_opt_type != TUNNEL_GENEVE_OPT) {
+ key->enc_opts.dst_opt_type !=
+ IP_TUNNEL_GENEVE_OPT_BIT) {
NL_SET_ERR_MSG(extack, "Duplicate type for geneve options");
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_GENEVE_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_GENEVE_OPT_BIT;
option_len = fl_set_geneve_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1470,7 +1471,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_GENEVE_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_GENEVE_OPT_BIT;
option_len = fl_set_geneve_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1489,7 +1490,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_VXLAN_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_VXLAN_OPT_BIT;
option_len = fl_set_vxlan_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1500,7 +1501,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_VXLAN_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_VXLAN_OPT_BIT;
option_len = fl_set_vxlan_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1519,7 +1520,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_ERSPAN_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_ERSPAN_OPT_BIT;
option_len = fl_set_erspan_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1530,7 +1531,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_ERSPAN_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_ERSPAN_OPT_BIT;
option_len = fl_set_erspan_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -1550,7 +1551,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
return -EINVAL;
}
option_len = 0;
- key->enc_opts.dst_opt_type = TUNNEL_GTP_OPT;
+ key->enc_opts.dst_opt_type = IP_TUNNEL_GTP_OPT_BIT;
option_len = fl_set_gtp_opt(nla_opt_key, key,
key_depth, option_len,
extack);
@@ -1561,7 +1562,7 @@ static int fl_set_enc_opt(struct nlattr **tb, struct fl_flow_key *key,
/* At the same time we need to parse through the mask
* in order to verify exact and mask attribute lengths.
*/
- mask->enc_opts.dst_opt_type = TUNNEL_GTP_OPT;
+ mask->enc_opts.dst_opt_type = IP_TUNNEL_GTP_OPT_BIT;
option_len = fl_set_gtp_opt(nla_opt_msk, mask,
msk_depth, option_len,
extack);
@@ -3177,22 +3178,22 @@ static int fl_dump_key_options(struct sk_buff *skb, int enc_opt_type,
goto nla_put_failure;

switch (enc_opts->dst_opt_type) {
- case TUNNEL_GENEVE_OPT:
+ case IP_TUNNEL_GENEVE_OPT_BIT:
err = fl_dump_key_geneve_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_VXLAN_OPT:
+ case IP_TUNNEL_VXLAN_OPT_BIT:
err = fl_dump_key_vxlan_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_ERSPAN_OPT:
+ case IP_TUNNEL_ERSPAN_OPT_BIT:
err = fl_dump_key_erspan_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
break;
- case TUNNEL_GTP_OPT:
+ case IP_TUNNEL_GTP_OPT_BIT:
err = fl_dump_key_gtp_opt(skb, enc_opts);
if (err)
goto nla_put_failure;
--
2.41.0

2023-10-09 15:30:52

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH 07/14] btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()

On Mon, Oct 09, 2023 at 05:10:19PM +0200, Alexander Lobakin wrote:
> bitmap_set_bits() does not start with the FS' prefix and may collide
> with a new generic helper one day. It operates with the FS-specific
> types, so there's no change those two could do the same thing.
> Just add the prefix to exclude such possible conflict.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>

Acked-by: David Sterba <[email protected]>

We don't have any other code pending that would potentially collide with
this change so I don't care when and via which tree this gets merged. I
can take it by btrfs too.

2023-10-09 16:19:32

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 03/14] bitops: let the compiler optimize __assign_bit()

On Mon, Oct 09, 2023 at 05:10:15PM +0200, Alexander Lobakin wrote:
> Since commit b03fc1173c0c ("bitops: let optimize out non-atomic bitops
> on compile-time constants"), the compilers are able to expand inline
> bitmap operations to compile-time initializers when possible.
> However, during the round of replacement if-__set-else-__clear with
> __assign_bit() as per Andy's advice, bloat-o-meter showed +1024 bytes
> difference in object code size for one module (even one function),
> where the pattern:
>
> DECLARE_BITMAP(foo) = { }; // on the stack, zeroed
>
> if (a)
> __set_bit(const_bit_num, foo);
> if (b)
> __set_bit(another_const_bit_num, foo);
> ...
>
> is heavily used, although there should be no difference: the bitmap is
> zeroed, so the second half of __assign_bit() should be compiled-out as
> a no-op.
> I either missed the fact that __assign_bit() has bitmap pointer marked
> as `volatile` (as we usually do for bitmaps) or was hoping that the

No, we usually don't. Atomic ops on individual bits is a notable exception
for bitmaps, as the comment for generic_test_bit() says, for example:
/*
* Unlike the bitops with the '__' prefix above, this one *is* atomic,
* so `volatile` must always stay here with no cast-aways. See
* `Documentation/atomic_bitops.txt` for the details.
*/

For non-atomic single-bit operations and all multi-bit ops, volatile is
useless, and generic___test_and_set_bit() in the same file casts the
*addr to a plain 'unsigned long *'.

> compilers would at least try to look past the `volatile` for
> __always_inline functions. Anyhow, due to that attribute, the compilers
> were always compiling the whole expression and no mentioned compile-time
> optimizations were working.
>
> Convert __assign_bit() to a macro since it's a very simple if-else and
> all of the checks are performed inside __set_bit() and __clear_bit(),
> thus that wrapper has to be as transparent as possible. After that
> change, despite it showing only -20 bytes change for vmlinux (due to
> that it's still relatively unpopular), no drastic code size changes
> happen when replacing if-set-else-clear for onstack bitmaps with
> __assign_bit(), meaning the compiler now expands them to the actual
> operations will all the expected optimizations.
>
> Cc: Andy Shevchenko <[email protected]>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> include/linux/bitops.h | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/bitops.h b/include/linux/bitops.h
> index e0cd09eb91cd..f98f4fd1047f 100644
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -284,14 +284,8 @@ static __always_inline void assign_bit(long nr, volatile unsigned long *addr,
> clear_bit(nr, addr);
> }
>
> -static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
> - bool value)
> -{
> - if (value)
> - __set_bit(nr, addr);
> - else
> - __clear_bit(nr, addr);
> -}
> +#define __assign_bit(nr, addr, value) \
> + ((value) ? __set_bit(nr, addr) : __clear_bit(nr, addr))

Can you protect nr and addr with braces just as well?
Can you convert the atomic version too, to keep them synchronized ?

>
> /**
> * __ptr_set_bit - Set bit in a pointer's value
> --
> 2.41.0

2023-10-09 16:19:58

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 09/14] bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()

Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a
particular offset. Currently, there are only bitmap_{get,set}_value8()
to do that for 8 bits and that's it.
Instead of introducing a separate pair for u16 and so on, which doesn't
scale well, extend the existing functions to be able to pass the wanted
value width. Make both offset and width arbitrary, but in order to not
over complicate the current logic and keep the helpers as optimized as
the current ones, require the width to be a pow-2 value and the offset
to be a multiple of the width, while the target piece should not cross
a %BITS_PER_LONG boundary and stay within one long.
Avoid adjusting all the already existing callsites by defining oneliner
wrapper macros named after the former functions. bloat-o-meter shows
almost no difference (+1-2 bytes in a couple of places), meaning the
new helpers get optimized just nicely.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
include/linux/bitmap.h | 51 ++++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 63e422f8ba3d..9c010a7fa331 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -6,8 +6,10 @@

#include <linux/align.h>
#include <linux/bitops.h>
+#include <linux/bug.h>
#include <linux/find.h>
#include <linux/limits.h>
+#include <linux/log2.h>
#include <linux/string.h>
#include <linux/types.h>

@@ -569,38 +571,59 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask)
}

/**
- * bitmap_get_value8 - get an 8-bit value within a memory region
+ * bitmap_get_bits - get a 8/16/32/64-bit value within a memory region
* @map: address to the bitmap memory region
- * @start: bit offset of the 8-bit value; must be a multiple of 8
+ * @start: bit offset of the value; must be a multiple of @len
+ * @len: bit width of the value; must be a power of two
*
- * Returns the 8-bit value located at the @start bit offset within the @src
- * memory region.
+ * Return: the 8/16/32/64-bit value located at the @start bit offset within
+ * the @src memory region. Its position (@start + @len) can't cross
+ * a ``BITS_PER_LONG`` boundary.
*/
-static inline unsigned long bitmap_get_value8(const unsigned long *map,
- unsigned long start)
+static inline unsigned long bitmap_get_bits(const unsigned long *map,
+ unsigned long start, size_t len)
{
const size_t index = BIT_WORD(start);
const unsigned long offset = start % BITS_PER_LONG;

- return (map[index] >> offset) & 0xFF;
+ if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len ||
+ offset + len > BITS_PER_LONG))
+ return 0;
+
+ return (map[index] >> offset) & GENMASK(len - 1, 0);
}

/**
- * bitmap_set_value8 - set an 8-bit value within a memory region
+ * bitmap_set_bits - set a 8/16/32/64-bit value within a memory region
* @map: address to the bitmap memory region
- * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
- * @start: bit offset of the 8-bit value; must be a multiple of 8
+ * @start: bit offset of the value; must be a multiple of @len
+ * @value: new value to set
+ * @len: bit width of the value; must be a power of two
+ *
+ * Replaces the 8/16/32/64-bit value located at the @start bit offset within
+ * the @src memory region with the new @value. Its position (@start + @len)
+ * can't cross a ``BITS_PER_LONG`` boundary.
*/
-static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
- unsigned long start)
+static inline void bitmap_set_bits(unsigned long *map, unsigned long start,
+ unsigned long value, size_t len)
{
const size_t index = BIT_WORD(start);
const unsigned long offset = start % BITS_PER_LONG;
+ unsigned long mask = GENMASK(len - 1, 0);

- map[index] &= ~(0xFFUL << offset);
- map[index] |= value << offset;
+ if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len ||
+ offset + len > BITS_PER_LONG))
+ return;
+
+ map[index] &= ~(mask << offset);
+ map[index] |= (value & mask) << offset;
}

+#define bitmap_get_value8(map, start) \
+ bitmap_get_bits(map, start, BITS_PER_BYTE)
+#define bitmap_set_value8(map, value, start) \
+ bitmap_set_bits(map, start, value, BITS_PER_BYTE)
+
#endif /* __ASSEMBLY__ */

#endif /* __LINUX_BITMAP_H */
--
2.41.0

2023-10-09 16:20:01

by Alexander Lobakin

[permalink] [raw]
Subject: [PATCH 13/14] lib/bitmap: add tests for bitmap_{get,set}_bits()

bitmap_{get,set}_value8() is now bitmap_{get,set}_bits(). The former
didn't have any dedicated tests in the bitmap test suite.
Add a pack of test cases with variable offset, width, and value to set
(for _set_bits()), randomly generated with the only limitation:
``offset % 32 + width <= 32``, to make sure the tests won't fail or
trigger kernel warnings on 32-bit platforms. For _get_bits(), compare
the return values with the expected ones, calculated and saved manually.
For _set_bit(), do several modifications of the source bitmap and then
compare against the expected resulting one, also pre-calculated.

Reviewed-by: Przemek Kitszel <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
---
lib/test_bitmap.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 77 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index cbf1b9611616..6037b66fd30a 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -1161,6 +1161,82 @@ static void __init test_bitmap_print_buf(void)
}
}

+struct getset_test {
+ u16 offset;
+ u16 width;
+ union {
+ u32 expect;
+ u32 value;
+ };
+};
+
+#define GETSET_TEST(o, w, v) { \
+ .offset = (o), \
+ .width = (w), \
+ .value = (v), \
+}
+
+static const unsigned long getset_src[] __initconst = {
+ BITMAP_FROM_U64(0x4329c918b472468eULL),
+ BITMAP_FROM_U64(0xb2c20a622474a119ULL),
+ BITMAP_FROM_U64(0x3a08cb5591cea40dULL),
+ BITMAP_FROM_U64(0xc9a7550384e145f8ULL),
+};
+
+static const struct getset_test get_bits_test[] __initconst = {
+ GETSET_TEST(208, 16, 0x84e1),
+ GETSET_TEST(104, 8, 0xa),
+ GETSET_TEST(224, 32, 0xc9a75503),
+ GETSET_TEST(64, 16, 0xa119),
+ GETSET_TEST(169, 1, 0x1),
+ GETSET_TEST(144, 8, 0xce),
+ GETSET_TEST(80, 4, 0x4),
+ GETSET_TEST(24, 4, 0x4),
+};
+
+static const struct getset_test set_bits_test[] __initconst = {
+ GETSET_TEST(56, 4, 0xa),
+ GETSET_TEST(80, 16, 0xb17a),
+ GETSET_TEST(112, 8, 0x1b),
+ GETSET_TEST(0, 32, 0xe8a555f2),
+ GETSET_TEST(16, 2, 0),
+ GETSET_TEST(72, 8, 0x7d),
+ GETSET_TEST(47, 1, 0),
+ GETSET_TEST(160, 16, 0x1622),
+};
+
+static const unsigned long getset_out[] __initconst = {
+ BITMAP_FROM_U64(0x4a294918e8a455f2ULL),
+ BITMAP_FROM_U64(0xb21b0a62b17a7d19ULL),
+ BITMAP_FROM_U64(0x3a08162291cea40dULL),
+ BITMAP_FROM_U64(0xc9a7550384e145f8ULL),
+};
+
+#define GETSET_TEST_BITS BYTES_TO_BITS(sizeof(getset_out))
+
+static void __init test_bitmap_getset_bits(void)
+{
+ DECLARE_BITMAP(out, GETSET_TEST_BITS);
+
+ for (u32 i = 0; i < ARRAY_SIZE(get_bits_test); i++) {
+ const struct getset_test *test = &get_bits_test[i];
+ u32 val;
+
+ val = bitmap_get_bits(getset_src, test->offset, test->width);
+ expect_eq_uint(test->expect, val);
+ }
+
+ bitmap_copy(out, getset_src, GETSET_TEST_BITS);
+
+ for (u32 i = 0; i < ARRAY_SIZE(set_bits_test); i++) {
+ const struct getset_test *test = &set_bits_test[i];
+
+ bitmap_set_bits(out, test->offset, test->value, test->width);
+ }
+
+ expect_eq_bitmap(getset_out, out, GETSET_TEST_BITS);
+}
+
/*
* FIXME: Clang breaks compile-time evaluations when KASAN and GCOV are enabled.
* To workaround it, GCOV is force-disabled in Makefile for this configuration.
@@ -1238,6 +1314,7 @@ static void __init selftest(void)
test_mem_optimisations();
test_bitmap_cut();
test_bitmap_print_buf();
+ test_bitmap_getset_bits();
test_bitmap_const_eval();

test_find_nth_bit();
--
2.41.0

2023-10-09 16:31:46

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 09/14] bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()

+ Alexander Potapenko <[email protected]>

On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote:
> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a
> particular offset. Currently, there are only bitmap_{get,set}_value8()
> to do that for 8 bits and that's it.

And also a series from Alexander Potapenko, which I really hope will
get into the -next really soon. It introduces bitmap_read/write which
can set up to BITS_PER_LONG at once, with no limitations on alignment
of position and length:

https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb

Can you consider building your series on top of it?

> Instead of introducing a separate pair for u16 and so on, which doesn't
> scale well, extend the existing functions to be able to pass the wanted
> value width. Make both offset and width arbitrary, but in order to not
> over complicate the current logic and keep the helpers as optimized as
> the current ones, require the width to be a pow-2 value and the offset
> to be a multiple of the width, while the target piece should not cross
> a %BITS_PER_LONG boundary and stay within one long.
> Avoid adjusting all the already existing callsites by defining oneliner
> wrapper macros named after the former functions. bloat-o-meter shows
> almost no difference (+1-2 bytes in a couple of places), meaning the
> new helpers get optimized just nicely.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> include/linux/bitmap.h | 51 ++++++++++++++++++++++++++++++------------
> 1 file changed, 37 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 63e422f8ba3d..9c010a7fa331 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -6,8 +6,10 @@
>
> #include <linux/align.h>
> #include <linux/bitops.h>
> +#include <linux/bug.h>
> #include <linux/find.h>
> #include <linux/limits.h>
> +#include <linux/log2.h>
> #include <linux/string.h>
> #include <linux/types.h>
>
> @@ -569,38 +571,59 @@ static inline void bitmap_from_u64(unsigned long *dst, u64 mask)
> }
>
> /**
> - * bitmap_get_value8 - get an 8-bit value within a memory region
> + * bitmap_get_bits - get a 8/16/32/64-bit value within a memory region
> * @map: address to the bitmap memory region
> - * @start: bit offset of the 8-bit value; must be a multiple of 8
> + * @start: bit offset of the value; must be a multiple of @len
> + * @len: bit width of the value; must be a power of two
> *
> - * Returns the 8-bit value located at the @start bit offset within the @src
> - * memory region.
> + * Return: the 8/16/32/64-bit value located at the @start bit offset within
> + * the @src memory region. Its position (@start + @len) can't cross
> + * a ``BITS_PER_LONG`` boundary.
> */
> -static inline unsigned long bitmap_get_value8(const unsigned long *map,
> - unsigned long start)
> +static inline unsigned long bitmap_get_bits(const unsigned long *map,
> + unsigned long start, size_t len)
> {
> const size_t index = BIT_WORD(start);
> const unsigned long offset = start % BITS_PER_LONG;
>
> - return (map[index] >> offset) & 0xFF;
> + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len ||
> + offset + len > BITS_PER_LONG))
> + return 0;
> +
> + return (map[index] >> offset) & GENMASK(len - 1, 0);
> }
>
> /**
> - * bitmap_set_value8 - set an 8-bit value within a memory region
> + * bitmap_set_bits - set a 8/16/32/64-bit value within a memory region
> * @map: address to the bitmap memory region
> - * @value: the 8-bit value; values wider than 8 bits may clobber bitmap
> - * @start: bit offset of the 8-bit value; must be a multiple of 8
> + * @start: bit offset of the value; must be a multiple of @len
> + * @value: new value to set
> + * @len: bit width of the value; must be a power of two
> + *
> + * Replaces the 8/16/32/64-bit value located at the @start bit offset within
> + * the @src memory region with the new @value. Its position (@start + @len)
> + * can't cross a ``BITS_PER_LONG`` boundary.
> */
> -static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
> - unsigned long start)
> +static inline void bitmap_set_bits(unsigned long *map, unsigned long start,
> + unsigned long value, size_t len)
> {
> const size_t index = BIT_WORD(start);
> const unsigned long offset = start % BITS_PER_LONG;
> + unsigned long mask = GENMASK(len - 1, 0);
>
> - map[index] &= ~(0xFFUL << offset);
> - map[index] |= value << offset;
> + if (WARN_ON_ONCE(!is_power_of_2(len) || offset % len ||
> + offset + len > BITS_PER_LONG))
> + return;
> +
> + map[index] &= ~(mask << offset);
> + map[index] |= (value & mask) << offset;
> }
>
> +#define bitmap_get_value8(map, start) \
> + bitmap_get_bits(map, start, BITS_PER_BYTE)
> +#define bitmap_set_value8(map, value, start) \
> + bitmap_set_bits(map, start, value, BITS_PER_BYTE)
> +
> #endif /* __ASSEMBLY__ */
>
> #endif /* __LINUX_BITMAP_H */
> --
> 2.41.0

2023-10-09 16:36:18

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 05/14] s390/cio: rename bitmap_size() -> idset_bitmap_size()

On Mon, Oct 09, 2023 at 05:10:17PM +0200, Alexander Lobakin wrote:
> bitmap_size() is a pretty generic name and one may want to use it for
> a generic bitmap API function. At the same time, its logic is not
> "generic", i.e. it's not just `nbits -> size of bitmap in bytes`
> converter as it would be expected from its name.
> Add the prefix 'idset_' used throughout the file where the function
> resides.

At the first glance, this custom implementation just duplicates the
generic one that you introduce in the following patch. If so, why
don't you switch idset to just use generic bitmap_size()?

>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> idset_new() really wants its vmalloc() + memset() pair to be replaced
> with vzalloc().
> ---
> drivers/s390/cio/idset.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/s390/cio/idset.c b/drivers/s390/cio/idset.c
> index 45f9c0736be4..0a1105a483bf 100644
> --- a/drivers/s390/cio/idset.c
> +++ b/drivers/s390/cio/idset.c
> @@ -16,7 +16,7 @@ struct idset {
> unsigned long bitmap[];
> };
>
> -static inline unsigned long bitmap_size(int num_ssid, int num_id)
> +static inline unsigned long idset_bitmap_size(int num_ssid, int num_id)
> {
> return BITS_TO_LONGS(num_ssid * num_id) * sizeof(unsigned long);
> }
> @@ -25,11 +25,12 @@ static struct idset *idset_new(int num_ssid, int num_id)
> {
> struct idset *set;
>
> - set = vmalloc(sizeof(struct idset) + bitmap_size(num_ssid, num_id));
> + set = vmalloc(sizeof(struct idset) +
> + idset_bitmap_size(num_ssid, num_id));
> if (set) {
> set->num_ssid = num_ssid;
> set->num_id = num_id;
> - memset(set->bitmap, 0, bitmap_size(num_ssid, num_id));
> + memset(set->bitmap, 0, idset_bitmap_size(num_ssid, num_id));
> }
> return set;
> }
> @@ -41,7 +42,8 @@ void idset_free(struct idset *set)
>
> void idset_fill(struct idset *set)
> {
> - memset(set->bitmap, 0xff, bitmap_size(set->num_ssid, set->num_id));
> + memset(set->bitmap, 0xff,
> + idset_bitmap_size(set->num_ssid, set->num_id));
> }
>
> static inline void idset_add(struct idset *set, int ssid, int id)
> --
> 2.41.0

2023-10-09 16:51:06

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 06/14] fs/ntfs3: rename bitmap_size() -> ntfs3_bitmap_size()

On Mon, Oct 09, 2023 at 05:10:18PM +0200, Alexander Lobakin wrote:
> bitmap_size() is a pretty generic name and one may want to use it for
> a generic bitmap API function. At the same time, its logic is
> NTFS-specific, as it aligns to the sizeof(u64), not the sizeof(long)
> (although it uses ideologically right ALIGN() instead of division).
> Add the prefix 'ntfs3_' used for that FS (not just 'ntfs_' to not mix
> it with the legacy module).
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> fs/ntfs3/bitmap.c | 4 ++--
> fs/ntfs3/fsntfs.c | 2 +-
> fs/ntfs3/index.c | 11 ++++++-----
> fs/ntfs3/ntfs_fs.h | 2 +-
> fs/ntfs3/super.c | 2 +-
> 5 files changed, 11 insertions(+), 10 deletions(-)
>
> diff --git a/fs/ntfs3/bitmap.c b/fs/ntfs3/bitmap.c
> index 107e808e06ea..a2e18f13e93a 100644
> --- a/fs/ntfs3/bitmap.c
> +++ b/fs/ntfs3/bitmap.c
> @@ -653,7 +653,7 @@ int wnd_init(struct wnd_bitmap *wnd, struct super_block *sb, size_t nbits)
> wnd->total_zeroes = nbits;
> wnd->extent_max = MINUS_ONE_T;
> wnd->zone_bit = wnd->zone_end = 0;
> - wnd->nwnd = bytes_to_block(sb, bitmap_size(nbits));
> + wnd->nwnd = bytes_to_block(sb, ntfs3_bitmap_size(nbits));
> wnd->bits_last = nbits & (wbits - 1);
> if (!wnd->bits_last)
> wnd->bits_last = wbits;
> @@ -1345,7 +1345,7 @@ int wnd_extend(struct wnd_bitmap *wnd, size_t new_bits)
> return -EINVAL;
>
> /* Align to 8 byte boundary. */
> - new_wnd = bytes_to_block(sb, bitmap_size(new_bits));
> + new_wnd = bytes_to_block(sb, ntfs3_bitmap_size(new_bits));
> new_last = new_bits & (wbits - 1);
> if (!new_last)
> new_last = wbits;
> diff --git a/fs/ntfs3/fsntfs.c b/fs/ntfs3/fsntfs.c
> index 33afee0f5559..7a14d2347f27 100644
> --- a/fs/ntfs3/fsntfs.c
> +++ b/fs/ntfs3/fsntfs.c
> @@ -522,7 +522,7 @@ static int ntfs_extend_mft(struct ntfs_sb_info *sbi)
> ni->mi.dirty = true;
>
> /* Step 2: Resize $MFT::BITMAP. */
> - new_bitmap_bytes = bitmap_size(new_mft_total);
> + new_bitmap_bytes = ntfs3_bitmap_size(new_mft_total);
>
> err = attr_set_size(ni, ATTR_BITMAP, NULL, 0, &sbi->mft.bitmap.run,
> new_bitmap_bytes, &new_bitmap_bytes, true, NULL);
> diff --git a/fs/ntfs3/index.c b/fs/ntfs3/index.c
> index 124c6e822623..ab53a4b6ddf8 100644
> --- a/fs/ntfs3/index.c
> +++ b/fs/ntfs3/index.c
> @@ -1453,8 +1453,8 @@ static int indx_create_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
>
> alloc->nres.valid_size = alloc->nres.data_size = cpu_to_le64(data_size);
>
> - err = ni_insert_resident(ni, bitmap_size(1), ATTR_BITMAP, in->name,
> - in->name_len, &bitmap, NULL, NULL);
> + err = ni_insert_resident(ni, ntfs3_bitmap_size(1), ATTR_BITMAP,
> + in->name, in->name_len, &bitmap, NULL, NULL);
> if (err)
> goto out2;
>
> @@ -1515,8 +1515,9 @@ static int indx_add_allocate(struct ntfs_index *indx, struct ntfs_inode *ni,
> if (bmp) {
> /* Increase bitmap. */
> err = attr_set_size(ni, ATTR_BITMAP, in->name, in->name_len,
> - &indx->bitmap_run, bitmap_size(bit + 1),
> - NULL, true, NULL);
> + &indx->bitmap_run,
> + ntfs3_bitmap_size(bit + 1), NULL, true,
> + NULL);
> if (err)
> goto out1;
> }
> @@ -2089,7 +2090,7 @@ static int indx_shrink(struct ntfs_index *indx, struct ntfs_inode *ni,
> if (in->name == I30_NAME)
> ni->vfs_inode.i_size = new_data;
>
> - bpb = bitmap_size(bit);
> + bpb = ntfs3_bitmap_size(bit);
> if (bpb * 8 == nbits)
> return 0;
>
> diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
> index 629403ede6e5..93333156aac6 100644
> --- a/fs/ntfs3/ntfs_fs.h
> +++ b/fs/ntfs3/ntfs_fs.h
> @@ -961,7 +961,7 @@ static inline bool run_is_empty(struct runs_tree *run)
> }
>
> /* NTFS uses quad aligned bitmaps. */
> -static inline size_t bitmap_size(size_t bits)
> +static inline size_t ntfs3_bitmap_size(size_t bits)
> {
> return ALIGN((bits + 7) >> 3, 8);
> }

This looks like duplicating BITS_TO_U64(). If so, why not just switch
to using the macro while you're here?

> diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
> index cfec5e0c7f66..b1fb6efe7084 100644
> --- a/fs/ntfs3/super.c
> +++ b/fs/ntfs3/super.c
> @@ -1285,7 +1285,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
>
> /* Check bitmap boundary. */
> tt = sbi->used.bitmap.nbits;
> - if (inode->i_size < bitmap_size(tt)) {
> + if (inode->i_size < ntfs3_bitmap_size(tt)) {
> ntfs_err(sb, "$Bitmap is corrupted.");
> err = -EINVAL;
> goto put_inode_out;
> --
> 2.41.0

2023-10-09 17:04:37

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 08/14] bitmap: introduce generic optimized bitmap_size()

On Mon, Oct 09, 2023 at 05:10:20PM +0200, Alexander Lobakin wrote:
> The number of times yet another open coded
> `BITS_TO_LONGS(nbits) * sizeof(long)` can be spotted is huge.
> Some generic helper is long overdue.

OK, I see your point. Indeed, opencoding this again and again may be
annoying.

Acked-by: Yury Norov <[email protected]>

> Add one, bitmap_size(), but with one detail.
> BITS_TO_LONGS() uses DIV_ROUND_UP(). The latter works well when both
> divident and divisor are compile-time constants or when the divisor
> is not a pow-of-2. When it is however, the compilers sometimes tend
> to generate suboptimal code (GCC 13):
>
> 48 83 c0 3f add $0x3f,%rax
> 48 c1 e8 06 shr $0x6,%rax
> 48 8d 14 c5 00 00 00 00 lea 0x0(,%rax,8),%rdx
>
> %BITS_PER_LONG is always a pow-2 (either 32 or 64), but GCC still does
> full division of `nbits + 63` by it and then multiplication by 8.
> Instead of BITS_TO_LONGS(), use ALIGN() and then divide by 8. GCC:
>
> 8d 50 3f lea 0x3f(%rax),%edx
> c1 ea 03 shr $0x3,%edx
> 81 e2 f8 ff ff 1f and $0x1ffffff8,%edx
>
> Now it divides `nbits + 63` by 8 and then masks bits[2:0].
> bloat-o-meter:
>
> add/remove: 0/0 grow/shrink: 20/133 up/down: 156/-773 (-617)
>
> Clang does it better and generates the same code before/after starting
> from -O1, except that with the ALIGN() approach it uses %edx and thus
> still saves some bytes:
>
> add/remove: 0/0 grow/shrink: 9/133 up/down: 18/-538 (-520)
>
> Note that we can't expand DIV_ROUND_UP() by adding a check and using
> this approach there, as it's used in array declarations where
> expressions are not allowed.
> Add this helper to tools/ as well.
>
> Reviewed-by: Przemek Kitszel <[email protected]>
> Signed-off-by: Alexander Lobakin <[email protected]>
> ---
> drivers/md/dm-clone-metadata.c | 5 -----
> include/linux/bitmap.h | 8 +++++---
> include/linux/cpumask.h | 2 +-
> lib/math/prime_numbers.c | 2 --
> tools/include/linux/bitmap.h | 8 +++++---
> 5 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/md/dm-clone-metadata.c b/drivers/md/dm-clone-metadata.c
> index c43d55672bce..47c1fa7aad8b 100644
> --- a/drivers/md/dm-clone-metadata.c
> +++ b/drivers/md/dm-clone-metadata.c
> @@ -465,11 +465,6 @@ static void __destroy_persistent_data_structures(struct dm_clone_metadata *cmd)
>
> /*---------------------------------------------------------------------------*/
>
> -static size_t bitmap_size(unsigned long nr_bits)
> -{
> - return BITS_TO_LONGS(nr_bits) * sizeof(long);
> -}
> -
> static int __dirty_map_init(struct dirty_map *dmap, unsigned long nr_words,
> unsigned long nr_regions)
> {
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 03644237e1ef..63e422f8ba3d 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -237,9 +237,11 @@ extern int bitmap_print_list_to_buf(char *buf, const unsigned long *maskp,
> #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
> #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
>
> +#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
> +
> static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> {
> - unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> + unsigned int len = bitmap_size(nbits);
>
> if (small_const_nbits(nbits))
> *dst = 0;
> @@ -249,7 +251,7 @@ static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
>
> static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
> {
> - unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> + unsigned int len = bitmap_size(nbits);
>
> if (small_const_nbits(nbits))
> *dst = ~0UL;
> @@ -260,7 +262,7 @@ static inline void bitmap_fill(unsigned long *dst, unsigned int nbits)
> static inline void bitmap_copy(unsigned long *dst, const unsigned long *src,
> unsigned int nbits)
> {
> - unsigned int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> + unsigned int len = bitmap_size(nbits);
>
> if (small_const_nbits(nbits))
> *dst = *src;
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index f10fb87d49db..dbdbf1451cad 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -821,7 +821,7 @@ static inline int cpulist_parse(const char *buf, struct cpumask *dstp)
> */
> static inline unsigned int cpumask_size(void)
> {
> - return BITS_TO_LONGS(large_cpumask_bits) * sizeof(long);
> + return bitmap_size(large_cpumask_bits);
> }
>
> /*
> diff --git a/lib/math/prime_numbers.c b/lib/math/prime_numbers.c
> index d42cebf7407f..d3b64b10da1c 100644
> --- a/lib/math/prime_numbers.c
> +++ b/lib/math/prime_numbers.c
> @@ -6,8 +6,6 @@
> #include <linux/prime_numbers.h>
> #include <linux/slab.h>
>
> -#define bitmap_size(nbits) (BITS_TO_LONGS(nbits) * sizeof(unsigned long))
> -
> struct primes {
> struct rcu_head rcu;
> unsigned long last, sz;
> diff --git a/tools/include/linux/bitmap.h b/tools/include/linux/bitmap.h
> index f3566ea0f932..81a2299ace15 100644
> --- a/tools/include/linux/bitmap.h
> +++ b/tools/include/linux/bitmap.h
> @@ -2,6 +2,7 @@
> #ifndef _TOOLS_LINUX_BITMAP_H
> #define _TOOLS_LINUX_BITMAP_H
>
> +#include <linux/align.h>
> #include <string.h>
> #include <linux/bitops.h>
> #include <linux/find.h>
> @@ -25,13 +26,14 @@ bool __bitmap_intersects(const unsigned long *bitmap1,
> #define BITMAP_FIRST_WORD_MASK(start) (~0UL << ((start) & (BITS_PER_LONG - 1)))
> #define BITMAP_LAST_WORD_MASK(nbits) (~0UL >> (-(nbits) & (BITS_PER_LONG - 1)))
>
> +#define bitmap_size(nbits) (ALIGN(nbits, BITS_PER_LONG) / BITS_PER_BYTE)
> +
> static inline void bitmap_zero(unsigned long *dst, unsigned int nbits)
> {
> if (small_const_nbits(nbits))
> *dst = 0UL;
> else {
> - int len = BITS_TO_LONGS(nbits) * sizeof(unsigned long);
> - memset(dst, 0, len);
> + memset(dst, 0, bitmap_size(nbits));
> }
> }
>
> @@ -83,7 +85,7 @@ static inline void bitmap_or(unsigned long *dst, const unsigned long *src1,
> */
> static inline unsigned long *bitmap_zalloc(int nbits)
> {
> - return calloc(1, BITS_TO_LONGS(nbits) * sizeof(unsigned long));
> + return calloc(1, bitmap_size(nbits));
> }
>
> /*
> --
> 2.41.0

2023-10-11 07:27:50

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH 03/14] bitops: let the compiler optimize __assign_bit()

From: Yury Norov <[email protected]>
Date: Mon, 9 Oct 2023 09:18:40 -0700

> On Mon, Oct 09, 2023 at 05:10:15PM +0200, Alexander Lobakin wrote:

[...]

>> -static __always_inline void __assign_bit(long nr, volatile unsigned long *addr,
>> - bool value)
>> -{
>> - if (value)
>> - __set_bit(nr, addr);
>> - else
>> - __clear_bit(nr, addr);
>> -}
>> +#define __assign_bit(nr, addr, value) \
>> + ((value) ? __set_bit(nr, addr) : __clear_bit(nr, addr))
>
> Can you protect nr and addr with braces just as well?
> Can you convert the atomic version too, to keep them synchronized ?

+ for both. I didn't convert assign_bit() as I thought it wouldn't give
any optimization improvements, but yeah, let the compiler decide.

>
>>
>> /**
>> * __ptr_set_bit - Set bit in a pointer's value
>> --
>> 2.41.0

Thanks,
Olek

2023-10-11 07:30:50

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH 05/14] s390/cio: rename bitmap_size() -> idset_bitmap_size()

From: Yury Norov <[email protected]>
Date: Mon, 9 Oct 2023 09:35:20 -0700

> On Mon, Oct 09, 2023 at 05:10:17PM +0200, Alexander Lobakin wrote:
>> bitmap_size() is a pretty generic name and one may want to use it for
>> a generic bitmap API function. At the same time, its logic is not
>> "generic", i.e. it's not just `nbits -> size of bitmap in bytes`
>> converter as it would be expected from its name.
>> Add the prefix 'idset_' used throughout the file where the function
>> resides.
>
> At the first glance, this custom implementation just duplicates the
> generic one that you introduce in the following patch. If so, why
> don't you switch idset to just use generic bitmap_size()?

I didn't want to introduce any semantic changes, but good point, why not.

>
>>
>> Reviewed-by: Przemek Kitszel <[email protected]>
>> Signed-off-by: Alexander Lobakin <[email protected]>

[...]

Thanks,
Olek

2023-10-11 07:38:45

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH 06/14] fs/ntfs3: rename bitmap_size() -> ntfs3_bitmap_size()

From: Yury Norov <[email protected]>
Date: Mon, 9 Oct 2023 09:50:15 -0700

> On Mon, Oct 09, 2023 at 05:10:18PM +0200, Alexander Lobakin wrote:
>> bitmap_size() is a pretty generic name and one may want to use it for

[...]

>> @@ -961,7 +961,7 @@ static inline bool run_is_empty(struct runs_tree *run)
>> }
>>
>> /* NTFS uses quad aligned bitmaps. */
>> -static inline size_t bitmap_size(size_t bits)
>> +static inline size_t ntfs3_bitmap_size(size_t bits)
>> {
>> return ALIGN((bits + 7) >> 3, 8);
>> }
>
> This looks like duplicating BITS_TO_U64(). If so, why not just switch
> to using the macro while you're here?

I thought that this

return BITS_TO_U64(bits) * sizeof(u64);

would give worse optimization, but actually

add/remove: 0/0 grow/shrink: 0/6 up/down: 0/-32 (-32)

Nice, makes sense to do it.

>
>> diff --git a/fs/ntfs3/super.c b/fs/ntfs3/super.c
>> index cfec5e0c7f66..b1fb6efe7084 100644
>> --- a/fs/ntfs3/super.c
>> +++ b/fs/ntfs3/super.c
>> @@ -1285,7 +1285,7 @@ static int ntfs_fill_super(struct super_block *sb, struct fs_context *fc)
>>
>> /* Check bitmap boundary. */
>> tt = sbi->used.bitmap.nbits;
>> - if (inode->i_size < bitmap_size(tt)) {
>> + if (inode->i_size < ntfs3_bitmap_size(tt)) {
>> ntfs_err(sb, "$Bitmap is corrupted.");
>> err = -EINVAL;
>> goto put_inode_out;
>> --
>> 2.41.0

Thanks,
Olek

2023-10-11 09:35:24

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [PATCH 09/14] bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()

From: Yury Norov <[email protected]>
Date: Mon, 9 Oct 2023 09:31:15 -0700

> + Alexander Potapenko <[email protected]>
>
> On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote:
>> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a
>> particular offset. Currently, there are only bitmap_{get,set}_value8()
>> to do that for 8 bits and that's it.
>
> And also a series from Alexander Potapenko, which I really hope will
> get into the -next really soon. It introduces bitmap_read/write which
> can set up to BITS_PER_LONG at once, with no limitations on alignment
> of position and length:
>
> https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb
>
> Can you consider building your series on top of it?

Yeah, I mentioned in the cover letter that I'm aware of it and in fact
it doesn't conflict much, as the functions I'm adding here get optimized
as much as the original bitmap_{get,set}_value8(), while Alexander's
generic helpers are heavier.
I realize lots of calls will be optimized as well due to the offset and
the width being compile-time constants, but not all of them. The idea of
keeping two pairs of helpers initially came from Andy if I understood
him correctly.
What do you think? I can provide some bloat-o-meter stats after
rebasing. And either way, I see no issue in basing this series on top of
Alex' one.

>
>> Instead of introducing a separate pair for u16 and so on, which doesn't
>> scale well, extend the existing functions to be able to pass the wanted
>> value width. Make both offset and width arbitrary, but in order to not
>> over complicate the current logic and keep the helpers as optimized as
>> the current ones, require the width to be a pow-2 value and the offset
>> to be a multiple of the width, while the target piece should not cross
>> a %BITS_PER_LONG boundary and stay within one long.
>> Avoid adjusting all the already existing callsites by defining oneliner
>> wrapper macros named after the former functions. bloat-o-meter shows
>> almost no difference (+1-2 bytes in a couple of places), meaning the

[...]

Thanks,
Olek

2023-10-11 10:37:14

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 09/14] bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()

On Wed, Oct 11, 2023 at 11:33:25AM +0200, Alexander Lobakin wrote:
> From: Yury Norov <[email protected]>
> Date: Mon, 9 Oct 2023 09:31:15 -0700
>
> > + Alexander Potapenko <[email protected]>
> >
> > On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote:
> >> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a
> >> particular offset. Currently, there are only bitmap_{get,set}_value8()
> >> to do that for 8 bits and that's it.
> >
> > And also a series from Alexander Potapenko, which I really hope will
> > get into the -next really soon. It introduces bitmap_read/write which
> > can set up to BITS_PER_LONG at once, with no limitations on alignment
> > of position and length:
> >
> > https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb
> >
> > Can you consider building your series on top of it?
>
> Yeah, I mentioned in the cover letter that I'm aware of it and in fact
> it doesn't conflict much, as the functions I'm adding here get optimized
> as much as the original bitmap_{get,set}_value8(), while Alexander's
> generic helpers are heavier.
> I realize lots of calls will be optimized as well due to the offset and
> the width being compile-time constants, but not all of them. The idea of
> keeping two pairs of helpers initially came from Andy if I understood
> him correctly.

Just a disclaimer: The idea came before I saw the series by Alexander Potapenko.

> What do you think? I can provide some bloat-o-meter stats after
> rebasing. And either way, I see no issue in basing this series on top of
> Alex' one.

--
With Best Regards,
Andy Shevchenko


2023-10-15 02:21:19

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH 09/14] bitmap: extend bitmap_{get,set}_value8() to bitmap_{get,set}_bits()

On Wed, Oct 11, 2023 at 11:33:25AM +0200, Alexander Lobakin wrote:
> From: Yury Norov <[email protected]>
> Date: Mon, 9 Oct 2023 09:31:15 -0700
>
> > + Alexander Potapenko <[email protected]>
> >
> > On Mon, Oct 09, 2023 at 05:10:21PM +0200, Alexander Lobakin wrote:
> >> Sometimes there's need to get a 8/16/...-bit piece of a bitmap at a
> >> particular offset. Currently, there are only bitmap_{get,set}_value8()
> >> to do that for 8 bits and that's it.
> >
> > And also a series from Alexander Potapenko, which I really hope will
> > get into the -next really soon. It introduces bitmap_read/write which
> > can set up to BITS_PER_LONG at once, with no limitations on alignment
> > of position and length:
> >
> > https://lore.kernel.org/linux-arm-kernel/ZRXbOoKHHafCWQCW@yury-ThinkPad/T/#mc311037494229647088b3a84b9f0d9b50bf227cb
> >
> > Can you consider building your series on top of it?
>
> Yeah, I mentioned in the cover letter that I'm aware of it and in fact
> it doesn't conflict much, as the functions I'm adding here get optimized
> as much as the original bitmap_{get,set}_value8(), while Alexander's
> generic helpers are heavier.
> I realize lots of calls will be optimized as well due to the offset and
> the width being compile-time constants, but not all of them. The idea of
> keeping two pairs of helpers initially came from Andy if I understood
> him correctly.
> What do you think? I can provide some bloat-o-meter stats after
> rebasing. And either way, I see no issue in basing this series on top of
> Alex' one.

You're right, let's try both and see what how worse is one comparing
to another wrt bloat-o-meter and overall code generation. If the
difference is not that terrible, I'd stick to universal and simpler
for users version.

If the difference is significant, we'd have to keep both. Maybe it's
worth to try merge the aligned case into generic one, but it's not the
purpose of your series, of course.

Thanks,
Yury