Currently, when MTE pages are swapped out, the tags are kept in the
memory, occupying 128 bytes per page. This is especially problematic
for devices that use zram-backed in-memory swap, because tags stored
uncompressed in the heap effectively reduce the available amount of
swap memory.
The RLE-based EA0 algorithm suggested by Evgenii Stepanov and
implemented in this patch series is able to efficiently compress
128-byte tag buffers, resulting in practical compression ratio between
2.5x and 20x. In most cases it is possible to store the compressed data
in 63-bit Xarray values, resulting in no extra memory allocations.
Our measurements show that EA0 provides better compression than existing
kernel compression algorithms (LZ4, LZO, LZ4HC, ZSTD) can offer, because
EA0 specifically targets 128-byte buffers.
To implement compression/decompression, we also extend <linux/bitmap.h>
with methods to set/get bit values at arbitrary places in the map.
We refactor arch/arm64/mm/mteswap.c to support both the compressed
(CONFIG_ARM64_MTE_COMP) and non-compressed case. For the former, in
addition to tag compression, we move tag allocation from kmalloc() to
separate kmem caches, providing greater locality and relaxing the
alignment requirements.
v2:
- as suggested by Yuri Norov, replace the poorly implemented struct
bitq with <linux/bitmap.h>
Alexander Potapenko (5):
lib/bitmap: add bitmap_{set,get}_value_unaligned()
lib/test_bitmap: add tests for bitmap_{set,get}_value_unaligned
arm64: mte: implement CONFIG_ARM64_MTE_COMP
arm64: mte: add a test for MTE tags compression
arm64: mte: add compression support to mteswap.c
arch/arm64/Kconfig | 20 ++
arch/arm64/include/asm/mtecomp.h | 60 +++++
arch/arm64/mm/Makefile | 7 +
arch/arm64/mm/mtecomp.c | 412 +++++++++++++++++++++++++++++++
arch/arm64/mm/mteswap.c | 19 +-
arch/arm64/mm/mteswap.h | 12 +
arch/arm64/mm/mteswap_comp.c | 50 ++++
arch/arm64/mm/mteswap_nocomp.c | 37 +++
arch/arm64/mm/test_mtecomp.c | 175 +++++++++++++
include/linux/bitmap.h | 63 +++++
lib/test_bitmap.c | 34 +++
11 files changed, 878 insertions(+), 11 deletions(-)
create mode 100644 arch/arm64/include/asm/mtecomp.h
create mode 100644 arch/arm64/mm/mtecomp.c
create mode 100644 arch/arm64/mm/mteswap.h
create mode 100644 arch/arm64/mm/mteswap_comp.c
create mode 100644 arch/arm64/mm/mteswap_nocomp.c
create mode 100644 arch/arm64/mm/test_mtecomp.c
--
2.41.0.255.g8b1d071c50-goog
The two new functions allow setting/getting values of length up to
BITS_PER_LONG bits at arbitrary position in the bitmap.
Suggested-by: Yury Norov <[email protected]>
Signed-off-by: Alexander Potapenko <[email protected]>
---
include/linux/bitmap.h | 63 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 63 insertions(+)
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 03644237e1efb..8e36ce07bafd4 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -77,6 +77,8 @@ struct device;
* bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst
* bitmap_get_value8(map, start) Get 8bit value from map at start
* bitmap_set_value8(map, value, start) Set 8bit value to map at start
+ * bitmap_get_value_unaligned(map, start, nbits) Get value up to BITS_PER_LONG
+ * bitmap_set_value_unaligned(map, value, start, nbits) Set value up to BITS_PER_LONG
*
* Note, bitmap_zero() and bitmap_fill() operate over the region of
* unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -583,6 +585,35 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map,
return (map[index] >> offset) & 0xFF;
}
+/**
+ * bitmap_get_value_unaligned - get an @nbits-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the value; may not be a multiple of 8
+ * @nbits: number of bits to get
+ *
+ * Returns the @nbits-sized value located at the @start bit offset within the
+ * @map memory region.
+ */
+static inline unsigned long bitmap_get_value_unaligned(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned long offset = start % BITS_PER_LONG;
+ const unsigned long carry = (offset + nbits) % BITS_PER_LONG;
+ unsigned long hi, lo, result;
+
+ if (offset + nbits <= BITS_PER_LONG) {
+ result = map[index] >> (BITS_PER_LONG - offset - nbits);
+ return result & BITMAP_LAST_WORD_MASK(nbits);
+ }
+
+ hi = map[index] & BITMAP_LAST_WORD_MASK(BITS_PER_LONG - offset);
+ lo = map[index + 1] & BITMAP_FIRST_WORD_MASK(BITS_PER_LONG - carry);
+ lo >>= (BITS_PER_LONG - carry);
+ return (hi << carry) | lo;
+}
+
/**
* bitmap_set_value8 - set an 8-bit value within a memory region
* @map: address to the bitmap memory region
@@ -599,6 +630,38 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
map[index] |= value << offset;
}
+/**
+ * bitmap_set_value_unaligned - set an @nbits-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: the value up to BITS_PER_LONG bits (will be clamped to @nbits)
+ * @start: bit offset of the value; may not be a multiple of 8
+ * @nbits: number of bits to set
+ */
+static inline void bitmap_set_value_unaligned(unsigned long *map,
+ unsigned long value,
+ unsigned long start,
+ unsigned long nbits)
+{
+ const size_t index = BIT_WORD(start);
+ const unsigned long offset = start % BITS_PER_LONG;
+ unsigned long mask = BITMAP_LAST_WORD_MASK(nbits);
+ const unsigned long carry = (offset + nbits) % BITS_PER_LONG;
+
+ value &= mask;
+ if (offset + nbits <= BITS_PER_LONG) {
+ value <<= (BITS_PER_LONG - offset - nbits);
+ mask <<= (BITS_PER_LONG - offset - nbits);
+ map[index] &= ~mask;
+ map[index] |= value;
+ return;
+ }
+ map[index] &= ~BITMAP_LAST_WORD_MASK(BITS_PER_LONG - offset);
+ map[index] |= (value >> (carry));
+ value &= BITMAP_LAST_WORD_MASK(carry);
+ map[index + 1] &= ~BITMAP_FIRST_WORD_MASK(BITS_PER_LONG - carry);
+ map[index + 1] |= value << (BITS_PER_LONG - carry);
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __LINUX_BITMAP_H */
--
2.41.0.255.g8b1d071c50-goog
Add basic tests ensuring that values can be added at arbitrary positions
of the bitmap, including those spanning into the adjacent unsigned
longs.
Signed-off-by: Alexander Potapenko <[email protected]>
---
lib/test_bitmap.c | 34 ++++++++++++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 187f5b2db4cf1..8ca61f6e7f26e 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -71,6 +71,17 @@ __check_eq_uint(const char *srcfile, unsigned int line,
return true;
}
+static bool __init
+__check_eq_ulong(const char *srcfile, unsigned int line,
+ const unsigned long exp_ulong, unsigned long x)
+{
+ if (exp_ulong != x) {
+ pr_err("[%s:%u] expected %lu, got %lu\n",
+ srcfile, line, exp_ulong, x);
+ return false;
+ }
+ return true;
+}
static bool __init
__check_eq_bitmap(const char *srcfile, unsigned int line,
@@ -186,6 +197,7 @@ __check_eq_str(const char *srcfile, unsigned int line,
})
#define expect_eq_uint(...) __expect_eq(uint, ##__VA_ARGS__)
+#define expect_eq_ulong(...) __expect_eq(ulong, ##__VA_ARGS__)
#define expect_eq_bitmap(...) __expect_eq(bitmap, ##__VA_ARGS__)
#define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
#define expect_eq_u32_array(...) __expect_eq(u32_array, ##__VA_ARGS__)
@@ -1222,6 +1234,26 @@ static void __init test_bitmap_const_eval(void)
BUILD_BUG_ON(~var != ~BIT(25));
}
+static void __init test_set_get_value_unaligned(void)
+{
+ DECLARE_BITMAP(bitmap, BITS_PER_LONG * 2);
+ unsigned long val;
+ int i;
+
+ for (i = 0; i < BITS_PER_LONG * 2 - 7; i++) {
+ bitmap_zero(bitmap, BITS_PER_LONG * 2);
+ bitmap_set_value_unaligned(bitmap, 0b10101UL, i, 5);
+ val = bitmap_get_value_unaligned(bitmap, i, 5);
+ expect_eq_ulong(0b10101UL, val);
+ bitmap_set_value_unaligned(bitmap, 0b101UL, i + 5, 3);
+ val = bitmap_get_value_unaligned(bitmap, i + 5, 3);
+ expect_eq_ulong(0b101UL, val);
+ val = bitmap_get_value_unaligned(bitmap, i, 8);
+ expect_eq_ulong(0b10101101UL, val);
+ }
+}
+
+
static void __init selftest(void)
{
test_zero_clear();
@@ -1249,6 +1281,8 @@ static void __init selftest(void)
test_for_each_clear_bitrange_from();
test_for_each_set_clump8();
test_for_each_set_bit_wrap();
+
+ test_set_get_value_unaligned();
}
KSTM_MODULE_LOADERS(test_bitmap);
--
2.41.0.255.g8b1d071c50-goog
+Cc: William
On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> The two new functions allow setting/getting values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
A couple of years (?) ago it was a series to achieve something like this with
better (?) code. Why not resurrect that one?
https://www.mail-archive.com/[email protected]/msg2195426.html
--
With Best Regards,
Andy Shevchenko
On Thu, Jul 13, 2023 at 7:29 PM Andy Shevchenko
<[email protected]> wrote:
>
> +Cc: William
>
> On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> > The two new functions allow setting/getting values of length up to
> > BITS_PER_LONG bits at arbitrary position in the bitmap.
>
> A couple of years (?) ago it was a series to achieve something like this with
> better (?) code. Why not resurrect that one?
>
> https://www.mail-archive.com/[email protected]/msg2195426.html
It looks more compact thanks to GENMASK, I can cook something based on
the proposed bitmap_{set,get}_value (and change the names if you
prefer the shorter ones).
But I'd better avoid pulling in the rest of that series without a strong need.
> --
> With Best Regards,
> Andy Shevchenko
>
>
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-Straße, 33
80636 München
Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
On Thu, Jul 13, 2023 at 08:05:34PM +0200, Alexander Potapenko wrote:
> On Thu, Jul 13, 2023 at 7:29 PM Andy Shevchenko
> <[email protected]> wrote:
> >
> > +Cc: William
> >
> > On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> > > The two new functions allow setting/getting values of length up to
> > > BITS_PER_LONG bits at arbitrary position in the bitmap.
> >
> > A couple of years (?) ago it was a series to achieve something like this with
> > better (?) code. Why not resurrect that one?
> >
> > https://www.mail-archive.com/[email protected]/msg2195426.html
>
> It looks more compact thanks to GENMASK, I can cook something based on
> the proposed bitmap_{set,get}_value (and change the names if you
> prefer the shorter ones).
> But I'd better avoid pulling in the rest of that series without a strong need.
William, what do you think on this?
I'm personally prefer William's version as not only it was published first
it was carefully designed and got a lot of review already. We just hadn't had
the user for it that time.
--
With Best Regards,
Andy Shevchenko
On Fri, Jul 14, 2023 at 11:04:16AM +0300, Andy Shevchenko wrote:
> On Thu, Jul 13, 2023 at 08:05:34PM +0200, Alexander Potapenko wrote:
> > On Thu, Jul 13, 2023 at 7:29 PM Andy Shevchenko
> > <[email protected]> wrote:
> > >
> > > +Cc: William
> > >
> > > On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> > > > The two new functions allow setting/getting values of length up to
> > > > BITS_PER_LONG bits at arbitrary position in the bitmap.
> > >
> > > A couple of years (?) ago it was a series to achieve something like this with
> > > better (?) code. Why not resurrect that one?
> > >
> > > https://www.mail-archive.com/[email protected]/msg2195426.html
> >
> > It looks more compact thanks to GENMASK, I can cook something based on
> > the proposed bitmap_{set,get}_value (and change the names if you
> > prefer the shorter ones).
> > But I'd better avoid pulling in the rest of that series without a strong need.
>
> William, what do you think on this?
>
> I'm personally prefer William's version as not only it was published first
> it was carefully designed and got a lot of review already. We just hadn't had
> the user for it that time.
>
> --
> With Best Regards,
> Andy Shevchenko
Yes, that version went through several revisions so it's been well
tested and known to work -- as you pointed out it just lacked the users
to warrant merging it into the tree. If it statisfies the use-case
required here now, then I think we should it pick it up rather than
reinvent the solution again.
Also, we probably don't need the "clump" code in there, so perhaps
splitting it out to just the bitmap_{set,get}_value relevant code is
fine.
William Breathitt Gray
On Fri, Jul 14, 2023 at 07:19:15AM -0400, William Breathitt Gray wrote:
> On Fri, Jul 14, 2023 at 11:04:16AM +0300, Andy Shevchenko wrote:
> > On Thu, Jul 13, 2023 at 08:05:34PM +0200, Alexander Potapenko wrote:
> > > On Thu, Jul 13, 2023 at 7:29 PM Andy Shevchenko
> > > <[email protected]> wrote:
> > > > On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> > > > > The two new functions allow setting/getting values of length up to
> > > > > BITS_PER_LONG bits at arbitrary position in the bitmap.
> > > >
> > > > A couple of years (?) ago it was a series to achieve something like this with
> > > > better (?) code. Why not resurrect that one?
> > > >
> > > > https://www.mail-archive.com/[email protected]/msg2195426.html
> > >
> > > It looks more compact thanks to GENMASK, I can cook something based on
> > > the proposed bitmap_{set,get}_value (and change the names if you
> > > prefer the shorter ones).
> > > But I'd better avoid pulling in the rest of that series without a strong need.
> >
> > William, what do you think on this?
> >
> > I'm personally prefer William's version as not only it was published first
> > it was carefully designed and got a lot of review already. We just hadn't had
> > the user for it that time.
>
> Yes, that version went through several revisions so it's been well
> tested and known to work -- as you pointed out it just lacked the users
> to warrant merging it into the tree. If it statisfies the use-case
> required here now, then I think we should it pick it up rather than
> reinvent the solution again.
>
> Also, we probably don't need the "clump" code in there, so perhaps
> splitting it out to just the bitmap_{set,get}_value relevant code is
> fine.
Agree, thank you for your comments!
--
With Best Regards,
Andy Shevchenko
On Fri, Jul 14, 2023 at 1:28 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Fri, Jul 14, 2023 at 07:19:15AM -0400, William Breathitt Gray wrote:
> > On Fri, Jul 14, 2023 at 11:04:16AM +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 13, 2023 at 08:05:34PM +0200, Alexander Potapenko wrote:
> > > > On Thu, Jul 13, 2023 at 7:29 PM Andy Shevchenko
> > > > <[email protected]> wrote:
> > > > > On Thu, Jul 13, 2023 at 02:57:01PM +0200, Alexander Potapenko wrote:
> > > > > > The two new functions allow setting/getting values of length up to
> > > > > > BITS_PER_LONG bits at arbitrary position in the bitmap.
> > > > >
> > > > > A couple of years (?) ago it was a series to achieve something like this with
> > > > > better (?) code. Why not resurrect that one?
> > > > >
> > > > > https://www.mail-archive.com/[email protected]/msg2195426.html
> > > >
> > > > It looks more compact thanks to GENMASK, I can cook something based on
> > > > the proposed bitmap_{set,get}_value (and change the names if you
> > > > prefer the shorter ones).
> > > > But I'd better avoid pulling in the rest of that series without a strong need.
> > >
> > > William, what do you think on this?
> > >
> > > I'm personally prefer William's version as not only it was published first
> > > it was carefully designed and got a lot of review already. We just hadn't had
> > > the user for it that time.
> >
> > Yes, that version went through several revisions so it's been well
> > tested and known to work -- as you pointed out it just lacked the users
> > to warrant merging it into the tree. If it statisfies the use-case
> > required here now, then I think we should it pick it up rather than
> > reinvent the solution again.
> >
> > Also, we probably don't need the "clump" code in there, so perhaps
> > splitting it out to just the bitmap_{set,get}_value relevant code is
> > fine.
>
> Agree, thank you for your comments!
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
So would it be fine if I split off bitmap_set_value() and
bitmap_get_value() from that series and send it (with the appropriate
attribution) instead of my patch 1/5?
We'll probably still need to retain patch 2/5 (with the function names changed).
On Fri, Jul 14, 2023 at 02:07:45PM +0200, Alexander Potapenko wrote:
> On Fri, Jul 14, 2023 at 1:28 PM Andy Shevchenko
> <[email protected]> wrote:
> > On Fri, Jul 14, 2023 at 07:19:15AM -0400, William Breathitt Gray wrote:
> > > On Fri, Jul 14, 2023 at 11:04:16AM +0300, Andy Shevchenko wrote:
> > > > On Thu, Jul 13, 2023 at 08:05:34PM +0200, Alexander Potapenko wrote:
...
> > > > William, what do you think on this?
> > > >
> > > > I'm personally prefer William's version as not only it was published first
> > > > it was carefully designed and got a lot of review already. We just hadn't had
> > > > the user for it that time.
> > >
> > > Yes, that version went through several revisions so it's been well
> > > tested and known to work -- as you pointed out it just lacked the users
> > > to warrant merging it into the tree. If it statisfies the use-case
> > > required here now, then I think we should it pick it up rather than
> > > reinvent the solution again.
> > >
> > > Also, we probably don't need the "clump" code in there, so perhaps
> > > splitting it out to just the bitmap_{set,get}_value relevant code is
> > > fine.
> >
> > Agree, thank you for your comments!
> So would it be fine if I split off bitmap_set_value() and
> bitmap_get_value() from that series and send it (with the appropriate
> attribution) instead of my patch 1/5?
> We'll probably still need to retain patch 2/5 (with the function names
> changed).
Sounds good to me.
--
With Best Regards,
Andy Shevchenko