2023-03-29 03:04:46

by Qun-wei Lin (林群崴)

[permalink] [raw]
Subject: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

Hi,

We meet the mass MTE errors happened in Android T with kernel-6.1.

When the system is under memory pressure, the MTE often triggers some
error reporting in userspace.

Like the tombstone below, there are many reports with the acllocation
tags of 0:

Build fingerprint:
'alps/vext_k6897v1_64/k6897v1_64:13/TP1A.220624.014/mp2ofp23:userdebug/
dev-keys'
Revision: '0'
ABI: 'arm64'
Timestamp: 2023-03-14 06:39:40.344251744+0800
Process uptime: 0s
Cmdline: /vendor/bin/hw/camerahalserver
pid: 988, tid: 1395, name: binder:988_3 >>>
/vendor/bin/hw/camerahalserver <<<
uid: 1047
tagged_addr_ctrl: 000000000007fff3 (PR_TAGGED_ADDR_ENABLE,
PR_MTE_TCF_SYNC, mask 0xfffe)
signal 11 (SIGSEGV), code 9 (SEGV_MTESERR), fault addr
0x0d000075f1d8d7f0
x0 00000075018d3fb0 x1 00000000c0306201 x2 00000075018d3ae8 x
3 000000000000720c
x4 0000000000000000 x5 0000000000000000 x6 00000642000004fe x
7 0000054600000630
x8 00000000fffffff2 x9 b34a1094e7e33c3f x10
00000075018d3a80 x11 00000075018d3a50
x12 ffffff80ffffffd0 x13 0000061e0000072c x14
0000000000000004 x15 0000000000000000
x16 00000077f2dfcd78 x17 00000077da3a8ff0 x18
00000075011bc000 x19 0d000075f1d8d898
x20 0d000075f1d8d7f0 x21 0d000075f1d8d910 x22
0000000000000000 x23 00000000fffffff7
x24 00000075018d4000 x25 0000000000000000 x26
00000075018d3ff8 x27 00000000000fc000
x28 00000000000fe000 x29 00000075018d3b20
lr 00000077f2d9f164 sp 00000075018d3ad0 pc 00000077f2d9f134 p
st 0000000080001000

backtrace:
#00 pc 000000000005d134 /system/lib64/libbinder.so
(android::IPCThreadState::talkWithDriver(bool)+244) (BuildId:
8b5612259e4a42521c430456ec5939c7)
#01 pc 000000000005d448 /system/lib64/libbinder.so
(android::IPCThreadState::getAndExecuteCommand()+24) (BuildId:
8b5612259e4a42521c430456ec5939c7)
#02 pc 000000000005dd64 /system/lib64/libbinder.so
(android::IPCThreadState::joinThreadPool(bool)+68) (BuildId:
8b5612259e4a42521c430456ec5939c7)
#03 pc 000000000008dba8 /system/lib64/libbinder.so
(android::PoolThread::threadLoop()+24) (BuildId:
8b5612259e4a42521c430456ec5939c7)
#04 pc 0000000000013440 /system/lib64/libutils.so
(android::Thread::_threadLoop(void*)+416) (BuildId:
10aac5d4a671e4110bc00c9b69d83d8a)
#05 pc
00000000000c14cc /apex/com.android.runtime/lib64/bionic/libc.so
(__pthread_start(void*)+204) (BuildId:
718ecc04753b519b0f6289a7a2fcf117)
#06 pc
0000000000054930 /apex/com.android.runtime/lib64/bionic/libc.so
(__start_thread+64) (BuildId: 718ecc04753b519b0f6289a7a2fcf117)

Memory tags around the fault address (0xd000075f1d8d7f0), one tag per
16 bytes:
0x75f1d8cf00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d000: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d100: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d200: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d300: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d400: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d500: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d600: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
=>0x75f1d8d700: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [0]
0x75f1d8d800: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8d900: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8da00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8db00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8dc00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8dd00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x75f1d8de00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Also happens in coredump.

This problem only occurs when ZRAM is enabled, so we think there are
some issues regarding swap in/out.

Having compared the differences between Kernel-5.15 and Kernel-6.1,
We found the order of swap_free() and set_pte_at() is changed in
do_swap_page().

When fault in, do_swap_page() will call swap_free() first:
do_swap_page() -> swap_free() -> __swap_entry_free() ->
free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
swap_range_free() -> arch_swap_invalidate_page() ->
mte_invalidate_tags_area() -> mte_invalidate_tags() -> xa_erase()

and then call set_pte_at():
do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
mte_sync_page_tags() -> mte_restore_tags() -> xa_load()

This means that the swap slot is invalidated before pte mapping, and
this will cause the mte tag in XArray to be released before tag
restore.

After I moved swap_free() to the next line of set_pte_at(), the problem
is disappeared.

We suspect that the following patches, which have changed the order, do
not consider the mte tag restoring in page fault flow:
https://lore.kernel.org/all/[email protected]/

Any suggestion is appreciated.

Thank you.


2023-03-29 16:03:55

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

On Wed, Mar 29, 2023 at 4:56 AM 'Qun-wei Lin (林群崴)' via kasan-dev
<[email protected]> wrote:
>
> Hi,
>
> We meet the mass MTE errors happened in Android T with kernel-6.1.
>
> When the system is under memory pressure, the MTE often triggers some
> error reporting in userspace.
>
> Like the tombstone below, there are many reports with the acllocation
> tags of 0:
>
> Build fingerprint:
> 'alps/vext_k6897v1_64/k6897v1_64:13/TP1A.220624.014/mp2ofp23:userdebug/
> dev-keys'
> Revision: '0'
> ABI: 'arm64'
> Timestamp: 2023-03-14 06:39:40.344251744+0800
> Process uptime: 0s
> Cmdline: /vendor/bin/hw/camerahalserver
> pid: 988, tid: 1395, name: binder:988_3 >>>
> /vendor/bin/hw/camerahalserver <<<
> uid: 1047
> tagged_addr_ctrl: 000000000007fff3 (PR_TAGGED_ADDR_ENABLE,
> PR_MTE_TCF_SYNC, mask 0xfffe)
> signal 11 (SIGSEGV), code 9 (SEGV_MTESERR), fault addr
> 0x0d000075f1d8d7f0
> x0 00000075018d3fb0 x1 00000000c0306201 x2 00000075018d3ae8 x
> 3 000000000000720c
> x4 0000000000000000 x5 0000000000000000 x6 00000642000004fe x
> 7 0000054600000630
> x8 00000000fffffff2 x9 b34a1094e7e33c3f x10
> 00000075018d3a80 x11 00000075018d3a50
> x12 ffffff80ffffffd0 x13 0000061e0000072c x14
> 0000000000000004 x15 0000000000000000
> x16 00000077f2dfcd78 x17 00000077da3a8ff0 x18
> 00000075011bc000 x19 0d000075f1d8d898
> x20 0d000075f1d8d7f0 x21 0d000075f1d8d910 x22
> 0000000000000000 x23 00000000fffffff7
> x24 00000075018d4000 x25 0000000000000000 x26
> 00000075018d3ff8 x27 00000000000fc000
> x28 00000000000fe000 x29 00000075018d3b20
> lr 00000077f2d9f164 sp 00000075018d3ad0 pc 00000077f2d9f134 p
> st 0000000080001000
>
> backtrace:
> #00 pc 000000000005d134 /system/lib64/libbinder.so
> (android::IPCThreadState::talkWithDriver(bool)+244) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #01 pc 000000000005d448 /system/lib64/libbinder.so
> (android::IPCThreadState::getAndExecuteCommand()+24) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #02 pc 000000000005dd64 /system/lib64/libbinder.so
> (android::IPCThreadState::joinThreadPool(bool)+68) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #03 pc 000000000008dba8 /system/lib64/libbinder.so
> (android::PoolThread::threadLoop()+24) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #04 pc 0000000000013440 /system/lib64/libutils.so
> (android::Thread::_threadLoop(void*)+416) (BuildId:
> 10aac5d4a671e4110bc00c9b69d83d8a)
> #05 pc
> 00000000000c14cc /apex/com.android.runtime/lib64/bionic/libc.so
> (__pthread_start(void*)+204) (BuildId:
> 718ecc04753b519b0f6289a7a2fcf117)
> #06 pc
> 0000000000054930 /apex/com.android.runtime/lib64/bionic/libc.so
> (__start_thread+64) (BuildId: 718ecc04753b519b0f6289a7a2fcf117)
>
> Memory tags around the fault address (0xd000075f1d8d7f0), one tag per
> 16 bytes:
> 0x75f1d8cf00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d000: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d100: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d200: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d300: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d400: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d500: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d600: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> =>0x75f1d8d700: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [0]
> 0x75f1d8d800: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d900: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8da00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8db00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8dc00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8dd00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8de00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> Also happens in coredump.
>
> This problem only occurs when ZRAM is enabled, so we think there are
> some issues regarding swap in/out.
>
> Having compared the differences between Kernel-5.15 and Kernel-6.1,
> We found the order of swap_free() and set_pte_at() is changed in
> do_swap_page().
>
> When fault in, do_swap_page() will call swap_free() first:
> do_swap_page() -> swap_free() -> __swap_entry_free() ->
> free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
> swap_range_free() -> arch_swap_invalidate_page() ->
> mte_invalidate_tags_area() -> mte_invalidate_tags() -> xa_erase()
>
> and then call set_pte_at():
> do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
> mte_sync_page_tags() -> mte_restore_tags() -> xa_load()
>
> This means that the swap slot is invalidated before pte mapping, and
> this will cause the mte tag in XArray to be released before tag
> restore.
>
> After I moved swap_free() to the next line of set_pte_at(), the problem
> is disappeared.
>
> We suspect that the following patches, which have changed the order, do
> not consider the mte tag restoring in page fault flow:
> https://lore.kernel.org/all/[email protected]/
>
> Any suggestion is appreciated.
>
> Thank you.

+Peter

2023-03-29 17:08:27

by Catalin Marinas

[permalink] [raw]
Subject: Re: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

+ Steven Price who added the MTE swap support.

On Wed, Mar 29, 2023 at 02:55:49AM +0000, Qun-wei Lin (林群崴) wrote:
> Hi,
>
> We meet the mass MTE errors happened in Android T with kernel-6.1.
>
> When the system is under memory pressure, the MTE often triggers some
> error reporting in userspace.
>
> Like the tombstone below, there are many reports with the acllocation
> tags of 0:
>
> Build fingerprint:
> 'alps/vext_k6897v1_64/k6897v1_64:13/TP1A.220624.014/mp2ofp23:userdebug/
> dev-keys'
> Revision: '0'
> ABI: 'arm64'
> Timestamp: 2023-03-14 06:39:40.344251744+0800
> Process uptime: 0s
> Cmdline: /vendor/bin/hw/camerahalserver
> pid: 988, tid: 1395, name: binder:988_3 >>>
> /vendor/bin/hw/camerahalserver <<<
> uid: 1047
> tagged_addr_ctrl: 000000000007fff3 (PR_TAGGED_ADDR_ENABLE,
> PR_MTE_TCF_SYNC, mask 0xfffe)
> signal 11 (SIGSEGV), code 9 (SEGV_MTESERR), fault addr
> 0x0d000075f1d8d7f0
> x0 00000075018d3fb0 x1 00000000c0306201 x2 00000075018d3ae8 x
> 3 000000000000720c
> x4 0000000000000000 x5 0000000000000000 x6 00000642000004fe x
> 7 0000054600000630
> x8 00000000fffffff2 x9 b34a1094e7e33c3f x10
> 00000075018d3a80 x11 00000075018d3a50
> x12 ffffff80ffffffd0 x13 0000061e0000072c x14
> 0000000000000004 x15 0000000000000000
> x16 00000077f2dfcd78 x17 00000077da3a8ff0 x18
> 00000075011bc000 x19 0d000075f1d8d898
> x20 0d000075f1d8d7f0 x21 0d000075f1d8d910 x22
> 0000000000000000 x23 00000000fffffff7
> x24 00000075018d4000 x25 0000000000000000 x26
> 00000075018d3ff8 x27 00000000000fc000
> x28 00000000000fe000 x29 00000075018d3b20
> lr 00000077f2d9f164 sp 00000075018d3ad0 pc 00000077f2d9f134 p
> st 0000000080001000
>
> backtrace:
> #00 pc 000000000005d134 /system/lib64/libbinder.so
> (android::IPCThreadState::talkWithDriver(bool)+244) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #01 pc 000000000005d448 /system/lib64/libbinder.so
> (android::IPCThreadState::getAndExecuteCommand()+24) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #02 pc 000000000005dd64 /system/lib64/libbinder.so
> (android::IPCThreadState::joinThreadPool(bool)+68) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #03 pc 000000000008dba8 /system/lib64/libbinder.so
> (android::PoolThread::threadLoop()+24) (BuildId:
> 8b5612259e4a42521c430456ec5939c7)
> #04 pc 0000000000013440 /system/lib64/libutils.so
> (android::Thread::_threadLoop(void*)+416) (BuildId:
> 10aac5d4a671e4110bc00c9b69d83d8a)
> #05 pc
> 00000000000c14cc /apex/com.android.runtime/lib64/bionic/libc.so
> (__pthread_start(void*)+204) (BuildId:
> 718ecc04753b519b0f6289a7a2fcf117)
> #06 pc
> 0000000000054930 /apex/com.android.runtime/lib64/bionic/libc.so
> (__start_thread+64) (BuildId: 718ecc04753b519b0f6289a7a2fcf117)
>
> Memory tags around the fault address (0xd000075f1d8d7f0), one tag per
> 16 bytes:
> 0x75f1d8cf00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d000: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d100: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d200: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d300: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d400: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d500: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d600: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> =>0x75f1d8d700: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 [0]
> 0x75f1d8d800: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8d900: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8da00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8db00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8dc00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8dd00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 0x75f1d8de00: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> Also happens in coredump.
>
> This problem only occurs when ZRAM is enabled, so we think there are
> some issues regarding swap in/out.
>
> Having compared the differences between Kernel-5.15 and Kernel-6.1,
> We found the order of swap_free() and set_pte_at() is changed in
> do_swap_page().
>
> When fault in, do_swap_page() will call swap_free() first:
> do_swap_page() -> swap_free() -> __swap_entry_free() ->
> free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
> swap_range_free() -> arch_swap_invalidate_page() ->
> mte_invalidate_tags_area() -> mte_invalidate_tags() -> xa_erase()
>
> and then call set_pte_at():
> do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
> mte_sync_page_tags() -> mte_restore_tags() -> xa_load()
>
> This means that the swap slot is invalidated before pte mapping, and
> this will cause the mte tag in XArray to be released before tag
> restore.
>
> After I moved swap_free() to the next line of set_pte_at(), the problem
> is disappeared.
>
> We suspect that the following patches, which have changed the order, do
> not consider the mte tag restoring in page fault flow:
> https://lore.kernel.org/all/[email protected]/
>
> Any suggestion is appreciated.
>
> Thank you.

2023-03-30 14:06:07

by Steven Price

[permalink] [raw]
Subject: Re: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

On 29/03/2023 17:54, Catalin Marinas wrote:
> + Steven Price who added the MTE swap support.
>
> On Wed, Mar 29, 2023 at 02:55:49AM +0000, Qun-wei Lin (林群崴) wrote:

<snip>

>>
>> Having compared the differences between Kernel-5.15 and Kernel-6.1,
>> We found the order of swap_free() and set_pte_at() is changed in
>> do_swap_page().
>>
>> When fault in, do_swap_page() will call swap_free() first:
>> do_swap_page() -> swap_free() -> __swap_entry_free() ->
>> free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
>> swap_range_free() -> arch_swap_invalidate_page() ->
>> mte_invalidate_tags_area() -> mte_invalidate_tags() -> xa_erase()
>>
>> and then call set_pte_at():
>> do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
>> mte_sync_page_tags() -> mte_restore_tags() -> xa_load()
>>
>> This means that the swap slot is invalidated before pte mapping, and
>> this will cause the mte tag in XArray to be released before tag
>> restore.

This analysis looks correct to me. The MTE swap code works on the
assumption that the set_pte_at() will restore the tags to the page
before the swap entry is removed. The reordering which has happened
since has broken this assumption and as you observed can cause the tags
to be unavailable by the time set_pte_at() is called.

>> After I moved swap_free() to the next line of set_pte_at(), the problem
>> is disappeared.
>>
>> We suspect that the following patches, which have changed the order, do
>> not consider the mte tag restoring in page fault flow:
>> https://lore.kernel.org/all/[email protected]/

I'm not sure I entirely follow the reasoning in this patch, so I'm not
sure whether it's safe to just move swap_free() down to below
set_pte_at() or if that reintroduces the information leak.

I also wonder if sparc has a similar issue as the arch_do_swap()
callback is located next to set_pte_at().

>> Any suggestion is appreciated.

The other possibility is to add a(nother) callback for MTE in
arch_do_swap() that calls mte_restore_tags() on the page before the
swap_free() call rather than depending on the hook in set_pte_at().

Steve

2023-03-30 17:57:58

by Catalin Marinas

[permalink] [raw]
Subject: Re: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

On Thu, Mar 30, 2023 at 02:56:50PM +0100, Steven Price wrote:
> > On Wed, Mar 29, 2023 at 02:55:49AM +0000, Qun-wei Lin (林群崴) wrote:
> >> Having compared the differences between Kernel-5.15 and Kernel-6.1,
> >> We found the order of swap_free() and set_pte_at() is changed in
> >> do_swap_page().
> >>
> >> When fault in, do_swap_page() will call swap_free() first:
> >> do_swap_page() -> swap_free() -> __swap_entry_free() ->
> >> free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
> >> swap_range_free() -> arch_swap_invalidate_page() ->
> >> mte_invalidate_tags_area() -> mte_invalidate_tags() -> xa_erase()
> >>
> >> and then call set_pte_at():
> >> do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
> >> mte_sync_page_tags() -> mte_restore_tags() -> xa_load()
> >>
> >> This means that the swap slot is invalidated before pte mapping, and
> >> this will cause the mte tag in XArray to be released before tag
> >> restore.
>
> This analysis looks correct to me. The MTE swap code works on the
> assumption that the set_pte_at() will restore the tags to the page
> before the swap entry is removed. The reordering which has happened
> since has broken this assumption and as you observed can cause the tags
> to be unavailable by the time set_pte_at() is called.
>
> >> After I moved swap_free() to the next line of set_pte_at(), the problem
> >> is disappeared.
> >>
> >> We suspect that the following patches, which have changed the order, do
> >> not consider the mte tag restoring in page fault flow:
> >> https://lore.kernel.org/all/[email protected]/
>
> I'm not sure I entirely follow the reasoning in this patch, so I'm not
> sure whether it's safe to just move swap_free() down to below
> set_pte_at() or if that reintroduces the information leak.
>
> I also wonder if sparc has a similar issue as the arch_do_swap()
> callback is located next to set_pte_at().

SPARC has a potential race here since the page is made visible to the
user but the tags are not restored yet (I raised this before). But even
ignoring this small window, arch_do_swap() needs to have the metadata
available.

> >> Any suggestion is appreciated.
>
> The other possibility is to add a(nother) callback for MTE in
> arch_do_swap() that calls mte_restore_tags() on the page before the
> swap_free() call rather than depending on the hook in set_pte_at().

I think we should move arch_do_swap_page() earlier before swap_free()
and in arm64 we copy the tags to pte_page(pte). I don't think SPARC
would have any issues with this change (and it also fixes their race).

--
Catalin