2022-03-27 20:55:09

by Iwashima, Kuniyuki

[permalink] [raw]
Subject: [PATCH] list: Fix another data-race around ep->rdllist.

syzbot had reported another race around ep->rdllist. ep_poll() calls
list_empty_careful() locklessly to check if the list is empty or not
by testing rdllist->prev == rdllist->next.

When the list does not have any nodes, the next and prev arguments of
__list_add() is the same head pointer. Thus the write to head->prev
there is racy with lockless list_empty_careful() and needs WRITE_ONCE()
to avoid store-tearing.

Note that the reader side is already fixed in the patch [0].

[0]: https://lore.kernel.org/mm-commits/[email protected]/

BUG: KCSAN: data-race in do_epoll_ctl / do_epoll_wait

write to 0xffff888103e43058 of 8 bytes by task 1799 on cpu 0:
__list_add include/linux/list.h:72 [inline]
list_add_tail include/linux/list.h:102 [inline]
ep_insert fs/eventpoll.c:1542 [inline]
do_epoll_ctl+0x1331/0x1880 fs/eventpoll.c:2141
__do_sys_epoll_ctl fs/eventpoll.c:2192 [inline]
__se_sys_epoll_ctl fs/eventpoll.c:2183 [inline]
__x64_sys_epoll_ctl+0xc2/0xf0 fs/eventpoll.c:2183
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff888103e43058 of 8 bytes by task 1802 on cpu 1:
list_empty_careful include/linux/list.h:329 [inline]
ep_events_available fs/eventpoll.c:381 [inline]
ep_poll fs/eventpoll.c:1797 [inline]
do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234
do_epoll_pwait fs/eventpoll.c:2268 [inline]
__do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
__se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
__x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xffff888103e43050 -> 0xffff88812d515498

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 1802 Comm: syz-fuzzer Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout")
Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
Fixes: bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.")
Reported-by: [email protected]
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
CC: Soheil Hassas Yeganeh <[email protected]>
CC: Davidlohr Bueso <[email protected]>
CC: Sridhar Samudrala <[email protected]>
CC: Alexander Duyck <[email protected]>
---
include/linux/list.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index dd6c2041d..2eaadc84a 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -69,10 +69,10 @@ static inline void __list_add(struct list_head *new,
if (!__list_add_valid(new, prev, next))
return;

- next->prev = new;
new->next = next;
new->prev = prev;
WRITE_ONCE(prev->next, new);
+ WRITE_ONCE(next->prev, new);
}

/**
--
2.30.2


2022-03-28 04:29:35

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH] list: Fix another data-race around ep->rdllist.

On Sat, 26 Mar 2022, Kuniyuki Iwashima wrote:

>syzbot had reported another race around ep->rdllist. ep_poll() calls
>list_empty_careful() locklessly to check if the list is empty or not
>by testing rdllist->prev == rdllist->next.
>
>When the list does not have any nodes, the next and prev arguments of
>__list_add() is the same head pointer. Thus the write to head->prev
>there is racy with lockless list_empty_careful() and needs WRITE_ONCE()
>to avoid store-tearing.
>
>Note that the reader side is already fixed in the patch [0].
>
>[0]: https://lore.kernel.org/mm-commits/[email protected]/
>
>BUG: KCSAN: data-race in do_epoll_ctl / do_epoll_wait

I think this needs to be part of the same list-fix-a-data-race-around-ep-rdllist.patch

Thanks,
Davidlohr

2022-03-28 06:42:09

by Soheil Hassas Yeganeh

[permalink] [raw]
Subject: Re: [PATCH] list: Fix another data-race around ep->rdllist.

On Sat, Mar 26, 2022 at 2:36 AM Kuniyuki Iwashima <[email protected]> wrote:
>
> syzbot had reported another race around ep->rdllist. ep_poll() calls
> list_empty_careful() locklessly to check if the list is empty or not
> by testing rdllist->prev == rdllist->next.
>
> When the list does not have any nodes, the next and prev arguments of
> __list_add() is the same head pointer. Thus the write to head->prev
> there is racy with lockless list_empty_careful() and needs WRITE_ONCE()
> to avoid store-tearing.
>
> Note that the reader side is already fixed in the patch [0].
>
> [0]: https://lore.kernel.org/mm-commits/[email protected]/
>
> BUG: KCSAN: data-race in do_epoll_ctl / do_epoll_wait
>
> write to 0xffff888103e43058 of 8 bytes by task 1799 on cpu 0:
> __list_add include/linux/list.h:72 [inline]
> list_add_tail include/linux/list.h:102 [inline]
> ep_insert fs/eventpoll.c:1542 [inline]
> do_epoll_ctl+0x1331/0x1880 fs/eventpoll.c:2141
> __do_sys_epoll_ctl fs/eventpoll.c:2192 [inline]
> __se_sys_epoll_ctl fs/eventpoll.c:2183 [inline]
> __x64_sys_epoll_ctl+0xc2/0xf0 fs/eventpoll.c:2183
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> read to 0xffff888103e43058 of 8 bytes by task 1802 on cpu 1:
> list_empty_careful include/linux/list.h:329 [inline]
> ep_events_available fs/eventpoll.c:381 [inline]
> ep_poll fs/eventpoll.c:1797 [inline]
> do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234
> do_epoll_pwait fs/eventpoll.c:2268 [inline]
> __do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
> __se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
> __x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> value changed: 0xffff888103e43050 -> 0xffff88812d515498
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 1 PID: 1802 Comm: syz-fuzzer Not tainted 5.17.0-rc8-syzkaller-00003-g56e337f2cf13-dirty #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>
> Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout")
> Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
> Fixes: bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.")
> Reported-by: [email protected]
> Signed-off-by: Kuniyuki Iwashima <[email protected]>
> ---
> CC: Soheil Hassas Yeganeh <[email protected]>
> CC: Davidlohr Bueso <[email protected]>
> CC: Sridhar Samudrala <[email protected]>
> CC: Alexander Duyck <[email protected]>

Acked-by: Soheil Hassas Yeganeh <[email protected]>

Thank you for the fix!

> ---
> include/linux/list.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/list.h b/include/linux/list.h
> index dd6c2041d..2eaadc84a 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -69,10 +69,10 @@ static inline void __list_add(struct list_head *new,
> if (!__list_add_valid(new, prev, next))
> return;
>
> - next->prev = new;
> new->next = next;
> new->prev = prev;
> WRITE_ONCE(prev->next, new);
> + WRITE_ONCE(next->prev, new);
> }
>
> /**
> --
> 2.30.2
>