2023-07-28 08:53:57

by Tomas Glozar

[permalink] [raw]
Subject: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

From: Tomas Glozar <[email protected]>

Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
patchset notes for details).

This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
BUG (sleeping function called from invalid context) on RT kernels.

preempt_disable was introduced together with lock_sk and rcu_read_lock
in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
in parallel"), probably to match disabled migration of BPF programs, and
is no longer necessary.

Remove preempt_disable to fix BUG in sock_map_update_common on RT.

Signed-off-by: Tomas Glozar <[email protected]>
---
net/core/sock_map.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/net/core/sock_map.c b/net/core/sock_map.c
index 19538d628714..08ab108206bf 100644
--- a/net/core/sock_map.c
+++ b/net/core/sock_map.c
@@ -115,7 +115,6 @@ static void sock_map_sk_acquire(struct sock *sk)
__acquires(&sk->sk_lock.slock)
{
lock_sock(sk);
- preempt_disable();
rcu_read_lock();
}

@@ -123,7 +122,6 @@ static void sock_map_sk_release(struct sock *sk)
__releases(&sk->sk_lock.slock)
{
rcu_read_unlock();
- preempt_enable();
release_sock(sk);
}

--
2.39.3



2023-07-28 12:06:50

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

On Fri, Jul 28, 2023 at 08:44:11AM +0200, [email protected] wrote:
> From: Tomas Glozar <[email protected]>
>
> Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> patchset notes for details).
>
> This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
> BUG (sleeping function called from invalid context) on RT kernels.
>
> preempt_disable was introduced together with lock_sk and rcu_read_lock
> in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
> in parallel"), probably to match disabled migration of BPF programs, and
> is no longer necessary.
>
> Remove preempt_disable to fix BUG in sock_map_update_common on RT.

FYI, I'm not sure it's related but I started to see following splat recently:

[ 189.360689][ T658] =============================
[ 189.361149][ T658] [ BUG: Invalid wait context ]
[ 189.361588][ T658] 6.5.0-rc2+ #589 Tainted: G OE
[ 189.362174][ T658] -----------------------------
[ 189.362660][ T658] test_progs/658 is trying to lock:
[ 189.363176][ T658] ffff8881702652b8 (&psock->link_lock){....}-{3:3}, at: sock_map_update_common+0x1c4/0x340
[ 189.364152][ T658] other info that might help us debug this:
[ 189.364689][ T658] context-{5:5}
[ 189.365021][ T658] 3 locks held by test_progs/658:
[ 189.365508][ T658] #0: ffff888177611a80 (sk_lock-AF_INET){+.+.}-{0:0}, at: sock_map_update_elem_sys+0x82/0x260
[ 189.366503][ T658] #1: ffffffff835a3180 (rcu_read_lock){....}-{1:3}, at: sock_map_update_elem_sys+0x78/0x260
[ 189.367470][ T658] #2: ffff88816cf19240 (&stab->lock){+...}-{2:2}, at: sock_map_update_common+0x12a/0x340
[ 189.368420][ T658] stack backtrace:
[ 189.368806][ T658] CPU: 0 PID: 658 Comm: test_progs Tainted: G OE 6.5.0-rc2+ #589 98af30b3c42d747b51da05f1d0e4899e394be6c9
[ 189.369889][ T658] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
[ 189.370736][ T658] Call Trace:
[ 189.371063][ T658] <TASK>
[ 189.371365][ T658] dump_stack_lvl+0xb2/0x120
[ 189.371798][ T658] __lock_acquire+0x9ad/0x2470
[ 189.372243][ T658] ? lock_acquire+0x104/0x350
[ 189.372680][ T658] lock_acquire+0x104/0x350
[ 189.373104][ T658] ? sock_map_update_common+0x1c4/0x340
[ 189.373615][ T658] ? find_held_lock+0x32/0x90
[ 189.374074][ T658] ? sock_map_update_common+0x12a/0x340
[ 189.374587][ T658] _raw_spin_lock_bh+0x38/0x80
[ 189.375060][ T658] ? sock_map_update_common+0x1c4/0x340
[ 189.375571][ T658] sock_map_update_common+0x1c4/0x340
[ 189.376118][ T658] sock_map_update_elem_sys+0x184/0x260
[ 189.376704][ T658] __sys_bpf+0x181f/0x2840
[ 189.377147][ T658] __x64_sys_bpf+0x1a/0x30
[ 189.377556][ T658] do_syscall_64+0x38/0x90
[ 189.377980][ T658] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 189.378473][ T658] RIP: 0033:0x7fe52f47ab5d

the patch did not help with that

jirka

>
> Signed-off-by: Tomas Glozar <[email protected]>
> ---
> net/core/sock_map.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
> index 19538d628714..08ab108206bf 100644
> --- a/net/core/sock_map.c
> +++ b/net/core/sock_map.c
> @@ -115,7 +115,6 @@ static void sock_map_sk_acquire(struct sock *sk)
> __acquires(&sk->sk_lock.slock)
> {
> lock_sock(sk);
> - preempt_disable();
> rcu_read_lock();
> }
>
> @@ -123,7 +122,6 @@ static void sock_map_sk_release(struct sock *sk)
> __releases(&sk->sk_lock.slock)
> {
> rcu_read_unlock();
> - preempt_enable();
> release_sock(sk);
> }
>
> --
> 2.39.3
>
>

2023-07-28 17:13:01

by Yonghong Song

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire



On 7/28/23 4:48 AM, Jiri Olsa wrote:
> On Fri, Jul 28, 2023 at 08:44:11AM +0200, [email protected] wrote:
>> From: Tomas Glozar <[email protected]>
>>
>> Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
>> allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
>> GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
>> patchset notes for details).
>>
>> This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
>> BUG (sleeping function called from invalid context) on RT kernels.
>>
>> preempt_disable was introduced together with lock_sk and rcu_read_lock
>> in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
>> in parallel"), probably to match disabled migration of BPF programs, and
>> is no longer necessary.
>>
>> Remove preempt_disable to fix BUG in sock_map_update_common on RT.
>
> FYI, I'm not sure it's related but I started to see following splat recently:
>
> [ 189.360689][ T658] =============================
> [ 189.361149][ T658] [ BUG: Invalid wait context ]
> [ 189.361588][ T658] 6.5.0-rc2+ #589 Tainted: G OE
> [ 189.362174][ T658] -----------------------------
> [ 189.362660][ T658] test_progs/658 is trying to lock:
> [ 189.363176][ T658] ffff8881702652b8 (&psock->link_lock){....}-{3:3}, at: sock_map_update_common+0x1c4/0x340
> [ 189.364152][ T658] other info that might help us debug this:
> [ 189.364689][ T658] context-{5:5}
> [ 189.365021][ T658] 3 locks held by test_progs/658:
> [ 189.365508][ T658] #0: ffff888177611a80 (sk_lock-AF_INET){+.+.}-{0:0}, at: sock_map_update_elem_sys+0x82/0x260
> [ 189.366503][ T658] #1: ffffffff835a3180 (rcu_read_lock){....}-{1:3}, at: sock_map_update_elem_sys+0x78/0x260
> [ 189.367470][ T658] #2: ffff88816cf19240 (&stab->lock){+...}-{2:2}, at: sock_map_update_common+0x12a/0x340
> [ 189.368420][ T658] stack backtrace:
> [ 189.368806][ T658] CPU: 0 PID: 658 Comm: test_progs Tainted: G OE 6.5.0-rc2+ #589 98af30b3c42d747b51da05f1d0e4899e394be6c9
> [ 189.369889][ T658] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
> [ 189.370736][ T658] Call Trace:
> [ 189.371063][ T658] <TASK>
> [ 189.371365][ T658] dump_stack_lvl+0xb2/0x120
> [ 189.371798][ T658] __lock_acquire+0x9ad/0x2470
> [ 189.372243][ T658] ? lock_acquire+0x104/0x350
> [ 189.372680][ T658] lock_acquire+0x104/0x350
> [ 189.373104][ T658] ? sock_map_update_common+0x1c4/0x340
> [ 189.373615][ T658] ? find_held_lock+0x32/0x90
> [ 189.374074][ T658] ? sock_map_update_common+0x12a/0x340
> [ 189.374587][ T658] _raw_spin_lock_bh+0x38/0x80
> [ 189.375060][ T658] ? sock_map_update_common+0x1c4/0x340
> [ 189.375571][ T658] sock_map_update_common+0x1c4/0x340
> [ 189.376118][ T658] sock_map_update_elem_sys+0x184/0x260
> [ 189.376704][ T658] __sys_bpf+0x181f/0x2840
> [ 189.377147][ T658] __x64_sys_bpf+0x1a/0x30
> [ 189.377556][ T658] do_syscall_64+0x38/0x90
> [ 189.377980][ T658] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [ 189.378473][ T658] RIP: 0033:0x7fe52f47ab5d
>
> the patch did not help with that

I think the above splat is not related to this patch. In function
sock_map_update_common func we have
raw_spin_lock_bh(&stab->lock);

sock_map_add_link(psock, link, map, &stab->sks[idx]);
spin_lock_bh(&psock->link_lock);
...
spin_unlock_bh(&psock->link_lock);

raw_spin_unlock_bh(&stab->lock);

I think you probably have CONFIG_PROVE_RAW_LOCK_NESTING turned on
in your config.

In the above case, for RT kernel, spin_lock_bh will become
'mutex' and it is sleepable, while raw_spin_lock_bh remains
to be a spin lock. The warning is about potential
locking violation with RT kernel.

To fix the issue, you can convert spin_lock_bh to raw_spin_lock_bh
to silence the warning.

>
> jirka
>
>>
>> Signed-off-by: Tomas Glozar <[email protected]>
>> ---
>> net/core/sock_map.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/net/core/sock_map.c b/net/core/sock_map.c
>> index 19538d628714..08ab108206bf 100644
>> --- a/net/core/sock_map.c
>> +++ b/net/core/sock_map.c
>> @@ -115,7 +115,6 @@ static void sock_map_sk_acquire(struct sock *sk)
>> __acquires(&sk->sk_lock.slock)
>> {
>> lock_sock(sk);
>> - preempt_disable();
>> rcu_read_lock();
>> }
>>
>> @@ -123,7 +122,6 @@ static void sock_map_sk_release(struct sock *sk)
>> __releases(&sk->sk_lock.slock)
>> {
>> rcu_read_unlock();
>> - preempt_enable();
>> release_sock(sk);
>> }
>>
>> --
>> 2.39.3
>>
>>
>

2023-07-31 10:20:30

by Jakub Sitnicki

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire


On Fri, Jul 28, 2023 at 08:44 AM +02, [email protected] wrote:
> From: Tomas Glozar <[email protected]>
>
> Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> patchset notes for details).
>
> This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
> BUG (sleeping function called from invalid context) on RT kernels.
>
> preempt_disable was introduced together with lock_sk and rcu_read_lock
> in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
> in parallel"), probably to match disabled migration of BPF programs, and
> is no longer necessary.
>
> Remove preempt_disable to fix BUG in sock_map_update_common on RT.
>
> Signed-off-by: Tomas Glozar <[email protected]>
> ---

We disable softirq and hold a spin lock when modifying the map/hash in
sock_{map,hash}_update_common so this LGTM:

Reviewed-by: Jakub Sitnicki <[email protected]>

You might want some extra tags:

Link: https://lore.kernel.org/all/[email protected]/
Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel")

2023-08-01 04:56:07

by John Fastabend

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

Jakub Sitnicki wrote:
>
> On Fri, Jul 28, 2023 at 08:44 AM +02, [email protected] wrote:
> > From: Tomas Glozar <[email protected]>
> >
> > Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> > allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> > GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> > patchset notes for details).
> >
> > This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
> > BUG (sleeping function called from invalid context) on RT kernels.
> >
> > preempt_disable was introduced together with lock_sk and rcu_read_lock
> > in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
> > in parallel"), probably to match disabled migration of BPF programs, and
> > is no longer necessary.
> >
> > Remove preempt_disable to fix BUG in sock_map_update_common on RT.
> >
> > Signed-off-by: Tomas Glozar <[email protected]>
> > ---
>
> We disable softirq and hold a spin lock when modifying the map/hash in
> sock_{map,hash}_update_common so this LGTM:
>

Agree.

> Reviewed-by: Jakub Sitnicki <[email protected]>

Reviewed-by: John Fastabend <[email protected]>

>
> You might want some extra tags:
>
> Link: https://lore.kernel.org/all/[email protected]/
> Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel")



2023-08-01 08:17:43

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <[email protected]>:

On Fri, 28 Jul 2023 08:44:11 +0200 you wrote:
> From: Tomas Glozar <[email protected]>
>
> Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> patchset notes for details).
>
> [...]

Here is the summary with links:
- [net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire
https://git.kernel.org/netdev/net/c/13d2618b48f1

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2023-08-01 08:19:14

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

On Mon, 2023-07-31 at 20:58 -0700, John Fastabend wrote:
> Jakub Sitnicki wrote:
> >
> > On Fri, Jul 28, 2023 at 08:44 AM +02, [email protected] wrote:
> > > From: Tomas Glozar <[email protected]>
> > >
> > > Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> > > allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> > > GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> > > patchset notes for details).
> > >
> > > This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
> > > BUG (sleeping function called from invalid context) on RT kernels.
> > >
> > > preempt_disable was introduced together with lock_sk and rcu_read_lock
> > > in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
> > > in parallel"), probably to match disabled migration of BPF programs, and
> > > is no longer necessary.
> > >
> > > Remove preempt_disable to fix BUG in sock_map_update_common on RT.
> > >
> > > Signed-off-by: Tomas Glozar <[email protected]>
> > > ---
> >
> > We disable softirq and hold a spin lock when modifying the map/hash in
> > sock_{map,hash}_update_common so this LGTM:
> >
>
> Agree.
>
> > Reviewed-by: Jakub Sitnicki <[email protected]>
>
> Reviewed-by: John Fastabend <[email protected]>
>
> >
> > You might want some extra tags:
> >
> > Link: https://lore.kernel.org/all/[email protected]/
> > Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel")

ENOCOFFEE here (which is never an excuse, but at least today is really
true), but I considered you may want this patch via the ebpf tree only
after applying it to net.

Hopefully should not be tragic, but please let me know if you prefer
the change reverted from net and going via the other path.

Thanks!

Paolo


2023-08-01 17:32:13

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH net] bpf: sockmap: Remove preempt_disable in sock_map_sk_acquire

On Tue, Aug 1, 2023 at 12:44 AM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2023-07-31 at 20:58 -0700, John Fastabend wrote:
> > Jakub Sitnicki wrote:
> > >
> > > On Fri, Jul 28, 2023 at 08:44 AM +02, [email protected] wrote:
> > > > From: Tomas Glozar <[email protected]>
> > > >
> > > > Disabling preemption in sock_map_sk_acquire conflicts with GFP_ATOMIC
> > > > allocation later in sk_psock_init_link on PREEMPT_RT kernels, since
> > > > GFP_ATOMIC might sleep on RT (see bpf: Make BPF and PREEMPT_RT co-exist
> > > > patchset notes for details).
> > > >
> > > > This causes calling bpf_map_update_elem on BPF_MAP_TYPE_SOCKMAP maps to
> > > > BUG (sleeping function called from invalid context) on RT kernels.
> > > >
> > > > preempt_disable was introduced together with lock_sk and rcu_read_lock
> > > > in commit 99ba2b5aba24e ("bpf: sockhash, disallow bpf_tcp_close and update
> > > > in parallel"), probably to match disabled migration of BPF programs, and
> > > > is no longer necessary.
> > > >
> > > > Remove preempt_disable to fix BUG in sock_map_update_common on RT.
> > > >
> > > > Signed-off-by: Tomas Glozar <[email protected]>
> > > > ---
> > >
> > > We disable softirq and hold a spin lock when modifying the map/hash in
> > > sock_{map,hash}_update_common so this LGTM:
> > >
> >
> > Agree.
> >
> > > Reviewed-by: Jakub Sitnicki <[email protected]>
> >
> > Reviewed-by: John Fastabend <[email protected]>
> >
> > >
> > > You might want some extra tags:
> > >
> > > Link: https://lore.kernel.org/all/[email protected]/
> > > Fixes: 99ba2b5aba24 ("bpf: sockhash, disallow bpf_tcp_close and update in parallel")
>
> ENOCOFFEE here (which is never an excuse, but at least today is really
> true), but I considered you may want this patch via the ebpf tree only
> after applying it to net.
>
> Hopefully should not be tragic, but please let me know if you prefer
> the change reverted from net and going via the other path.

It's fine. Probably it shouldn't conflict with other sockmap fixes.