2015-11-02 17:12:34

by Shi, Yang

[permalink] [raw]
Subject: Re: [PATCH] bpf: convert hashtab lock to raw lock

On 10/31/2015 11:37 AM, Daniel Borkmann wrote:
> On 10/31/2015 02:47 PM, Steven Rostedt wrote:
>> On Fri, 30 Oct 2015 17:03:58 -0700
>> Alexei Starovoitov <[email protected]> wrote:
>>> On Fri, Oct 30, 2015 at 03:16:26PM -0700, Yang Shi wrote:
>>>> When running bpf samples on rt kernel, it reports the below warning:
>>>>
>>>> BUG: sleeping function called from invalid context at
>>>> kernel/locking/rtmutex.c:917
>>>> in_atomic(): 1, irqs_disabled(): 128, pid: 477, name: ping
>>>> Preemption disabled at:[<ffff80000017db58>] kprobe_perf_func+0x30/0x228
>>> ...
>>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>>> index 83c209d..972b76b 100644
>>>> --- a/kernel/bpf/hashtab.c
>>>> +++ b/kernel/bpf/hashtab.c
>>>> @@ -17,7 +17,7 @@
>>>> struct bpf_htab {
>>>> struct bpf_map map;
>>>> struct hlist_head *buckets;
>>>> - spinlock_t lock;
>>>> + raw_spinlock_t lock;
>>>
>>> How do we address such things in general?
>>> I bet there are tons of places around the kernel that
>>> call spin_lock from atomic.
>>> I'd hate to lose the benefits of lockdep of non-raw spin_lock
>>> just to make rt happy.
>>
>> You wont lose any benefits of lockdep. Lockdep still checks
>> raw_spin_lock(). The only difference between raw_spin_lock and
>> spin_lock is that in -rt spin_lock turns into an rt_mutex() and
>> raw_spin_lock stays a spin lock.
>
> ( Btw, Yang, would have been nice if your commit description would have
> already included such info, not only that you convert it, but also why
> it's okay to do so. )

I think Thomas's document will include all the information about rt spin
lock/raw spin lock, etc.

Alexei & Daniel,

If you think such info is necessary, I definitely could add it into the
commit log in v2.

>
>> The error is that in -rt, you called a mutex and not a spin lock while
>> atomic.
>
> You are right, I think this happens due to the preempt_disable() in the
> trace_call_bpf() handler. So, I think the patch seems okay. The dep_map
> is btw union'ed in the struct spinlock case to the same offset of the
> dep_map from raw_spinlock.
>
> It's a bit inconvenient, though, when we add other library code as maps
> in future, f.e. things like rhashtable as they would first need to be
> converted to raw_spinlock_t as well, but judging from the git log, it
> looks like common practice.

Yes, it is common practice for converting sleepable spin lock to raw
spin lock in -rt to avoid scheduling in atomic context bug.

Thanks,
Yang

>
> Thanks,
> Daniel


2015-11-02 17:24:12

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] bpf: convert hashtab lock to raw lock

On Mon, 02 Nov 2015 09:12:29 -0800
"Shi, Yang" <[email protected]> wrote:

> Yes, it is common practice for converting sleepable spin lock to raw
> spin lock in -rt to avoid scheduling in atomic context bug.

Note, in a lot of cases we don't just convert spin_locks to raw because
of atomic context. There's times we need to change the design where the
lock is not taken in atomic context (switching preempt_disable() to a
local_lock() for example).

But bpf is much like ftrace and kprobes where they can be taken almost
anywhere, and the do indeed need to be raw.

-- Steve

2015-11-02 17:29:08

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH] bpf: convert hashtab lock to raw lock

On 11/02/2015 06:12 PM, Shi, Yang wrote:
...
> If you think such info is necessary, I definitely could add it into the commit log in v2.

As this is going to be documented anyway (thanks! ;)), and the discussion
to this patch can be found in the archives for those wondering, I'm good:

Acked-by: Daniel Borkmann <[email protected]>

Thanks for the fix, Yang!

I presume this should go to net-next then ...

2015-11-02 17:31:55

by Shi, Yang

[permalink] [raw]
Subject: Re: [PATCH] bpf: convert hashtab lock to raw lock

On 11/2/2015 9:24 AM, Steven Rostedt wrote:
> On Mon, 02 Nov 2015 09:12:29 -0800
> "Shi, Yang" <[email protected]> wrote:
>
>> Yes, it is common practice for converting sleepable spin lock to raw
>> spin lock in -rt to avoid scheduling in atomic context bug.
>
> Note, in a lot of cases we don't just convert spin_locks to raw because
> of atomic context. There's times we need to change the design where the
> lock is not taken in atomic context (switching preempt_disable() to a
> local_lock() for example).

Yes, definitely. Understood.

Thanks,
Yang

>
> But bpf is much like ftrace and kprobes where they can be taken almost
> anywhere, and the do indeed need to be raw.
>
> -- Steve
>