2018-02-14 14:18:37

by Jason Wang

[permalink] [raw]
Subject: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

There're several implications after commit 0bf7800f1799 ("ptr_ring:
try vmalloc() when kmalloc() fails") with the using of vmalloc() since
can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
since cpumap try to call with GFP_ATOMIC. Fortunately, entry
allocation of cpumap can only be done through syscall path which means
GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
with GFP_KERNEL.

Reported-by: [email protected]
Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
Cc: Michal Hocko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Jason Wang <[email protected]>
---
kernel/bpf/cpumap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index fbfdada6..a4bb0b3 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
int map_id)
{
- gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
+ gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
struct bpf_cpu_map_entry *rcpu;
int numa, err;

--
2.7.4



2018-02-14 14:21:55

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On Wed, 14 Feb 2018 22:17:34 +0800
Jason Wang <[email protected]> wrote:

> There're several implications after commit 0bf7800f1799 ("ptr_ring:
> try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> allocation of cpumap can only be done through syscall path which means
> GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> with GFP_KERNEL.
>
> Reported-by: [email protected]
> Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
> Cc: Michal Hocko <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Jason Wang <[email protected]>
> ---
> kernel/bpf/cpumap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Jesper Dangaard Brouer <[email protected]>


> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index fbfdada6..a4bb0b3 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
> static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
> int map_id)
> {
> - gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
> + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> struct bpf_cpu_map_entry *rcpu;
> int numa, err;
>

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

2018-02-14 14:38:44

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On 02/14/2018 03:17 PM, Jason Wang wrote:
> There're several implications after commit 0bf7800f1799 ("ptr_ring:
> try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> allocation of cpumap can only be done through syscall path which means
> GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> with GFP_KERNEL.
>
> Reported-by: [email protected]
> Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
> Cc: Michal Hocko <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Jason Wang <[email protected]>

Applied to bpf tree, thanks Jason!

2018-02-14 15:08:18

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On Wed 14-02-18 22:17:34, Jason Wang wrote:
> There're several implications after commit 0bf7800f1799 ("ptr_ring:
> try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> allocation of cpumap can only be done through syscall path which means
> GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> with GFP_KERNEL.

map_update_elem does the following. Unless I am missing something and
the callback doesn't call cpu_map_update_elem there then we are in a
non-preemptible context there and GFP_WAIT would blow up.
rcu_read_lock();
err = map->ops->map_update_elem(map, key, value, attr->flags);
rcu_read_unlock();

> Reported-by: [email protected]
> Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
> Cc: Michal Hocko <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Jason Wang <[email protected]>
> ---
> kernel/bpf/cpumap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index fbfdada6..a4bb0b3 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
> static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
> int map_id)
> {
> - gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
> + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> struct bpf_cpu_map_entry *rcpu;
> int numa, err;
>
> --
> 2.7.4

--
Michal Hocko
SUSE Labs

2018-02-14 17:37:19

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On Wed, 14 Feb 2018 16:06:40 +0100
Michal Hocko <[email protected]> wrote:

> On Wed 14-02-18 22:17:34, Jason Wang wrote:
> > There're several implications after commit 0bf7800f1799 ("ptr_ring:
> > try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> > can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> > since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> > allocation of cpumap can only be done through syscall path which means
> > GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> > with GFP_KERNEL.
>
> map_update_elem does the following. Unless I am missing something and
> the callback doesn't call cpu_map_update_elem there then we are in a
> non-preemptible context there and GFP_WAIT would blow up.
> rcu_read_lock();
> err = map->ops->map_update_elem(map, key, value, attr->flags);
> rcu_read_unlock();

Nope - you did miss something ;-)

You are looking at the wrong place. Look at /kernel/bpf/syscall.c line 697.

vim +697 kernel/bpf/syscall.c
[...]
} else if (map->map_type == BPF_MAP_TYPE_CPUMAP) {
err = map->ops->map_update_elem(map, key, value, attr->flags);
goto out;
}

You missed that map type BPF_MAP_TYPE_CPUMAP is special cased, and
is moved outside rcu_read_{lock,unlock} (because it need to create some
kthreads).

Further more the BPF-verifier disallow BPF programs runtime changing
the BPF_MAP_TYPE_CPUMAP. Right now, we disallow almost everything from
the bpf-side (even reading the value):

vim +2057 kernel/bpf/verifier.c


> > Reported-by: [email protected]
> > Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
> > Cc: Michal Hocko <[email protected]>
> > Cc: Daniel Borkmann <[email protected]>
> > Cc: Matthew Wilcox <[email protected]>
> > Cc: Jesper Dangaard Brouer <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Cc: [email protected]
> > Signed-off-by: Jason Wang <[email protected]>
> > ---
> > kernel/bpf/cpumap.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> > index fbfdada6..a4bb0b3 100644
> > --- a/kernel/bpf/cpumap.c
> > +++ b/kernel/bpf/cpumap.c
> > @@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
> > static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
> > int map_id)
> > {
> > - gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
> > + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> > struct bpf_cpu_map_entry *rcpu;
> > int numa, err;
> >
> > --
> > 2.7.4
>



--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer

2018-02-14 17:47:01

by Daniel Borkmann

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On 02/14/2018 06:04 PM, Michael S. Tsirkin wrote:
> On Wed, Feb 14, 2018 at 10:17:34PM +0800, Jason Wang wrote:
>> There're several implications after commit 0bf7800f1799 ("ptr_ring:
>> try vmalloc() when kmalloc() fails") with the using of vmalloc() since
>> can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
>> since cpumap try to call with GFP_ATOMIC. Fortunately, entry
>> allocation of cpumap can only be done through syscall path which means
>> GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
>> with GFP_KERNEL.
>>
>> Reported-by: [email protected]
>> Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
>> Cc: Michal Hocko <[email protected]>
>> Cc: Daniel Borkmann <[email protected]>
>> Cc: Matthew Wilcox <[email protected]>
>> Cc: Jesper Dangaard Brouer <[email protected]>
>> Cc: [email protected]
>> Cc: [email protected]
>> Cc: [email protected]
>> Signed-off-by: Jason Wang <[email protected]>
>
> Frankly I'd start with the revert. The original patch was rushed
> into net without enough justification IMHO, and we just seem to keep
> piling up these things. How about deferring all these ideas
> to net-next?

It's up to you if you think a revert is needed. The below is fine and
small enough in any case for cpumap, imho.

>> kernel/bpf/cpumap.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
>> index fbfdada6..a4bb0b3 100644
>> --- a/kernel/bpf/cpumap.c
>> +++ b/kernel/bpf/cpumap.c
>> @@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
>> static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
>> int map_id)
>> {
>> - gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
>> + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
>> struct bpf_cpu_map_entry *rcpu;
>> int numa, err;
>>
>> --
>> 2.7.4

2018-02-14 20:10:05

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On Wed, Feb 14, 2018 at 10:17:34PM +0800, Jason Wang wrote:
> There're several implications after commit 0bf7800f1799 ("ptr_ring:
> try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> allocation of cpumap can only be done through syscall path which means
> GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> with GFP_KERNEL.
>
> Reported-by: [email protected]
> Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails")
> Cc: Michal Hocko <[email protected]>
> Cc: Daniel Borkmann <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Jason Wang <[email protected]>

Frankly I'd start with the revert. The original patch was rushed
into net without enough justification IMHO, and we just seem to keep
piling up these things. How about deferring all these ideas
to net-next?

> ---
> kernel/bpf/cpumap.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index fbfdada6..a4bb0b3 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -334,7 +334,7 @@ static int cpu_map_kthread_run(void *data)
> static struct bpf_cpu_map_entry *__cpu_map_entry_alloc(u32 qsize, u32 cpu,
> int map_id)
> {
> - gfp_t gfp = GFP_ATOMIC|__GFP_NOWARN;
> + gfp_t gfp = GFP_KERNEL | __GFP_NOWARN;
> struct bpf_cpu_map_entry *rcpu;
> int numa, err;
>
> --
> 2.7.4

2018-02-14 20:11:15

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

On Wed 14-02-18 18:34:51, Jesper Dangaard Brouer wrote:
> On Wed, 14 Feb 2018 16:06:40 +0100
> Michal Hocko <[email protected]> wrote:
>
> > On Wed 14-02-18 22:17:34, Jason Wang wrote:
> > > There're several implications after commit 0bf7800f1799 ("ptr_ring:
> > > try vmalloc() when kmalloc() fails") with the using of vmalloc() since
> > > can't allow GFP_ATOMIC but mandate GFP_KERNEL. This will lead a WARN
> > > since cpumap try to call with GFP_ATOMIC. Fortunately, entry
> > > allocation of cpumap can only be done through syscall path which means
> > > GFP_ATOMIC is not necessary, so fixing this by replacing GFP_ATOMIC
> > > with GFP_KERNEL.
> >
> > map_update_elem does the following. Unless I am missing something and
> > the callback doesn't call cpu_map_update_elem there then we are in a
> > non-preemptible context there and GFP_WAIT would blow up.
> > rcu_read_lock();
> > err = map->ops->map_update_elem(map, key, value, attr->flags);
> > rcu_read_unlock();
>
> Nope - you did miss something ;-)
>
> You are looking at the wrong place. Look at /kernel/bpf/syscall.c line 697.
>
> vim +697 kernel/bpf/syscall.c
> [...]
> } else if (map->map_type == BPF_MAP_TYPE_CPUMAP) {
> err = map->ops->map_update_elem(map, key, value, attr->flags);
> goto out;
> }
>
> You missed that map type BPF_MAP_TYPE_CPUMAP is special cased, and
> is moved outside rcu_read_{lock,unlock} (because it need to create some
> kthreads).
>
> Further more the BPF-verifier disallow BPF programs runtime changing
> the BPF_MAP_TYPE_CPUMAP. Right now, we disallow almost everything from
> the bpf-side (even reading the value):
>
> vim +2057 kernel/bpf/verifier.c

OK, thanks for the clarification. I am not familiar with the code at all
so I was merely looking at call sites and this one just hit my eyes.

--
Michal Hocko
SUSE Labs