2015-06-15 21:25:42

by Larry Finger

[permalink] [raw]
Subject: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

Beginning at commit d52d399, the following INFO splat is logged:

[ 2.816564] ===============================
[ 2.816986] [ INFO: suspicious RCU usage. ]
[ 2.817402] 4.1.0-rc7-next-20150612 #1 Not tainted
[ 2.817881] -------------------------------
[ 2.818297] kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
[ 2.819180]
other info that might help us debug this:

[ 2.819947]
rcu_scheduler_active = 1, debug_locks = 0
[ 2.820578] 3 locks held by systemd/1:
[ 2.820954] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
[ 2.821855] #1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
[ 2.822808] #2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
[ 2.823790]
stack backtrace:
[ 2.824212] CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
[ 2.824932] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
[ 2.825751] 0000000000000001 ffff880224e07838 ffffffff817263a4 ffffffff810ccf2a
[ 2.826560] ffff880224e08000 ffff880224e07868 ffffffff810b6827 0000000000000000
[ 2.827368] ffffffff81a445d3 00000000000004f4 ffff88022682e100 ffff880224e07898
[ 2.828177] Call Trace:
[ 2.828422] [<ffffffff817263a4>] dump_stack+0x4c/0x6e
[ 2.828937] [<ffffffff810ccf2a>] ? console_unlock+0x1ca/0x510
[ 2.829514] [<ffffffff810b6827>] lockdep_rcu_suspicious+0xe7/0x120
[ 2.830139] [<ffffffff8108cf05>] ___might_sleep+0x1d5/0x1f0
[ 2.830699] [<ffffffff8108cf6d>] __might_sleep+0x4d/0x90
[ 2.831239] [<ffffffff811f3789>] ? create_object+0x39/0x2e0
[ 2.831800] [<ffffffff811da427>] kmem_cache_alloc+0x47/0x250
[ 2.832375] [<ffffffff813c19ae>] ? find_next_zero_bit+0x1e/0x20
[ 2.832973] [<ffffffff811f3789>] create_object+0x39/0x2e0
[ 2.833515] [<ffffffff810b7eb6>] ? mark_held_locks+0x66/0x90
[ 2.834089] [<ffffffff8172efab>] ? _raw_spin_unlock_irqrestore+0x4b/0x60
[ 2.834761] [<ffffffff817193c1>] kmemleak_alloc_percpu+0x61/0xe0
[ 2.835369] [<ffffffff811a26f0>] pcpu_alloc+0x370/0x630

Additional backtrace lines are truncated. In addition, the above splat is
followed by several "BUG: sleeping function called from invalid context
at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
for its allocations, whereas it should use the value input to pcpu_alloc().

Signed-off-by: Larry Finger <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Catalin Marinas <[email protected]>
To: Tejun Heo <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
include/linux/kmemleak.h | 3 ++-
mm/kmemleak.c | 9 +++++----
mm/kmemleak.c.rej | 19 +++++++++++++++++++
mm/percpu.c | 2 +-
4 files changed, 27 insertions(+), 6 deletions(-)
create mode 100644 mm/kmemleak.c.rej

diff --git a/include/linux/kmemleak.h b/include/linux/kmemleak.h
index e705467..ec4437b 100644
--- a/include/linux/kmemleak.h
+++ b/include/linux/kmemleak.h
@@ -28,7 +28,8 @@
extern void kmemleak_init(void) __ref;
extern void kmemleak_alloc(const void *ptr, size_t size, int min_count,
gfp_t gfp) __ref;
-extern void kmemleak_alloc_percpu(const void __percpu *ptr, size_t size) __ref;
+extern void kmemleak_alloc_percpu(const void __percpu *ptr, size_t size,
+ gfp_t gfp) __ref;
extern void kmemleak_free(const void *ptr) __ref;
extern void kmemleak_free_part(const void *ptr, size_t size) __ref;
extern void kmemleak_free_percpu(const void __percpu *ptr) __ref;
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index ca9e5a5..b5f5129 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -930,12 +930,13 @@ EXPORT_SYMBOL_GPL(kmemleak_alloc);
* kmemleak_alloc_percpu - register a newly allocated __percpu object
* @ptr: __percpu pointer to beginning of the object
* @size: size of the object
+ * @gfp: kmalloc() flags used for kmemleak internal memory allocations
*
* This function is called from the kernel percpu allocator when a new object
- * (memory block) is allocated (alloc_percpu). It assumes GFP_KERNEL
- * allocation.
+ * (memory block) is allocated (alloc_percpu).
*/
-void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size)
+void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size,
+ gfp_t gfp)
{
unsigned int cpu;

@@ -948,7 +949,7 @@ void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size)
if (kmemleak_enabled && ptr && !IS_ERR(ptr))
for_each_possible_cpu(cpu)
create_object((unsigned long)per_cpu_ptr(ptr, cpu),
- size, 0, GFP_KERNEL);
+ size, 0, gfp);
else if (kmemleak_early_log)
log_early(KMEMLEAK_ALLOC_PERCPU, ptr, size, 0);
}
diff --git a/mm/kmemleak.c.rej b/mm/kmemleak.c.rej
new file mode 100644
index 0000000..d93c2fe
--- /dev/null
+++ b/mm/kmemleak.c.rej
@@ -0,0 +1,19 @@
+--- mm/kmemleak.c
++++ mm/kmemleak.c
+@@ -930,12 +930,13 @@ EXPORT_SYMBOL_GPL(kmemleak_alloc);
+ * kmemleak_alloc_percpu - register a newly allocated __percpu object
+ * @ptr: __percpu pointer to beginning of the object
+ * @size: size of the object
++ * @gfp: kmalloc() flags used for kmemleak internal memory allocations
+ *
+ * This function is called from the kernel percpu allocator when a new object
+- * (memory block) is allocated (alloc_percpu). It assumes GFP_KERNEL
+- * allocation.
++ * (memory block) is allocated (alloc_percpu).
+ */
+-void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size)
++void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size,
++ gfp_t gfp)
+ {
+ unsigned int cpu;
+
diff --git a/mm/percpu.c b/mm/percpu.c
index dfd0248..2dd7448 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1030,7 +1030,7 @@ area_found:
memset((void *)pcpu_chunk_addr(chunk, cpu, 0) + off, 0, size);

ptr = __addr_to_pcpu_ptr(chunk->base_addr + off);
- kmemleak_alloc_percpu(ptr, size);
+ kmemleak_alloc_percpu(ptr, size, gfp);
return ptr;

fail_unlock:
--
2.1.4


2015-06-16 05:37:22

by Kamalesh Babulal

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

* Larry Finger <[email protected]> [2015-06-15 16:25:18]:

> Beginning at commit d52d399, the following INFO splat is logged:
>

[...]

> ---
> include/linux/kmemleak.h | 3 ++-
> mm/kmemleak.c | 9 +++++----
> mm/kmemleak.c.rej | 19 +++++++++++++++++++
> mm/percpu.c | 2 +-
> 4 files changed, 27 insertions(+), 6 deletions(-)
> create mode 100644 mm/kmemleak.c.rej

Patch creates kmemleak.c.rej file.


Regards,
Kamalesh.

2015-06-16 06:02:21

by Larry Finger

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On 06/16/2015 12:37 AM, Kamalesh Babulal wrote:
> * Larry Finger <[email protected]> [2015-06-15 16:25:18]:
>
>> Beginning at commit d52d399, the following INFO splat is logged:
>>
>
> [...]
>
>> ---
>> include/linux/kmemleak.h | 3 ++-
>> mm/kmemleak.c | 9 +++++----
>> mm/kmemleak.c.rej | 19 +++++++++++++++++++
>> mm/percpu.c | 2 +-
>> 4 files changed, 27 insertions(+), 6 deletions(-)
>> create mode 100644 mm/kmemleak.c.rej
>
> Patch creates kmemleak.c.rej file.

Sorry about that. This one was an RFC. I'll fix that before the final submission.

Larry

2015-06-16 09:20:36

by Catalin Marinas

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On Mon, Jun 15, 2015 at 10:25:18PM +0100, Larry Finger wrote:
> Beginning at commit d52d399, the following INFO splat is logged:
>
> [ 2.816564] ===============================
> [ 2.816986] [ INFO: suspicious RCU usage. ]
> [ 2.817402] 4.1.0-rc7-next-20150612 #1 Not tainted
> [ 2.817881] -------------------------------
> [ 2.818297] kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
> [ 2.819180]
> other info that might help us debug this:
>
> [ 2.819947]
> rcu_scheduler_active = 1, debug_locks = 0
> [ 2.820578] 3 locks held by systemd/1:
> [ 2.820954] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
> [ 2.821855] #1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
> [ 2.822808] #2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
> [ 2.823790]
> stack backtrace:
> [ 2.824212] CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
> [ 2.824932] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
> [ 2.825751] 0000000000000001 ffff880224e07838 ffffffff817263a4 ffffffff810ccf2a
> [ 2.826560] ffff880224e08000 ffff880224e07868 ffffffff810b6827 0000000000000000
> [ 2.827368] ffffffff81a445d3 00000000000004f4 ffff88022682e100 ffff880224e07898
> [ 2.828177] Call Trace:
> [ 2.828422] [<ffffffff817263a4>] dump_stack+0x4c/0x6e
> [ 2.828937] [<ffffffff810ccf2a>] ? console_unlock+0x1ca/0x510
> [ 2.829514] [<ffffffff810b6827>] lockdep_rcu_suspicious+0xe7/0x120
> [ 2.830139] [<ffffffff8108cf05>] ___might_sleep+0x1d5/0x1f0
> [ 2.830699] [<ffffffff8108cf6d>] __might_sleep+0x4d/0x90
> [ 2.831239] [<ffffffff811f3789>] ? create_object+0x39/0x2e0
> [ 2.831800] [<ffffffff811da427>] kmem_cache_alloc+0x47/0x250
> [ 2.832375] [<ffffffff813c19ae>] ? find_next_zero_bit+0x1e/0x20
> [ 2.832973] [<ffffffff811f3789>] create_object+0x39/0x2e0
> [ 2.833515] [<ffffffff810b7eb6>] ? mark_held_locks+0x66/0x90
> [ 2.834089] [<ffffffff8172efab>] ? _raw_spin_unlock_irqrestore+0x4b/0x60
> [ 2.834761] [<ffffffff817193c1>] kmemleak_alloc_percpu+0x61/0xe0
> [ 2.835369] [<ffffffff811a26f0>] pcpu_alloc+0x370/0x630
>
> Additional backtrace lines are truncated. In addition, the above splat is
> followed by several "BUG: sleeping function called from invalid context
> at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
> clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
> for its allocations, whereas it should use the value input to pcpu_alloc().
>
> Signed-off-by: Larry Finger <[email protected]>
> Cc: Martin KaFai Lau <[email protected]>
> Cc: Catalin Marinas <[email protected]>
> To: Tejun Heo <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> include/linux/kmemleak.h | 3 ++-
> mm/kmemleak.c | 9 +++++----
> mm/kmemleak.c.rej | 19 +++++++++++++++++++
> mm/percpu.c | 2 +-

As Kamalesh already pointed out, you added the mm/kmemleak.c.rej file to
this patch.

> diff --git a/include/linux/kmemleak.h b/include/linux/kmemleak.h
> index e705467..ec4437b 100644
> --- a/include/linux/kmemleak.h
> +++ b/include/linux/kmemleak.h
> @@ -28,7 +28,8 @@
> extern void kmemleak_init(void) __ref;
> extern void kmemleak_alloc(const void *ptr, size_t size, int min_count,
> gfp_t gfp) __ref;
> -extern void kmemleak_alloc_percpu(const void __percpu *ptr, size_t size) __ref;
> +extern void kmemleak_alloc_percpu(const void __percpu *ptr, size_t size,
> + gfp_t gfp) __ref;
> extern void kmemleak_free(const void *ptr) __ref;
> extern void kmemleak_free_part(const void *ptr, size_t size) __ref;
> extern void kmemleak_free_percpu(const void __percpu *ptr) __ref;
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index ca9e5a5..b5f5129 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -930,12 +930,13 @@ EXPORT_SYMBOL_GPL(kmemleak_alloc);
> * kmemleak_alloc_percpu - register a newly allocated __percpu object
> * @ptr: __percpu pointer to beginning of the object
> * @size: size of the object
> + * @gfp: kmalloc() flags used for kmemleak internal memory allocations

Nitpick: since this is triggered by percpu_alloc_gfp(), I would just
remove the "kmalloc()" here.

> *
> * This function is called from the kernel percpu allocator when a new object
> - * (memory block) is allocated (alloc_percpu). It assumes GFP_KERNEL
> - * allocation.
> + * (memory block) is allocated (alloc_percpu).
> */
> -void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size)
> +void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size,
> + gfp_t gfp)
> {
> unsigned int cpu;
>
> @@ -948,7 +949,7 @@ void __ref kmemleak_alloc_percpu(const void __percpu *ptr, size_t size)
> if (kmemleak_enabled && ptr && !IS_ERR(ptr))
> for_each_possible_cpu(cpu)
> create_object((unsigned long)per_cpu_ptr(ptr, cpu),
> - size, 0, GFP_KERNEL);
> + size, 0, gfp);
> else if (kmemleak_early_log)
> log_early(KMEMLEAK_ALLOC_PERCPU, ptr, size, 0);
> }
[... mm/kmemleak.c.rej removed ...]
> diff --git a/mm/percpu.c b/mm/percpu.c
> index dfd0248..2dd7448 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -1030,7 +1030,7 @@ area_found:
> memset((void *)pcpu_chunk_addr(chunk, cpu, 0) + off, 0, size);
>
> ptr = __addr_to_pcpu_ptr(chunk->base_addr + off);
> - kmemleak_alloc_percpu(ptr, size);
> + kmemleak_alloc_percpu(ptr, size, gfp);
> return ptr;
>
> fail_unlock:

Apart from the minor comment above (and the kmemleak.c.rej file):

Reviewed-by: Catalin Marinas <[email protected]>

Thanks.

2015-06-16 09:04:42

by Catalin Marinas

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On Tue, Jun 16, 2015 at 09:44:24AM +0100, Catalin Marinas wrote:
> On Mon, Jun 15, 2015 at 10:25:18PM +0100, Larry Finger wrote:
> > Beginning at commit d52d399, the following INFO splat is logged:
> >
> > [ 2.816564] ===============================
> > [ 2.816986] [ INFO: suspicious RCU usage. ]
> > [ 2.817402] 4.1.0-rc7-next-20150612 #1 Not tainted
> > [ 2.817881] -------------------------------
> > [ 2.818297] kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
> > [ 2.819180]
> > other info that might help us debug this:
> >
> > [ 2.819947]
> > rcu_scheduler_active = 1, debug_locks = 0
> > [ 2.820578] 3 locks held by systemd/1:
> > [ 2.820954] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff815f0c8f>] rtnetlink_rcv+0x1f/0x40
> > [ 2.821855] #1: (rcu_read_lock_bh){......}, at: [<ffffffff816a34e2>] ipv6_add_addr+0x62/0x540
> > [ 2.822808] #2: (addrconf_hash_lock){+...+.}, at: [<ffffffff816a3604>] ipv6_add_addr+0x184/0x540
> > [ 2.823790]
> > stack backtrace:
> > [ 2.824212] CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
> > [ 2.824932] Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
> > [ 2.825751] 0000000000000001 ffff880224e07838 ffffffff817263a4 ffffffff810ccf2a
> > [ 2.826560] ffff880224e08000 ffff880224e07868 ffffffff810b6827 0000000000000000
> > [ 2.827368] ffffffff81a445d3 00000000000004f4 ffff88022682e100 ffff880224e07898
> > [ 2.828177] Call Trace:
> > [ 2.828422] [<ffffffff817263a4>] dump_stack+0x4c/0x6e
> > [ 2.828937] [<ffffffff810ccf2a>] ? console_unlock+0x1ca/0x510
> > [ 2.829514] [<ffffffff810b6827>] lockdep_rcu_suspicious+0xe7/0x120
> > [ 2.830139] [<ffffffff8108cf05>] ___might_sleep+0x1d5/0x1f0
> > [ 2.830699] [<ffffffff8108cf6d>] __might_sleep+0x4d/0x90
> > [ 2.831239] [<ffffffff811f3789>] ? create_object+0x39/0x2e0
> > [ 2.831800] [<ffffffff811da427>] kmem_cache_alloc+0x47/0x250
> > [ 2.832375] [<ffffffff813c19ae>] ? find_next_zero_bit+0x1e/0x20
> > [ 2.832973] [<ffffffff811f3789>] create_object+0x39/0x2e0
> > [ 2.833515] [<ffffffff810b7eb6>] ? mark_held_locks+0x66/0x90
> > [ 2.834089] [<ffffffff8172efab>] ? _raw_spin_unlock_irqrestore+0x4b/0x60
> > [ 2.834761] [<ffffffff817193c1>] kmemleak_alloc_percpu+0x61/0xe0
> > [ 2.835369] [<ffffffff811a26f0>] pcpu_alloc+0x370/0x630
> >
> > Additional backtrace lines are truncated. In addition, the above splat is
> > followed by several "BUG: sleeping function called from invalid context
> > at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
> > clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
> > for its allocations, whereas it should use the value input to pcpu_alloc().
> >
> > Signed-off-by: Larry Finger <[email protected]>
> > Cc: Martin KaFai Lau <[email protected]>
> > Cc: Catalin Marinas <[email protected]>
> > To: Tejun Heo <[email protected]>
> > Cc: Christoph Lameter <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
[...]
> Apart from the minor comment above (and the kmemleak.c.rej file):
>
> Reviewed-by: Catalin Marinas <[email protected]>

BTW, it's worth adding:

Cc: <[email protected]> # v3.18+

(or Fixes: 5835d96e9ce4 percpu: implement [__]alloc_percpu_gfp())

--
Catalin

2015-06-16 21:07:50

by Martin KaFai Lau

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On Mon, Jun 15, 2015 at 04:25:18PM -0500, Larry Finger wrote:
> Additional backtrace lines are truncated. In addition, the above splat is
> followed by several "BUG: sleeping function called from invalid context
> at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
> clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
> for its allocations, whereas it should use the value input to pcpu_alloc().
Just a minor nit, 'kmemleak_alloc_percpu() should follow the gfp from
per_alloc()' may be a more accurate title to describe the patch.

Acked-by: Martin KaFai Lau <[email protected]>

2015-06-17 02:43:40

by Larry Finger

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On 06/16/2015 04:07 PM, Martin KaFai Lau wrote:
> On Mon, Jun 15, 2015 at 04:25:18PM -0500, Larry Finger wrote:
>> Additional backtrace lines are truncated. In addition, the above splat is
>> followed by several "BUG: sleeping function called from invalid context
>> at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
>> clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
>> for its allocations, whereas it should use the value input to pcpu_alloc().
> Just a minor nit, 'kmemleak_alloc_percpu() should follow the gfp from
> per_alloc()' may be a more accurate title to describe the patch.

Do you mean that the subject should be changed?

Larry

2015-06-17 04:15:45

by Martin KaFai Lau

[permalink] [raw]
Subject: Re: [RFC NEXT] mm: Fix suspicious RCU usage at kernel/sched/core.c:7318

On Tue, Jun 16, 2015 at 09:43:28PM -0500, Larry Finger wrote:
> On 06/16/2015 04:07 PM, Martin KaFai Lau wrote:
> >On Mon, Jun 15, 2015 at 04:25:18PM -0500, Larry Finger wrote:
> >>Additional backtrace lines are truncated. In addition, the above splat is
> >>followed by several "BUG: sleeping function called from invalid context
> >>at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau, these are the
> >>clue to the fix. Routine kmemleak_alloc_percpu() always uses GFP_KERNEL
> >>for its allocations, whereas it should use the value input to pcpu_alloc().
> >Just a minor nit, 'kmemleak_alloc_percpu() should follow the gfp from
> >per_alloc()' may be a more accurate title to describe the patch.
>
> Do you mean that the subject should be changed?
Yes

A typo in my last response: s/per_alloc/pcpu_alloc/

--Martin