In dev_queue_xmit() net_cls protected with rcu-bh.
[ 270.730026] ===============================
[ 270.730029] [ INFO: suspicious RCU usage. ]
[ 270.730033] 4.2.0-rc3+ #2 Not tainted
[ 270.730036] -------------------------------
[ 270.730040] include/linux/cgroup.h:353 suspicious rcu_dereference_check() usage!
[ 270.730041] other info that might help us debug this:
[ 270.730043] rcu_scheduler_active = 1, debug_locks = 1
[ 270.730045] 2 locks held by dhclient/748:
[ 270.730046] #0: (rcu_read_lock_bh){......}, at: [<ffffffff81682b70>] __dev_queue_xmit+0x50/0x960
[ 270.730085] #1: (&qdisc_tx_lock){+.....}, at: [<ffffffff81682d60>] __dev_queue_xmit+0x240/0x960
[ 270.730090] stack backtrace:
[ 270.730096] CPU: 0 PID: 748 Comm: dhclient Not tainted 4.2.0-rc3+ #2
[ 270.730098] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 270.730100] 0000000000000001 ffff8800bafeba58 ffffffff817ad487 0000000000000007
[ 270.730103] ffff880232a0a780 ffff8800bafeba88 ffffffff810ca4f2 ffff88022fb23e00
[ 270.730105] ffff880232a0a780 ffff8800bafebb68 ffff8800bafebb68 ffff8800bafebaa8
[ 270.730108] Call Trace:
[ 270.730121] [<ffffffff817ad487>] dump_stack+0x4c/0x65
[ 270.730148] [<ffffffff810ca4f2>] lockdep_rcu_suspicious+0xe2/0x120
[ 270.730153] [<ffffffff816a62d2>] task_cls_state+0x92/0xa0
[ 270.730158] [<ffffffffa00b534f>] cls_cgroup_classify+0x4f/0x120 [cls_cgroup]
[ 270.730164] [<ffffffff816aac74>] tc_classify_compat+0x74/0xc0
[ 270.730166] [<ffffffff816ab573>] tc_classify+0x33/0x90
[ 270.730170] [<ffffffffa00bcb0a>] htb_enqueue+0xaa/0x4a0 [sch_htb]
[ 270.730172] [<ffffffff81682e26>] __dev_queue_xmit+0x306/0x960
[ 270.730174] [<ffffffff81682b70>] ? __dev_queue_xmit+0x50/0x960
[ 270.730176] [<ffffffff816834a3>] dev_queue_xmit_sk+0x13/0x20
[ 270.730185] [<ffffffff81787770>] dev_queue_xmit+0x10/0x20
[ 270.730187] [<ffffffff8178b91c>] packet_snd.isra.62+0x54c/0x760
[ 270.730190] [<ffffffff8178be25>] packet_sendmsg+0x2f5/0x3f0
[ 270.730203] [<ffffffff81665245>] ? sock_def_readable+0x5/0x190
[ 270.730210] [<ffffffff817b64bb>] ? _raw_spin_unlock+0x2b/0x40
[ 270.730216] [<ffffffff8173bcbc>] ? unix_dgram_sendmsg+0x5cc/0x640
[ 270.730219] [<ffffffff8165f367>] sock_sendmsg+0x47/0x50
[ 270.730221] [<ffffffff8165f42f>] sock_write_iter+0x7f/0xd0
[ 270.730232] [<ffffffff811fd4c7>] __vfs_write+0xa7/0xf0
[ 270.730234] [<ffffffff811fe5b8>] vfs_write+0xb8/0x190
[ 270.730236] [<ffffffff811fe8c2>] SyS_write+0x52/0xb0
[ 270.730239] [<ffffffff817b6bae>] entry_SYSCALL_64_fastpath+0x12/0x76
Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
net/core/netclassid_cgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index 1f2a126f4ffa..515939034298 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -23,7 +23,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
struct cgroup_cls_state *task_cls_state(struct task_struct *p)
{
- return css_cls_state(task_css(p, net_cls_cgrp_id));
+ return css_cls_state(task_css_check(p, net_cls_cgrp_id,
+ rcu_read_lock_bh_held()));
}
EXPORT_SYMBOL_GPL(task_cls_state);
From: Konstantin Khlebnikov <[email protected]>
Date: Tue, 21 Jul 2015 19:46:29 +0300
> @@ -23,7 +23,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>
> struct cgroup_cls_state *task_cls_state(struct task_struct *p)
> {
> - return css_cls_state(task_css(p, net_cls_cgrp_id));
> + return css_cls_state(task_css_check(p, net_cls_cgrp_id,
> + rcu_read_lock_bh_held()));
You've made a serious mess of the indentation here.
First of all, you've changed the correct plain "TAB" before the 'return' line
into a TAB and two SPACE characters.
Secondly, the second line needs to be precisely indented to the exact column
following the openning parenthesis of the task_css_check() call on the
previous line.
In dev_queue_xmit() net_cls protected with rcu-bh.
[ 270.730026] ===============================
[ 270.730029] [ INFO: suspicious RCU usage. ]
[ 270.730033] 4.2.0-rc3+ #2 Not tainted
[ 270.730036] -------------------------------
[ 270.730040] include/linux/cgroup.h:353 suspicious rcu_dereference_check() usage!
[ 270.730041] other info that might help us debug this:
[ 270.730043] rcu_scheduler_active = 1, debug_locks = 1
[ 270.730045] 2 locks held by dhclient/748:
[ 270.730046] #0: (rcu_read_lock_bh){......}, at: [<ffffffff81682b70>] __dev_queue_xmit+0x50/0x960
[ 270.730085] #1: (&qdisc_tx_lock){+.....}, at: [<ffffffff81682d60>] __dev_queue_xmit+0x240/0x960
[ 270.730090] stack backtrace:
[ 270.730096] CPU: 0 PID: 748 Comm: dhclient Not tainted 4.2.0-rc3+ #2
[ 270.730098] Hardware name: OpenStack Foundation OpenStack Nova, BIOS Bochs 01/01/2011
[ 270.730100] 0000000000000001 ffff8800bafeba58 ffffffff817ad487 0000000000000007
[ 270.730103] ffff880232a0a780 ffff8800bafeba88 ffffffff810ca4f2 ffff88022fb23e00
[ 270.730105] ffff880232a0a780 ffff8800bafebb68 ffff8800bafebb68 ffff8800bafebaa8
[ 270.730108] Call Trace:
[ 270.730121] [<ffffffff817ad487>] dump_stack+0x4c/0x65
[ 270.730148] [<ffffffff810ca4f2>] lockdep_rcu_suspicious+0xe2/0x120
[ 270.730153] [<ffffffff816a62d2>] task_cls_state+0x92/0xa0
[ 270.730158] [<ffffffffa00b534f>] cls_cgroup_classify+0x4f/0x120 [cls_cgroup]
[ 270.730164] [<ffffffff816aac74>] tc_classify_compat+0x74/0xc0
[ 270.730166] [<ffffffff816ab573>] tc_classify+0x33/0x90
[ 270.730170] [<ffffffffa00bcb0a>] htb_enqueue+0xaa/0x4a0 [sch_htb]
[ 270.730172] [<ffffffff81682e26>] __dev_queue_xmit+0x306/0x960
[ 270.730174] [<ffffffff81682b70>] ? __dev_queue_xmit+0x50/0x960
[ 270.730176] [<ffffffff816834a3>] dev_queue_xmit_sk+0x13/0x20
[ 270.730185] [<ffffffff81787770>] dev_queue_xmit+0x10/0x20
[ 270.730187] [<ffffffff8178b91c>] packet_snd.isra.62+0x54c/0x760
[ 270.730190] [<ffffffff8178be25>] packet_sendmsg+0x2f5/0x3f0
[ 270.730203] [<ffffffff81665245>] ? sock_def_readable+0x5/0x190
[ 270.730210] [<ffffffff817b64bb>] ? _raw_spin_unlock+0x2b/0x40
[ 270.730216] [<ffffffff8173bcbc>] ? unix_dgram_sendmsg+0x5cc/0x640
[ 270.730219] [<ffffffff8165f367>] sock_sendmsg+0x47/0x50
[ 270.730221] [<ffffffff8165f42f>] sock_write_iter+0x7f/0xd0
[ 270.730232] [<ffffffff811fd4c7>] __vfs_write+0xa7/0xf0
[ 270.730234] [<ffffffff811fe5b8>] vfs_write+0xb8/0x190
[ 270.730236] [<ffffffff811fe8c2>] SyS_write+0x52/0xb0
[ 270.730239] [<ffffffff817b6bae>] entry_SYSCALL_64_fastpath+0x12/0x76
Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
net/core/netclassid_cgroup.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
index 1f2a126f4ffa..6441f47b1a8f 100644
--- a/net/core/netclassid_cgroup.c
+++ b/net/core/netclassid_cgroup.c
@@ -23,7 +23,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
struct cgroup_cls_state *task_cls_state(struct task_struct *p)
{
- return css_cls_state(task_css(p, net_cls_cgrp_id));
+ return css_cls_state(task_css_check(p, net_cls_cgrp_id,
+ rcu_read_lock_bh_held()));
}
EXPORT_SYMBOL_GPL(task_cls_state);
On 07/22/2015 11:23 AM, Konstantin Khlebnikov wrote:
> In dev_queue_xmit() net_cls protected with rcu-bh.
...
> Signed-off-by: Konstantin Khlebnikov <[email protected]>
> ---
> net/core/netclassid_cgroup.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
> index 1f2a126f4ffa..6441f47b1a8f 100644
> --- a/net/core/netclassid_cgroup.c
> +++ b/net/core/netclassid_cgroup.c
> @@ -23,7 +23,8 @@ static inline struct cgroup_cls_state *css_cls_state(struct cgroup_subsys_state
>
> struct cgroup_cls_state *task_cls_state(struct task_struct *p)
> {
> - return css_cls_state(task_css(p, net_cls_cgrp_id));
> + return css_cls_state(task_css_check(p, net_cls_cgrp_id,
> + rcu_read_lock_bh_held()));
Did you also check that after your patch this doesn't trigger on ingress
either? There, this code path could be invoked under rcu_read_lock(). So,
perhaps you need to check for both.
> }
> EXPORT_SYMBOL_GPL(task_cls_state);
On 22.07.2015 14:56, Daniel Borkmann wrote:
> On 07/22/2015 11:23 AM, Konstantin Khlebnikov wrote:
>> In dev_queue_xmit() net_cls protected with rcu-bh.
> ...
>> Signed-off-by: Konstantin Khlebnikov <[email protected]>
>> ---
>> net/core/netclassid_cgroup.c | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
>> index 1f2a126f4ffa..6441f47b1a8f 100644
>> --- a/net/core/netclassid_cgroup.c
>> +++ b/net/core/netclassid_cgroup.c
>> @@ -23,7 +23,8 @@ static inline struct cgroup_cls_state
>> *css_cls_state(struct cgroup_subsys_state
>>
>> struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>> {
>> - return css_cls_state(task_css(p, net_cls_cgrp_id));
>> + return css_cls_state(task_css_check(p, net_cls_cgrp_id,
>> + rcu_read_lock_bh_held()));
>
> Did you also check that after your patch this doesn't trigger on ingress
> either? There, this code path could be invoked under rcu_read_lock(). So,
> perhaps you need to check for both.
I haven't seen warnings with this patch. rcu_read_lock_held() is
checked inside rcu_dereference_check() inside task_css_check().
So, overall check expression is:
lockdep_is_held(&cgroup_mutex) || <- in task_css_set_check
lockdep_is_held(&css_set_rwsem) ||
((task)->flags & PF_EXITING) ||
rcu_read_lock_bh_held() || <- mine
rcu_read_lock_held() <- in rcu_dereference_check
>
>> }
>> EXPORT_SYMBOL_GPL(task_cls_state);
--
Konstantin
On 07/22/2015 02:03 PM, Konstantin Khlebnikov wrote:
> On 22.07.2015 14:56, Daniel Borkmann wrote:
>> On 07/22/2015 11:23 AM, Konstantin Khlebnikov wrote:
>>> In dev_queue_xmit() net_cls protected with rcu-bh.
>> ...
>>> Signed-off-by: Konstantin Khlebnikov <[email protected]>
>>> ---
>>> net/core/netclassid_cgroup.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/net/core/netclassid_cgroup.c b/net/core/netclassid_cgroup.c
>>> index 1f2a126f4ffa..6441f47b1a8f 100644
>>> --- a/net/core/netclassid_cgroup.c
>>> +++ b/net/core/netclassid_cgroup.c
>>> @@ -23,7 +23,8 @@ static inline struct cgroup_cls_state
>>> *css_cls_state(struct cgroup_subsys_state
>>>
>>> struct cgroup_cls_state *task_cls_state(struct task_struct *p)
>>> {
>>> - return css_cls_state(task_css(p, net_cls_cgrp_id));
>>> + return css_cls_state(task_css_check(p, net_cls_cgrp_id,
>>> + rcu_read_lock_bh_held()));
>>
>> Did you also check that after your patch this doesn't trigger on ingress
>> either? There, this code path could be invoked under rcu_read_lock(). So,
>> perhaps you need to check for both.
>
> I haven't seen warnings with this patch. rcu_read_lock_held() is
> checked inside rcu_dereference_check() inside task_css_check().
Thanks, agreed.
From: Konstantin Khlebnikov <[email protected]>
Date: Wed, 22 Jul 2015 12:23:20 +0300
> In dev_queue_xmit() net_cls protected with rcu-bh.
...
> Signed-off-by: Konstantin Khlebnikov <[email protected]>
Applied, thanks.