I find tcf_gate_act() acquires &gact->tcf_lock without disable
bh explicitly, as gact->tcf_lock is acquired inside timer under
softirq context, if tcf_gate_act() is not called with bh disable
by default or under softirq context(which I am not sure as I cannot
find corresponding documentation), then it could be the following
deadlocks.
tcf_gate_act()
--> spin_loc(&gact->tcf_lock)
<interrupt>
--> gate_timer_func()
--> spin_lock(&gact->tcf_lock)
Signed-off-by: Chengfeng Ye <[email protected]>
---
net/sched/act_gate.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
index c9a811f4c7ee..b82daf7401a5 100644
--- a/net/sched/act_gate.c
+++ b/net/sched/act_gate.c
@@ -124,25 +124,25 @@ TC_INDIRECT_SCOPE int tcf_gate_act(struct sk_buff *skb,
tcf_lastuse_update(&gact->tcf_tm);
tcf_action_update_bstats(&gact->common, skb);
- spin_lock(&gact->tcf_lock);
+ spin_lock_bh(&gact->tcf_lock);
if (unlikely(gact->current_gate_status & GATE_ACT_PENDING)) {
- spin_unlock(&gact->tcf_lock);
+ spin_unlock_bh(&gact->tcf_lock);
return action;
}
if (!(gact->current_gate_status & GATE_ACT_GATE_OPEN)) {
- spin_unlock(&gact->tcf_lock);
+ spin_unlock_bh(&gact->tcf_lock);
goto drop;
}
if (gact->current_max_octets >= 0) {
gact->current_entry_octets += qdisc_pkt_len(skb);
if (gact->current_entry_octets > gact->current_max_octets) {
- spin_unlock(&gact->tcf_lock);
+ spin_unlock_bh(&gact->tcf_lock);
goto overlimit;
}
}
- spin_unlock(&gact->tcf_lock);
+ spin_unlock_bh(&gact->tcf_lock);
return action;
--
2.17.1
+ Pedro Tammela <[email protected]>
Po Liu <[email protected]>
On Tue, Sep 26, 2023 at 06:26:25PM +0000, Chengfeng Ye wrote:
> I find tcf_gate_act() acquires &gact->tcf_lock without disable
> bh explicitly, as gact->tcf_lock is acquired inside timer under
> softirq context, if tcf_gate_act() is not called with bh disable
> by default or under softirq context(which I am not sure as I cannot
> find corresponding documentation), then it could be the following
> deadlocks.
>
> tcf_gate_act()
> --> spin_loc(&gact->tcf_lock)
> <interrupt>
> --> gate_timer_func()
> --> spin_lock(&gact->tcf_lock)
>
> Signed-off-by: Chengfeng Ye <[email protected]>
Hi Chengfeng Ye,
thanks for your patch.
As a fix for Networking this should probably be targeted at the
'net' tree. Which should be denoted in the subject.
Subject: [PATCH net] ...
And as a fix this patch should probably have a Fixes tag.
This ones seem appropriate to me, but I could be wrong.
Fixes: a51c328df310 ("net: qos: introduce a gate control flow action")
I don't think it is necessary to repost just to address these issues,
but the Networking maintainers may think otherwise.
The code change itself looks good to me.
Reviewed-by: Simon Horman <[email protected]>
> ---
> net/sched/act_gate.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> index c9a811f4c7ee..b82daf7401a5 100644
> --- a/net/sched/act_gate.c
> +++ b/net/sched/act_gate.c
> @@ -124,25 +124,25 @@ TC_INDIRECT_SCOPE int tcf_gate_act(struct sk_buff *skb,
> tcf_lastuse_update(&gact->tcf_tm);
> tcf_action_update_bstats(&gact->common, skb);
>
> - spin_lock(&gact->tcf_lock);
> + spin_lock_bh(&gact->tcf_lock);
> if (unlikely(gact->current_gate_status & GATE_ACT_PENDING)) {
> - spin_unlock(&gact->tcf_lock);
> + spin_unlock_bh(&gact->tcf_lock);
> return action;
> }
>
> if (!(gact->current_gate_status & GATE_ACT_GATE_OPEN)) {
> - spin_unlock(&gact->tcf_lock);
> + spin_unlock_bh(&gact->tcf_lock);
> goto drop;
> }
>
> if (gact->current_max_octets >= 0) {
> gact->current_entry_octets += qdisc_pkt_len(skb);
> if (gact->current_entry_octets > gact->current_max_octets) {
> - spin_unlock(&gact->tcf_lock);
> + spin_unlock_bh(&gact->tcf_lock);
> goto overlimit;
> }
> }
> - spin_unlock(&gact->tcf_lock);
> + spin_unlock_bh(&gact->tcf_lock);
>
> return action;
>
> --
> 2.17.1
>
>
On Tue, Sep 26, 2023 at 11:27 AM Chengfeng Ye <[email protected]> wrote:
>
> I find tcf_gate_act() acquires &gact->tcf_lock without disable
> bh explicitly, as gact->tcf_lock is acquired inside timer under
> softirq context, if tcf_gate_act() is not called with bh disable
> by default or under softirq context(which I am not sure as I cannot
> find corresponding documentation), then it could be the following
> deadlocks.
Did you find this during code review or did you see a real
lockdep splat? If the latter, please include the full lockdep log.
Thanks.
> Did you find this during code review or did you see a real
> lockdep splat? If the latter, please include the full lockdep log.
No, it is found during static code review.
Thanks,
Chengfeng
On Tue, 26 Sep 2023 18:26:25 +0000 Chengfeng Ye wrote:
> I find tcf_gate_act() acquires &gact->tcf_lock without disable
> bh explicitly, as gact->tcf_lock is acquired inside timer under
> softirq context, if tcf_gate_act() is not called with bh disable
> by default or under softirq context(which I am not sure as I cannot
> find corresponding documentation), then it could be the following
> deadlocks.
>
> tcf_gate_act()
> --> spin_loc(&gact->tcf_lock)
> <interrupt>
> --> gate_timer_func()
> --> spin_lock(&gact->tcf_lock)
This is a TC action, I don't think it can run without BH being already
disabled, can it?
--
pw-bot: cr
On Thu, Oct 5, 2023 at 5:01 AM Chengfeng Ye <[email protected]> wrote:
>
> Hi Jakub,
>
> Thanks for the reply,
>
> I inspected the code a bit more, it seems that the TC action is called from
> tcf_proto_ops.classify() callback, which is called from Qdisc_ops enqueue
> callback.
>
> Then Qdisc enqueue callback is from
>
> -> __dev_queue_xmit()
> -> __dev_xmit_skb()
> -> dev_qdisc_enqueue()
>
> inside the net core. It seems that this __dev_queue_xmit() callback is
> typically called from BH context (e.g., NET_TX_SOFTIRQ) with BH
> already disabled, but sometimes also can from a work queue under
> process context, one case is the br_mrp_test_work_expired() inside
> net/bridge/br_mrp.c. Does it indicate that this TC action could also be
> called with BH enable? I am not a developer so really not sure about it,
> as the networking code is a bit long and complicated.
net/bridge/br_mrp.c seems to need some love +Cc Horatiu Vultur
<[email protected]>
Maybe that code needs to run in a tasklet?
In any case your patch is incorrect.
cheers,
jamal
Does it mean that dev_queue_xmit() should be executed under BH?
Thanks,
Chengfeng
Hi Jakub,
Thanks for the reply,
I inspected the code a bit more, it seems that the TC action is called from
tcf_proto_ops.classify() callback, which is called from Qdisc_ops enqueue
callback.
Then Qdisc enqueue callback is from
-> __dev_queue_xmit()
-> __dev_xmit_skb()
-> dev_qdisc_enqueue()
inside the net core. It seems that this __dev_queue_xmit() callback is
typically called from BH context (e.g., NET_TX_SOFTIRQ) with BH
already disabled, but sometimes also can from a work queue under
process context, one case is the br_mrp_test_work_expired() inside
net/bridge/br_mrp.c. Does it indicate that this TC action could also be
called with BH enable? I am not a developer so really not sure about it,
as the networking code is a bit long and complicated.
Thanks again,
Chengfeng
On Thu, Oct 05, 2023 at 05:01:07PM +0800, Chengfeng Ye wrote:
> Hi Jakub,
>
> Thanks for the reply,
>
> I inspected the code a bit more, it seems that the TC action is called from
> tcf_proto_ops.classify() callback, which is called from Qdisc_ops enqueue
> callback.
>
> Then Qdisc enqueue callback is from
>
> -> __dev_queue_xmit()
> -> __dev_xmit_skb()
> -> dev_qdisc_enqueue()
>
> inside the net core. It seems that this __dev_queue_xmit() callback is
> typically called from BH context (e.g., NET_TX_SOFTIRQ) with BH
> already disabled, but sometimes also can from a work queue under
> process context, one case is the br_mrp_test_work_expired() inside
> net/bridge/br_mrp.c. Does it indicate that this TC action could also be
> called with BH enable? I am not a developer so really not sure about it,
> as the networking code is a bit long and complicated.
Doesn't __dev_queue_xmit() itself disable BH with rcu_read_lock_bh()??
Thanks.
You are right, sorry for my negligence.
Thanks,
Chengfeng
The 10/05/2023 07:46, Jamal Hadi Salim wrote:
Hi Jamal,
> On Thu, Oct 5, 2023 at 5:01 AM Chengfeng Ye <[email protected]> wrote:
> >
> > Hi Jakub,
> >
> > Thanks for the reply,
> >
> > I inspected the code a bit more, it seems that the TC action is called from
> > tcf_proto_ops.classify() callback, which is called from Qdisc_ops enqueue
> > callback.
> >
> > Then Qdisc enqueue callback is from
> >
> > -> __dev_queue_xmit()
> > -> __dev_xmit_skb()
> > -> dev_qdisc_enqueue()
> >
> > inside the net core. It seems that this __dev_queue_xmit() callback is
> > typically called from BH context (e.g., NET_TX_SOFTIRQ) with BH
> > already disabled, but sometimes also can from a work queue under
> > process context, one case is the br_mrp_test_work_expired() inside
> > net/bridge/br_mrp.c. Does it indicate that this TC action could also be
> > called with BH enable? I am not a developer so really not sure about it,
> > as the networking code is a bit long and complicated.
>
> net/bridge/br_mrp.c seems to need some love +Cc Horatiu Vultur
> <[email protected]>
> Maybe that code needs to run in a tasklet?
> In any case your patch is incorrect.
I am currently out traveling and it would be a little bit hard for me to
look at this right now. I can have a look after I come back in office
around mid November.
But I was wondering if this is stil an issue for MRP as Cong Wang
pointed out, the function __dev_queue_xmit is already disabling the BH.
>
> cheers,
> jamal
>
--
/Horatiu
On Mon, Oct 9, 2023 at 2:36 AM Horatiu Vultur
<[email protected]> wrote:
>
> The 10/05/2023 07:46, Jamal Hadi Salim wrote:
>
> Hi Jamal,
>
> > On Thu, Oct 5, 2023 at 5:01 AM Chengfeng Ye <[email protected]> wrote:
> > >
> > > Hi Jakub,
> > >
> > > Thanks for the reply,
> > >
> > > I inspected the code a bit more, it seems that the TC action is called from
> > > tcf_proto_ops.classify() callback, which is called from Qdisc_ops enqueue
> > > callback.
> > >
> > > Then Qdisc enqueue callback is from
> > >
> > > -> __dev_queue_xmit()
> > > -> __dev_xmit_skb()
> > > -> dev_qdisc_enqueue()
> > >
> > > inside the net core. It seems that this __dev_queue_xmit() callback is
> > > typically called from BH context (e.g., NET_TX_SOFTIRQ) with BH
> > > already disabled, but sometimes also can from a work queue under
> > > process context, one case is the br_mrp_test_work_expired() inside
> > > net/bridge/br_mrp.c. Does it indicate that this TC action could also be
> > > called with BH enable? I am not a developer so really not sure about it,
> > > as the networking code is a bit long and complicated.
> >
> > net/bridge/br_mrp.c seems to need some love +Cc Horatiu Vultur
> > <[email protected]>
> > Maybe that code needs to run in a tasklet?
> > In any case your patch is incorrect.
>
> I am currently out traveling and it would be a little bit hard for me to
> look at this right now. I can have a look after I come back in office
> around mid November.
> But I was wondering if this is stil an issue for MRP as Cong Wang
> pointed out, the function __dev_queue_xmit is already disabling the BH.
Yeah, sorry - should have read the code. Cong is right, there's
nothing for you to do.
cheers,
jamal
> >
> > cheers,
> > jamal
> >
>
> --
> /Horatiu