2023-06-27 11:18:44

by Chengfeng Ye

[permalink] [raw]
Subject: [PATCH] net/802/garp: fix potential deadlock on &app->lock

As &app->lock is also acquired by the timer garp_join_timer() which
which executes under soft-irq context, code executing under process
context should disable irq before acquiring the lock, otherwise
deadlock could happen if the process context hold the lock then
preempt by the interruption.

garp_pdu_rcv() is one such function that acquires &app->lock, but I
am not sure whether it is called with irq disable outside thus the
patch could be false.

Possible deadlock scenario:
garp_pdu_rcv()
-> spin_lock(&app->lock)
<timer interrupt>
-> garp_join_timer()
-> spin_lock(&app->lock)

This flaw was found using an experimental static analysis tool we are
developing for irq-related deadlock.

The tentative patch fix the potential deadlock by spin_lock_irqsave(),
or it should be fixed with spin_lock_bh() if it is a real bug? I am
not very sure.

Signed-off-by: Chengfeng Ye <[email protected]>
---
net/802/garp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/net/802/garp.c b/net/802/garp.c
index ab24b21fbb49..acc6f2f847a6 100644
--- a/net/802/garp.c
+++ b/net/802/garp.c
@@ -515,6 +515,7 @@ static void garp_pdu_rcv(const struct stp_proto *proto, struct sk_buff *skb,
struct garp_port *port;
struct garp_applicant *app;
const struct garp_pdu_hdr *gp;
+ unsigned long flags;

port = rcu_dereference(dev->garp_port);
if (!port)
@@ -530,14 +531,14 @@ static void garp_pdu_rcv(const struct stp_proto *proto, struct sk_buff *skb,
goto err;
skb_pull(skb, sizeof(*gp));

- spin_lock(&app->lock);
+ spin_lock_irqsave(&app->lock, flags);
while (skb->len > 0) {
if (garp_pdu_parse_msg(app, skb) < 0)
break;
if (garp_pdu_parse_end_mark(skb) < 0)
break;
}
- spin_unlock(&app->lock);
+ spin_unlock_irqrestore(&app->lock, flags);
err:
kfree_skb(skb);
}
--
2.17.1



2023-06-27 14:32:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] net/802/garp: fix potential deadlock on &app->lock

On Tue, Jun 27, 2023 at 12:52 PM Chengfeng Ye <[email protected]> wrote:
>
> As &app->lock is also acquired by the timer garp_join_timer() which
> which executes under soft-irq context, code executing under process
> context should disable irq before acquiring the lock, otherwise
> deadlock could happen if the process context hold the lock then
> preempt by the interruption.
>
> garp_pdu_rcv() is one such function that acquires &app->lock, but I
> am not sure whether it is called with irq disable outside thus the
> patch could be false.
>
> Possible deadlock scenario:
> garp_pdu_rcv()
> -> spin_lock(&app->lock)
> <timer interrupt>

This can not happen.

RX handlers are called from BH context, and rcu_read_lock()

See net/core/dev.c, deliver_skb() and netif_receive_skb()


> -> garp_join_timer()
> -> spin_lock(&app->lock)
>
> This flaw was found using an experimental static analysis tool we are
> developing for irq-related deadlock.
>
> The tentative patch fix the potential deadlock by spin_lock_irqsave(),
> or it should be fixed with spin_lock_bh() if it is a real bug? I am
> not very sure.

I guess more work is needed at your side :)

2023-06-27 15:06:17

by Chengfeng Ye

[permalink] [raw]
Subject: Re: [PATCH] net/802/garp: fix potential deadlock on &app->lock

I should have noticed that _rcv suffix mostly denotes RX handlers,
thanks much for pointing that out!

Best regards,
Chengfeng