LinuxLists.cc - kernel panic in 4.2.3, rb_erase in sch

2015-11-02 14:17:08

Subject: kernel panic in 4.2.3, rb_erase in sch_fq

Hi!

Actually seems i was getting this panic for a while (once per week) on
loaded pppoe server, but just now was able to get full panic message.
After checking commit logs on sch_fq.c i didnt seen any fixes, so
probably upgrading to newer kernel wont help?

<remote> [237470.633382] general protection fault: 0000 [#1]
<remote> SMP
<remote>
<remote> [237470.633832] Modules linked in:
<remote> netconsole
<remote> configfs
<remote> act_skbedit
<remote> sch_fq
<remote> cls_fw
<remote> act_police
<remote> cls_u32
<remote> sch_ingress
<remote> sch_sfq
<remote> sch_htb
<remote> pppoe
<remote> pppox
<remote> ppp_generic
<remote> slhc
<remote> xt_nat
<remote> ts_bm
<remote> xt_string
<remote> xt_connmark
<remote> xt_TCPMSS
<remote> xt_tcpudp
<remote> xt_mark
<remote> iptable_filter
<remote> iptable_nat
<remote> nf_conntrack_ipv4
<remote> nf_defrag_ipv4
<remote> nf_nat_ipv4
<remote> nf_nat
<remote> nf_conntrack
<remote> iptable_mangle
<remote> ip_tables
<remote> x_tables
<remote> 8021q
<remote> garp
<remote> mrp
<remote> stp
<remote> llc
<remote>
<remote> [237470.637835] CPU: 1 PID: 14035 Comm: accel-pppd Not tainted
4.2.3-build-0087 #3
<remote> [237470.638342] Hardware name: Intel Corporation
S2600GZ/S2600GZ, BIOS SE5C600.86B.01.03.0002.062020121504 06/20/2012
<remote> [237470.638859] task: ffff8803ef8b5080 ti: ffff8803ed7e0000
task.ti: ffff8803ed7e0000
<remote> [237470.639370] RIP: 0010:[<ffffffff8125af5d>]
<remote> [<ffffffff8125af5d>] rb_erase+0x37/0x2c4
<remote> [237470.639960] RSP: 0018:ffff8803ed7e3b88 EFLAGS: 00010286
<remote> [237470.644863] RAX: 0000000000000000 RBX: ffff8804106ab000
RCX: 0000000000000001
<remote> [237470.645366] RDX: ffa2050402210218 RSI: ffff88040cfe2cf0
RDI: ffff8803f50d00e0
<remote> [237470.645872] RBP: ffff8803ed7e3b88 R08: 0000000000000000
R09: ffff88042ee37d50
<remote> [237470.646376] R10: ffffea000fe7a9c0 R11: 0000000094f1b850
R12: 000000000000019e
<remote> [237470.646881] R13: ffff88040cfe2cf0 R14: ffff8803f50d00d0
R15: 0000000000000000
<remote> [237470.647381] FS: 00007fcd5d384700(0000)
GS:ffff88042ee20000(0000) knlGS:0000000000000000
<remote> [237470.647889] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
<remote> [237470.648209] CR2: 00007fcd003efa90 CR3: 0000000424b6e000
CR4: 00000000000406e0
<remote> [237470.648707] Stack:
<remote> [237470.648990] ffff8803ed7e3bb8
<remote> ffffffffa00ef38b
<remote> ffff8804106ab000
<remote> ffff880416079000
<remote>
<remote> [237470.649791] 0000000000000002
<remote> ffff8804160790d8
<remote> ffff8803ed7e3bd8
<remote> ffffffff8183785c
<remote>
<remote> [237470.650589] 0000000000000002
<remote> ffff8800b021d000
<remote> ffff8803ed7e3c18
<remote> ffffffffa00d247a
<remote>
<remote> [237470.651387] Call Trace:
<remote> [237470.651716] [<ffffffffa00ef38b>] fq_reset+0x7a/0xf2
[sch_fq]
<remote> [237470.652084] [<ffffffff8183785c>] qdisc_reset+0x18/0x42
<remote> [237470.652444] [<ffffffffa00d247a>] htb_reset+0x96/0x14d
[sch_htb]
<remote> [237470.652780] [<ffffffff8183785c>] qdisc_reset+0x18/0x42
<remote> [237470.653146] [<ffffffff81837e3d>]
dev_deactivate_queue.constprop.34+0x43/0x53
<remote> [237470.653726] [<ffffffff81838543>]
dev_deactivate_many+0x53/0x206
<remote> [237470.654088] [<ffffffff81819ce2>]
__dev_close_many+0x73/0xbf
<remote> [237470.654436] [<ffffffff81819e5d>] __dev_close+0x2c/0x41
<remote> [237470.654784] [<ffffffff818a5569>] ?
_raw_spin_unlock_bh+0x15/0x17
<remote> [237470.655106] [<ffffffff8181ff25>]
__dev_change_flags+0xa5/0x13c
<remote> [237470.655427] [<ffffffff8181ffdf>]
dev_change_flags+0x23/0x59
<remote> [237470.655777] [<ffffffff818a437b>] ? mutex_lock+0x13/0x24
<remote> [237470.656073] [<ffffffff81874197>] devinet_ioctl+0x246/0x533
<remote> [237470.656372] [<ffffffff8187572d>] inet_ioctl+0x8c/0xa6
<remote> [237470.656667] [<ffffffff81807e06>] sock_do_ioctl+0x22/0x40
<remote> [237470.656960] [<ffffffff81808016>] sock_ioctl+0x1f2/0x200
<remote> [237470.657253] [<ffffffff8115f5b5>] do_vfs_ioctl+0x360/0x41a
<remote> [237470.657549] [<ffffffff81151989>] ? vfs_write+0x105/0x164
<remote> [237470.657841] [<ffffffff8115f6a8>] SyS_ioctl+0x39/0x61
<remote> [237470.658134] [<ffffffff818a591b>]
entry_SYSCALL_64_fastpath+0x16/0x6e
<remote> [237470.658431] Code:
<remote> 48
<remote> 85
<remote> c0
<remote> 75
<remote> 36
<remote> 48
<remote> 8b
<remote> 0f
<remote> 48
<remote> 89
<remote> c8
<remote> 48
<remote> 83
<remote> e0
<remote> fc
<remote> 74
<remote> 12
<remote> 48
<remote> 39
<remote> 78
<remote> 10
<remote> 75
<remote> 06
<remote> 48
<remote> 89
<remote> 50
<remote> 10
<remote> eb
<remote> 09
<remote> 48
<remote> 89
<remote> 50
<remote> 08
<remote> eb
<remote> 03
<remote> 48
<remote> 89
<remote> 16
<remote> 48
<remote> 85
<remote> d2
<remote> 74
<remote> 08
<remote>
<remote> 89
<remote> 0a
<remote> e9
<remote> 83
<remote> 02
<remote> 00
<remote> 00
<remote> 80
<remote> e1
<remote> 01
<remote> e9
<remote> c3
<remote> 00
<remote> 00
<remote> 00
<remote> 48
<remote> 85
<remote> d2
<remote> 75
<remote> 2c
<remote>
<remote> [237470.663930] RIP
<remote> [<ffffffff8125af5d>] rb_erase+0x37/0x2c4
<remote> [237470.664296] RSP <ffff8803ed7e3b88>
<remote> [237470.664598] ---[ end trace 32ea40a7de450892 ]---
<remote> [237470.673272] Kernel panic - not syncing: Fatal exception in
interrupt
<remote> [237470.673577] Kernel Offset: disabled
<remote> [237470.704654] Rebooting in 5 seconds..

2015-11-02 15:25:01

by Eric Dumazet

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote:
> Hi!
>
> Actually seems i was getting this panic for a while (once per week) on
> loaded pppoe server, but just now was able to get full panic message.
> After checking commit logs on sch_fq.c i didnt seen any fixes, so
> probably upgrading to newer kernel wont help?

I do not think we support sch_fq as a HTB leaf.

If you want both HTB and sch_fq, you need to setup a bonding device.

HTB on bond0

sch_fq on the slaves

Sure, the kernel should not crash, but HTB+sch_fq on same net device is
certainly not something that will work anyway.

2015-11-02 15:58:18

by Denys Fedoryshchenko

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On 2015-11-02 17:24, Eric Dumazet wrote:
> On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote:
>> Hi!
>>
>> Actually seems i was getting this panic for a while (once per week) on
>> loaded pppoe server, but just now was able to get full panic message.
>> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> probably upgrading to newer kernel wont help?
>
> I do not think we support sch_fq as a HTB leaf.
>
> If you want both HTB and sch_fq, you need to setup a bonding device.
>
> HTB on bond0
>
> sch_fq on the slaves
>
> Sure, the kernel should not crash, but HTB+sch_fq on same net device is
> certainly not something that will work anyway.
Strange, because except ppp, on static devices it works really very well
in such scheme. It is the only solution that can throttle incoming
bandwidth, when bandwidth is very overbooked - reliably, for my use
cases, such as 256k+ flows/2.5Gbps and several different classes of
traffic, so using DRR will end up in just not enough classes.

On latest kernels i had to patch tc to provide parameter for orphan mask
in fq, to increase number for flows for transit traffic.
None of other qdiscs able to solve this problem, incoming bandwidth
simply flowing 10-20% more than set, but fq is doing magic.
The only device that was working with similar efficiency for such cases
- proprietary PacketShaper, but is modifying tcp window size, and can't
be called transparent, and also has stability issues over 1Gbps.

2015-11-02 16:12:07

by Eric Dumazet

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On Mon, 2015-11-02 at 17:58 +0200, Denys Fedoryshchenko wrote:
> On 2015-11-02 17:24, Eric Dumazet wrote:
> > On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote:
> >> Hi!
> >>
> >> Actually seems i was getting this panic for a while (once per week) on
> >> loaded pppoe server, but just now was able to get full panic message.
> >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
> >> probably upgrading to newer kernel wont help?
> >
> > I do not think we support sch_fq as a HTB leaf.
> >
> > If you want both HTB and sch_fq, you need to setup a bonding device.
> >
> > HTB on bond0
> >
> > sch_fq on the slaves
> >
> > Sure, the kernel should not crash, but HTB+sch_fq on same net device is
> > certainly not something that will work anyway.
> Strange, because except ppp, on static devices it works really very well
> in such scheme. It is the only solution that can throttle incoming
> bandwidth, when bandwidth is very overbooked - reliably, for my use
> cases, such as 256k+ flows/2.5Gbps and several different classes of
> traffic, so using DRR will end up in just not enough classes.
>
> On latest kernels i had to patch tc to provide parameter for orphan mask
> in fq, to increase number for flows for transit traffic.
> None of other qdiscs able to solve this problem, incoming bandwidth
> simply flowing 10-20% more than set, but fq is doing magic.
> The only device that was working with similar efficiency for such cases
> - proprietary PacketShaper, but is modifying tcp window size, and can't
> be called transparent, and also has stability issues over 1Gbps.

Ah, I was thinking you needed more like 10Gb traffic ;)

with HTB on bonding, we can use MQ+FQ on the slaves in order to use many
cpus to serve local traffic.

But yes, if you use HTB+FQ for forwarding, I guess the bonding setup is
not really needed.

2015-11-02 16:21:45

by Denys Fedoryshchenko

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On 2015-11-02 18:12, Eric Dumazet wrote:
> On Mon, 2015-11-02 at 17:58 +0200, Denys Fedoryshchenko wrote:
>> On 2015-11-02 17:24, Eric Dumazet wrote:
>> > On Mon, 2015-11-02 at 16:11 +0200, Denys Fedoryshchenko wrote:
>> >> Hi!
>> >>
>> >> Actually seems i was getting this panic for a while (once per week) on
>> >> loaded pppoe server, but just now was able to get full panic message.
>> >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> >> probably upgrading to newer kernel wont help?
>> >
>> > I do not think we support sch_fq as a HTB leaf.
>> >
>> > If you want both HTB and sch_fq, you need to setup a bonding device.
>> >
>> > HTB on bond0
>> >
>> > sch_fq on the slaves
>> >
>> > Sure, the kernel should not crash, but HTB+sch_fq on same net device is
>> > certainly not something that will work anyway.
>> Strange, because except ppp, on static devices it works really very
>> well
>> in such scheme. It is the only solution that can throttle incoming
>> bandwidth, when bandwidth is very overbooked - reliably, for my use
>> cases, such as 256k+ flows/2.5Gbps and several different classes of
>> traffic, so using DRR will end up in just not enough classes.
>>
>> On latest kernels i had to patch tc to provide parameter for orphan
>> mask
>> in fq, to increase number for flows for transit traffic.
>> None of other qdiscs able to solve this problem, incoming bandwidth
>> simply flowing 10-20% more than set, but fq is doing magic.
>> The only device that was working with similar efficiency for such
>> cases
>> - proprietary PacketShaper, but is modifying tcp window size, and
>> can't
>> be called transparent, and also has stability issues over 1Gbps.
>
> Ah, I was thinking you needed more like 10Gb traffic ;)
>
> with HTB on bonding, we can use MQ+FQ on the slaves in order to use
> many
> cpus to serve local traffic.
>
> But yes, if you use HTB+FQ for forwarding, I guess the bonding setup is
> not really needed.
Well, here country is very underdeveloped in matters of technology. 10G
interfaces appeared in some ISP only this year.
On the ppp interfaces where crash happening - it is even less bandwidth.
Each user max 1-2Mbps(average usage 128kbps), 4.5k interfaces.
But i have some more heavy setups there, around 9k pppoe users
terminated on single server, (means 9k interfaces), about 2Gbps traffic
passing thru.
If i take non-FOSS solution, i will have to pay for software licenses
$100k+, which is unbearable for local ISP. fq is not critical in this
specific use case, i can use for ppp interfaces fifo or such, but i
guess better to report a but :)

2015-11-03 22:06:09

by Cong Wang

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
<[email protected]> wrote:
> Hi!
>
> Actually seems i was getting this panic for a while (once per week) on
> loaded pppoe server, but just now was able to get full panic message.
> After checking commit logs on sch_fq.c i didnt seen any fixes, so probably
> upgrading to newer kernel wont help?

Can you share your `tc qdisc show dev xxxx` with us? And how to reproduce
it? I tried to setup htb+fq and then flip the interface back and forth
but I don't
see any crash.

Thanks.

2015-11-04 04:25:43

by Denys Fedoryshchenko

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On 2015-11-04 00:06, Cong Wang wrote:
> On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
> <[email protected]> wrote:
>> Hi!
>>
>> Actually seems i was getting this panic for a while (once per week) on
>> loaded pppoe server, but just now was able to get full panic message.
>> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> probably
>> upgrading to newer kernel wont help?
>
>
> Can you share your `tc qdisc show dev xxxx` with us? And how to
> reproduce
> it? I tried to setup htb+fq and then flip the interface back and forth
> but I don't
> see any crash.
My guess it wont be easy to reproduce, it is happening on box with 4.5k
interfaces, that constantly create/delete interfaces,
and even with that this problem may happen once per day, or may not
happen for 1 week.

Here is script that is being fired after new ppp interface detected. But
pppoe process are independent from
process that are "establishing" shapers.

/sbin/tc qdisc del root
/sbin/tc qdisc add handle 1: root htb default 3

/sbin/tc filter add parent 1:0 protocol ip prio 4 handle 1 fw flowid 1:3
/sbin/tc filter add parent 1:0 protocol ip prio 3 u32 match ip protocol
6 0xff match ip src 10.0.252.8/32 flowid 1:3/sbin/tc filter add parent
1:0 protocol ip prio 5 u32 match ip protocol 1 0xff flowid 1:0
/sbin/tc filter add parent 1:0 protocol ip prio 5 u32 match ip protocol
6 0xff match ip sport 80 0xffff flowid 1:4
/sbin/tc filter add parent 1:0 protocol ip prio 5 u32 match ip protocol
6 0xff match ip sport 443 0xffff flowid 1:5
/sbin/tc filter add parent 1:0 protocol ip prio 100 u32 match u32 0 0
flowid 1:2

/sbin/tc class add classid 1:1 parent 1:0 htb rate 512Kbit ceil
512Kbit.
/sbin/tc class add classid 1:2 parent 1:1 htb rate 32Kbit ceil 512Kbit
/sbin/tc class add classid 1:3 parent 1:0 htb rate 10Mbit ceil 10Mbit
/sbin/tc class add classid 1:4 parent 1:1 htb rate 32Kbit ceil 512Kbit
/sbin/tc class add classid 1:5 parent 1:1 htb rate 32Kbit ceil 512Kbit

/sbin/tc qdisc add parent 1:2 fq limit 300
/sbin/tc qdisc add parent 1:3 pfifo limit 300
/sbin/tc qdisc add parent 1:4 fq limit 300
/sbin/tc qdisc add parent 1:5 fq limit 300

Possible cases come to my mind (but maybe i missed others):
Script and tc working and interface are deleted in a process (e.g.
interface disappears)
Script deleting root while there is heavy traffic on interface and a
lot of packets queued
ppp interface destroyed, while there is a lot of traffic queued on it
(this one a bit rare situation)

>
> Thanks.

2015-11-04 04:46:25

by Eric Dumazet

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote:
> On 2015-11-04 00:06, Cong Wang wrote:
> > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
> > <[email protected]> wrote:
> >> Hi!
> >>
> >> Actually seems i was getting this panic for a while (once per week) on
> >> loaded pppoe server, but just now was able to get full panic message.
> >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
> >> probably
> >> upgrading to newer kernel wont help?
> >
> >
> > Can you share your `tc qdisc show dev xxxx` with us? And how to
> > reproduce
> > it? I tried to setup htb+fq and then flip the interface back and forth
> > but I don't
> > see any crash.
> My guess it wont be easy to reproduce, it is happening on box with 4.5k
> interfaces, that constantly create/delete interfaces,
> and even with that this problem may happen once per day, or may not
> happen for 1 week.
>
> Here is script that is being fired after new ppp interface detected. But
> pppoe process are independent from
> process that are "establishing" shapers.

It is probably a generic bug. sch_fq seems OK to me.

Somehow nobody tries to change qdisc hundred times per second ;)

Could you try following patch ?

It seems to 'fix' the issue for me.

diff --git a/net/core/dev.c b/net/core/dev.c
index 8ce3f74cd6b9..bf136103bc7b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2880,6 +2880,12 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
spin_lock(&q->busylock);

spin_lock(root_lock);
+ if (unlikely(q != rcu_dereference_bh(txq->qdisc))) {
+ pr_err_ratelimited("Arg, qdisc changed ! state %lx\n", q->state);
+ kfree_skb(skb);
+ rc = NET_XMIT_DROP;
+ goto end;
+ }
if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
kfree_skb(skb);
rc = NET_XMIT_DROP;
@@ -2913,6 +2919,7 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
__qdisc_run(q);
}
}
+end:
spin_unlock(root_lock);
if (unlikely(contended))
spin_unlock(&q->busylock);

2015-11-04 04:58:44

by Eric Dumazet

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On Tue, 2015-11-03 at 20:46 -0800, Eric Dumazet wrote:
> On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote:
> > On 2015-11-04 00:06, Cong Wang wrote:
> > > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
> > > <[email protected]> wrote:
> > >> Hi!
> > >>
> > >> Actually seems i was getting this panic for a while (once per week) on
> > >> loaded pppoe server, but just now was able to get full panic message.
> > >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
> > >> probably
> > >> upgrading to newer kernel wont help?
> > >
> > >
> > > Can you share your `tc qdisc show dev xxxx` with us? And how to
> > > reproduce
> > > it? I tried to setup htb+fq and then flip the interface back and forth
> > > but I don't
> > > see any crash.
> > My guess it wont be easy to reproduce, it is happening on box with 4.5k
> > interfaces, that constantly create/delete interfaces,
> > and even with that this problem may happen once per day, or may not
> > happen for 1 week.
> >
> > Here is script that is being fired after new ppp interface detected. But
> > pppoe process are independent from
> > process that are "establishing" shapers.
>
>
> It is probably a generic bug. sch_fq seems OK to me.
>
> Somehow nobody tries to change qdisc hundred times per second ;)
>
> Could you try following patch ?
>
> It seems to 'fix' the issue for me.

Following patch would be more appropriate.
Prior one was meant to 'show' the issue.

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cb5d4ad32946..7f5f3e8a10f5 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -706,9 +706,11 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue,
spin_lock_bh(root_lock);

/* Prune old scheduler */
- if (oqdisc && atomic_read(&oqdisc->refcnt) <= 1)
- qdisc_reset(oqdisc);
-
+ if (oqdisc) {
+ if (atomic_read(&oqdisc->refcnt) <= 1)
+ qdisc_reset(oqdisc);
+ set_bit(__QDISC_STATE_DEACTIVATED, &oqdisc->state);
+ }
/* ... and graft new one */
if (qdisc == NULL)
qdisc = &noop_qdisc;

2015-11-04 05:13:35

by Denys Fedoryshchenko

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

On 2015-11-04 06:58, Eric Dumazet wrote:
> On Tue, 2015-11-03 at 20:46 -0800, Eric Dumazet wrote:
>> On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote:
>> > On 2015-11-04 00:06, Cong Wang wrote:
>> > > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
>> > > <[email protected]> wrote:
>> > >> Hi!
>> > >>
>> > >> Actually seems i was getting this panic for a while (once per week) on
>> > >> loaded pppoe server, but just now was able to get full panic message.
>> > >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> > >> probably
>> > >> upgrading to newer kernel wont help?
>> > >
>> > >
>> > > Can you share your `tc qdisc show dev xxxx` with us? And how to
>> > > reproduce
>> > > it? I tried to setup htb+fq and then flip the interface back and forth
>> > > but I don't
>> > > see any crash.
>> > My guess it wont be easy to reproduce, it is happening on box with 4.5k
>> > interfaces, that constantly create/delete interfaces,
>> > and even with that this problem may happen once per day, or may not
>> > happen for 1 week.
>> >
>> > Here is script that is being fired after new ppp interface detected. But
>> > pppoe process are independent from
>> > process that are "establishing" shapers.
>>
>>
>> It is probably a generic bug. sch_fq seems OK to me.
>>
>> Somehow nobody tries to change qdisc hundred times per second ;)
>>
>> Could you try following patch ?
>>
>> It seems to 'fix' the issue for me.
>
> Following patch would be more appropriate.
> Prior one was meant to 'show' the issue.
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index cb5d4ad32946..7f5f3e8a10f5 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -706,9 +706,11 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue
> *dev_queue,
> spin_lock_bh(root_lock);
>
> /* Prune old scheduler */
> - if (oqdisc && atomic_read(&oqdisc->refcnt) <= 1)
> - qdisc_reset(oqdisc);
> -
> + if (oqdisc) {
> + if (atomic_read(&oqdisc->refcnt) <= 1)
> + qdisc_reset(oqdisc);
> + set_bit(__QDISC_STATE_DEACTIVATED, &oqdisc->state);
> + }
> /* ... and graft new one */
> if (qdisc == NULL)
> qdisc = &noop_qdisc;

Applied, will test it, but this bug might be triggered rarely.
I will try to push it to more pppoe servers in order to stress test them
(and 4.3) more.

2015-11-13 21:41:33

by Denys Fedoryshchenko

[permalink] [raw]

Subject: Re: kernel panic in 4.2.3, rb_erase in sch_fq

I can confirm, after patch this issue never appeared again. So maybe
good to push it to stable and etc :) Thanks a lot Eric, you saved me
again.

Still i have some weird panic issues, maybe related to conntrack, but
they are rare even on high load, so i am slowly gathering data, and i
found at least one more person with similar conntrack crashes on latest
kernels.

On 2015-11-04 06:46, Eric Dumazet wrote:
> On Wed, 2015-11-04 at 06:25 +0200, Denys Fedoryshchenko wrote:
>> On 2015-11-04 00:06, Cong Wang wrote:
>> > On Mon, Nov 2, 2015 at 6:11 AM, Denys Fedoryshchenko
>> > <[email protected]> wrote:
>> >> Hi!
>> >>
>> >> Actually seems i was getting this panic for a while (once per week) on
>> >> loaded pppoe server, but just now was able to get full panic message.
>> >> After checking commit logs on sch_fq.c i didnt seen any fixes, so
>> >> probably
>> >> upgrading to newer kernel wont help?
>> >
>> >
>> > Can you share your `tc qdisc show dev xxxx` with us? And how to
>> > reproduce
>> > it? I tried to setup htb+fq and then flip the interface back and forth
>> > but I don't
>> > see any crash.
>> My guess it wont be easy to reproduce, it is happening on box with
>> 4.5k
>> interfaces, that constantly create/delete interfaces,
>> and even with that this problem may happen once per day, or may not
>> happen for 1 week.
>>
>> Here is script that is being fired after new ppp interface detected.
>> But
>> pppoe process are independent from
>> process that are "establishing" shapers.
>
>
> It is probably a generic bug. sch_fq seems OK to me.
>
> Somehow nobody tries to change qdisc hundred times per second ;)
>
> Could you try following patch ?
>
> It seems to 'fix' the issue for me.
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 8ce3f74cd6b9..bf136103bc7b 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2880,6 +2880,12 @@ static inline int __dev_xmit_skb(struct sk_buff
> *skb, struct Qdisc *q,
> spin_lock(&q->busylock);
>
> spin_lock(root_lock);
> + if (unlikely(q != rcu_dereference_bh(txq->qdisc))) {
> + pr_err_ratelimited("Arg, qdisc changed ! state %lx\n", q->state);
> + kfree_skb(skb);
> + rc = NET_XMIT_DROP;
> + goto end;
> + }
> if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) {
> kfree_skb(skb);
> rc = NET_XMIT_DROP;
> @@ -2913,6 +2919,7 @@ static inline int __dev_xmit_skb(struct sk_buff
> *skb, struct Qdisc *q,
> __qdisc_run(q);
> }
> }
> +end:
> spin_unlock(root_lock);
> if (unlikely(contended))
> spin_unlock(&q->busylock);