When search the first skb to collapse,the condition of overlap to the next one have been
reached,but the start is less than TCP_SKB_CB(skb)->seq at this time, then followed process
will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)->seq).
So this patch add one check (! before(start,TCP_SKB_CB(skb)->seq)) to avoid this ipanic.
Signed-off-by: Chen Jun <[email protected]>
---
net/ipv4/tcp_input.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 9c62257..4c745c5 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4465,7 +4465,8 @@ restart:
* overlaps to the next one.
*/
if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin &&
- (tcp_win_from_space(skb->truesize) > skb->len ||
+ ((tcp_win_from_space(skb->truesize) > skb->len &&
+ !before(start, TCP_SKB_CB(skb)->seq)) ||
before(TCP_SKB_CB(skb)->seq, start))) {
end_of_skbs = false;
break;
--
1.7.4.1
On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote:
> When search the first skb to collapse,the condition of overlap to the next one have been
> reached,but the start is less than TCP_SKB_CB(skb)->seq at this time, then followed process
> will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)->seq).
> So this patch add one check (! before(start,TCP_SKB_CB(skb)->seq)) to avoid this ipanic.
>
> Signed-off-by: Chen Jun <[email protected]>
> ---
> net/ipv4/tcp_input.c | 3 ++-
> 1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 9c62257..4c745c5 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4465,7 +4465,8 @@ restart:
> * overlaps to the next one.
> */
> if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin &&
> - (tcp_win_from_space(skb->truesize) > skb->len ||
> + ((tcp_win_from_space(skb->truesize) > skb->len &&
> + !before(start, TCP_SKB_CB(skb)->seq)) ||
> before(TCP_SKB_CB(skb)->seq, start))) {
> end_of_skbs = false;
> break;
Hmm... I must say I do not understand this patch.
If we find a skb with before(TCP_SKB_CB(skb)->seq, start), then the
final condition will be true.
Let's rewrite your code to equivalent one :
if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin &&
(before(TCP_SKB_CB(skb)->seq, start) ||
tcp_win_from_space(skb->truesize) > skb->len)) {
So it seems your patch would not solve the problem for all
possible skbs (aka not bloated) ?
Please tell us how you trigger this bug, and send the stack trace.
Thanks
On Mon, 2013-06-17 at 01:15 -0700, Eric Dumazet wrote:
> On Mon, 2013-06-17 at 10:18 -0400, Jun Chen wrote:
> > When search the first skb to collapse,the condition of overlap to the next one have been
> > reached,but the start is less than TCP_SKB_CB(skb)->seq at this time, then followed process
> > will trigger the BUG_ON of the offset(start - TCP_SKB_CB(skb)->seq).
> > So this patch add one check (! before(start,TCP_SKB_CB(skb)->seq)) to avoid this ipanic.
> >
> > Signed-off-by: Chen Jun <[email protected]>
> > ---
> > net/ipv4/tcp_input.c | 3 ++-
> > 1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> > index 9c62257..4c745c5 100644
> > --- a/net/ipv4/tcp_input.c
> > +++ b/net/ipv4/tcp_input.c
> > @@ -4465,7 +4465,8 @@ restart:
> > * overlaps to the next one.
> > */
> > if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin &&
> > - (tcp_win_from_space(skb->truesize) > skb->len ||
> > + ((tcp_win_from_space(skb->truesize) > skb->len &&
> > + !before(start, TCP_SKB_CB(skb)->seq)) ||
> > before(TCP_SKB_CB(skb)->seq, start))) {
> > end_of_skbs = false;
> > break;
>
> Hmm... I must say I do not understand this patch.
>
> If we find a skb with before(TCP_SKB_CB(skb)->seq, start), then the
> final condition will be true.
>
> Let's rewrite your code to equivalent one :
>
> if (!tcp_hdr(skb)->syn && !tcp_hdr(skb)->fin &&
> (before(TCP_SKB_CB(skb)->seq, start) ||
> tcp_win_from_space(skb->truesize) > skb->len)) {
>
> So it seems your patch would not solve the problem for all
> possible skbs (aka not bloated) ?
>
> Please tell us how you trigger this bug, and send the stack trace.
>
> Thanks
>
>
hi,
When the condition of tcp_win_from_space(skb->truesize) > skb->len is
true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final
condition will be true. The follow line:
int offset = start - TCP_SKB_CB(skb)->seq;
BUG_ON(offset < 0);
this BUG_ON will be triggered.
Follow line is my error logs:
<2>[ 7736.344508] kernel BUG
at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp_input.c:4845!
<4>[ 7736.344578] invalid opcode: 0000 [#1] PREEMPT SMP
<4>[ 7736.344883] Modules linked in: atomisp lm3559 ov9724 imx1x5
bcm4335(O) cfg80211 bcm_bt_lpm videobuf_vmalloc videobuf_core matrix(C)
<4>[ 7736.345681]
<4>[ 7736.345748] Pid: 5189, comm: TimedEventQueue Tainted: G WC
O 3.4.43-186445-g3ada675 #1 Intel Corporation Merrifield/SALT BAY
<4>[ 7736.346059] EIP: 0060:[<c18ad61d>] EFLAGS: 00010297 CPU: 1
<4>[ 7736.346183] EIP is at tcp_collapse+0x3bd/0x3d0
<4>[ 7736.346250] EAX: ab57d2bb EBX: df428c00 ECX: c97dcd00 EDX:
000010c0
<4>[ 7736.346372] ESI: df4289c0 EDI: fffffadb EBP: edca1d88 ESP:
edca1d60
<4>[ 7736.346441] DS: 007b ES: 007b FS: 00d8 GS: 003b SS: 0068
<4>[ 7736.346560] CR0: 8005003b CR2: 41d310bc CR3: 2d300000 CR4:
001007d0
<4>[ 7736.346629] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3:
00000000
<4>[ 7736.346749] DR6: ffff0ff0 DR7: 00000400
<0>[ 7736.346816] Process TimedEventQueue (pid: 5189, ti=edca0000
task=dc30b660 task.ti=c9a6e000)
<0>[ 7736.346936] Stack:
<4>[ 7736.347002] ffffffff ffffffff fffffadb c97dcd5c 00000001 c97dcd00
00000e32 c97dcd00
<4>[ 7736.347615] c97dcd00 df428180 edca1db0 c18addd0 00000000 ab57c870
ab57f19f c97dcd00
<4>[ 7736.348175] c97dd198 000080c0 c97dcd00 df428180 edca1df0 c18aea27
00000000 c18dc8f8
<0>[ 7736.348788] Call Trace:
<4>[ 7736.348861] [<c18addd0>] tcp_prune_queue+0x120/0x2f0
<4>[ 7736.348984] [<c18aea27>] tcp_data_queue+0x777/0xf00
<4>[ 7736.349055] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
<4>[ 7736.349126] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
<4>[ 7736.349196] [<c18b2e84>] tcp_rcv_established+0x114/0x680
<4>[ 7736.349269] [<c18bb034>] tcp_v4_do_rcv+0x164/0x350
On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote:
> >
> hi,
> When the condition of tcp_win_from_space(skb->truesize) > skb->len is
> true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final
> condition will be true. The follow line:
> int offset = start - TCP_SKB_CB(skb)->seq;
> BUG_ON(offset < 0);
> this BUG_ON will be triggered.
>
Really this should never happen, we must track what's happening here.
Are you using a pristine kernel, without any patches ?
Are you able to reproduce this bug in a short amount of time ?
What kind of driver is in use ? (your stack trace was truncated)
On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote:
> On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote:
> > >
> > hi,
> > When the condition of tcp_win_from_space(skb->truesize) > skb->len is
> > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final
> > condition will be true. The follow line:
> > int offset = start - TCP_SKB_CB(skb)->seq;
> > BUG_ON(offset < 0);
> > this BUG_ON will be triggered.
> >
>
> Really this should never happen, we must track what's happening here.
It's very very rare, but the logic of codes have such a little hole.
>
> Are you using a pristine kernel, without any patches ?
The based kernel version is 3.4.
>
> Are you able to reproduce this bug in a short amount of time ?
I can't reproduce it in short time, this log had just been found once
for long long time tests on many devices .
>
> What kind of driver is in use ? (your stack trace was truncated)
I attach the whole stack traces for you.
<0>[ 7736.348788] Call Trace:
<4>[ 7736.348861] [<c18addd0>] tcp_prune_queue+0x120/0x2f0
<4>[ 7736.348984] [<c18aea27>] tcp_data_queue+0x777/0xf00
<4>[ 7736.349055] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
<4>[ 7736.349126] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
<4>[ 7736.349196] [<c18b2e84>] tcp_rcv_established+0x114/0x680
<4>[ 7736.349269] [<c18bb034>] tcp_v4_do_rcv+0x164/0x350
<4>[ 7736.349396] [<c18de301>] ? nf_nat_fn+0xb1/0x1d0
<4>[ 7736.349470] [<c18bc0c1>] tcp_v4_rcv+0x6f1/0x7a0
<4>[ 7736.349599] [<c1881dad>] ? nf_hook_slow+0x10d/0x150
<4>[ 7736.349673] [<c189b30b>] ip_local_deliver_finish+0x8b/0x200
<4>[ 7736.349796] [<c189b60f>] ip_local_deliver+0x8f/0xa0
<4>[ 7736.349867] [<c189b280>] ? ip_rcv_finish+0x300/0x300
<4>[ 7736.349937] [<c189b05f>] ip_rcv_finish+0xdf/0x300
<4>[ 7736.350062] [<c189b878>] ip_rcv+0x258/0x330
<4>[ 7736.350132] [<c189af80>] ? inet_del_protocol+0x30/0x30
<4>[ 7736.350258] [<c1864175>] __netif_receive_skb+0x325/0x410
<4>[ 7736.350331] [<c1864956>] process_backlog+0x96/0x150
<4>[ 7736.350455] [<c1864ba5>] net_rx_action+0x115/0x210
<4>[ 7736.350525] [<c18b7680>] ? tcp_out_of_resources+0xb0/0xb0
<4>[ 7736.350652] [<c123dc0b>] __do_softirq+0x9b/0x220
<4>[ 7736.350723] [<c123db70>] ? local_bh_enable_ip+0xd0/0xd0
>
>
>
On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote:
> On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote:
> > On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote:
> > > >
> > > hi,
> > > When the condition of tcp_win_from_space(skb->truesize) > skb->len is
> > > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final
> > > condition will be true. The follow line:
> > > int offset = start - TCP_SKB_CB(skb)->seq;
> > > BUG_ON(offset < 0);
> > > this BUG_ON will be triggered.
> > >
> >
> > Really this should never happen, we must track what's happening here.
> It's very very rare, but the logic of codes have such a little hole.
> >
> > Are you using a pristine kernel, without any patches ?
> The based kernel version is 3.4.
> >
> > Are you able to reproduce this bug in a short amount of time ?
> I can't reproduce it in short time, this log had just been found once
> for long long time tests on many devices .
> >
> > What kind of driver is in use ? (your stack trace was truncated)
>
> I attach the whole stack traces for you.
>
> <0>[ 7736.348788] Call Trace:
>
> <4>[ 7736.348861] [<c18addd0>] tcp_prune_queue+0x120/0x2f0
>
> <4>[ 7736.348984] [<c18aea27>] tcp_data_queue+0x777/0xf00
>
> <4>[ 7736.349055] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
>
> <4>[ 7736.349126] [<c18dc8f8>] ? ipt_do_table+0x1f8/0x480
>
> <4>[ 7736.349196] [<c18b2e84>] tcp_rcv_established+0x114/0x680
>
> <4>[ 7736.349269] [<c18bb034>] tcp_v4_do_rcv+0x164/0x350
>
> <4>[ 7736.349396] [<c18de301>] ? nf_nat_fn+0xb1/0x1d0
>
> <4>[ 7736.349470] [<c18bc0c1>] tcp_v4_rcv+0x6f1/0x7a0
>
> <4>[ 7736.349599] [<c1881dad>] ? nf_hook_slow+0x10d/0x150
>
> <4>[ 7736.349673] [<c189b30b>] ip_local_deliver_finish+0x8b/0x200
>
> <4>[ 7736.349796] [<c189b60f>] ip_local_deliver+0x8f/0xa0
>
> <4>[ 7736.349867] [<c189b280>] ? ip_rcv_finish+0x300/0x300
>
> <4>[ 7736.349937] [<c189b05f>] ip_rcv_finish+0xdf/0x300
>
> <4>[ 7736.350062] [<c189b878>] ip_rcv+0x258/0x330
>
> <4>[ 7736.350132] [<c189af80>] ? inet_del_protocol+0x30/0x30
>
> <4>[ 7736.350258] [<c1864175>] __netif_receive_skb+0x325/0x410
>
> <4>[ 7736.350331] [<c1864956>] process_backlog+0x96/0x150
>
> <4>[ 7736.350455] [<c1864ba5>] net_rx_action+0x115/0x210
>
> <4>[ 7736.350525] [<c18b7680>] ? tcp_out_of_resources+0xb0/0xb0
>
> <4>[ 7736.350652] [<c123dc0b>] __do_softirq+0x9b/0x220
>
> <4>[ 7736.350723] [<c123db70>] ? local_bh_enable_ip+0xd0/0xd0
>
Any other suspect messages before this, a memory allocation error for
example ?
I believe we have a bug in tcp_collapse() if one alloc_skb() returns
NULL while we were in the middle of collapsing a big GRO packet.
gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated
skb1: seq=X end_seq=X+4000
skb2: seq=X+4000 end_seq=X+8000
<missing>
grp_skb: seq=X end_seq=X+16000
Next time we call tcp_collapse(), we might split again the GRO packet
and get following incorrect queue :
skb1: seq=X end_seq=X+4000
skb2: seq=X+4000 end_seq=X+8000
skb3: seq=X end_seq=X+4000
skb4: seq=X+4000 end_seq=X+8000
skb5: seq=X+8000 end_seq=X+12000
skb6: seq=X+12000 end_seq=X+16000
I would use the following patch instead, to narrow the problem
If we really find in the ofo queue a skb with a lower seq than the
previous one, we should complain instead of lowering @start, since
this is going to crash later.
receive_queue / ofo_queue should contain monotonically increasing
skb->seq.
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 46271cdc..5507a09 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk)
start = TCP_SKB_CB(skb)->seq;
end = TCP_SKB_CB(skb)->end_seq;
} else {
- if (before(TCP_SKB_CB(skb)->seq, start))
- start = TCP_SKB_CB(skb)->seq;
+ if (before(TCP_SKB_CB(skb)->seq, start)) {
+ pr_err_once("tcp_collapse_ofo_queue() : seq %08x before start %08X\n",
+ TCP_SKB_CB(skb)->seq, start);
+ }
if (after(TCP_SKB_CB(skb)->end_seq, end))
end = TCP_SKB_CB(skb)->end_seq;
}
On Mon, 2013-06-17 at 06:21 -0700, Eric Dumazet wrote:
> On Mon, 2013-06-17 at 14:52 -0400, Jun Chen wrote:
> > On Mon, 2013-06-17 at 03:29 -0700, Eric Dumazet wrote:
> > > On Mon, 2013-06-17 at 13:29 -0400, Jun Chen wrote:
> > > > >
> > > > hi,
> > > > When the condition of tcp_win_from_space(skb->truesize) > skb->len is
> > > > true but the before(start, TCP_SKB_CB(skb)->seq) is also true, the final
> > > > condition will be true. The follow line:
> > > > int offset = start - TCP_SKB_CB(skb)->seq;
> > > > BUG_ON(offset < 0);
> > > > this BUG_ON will be triggered.
> > > >
> > >
> > > Really this should never happen, we must track what's happening here.
> > It's very very rare, but the logic of codes have such a little hole.
> > >
> > > Are you using a pristine kernel, without any patches ?
> > The based kernel version is 3.4.
> > >
> > > Are you able to reproduce this bug in a short amount of time ?
> > I can't reproduce it in short time, this log had just been found once
> > for long long time tests on many devices .
> > >
> > > What kind of driver is in use ? (your stack trace was truncated)
> >
> > I attach the whole stack traces for you.
> >
> Any other suspect messages before this, a memory allocation error for
> example ?
>
> I believe we have a bug in tcp_collapse() if one alloc_skb() returns
> NULL while we were in the middle of collapsing a big GRO packet.
>
> gro_skb needed 3 skb to be rebuilt, and only two skbs could be allocated
>
> skb1: seq=X end_seq=X+4000
> skb2: seq=X+4000 end_seq=X+8000
> <missing>
> grp_skb: seq=X end_seq=X+16000
>
> Next time we call tcp_collapse(), we might split again the GRO packet
> and get following incorrect queue :
>
> skb1: seq=X end_seq=X+4000
> skb2: seq=X+4000 end_seq=X+8000
> skb3: seq=X end_seq=X+4000
> skb4: seq=X+4000 end_seq=X+8000
> skb5: seq=X+8000 end_seq=X+12000
> skb6: seq=X+12000 end_seq=X+16000
>
>
> I would use the following patch instead, to narrow the problem
>
> If we really find in the ofo queue a skb with a lower seq than the
> previous one, we should complain instead of lowering @start, since
> this is going to crash later.
>
> receive_queue / ofo_queue should contain monotonically increasing
> skb->seq.
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 46271cdc..5507a09 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4513,8 +4513,10 @@ static void tcp_collapse_ofo_queue(struct sock *sk)
> start = TCP_SKB_CB(skb)->seq;
> end = TCP_SKB_CB(skb)->end_seq;
> } else {
> - if (before(TCP_SKB_CB(skb)->seq, start))
> - start = TCP_SKB_CB(skb)->seq;
> + if (before(TCP_SKB_CB(skb)->seq, start)) {
> + pr_err_once("tcp_collapse_ofo_queue() : seq %08x before start %08X\n",
> + TCP_SKB_CB(skb)->seq, start);
> + }
> if (after(TCP_SKB_CB(skb)->end_seq, end))
> end = TCP_SKB_CB(skb)->end_seq;
> }
>
>
There are many warning for tcp_recvmsg before this crash. I can't find
other memory warning in the logs, but I'm not sure whether there are
memory issues because of the length limitation of saved logs. I think
this logs will give you more information.
<4>[ 7736.343742] ------------[ cut here ]------------
<4>[ 7736.343759] WARNING:
at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg
+0x3bf/0x910()
<4>[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt
AB57F19F fl 0
<4>[ 7736.343845] Call Trace:
<4>[ 7736.343865] [<c1237032>] warn_slowpath_common+0x72/0xa0
<4>[ 7736.343888] [<c18a955f>] ? tcp_recvmsg+0x3bf/0x910
<4>[ 7736.343902] [<c18a955f>] ? tcp_recvmsg+0x3bf/0x910
<4>[ 7736.343922] [<c1237103>] warn_slowpath_fmt+0x33/0x40
<4>[ 7736.343944] [<c18a955f>] tcp_recvmsg+0x3bf/0x910
<4>[ 7736.343968] [<c18c9bb5>] inet_recvmsg+0x85/0xa0
<4>[ 7736.343992] [<c1852030>] sock_aio_read+0x140/0x160
<4>[ 7736.344016] [<c126b221>] ? set_next_entity+0xc1/0xf0
<4>[ 7736.344039] [<c130d627>] do_sync_read+0xb7/0xf0
<4>[ 7736.344064] [<c130dc6c>] ? rw_verify_area+0x6c/0x120
<4>[ 7736.344077] [<c1349aa8>] ? sys_epoll_wait+0x68/0x360
<4>[ 7736.344098] [<c130e1e9>] vfs_read+0x149/0x160
<4>[ 7736.344120] [<c130f518>] ? fget_light+0x58/0xd0
<4>[ 7736.344142] [<c130e23d>] sys_read+0x3d/0x70
<4>[ 7736.344164] [<c198c361>] syscall_call+0x7/0xb
<4>[ 7736.344187] [<c1980000>] ? perf_cpu_notify+0x45/0x89
<4>[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]---
On Tue, 2013-06-18 at 05:52 -0400, Jun Chen wrote:
> >
> There are many warning for tcp_recvmsg before this crash. I can't find
> other memory warning in the logs, but I'm not sure whether there are
> memory issues because of the length limitation of saved logs. I think
> this logs will give you more information.
>
> <4>[ 7736.343742] ------------[ cut here ]------------
>
> <4>[ 7736.343759] WARNING:
> at /data/buildbot/workdir/jb/kernel/net/ipv4/tcp.c:1496 tcp_recvmsg
> +0x3bf/0x910()
>
> <4>[ 7736.343775] recvmsg bug: copied AB57C870 seq AB57CD95 rcvnxt
> AB57F19F fl 0
>
> <4>[ 7736.343845] Call Trace:
>
> <4>[ 7736.343865] [<c1237032>] warn_slowpath_common+0x72/0xa0
>
> <4>[ 7736.343888] [<c18a955f>] ? tcp_recvmsg+0x3bf/0x910
>
> <4>[ 7736.343902] [<c18a955f>] ? tcp_recvmsg+0x3bf/0x910
>
> <4>[ 7736.343922] [<c1237103>] warn_slowpath_fmt+0x33/0x40
>
> <4>[ 7736.343944] [<c18a955f>] tcp_recvmsg+0x3bf/0x910
>
> <4>[ 7736.343968] [<c18c9bb5>] inet_recvmsg+0x85/0xa0
>
> <4>[ 7736.343992] [<c1852030>] sock_aio_read+0x140/0x160
>
> <4>[ 7736.344016] [<c126b221>] ? set_next_entity+0xc1/0xf0
>
> <4>[ 7736.344039] [<c130d627>] do_sync_read+0xb7/0xf0
>
> <4>[ 7736.344064] [<c130dc6c>] ? rw_verify_area+0x6c/0x120
>
> <4>[ 7736.344077] [<c1349aa8>] ? sys_epoll_wait+0x68/0x360
>
> <4>[ 7736.344098] [<c130e1e9>] vfs_read+0x149/0x160
>
> <4>[ 7736.344120] [<c130f518>] ? fget_light+0x58/0xd0
>
> <4>[ 7736.344142] [<c130e23d>] sys_read+0x3d/0x70
>
> <4>[ 7736.344164] [<c198c361>] syscall_call+0x7/0xb
>
> <4>[ 7736.344187] [<c1980000>] ? perf_cpu_notify+0x45/0x89
>
> <4>[ 7736.344205] ---[ end trace b3c5b245ce7ff5b5 ]---
>
Thats exactly the interesting stuff ;)
This was fixed, or should be fixed if still happening on more recent
kernels.
Basically, once we are in this state, there is nothing we can do to
prevent a crash.
Please try to reproduce the issue using 3.9 or David trees (net-next or
net )
Thanks