Hello -
I have been running the 2.4.15-pre kernels and
have found an interesting oops. I can reproduce
it immediately, and reliably, just by issuing an ssh
command (as a normal user).
Hardware: Pentium III 933 w/512 MB RAM
Red Hat 7.1+ updates
I have 2 eepro 100 cards and am running
an iptables firewall.
The condition exists in 2.4.15-pre1 and -pre2.
I have not seen this before (2.4.14 is fine)
Tonight I compiled 2.4.15-pre2 and ran dbench
for awhile, with good results.
Then I tried the simple "ssh <somehost>"cmd
that locked up -pre1 - sure enough, it locked
up the system again -
Here is the hand-copied, decoded oops:
ksymoops 2.4.3 on i686 2.4.15-pre2. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.15-pre2/ (default)
-m /boot/System.map (specified)
c01b8345
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c01b8345>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010293
eax: 000005dc ebx: df0f7de0 ecx: df0e1000 edx: 0000000e
esi: df96bdec edi: 00000000 ebp: d793b2a0 esp: da64bcf8
ds: 0018 es: 0018 ss: 0018
Process ssh (pid:2028, stackpage=da64b000)
Stack: 00000003 c01b82b0 c01af49b c0294838 00000000 df0e1000 00000003
c01b82b0
c01af4d2 df0d7de0 00000000 c0294838 df96b690 00000286 0001e9c0
00000286
df0e1000 de4297a0 00000000 dfd8aee0 c01b7269 00000002 00000003
df0d7de0
Call Trace: [<c01b82b0>] [<c01af49b>] [<c01b82b0>] [<c01af4d2>]
[<c01b7269>]
[<c01b82b0>] [<c01abec8>] [<c01c9d5e>] [<c01c5243>] [<c01c755b>]
[<c015fcb6>]
[<c01c95f7>] [<c01d42c2>] [<c01a3df6>] [<c01a2fce>] [<c01a3a27>]
[<c01211b1>]
[<c01a3a7d>] [<c01a4790>] [<c011d34b>] [<c0106cfb>]
Code: 0f b6 87 c6 02 00 00 31 d2 3c 02 74 0a fe c8 75 0b f6 45 20
>>EIP; c01b8344 <ip_queue_xmit2+94/220> <=====
Trace; c01b82b0 <ip_queue_xmit2+0/220>
Trace; c01af49a <nf_hook_slow+aa/140>
Trace; c01b82b0 <ip_queue_xmit2+0/220>
Trace; c01af4d2 <nf_hook_slow+e2/140>
Trace; c01b7268 <ip_queue_xmit+448/490>
Trace; c01b82b0 <ip_queue_xmit2+0/220>
Trace; c01abec8 <neigh_lookup+18/80>
Trace; c01c9d5e <tcp_v4_send_check+6e/b0>
Trace; c01c5242 <tcp_transmit_skb+552/600>
Trace; c01c755a <tcp_connect+3ba/4b0>
Trace; c015fcb6 <secure_tcp_sequence_number+96/c0>
Trace; c01c95f6 <tcp_v4_connect+2c6/300>
Trace; c01d42c2 <inet_stream_connect+102/230>
Trace; c01a3df6 <sys_connect+56/80>
Trace; c01a2fce <sock_map_fd+ee/170>
Trace; c01a3a26 <sock_create+d6/100>
Trace; c01211b0 <do_munmap+1f0/260>
Trace; c01a3a7c <sys_socket+2c/50>
Trace; c01a4790 <sys_socketcall+e0/200>
Trace; c011d34a <sys_setgroups+5a/80>
Trace; c0106cfa <system_call+32/38>
Code; c01b8344 <ip_queue_xmit2+94/220>
00000000 <_EIP>:
Code; c01b8344 <ip_queue_xmit2+94/220> <=====
0: 0f b6 87 c6 02 00 00 movzbl 0x2c6(%edi),%eax <=====
Code; c01b834a <ip_queue_xmit2+9a/220>
7: 31 d2 xor %edx,%edx
Code; c01b834c <ip_queue_xmit2+9c/220>
9: 3c 02 cmp $0x2,%al
Code; c01b834e <ip_queue_xmit2+9e/220>
b: 74 0a je 17 <_EIP+0x17> c01b835a
<ip_queue_xmit2
+aa/220>
Code; c01b8350 <ip_queue_xmit2+a0/220>
d: fe c8 dec %al
Code; c01b8352 <ip_queue_xmit2+a2/220>
f: 75 0b jne 1c <_EIP+0x1c> c01b8360
<ip_queue_xmit2
+b0/220>
Code; c01b8354 <ip_queue_xmit2+a4/220>
11: f6 45 20 00 testb $0x0,0x20(%ebp)
<0>Kernel panic: Aiee, killing interrupt handler!
On Fri, 9 Nov 2001, J Sloan wrote:
> I have been running the 2.4.15-pre kernels and
> have found an interesting oops. I can reproduce
> it immediately, and reliably, just by issuing an ssh
> command (as a normal user).
I'm seeing the same thing on my gateway, though I haven't
yet found my serial cable to get the oops translated. I
am back to 2.4.10 for now.
> Hardware: Pentium III 933 w/512 MB RAM
> Red Hat 7.1+ updates
My setup:
K6-200 w/64 MB RAM
Debian Woody
a 3c905B and an RTL-8139
using iptables and transparent proxying (no masq).
> <0>Kernel panic: Aiee, killing interrupt handler!
I haven't decoded the oops, but I'm certainly seeing this
line.
Matthew.
J Sloan ([email protected]) wrote:
> I have been running the 2.4.15-pre kernels and
> have found an interesting oops. I can reproduce
> it immediately, and reliably, just by issuing an ssh
> command (as a normal user).
I'm currently running Linux 2.4.15-pre2 and have no troubles with ssh. I can
safely login onto other hosts, or issuing commands like
ssh -l someuser@somehost mutt
or copy files
scp somefile someuser@somehost:
I'm not using OpenSSH 3.0 yet (2.9p2). I'm not running any firewall or
transparent proxying.
PS My apologies that this reply isn't like it should be (no Message-ID to
what it is replying) but I've removed the mail before I could reply...
--
Taking advice on what the GPL means from Microsoft is like taking
Stalin's word on the meaning of the US Constitution. ~(Eben Moglen)
From: Matthew Kirkwood <[email protected]>
Date: Sat, 10 Nov 2001 11:53:11 +0000 (GMT)
On Fri, 9 Nov 2001, J Sloan wrote:
> I have been running the 2.4.15-pre kernels and
> have found an interesting oops. I can reproduce
> it immediately, and reliably, just by issuing an ssh
> command (as a normal user).
I'm seeing the same thing on my gateway, though I haven't
yet found my serial cable to get the oops translated. I
am back to 2.4.10 for now.
Just back out the netfilter changes in 2.4.15-pre1, that
is the cause.
Franks a lot,
David S. Miller
[email protected]
From: Sven Vermeulen <[email protected]>
Date: Sat, 10 Nov 2001 13:21:39 +0100
I'm not using OpenSSH 3.0 yet (2.9p2). I'm not running any firewall or
transparent proxying.
You will only see the bug if you are using netfilter.
Franks a lot,
David S. Miller
[email protected]
Sven Vermeulen wrote:
> J Sloan ([email protected]) wrote:
> > I have been running the 2.4.15-pre kernels and
> > have found an interesting oops. I can reproduce
> > it immediately, and reliably, just by issuing an ssh
> > command (as a normal user).
>
> I'm currently running Linux 2.4.15-pre2 and have no troubles with ssh. I can
> safely login onto other hosts, or issuing commands like
> ssh -l someuser@somehost mutt
> or copy files
> scp somefile someuser@somehost:
>
> I'm not using OpenSSH 3.0 yet (2.9p2). I'm not running any firewall or
> transparent proxying.
Thanks for the info, this is what I suspected -
only people running iptables appear to be
seeing this problem.
cu
jjs
"David S. Miller" wrote:
> Just back out the netfilter changes in 2.4.15-pre1, that
> is the cause.
Great, will do - hopefully this will be backed
out of -pre3, or else sorted out properly...
cu
jjs
Matthew Kirkwood wrote:
> On Fri, 9 Nov 2001, J Sloan wrote:
>
> > I have been running the 2.4.15-pre kernels and
> > have found an interesting oops. I can reproduce
> > it immediately, and reliably, just by issuing an ssh
> > command (as a normal user).
>
> I'm seeing the same thing on my gateway,
Good to know it's not just me!
> I am back to 2.4.10 for now.
2.4.14 runs fine here, much faster than 2.4.10
> My setup:
>
> K6-200 w/64 MB RAM
> Debian Woody
> a 3c905B and an RTL-8139
Excellent, that rules out CPU specifics, distro
specifics, and ethernet adapter specifics -
> using iptables and transparent proxying (no masq).
Aha, we are both using iptables - (I am using nat)
Rusty, I hope you are reading this.
> > <0>Kernel panic: Aiee, killing interrupt handler!
>
> I haven't decoded the oops, but I'm certainly seeing this
> line.
Good info, thanks for your input.
cu
jjs
diff -u --recursive --new-file v2.4.14/linux/net/ipv4/netfilter/ip_fw_compat.c linux/net/ipv4/netfilter/ip_fw_compat.c
--- v2.4.14/linux/net/ipv4/netfilter/ip_fw_compat.c Fri Apr 27 14:15:01 2001
+++ linux/net/ipv4/netfilter/ip_fw_compat.c Wed Nov 7 14:39:36 2001
@@ -78,11 +78,19 @@
{
int ret = FW_BLOCK;
u_int16_t redirpt;
+ struct sk_buff *nskb;
/* Assume worse case: any hook could change packet */
(*pskb)->nfcache |= NFC_UNKNOWN | NFC_ALTERED;
if ((*pskb)->ip_summed == CHECKSUM_HW)
(*pskb)->ip_summed = CHECKSUM_NONE;
+
+ /* Firewall rules can alter TOS: raw socket may have clone of
+ skb: don't disturb it --RR */
+ nskb = skb_unshare(*pskb, GFP_ATOMIC);
+ if (!nskb)
+ return NF_DROP;
+ *pskb = nskb;
switch (hooknum) {
case NF_IP_PRE_ROUTING:
diff -u --recursive --new-file v2.4.14/linux/net/ipv4/netfilter/ip_nat_core.c linux/net/ipv4/netfilter/ip_nat_core.c
--- v2.4.14/linux/net/ipv4/netfilter/ip_nat_core.c Wed May 16 10:31:27 2001
+++ linux/net/ipv4/netfilter/ip_nat_core.c Wed Nov 7 14:39:36 2001
@@ -734,6 +734,15 @@
synchronize_bh()) can vanish. */
READ_LOCK(&ip_nat_lock);
for (i = 0; i < info->num_manips; i++) {
+ struct sk_buff *nskb;
+ /* raw socket may have clone of skb: don't disturb it --RR */
+ nskb = skb_unshare(*pskb, GFP_ATOMIC);
+ if (!nskb) {
+ READ_UNLOCK(&ip_nat_lock);
+ return NF_DROP;
+ }
+ *pskb = nskb;
+
if (info->manips[i].direction == dir
&& info->manips[i].hooknum == hooknum) {
DEBUGP("Mangling %p: %s to %u.%u.%u.%u %u\n",
diff -u --recursive --new-file v2.4.14/linux/net/ipv4/netfilter/ipt_TCPMSS.c linux/net/ipv4/netfilter/ipt_TCPMSS.c
--- v2.4.14/linux/net/ipv4/netfilter/ipt_TCPMSS.c Tue Oct 9 17:06:53 2001
+++ linux/net/ipv4/netfilter/ipt_TCPMSS.c Wed Nov 7 14:39:36 2001
@@ -48,6 +48,13 @@
u_int16_t tcplen, newtotlen, oldval, newmss;
unsigned int i;
u_int8_t *opt;
+ struct sk_buff *nskb;
+
+ /* raw socket may have clone of skb: don't disturb it --RR */
+ nskb = skb_unshare(*pskb, GFP_ATOMIC);
+ if (!nskb)
+ return NF_DROP;
+ *pskb = nskb;
tcplen = (*pskb)->len - iph->ihl*4;
diff -u --recursive --new-file v2.4.14/linux/net/ipv4/netfilter/ipt_TOS.c linux/net/ipv4/netfilter/ipt_TOS.c
--- v2.4.14/linux/net/ipv4/netfilter/ipt_TOS.c Tue Oct 9 17:06:53 2001
+++ linux/net/ipv4/netfilter/ipt_TOS.c Wed Nov 7 14:39:36 2001
@@ -19,7 +19,14 @@
const struct ipt_tos_target_info *tosinfo = targinfo;
if ((iph->tos & IPTOS_TOS_MASK) != tosinfo->tos) {
+ struct sk_buff *nskb;
u_int16_t diffs[2];
+
+ /* raw socket may have clone of skb: don't disturb it --RR */
+ nskb = skb_unshare(*pskb, GFP_ATOMIC);
+ if (!nskb)
+ return NF_DROP;
+ *pskb = nskb;
diffs[0] = htons(iph->tos) ^ 0xFFFF;
iph->tos = (iph->tos & IPTOS_PREC_MASK) | tosinfo->tos;
On Sat, Nov 10, 2001 at 10:48:11AM -0800, J Sloan wrote:
> Sven Vermeulen wrote:
>
> > J Sloan ([email protected]) wrote:
> > > I have been running the 2.4.15-pre kernels and
> > > have found an interesting oops. I can reproduce
> > > it immediately, and reliably, just by issuing an ssh
> > > command (as a normal user).
> >
> > I'm currently running Linux 2.4.15-pre2 and have no troubles with ssh. I can
> > safely login onto other hosts, or issuing commands like
> > ssh -l someuser@somehost mutt
> > or copy files
> > scp somefile someuser@somehost:
> >
> > I'm not using OpenSSH 3.0 yet (2.9p2). I'm not running any firewall or
> > transparent proxying.
>
> Thanks for the info, this is what I suspected -
>
> only people running iptables appear to be
> seeing this problem.
>
I don't know...
I have netfilter compiled in, but I don't have any filter rules yet. This
is on smp too...
Have you been able to tell if you need to use mangling, or nat, or will just
filter rules do the job of reproducing?
Mike
I had the exact same problem, and removing the 4 netfilter patches
mailed to this list in an earlier post in this thread has solved the
problem for me.
UP box, preempt kernel patch, APIC enabled, running iptables based
firewall with NAT and filtering enabled...
Al
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Mike Fedyk
> Sent: Saturday, November 10, 2001 6:15 PM
> To: J Sloan
> Cc: Sven Vermeulen; Linux-Kernel Development Mailinglist
> Subject: Re: Networking: repeatable oops in 2.4.15-pre2
>
>
> On Sat, Nov 10, 2001 at 10:48:11AM -0800, J Sloan wrote:
> > Sven Vermeulen wrote:
> >
> > > J Sloan ([email protected]) wrote:
> > > > I have been running the 2.4.15-pre kernels and
> > > > have found an interesting oops. I can reproduce
> > > > it immediately, and reliably, just by issuing an ssh
> command (as a
> > > > normal user).
> > >
> > > I'm currently running Linux 2.4.15-pre2 and have no troubles with
> > > ssh. I can safely login onto other hosts, or issuing commands like
> > > ssh -l someuser@somehost mutt
> > > or copy files
> > > scp somefile someuser@somehost:
> > >
> > > I'm not using OpenSSH 3.0 yet (2.9p2). I'm not running
> any firewall
> > > or transparent proxying.
> >
> > Thanks for the info, this is what I suspected -
> >
> > only people running iptables appear to be
> > seeing this problem.
> >
>
> I don't know...
>
> I have netfilter compiled in, but I don't have any filter
> rules yet. This is on smp too...
>
> Have you been able to tell if you need to use mangling, or
> nat, or will just filter rules do the job of reproducing?
>
> Mike
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to
> [email protected] More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the
> FAQ at http://www.tux.org/lkml/
>