2000-10-29 20:09:20

by Jeff Garzik

[permalink] [raw]
Subject: Re: [patch] NE2000

pavel rabel wrote:
> There are three drivers for n2k cards. One is MCA only, one is PCI only,
> and the then the third one (ne.c) is both ISA and PCI. I think the ISA
> driver should be ISA only, as is described in Documentation and in config
> help. So I removed PCI code from ne.c to have ISA only driver. It
> gets a bit smaller, although I am not sure whether more code can be
> removed.

This change sounds ok to me, if noone else objects. (I added to the CC
a bit) I saw that code, and was thinking about doing the same thing
myself. ne2k-pci.c definitely has changes which are not included in
ne.c, and it seems silly to duplicate ne2000 PCI support.

Regards,

Jeff


P.S. Pavel, for the future, patches made with "diff -u" are preferred.

--
Jeff Garzik | "Mind if I drive?" -Sam
Building 1024 | "Not if you don't mind me clawing at the
MandrakeSoft | dash and shrieking like a cheerleader."
| -Max


2000-10-29 20:33:21

by Alan

[permalink] [raw]
Subject: Re: [patch] NE2000

> This change sounds ok to me, if noone else objects. (I added to the CC
> a bit) I saw that code, and was thinking about doing the same thing
> myself. ne2k-pci.c definitely has changes which are not included in
> ne.c, and it seems silly to duplicate ne2000 PCI support.

Unless there are any cards that need the bug workarounds in ne.c for use
on PCI then I see no problem. I've not heard of any.

2000-10-30 19:30:32

by Jorge Nerin

[permalink] [raw]
Subject: Re: [patch] NE2000

Alan Cox wrote:
>
> > This change sounds ok to me, if noone else objects. (I added to the CC
> > a bit) I saw that code, and was thinking about doing the same thing
> > myself. ne2k-pci.c definitely has changes which are not included in
> > ne.c, and it seems silly to duplicate ne2000 PCI support.
>
> Unless there are any cards that need the bug workarounds in ne.c for use
> on PCI then I see no problem. I've not heard of any.
>

Ok, I reported it several times, but it gets ignored. I have a Realtek
8029 (ne2k-pci), and with both drivers ne and ne2k-pci I can easily get
it stuck by doing a ping -f to a host in the local net, and sometimes it
happens doing copies to/from nfs shared resources.

rmmod & insmod don't cure the problem, it seems that no interrupts are
delivered from the card, and there are no log messages, so a reboot is
needed to restore net access.

System is dual 2x200mmx 96Mb ide discs no interrupts shared, and as far
as I can remember all kernel from 2.2.x, 2.3.x up to 2.4.0-testx exhibit
this problem.

--
Jorge Nerin
<[email protected]>

2000-10-31 13:54:42

by Petko Manolov

[permalink] [raw]
Subject: changed section attributes

Hi there,

I noticed that when i changed to binutils 2.10.91 (Debian,woody)
i start to see messages like:

"Warning: Ignoring changed section attributes for .modinfo"

Chasing down the problem appeared that section modinfo is
declared for the first time as ".section .modinfo" without any
attributes. This is done by the including of linux/module.h.
The next declaration is ".section .modinfo,"a",@progbits".
And assembler moans on that line number.

Changing the declaration in linux/module.h to ".modinfo,"a""
fixed the problem, but i noticed that the author said that
"we want .modinfo to not be allocated"

I wonder why?

I already tried to allocate it (.modinfo,"a" in module.h) and
it seems to work.

Any ideas?



Petkan

2000-10-31 14:16:02

by Keith Owens

[permalink] [raw]
Subject: Re: changed section attributes

On Tue, 31 Oct 2000 15:54:05 +0200,
Petko Manolov <[email protected]> wrote:
>"Warning: Ignoring changed section attributes for .modinfo"
>
>Changing the declaration in linux/module.h to ".modinfo,"a""
>fixed the problem, but i noticed that the author said that
>"we want .modinfo to not be allocated"

Historically that was the only way of preventing the .modinfo section
from being included in modules when they were loaded into the kernel.
An alternative is to allow .modinfo to be allocated and have modutils
treat it as non-allocated. This feature was added to modutils 2.3.19
on October 22 (bleeding edge toolchains for IA64 are "fun") so anybody
who is annoyed by the warning messages can apply this patch.

Index: 0-test10-pre7.1/include/linux/module.h
--- 0-test10-pre7.1/include/linux/module.h Tue, 31 Oct 2000 08:28:16 +1100 kaos (linux-2.4/W/33_module.h 1.1.2.1.2.1.2.1.2.1.1.1 644)
+++ 0-test10-pre7.1(w)/include/linux/module.h Wed, 01 Nov 2000 01:13:22 +1100 kaos (linux-2.4/W/33_module.h 1.1.2.1.2.1.2.1.2.1.1.1 644)
@@ -218,11 +218,6 @@
MODULE_GENERIC_TABLE(type##_device,name)
/* not put to .modinfo section to avoid section type conflicts */

-/* The attributes of a section are set the first time the section is
- seen; we want .modinfo to not be allocated. */
-
-__asm__(".section .modinfo\n\t.previous");
-
/* Define the module variable, and usage macros. */
extern struct module __this_module;

Index: 0-test10-pre7.1/Documentation/Changes
--- 0-test10-pre7.1/Documentation/Changes Fri, 27 Oct 2000 22:11:48 +1100 kaos (linux-2.4/G/c/25_Changes 1.1.1.4.1.6 644)
+++ 0-test10-pre7.1(w)/Documentation/Changes Wed, 01 Nov 2000 01:13:03 +1100 kaos (linux-2.4/G/c/25_Changes 1.1.1.4.1.6 644)
@@ -52,7 +52,7 @@
o Gnu make 3.77 # make --version
o binutils 2.9.1.0.25 # ld -v
o util-linux 2.10o # kbdrate -v
-o modutils 2.3.18 # insmod -V
+o modutils 2.3.19 # insmod -V
o e2fsprogs 1.19 # tune2fs --version
o pcmcia-cs 3.1.21 # cardmgr -V
o PPP 2.4.0 # pppd --version
@@ -284,7 +284,7 @@

Modutils
--------
-o <ftp://ftp.kernel.org/pub/linux/utils/kernel/modutils/v2.3/modutils-2.3.18.tar.bz2>
+o <ftp://ftp.kernel.org/pub/linux/utils/kernel/modutils/v2.3/modutils-2.3.19.tar.bz2>

Mkinitrd
--------

2000-10-31 14:30:04

by Petko Manolov

[permalink] [raw]
Subject: Re: changed section attributes

Keith Owens wrote:
>
> >Changing the declaration in linux/module.h to ".modinfo,"a""
> >fixed the problem, but i noticed that the author said that
> >"we want .modinfo to not be allocated"
>
> Historically that was the only way of preventing the .modinfo section
> from being included in modules when they were loaded into the kernel.
> An alternative is to allow .modinfo to be allocated and have modutils
> treat it as non-allocated. This feature was added to modutils 2.3.19
> on October 22 (bleeding edge toolchains for IA64 are "fun") so anybody
> who is annoyed by the warning messages can apply this patch.

[snip]

> -/* The attributes of a section are set the first time the section is
> - seen; we want .modinfo to not be allocated. */
> -
> -__asm__(".section .modinfo\n\t.previous");
> -
> /* Define the module variable, and usage macros. */
> extern struct module __this_module;


This is exactly what i did (excluding removing of the comment ;-)

I wonder why the compiler decides to add ".section
.modinfo,"a",@progbits"
May be this is the thing which should be fixed.


Petkan

2000-10-31 14:34:34

by Keith Owens

[permalink] [raw]
Subject: Re: changed section attributes

On Tue, 31 Oct 2000 16:29:16 +0200,
Petko Manolov <[email protected]> wrote:
>I wonder why the compiler decides to add ".section
>.modinfo,"a",@progbits"
>May be this is the thing which should be fixed.

That is just gcc speak for section .modinfo is marked as allocated,
type progbits. Read the ELF standard if you want to know more.

2000-10-31 14:41:44

by Petko Manolov

[permalink] [raw]
Subject: Re: changed section attributes

Keith Owens wrote:
>
> On Tue, 31 Oct 2000 16:29:16 +0200,
> Petko Manolov <[email protected]> wrote:
> >I wonder why the compiler decides to add ".section
> >.modinfo,"a",@progbits"
> >May be this is the thing which should be fixed.
>
> That is just gcc speak for section .modinfo is marked as allocated,
> type progbits. Read the ELF standard if you want to know more.


I already red the info as pages, but the description was too brief.

If this is default gcc behavior then it seems that changing to latest
modutils is the only option ;-)

I wonder if Linus will apply your patch.


best,
Petkan

2000-11-03 20:39:50

by Jorge Nerin

[permalink] [raw]
Subject: Re: [patch] NE2000

Paul Gortmaker wrote:
>
> Jorge Nerin wrote:
>
> >
> > Ok, I reported it several times, but it gets ignored. I have a Realtek
> > 8029 (ne2k-pci), and with both drivers ne and ne2k-pci I can easily get
> > it stuck by doing a ping -f to a host in the local net, and sometimes it
> > happens doing copies to/from nfs shared resources.
> >
> > rmmod & insmod don't cure the problem, it seems that no interrupts are
> > delivered from the card, and there are no log messages, so a reboot is
> > needed to restore net access.
> >
> > System is dual 2x200mmx 96Mb ide discs no interrupts shared, and as far
> > as I can remember all kernel from 2.2.x, 2.3.x up to 2.4.0-testx exhibit
> > this problem.
>
> Any messages from the driver whatsoever? Does running a non-SMP
> kernel make the problem go away?
>
> Paul.
>

Well, I have tried it with 2.4.0-test10, both SMP and non-SMP, and the
result is a little confusing.

Under SMP a ping -s 50000 -f other_host takes down the network access
with no messages (ne2k-pci), and no possibility of being restored
without a reboot.

Under UP the same command works ok, but after a while the dots stop for
30sec, then ping prints an 'E' and the dots continue. strace revealed
this:

sendmsg(4, {msg_name(16)={sin_family=AF_INET, sin_port=htons(0),
sin_addr=inet_addr("192.168.1.20")}},
msg_iov(1)=[{"\10\0\305~|\23\231\0\v\317\2:\177\236\r\0\10\t\n\v\f\r"...,
50008}], msg_controllen=0, msg_flags=0}, 0x800) = 50008 <30.016855>

ping has been waiting for sendmsg to end for 30 seconds! I don't know if
this could be caused by filling the network buffers, and then waiting a
while while the nic sends it out. As the packet size increases (the -s
option) the stops are more frequent, there is still activity on the
network, but I don't know if they are my packets or the replies.

When ping is stopped it's stuck in sock_wait_for_wmem, and when it's
running it's (usually) in wait_for_packet.

So I think that it could be a little window near sock_wait_for_wmem that
could be SMP insecure wich is affecting me.

The code of sock_wait_for_wmem in 2.4.0-test10 is this:

static long sock_wait_for_wmem(struct sock * sk, long timeo)
{
DECLARE_WAITQUEUE(wait, current);

clear_bit(SOCK_ASYNC_NOSPACE, &sk->socket->flags);
add_wait_queue(sk->sleep, &wait);
for (;;) {
if (signal_pending(current))
break;
set_bit(SOCK_NOSPACE, &sk->socket->flags);
set_current_state(TASK_INTERRUPTIBLE);
if (atomic_read(&sk->wmem_alloc) < sk->sndbuf)
break;
if (sk->shutdown & SEND_SHUTDOWN)
break;
if (sk->err)
break;
timeo = schedule_timeout(timeo);
}
__set_current_state(TASK_RUNNING);
remove_wait_queue(sk->sleep, &wait);
return timeo;
}

Does someone see something SMP insecure? Perhaps I'm totally wrong, this
could also be somewhere in the interrupt handling, don't know.

--
Jorge Nerin
<[email protected]>

2000-11-04 05:29:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] NE2000

Jorge Nerin wrote:
>
> ...
> So I think that it could be a little window near sock_wait_for_wmem that
> could be SMP insecure wich is affecting me.
>
> The code of sock_wait_for_wmem in 2.4.0-test10 is this:
>
> static long sock_wait_for_wmem(struct sock * sk, long timeo)
> {
> DECLARE_WAITQUEUE(wait, current);
>
> clear_bit(SOCK_ASYNC_NOSPACE, &sk->socket->flags);
> add_wait_queue(sk->sleep, &wait);
> for (;;) {
> if (signal_pending(current))
> break;
> set_bit(SOCK_NOSPACE, &sk->socket->flags);
> set_current_state(TASK_INTERRUPTIBLE);
> if (atomic_read(&sk->wmem_alloc) < sk->sndbuf)
> break;
> if (sk->shutdown & SEND_SHUTDOWN)
> break;
> if (sk->err)
> break;
> timeo = schedule_timeout(timeo);
> }
> __set_current_state(TASK_RUNNING);
> remove_wait_queue(sk->sleep, &wait);
> return timeo;
> }
>
> Does someone see something SMP insecure? Perhaps I'm totally wrong, this
> could also be somewhere in the interrupt handling, don't know.

No, that code is correct, provided (current->state == TASK_RUNNING)
on entry. If it isn't, there's a race window which can cause
lost wakeups. As a check you could add:

if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE)) == 0)
BUG();

to the start of this function.

2000-11-06 08:06:58

by Paul Gortmaker

[permalink] [raw]
Subject: Re: ping -f kills ne2k (was:[patch] NE2000)

--- linux~/drivers/net/8390.c Fri Jul 7 03:45:36 2000
+++ linux/drivers/net/8390.c Sat Jul 22 04:56:51 2000
@@ -38,6 +38,7 @@
Paul Gortmaker : add kmod support for auto-loading of the 8390
module by all drivers that require it.
Alan Cox : Spinlocking work, added 'BUG_83C690'
+ Paul Gortmaker : Separate out Tx timeout code from Tx path.

Sources:
The National Semiconductor LAN Databook, and the 3Com 3c503 databook.
@@ -105,6 +106,7 @@
/* Index to functions. */
static void ei_tx_intr(struct net_device *dev);
static void ei_tx_err(struct net_device *dev);
+static void ei_tx_timeout(struct net_device *dev);
static void ei_receive(struct net_device *dev);
static void ei_rx_overrun(struct net_device *dev);

@@ -161,6 +163,13 @@
printk(KERN_EMERG "%s: ei_open passed a non-existent device!\n", dev->name);
return -ENXIO;
}
+
+ /* The card I/O part of the driver (e.g. 3c503) can hook a Tx timeout
+ wrapper that does e.g. media check & then calls ei_tx_timeout. */
+ if (dev->tx_timeout == NULL)
+ dev->tx_timeout = ei_tx_timeout;
+ if (dev->watchdog_timeo <= 0)
+ dev->watchdog_timeo = TX_TIMEOUT;

/*
* Grab the page lock so we own the register set, then call
@@ -200,89 +209,66 @@
}

/**
- * ei_start_xmit - begin packet transmission
- * @skb: packet to be sent
- * @dev: network device to which packet is sent
+ * ei_tx_timeout - handle transmit time out condition
+ * @dev: network device which has apparently fallen asleep
*
- * Sends a packet to an 8390 network device.
+ * Called by kernel when device never acknowledges a transmit has
+ * completed (or failed) - i.e. never posted a Tx related interrupt.
*/
-
-static int ei_start_xmit(struct sk_buff *skb, struct net_device *dev)
+
+void ei_tx_timeout(struct net_device *dev)
{
long e8390_base = dev->base_addr;
struct ei_device *ei_local = (struct ei_device *) dev->priv;
- int length, send_length, output_page;
+ int txsr, isr, tickssofar = jiffies - dev->trans_start;
unsigned long flags;

- /*
- * If it has been too long since the last Tx, we assume the
- * board has died and kick it.
- */
-
- if (netif_queue_stopped(dev)) {
- /* Do timeouts, just like the 8003 driver. */
- int txsr;
- int isr;
- int tickssofar = jiffies - dev->trans_start;
-
- /*
- * Need the page lock. Now see what went wrong. This bit is
- * fast.
- */
-
- spin_lock_irqsave(&ei_local->page_lock, flags);
- txsr = inb(e8390_base+EN0_TSR);
- if (tickssofar < TX_TIMEOUT || (tickssofar < (TX_TIMEOUT+5) && ! (txsr & ENTSR_PTX)))
- {
- spin_unlock_irqrestore(&ei_local->page_lock, flags);
- return 1;
- }
-
- ei_local->stat.tx_errors++;
- isr = inb(e8390_base+EN0_ISR);
- if (!netif_running(dev)) {
- spin_unlock_irqrestore(&ei_local->page_lock, flags);
- printk(KERN_WARNING "%s: xmit on stopped card\n", dev->name);
- return 1;
- }
-
- /*
- * Note that if the Tx posted a TX_ERR interrupt, then the
- * error will have been handled from the interrupt handler
- * and not here. Error statistics are handled there as well.
- */
+ ei_local->stat.tx_errors++;

- printk(KERN_DEBUG "%s: Tx timed out, %s TSR=%#2x, ISR=%#2x, t=%d.\n",
- dev->name, (txsr & ENTSR_ABT) ? "excess collisions." :
- (isr) ? "lost interrupt?" : "cable problem?", txsr, isr, tickssofar);
+ spin_lock_irqsave(&ei_local->page_lock, flags);
+ txsr = inb(e8390_base+EN0_TSR);
+ isr = inb(e8390_base+EN0_ISR);
+ spin_unlock_irqrestore(&ei_local->page_lock, flags);

- if (!isr && !ei_local->stat.tx_packets)
- {
- /* The 8390 probably hasn't gotten on the cable yet. */
- ei_local->interface_num ^= 1; /* Try a different xcvr. */
- }
+ printk(KERN_DEBUG "%s: Tx timed out, %s TSR=%#2x, ISR=%#2x, t=%d.\n",
+ dev->name, (txsr & ENTSR_ABT) ? "excess collisions." :
+ (isr) ? "lost interrupt?" : "cable problem?", txsr, isr, tickssofar);

- /*
- * Play shuffle the locks, a reset on some chips takes a few
- * mS. We very rarely hit this point.
- */
-
- spin_unlock_irqrestore(&ei_local->page_lock, flags);
+ if (!isr && !ei_local->stat.tx_packets)
+ {
+ /* The 8390 probably hasn't gotten on the cable yet. */
+ ei_local->interface_num ^= 1; /* Try a different xcvr. */
+ }

- /* Ugly but a reset can be slow, yet must be protected */
+ /* Ugly but a reset can be slow, yet must be protected */

- disable_irq_nosync(dev->irq);
- spin_lock(&ei_local->page_lock);
+ disable_irq_nosync(dev->irq);
+ spin_lock(&ei_local->page_lock);

- /* Try to restart the card. Perhaps the user has fixed something. */
- ei_reset_8390(dev);
- NS8390_init(dev, 1);
+ /* Try to restart the card. Perhaps the user has fixed something. */
+ ei_reset_8390(dev);
+ NS8390_init(dev, 1);

- spin_unlock(&ei_local->page_lock);
- enable_irq(dev->irq);
- dev->trans_start = jiffies;
- }
+ spin_unlock(&ei_local->page_lock);
+ enable_irq(dev->irq);
+ netif_wake_queue(dev);
+}

+/**
+ * ei_start_xmit - begin packet transmission
+ * @skb: packet to be sent
+ * @dev: network device to which packet is sent
+ *
+ * Sends a packet to an 8390 network device.
+ */
+
+static int ei_start_xmit(struct sk_buff *skb, struct net_device *dev)
+{
+ long e8390_base = dev->base_addr;
+ struct ei_device *ei_local = (struct ei_device *) dev->priv;
+ int length, send_length, output_page;
+ unsigned long flags;
+
length = skb->len;

/* Mask interrupts from the ethercard.
@@ -1147,6 +1133,7 @@
EXPORT_SYMBOL(ei_open);
EXPORT_SYMBOL(ei_close);
EXPORT_SYMBOL(ei_interrupt);
+EXPORT_SYMBOL(ei_tx_timeout);
EXPORT_SYMBOL(ethdev_init);
EXPORT_SYMBOL(NS8390_init);


Attachments:
2400-t5-8390-diff0 (5.49 kB)

2000-11-06 17:29:35

by Jorge Nerin

[permalink] [raw]
Subject: Re: [patch] NE2000

Andrew Morton wrote:
>
> Jorge Nerin wrote:
> >
> > ...
> > So I think that it could be a little window near sock_wait_for_wmem that
> > could be SMP insecure wich is affecting me.
> >
> > The code of sock_wait_for_wmem in 2.4.0-test10 is this:
> >
> > static long sock_wait_for_wmem(struct sock * sk, long timeo)
> > {
> > DECLARE_WAITQUEUE(wait, current);
> >
> > clear_bit(SOCK_ASYNC_NOSPACE, &sk->socket->flags);
> > add_wait_queue(sk->sleep, &wait);
> > for (;;) {
> > if (signal_pending(current))
> > break;
> > set_bit(SOCK_NOSPACE, &sk->socket->flags);
> > set_current_state(TASK_INTERRUPTIBLE);
> > if (atomic_read(&sk->wmem_alloc) < sk->sndbuf)
> > break;
> > if (sk->shutdown & SEND_SHUTDOWN)
> > break;
> > if (sk->err)
> > break;
> > timeo = schedule_timeout(timeo);
> > }
> > __set_current_state(TASK_RUNNING);
> > remove_wait_queue(sk->sleep, &wait);
> > return timeo;
> > }
> >
> > Does someone see something SMP insecure? Perhaps I'm totally wrong, this
> > could also be somewhere in the interrupt handling, don't know.
>
> No, that code is correct, provided (current->state == TASK_RUNNING)
> on entry. If it isn't, there's a race window which can cause
> lost wakeups. As a check you could add:
>
> if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE)) == 0)
> BUG();
>
> to the start of this function.

OK, added, the function now looks like this:

static long sock_wait_for_wmem(struct sock * sk, long timeo)
{
DECLARE_WAITQUEUE(wait, current);

if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE))
== 0)
BUG();

clear_bit(SOCK_ASYNC_NOSPACE, &sk->socket->flags);
add_wait_queue(sk->sleep, &wait);
for (;;) {
if (signal_pending(current))
break;
set_bit(SOCK_NOSPACE, &sk->socket->flags);
set_current_state(TASK_INTERRUPTIBLE);
if (atomic_read(&sk->wmem_alloc) < sk->sndbuf)
break;
if (sk->shutdown & SEND_SHUTDOWN)
break;
if (sk->err)
break;
timeo = schedule_timeout(timeo);
}
__set_current_state(TASK_RUNNING);
remove_wait_queue(sk->sleep, &wait);
return timeo;
}

I have to put it _after_ DECLARE_WAITQUEUE in order to compile, if I put
it before it says me that wait is undeclared :-?

Well recompile, reboot, and got caugth by BUG(), after some tests.

[root@quartz ~]# ping -f -s 15000 pp_head
PING pp_head.pp.redvip.net (192.168.1.20) from 192.168.1.3 :
15000(15028) bytes of data.
..............invalid operand: 0000
CPU: 0
EIP: 0010:[<c0195b30>]
EFLAGS: 00010296
eax: 0000001a ebx: c2604000 ecx: c021e2e8 edx: c0266440
esi: c26eeba0 edi: c26eeba0 ebp: 7fffffff esp: c2605c88
ds: 0018 es: 0018 ss: 0018
Process ping (pid: 2202, stackpage=c2605000)
Stack: c02047a5 c02049eb 000002d2 7fffffff c26eeba0 c2604000 00000000
c26c4024
01234567 c2604000 00000000 00000000 01234567 c2604000 00000000
00000000
c0195c99 c26eeba0 7fffffff c26c4024 c264d0c0 c26c4010 000005dc
00000000
Call Trace: [<c02047a5>] [<c02049eb>] [<c0195c99>] [<c01aa692>]
[<c01aaa1e>] [<c01c0180>] [<c01c04de>]
[<c01c0180>] [<c01c6f48>] [<c01c6f86>] [<c0192fad>] [<c01c6f48>]
[<c01941b8>] [<c0193ee6>] [<c0165c9a>]
[<c016e602>] [<c01945fc>] [<c0109477>]
Code: 0f 0b 83 c4 0c 8b 87 50 03 00 00 f0 0f ba 70 04 00 8d 4c 24
Violaci?n de segmento
[root@quartz ~]#


Nov 6 12:00:07 quartz kernel: kernel BUG at sock.c:722!
Nov 6 12:00:07 quartz kernel: invalid operand: 0000
Nov 6 12:00:07 quartz kernel: CPU: 0
Nov 6 12:00:07 quartz kernel: EIP: 0010:[sock_wait_for_wmem+104/244]
Nov 6 12:00:07 quartz kernel: EFLAGS: 00010296
Nov 6 12:00:07 quartz kernel: eax: 0000001a ebx: c2604000 ecx:
c021e2e8 edx: c0266440
Nov 6 12:00:07 quartz kernel: esi: c26eeba0 edi: c26eeba0 ebp:
7fffffff esp: c2605c88
Nov 6 12:00:07 quartz kernel: ds: 0018 es: 0018 ss: 0018
Nov 6 12:00:07 quartz kernel: Process ping (pid: 2202,
stackpage=c2605000)
Nov 6 12:00:07 quartz kernel: Stack: c02047a5 c02049eb 000002d2
7fffffff c26eeba0 c2604000 00000000 c26c4024
Nov 6 12:00:07 quartz kernel: 01234567 c2604000 00000000
00000000 01234567 c2604000 00000000 00000000
Nov 6 12:00:07 quartz kernel: c0195c99 c26eeba0 7fffffff
c26c4024 c264d0c0 c26c4010 000005dc 00000000
Nov 6 12:00:07 quartz kernel: Call Trace: [vga_con+2501/10176]
[vga_con+3083/10176] [sock_alloc_send_skb+221/300]
[ip_build_xmit_slow+378/1208] [ip_build_xmit+78/816] [raw_getfrag+0/36]
[raw_sendmsg+642/752]
Nov 6 12:00:07 quartz kernel: [raw_getfrag+0/36]
[inet_sendmsg+0/68] [inet_sendmsg+62/68] [sock_sendmsg+129/164]
[inet_sendmsg+0/68] [sys_sendmsg+380/464] [sys_recvfrom+238/256]
[set_cursor+110/132]
Nov 6 12:00:07 quartz kernel: [write_chan+462/488]
[sys_socketcall+460/484] [system_call+55/64]
Nov 6 12:00:07 quartz kernel: Code: 0f 0b 83 c4 0c 8b 87 50 03 00 00 f0
0f ba 70 04 00 8d 4c 24

Nov 6 12:06:11 quartz kernel: kernel BUG at sock.c:722!
Nov 6 12:06:11 quartz kernel: invalid operand: 0000
Nov 6 12:06:11 quartz kernel: CPU: 1
Nov 6 12:06:11 quartz kernel: EIP: 0010:[sock_wait_for_wmem+104/244]
Nov 6 12:06:11 quartz kernel: EFLAGS: 00010296
Nov 6 12:06:11 quartz kernel: eax: 0000001a ebx: c1f54000 ecx:
c021e2e8 edx: c0266440
Nov 6 12:06:11 quartz kernel: esi: c3cd6100 edi: c3cd6100 ebp:
7fffffff esp: c1f55c88
Nov 6 12:06:11 quartz kernel: ds: 0018 es: 0018 ss: 0018
Nov 6 12:06:11 quartz kernel: Process ping (pid: 2993,
stackpage=c1f55000)
Nov 6 12:06:11 quartz kernel: Stack: c02047a5 c02049eb 000002d2
7fffffff c3cd6100 c1f54000 00000000 c2648824
Nov 6 12:06:11 quartz kernel: 01234567 c1f54000 00000000
00000000 01234567 c1f54000 00000000 00000000
Nov 6 12:06:11 quartz kernel: c0195c99 c3cd6100 7fffffff
c2648824 c260c2c0 c2648810 000005dc 00000000
Nov 6 12:06:11 quartz kernel: Call Trace: [vga_con+2501/10176]
[vga_con+3083/10176] [sock_alloc_send_skb+221/300]
[ip_build_xmit_slow+378/1208] [ip_build_xmit+78/816] [raw_getfrag+0/36]
[raw_sendmsg+642/752]
Nov 6 12:06:11 quartz kernel: [raw_getfrag+0/36]
[inet_sendmsg+0/68] [inet_sendmsg+62/68] [sock_sendmsg+129/164]
[inet_sendmsg+0/68] [sys_sendmsg+380/464] [sys_recvfrom+238/256]
[set_cursor+110/132]
Nov 6 12:06:11 quartz kernel: [write_chan+462/488]
[sys_socketcall+460/484] [system_call+55/64]
Nov 6 12:06:11 quartz kernel: Code: 0f 0b 83 c4 0c 8b 87 50 03 00 00 f0
0f ba 70 04 00 8d 4c 24
Nov 6 12:06:11 quartz kernel: NET: 6 messages suppressed.
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c26984a0
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c2828a80
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c265d560
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c2828a80
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c53bc820
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:11 quartz kernel: NAT: 0 dropping untracked packet c264d780
1 192.168.1.20 -> 192.168.1.3
Nov 6 12:06:41 quartz kernel: kernel BUG at sock.c:722!
Nov 6 12:06:41 quartz kernel: invalid operand: 0000
Nov 6 12:06:41 quartz kernel: CPU: 1
Nov 6 12:06:41 quartz kernel: EIP: 0010:[sock_wait_for_wmem+104/244]
Nov 6 12:06:41 quartz kernel: EFLAGS: 00010296
Nov 6 12:06:41 quartz kernel: eax: 0000001a ebx: c1f54000 ecx:
c021e2e8 edx: c0266440
Nov 6 12:06:41 quartz kernel: esi: c3cd6800 edi: c3cd6800 ebp:
7fffffff esp: c1f55c88
Nov 6 12:06:41 quartz kernel: ds: 0018 es: 0018 ss: 0018
Nov 6 12:06:41 quartz kernel: Process ping (pid: 2994,
stackpage=c1f55000)
Nov 6 12:06:41 quartz kernel: Stack: c02047a5 c02049eb 000002d2
7fffffff c3cd6800 c1f54000 00000000 c2644824
Nov 6 12:06:41 quartz kernel: 01234567 c1f54000 00000000
00000000 01234567 c1f54000 00000000 00000000
Nov 6 12:06:41 quartz kernel: c0195c99 c3cd6800 7fffffff
c2644824 c5b8be60 c2644810 000005dc 00000000
Nov 6 12:06:41 quartz kernel: Call Trace: [vga_con+2501/10176]
[vga_con+3083/10176] [sock_alloc_send_skb+221/300]
[ip_build_xmit_slow+378/1208] [ip_build_xmit+78/816] [raw_getfrag+0/36]
[raw_sendmsg+642/752]
Nov 6 12:06:41 quartz kernel: [raw_getfrag+0/36]
[inet_sendmsg+0/68] [inet_sendmsg+62/68] [sock_sendmsg+129/164]
[inet_sendmsg+0/68] [sys_sendmsg+380/464] [sys_recvfrom+238/256]
[kfree_skbmem+40/140]
Nov 6 12:06:41 quartz kernel: [__kfree_skb+369/376]
[nfsd:__insmod_nfsd_O/lib/modules/2.4.0-test10-ne2k/kernel/fs/nfs+-289558/96]
[nfsd:__insmod_nfsd_O/lib/modules/2.4.0-test10-ne2k/kernel/fs/nfs+-288860/96]
[qdisc_restart+108/376] [net_tx_action+194/300] [sys_socketcall+460/484]
[system_call+55/64]
Nov 6 12:06:41 quartz kernel: Code: 0f 0b 83 c4 0c 8b 87 50 03 00 00 f0
0f ba 70 04 00 8d 4c 24

Sorry, I can't pass it througth ksymoops because it doesn't work for me
in later kernels (RH 6.9.5) it says Fatal Error (re_compile) - Invalid
range end, and I have recompiled it. So I have to give you the results
of sysklogd.

As a side note I have to say that after those BUG the net is still
working, and I have those rules added in init scripts:
modprobe -k ip_tables
modprobe -k iptable_nat
insmod -k ipt_MASQUERADE
iptables -t nat -A POSTROUTING -o ppp+ -j MASQUERADE
iptables -A FORWARD -i ppp+ -m state --state RELATED,ESTABLISHED
-j ACCEPT
iptables -A FORWARD -o ppp+ -j ACCEPT
echo 0 >/proc/sys/net/ipv4/tcp_ecn
Under heavy packet load in these tests I see that NAT messages about
dropped packets.

More tests as requested.

--
Jorge Nerin
<[email protected]>

2000-11-06 18:41:19

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [patch] NE2000

Hello!

> if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE))
> == 0)
> BUG();

The Puzzle... 8) It is truly impossible.

Alexey

2000-11-06 18:47:19

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [patch] NE2000

Hello!

> No, that code is correct, provided (current->state == TASK_RUNNING)
> on entry. If it isn't, there's a race window which can cause
> lost wakeups. As a check you could add:
>
> if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE)) == 0)
> BUG();

Though it really cannot happen and really happens, as we have seen... 8)

In any case, Andrew, where is the race, when we enter in sleeping state?
Wakeup is not lost, it is just not required when we are not going
to schedule and force task to running state.

I still do not see how it is possible that task runs in sleeping state.
Apparently, set_current_state is forgotten somewhere. Do you see, where? 8)

Alexey

2000-11-06 22:33:55

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] NE2000

[email protected] wrote:
>
> Hello!
>
> > No, that code is correct, provided (current->state == TASK_RUNNING)
> > on entry. If it isn't, there's a race window which can cause
> > lost wakeups. As a check you could add:
> >
> > if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE)) == 0)
> > BUG();
>
> Though it really cannot happen and really happens, as we have seen... 8)
>
> In any case, Andrew, where is the race, when we enter in sleeping state?
> Wakeup is not lost, it is just not required when we are not going
> to schedule and force task to running state.

set_current_state(TASK_INTERRUPTIBLE);
add_wait_queue(...);
/* window here */
set_current_state(TASK_INTERRUPTIBLE);
schedule();

If there's a wakeup by another CPU (or this CPU in an interrupt) in
that window, current->state will get switched to TASK_RUNNING.

Then it's immediately overwritten and we go to sleep. Lost wakeup.

> I still do not see how it is possible that task runs in sleeping state.
> Apparently, set_current_state is forgotten somewhere. Do you see, where? 8)

Nope. Is Jorge running SMP?

2000-11-07 00:11:31

by Jorge Nerin

[permalink] [raw]
Subject: Re: ping -f kills ne2k (was:[patch] NE2000)

Paul Gortmaker wrote:
>
> >
> > Well, I have tried it with 2.4.0-test10, both SMP and non-SMP, and the
> > result is a little confusing.
> >
> > Under SMP a ping -s 50000 -f other_host takes down the network access
> > with no messages (ne2k-pci), and no possibility of being restored
> > without a reboot.
> >
> > Under UP the same command works ok, but after a while the dots stop for
> > 30sec, then ping prints an 'E' and the dots continue. strace revealed
> > this:
>
> Another suggestion - if you have your heart set on using ping
> as your network stress tool, you may want to try using multiple
> instances of MTU sized pings versus a single "ping -s 50000".
> In this way you aren't involving any IP frag code and its associated
> bean counting - giving us one less factor to consider.
>
> Oh, and since you get a silent failure, maybe you would be interested
> in testing this patch I was (originally) saving for 2.5.x. -- It adds
> watchdog transmit timeout functionality to 8390.c (which is used by
> the ne2k-pci driver). Last time I updated it was a couple of months
> ago, but nothing has changed since then.
>
> Paul.
>

Tested with ping -f -s 1400 (1400 in order not to reach 1500)
It took about half an hour and more than one million packets, but I
finally took the net down, with 12 concurrent pings.

To eliminate factors I have deleted all the NAT rules wich gave messages
about dropped packets, and unloaded all the iptables modules.

I have to make the test without the test check in sock_wait_for_wmem:
if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE))
== 0)
BUG();

Because as I said in a previous msg it gave me BUG()s very early in the
tests, and I still had network access.

If someone has a better sugestion as a nic stress tool I can try it, but
now I only have two ways of reaching this limits, ping -f of big
packets, and sometimes (only 4 or 5) during a copy of a large file over
NFS, but it's not a easy way to cause this.

--
Jorge Nerin
<[email protected]>

2000-11-07 02:41:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [patch] NE2000

[email protected] wrote:
>
> Hello!
>
> > No, that code is correct, provided (current->state == TASK_RUNNING)
> > on entry. If it isn't, there's a race window which can cause
> > lost wakeups. As a check you could add:
> >
> > if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE)) == 0)
> > BUG();
>
> Though it really cannot happen and really happens, as we have seen... 8)
>
> In any case, Andrew, where is the race, when we enter in sleeping state?
> Wakeup is not lost, it is just not required when we are not going
> to schedule and force task to running state.
>
> I still do not see how it is possible that task runs in sleeping state.
> Apparently, set_current_state is forgotten somewhere. Do you see, where? 8)
>

OK, there are a few areas which look fishy.

Calling __lock_sock when we're getting ready to wait
on a different waitqueue looks like a rather risky area.
We have a single task which is on two waitqueues.

Consider the case of tcp_data_wait():

add_wait_queue(sk->sleep)
set_current_state(TASK_INTERRUPTIBLE);
release_sock(sk);
if (...) /* Suppose this evaluates to false */
schedule_timeout();
lock_sock();

__lock_sock()
{
add_wait_queue_exclusive(sk->lock.wq);
/* Window 1: What does a wake_up(sk->sleep) do here? */
current->state = TASK_EXCLUSIVE | TASK_UNINTERRUPTIBLE;
/* Window 2: Bad things happen here */
schedule();

If someone does a wakeup(sk->sleep) in Window 2 in
__lock_sock() the wakeup code will think that the
task is sleeping on sk->sleep in state
TASK_EXCLUSIVE|TASK_UNINTERRUPTIBLE,
when in fact it is not. So a wakeup which _should_ have gone to
a different exclusive task actually goes to this one. This is
fantastically hard to hit because of the direction of the
waitqueue scan.

If the wakeup on sk->sleep happens during Window 1
it will be completely lost, but that's OK because
this task is not yet TASK_EXCLUSIVE (providing the
write ordering behaves as we want?)

If a wakeup on sk->lock.wq happens during Window 1
it will be completely lost.

wait_for_connect() and wait_for_tcp_memory() play similar
games with lock_sock() whereby they can appear to be on
two waitqueues at the same time. And again, because
lock_sock() uses TASK_EXCLUSIVE a wake_up on sk->sleep
could choose this task instead of a TASK_EXCLUSIVE task
which is _really_ sleeping on sk->sleep.

Now, this may not be a problem in practise, and in fact the
above may not be bugs because I missed something. But I suggest you
have a think about it. My brain is starting to hurt.

But none of these explain Jorge's problem. How he got to where
he did in !TASK_RUNNING. Plus the possible lock_sock problems
just look too damn hard to hit to explain Jorge's repeatability.

It may be useful to put a Pentium hardware watchpoint onto
current->state. Does kdb support those?

Can sock_fasync() be called when we're on a waitqueue, not in
state TASK_RUNNING and prior to schedule()?

inet_wait_for_connect() is OK.
wait_for_tcp_connect() is OK.
tcp_close() is OK.

Also, are you sure that all occurrences of

current->state = <whatever>;

are still safe on weakly ordered CPUs? (Not that this
would explain Jorge's problem).

hmm.. khttpd tries to do wake-one, but
interruptible_sleep_on_timeout() confounds it.
Bummer.

2000-11-08 16:46:12

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [patch] NE2000

Hello!

> > In any case, Andrew, where is the race, when we enter in sleeping state?
> > Wakeup is not lost, it is just not required when we are not going
> > to schedule and force task to running state.
>
> set_current_state(TASK_INTERRUPTIBLE);
> add_wait_queue(...);
> /* window here */
> set_current_state(TASK_INTERRUPTIBLE);
> schedule();
>
> If there's a wakeup by another CPU (or this CPU in an interrupt) in
> that window, current->state will get switched to TASK_RUNNING.
>
> Then it's immediately overwritten and we go to sleep. Lost wakeup.

Look into code yet. It looks sort of different. Again:

> > Wakeup is not lost, it is just not required when we are not going
> > to schedule and force task to running state.

So that it is right not depening on anything.

Alexey

2000-11-08 20:32:14

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [patch] NE2000

Hello!

[ Dave, please, look! I will strain brains this night too.
Indeed, this sounds dubious. ]

No, Andrew, this is surely not related to either of puzzles even if it
is really buggy place. ping does not use either tcp or socket lock. 8)

> Can sock_fasync() be called when we're on a waitqueue, not in
> state TASK_RUNNING and prior to schedule()?

No top level syscalls can be called in either state but running.
That funny BUG() proves the opposite, but this must not happen in any case.
At least, I still cannot reproduce this here. 8)

Alexey



> OK, there are a few areas which look fishy.
>
> Calling __lock_sock when we're getting ready to wait
> on a different waitqueue looks like a rather risky area.
> We have a single task which is on two waitqueues.
>
> Consider the case of tcp_data_wait():
>
> add_wait_queue(sk->sleep)
> set_current_state(TASK_INTERRUPTIBLE);
> release_sock(sk);
> if (...) /* Suppose this evaluates to false */
> schedule_timeout();
> lock_sock();
>
> __lock_sock()
> {
> add_wait_queue_exclusive(sk->lock.wq);
> /* Window 1: What does a wake_up(sk->sleep) do here? */
> current->state = TASK_EXCLUSIVE | TASK_UNINTERRUPTIBLE;
> /* Window 2: Bad things happen here */
> schedule();
>
> If someone does a wakeup(sk->sleep) in Window 2 in
> __lock_sock() the wakeup code will think that the
> task is sleeping on sk->sleep in state
> TASK_EXCLUSIVE|TASK_UNINTERRUPTIBLE,
> when in fact it is not. So a wakeup which _should_ have gone to
> a different exclusive task actually goes to this one. This is
> fantastically hard to hit because of the direction of the
> waitqueue scan.
>
> If the wakeup on sk->sleep happens during Window 1
> it will be completely lost, but that's OK because
> this task is not yet TASK_EXCLUSIVE (providing the
> write ordering behaves as we want?)
>
> If a wakeup on sk->lock.wq happens during Window 1
> it will be completely lost.
>
> wait_for_connect() and wait_for_tcp_memory() play similar
> games with lock_sock() whereby they can appear to be on
> two waitqueues at the same time. And again, because
> lock_sock() uses TASK_EXCLUSIVE a wake_up on sk->sleep
> could choose this task instead of a TASK_EXCLUSIVE task
> which is _really_ sleeping on sk->sleep.
>
> Now, this may not be a problem in practise, and in fact the
> above may not be bugs because I missed something. But I suggest you
> have a think about it. My brain is starting to hurt.
>
> But none of these explain Jorge's problem. How he got to where
> he did in !TASK_RUNNING. Plus the possible lock_sock problems
> just look too damn hard to hit to explain Jorge's repeatability.
>
> It may be useful to put a Pentium hardware watchpoint onto
> current->state. Does kdb support those?
>
> Can sock_fasync() be called when we're on a waitqueue, not in
> state TASK_RUNNING and prior to schedule()?
>
> inet_wait_for_connect() is OK.
> wait_for_tcp_connect() is OK.
> tcp_close() is OK.
>
> Also, are you sure that all occurrences of
>
> current->state = <whatever>;
>
> are still safe on weakly ordered CPUs? (Not that this
> would explain Jorge's problem).
>
> hmm.. khttpd tries to do wake-one, but
> interruptible_sleep_on_timeout() confounds it.
> Bummer.
>

2000-11-09 01:34:03

by David Miller

[permalink] [raw]
Subject: Re: [patch] NE2000

From: [email protected]
Date: Wed, 8 Nov 2000 23:31:28 +0300 (MSK)

[ Dave, please, look! I will strain brains this night too.
Indeed, this sounds dubious. ]

It is true disaster to be on multiple wait queues at once.
There are no doubts.

No, Andrew, this is surely not related to either of puzzles even if it
is really buggy place. ping does not use either tcp or socket lock. 8)

(BTW, this BUG() case sounds like memory corruption, not logic bug in
the code. BUTTT there was hard error in test9, but fixed in test10,
about wakeups. It would set task running state back to TASK_RUNNING
outside of runqueue lock, then add task to runqueue with lock held.
I assume test10 was tried already though.)

Yes, these multiple wait-queue cases must be repaired. BTW, look
at fs/pipe.c:pipe_wait(), whoever wrote this understood, even though
second wait queue hides behind semaphore :-)))

Consider next the case of being on some wait queue, and touching user
space, taking fault and sleeping on disk I/O or low memory. This
issue could have very far reaching consequences.

I will think about this some more.

Later,
David S. Miller
[email protected]

2000-11-09 01:43:02

by David Miller

[permalink] [raw]
Subject: Re: [patch] NE2000

From: [email protected]
Date: Wed, 8 Nov 2000 23:31:28 +0300 (MSK)

[ Dave, please, look! I will strain brains this night too.
Indeed, this sounds dubious. ]

Alexey! Even someone understood all this already, look
to include/net/sock.h SOCK_SLEEP_{PRE,POST} macros :-)

I will compose a patch to fix all this.

Later,
David S. Miller
[email protected]

2000-11-09 18:14:21

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [patch] NE2000

Hello!

> Alexey! Even someone understood all this already, look
> to include/net/sock.h SOCK_SLEEP_{PRE,POST} macros :-)
>
> I will compose a patch to fix all this.

O! But who was this wiseman? 8)

Alexey

2000-11-09 18:26:31

by Steven Whitehouse

[permalink] [raw]
Subject: Re: [patch] NE2000

Hi,

I have to own up and say that it was me :-) you'll see that DECnet is the
only protocol to use these macros at the moment. I'm sure though that I
only copied what IPv4 was doing at the time, along with the hints I had
from yourself and Dave,

Steve.

>
> Hello!
>
> > Alexey! Even someone understood all this already, look
> > to include/net/sock.h SOCK_SLEEP_{PRE,POST} macros :-)
> >
> > I will compose a patch to fix all this.
>
> O! But who was this wiseman? 8)
>
> Alexey
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-11-09 19:49:15

by Jorge Nerin

[permalink] [raw]
Subject: Re: ping -f kills ne2k (was:[patch] NE2000)

Jorge Nerin wrote:
>
> Paul Gortmaker wrote:
> >
> > >
> > > Well, I have tried it with 2.4.0-test10, both SMP and non-SMP, and the
> > > result is a little confusing.
> > >
> > > Under SMP a ping -s 50000 -f other_host takes down the network access
> > > with no messages (ne2k-pci), and no possibility of being restored
> > > without a reboot.
> > >
> > > Under UP the same command works ok, but after a while the dots stop for
> > > 30sec, then ping prints an 'E' and the dots continue. strace revealed
> > > this:
> >
> > Another suggestion - if you have your heart set on using ping
> > as your network stress tool, you may want to try using multiple
> > instances of MTU sized pings versus a single "ping -s 50000".
> > In this way you aren't involving any IP frag code and its associated
> > bean counting - giving us one less factor to consider.
> >
> > Oh, and since you get a silent failure, maybe you would be interested
> > in testing this patch I was (originally) saving for 2.5.x. -- It adds
> > watchdog transmit timeout functionality to 8390.c (which is used by
> > the ne2k-pci driver). Last time I updated it was a couple of months
> > ago, but nothing has changed since then.
> >
> > Paul.
> >
>
> Tested with ping -f -s 1400 (1400 in order not to reach 1500)
> It took about half an hour and more than one million packets, but I
> finally took the net down, with 12 concurrent pings.
>
> To eliminate factors I have deleted all the NAT rules wich gave messages
> about dropped packets, and unloaded all the iptables modules.
>
> I have to make the test without the test check in sock_wait_for_wmem:
> if ((current->state & (TASK_INTERRUPTIBLE|TASK_UNINTERRUPTIBLE))
> == 0)
> BUG();
>
> Because as I said in a previous msg it gave me BUG()s very early in the
> tests, and I still had network access.
>
> If someone has a better sugestion as a nic stress tool I can try it, but
> now I only have two ways of reaching this limits, ping -f of big
> packets, and sometimes (only 4 or 5) during a copy of a large file over
> NFS, but it's not a easy way to cause this.
>
> --
> Jorge Nerin
> <[email protected]>

Well, now it's kernel 2.4.0-test11-pre1 + 8390nmi, and the same
conditions, about 8 pings concurrent, and this time it took only 202k
packets to take the ne2k-pci down, but this time the watchdog says:

Nov 9 16:00:52 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:00:52 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=1792.
Nov 9 16:00:54 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:00:54 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=117.
Nov 9 16:00:56 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:00:56 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=117.
Nov 9 16:00:58 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:00:58 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=117.
Nov 9 16:01:00 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:01:00 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=117.
Nov 9 16:01:02 quartz kernel: NETDEV WATCHDOG: eth0: transmit timed out
Nov 9 16:01:02 quartz kernel: eth0: Tx timed out, lost interrupt?
TSR=0x3, ISR=0x3, t=117.

And never comes alive again.

--
Jorge Nerin
<[email protected]>

2000-11-10 01:55:41

by Tom Leete

[permalink] [raw]
Subject: Re: [patch] NE2000

"David S. Miller" wrote:
>
> Date: Thu, 09 Nov 2000 21:53:42 +1100
> From: Andrew Morton <[email protected]>
>
> "David S. Miller" wrote:
> > I will compose a patch to fix all this.
>
> I've quickly been through just about all of the kernel wrt
> waitqueues.
>
> My analysis was in error, BEWARE!
>
> Being on multiple wait queues at once is just fine. I verified this
> with Linus tonight.
>
> The problem case is in mixing TASK_EXCLUSIVE and non-TASK_EXCLUSIVE
> sleeps, that is what can actually cause problems.
>
> Everything else is fine. Anyways, the (untested) patch below should
> cure the lock_sock() cases.
>
> --- ./net/ipv4/af_inet.c.~1~ Tue Oct 24 14:26:18 2000
> +++ ./net/ipv4/af_inet.c Wed Nov 8 17:28:47 2000
[...]
> --- ./net/ipv4/tcp.c.~1~ Fri Oct 6 15:45:41 2000
> +++ ./net/ipv4/tcp.c Wed Nov 8 17:35:31 2000

This touches the places where I saw hangs, so I'm testing.
Too soon to have statistics, but with this patch I have
observed no more failures to wake (what I referred to as
"soft hangs").

I have seen a total I/O lockup, but no info escapes to
indicate its source. No NMI wakeup available, maybe I should
rig a pushbutton.

Tom