2001-04-18 00:17:11

by Vibol Hou

[permalink] [raw]
Subject:

Hi,

I'm using 2.4.4-pre3 and get this message occasionally when the system is
loaded:

Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.

The nic is a 3Com 3c905B. Is this a bad thing?

/proc/interrupts:
CPU0 CPU1
0: 13167527 12036422 IO-APIC-edge timer
1: 0 2 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
4: 22773 19820 IO-APIC-edge
8: 1 0 IO-APIC-edge rtc
15: 1 4 IO-APIC-edge ide1
17: 50001929 49606064 IO-APIC-level eth0
18: 2459038 2364252 IO-APIC-level aic7xxx
NMI: 0 0
LOC: 25202946 25202942
ERR: 0

--
Vibol Hou
KhmerConnection
http://khmer.cc


2001-04-18 00:27:15

by Jaquemet Loic

[permalink] [raw]
Subject: Re:

Vibol Hou a ?crit :

> Hi,
>
> I'm using 2.4.4-pre3 and get this message occasionally when the system is
> loaded:
>
> Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
> Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
>
> The nic is a 3Com 3c905B. Is this a bad thing?
>
> /proc/interrupts:
> CPU0 CPU1
> 0: 13167527 12036422 IO-APIC-edge timer
> 1: 0 2 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 4: 22773 19820 IO-APIC-edge
> 8: 1 0 IO-APIC-edge rtc
> 15: 1 4 IO-APIC-edge ide1
> 17: 50001929 49606064 IO-APIC-level eth0
> 18: 2459038 2364252 IO-APIC-level aic7xxx
> NMI: 0 0
> LOC: 25202946 25202942
> ERR: 0
>
> --
> Vibol Hou
> KhmerConnection
> http://khmer.cc
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

I've got a similar problem with a RTL-8139 (rev 10) ( 8139too.c )
Apr 17 22:53:12 skippy kernel: eth1: Too much work at interrupt,
IntrStatus=0x0040.

The maintenair of this module writes that's a RxFIIFO Overflow that have
probably no other issue than buying a new processor :)
But .. I didn't have this messages on pre - 2.4.3 kernels .. ( neither on
2.4.3ac7 )



2001-04-18 00:32:56

by Jeff Garzik

[permalink] [raw]
Subject: Re:

Jaquemet Loic wrote:
> I've got a similar problem with a RTL-8139 (rev 10) ( 8139too.c )
> Apr 17 22:53:12 skippy kernel: eth1: Too much work at interrupt,
> IntrStatus=0x0040.
>
> The maintenair of this module writes that's a RxFIIFO Overflow that have
> probably no other issue than buying a new processor :)
> But .. I didn't have this messages on pre - 2.4.3 kernels .. ( neither on
> 2.4.3ac7 )

That's a different issue than the poster is having, it's two totally
different network cards with different characteristics. I don't
remember telling you that status code is a RxFIFO overflow, though :)

The RxFIFO overflow code definitely needs changing -- that's the next
item on the list.

--
Jeff Garzik | "Give a man a fish, and he eats for a day. Teach a
Building 1024 | man to fish, and a US Navy submarine will make sure
MandrakeSoft | he's never hungry again." -- Chris Neufeld

2001-04-20 02:48:33

by Francois Cami

[permalink] [raw]
Subject: Re:

Vibol Hou wrote:
>
> Hi,
>
> I'm using 2.4.4-pre3 and get this message occasionally when the system is
> loaded:
>
> Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
> Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.

I got that one too, PC is ASUS P2B-DS with two PII-350, 384MB RAM,
3C905B.
I've tried 3C905C to no avail.
The e401 status seems to be that there is too much load on the card to
be treated in the 20 (2.2.17) or 32 (2.2.19, 2.4.x) loops of the
interruption
check routine (stop/hit me if i'm wrong please).
I think we should try (MM. Donald Becker or Andrew Norton,
is this a Bad Thing ?) to change max_interrupt_work (3c59x.c, row 171)
to 64
or maybe even higher. Haven't had the guts to try on the production
machine
right now =)

> The nic is a 3Com 3c905B. Is this a bad thing?

I heard they work fine...

Fran?ois Cami
There And Back Again

2001-04-21 01:32:08

by Andrew Morton

[permalink] [raw]
Subject: Re:

Francois Cami wrote:
>
> Vibol Hou wrote:
> ...
>
> > Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
>
> I got that one too, PC is ASUS P2B-DS with two PII-350, 384MB RAM,
> 3C905B.

If you were getting this message occasionally, and if increasing the
max_interrupt_work module parm makes it stop, and everything
is always working fine, then it's an OK thing to do.

Question is: why is it happening? We're failing to get out
of the interrupt loop after 32 loops. Each loop can reap
up to 16 transmitted packets and 32 received packets.
That's a lot.

My suspicion is that something else in the system is
causing the NIC interrupt routine to get held up for long
periods of time. It has to be another interrupt.

All reporters of this problem (ie: both of them) were using
aic7xx SCSI. I wonder if that driver can sometimes spend a
long time in its interrupt routine. Many times. Rapidly.

Very odd.

Ah. SMP. Perhaps the other CPU is generating the transmit
load, some other interrupt source is slowing down *this*
CPU.

Could you test something for me? Try *decreasing* the
value of max_interrupt_work. See if that increases
the frequency of the message. Then, it if does, try to
correlate the occurence of the message with some other
form of system activity (especially disk I/O).

Thanks.


-

2001-04-21 12:58:35

by Francois Cami

[permalink] [raw]
Subject: Re: [3C905x e401]


okay, testing will begin monday (when it's under load).
any advice on which value i begin with ? (20 ?)

Fran?ois Cami


Andrew Morton wrote:
>
> Francois Cami wrote:
> >
> > Vibol Hou wrote:
> > ...
> >
> > > Apr 17 16:10:12 omega kernel: eth0: Too much work in interrupt, status e401.
> >
> > I got that one too, PC is ASUS P2B-DS with two PII-350, 384MB RAM,
> > 3C905B.
>
> If you were getting this message occasionally, and if increasing the
> max_interrupt_work module parm makes it stop, and everything
> is always working fine, then it's an OK thing to do.
>
> Question is: why is it happening? We're failing to get out
> of the interrupt loop after 32 loops. Each loop can reap
> up to 16 transmitted packets and 32 received packets.
> That's a lot.
>
> My suspicion is that something else in the system is
> causing the NIC interrupt routine to get held up for long
> periods of time. It has to be another interrupt.
>
> All reporters of this problem (ie: both of them) were using
> aic7xx SCSI. I wonder if that driver can sometimes spend a
> long time in its interrupt routine. Many times. Rapidly.
>
> Very odd.
>
> Ah. SMP. Perhaps the other CPU is generating the transmit
> load, some other interrupt source is slowing down *this*
> CPU.
>
> Could you test something for me? Try *decreasing* the
> value of max_interrupt_work. See if that increases
> the frequency of the message. Then, it if does, try to
> correlate the occurence of the message with some other
> form of system activity (especially disk I/O).
>
> Thanks.
>
> -