Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756037AbXHHLm0 (ORCPT ); Wed, 8 Aug 2007 07:42:26 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751196AbXHHLmN (ORCPT ); Wed, 8 Aug 2007 07:42:13 -0400 Received: from mx12.go2.pl ([193.17.41.142]:43656 "EHLO poczta.o2.pl" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750964AbXHHLmL (ORCPT ); Wed, 8 Aug 2007 07:42:11 -0400 Date: Wed, 8 Aug 2007 13:42:43 +0200 From: Jarek Poplawski To: Marcin =?iso-8859-2?Q?=A6lusarz?= Cc: Ingo Molnar , Thomas Gleixner , Linus Torvalds , Jean-Baptiste Vignaud , linux-kernel , shemminger , linux-net , netdev , Andrew Morton , Alan Cox Subject: Re: 2.6.20->2.6.21 - networking dies after random time Message-ID: <20070808114243.GC2426@ff.dom.local> References: <4bacf17f0707300029g5116e70bq4808059dc8b069f1@mail.gmail.com> <20070731132037.GC1046@ff.dom.local> <4bacf17f0708060000n5a00bb77i74adc3b4b28ac42b@mail.gmail.com> <20070806070300.GA4509@elte.hu> <4bacf17f0708070046o14403089v8376a4544f72fec3@mail.gmail.com> <20070807082321.GB2120@ff.dom.local> <4bacf17f0708070237w19d184b3p7f74b53612edb9a6@mail.gmail.com> <20070807095246.GB3223@ff.dom.local> <20070807121339.GA3946@ff.dom.local> <4bacf17f0708080409t116b5c84ye60dff7da51d0fdf@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4bacf17f0708080409t116b5c84ye60dff7da51d0fdf@mail.gmail.com> User-Agent: Mutt/1.4.2.2i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5514 Lines: 145 Read below please: On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin ?lusarz wrote: > 2007/8/7, Jarek Poplawski : > > So, the let's try this idea yet: modified Ingo's "x86: activate > > HARDIRQS_SW_RESEND" patch. > > (Don't forget about make oldconfig before make.) > > For testing only. > > > > Cheers, > > Jarek P. > > > > PS: alas there was not even time for "compile checking"... > > > > --- > > > > diff -Nurp 2.6.22.1-/arch/i386/Kconfig 2.6.22.1/arch/i386/Kconfig > > --- 2.6.22.1-/arch/i386/Kconfig 2007-07-09 01:32:17.000000000 +0200 > > +++ 2.6.22.1/arch/i386/Kconfig 2007-08-07 13:13:03.000000000 +0200 > > @@ -1252,6 +1252,10 @@ config GENERIC_PENDING_IRQ > > depends on GENERIC_HARDIRQS && SMP > > default y > > > > +config HARDIRQS_SW_RESEND > > + bool > > + default y > > + > > config X86_SMP > > bool > > depends on SMP && !X86_VOYAGER > > diff -Nurp 2.6.22.1-/arch/x86_64/Kconfig 2.6.22.1/arch/x86_64/Kconfig > > --- 2.6.22.1-/arch/x86_64/Kconfig 2007-07-09 01:32:17.000000000 +0200 > > +++ 2.6.22.1/arch/x86_64/Kconfig 2007-08-07 13:13:03.000000000 +0200 > > @@ -690,6 +690,10 @@ config GENERIC_PENDING_IRQ > > depends on GENERIC_HARDIRQS && SMP > > default y > > > > +config HARDIRQS_SW_RESEND > > + bool > > + default y > > + > > menu "Power management options" > > > > source kernel/power/Kconfig > > diff -Nurp 2.6.22.1-/kernel/irq/manage.c 2.6.22.1/kernel/irq/manage.c > > --- 2.6.22.1-/kernel/irq/manage.c 2007-07-09 01:32:17.000000000 +0200 > > +++ 2.6.22.1/kernel/irq/manage.c 2007-08-07 13:13:03.000000000 +0200 > > @@ -169,6 +169,14 @@ void enable_irq(unsigned int irq) > > desc->depth--; > > } > > spin_unlock_irqrestore(&desc->lock, flags); > > +#ifdef CONFIG_HARDIRQS_SW_RESEND > > + /* > > + * Do a bh disable/enable pair to trigger any pending > > + * irq resend logic: > > + */ > > + local_bh_disable(); > > + local_bh_enable(); > > +#endif > > } > > EXPORT_SYMBOL(enable_irq); > > > > diff -Nurp 2.6.22.1-/kernel/irq/resend.c 2.6.22.1/kernel/irq/resend.c > > --- 2.6.22.1-/kernel/irq/resend.c 2007-07-09 01:32:17.000000000 +0200 > > +++ 2.6.22.1/kernel/irq/resend.c 2007-08-07 13:57:54.000000000 +0200 > > @@ -62,16 +62,24 @@ void check_irq_resend(struct irq_desc *d > > */ > > desc->chip->enable(irq); > > > > + /* > > + * Temporary hack to figure out more about the problem, which > > + * is causing the ancient network cards to die. > > + */ > > + > > if ((status & (IRQ_PENDING | IRQ_REPLAY)) == IRQ_PENDING) { > > desc->status = (status & ~IRQ_PENDING) | IRQ_REPLAY; > > > > - if (!desc->chip || !desc->chip->retrigger || > > - !desc->chip->retrigger(irq)) { > > + if (desc->handle_irq == handle_edge_irq) { > > + if (desc->chip->retrigger) > > + desc->chip->retrigger(irq); > > + return; > > + } > > #ifdef CONFIG_HARDIRQS_SW_RESEND > > - /* Set it pending and activate the softirq: */ > > - set_bit(irq, irqs_resend); > > - tasklet_schedule(&resend_tasklet); > > + WARN_ON_ONCE(1); > > + /* Set it pending and activate the softirq: */ > > + set_bit(irq, irqs_resend); > > + tasklet_schedule(&resend_tasklet); > > #endif > > - } > > } > > } > > > Works fine with: Very nice! It would be about time this kernel should start behave... > WARNING: at kernel/irq/resend.c:79 check_irq_resend() > > Call Trace: > [] check_irq_resend+0xc0/0xd0 > [] enable_irq+0xed/0xf0 > [] :8390:ei_start_xmit+0x14d/0x30c > [] lock_release_non_nested+0xe5/0x190 > [] __qdisc_run+0x98/0x1f0 > [] __qdisc_run+0xae/0x1f0 > [] dev_hard_start_xmit+0x26e/0x2d0 > [] __qdisc_run+0xc0/0x1f0 > [] dev_queue_xmit+0x24f/0x310 > [] neigh_resolve_output+0xe7/0x290 > [] dst_output+0x0/0x10 > [] ip_output+0x19f/0x340 > [] ip_queue_xmit+0x217/0x430 > [] tcp_transmit_skb+0x40a/0x7c0 > [] __tcp_push_pending_frames+0x11b/0x940 > [] tcp_sendmsg+0x87a/0xc80 > [] inet_sendmsg+0x45/0x80 > [] sock_aio_write+0x104/0x120 > [] do_sync_write+0xf1/0x130 > [] autoremove_wake_function+0x0/0x40 > [] vfs_write+0x159/0x170 > [] sys_write+0x50/0x90 > [] system_call+0x7e/0x83 > So, it looks like x86_64 io_apic's IPI code was unused too long... I hope it's a piece of cake for Ingo now... Thanks very much Marcin! If it's possible for you and Jean-Baptiste, try this today patch with -rc2, and maybe once more this one patch (-rc1 or older). Regards, Jarek P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/