2002-01-02 22:04:44

by kees

[permalink] [raw]
Subject: [PATCH] solves freeze due to serial comm. on SMP

Hi,

In the beginning of last year I reported a solid freeze problem with Linux
when I moved from UP to SMP. Some bughunting especially with kdb an hints
from AM I was able to nail it down to some SMP unsafe irq table handling
in serial.c.
I submitted the attached patch to Ted but that never made it to the
kernel. It _really_ solved the problem as I had a crash sometimes within
15 minutes and after applying it I reached uptimes over 100 days.

The problem is however that this patch applies to Linux-2.4.4 and serial.c
has had some tweaks in the meantime. Please merge it.

Kees


Attachments:
patch_serial.c_spinlocks (4.83 kB)

2002-01-03 05:50:49

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] solves freeze due to serial comm. on SMP

kees wrote:
>
> Hi,
>
> In the beginning of last year I reported a solid freeze problem with Linux
> when I moved from UP to SMP. Some bughunting especially with kdb an hints
> from AM I was able to nail it down to some SMP unsafe irq table handling
> in serial.c.
> I submitted the attached patch to Ted but that never made it to the
> kernel. It _really_ solved the problem as I had a crash sometimes within
> 15 minutes and after applying it I reached uptimes over 100 days.
>

It looks like somebody has already had a go at fixing this in current
kernels - the restore_flags() has been moved to the end of
shutdown(). (It's not a complete fix, because request_irq() can
schedule).

Are you able to test 2.4.17?

2002-01-04 07:54:15

by kees

[permalink] [raw]
Subject: Re: [PATCH] solves freeze due to serial comm. on SMP

Andrew

I'll give it a try, but from what I experienced in those days was that
adding the _spinlock protection_ finally solved all.

Kees

On Wed, 2 Jan 2002, Andrew Morton wrote:

> kees wrote:
> >
> > Hi,
> >
> > In the beginning of last year I reported a solid freeze problem with Linux
> > when I moved from UP to SMP. Some bughunting especially with kdb an hints
> > from AM I was able to nail it down to some SMP unsafe irq table handling
> > in serial.c.
> > I submitted the attached patch to Ted but that never made it to the
> > kernel. It _really_ solved the problem as I had a crash sometimes within
> > 15 minutes and after applying it I reached uptimes over 100 days.
> >
>
> It looks like somebody has already had a go at fixing this in current
> kernels - the restore_flags() has been moved to the end of
> shutdown(). (It's not a complete fix, because request_irq() can
> schedule).
>
> Are you able to test 2.4.17?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2002-01-04 14:14:32

by kees

[permalink] [raw]
Subject: Re: [PATCH] solves freeze due to serial comm. on SMP

Hi Andrew,

I tested 2.4.17 bare and got TLWOH (total Lockup Within One Hour)
So it is clear (for me at least) that the spinlock protection is _really_
needed.

I applied the 'Patch_for_ted' to 2.4.17 (without difficulty), build a new
kernel and I'm running it now. On 2.4.4 with the patch applied I got
uptime of 108 days when a power outage stopped the box.


regards


Kees


On Wed, 2 Jan 2002, Andrew Morton wrote:

> kees wrote:
> >
> > Hi,
> >
> > In the beginning of last year I reported a solid freeze problem with Linux
> > when I moved from UP to SMP. Some bughunting especially with kdb an hints
> > from AM I was able to nail it down to some SMP unsafe irq table handling
> > in serial.c.
> > I submitted the attached patch to Ted but that never made it to the
> > kernel. It _really_ solved the problem as I had a crash sometimes within
> > 15 minutes and after applying it I reached uptimes over 100 days.
> >
>
> It looks like somebody has already had a go at fixing this in current
> kernels - the restore_flags() has been moved to the end of
> shutdown(). (It's not a complete fix, because request_irq() can
> schedule).
>
> Are you able to test 2.4.17?
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>