2006-05-19 17:22:18

by Mark Knecht

[permalink] [raw]
Subject: 2.6.16-rt22/23 kernels hanging after registering IO schedulers

Hi Ingo and others,
It's been quite awhile since I wrote anything here. I have been
using the 2.6.15-rt kernels for audio work on my AMD64 machine, but I
haven't written much music lately so the work load has been very low.
Actually I've found that for light work the standard Gentoo 2.6.16
kernel without -rt patches has been good enough for my real-time needs
of late so thanks to all the kernel developers for those more results
also.

Working quite nicely on my system are:

2.6.15-rt18
2.6.16-gentoo-r2

Some Gentoo oriented folks created a new 'pro-audio' overlay which
allows Gentoo users to build both applications and -rt kernels using
normal portage methods. Here's the link:

http://proaudio.tuxfamily.org/wiki/index.php?title=Main_Page

Since the newest -rt kernels are part of the overlay I thought I'd
give building and booting one a try. At the time the version in the
overlay was 2.6.16-rt22 so I tried that. Unfortunately the boot
process hangs immediately after some messages about registering IO
schedulers.

My method to build 2.6.16-rt22 was to emerge the source from the
pro-audio overlay, copy the 2.6.15-rt18 .config file, run make
oldconfig, make menuconfig, make && make modules_install, and then
copy the kernel source to /boot.

My method to build 2.6.16-rt23 was to download linux-2.6.16, unzip
and untar, patch it with the -rt23 patch from your site, then again
use the 2.6.15-rt18 .config file as in the previous paragraph. Same
results.

I did try changing the default IO Scheduler, as well as leaving a
few out. I still hang at the same place.

My guess is that it's the step immediately after registering these
schedulers that's hanging but I no longer have a second computer so I
cannot do the console thing with the serial cable. Sorry.

Again, no serious problems for me since both 2.6.15-rt18 &
2.6.16-gentoo-r2 are working great, but I thought I should at least
report this to see if there is some known thing I need to change or,
if not, make sure you are aware of the problem.

Here's some very basic info about the hardware

lightning ~ # lspci
00:00.0 Memory controller: nVidia Corporation CK804 Memory Controller (rev a3)
00:01.0 ISA bridge: nVidia Corporation CK804 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation CK804 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation CK804 USB Controller (rev a2)
00:02.1 USB Controller: nVidia Corporation CK804 USB Controller (rev a3)
00:04.0 Multimedia audio controller: nVidia Corporation CK804 AC'97
Audio Contro ller (rev a2)
00:06.0 IDE interface: nVidia Corporation CK804 IDE (rev f2)
00:07.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:08.0 IDE interface: nVidia Corporation CK804 Serial ATA Controller (rev f3)
00:09.0 PCI bridge: nVidia Corporation CK804 PCI Bridge (rev a2)
00:0a.0 Bridge: nVidia Corporation CK804 Ethernet Controller (rev a3)
00:0b.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0c.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0d.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:0e.0 PCI bridge: nVidia Corporation CK804 PCIE Bridge (rev a3)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] HyperTra nsport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] DRAM Con troller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8
[Athlon64/Opteron] Miscella neous Control
01:00.0 VGA compatible controller: ATI Technologies Inc RV370 5B60
[Radeon X300 (PCIE)]
01:00.1 Display controller: ATI Technologies Inc RV370 [Radeon X300SE]
05:06.0 Multimedia audio controller: Xilinx Corporation RME Hammerfall
DSP (rev 68)
05:08.0 FireWire (IEEE 1394): Texas Instruments TSB82AA2 IEEE-1394b
Link Layer C ontroller (rev 01)
lightning ~ #

As always, thanks for all your work on kernels for us audio folks.
I'm sure we don't say thanks enough. Sorry about that.

Cheers,
Mark


2006-05-21 17:25:15

by Steven Rostedt

[permalink] [raw]
Subject: Re: 2.6.16-rt22/23 kernels hanging after registering IO schedulers


On Fri, 19 May 2006, Mark Knecht wrote:

> Hi Ingo and others,
> It's been quite awhile since I wrote anything here. I have been
> using the 2.6.15-rt kernels for audio work on my AMD64 machine, but I
> haven't written much music lately so the work load has been very low.
> Actually I've found that for light work the standard Gentoo 2.6.16
> kernel without -rt patches has been good enough for my real-time needs
> of late so thanks to all the kernel developers for those more results
> also.
>
> Working quite nicely on my system are:
>
> 2.6.15-rt18

Well this is good to hear.

> 2.6.16-gentoo-r2
>
> Some Gentoo oriented folks created a new 'pro-audio' overlay which
> allows Gentoo users to build both applications and -rt kernels using
> normal portage methods. Here's the link:
>
> http://proaudio.tuxfamily.org/wiki/index.php?title=Main_Page

Also interesting to know.

>
> Since the newest -rt kernels are part of the overlay I thought I'd
> give building and booting one a try. At the time the version in the
> overlay was 2.6.16-rt22 so I tried that. Unfortunately the boot
> process hangs immediately after some messages about registering IO
> schedulers.
>
> My method to build 2.6.16-rt22 was to emerge the source from the
> pro-audio overlay, copy the 2.6.15-rt18 .config file, run make
> oldconfig, make menuconfig, make && make modules_install, and then
> copy the kernel source to /boot.
>
> My method to build 2.6.16-rt23 was to download linux-2.6.16, unzip
> and untar, patch it with the -rt23 patch from your site, then again
> use the 2.6.15-rt18 .config file as in the previous paragraph. Same
> results.
>
> I did try changing the default IO Scheduler, as well as leaving a
> few out. I still hang at the same place.
>
> My guess is that it's the step immediately after registering these
> schedulers that's hanging but I no longer have a second computer so I
> cannot do the console thing with the serial cable. Sorry.
>
> Again, no serious problems for me since both 2.6.15-rt18 &
> 2.6.16-gentoo-r2 are working great, but I thought I should at least
> report this to see if there is some known thing I need to change or,
> if not, make sure you are aware of the problem.
>
> Here's some very basic info about the hardware
>

OK, there's been other reports of problems with booting AMD64 with the
latest -rt kernels. I just got home from Germany, and will start testing
on my machine tomorrow. I'll let you know what I find then.

-- Steve

2006-05-21 21:10:27

by Mark Knecht

[permalink] [raw]
Subject: Re: 2.6.16-rt22/23 kernels hanging after registering IO schedulers

On 5/21/06, Steven Rostedt <[email protected]> wrote:
>
> On Fri, 19 May 2006, Mark Knecht wrote:
>
> > Hi Ingo and others,
> > It's been quite awhile since I wrote anything here. I have been
> > using the 2.6.15-rt kernels for audio work on my AMD64 machine, but I
> > haven't written much music lately so the work load has been very low.
> > Actually I've found that for light work the standard Gentoo 2.6.16
> > kernel without -rt patches has been good enough for my real-time needs
> > of late so thanks to all the kernel developers for those more results
> > also.
> >
> > Working quite nicely on my system are:
> >
> > 2.6.15-rt18
>
> Well this is good to hear.
>
> > 2.6.16-gentoo-r2
> >
> > Some Gentoo oriented folks created a new 'pro-audio' overlay which
> > allows Gentoo users to build both applications and -rt kernels using
> > normal portage methods. Here's the link:
> >
> > http://proaudio.tuxfamily.org/wiki/index.php?title=Main_Page
>
> Also interesting to know.
>
> >
> > Since the newest -rt kernels are part of the overlay I thought I'd
> > give building and booting one a try. At the time the version in the
> > overlay was 2.6.16-rt22 so I tried that. Unfortunately the boot
> > process hangs immediately after some messages about registering IO
> > schedulers.
> >
> > My method to build 2.6.16-rt22 was to emerge the source from the
> > pro-audio overlay, copy the 2.6.15-rt18 .config file, run make
> > oldconfig, make menuconfig, make && make modules_install, and then
> > copy the kernel source to /boot.
> >
> > My method to build 2.6.16-rt23 was to download linux-2.6.16, unzip
> > and untar, patch it with the -rt23 patch from your site, then again
> > use the 2.6.15-rt18 .config file as in the previous paragraph. Same
> > results.
> >
> > I did try changing the default IO Scheduler, as well as leaving a
> > few out. I still hang at the same place.
> >
> > My guess is that it's the step immediately after registering these
> > schedulers that's hanging but I no longer have a second computer so I
> > cannot do the console thing with the serial cable. Sorry.
> >
> > Again, no serious problems for me since both 2.6.15-rt18 &
> > 2.6.16-gentoo-r2 are working great, but I thought I should at least
> > report this to see if there is some known thing I need to change or,
> > if not, make sure you are aware of the problem.
> >
> > Here's some very basic info about the hardware
> >
>
> OK, there's been other reports of problems with booting AMD64 with the
> latest -rt kernels. I just got home from Germany, and will start testing
> on my machine tomorrow. I'll let you know what I find then.
>
> -- Steve

Hi Steve,
It's good to hear from you and great if you can take a look at
this. I did some more ground work and now feel bad that I didn't
report back much earlier. It appears that the problems have started
with the very first revision of 2.6.16-rt support. I am currently
writing you from 2.6.16 from kernel.org which booted fine. However
2.6.16-rt1 fails the same way as all the later kernels that I tried
with a hang right after registering the schedulers.

To try finding a place where the problem started I also tried
2.6.15-rt21 which was the last 2.6.15-rt kernel in the pro-audio
overlay. That kernel works fine also so it seems to be something right
at the beginning of the 2.6.16 series.

If this could possibly be something in my .config files let me know
and I'll try making changes directed by you or Ingo there.

I hope this info might help you get to the problem a little more quickly.

Cheers,
Mark

2006-05-22 21:27:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: 2.6.16-rt22/23 kernels hanging after registering IO schedulers

On Sun, 2006-05-21 at 14:10 -0700, Mark Knecht wrote:

> Hi Steve,
> It's good to hear from you and great if you can take a look at
> this. I did some more ground work and now feel bad that I didn't
> report back much earlier. It appears that the problems have started
> with the very first revision of 2.6.16-rt support. I am currently
> writing you from 2.6.16 from kernel.org which booted fine. However
> 2.6.16-rt1 fails the same way as all the later kernels that I tried
> with a hang right after registering the schedulers.

You're getting farther than I am. My system crashes in init_8259A right
in the unlocking of the i8259A_lock. It takes an exception in the
local_irq_restore of the raw_spin_unlock_irqrestore. I tried unlocking
the lock and locking it again at the beginning of the function, and that
seems to work fine. But this function didn't change between the
previous versions that do work. So I think something is very wacky
going on someplace else.

Unfortunately, I'm very behind in the work that I get paid for, so I
really don't have any more time to look into this. Especially since my
main developing machine happens to be my x86_64.

Hopefully, Ingo can find something, or I catch up and can work on this
again.

-- Steve

Here's my dump:

...
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 high edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 15 global_irq 15 high edge)
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 80000000 (gap: 7ff00000:60100000)
Real-Time Preemption Support (C) 2004-2006 Ingo Molnar
Built 1 zonelists
Kernel command line: root=/dev/md0 ro console=ttyS0,115200 console=tty0
nmi_watm
Initializing CPU#0
WARNING: experimental RCU implementation.
PANIC: early exception rip 10 error ffffffff8034d270 cr2 f0aeaa

Call Trace:
<ffffffff8034d270>{_raw_spin_unlock_irqrestore+32}
<ffffffff80111582>{init_8259A+226}
<ffffffff8093a935>{init_ISA_irqs+21}
<ffffffff8093a9a2>{init_IRQ+18}
<ffffffff809327f5>{start_kernel+213}
<ffffffff80932293>{_sinittext+659}
RIP 0x10

>
> To try finding a place where the problem started I also tried
> 2.6.15-rt21 which was the last 2.6.15-rt kernel in the pro-audio
> overlay. That kernel works fine also so it seems to be something right
> at the beginning of the 2.6.16 series.
>
> If this could possibly be something in my .config files let me know
> and I'll try making changes directed by you or Ingo there.
>
> I hope this info might help you get to the problem a little more quickly.
>
> Cheers,
> Mark

2006-05-23 00:30:09

by Mark Knecht

[permalink] [raw]
Subject: Re: 2.6.16-rt22/23 kernels hanging after registering IO schedulers

On 5/22/06, Steven Rostedt <[email protected]> wrote:
> On Sun, 2006-05-21 at 14:10 -0700, Mark Knecht wrote:
>
> > Hi Steve,
> > It's good to hear from you and great if you can take a look at
> > this. I did some more ground work and now feel bad that I didn't
> > report back much earlier. It appears that the problems have started
> > with the very first revision of 2.6.16-rt support. I am currently
> > writing you from 2.6.16 from kernel.org which booted fine. However
> > 2.6.16-rt1 fails the same way as all the later kernels that I tried
> > with a hang right after registering the schedulers.
>
> You're getting farther than I am. My system crashes in init_8259A right
> in the unlocking of the i8259A_lock. It takes an exception in the
> local_irq_restore of the raw_spin_unlock_irqrestore. I tried unlocking
> the lock and locking it again at the beginning of the function, and that
> seems to work fine. But this function didn't change between the
> previous versions that do work. So I think something is very wacky
> going on someplace else.
>
> Unfortunately, I'm very behind in the work that I get paid for, so I
> really don't have any more time to look into this. Especially since my
> main developing machine happens to be my x86_64.
>
> Hopefully, Ingo can find something, or I catch up and can work on this
> again.
>
> -- Steve
>
<SNIP>

Steve,
Hi. Hey, I'm happy to even get a response! Don't worry at all about
not being able to work on it right now. I completely understand. I'm
sure Ingo will pop up one of these days soon and we'll get all this
ironed out.

In the meantime I'm looking into redoing the config file from
scratch tonight to see if something has cropped up using make
oldconfig. I've not had problems with that myself but I have read
reports on the web that others have.

I'll let you all know if I find any new results.

Cheers,
Mark