2005-09-17 05:10:18

by Bernardo Innocenti

[permalink] [raw]
Subject: Assertion failed in libata-core.c:ata_qc_complete(3051)

Sorry for attaching a screenshot, I couldn't find a better
way to grab the panic message :-)

I get this panic occasionally (every 1-2 days) since I
upgraded to kernel-2.6.12-1.1447_FC4.
I've gone back to 2.6.12-1.1369_FC4 and the machine has
not yet crashed after 3 days.

I have a Promise TX4 controller with 4 SATA drivers
formatted with a RAID1 and a RAID5 md. LVM on top of this.


The relevant changelog is:

* Sat Aug 27 2005 Dave Jones <[email protected]> [2.6.12-1.1447_FC4]
- Better identify local builds. (#159696)
- Fix disk/net dump & netconsole. (#152586)
- Fix up sleeping in invalid context in sym2 driver. (#164995)
- Fix 'semaphore is not ready' error in snd-intel8x0m.
- Restore hwclock functionality on some systems. (#144894)
- Merge patches proposed for 2.6.12.6
- Fix typo in ALPS driver.
- Fix 'No sense' error with Transcend USB key. (#162559)
- Fix up ide-scsi check for medium not present. (#160868)
- powernow-k8 driver update from 2.6.13rc7.

* Wed Aug 24 2005 Dave Jones <[email protected]> [2.6.12-1.1435_FC4]
- Work around AMD x86-64 errata 122.

* Wed Aug 24 2005 Rik van Riel <[email protected]>
- upgrade to today's Xen snapshot

* Tue Aug 23 2005 Rik van Riel <[email protected]>
- make sure that the vsyscall-note is linked in so the right glibc is used

* Mon Aug 22 2005 Rik van Riel <[email protected]>
- fix the Xen vsyscall problem

* Fri Aug 19 2005 David Woodhouse <[email protected]>
- Don't probe 8250 ports on ppc32 unless they're in the device tree
- Enable ISDN, 8250 console, i8042 keyboard controller on ppc32
- Audit updates from git tree

* Thu Aug 18 2005 Rik van Riel <[email protected]>
- temporarily disable the vsyscall page for Xen

* Wed Aug 17 2005 Dave Jones <[email protected]>
- Restrict ipsec socket policy loading to CAP_NET_ADMIN. (CAN-2005-2555)

* Tue Aug 16 2005 Rik van Riel <[email protected]>
- upgrade Xen to a newer version

* Tue Aug 16 2005 Dave Jones <[email protected]>
- 2.6.11.5
- Fix module_verify_elf check that rejected valid .ko files. (#165528)

* Fri Aug 12 2005 Dave Jones <[email protected]>
- Audit speedup in syscall path.
- Update to a newer ACPI drop.

* Sat Aug 06 2005 Dave Jones <[email protected]> [2.6.12-1.1420_FC4]
- update to final 2.6.12.4 patchset.
- ACPI update to 20050729.
- Disable experimental ACPI HOTKEY driver. (#163355)

* Fri Aug 05 2005 Dave Jones <[email protected]>
- Enable Amiga partition support. (#149802)

* Thu Aug 04 2005 Dave Jones <[email protected]> [2.6.12-1.1411_FC4]
- Include pre-release 2.6.12.4 patchset
- Silence some messages from PowerMac thermal driver. (#158739)
- nfs server intermitently claimed ENOENT on existing files or directories. (#150759)
- Stop usbhid driver incorrectly claiming Wireless Security Lock as a mouse. (#147479)
- Further NFSD fixing for non-standard ports.
- Fix up miscalculated i_nlink in /proc (#162418)
- Fix addrlen checks in selinux_socket_connect. (#164165)

* Fri Jul 29 2005 Dave Jones <[email protected]>
- Fix compilation with older gcc. (#164041)

* Sat Jul 16 2005 Dave Jones <[email protected]>
- Enable the DC395x driver. (#151010)

* Sat Jul 16 2005 Dave Jones <[email protected]> [2.6.12-1.1398_FC4]
- Include a number of patches likely to show up in 2.6.12.3


--
// Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/ http://www.develer.com/


Attachments:
panic_screenshot.jpg (23.29 kB)

2005-09-17 15:10:04

by Matheus Izvekov

[permalink] [raw]
Subject: Re: Assertion failed in libata-core.c:ata_qc_complete(3051)


On Sab, Setembro 17, 2005 2:09 am, Bernardo Innocenti disse:
> Sorry for attaching a screenshot, I couldn't find a better
> way to grab the panic message :-)
>
> I get this panic occasionally (every 1-2 days) since I
> upgraded to kernel-2.6.12-1.1447_FC4.
> I've gone back to 2.6.12-1.1369_FC4 and the machine has
> not yet crashed after 3 days.
>
> I have a Promise TX4 controller with 4 SATA drivers
> formatted with a RAID1 and a RAID5 md. LVM on top of this.
>

Can you reproduce this with a stock kernel? Also, i think it would be
better if instead of sending a screenshot, get a serial cable and boot
with console=ttyS*

2005-09-18 00:15:34

by Bernardo Innocenti

[permalink] [raw]
Subject: Re: Assertion failed in libata-core.c:ata_qc_complete(3051)

Matheus Izvekov wrote:

>>I have a Promise TX4 controller with 4 SATA drivers
>>formatted with a RAID1 and a RAID5 md. LVM on top of this.
>
> Can you reproduce this with a stock kernel?

I've just opened the case to install some more RAM and
noticed that the SATA controller card wasn't completely
fitted into the PCI slot. Could it be just a hardware
problem? I don't know what that assartion is about.

Nowadays, Fedora kernels don't differ much from stock
kernels plus the usual bugfixes. I've now upgraded to
2.6.13-1.1555-FC5 because it fixes an iptables bug.
I'll report if I see this bug again.


> Also, i think it would be
> better if instead of sending a screenshot, get a serial cable and boot
> with console=ttyS*

This is happening on our production server, and there are no
other computers next to it, so I can't easily hook in a
serial cable.

--
// Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/ http://www.develer.com/

2005-09-18 00:52:56

by Jesper Juhl

[permalink] [raw]
Subject: Re: Assertion failed in libata-core.c:ata_qc_complete(3051)

On 9/18/05, Bernardo Innocenti <[email protected]> wrote:
> Matheus Izvekov wrote:
>
> >>I have a Promise TX4 controller with 4 SATA drivers
> >>formatted with a RAID1 and a RAID5 md. LVM on top of this.
> >
> > Can you reproduce this with a stock kernel?
>
> I've just opened the case to install some more RAM and
> noticed that the SATA controller card wasn't completely
> fitted into the PCI slot. Could it be just a hardware
> problem? I don't know what that assartion is about.
>
> Nowadays, Fedora kernels don't differ much from stock
> kernels plus the usual bugfixes. I've now upgraded to

They still do differ though. When asked to retest with a stock kernel,
indulging the person who asks is usually a good idea if you want your
problem solved :)


> 2.6.13-1.1555-FC5 because it fixes an iptables bug.
> I'll report if I see this bug again.
>
>
> > Also, i think it would be
> > better if instead of sending a screenshot, get a serial cable and boot
> > with console=ttyS*
>
> This is happening on our production server, and there are no
> other computers next to it, so I can't easily hook in a
> serial cable.
>
netconsole may be a useful alternative for you then.
See Documentation/networking/netconsole.txt


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2005-09-18 02:41:38

by Dave Jones

[permalink] [raw]
Subject: Re: Assertion failed in libata-core.c:ata_qc_complete(3051)

On Sun, Sep 18, 2005 at 02:52:55AM +0200, Jesper Juhl wrote:
> On 9/18/05, Bernardo Innocenti <[email protected]> wrote:
> > Matheus Izvekov wrote:
> >
> > >>I have a Promise TX4 controller with 4 SATA drivers
> > >>formatted with a RAID1 and a RAID5 md. LVM on top of this.
> > >
> > > Can you reproduce this with a stock kernel?
> >
> > I've just opened the case to install some more RAM and
> > noticed that the SATA controller card wasn't completely
> > fitted into the PCI slot. Could it be just a hardware
> > problem? I don't know what that assartion is about.
> >
> > Nowadays, Fedora kernels don't differ much from stock
> > kernels plus the usual bugfixes. I've now upgraded to
>
> They still do differ though. When asked to retest with a stock kernel,
> indulging the person who asks is usually a good idea if you want your
> problem solved :)

libata / scsi layer in that kernel should be 1:1 to mainline
as of 2.6.12

Dave

2005-09-18 02:57:59

by Bernardo Innocenti

[permalink] [raw]
Subject: Re: Assertion failed in libata-core.c:ata_qc_complete(3051)

Jesper Juhl wrote:
> On 9/18/05, Bernardo Innocenti <[email protected]> wrote:

>>I've just opened the case to install some more RAM and
>>noticed that the SATA controller card wasn't completely
>>fitted into the PCI slot. Could it be just a hardware
>>problem? I don't know what that assartion is about.
>>
>>Nowadays, Fedora kernels don't differ much from stock
>>kernels plus the usual bugfixes. I've now upgraded to
>
> They still do differ though. When asked to retest with a stock kernel,
> indulging the person who asks is usually a good idea if you want your
> problem solved :)

I appreciate Matheus's help, but installing a stock
kernel on a production server and waiting a few days
to see if the bug shows up is problematic for me.


I've already reviewed Fedora-specific changes in this
kernel and none of them appears to be related to my
problem. The only patch that comes close is:

linux-2.6.11-libata-promise-pata-on-sata.patch


>>This is happening on our production server, and there are no
>>other computers next to it, so I can't easily hook in a
>>serial cable.
>
> netconsole may be a useful alternative for you then.
> See Documentation/networking/netconsole.txt

Thanks for the suggestion. I think I'll leave it enabled
on the server. I've just compiled the netconsole module,
but it depends on the non-modular netpoll, so I'll have to
wait until next reboot in order to try it out.

By the way, the documentation doesn't say how to interface
with syslogd. Is it sufficient to use port 514 and turning
on the -r option?

--
// Bernardo Innocenti - Develer S.r.l., R&D dept.
\X/ http://www.develer.com/