2005-02-17 01:04:32

by Guennadi Liakhovetski

[permalink] [raw]
Subject: [OOPS] 2.6.10, ReiserFS errors, preempt

Hi

The machine was running updatedb, while I was trying to burn on an ATAPI
CDR (hdc) and read a SCSI DVD-ROM not very intensively, ran a couple of
random applications, when my /home partition (hda7) became unaccessible,
then came a few Oopses. I sysrq-dumped traces and rebooted. After the
reboot /home mounted without any errors, nothing seems to be lost
(phew...). In logs found a few ReiserFS errors before the Oops. Attached
is a log, perhaps too verbous - sorry.

In kernel configured ACPI (no suspends), USB-UHCI, Bluetooth, bttv, no
ide-scsi, CONFIG_BLK_DEV_VIA82CXXX, LAPIC, boot parameter nmi_watchdog=2
(doesn't seem to work anyway), hda is a pretty new. Just looked at
smartctl -a /dev/hda - didn't see anything bad (not 100% sure though).

A comment "fixed in latest 2.6.11-rcX" would be gladly accepted:-)

Thanks
Guennadi

P.S.
Just noticed while studying S.M.A.R.T. output - while scrolling a tty with
SHIFT-PGUP / DOWN there always was a missing character in the bottom line
where the cursor was. Say, if cursor was at position 11, after SHIFT-PGUP
11th character in the last line is black. A bug in framebuffer / tty?

---
Guennadi Liakhovetski


Attachments:
hda7-2.6.10_klog-clean.oops (49.72 kB)

2005-02-17 12:50:38

by Sami Farin

[permalink] [raw]
Subject: Re: [OOPS] 2.6.10, ReiserFS errors, preempt

On Thu, Feb 17, 2005 at 01:59:47AM +0100, Guennadi Liakhovetski wrote:
> Hi
>
> The machine was running updatedb, while I was trying to burn on an ATAPI
> CDR (hdc) and read a SCSI DVD-ROM not very intensively, ran a couple of
> random applications, when my /home partition (hda7) became unaccessible,

some apps using bttv?

> then came a few Oopses. I sysrq-dumped traces and rebooted. After the
> reboot /home mounted without any errors, nothing seems to be lost
> (phew...). In logs found a few ReiserFS errors before the Oops. Attached
> is a log, perhaps too verbous - sorry.
>
> In kernel configured ACPI (no suspends), USB-UHCI, Bluetooth, bttv, no
> ide-scsi, CONFIG_BLK_DEV_VIA82CXXX, LAPIC, boot parameter nmi_watchdog=2
> (doesn't seem to work anyway), hda is a pretty new. Just looked at
> smartctl -a /dev/hda - didn't see anything bad (not 100% sure though).
>
> A comment "fixed in latest 2.6.11-rcX" would be gladly accepted:-)

I don't know, but if I keep on whacking 'v' in xawtv (Video (Capture) on/off),
I get this kind of crap sooner or later (and Oops a bit later).

Dec 25 16:27:37 safari kernel: bttv0: timeout: drop=18 irq=3151491/3151491, risc=0d85301c, bits: VSYNC HSYNC OFLOW
Dec 25 16:27:38 safari kernel: bttv0: reset, reinitialize
Dec 25 16:27:38 safari kernel: bttv0: PLL: 28636363 => 35468950 . ok
Dec 25 16:29:56 safari kernel: ReiserFS: warning: is_tree_node: node level 8 does not match to the expected one 1
Dec 25 16:29:56 safari kernel: ReiserFS: hda6: warning: vs-5150: search_by_key: invalid format found in block 1774456. Fsck?

so, can you reproduce the Oops without using bttv?
I believe there's unresolved memory corruption bug in bttv...

--

2005-02-17 21:34:17

by Guennadi Liakhovetski

[permalink] [raw]
Subject: Re: [OOPS] 2.6.10, ReiserFS errors, preempt

Hello

On Thu, 17 Feb 2005 [email protected] wrote:

> > I believe there's unresolved memory corruption bug in bttv...
> yes I think so, other have also similar problem :
> http://marc.theaimsgroup.com/?l=linux-kernel&m=110820804010204&w=2
> http://marc.theaimsgroup.com/?t=110531543900002&r=1&w=2
> http://www.ussg.iu.edu/hypermail/linux/kernel/0412.3/0881.html

Ahh... /me stops the memory test after 18 hours without a single error,
pulls the card out of my desktop and inserts it back into the experimantal
machine. Unfortunately, unlike in other posts you quoted above, I cannot
reproduce my Oops. Is anybody working on this?

Thanks
Guennadi
---
Guennadi Liakhovetski

2005-03-14 22:04:58

by Guennadi Liakhovetski

[permalink] [raw]
Subject: [lockup] no NMI (was Re: [OOPS] 2.6.10, ReiserFS errors, preempt)

On Thu, 17 Feb 2005, Guennadi Liakhovetski wrote:

> Hello
>
> On Thu, 17 Feb 2005 [email protected] wrote:
>
> > > I believe there's unresolved memory corruption bug in bttv...
> > yes I think so, other have also similar problem :
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=110820804010204&w=2
> > http://marc.theaimsgroup.com/?t=110531543900002&r=1&w=2
> > http://www.ussg.iu.edu/hypermail/linux/kernel/0412.3/0881.html
>
> Ahh... /me stops the memory test after 18 hours without a single error,
> pulls the card out of my desktop and inserts it back into the experimantal
> machine. Unfortunately, unlike in other posts you quoted above, I cannot
> reproduce my Oops. Is anybody working on this?

Well, I did remove the tv-card - and today got a hard lockup. It's a VIA
A7VI-VM motherboard with a 900MHz Duron, lapic explicitely re-enabled on
the command-line:

Kernel command line: BOOT_IMAGE=2.6.10 ro root=308 3 lapic nmi_watchdog=2
console=tty1 console=ttyS0,38400

/proc/interrupts:

NMI: 59

and still it didn't trigger. Why? Going to get 2.6.11.latest now... Was
2.6.10.

Thanks
Guennadi
---
Guennadi Liakhovetski