The ide drive holding the mounted filesystem dropped out of DMA and then
spewed the following a number of times. Anyone interested?
buffer layer error at buffer.c:2326
Pass this trace through ksymoops for reporting
c13f7e8c 00000916 c1014ef0 c13f6000 c0868d8c c0861da0 c014e500 c1014ef0
c13f6000 c0861da0 c13f6000 c0861da0 c0186f3b c1014ef0 00000000 00000000
c1000018 c033893c 00000203 000001d0 c13f6000 c0868d8c 0000000a c017d55e
Call Trace: [<c014e500>] [<c0186f3b>] [<c017d55e>] [<c014cb62>] [<c013cb70>]
[<c013d0b5>] [<c013d11c>] [<c013d1c2>] [<c013d236>] [<c013d38f>] [<c0117820>]
[<c0105000>] [<c0105000>] [<c01057f6>] [<c013d2d0>]
Trace; c014e500 <try_to_free_buffers+80/110>
Trace; c0186f3b <journal_try_to_free_buffers+20b/220>
Trace; c017d55e <ext3_releasepage+1e/30>
Trace; c014cb62 <try_to_release_page+42/60>
Trace; c013cb70 <shrink_cache+3e0/6e0>
Trace; c013d0b5 <shrink_caches+65/a0>
Trace; c013d11c <try_to_free_pages+2c/50>
Trace; c013d1c2 <kswapd_balance_pgdat+52/a0>
Trace; c013d236 <kswapd_balance+26/40>
Trace; c013d38f <kswapd+bf/c6>
Trace; c0117820 <default_wake_function+0/40>
Trace; c0105000 <_stext+0/0>
Trace; c0105000 <_stext+0/0>
Trace; c01057f6 <kernel_thread+26/30>
Trace; c013d2d0 <kswapd+0/c6>
--
http://function.linuxpower.ca
Zwane Mwaikambo wrote:
>
> The ide drive holding the mounted filesystem dropped out of DMA and then
> spewed the following a number of times. Anyone interested?
>
> buffer layer error at buffer.c:2326
y'know, just this morning I was thinking it may be time to pull the
debug code out of buffer.c. Silly me.
So we had a non-uptodate buffer against an uptodate page. Were
there any other messages in the logs? I'd have expected a
"buffer IO error" to come out first?
Looking at the code, it seems likely that you hit an I/O error
on a write. That will leave the page uptodate, but with PageError
set. And the buffer is marked not uptodate, which is silly, because the
buffer _is_ uptodate.
What this says is: I still need to get down and set up a fault simulator
and make sure that we're doing all the right things when I/O errors occur.
Does anyone have any opinions on what the kernel's behaviour should
be in the presence of a write I/O error? Our options appear to be:
1: Just drop the data. That's what we do now.
2: Mark it dirty again, so it gets written indefinitely
3: Mark the page dirty again, but also set PageError. So we
attempt to write the same blocks a second time only. Then
drop the data.
4: (Just thought of this): mark the page PageError and PageDirty,
and unmap it from disk. So when it gets written again, the
filesystem's get_block function will be called. It can look at
PageError(bh_result->b_page) and say "hey, I need to find a
different set of blocks for this page". The bad blocks will
just be leaked.
To back that up: if we get an IO error and the page is _already_
PageError, give up. Mark it clean and lose the data. This gives the
fs the option of clearing PageError inside get_block(), so it will end
up trying every block on the disk.
Pretty sneaky, I think. But it only works for file data. If the
blocks are for metadata, we're screwed..
-
At 21:02 19/06/02, Andrew Morton wrote:
>Zwane Mwaikambo wrote:
> >
> > The ide drive holding the mounted filesystem dropped out of DMA and then
> > spewed the following a number of times. Anyone interested?
> >
> > buffer layer error at buffer.c:2326
>
>y'know, just this morning I was thinking it may be time to pull the
>debug code out of buffer.c. Silly me.
>
>So we had a non-uptodate buffer against an uptodate page. Were
>there any other messages in the logs? I'd have expected a
>"buffer IO error" to come out first?
>
>Looking at the code, it seems likely that you hit an I/O error
>on a write. That will leave the page uptodate, but with PageError
>set. And the buffer is marked not uptodate, which is silly, because the
>buffer _is_ uptodate.
>
>What this says is: I still need to get down and set up a fault simulator
>and make sure that we're doing all the right things when I/O errors occur.
>
>Does anyone have any opinions on what the kernel's behaviour should
>be in the presence of a write I/O error? Our options appear to be:
>
>1: Just drop the data. That's what we do now.
>
>2: Mark it dirty again, so it gets written indefinitely
>
>3: Mark the page dirty again, but also set PageError. So we
> attempt to write the same blocks a second time only. Then
> drop the data.
>
>4: (Just thought of this): mark the page PageError and PageDirty,
> and unmap it from disk. So when it gets written again, the
> filesystem's get_block function will be called. It can look at
> PageError(bh_result->b_page) and say "hey, I need to find a
> different set of blocks for this page". The bad blocks will
> just be leaked.
>
> To back that up: if we get an IO error and the page is _already_
> PageError, give up. Mark it clean and lose the data. This gives the
> fs the option of clearing PageError inside get_block(), so it will end
> up trying every block on the disk.
Nice!
> Pretty sneaky, I think. But it only works for file data. If the
> blocks are for metadata, we're screwed..
Not necessarily. NTFS metadata is stored in "normal" files. So the two
statements above are incompatible. Either it will work for all of NTFS or
for none of it.
I definitely like the idea. Especially if we can somehow combine it with
moving the bad blocks to the "bad blocks list" in an fs specific manner
instead of just leaking them, it would turn into a software based fault
tolerance solution for writes, which would be damn neat.
Best regards,
Anton
--
"I've not lost my mind. It's backed up on tape somewhere." - Unknown
--
Anton Altaparmakov <aia21 at cantab.net> (replace at with @)
Linux NTFS Maintainer / IRC: #ntfs on irc.openprojects.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
Hi,
On Wed, 19 Jun 2002, Andrew Morton wrote:
> Does anyone have any opinions on what the kernel's behaviour should
> be in the presence of a write I/O error? Our options appear to be:
Another possibility would be to add a more flexible error handling
infrastructure controllable from user space. Think for example of
hotpluggable devices. Either we add support for this to all subsystems or
we add generic support at a higher level. A single daemon could ask the
user to replug the disk or temporarily save a the dirty pages to a
different device.
bye, Roman
On Wed, 19 Jun 2002, Andrew Morton wrote:
> Zwane Mwaikambo wrote:
> >
> > The ide drive holding the mounted filesystem dropped out of DMA and then
> > spewed the following a number of times. Anyone interested?
> >
> > buffer layer error at buffer.c:2326
>
> So we had a non-uptodate buffer against an uptodate page. Were
> there any other messages in the logs? I'd have expected a
> "buffer IO error" to come out first?
end_request: I/O error, dev 03:00, sector 180247
Buffer I/O error on device ide0(3,1), logical block 90123
EXT3-fs error (device ide0(3,1)): ext3_get_inode_loc: unable to read inode block - inode=22484, block=90123
EXT3-fs error (device ide0(3,1)) in ext3_reserve_inode_write: IO failure
EXT3-fs error (device ide0(3,1)) in ext3_new_inode: IO failure
hda: ide_dma_intr: status=0x51 [ drive ready seek complete error ]
hda: ide_dma_intr: error=0x40 [ uncorrectable error ] , CHS=181/11/14, sector=180247
Yep i got em all.
Cheers,
Zwane Mwaikambo
--
http://function.linuxpower.ca
Hi,
On Wed, Jun 19, 2002 at 01:02:32PM -0700, Andrew Morton wrote:
> What this says is: I still need to get down and set up a fault simulator
> and make sure that we're doing all the right things when I/O errors occur.
I've got one for 2.4:
http://people.redhat.com/sct/patches/testdrive/
The testdrive-1.1-for-2.4.19pre10.patch can do random fault injection,
at pseudo-random intervals of selectable frequency, on reads or writes
or both. It's a modified loop.o which requires a separate
testdrive.o, and you just losetup it over a block device (or, more
easily, "mount -o loop /dev/foo /mnt/bar".)
It can trace IOs and will watch for suspicious activity such as
overlapping IOs being submitted. The fault injection code trips in
before the bh request ever gets to the underlying block device.
It shouldn't be too hard to adapt it to bio if you want.
Cheers,
Stephen
On Thu, 20 Jun 2002 21:50, Stephen C. Tweedie wrote:
> Hi,
>
> On Wed, Jun 19, 2002 at 01:02:32PM -0700, Andrew Morton wrote:
> > What this says is: I still need to get down and set up a fault simulator
> > and make sure that we're doing all the right things when I/O errors
> > occur.
>
> I've got one for 2.4:
>
> http://people.redhat.com/sct/patches/testdrive/
>
> The testdrive-1.1-for-2.4.19pre10.patch can do random fault injection,
> at pseudo-random intervals of selectable frequency, on reads or writes
> or both. It's a modified loop.o which requires a separate
> testdrive.o, and you just losetup it over a block device (or, more
> easily, "mount -o loop /dev/foo /mnt/bar".)
I've often thought that more "in kernel" test features should be available for
those who'd like to do a bit of torture testing, but don't have all the
patches to hand.
In addition to this, maybe tests to:
1. Fail kmalloc occasionally
2. Corrupt network data packets
3. Test USB hardware (there was a kernel patch for isoc bandwidth tests, that
allowed writing to a non-existent endpoint - bitrotted now)
and probably lots more.
I'd like to see all of these enabled seperately as CONFIG_ options. They also
need to be protected by something like CONFIG_EXPERIMENTAL, so people don't
unwittingly enable them on production systems.
An example (not meant to be applied) of this is attached. You'd then just
dep_(m)bool on CONFIG_TESTONLY for whatever CONFIG_KMALLOC_TESTMODE type
thing you've added.
Thoughts?
Brad
--
http://conf.linux.org.au. 22-25Jan2003. Perth, Australia. Birds in Black.
On Thu, Jun 20, 2002 at 11:22:41PM +1000, Brad Hards wrote:
> In addition to this, maybe tests to:
> 1. Fail kmalloc occasionally
> I'd like to see all of these enabled seperately as CONFIG_ options.
Arnaldo (or maybe one of the other Conectiva folks) had a patch
that did this.
Dave.
--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs