2008-11-20 09:03:41

by Paweł Sikora

[permalink] [raw]
Subject: [2.6.27.6] jfs on raid1 => attempt to access beyond end of device.

hi,

few hours ago i've set up jfs filesystems on raid1 and raid0.
during restoring backup i've got an errors in dmesg.
the testcase on my system is easy and 100% reproducible:
just do the following command on jfs/raid1 device:

# dd if=/dev/zero of=testfile bs=1M count=4096

# fdisk -l /dev/sd{a,b}

Disk /dev/sda: 164.6 GB, 164696555520 bytes
255 heads, 63 sectors/track, 20023 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x0f800000

Device Boot Start End Blocks Id System
/dev/sda1 1 747 6000246 82 Linux swap /
Solaris
/dev/sda2 * 748 2206 11719417+ fd Linux raid
autodetect
/dev/sda3 2207 15093 103514827+ 83 Linux
/dev/sda4 15094 20023 39600225 83 Linux

Disk /dev/sdb: 164.6 GB, 164696555520 bytes
255 heads, 63 sectors/track, 20023 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x90909090

Device Boot Start End Blocks Id System
/dev/sdb1 1 747 6000246 82 Linux swap /
Solaris
/dev/sdb2 * 748 2206 11719417+ fd Linux raid
autodetect
/dev/sdb3 2207 15093 103514827+ 83 Linux
/dev/sdb4 15094 20023 39600225 83 Linux

# cat /proc/mdstat
Personalities : [raid1] [raid0]
md1 : active raid0 sda3[0] sdb3[1]
207029504 blocks 64k chunks

md0 : active raid1 sda2[0] sdb2[1]
11719296 blocks [2/2] [UU]


[ 5583.795772] attempt to access beyond end of device
[ 5583.795789] md0: rw=1, want=23438600, limit=23438592
[ 5583.795792] __ratelimit: 20 callbacks suppressed
[ 5583.795794] Buffer I/O error on device md0, logical block 2929824
[ 5583.795796] lost page write due to I/O error on md0
[ 5583.795799] attempt to access beyond end of device
[ 5583.795801] md0: rw=1, want=23438608, limit=23438592
[ 5583.795802] Buffer I/O error on device md0, logical block 2929825
[ 5583.795803] lost page write due to I/O error on md0
[ 5583.795806] attempt to access beyond end of device
[ 5583.795807] md0: rw=1, want=23438616, limit=23438592
[ 5583.795808] Buffer I/O error on device md0, logical block 2929826
[ 5583.795809] lost page write due to I/O error on md0
[ 5583.795813] attempt to access beyond end of device
[ 5583.795814] md0: rw=1, want=23438624, limit=23438592
[ 5583.795820] Buffer I/O error on device md0, logical block 2929827
[ 5583.795826] lost page write due to I/O error on md0
[ 5583.795833] attempt to access beyond end of device
[ 5583.795838] md0: rw=1, want=23438632, limit=23438592
[ 5583.795844] Buffer I/O error on device md0, logical block 2929828
[ 5583.795850] lost page write due to I/O error on md0
[ 5583.795857] attempt to access beyond end of device
[ 5583.795862] md0: rw=1, want=23438640, limit=23438592
[ 5583.795868] Buffer I/O error on device md0, logical block 2929829
[ 5583.795874] lost page write due to I/O error on md0
[ 5583.795881] attempt to access beyond end of device
[ 5583.795886] md0: rw=1, want=23438648, limit=23438592
[ 5583.795892] Buffer I/O error on device md0, logical block 2929830
[ 5583.795898] lost page write due to I/O error on md0
[ 5583.795905] attempt to access beyond end of device
[ 5583.795911] md0: rw=1, want=23438656, limit=23438592
[ 5583.795916] Buffer I/O error on device md0, logical block 2929831
[ 5583.795923] lost page write due to I/O error on md0
[ 5583.795929] attempt to access beyond end of device
[ 5583.795935] md0: rw=1, want=23438664, limit=23438592
[ 5583.795941] Buffer I/O error on device md0, logical block 2929832
[ 5583.795947] lost page write due to I/O error on md0
[ 5583.795954] attempt to access beyond end of device
[ 5583.795960] md0: rw=1, want=23438672, limit=23438592
[ 5583.795966] Buffer I/O error on device md0, logical block 2929833
[ 5583.795972] lost page write due to I/O error on md0
[ 5583.795978] attempt to access beyond end of device
[ 5583.795984] md0: rw=1, want=23438680, limit=23438592
[ 5583.795991] attempt to access beyond end of device
[ 5583.795997] md0: rw=1, want=23438688, limit=23438592
[ 5583.796004] attempt to access beyond end of device
[ 5583.796009] md0: rw=1, want=23438696, limit=23438592
[ 5583.796017] attempt to access beyond end of device
[ 5583.796022] md0: rw=1, want=23438704, limit=23438592
[ 5583.796030] attempt to access beyond end of device
[ 5583.796035] md0: rw=1, want=23438712, limit=23438592
[ 5583.796042] attempt to access beyond end of device
[ 5583.796048] md0: rw=1, want=23438720, limit=23438592
[ 5583.796055] attempt to access beyond end of device
[ 5583.796060] md0: rw=1, want=23438728, limit=23438592
[ 5583.796068] attempt to access beyond end of device
[ 5583.796074] md0: rw=1, want=23438736, limit=23438592
[ 5583.796081] attempt to access beyond end of device
[ 5583.796086] md0: rw=1, want=23438744, limit=23438592
[ 5583.796094] attempt to access beyond end of device
[ 5583.796099] md0: rw=1, want=23438752, limit=23438592
[ 5583.796107] attempt to access beyond end of device
[ 5583.796112] md0: rw=1, want=23438760, limit=23438592
[ 5583.796120] attempt to access beyond end of device
[ 5583.796125] md0: rw=1, want=23438768, limit=23438592
[ 5583.796132] attempt to access beyond end of device
[ 5583.796138] md0: rw=1, want=23438776, limit=23438592
[ 5583.796145] attempt to access beyond end of device
[ 5583.796150] md0: rw=1, want=23438784, limit=23438592
[ 5583.796157] attempt to access beyond end of device
[ 5583.796163] md0: rw=1, want=23438792, limit=23438592
[ 5583.796170] attempt to access beyond end of device
[ 5583.796176] md0: rw=1, want=23438800, limit=23438592
[ 5583.796183] attempt to access beyond end of device
[ 5583.796188] md0: rw=1, want=23438808, limit=23438592
[ 5583.796195] attempt to access beyond end of device
[ 5583.796202] md0: rw=1, want=23438816, limit=23438592
[ 5583.796208] attempt to access beyond end of device
[ 5583.796214] md0: rw=1, want=23438824, limit=23438592
[ 5583.796222] attempt to access beyond end of device
[ 5583.796227] md0: rw=1, want=23438832, limit=23438592
[ 5607.575701] JBD: Detected IO errors while flushing file data on md0

any ideas what's wrong?

BR,
Pawel.


2008-11-20 10:37:01

by NeilBrown

[permalink] [raw]
Subject: Re: [2.6.27.6] jfs on raid1 => attempt to access beyond end of device.

On Thu, November 20, 2008 8:03 pm, Pawe? Sikora wrote:
> hi,
>
> few hours ago i've set up jfs filesystems on raid1 and raid0.
> during restoring backup i've got an errors in dmesg.
> the testcase on my system is easy and 100% reproducible:
> just do the following command on jfs/raid1 device:
....

> /dev/sda2 * 748 2206 11719417+ fd Linux raid
^^^^^^^^^
size of sda2 is Kilobytes - 23438835 sectors.


>
> md0 : active raid1 sda2[0] sdb2[1]
> 11719296 blocks [2/2] [UU]
^^^^^^^^

size of md0 in kilobytes - 23438592 sectors.


> [ 5583.796222] attempt to access beyond end of device
> [ 5583.796227] md0: rw=1, want=23438832, limit=23438592
Largest 'want' value.

'want' is just less than size of sda2
'limit' is exactly size of md0 (no surprise there).

> any ideas what's wrong?

I suspect you created the filesystem on /dev/sda2, not realising
that when you created a raid1 from sda2 and sdb2 it would be slightly
smaller than sda2 (as md used up to 120K for metadata storage).

NeilBrown

(P.S. I love it when people provide all the details thus making it
easier for me to spot what is happening - thanks!)

2008-11-20 12:42:33

by Paweł Sikora

[permalink] [raw]
Subject: Re: [2.6.27.6] jfs on raid1 => attempt to access beyond end of device.

20/11/2008, "Neil Brown" <[email protected]> napisa?/a:

>Hi.
> I sent this reply, but I bounced. I don't know why but I'm sending
> it again a different way. Hopefully it will get through.
> Please Cc any followup to [email protected]. I think the
> email got through to the list.
>NeilBrown
>
>
>
>On Thu, November 20, 2008 8:03 pm, Pawe? Sikora wrote:
>> hi,
>>
>> few hours ago i've set up jfs filesystems on raid1 and raid0.
>> during restoring backup i've got an errors in dmesg.
>> the testcase on my system is easy and 100% reproducible:
>> just do the following command on jfs/raid1 device:
>....
>
>> /dev/sda2 * 748 2206 11719417+ fd Linux raid
> ^^^^^^^^^
>size of sda2 is Kilobytes - 23438835 sectors.
>
>
>>
>> md0 : active raid1 sda2[0] sdb2[1]
>> 11719296 blocks [2/2] [UU]
> ^^^^^^^^
>
>size of md0 in kilobytes - 23438592 sectors.
>
>
>> [ 5583.796222] attempt to access beyond end of device
>> [ 5583.796227] md0: rw=1, want=23438832, limit=23438592
>Largest 'want' value.
>
>'want' is just less than size of sda2
>'limit' is exactly size of md0 (no surprise there).
>
>> any ideas what's wrong?
>
>I suspect you created the filesystem on /dev/sda2, not realising
>that when you created a raid1 from sda2 and sdb2 it would be slightly
>smaller than sda2 (as md used up to 120K for metadata storage).

thanks for the quick reply!
afair i've ran mkfs.jfs on /dev/md/0.
quick test...

working raid0 device:

# fsck.jfs -f -n /dev/md/1
fsck.jfs version 1.1.13, 17-Jul-2008
processing started: 11/20/2008 13.38.32
Filesystem is currently mounted.
WARNING: Checking a mounted filesystem does not produce dependable
results.
The current device is: /dev/md/1
Block size in bytes: 4096
Filesystem size in blocks: 51757376
**Phase 0 - Replay Journal Log
**Phase 1 - Check Blocks, Files/Directories, and Directory Entries
**Phase 2 - Count links
(...)

failing raid1 device:

# fsck.jfs -f -n -v /dev/md/0
fsck.jfs version 1.1.13, 17-Jul-2008
processing started: 11/20/2008 13.34.54

/dev/md/0 is mounted and the file system is not type JFS.
(...)

and the raw sda2 device:

# fsck.jfs -f -n -v /dev/sda2
fsck.jfs version 1.1.13, 17-Jul-2008
processing started: 11/20/2008 13.34.50
The current device is: /dev/sda2
Open(...READ/WRITE EXCLUSIVE...) returned rc = 0
Invalid magic number in the superblock (P).
Invalid magic number in the superblock (S).

The superblock does not describe a correct jfs file system.
(...)

2008-11-20 20:38:17

by NeilBrown

[permalink] [raw]
Subject: Re: [2.6.27.6] jfs on raid1 => attempt to access beyond end of device.

On Thu, November 20, 2008 11:42 pm, Pawe? Sikora wrote:
> 20/11/2008, "Neil Brown" <[email protected]> napisa?/a:

> failing raid1 device:
>
> # fsck.jfs -f -n -v /dev/md/0
> fsck.jfs version 1.1.13, 17-Jul-2008
> processing started: 11/20/2008 13.34.54
>
> /dev/md/0 is mounted and the file system is not type JFS.

Try
grep md /proc/mounts
and
fsck.ext3 -f -n /dev/md/0

I looked again at the logs and there is evidence that it is an
ext3 filesystem that was triggering those errors.
Maybe you never did a mkfs after creating the raid1???

NeilBrown