2009-03-10 00:13:31

by Kevin Shanahan

[permalink] [raw]
Subject: Possible ext4 corruption - ACL related?

Hi,

I think we have some minor corruption happening with a new (about two
weeks old) ext4 file system. Our backup script spat out bunch of errors
last night while trying to read the file ACLs:

getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll: Input/output error
getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/U3520chs.dll: Input/output error
getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/ReportRenderer.dll: Input/output error
getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/ParameterDesigner.dll: Input/output error
...

And in syslog I see these warnings from the kernel:

Mar 10 00:06:01 hermes /USR/SBIN/CRON[10875]: (root) CMD ( /usr/local/bin/rsync-backup-all.sh)
Mar 10 00:06:24 hermes kernel: init_special_inode: bogus i_mode (53253)
Mar 10 00:06:24 hermes kernel: attempt to access beyond end of device
Mar 10 00:06:24 hermes kernel: dm-0: rw=0, want=1312475392770056, limit=2147483648
Mar 10 00:06:25 hermes kernel: attempt to access beyond end of device
Mar 10 00:06:25 hermes kernel: dm-0: rw=0, want=1312475392770056, limit=2147483648

The filesystem is still in use (and working okay, AFAICT), but I'll need
to fsck this tonight. Is there any useful data I can gather from the fs
before (or while) I do that which might help track down the cause?

This system has been running 2.6.29-rc6 (x86_64) since this filesystem
was created. I did a quick scan of the ext4 commits since then, but none
of them obviously address this unless maybe it's related to 7ce9d5d -
"ext4: fix ext4_free_inode() vs. ext4_claim_inode() race"

Regards,
Kevin.




2009-03-10 04:35:21

by Eric Sandeen

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

Kevin Shanahan wrote:
> On Mon, 2009-03-09 at 21:57 -0500, Eric Sandeen wrote:
>> Kevin Shanahan wrote:
...
>> If you try a manual getfacl of the above files do you still get the error?
>
> Yeah, same thing:
>
> hermes:/srv/samba/local# getfacl apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/Cdo32sv.dll
> getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll: Input/output error
>
> Nothing extra showing up in dmesg after doing that, so it's possible the
> two are unrelated - although, the coincidence would surprise me.

Can you try debugfs /dev/whatever, and then do:

debugfs> stat
apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll

(all one line there, and this is assuming that apps/ is the root dir on
that filesystem, if not adjust accordingly...)

-Eric

2009-03-10 05:03:00

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mon, 2009-03-09 at 23:35 -0500, Eric Sandeen wrote:
> Kevin Shanahan wrote:
> > On Mon, 2009-03-09 at 21:57 -0500, Eric Sandeen wrote:
> >> Kevin Shanahan wrote:
> ...
> >> If you try a manual getfacl of the above files do you still get the error?
> >
> > Yeah, same thing:
> >
> > hermes:/srv/samba/local# getfacl apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/Cdo32sv.dll
> > getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll: Input/output error
> >
> > Nothing extra showing up in dmesg after doing that, so it's possible the
> > two are unrelated - although, the coincidence would surprise me.
>
> Can you try debugfs /dev/whatever, and then do:
>
> debugfs> stat
> apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll
>
> (all one line there, and this is assuming that apps/ is the root dir on
> that filesystem, if not adjust accordingly...)

hermes:~# debugfs /dev/dm-0
debugfs 1.41.3 (12-Oct-2008)
debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll"

Gives the following output:

Inode: 867 Type: bad type Mode: 0404 Flags: 0x802a61af
Generation: 2483046020 Version: 0xb9286359:17a7fdfd
User: 1455931783 Group: -798021131 Size: -1808719531
File ACL: 141934744 Directory ACL: 0
Links: 15681 Blockcount: 171984001880781
Fragment: Address: 956780679 Number: 0 Size: 0
ctime: 0xdca60244:006c5b08 -- Wed Apr 23 01:54:36 2087
atime: 0x5c9e956c:777587a4 -- Sat Mar 30 08:30:12 2019
mtime: 0x2ce44e11:286138f8 -- Sat Nov 13 13:31:37 1993
crtime: 0x737781cb:5661f351 -- Thu May 22 19:54:11 2031
dtime: 0xf19c4882 -- Sat Jun 14 11:57:14 2098
Size of extra inode fields: 3625
BLOCKS:

And when I exit the pager and return to the prompt I also see:

invalid inode->i_extra_isize (3625)
debugfs:

Cheers,
Kevin.



2009-03-10 07:09:43

by Andreas Dilger

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mar 10, 2009 15:32 +1030, Kevin Shanahan wrote:
> On Mon, 2009-03-09 at 23:35 -0500, Eric Sandeen wrote:
> > Kevin Shanahan wrote:
> > > On Mon, 2009-03-09 at 21:57 -0500, Eric Sandeen wrote:
> > >> Kevin Shanahan wrote:
> > >>> kernel: init_special_inode: bogus i_mode (53253)

If anyone has a chance, fixing this error message to be not-useless would
be good... Including the device name and the inode number would help
track down the source of the problem.

> hermes:~# debugfs /dev/dm-0
> debugfs 1.41.3 (12-Oct-2008)
> debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll"
>
> Gives the following output:
>
> Inode: 867 Type: bad type Mode: 0404 Flags: 0x802a61af
> Generation: 2483046020 Version: 0xb9286359:17a7fdfd
> User: 1455931783 Group: -798021131 Size: -1808719531
> File ACL: 141934744 Directory ACL: 0
> Links: 15681 Blockcount: 171984001880781
> Fragment: Address: 956780679 Number: 0 Size: 0
> ctime: 0xdca60244:006c5b08 -- Wed Apr 23 01:54:36 2087
> atime: 0x5c9e956c:777587a4 -- Sat Mar 30 08:30:12 2019
> mtime: 0x2ce44e11:286138f8 -- Sat Nov 13 13:31:37 1993
> crtime: 0x737781cb:5661f351 -- Thu May 22 19:54:11 2031
> dtime: 0xf19c4882 -- Sat Jun 14 11:57:14 2098
> Size of extra inode fields: 3625
> BLOCKS:
>
> And when I exit the pager and return to the prompt I also see:
>
> invalid inode->i_extra_isize (3625)

All garbage. Running e2fsck should fix this, though it won't explain
why it got corrupted in the first place..

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-03-10 12:15:56

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mon, 2009-03-09 at 21:57 -0500, Eric Sandeen wrote:
> > I think we have some minor corruption happening with a new (about two
> > weeks old) ext4 file system. Our backup script spat out bunch of errors
> > last night while trying to read the file ACLs:
> >
> > getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/Cdo32sv.dll: Input/output error
> > getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/U3520chs.dll: Input/output error
> > getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/ReportRenderer.dll: Input/output error
> > getfacl: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/ParameterDesigner.dll: Input/output error
> > ...
> >
> > And in syslog I see these warnings from the kernel:
> >
> > Mar 10 00:06:01 hermes /USR/SBIN/CRON[10875]: (root) CMD ( /usr/local/bin/rsync-backup-all.sh)
> > Mar 10 00:06:24 hermes kernel: init_special_inode: bogus i_mode (53253)
> > Mar 10 00:06:24 hermes kernel: attempt to access beyond end of device
> > Mar 10 00:06:24 hermes kernel: dm-0: rw=0, want=1312475392770056, limit=2147483648
> > Mar 10 00:06:25 hermes kernel: attempt to access beyond end of device
> > Mar 10 00:06:25 hermes kernel: dm-0: rw=0, want=1312475392770056, limit=2147483648
> >
> > The filesystem is still in use (and working okay, AFAICT), but I'll need
> > to fsck this tonight. Is there any useful data I can gather from the fs
> > before (or while) I do that which might help track down the cause?
>
> an e2image of the filesystem before fsck might help, and be sure to save
> fsck output.

Okay, I grabbed the e2image file first.
Huge, but full of holes of course.

hermes:~# stat lv_samba.e2image
File: `lv_samba.e2image'
Size: 17222082560 Blocks: 227456 IO Block: 4096 regular file
Device: 901h/2305d Inode: 1058747 Links: 1
Access: (0600/-rw-------) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2009-03-10 22:22:49.000000000 +1030
Modify: 2009-03-10 22:25:18.000000000 +1030
Change: 2009-03-10 22:25:18.000000000 +1030

fsck is not happy, wants to be run manually.

hermes:~# e2fsck -pfv /dev/dm-0
/dev/dm-0: Inode 865 is in use, but has dtime set. FIXED.
/dev/dm-0: Inode 865 has a extra size (33096) which is invalid
FIXED.
/dev/dm-0: Inode 865 has INDEX_FL flag set but is not a directory.
HTREE INDEX CLEARED.
/dev/dm-0: Inode 865, i_size is 11708523728200260508, should be 0. FIXED.
/dev/dm-0: Inode 865, i_blocks is 3851901040, should be 0. FIXED.
/dev/dm-0: Inode 866 is in use, but has dtime set. FIXED.
/dev/dm-0: Inode 867 is in use, but has dtime set. FIXED.
/dev/dm-0: Inode 867 has imagic flag set.

/dev/dm-0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)

Full output from the manual run is below.

Cheers,
Kevin.
--

hermes:~# e2fsck -fv /dev/dm-0
e2fsck 1.41.3 (12-Oct-2008)
Pass 1: Checking inodes, blocks, and sizes
Inode 867 has imagic flag set. Clear<y>? yes

Inode 867 has a extra size (3625) which is invalid
Fix<y>? yes

Inode 867 has a bad extended attribute block 141934744. Clear<y>? yes

Inode 867, i_size is 1621929268641603925, should be 0. Fix<y>? yes

Inode 867, i_blocks is 626447053, should be 0. Fix<y>? yes

Inode 868 is in use, but has dtime set. Fix<y>? yes

Inode 868 has a extra size (8206) which is invalid
Fix<y>? yes

Inode 869 is in use, but has dtime set. Fix<y>? yes

Inode 869 has imagic flag set. Clear<y>? yes

Inode 869 has a extra size (6152) which is invalid
Fix<y>? yes

Inode 870 is in use, but has dtime set. Fix<y>? yes

Inode 870 has a extra size (37249) which is invalid
Fix<y>? yes

Inode 871 is in use, but has dtime set. Fix<y>? yes

Inode 871 has imagic flag set. Clear<y>? yes

Inode 871 has a extra size (25184) which is invalid
Fix<y>? yes

Inode 871 has compression flag set on filesystem without compression support. Clear<y>? yes

Inode 871, i_size is 11354597595663089880, should be 0. Fix<y>? yes

Inode 871, i_blocks is 3084592992, should be 0. Fix<y>? yes

Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes

Inode 872 was part of the orphaned inode list. FIXED.
Inode 872 has a extra size (49847) which is invalid
Fix<y>? yes

Inode 872 has compression flag set on filesystem without compression support. Clear<y>? yes

Inode 872 has a bad extended attribute block 2621440. Clear<y>? yes

Error while reading over extent tree in inode 872: Corrupt extent header
Clear inode<y>? yes

Inode 872 is a zero-length directory. Clear<y>? yes

Inode 873 is in use, but has dtime set. Fix<y>? yes

Inode 873 has imagic flag set. Clear<y>? yes

Inode 873 has a extra size (18362) which is invalid
Fix<y>? yes

Inode 873 has compression flag set on filesystem without compression support. Clear<y>? yes

Error while reading over extent tree in inode 873: Corrupt extent header
Clear inode<y>? yes

Inode 873, i_blocks is 750551612, should be 0. Fix<y>? yes

Inode 874 is in use, but has dtime set. Fix<y>? yes

Inode 874 has a extra size (3840) which is invalid
Fix<y>? yes

Inode 875 is in use, but has dtime set. Fix<y>? yes

Inode 875 has imagic flag set. Clear<y>? yes

Inode 875 has a extra size (18005) which is invalid
Fix<y>? yes

Inode 875 has a bad extended attribute block 21679127. Clear<y>? yes

Inode 875 has INDEX_FL flag set but is not a directory.
Clear HTree index<y>? yes

Inode 875, i_size is 2269162663506877006, should be 0. Fix<y>? yes

Inode 875, i_blocks is 2538914224, should be 0. Fix<y>? yes

Inode 876 is in use, but has dtime set. Fix<y>? yes

Inode 876 has imagic flag set. Clear<y>? yes

Inode 876 has a extra size (27586) which is invalid
Fix<y>? yes

Inode 876 has compression flag set on filesystem without compression support. Clear<y>? yes

Inode 876, i_size is 8678493212014404941, should be 0. Fix<y>? yes

Inode 876, i_blocks is 86534675801319, should be 0. Fix<y>? yes

Inode 877 is in use, but has dtime set. Fix<y>? yes

Inode 877 has a extra size (50847) which is invalid
Fix<y>? yes

Error while reading over extent tree in inode 877: Corrupt extent header
Clear inode<y>? yes

Inode 877, i_blocks is 3392462386, should be 0. Fix<y>? yes

Inode 878 is in use, but has dtime set. Fix<y>? yes

Inode 878 has imagic flag set. Clear<y>? yes

Inode 878 has a extra size (26132) which is invalid
Fix<y>? yes

Inode 878 has a bad extended attribute block 16842752. Clear<y>? yes

Error while reading over extent tree in inode 878: Corrupt extent header
Clear inode<y>? yes

Inode 878, i_blocks is 2763914436, should be 0. Fix<y>? yes

Inode 879 is in use, but has dtime set. Fix<y>? yes

Inode 879 has imagic flag set. Clear<y>? yes

Inode 879 has a extra size (7719) which is invalid
Fix<y>? yes

Inode 880 is in use, but has dtime set. Fix<y>? yes

Inode 880 has imagic flag set. Clear<y>? yes

Inode 880 has a extra size (20011) which is invalid
Fix<y>? yes

Inode 880, i_size is 15560619716461179914, should be 0. Fix<y>? yes

Inode 880, i_blocks is 131913893443436, should be 0. Fix<y>? yes


Inode 868 has compression flag set on filesystem without compression support. Clear<y>? yes

Inode 868, i_size is 4028226676621914086, should be 0. Fix<y>? yes

Inode 868, i_blocks is 134639139614610, should be 0. Fix<y>? yes

Inode 866 has a bad extended attribute block 263323648. Clear<y>? yes

Inode 866, i_size is 2427760314128421, should be 0. Fix<y>? yes

Inode 866, i_blocks is 11426071, should be 0. Fix<y>? yes

Inode 874, i_size is 22524735385517479, should be 0. Fix<y>? yes

Inode 874, i_blocks is 279313060335481, should be 0. Fix<y>? yes

Inode 869 has a bad extended attribute block 185417860. Clear<y>? yes

Inode 869 has INDEX_FL flag set but is not a directory.
Clear HTree index<y>? yes

Inode 869, i_size is 288707807517097708, should be 0. Fix<y>? yes

Inode 869, i_blocks is 4014035729, should be 0. Fix<y>? yes

Inode 870, i_size is 4369557840038834079, should be 0. Fix<y>? yes

Inode 870, i_blocks is 2905973680, should be 0. Fix<y>? yes

Inode 879 has compression flag set on filesystem without compression support. Clear<y>? yes

Inode 879 has INDEX_FL flag set but is not a directory.
Clear HTree index<y>? yes

Inode 879, i_size is 15592687770754169968, should be 0. Fix<y>? yes

Inode 879, i_blocks is 86032481815687, should be 0. Fix<y>? yes

Extended attribute block 8626 has reference count 1024, should be 1008. Fix<y>? yes

Pass 2: Checking directory structure
Inode 867 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll) has invalid mode (0152404).
Clear<y>? yes

Inode 871 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/ParameterDesigner.dll) has invalid mode (0135566).
Clear<y>? yes

Entry 'ReportRenderer.dll' in /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin (136) has deleted/unused inode 873. Clear<y>? yes

Inode 876 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/U25dts.dll) has invalid mode (0171451).
Clear<y>? yes

Entry 'U3520chs.dll' in /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin (136) has deleted/unused inode 877. Clear<y>? yes

i_faddr for inode 875 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll) is 734152772, should be zero.
Clear<y>? yes

Entry 'RptControllers.dll' in /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin (136) has an incorrect filetype (was 1, should be 5).
Fix<y>? yes

Entry 'ReportPromptEMF.dll' in /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin (136) has deleted/unused inode 872. Clear<y>? yes

Inode 879 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/U3520de.dll) has invalid mode (0114355).
Clear<y>? yes

Inode 869 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Implode.dll) is an illegal FIFO.
Clear<y>? yes

Entry 'U3520cht.dll' in /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin (136) has deleted/unused inode 878. Clear<y>? yes

Inode 868 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/ExportModeller.dll) is an illegal block device.
Clear<y>? yes

Inode 880 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/U3520en.dll) has invalid mode (073230).
Clear<y>? yes

Inode 866 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32ru.dll) has invalid mode (053253).
Clear<y>? yes

Inode 874 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/ReportSourceBridge.dll) is an illegal socket.
Clear<y>? yes

Inode 870 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/License_xerces-c_2_1_0.txt) has invalid mode (077332).
Clear<y>? yes

Inode 865 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32pt.dll) is an illegal FIFO.
Clear<y>? yes

Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Error reading block 2705859237 (Invalid argument). Ignore error<y>? yes

Force rewrite<y>? yes

Error writing block 2705859237 (Invalid argument). Ignore error<y>? yes

Failed to optimize directory ??? (872): EXT2 directory corruptedPass 4: Checking reference counts
Inode 875 ref count is 38039, should be 1. Fix<y>? yes

Pass 5: Checking group summary information
Block bitmap differences: -(27296--27306) -(62051--62062) -(135936--136113) -(137216--137390) -(137472--137520) -(137728--138033) -(138240--138409) -(138752--139065)
Fix<y>? yes

Free blocks count wrong for group #0 (3435, counted=3446).
Fix<y>? yes

Free blocks count wrong for group #1 (804, counted=816).
Fix<y>? yes

Free blocks count wrong for group #4 (2203, counted=3395).
Fix<y>? yes

Free blocks count wrong (227860452, counted=227861667).
Fix<y>? yes

Inode bitmap differences: -(872--873) -(877--878)
Fix<y>? yes

Free inodes count wrong for group #0 (11, counted=15).
Fix<y>? yes

Free inodes count wrong (66820124, counted=66820128).
Fix<y>? yes


/dev/dm-0: ***** FILE SYSTEM WAS MODIFIED *****

288736 inodes used (0.43%)
11724 non-contiguous inodes (4.1%)
# of inodes with ind/dind/tind blocks: 5/5/5
Extent depth histogram: 288384/287
40573789 blocks used (15.11%)
0 bad blocks
2 large files

262926 regular files
25742 directories
0 character device files
0 block device files
0 fifos
0 links
58 symbolic links (54 fast symbolic links)
0 sockets
--------
288727 files



2009-03-10 14:47:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, Mar 10, 2009 at 01:09:15AM -0600, Andreas Dilger wrote:
> > > >>> kernel: init_special_inode: bogus i_mode (53253)
>
> If anyone has a chance, fixing this error message to be not-useless would
> be good... Including the device name and the inode number would help
> track down the source of the problem.

Agreed!

> > hermes:~# debugfs /dev/dm-0
> > debugfs 1.41.3 (12-Oct-2008)
> > debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll"
> >
> > Gives the following output:
> >
> > Inode: 867 Type: bad type Mode: 0404 Flags: 0x802a61af
> > Generation: 2483046020 Version: 0xb9286359:17a7fdfd
> > User: 1455931783 Group: -798021131 Size: -1808719531
> > File ACL: 141934744 Directory ACL: 0
> > Links: 15681 Blockcount: 171984001880781
> > Fragment: Address: 956780679 Number: 0 Size: 0
> > ctime: 0xdca60244:006c5b08 -- Wed Apr 23 01:54:36 2087
> > atime: 0x5c9e956c:777587a4 -- Sat Mar 30 08:30:12 2019
> > mtime: 0x2ce44e11:286138f8 -- Sat Nov 13 13:31:37 1993
> > crtime: 0x737781cb:5661f351 -- Thu May 22 19:54:11 2031
> > dtime: 0xf19c4882 -- Sat Jun 14 11:57:14 2098
> > Size of extra inode fields: 3625
> > BLOCKS:

This looks like a block written to the wrong place on disk. One of
the things that can be done is to dump out the disk block
corresponding to inode #867 before the fsck, and see if we can
recognize what the source of the block was. I don't see any ASCII in
any of the files (i.e., 0xdca60244, 0x5c9e956c, etc. don't appear to
be ascii), but it might be that we could determine what the block that
got written into the inode table originally came from.

As Andreas said, this e2fsck will this, but it won't explain how a
block in the inode table got corrupted in the first place. It could
be a random hardware hiccup; it could be a corruption on disk or in
memory that lead the Linux kernel to write the block in the wrong
place, etc. The problem is that it could be a filesystem or kernel
bug, but it could also just as easily be a hardware one-time failure.

Failures like this have been reported on ext3 filesystems as well, but
it's rare, and it's been written off to hardware failure, although it
could be really anything --- from the device driver, block layer, a
filesystem problem, or hardware hiccup.

Given that you've fixed it, all I think we can tell you is to keep an
eye out for any further failures....

- Ted

2009-03-10 20:44:40

by Andreas Dilger

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mar 10, 2009 22:45 +1030, Kevin Shanahan wrote:
> hermes:~# e2fsck -pfv /dev/dm-0
> /dev/dm-0: Inode 865 is in use, but has dtime set. FIXED.
> /dev/dm-0: Inode 865 has a extra size (33096) which is invalid
> FIXED.
> /dev/dm-0: Inode 865 has INDEX_FL flag set but is not a directory.
> HTREE INDEX CLEARED.
> /dev/dm-0: Inode 865, i_size is 11708523728200260508, should be 0. FIXED.
> /dev/dm-0: Inode 865, i_blocks is 3851901040, should be 0. FIXED.

These all mean the inode is garbage. The Lustre e2fsprogs has a patch
"e2fsprogs-badness_counter" that tracks the number of errors hit in
the inode, and clears it if it appears to be garbage (as in your case),
instead of turning a pile of manure into neatly organized fertilizer.

> Pass 2: Checking directory structure
> Inode 867 (/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll) has invalid mode (0152404).
> Clear<y>? yes

This is just more indication of the inodes being corrupted.

> Pass 3: Checking directory connectivity
> Pass 3A: Optimizing directories
> Error reading block 2705859237 (Invalid argument). Ignore error<y>? yes
>
> Force rewrite<y>? yes
>
> Error writing block 2705859237 (Invalid argument). Ignore error<y>? yes

Hmm, is this read beyond EOF? The block pointers should be fixed by this
time, so there may be a hole in e2fsck at this point. I assume you do
not have a 10TB filesystem here.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-03-10 20:59:48

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, 2009-03-10 at 14:44 -0600, Andreas Dilger wrote:
> On Mar 10, 2009 22:45 +1030, Kevin Shanahan wrote:
> > Pass 3: Checking directory connectivity
> > Pass 3A: Optimizing directories
> > Error reading block 2705859237 (Invalid argument). Ignore error<y>? yes
> >
> > Force rewrite<y>? yes
> >
> > Error writing block 2705859237 (Invalid argument). Ignore error<y>? yes
>
> Hmm, is this read beyond EOF? The block pointers should be fixed by this
> time, so there may be a hole in e2fsck at this point. I assume you do
> not have a 10TB filesystem here.

Yeah, this is a 1TB filesystem.

hermes:~# tune2fs -l /dev/dm-0
tune2fs 1.41.3 (12-Oct-2008)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 3db511f5-97fe-4a35-b327-ea58dd498414
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent flex_bg sparse_super large_file huge_file
uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 67108864
Block count: 268435456
Reserved block count: 13421772
Free blocks: 227859007
Free inodes: 66820121
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 192
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Fri Feb 27 01:58:46 2009
Last mount time: Tue Mar 10 22:39:18 2009
Last write time: Tue Mar 10 22:39:18 2009
Mount count: 1
Maximum mount count: 22
Last checked: Tue Mar 10 22:36:30 2009
Check interval: 15552000 (6 months)
Next check after: Sun Sep 6 21:36:30 2009
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 14c2ef0e-3417-45b2-8327-07dfb3603af1
Journal backup: inode blocks

Cheers,
Kevin.



2009-03-10 21:14:55

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, 2009-03-10 at 10:46 -0400, Theodore Tso wrote:
> > > hermes:~# debugfs /dev/dm-0
> > > debugfs 1.41.3 (12-Oct-2008)
> > > debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll"
> > >
> > > Gives the following output:
> > >
> > > Inode: 867 Type: bad type Mode: 0404 Flags: 0x802a61af
> > > Generation: 2483046020 Version: 0xb9286359:17a7fdfd
> > > User: 1455931783 Group: -798021131 Size: -1808719531
> > > File ACL: 141934744 Directory ACL: 0
> > > Links: 15681 Blockcount: 171984001880781
> > > Fragment: Address: 956780679 Number: 0 Size: 0
> > > ctime: 0xdca60244:006c5b08 -- Wed Apr 23 01:54:36 2087
> > > atime: 0x5c9e956c:777587a4 -- Sat Mar 30 08:30:12 2019
> > > mtime: 0x2ce44e11:286138f8 -- Sat Nov 13 13:31:37 1993
> > > crtime: 0x737781cb:5661f351 -- Thu May 22 19:54:11 2031
> > > dtime: 0xf19c4882 -- Sat Jun 14 11:57:14 2098
> > > Size of extra inode fields: 3625
> > > BLOCKS:
>
> This looks like a block written to the wrong place on disk. One of
> the things that can be done is to dump out the disk block
> corresponding to inode #867 before the fsck, and see if we can
> recognize what the source of the block was.

Thanks Ted. Just so I know what I'm doing if/when this comes up again, I
guess the process would be:

- debugfs /dev/some-filesystem
- debugfs: stat "some/problem/file"
- get the inode number from the output above (867 in this case)
- debugfs dump 867 inode867.bin

Or perhaps running e2fsck -n first to see which inodes really look
corrupted and get the numbers that way.

> Failures like this have been reported on ext3 filesystems as well, but
> it's rare, and it's been written off to hardware failure, although it
> could be really anything --- from the device driver, block layer, a
> filesystem problem, or hardware hiccup.
>
> Given that you've fixed it, all I think we can tell you is to keep an
> eye out for any further failures....

Thanks, will do.

Cheers,
Kevin.



2009-03-10 22:48:26

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Wed, Mar 11, 2009 at 07:44:51AM +1030, Kevin Shanahan wrote:
> Thanks Ted. Just so I know what I'm doing if/when this comes up again, I
> guess the process would be:
>
> - debugfs /dev/some-filesystem
> - debugfs: stat "some/problem/file"
> - get the inode number from the output above (867 in this case)
> - debugfs dump 867 inode867.bin

Actually, it's a bit more complicated than that. I should really add
a hook to debugfs to automate this, but... to find the block to grab,
you do this.

1) Grab the following fields from dumpe2fs (or debugfs's "stats" command)

Inodes per group: 8192
Block size: 4096
Inodes per group: 8192
Inode size: 256

2) To find out which block group the inode is located in, we calculate:

inode bg = inode_number / inodes_per_group
= 867 / 8192 = block group 0

(everything done using integer division, rounding any fraction down to 0.)

3) To find where the inode table for block group in question can be
found, check out dumpe2fs.

Group 0: (Blocks 0-32767) [ITABLE_ZEROED]
...
Inode table at 65-576 (+65)
^^^^^^

OK, we know it begins at block 65.

Now some more math:

inode_per_block = block_size / inode_size
= 4096 / 256 = 16

offset_block = (inode_number % inodes_per_group) / inodes_per_block
= (867 % 8192) / 16
= 54

So now we know that the block containing inode 867 is 65+54 = 119

Now to dump that block, we do:

dd if=/dev/fs-device of=block-119.dump bs=4k skip=119 count=1


> Or perhaps running e2fsck -n first to see which inodes really look
> corrupted and get the numbers that way.

Yep. In general, when you get corruption like this, the bad inodes
comes in chunks of 16 (if you are using 256 byte inodes) or 32 (if you
are using 128 byte inodes).

- Ted

2009-03-11 00:39:04

by Andreas Dilger

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mar 10, 2009 18:48 -0400, Theodore Ts'o wrote:
> On Wed, Mar 11, 2009 at 07:44:51AM +1030, Kevin Shanahan wrote:
> > Thanks Ted. Just so I know what I'm doing if/when this comes up again, I
> > guess the process would be:
> >
> > - debugfs /dev/some-filesystem
> > - debugfs: stat "some/problem/file"
> > - get the inode number from the output above (867 in this case)
> > - debugfs dump 867 inode867.bin
>
> Actually, it's a bit more complicated than that. I should really add
> a hook to debugfs to automate this, but... to find the block to grab,
> you do this.

[many complex steps deleted]

> So now we know that the block containing inode 867 is 65+54 = 119

Or you can use imap (which Ted probably wrote in the first place. :-)

debugfs: imap some/problem/file
Inode 867 is part of block group 0
located at block 119, offset 0x0180

> Now to dump that block, we do:
>
> dd if=/dev/fs-device of=block-119.dump bs=4k skip=119 count=1

This part is the same.

> > Or perhaps running e2fsck -n first to see which inodes really look
> > corrupted and get the numbers that way.
>
> Yep. In general, when you get corruption like this, the bad inodes
> comes in chunks of 16 (if you are using 256 byte inodes) or 32 (if you
> are using 128 byte inodes).

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-03-11 00:47:33

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, Mar 10, 2009 at 06:38:45PM -0600, Andreas Dilger wrote:
> > Actually, it's a bit more complicated than that. I should really add
> > a hook to debugfs to automate this, but... to find the block to grab,
> > you do this.
>
> [many complex steps deleted]
>
> > So now we know that the block containing inode 867 is 65+54 = 119
>
> Or you can use imap (which Ted probably wrote in the first place. :-)
>
> debugfs: imap some/problem/file
> Inode 867 is part of block group 0
> located at block 119, offset 0x0180

<Blush>

OK, now I'm embarassed. I had completely forgotten that I had already
implemented it.

And to think I've been doing the hard way most of the time (i.e.,
using dc and doing the calculations manually :-)

- Ted

2009-03-11 01:43:33

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, 2009-03-10 at 18:38 -0600, Andreas Dilger wrote:
> On Mar 10, 2009 18:48 -0400, Theodore Ts'o wrote:
> > So now we know that the block containing inode 867 is 65+54 = 119
>
> Or you can use imap (which Ted probably wrote in the first place. :-)
>
> debugfs: imap some/problem/file
> Inode 867 is part of block group 0
> located at block 119, offset 0x0180
>
> > Now to dump that block, we do:
> >
> > dd if=/dev/fs-device of=block-119.dump bs=4k skip=119 count=1
>
> This part is the same.

Thanks guys.

It looks like there is still a problem there, even after fsck has done
it's cleanup.

Last night I got:

getfattr: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error

And syslog shows:

Mar 11 00:06:01 hermes /USR/SBIN/CRON[26947]: (root) CMD ( /usr/local/bin/rsync-backup-all.sh)
Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648

hermes:~# getfattr /srv/samba/local/apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/RptControllers.dll
getfattr: /srv/samba/local/apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error

hermes:~# debugfs /dev/dm-0
debugfs 1.41.3 (12-Oct-2008)
debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"

Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3b9c185
Generation: 3690868 Version: 0x9d36b10d
User: 868313917 Group: -1340283792 Size: 0
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 0
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x0742afc4 -- Sun Nov 11 06:51:24 1973
atime: 0x472a2311 -- Fri Nov 2 05:33:45 2007
mtime: 0x80c59881 -- Fri Jun 18 09:51:21 2038
Size of extra inode fields: 4
BLOCKS:

debugfs: imap "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
Inode 875 is part of block group 0
located at block 343, offset 0x0a00

hermes:~# dd if=/dev/dm-0 of=block-343.dump bs=4k skip=343 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB) copied, 3.6316e-05 s, 113 MB/s

"strings block-343.dump" doesn't show anything obvious to me about where
the data came from, but I guess the concern would be whether fsck didn't
pick this up or maybe the corruption returned very quickly for some
reason.

Cheers,
Kevin.


Attachments:
block-343.dump (4.00 kB)

2009-03-11 01:49:04

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Wed, 2009-03-11 at 12:13 +1030, Kevin Shanahan wrote:
> It looks like there is still a problem there, even after fsck has done
> it's cleanup.
>
> Last night I got:
>
> getfattr: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error
>
> And syslog shows:
>
> Mar 11 00:06:01 hermes /USR/SBIN/CRON[26947]: (root) CMD ( /usr/local/bin/rsync-backup-all.sh)
> Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
> Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
> Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
> Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
>
> hermes:~# getfattr /srv/samba/local/apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/RptControllers.dll
> getfattr: /srv/samba/local/apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error
>
> hermes:~# debugfs /dev/dm-0
> debugfs 1.41.3 (12-Oct-2008)
> debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
>
> Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3b9c185
> Generation: 3690868 Version: 0x9d36b10d
> User: 868313917 Group: -1340283792 Size: 0
> File ACL: 0 Directory ACL: 0
> Links: 1 Blockcount: 0
> Fragment: Address: 0 Number: 0 Size: 0
> ctime: 0x0742afc4 -- Sun Nov 11 06:51:24 1973
> atime: 0x472a2311 -- Fri Nov 2 05:33:45 2007
> mtime: 0x80c59881 -- Fri Jun 18 09:51:21 2038
> Size of extra inode fields: 4
> BLOCKS:
>
> debugfs: imap "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
> Inode 875 is part of block group 0
> located at block 343, offset 0x0a00
>
> hermes:~# dd if=/dev/dm-0 of=block-343.dump bs=4k skip=343 count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB) copied, 3.6316e-05 s, 113 MB/s
>
> "strings block-343.dump" doesn't show anything obvious to me about where
> the data came from, but I guess the concern would be whether fsck didn't
> pick this up or maybe the corruption returned very quickly for some
> reason.

Oh, by the way:

hermes:~# ls -l /srv/samba/local/apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/RptControllers.dll
prwS--x--t 1 868313917 2954683504 0 1902-05-13 03:23 /srv/samba/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll

I guess this is what Andreas meant by "turning a pile of manure into
neatly organized fertilizer" :)

Cheers,
Kevin.



2009-03-11 04:51:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

Can you tell us something about the hardware that is behind this
filesystem? What sort of disks, raid controller if any, etc. are you
using?

- Ted

2009-03-11 05:27:22

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Wed, 2009-03-11 at 00:50 -0400, Theodore Tso wrote:
> Can you tell us something about the hardware that is behind this
> filesystem? What sort of disks, raid controller if any, etc. are you
> using?

Sure.

This is a Xeon X3350 quad-core system on an Intel 3210SHLC motherboard,
with 8GB ECC RAM. The filesystem is on top of LVM which is on top of
RAID6 using the md driver. The RAID is comprised of 9 1TB SATA drives, a
mix of Samsung and WDC, using the on-board Intel AHCI SATA controller
and an additional Silicon Image 3132 PCI-e controller and a 5:1 port
multiplier (Sil3726, I think).

Below is the full dmesg after 14 days uptime. Includes bringing the RAID
up from 7/9 to 9/9 and various messing around with the ext4 filesystems.

Cheers,
Kevin.
--

Linux version 2.6.29-rc6 (kmshanah@ucwb-kvm-10) (gcc version 4.3.2 (Debian 4.3.2-1.1) ) #1 SMP Tue Feb 24 22:24:06 CST 2009
Command line: BOOT_IMAGE=/vmlinuz-2.6.29-rc6 root=/dev/md1 ro
KERNEL supported cpus:
Intel GenuineIntel
AMD AuthenticAMD
Centaur CentaurHauls
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009c800 (usable)
BIOS-e820: 000000000009c800 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000dfcf0000 (usable)
BIOS-e820: 00000000dfcf0000 - 00000000dfd96000 (ACPI NVS)
BIOS-e820: 00000000dfd96000 - 00000000dfdfa000 (usable)
BIOS-e820: 00000000dfdfa000 - 00000000dfe5f000 (reserved)
BIOS-e820: 00000000dfe5f000 - 00000000dfe69000 (usable)
BIOS-e820: 00000000dfe69000 - 00000000dfedf000 (ACPI NVS)
BIOS-e820: 00000000dfedf000 - 00000000dfee6000 (usable)
BIOS-e820: 00000000dfee6000 - 00000000dfeff000 (ACPI data)
BIOS-e820: 00000000dfeff000 - 00000000dff00000 (usable)
BIOS-e820: 00000000dff00000 - 00000000e0000000 (reserved)
BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
BIOS-e820: 00000000fff80000 - 00000000fff8c000 (reserved)
BIOS-e820: 0000000100000000 - 0000000220000000 (usable)
DMI 2.5 present.
last_pfn = 0x220000 max_arch_pfn = 0x100000000
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
original variable MTRRs
reg 0, base: 0GB, range: 2GB, type WB
reg 1, base: 2GB, range: 1GB, type WB
reg 2, base: 3GB, range: 512MB, type WB
reg 3, base: 3583MB, range: 1MB, type UC
reg 4, base: 4GB, range: 4GB, type WB
reg 5, base: 8GB, range: 512MB, type WB
total RAM coverred: 8191M
Found optimal setting for mtrr clean up
gran_size: 64K chunk_size: 2M num_reg: 6 lose cover RAM: 0G
New variable MTRRs
reg 0, base: 0GB, range: 2GB, type WB
reg 1, base: 2GB, range: 1GB, type WB
reg 2, base: 3GB, range: 512MB, type WB
reg 3, base: 3583MB, range: 1MB, type UC
reg 4, base: 4GB, range: 4GB, type WB
reg 5, base: 8GB, range: 512MB, type WB
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
last_pfn = 0xdff00 max_arch_pfn = 0x100000000
init_memory_mapping: 0000000000000000-00000000dff00000
0000000000 - 00dfe00000 page 2M
00dfe00000 - 00dff00000 page 4k
kernel direct mapping tables up to dff00000 @ 8000-e000
last_map_addr: dff00000 end: dff00000
init_memory_mapping: 0000000100000000-0000000220000000
0100000000 - 0220000000 page 2M
kernel direct mapping tables up to 220000000 @ c000-16000
last_map_addr: 220000000 end: 220000000
ACPI: RSDP 000F03C0, 0024 (r2 INTEL )
ACPI: XSDT DFEFE120, 00A4 (r1 INTEL S3200SHC 0 INTL 1000013)
ACPI: SLIC DFEFC000, 0176 (r1 INTEL S3200SHC 2 INTL 1000013)
ACPI: FACP DFEFA000, 00F4 (r3 INTEL S3200SHC 0 MSFT 1000013)
ACPI Warning (tbfadt-0568): 32/64X length mismatch in PmTimerBlock: 32/24 [20081204]
ACPI Warning (tbfadt-0412): Invalid length for PmTimerBlock: 24, using default 32 [20081204]
FADT: X_PM1a_EVT_BLK.bit_width (16) does not match PM1_EVT_LEN (4)
ACPI: DSDT DFEF4000, 568D (r1 INTEL S3200SHC 0 MSFT 1000013)
ACPI: FACS DFE69000, 0040
ACPI: APIC DFEF3000, 0084 (r1 INTEL S3200SHC 0 MSFT 1000013)
ACPI: WDDT DFEF2000, 0040 (r1 INTEL S3200SHC 0 MSFT 1000013)
ACPI: MCFG DFEF1000, 003C (r1 INTEL S3200SHC 0 MSFT 1000013)
ACPI: HPET DFEF0000, 0038 (r1 INTEL S3200SHC 1 MSFT 1000013)
ACPI: SPCR DFEEF000, 0050 (r1 INTEL S3200SHC 0 MSFT 1000013)
ACPI: SSDT DFEEE000, 0175 (r1 INTEL Cpu0Ist 10 MSFT 1000013)
ACPI: SSDT DFEED000, 0175 (r1 INTEL Cpu1Ist 10 MSFT 1000013)
ACPI: SSDT DFEEC000, 0175 (r1 INTEL Cpu2Ist 10 MSFT 1000013)
ACPI: SSDT DFEEB000, 0175 (r1 INTEL Cpu3Ist 10 MSFT 1000013)
ACPI: SSDT DFEEA000, 01BC (r1 INTEL CpuPm 10 MSFT 1000013)
ACPI: HEST DFEE9000, 00A8 (r1 INTEL S3200SHC 1 INTL 1)
ACPI: BERT DFEE8000, 0030 (r1 INTEL S3200SHC 1 INTL 1)
ACPI: ERST DFEE7000, 0230 (r1 INTEL S3200SHC 1 INTL 1)
ACPI: EINJ DFEE6000, 0130 (r1 INTEL S3200SHC 1 INTL 1)
ACPI: Local APIC address 0xfee00000
(6 early reservations) ==> bootmem [0000000000 - 0220000000]
#0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000]
#1 [0000006000 - 0000008000] TRAMPOLINE ==> [0000006000 - 0000008000]
#2 [0000200000 - 00007df9c4] TEXT DATA BSS ==> [0000200000 - 00007df9c4]
#3 [000009c800 - 0000100000] BIOS reserved ==> [000009c800 - 0000100000]
#4 [0000008000 - 000000c000] PGTABLE ==> [0000008000 - 000000c000]
#5 [000000c000 - 0000011000] PGTABLE ==> [000000c000 - 0000011000]
found SMP MP-table at [ffff8800000fd270] 000fd270
[ffffe20000000000-ffffe200077fffff] PMD -> [ffff880028200000-ffff88002f9fffff] on node 0
Zone PFN ranges:
DMA 0x00000000 -> 0x00001000
DMA32 0x00001000 -> 0x00100000
Normal 0x00100000 -> 0x00220000
Movable zone start PFN for each node
early_node_map[7] active PFN ranges
0: 0x00000000 -> 0x0000009c
0: 0x00000100 -> 0x000dfcf0
0: 0x000dfd96 -> 0x000dfdfa
0: 0x000dfe5f -> 0x000dfe69
0: 0x000dfedf -> 0x000dfee6
0: 0x000dfeff -> 0x000dff00
0: 0x00100000 -> 0x00220000
On node 0 totalpages: 2096386
DMA zone: 56 pages used for memmap
DMA zone: 1616 pages reserved
DMA zone: 2324 pages, LIFO batch:0
DMA32 zone: 14280 pages used for memmap
DMA32 zone: 898462 pages, LIFO batch:31
Normal zone: 16128 pages used for memmap
Normal zone: 1163520 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
ACPI: LAPIC_NMI (acpi_id[0x01] high level lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] high level lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] high level lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] high level lint[0x1])
ACPI: IOAPIC (id[0x05] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 5, version 0, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Using ACPI (MADT) for SMP configuration information
ACPI: HPET id: 0x8086a201 base: 0xfed00000
SMP: Allowing 4 CPUs, 0 hotplug CPUs
nr_irqs_gsi: 24
Allocating PCI resources starting at e1000000 (gap: e0000000:10000000)
NR_CPUS:32 nr_cpumask_bits:32 nr_cpu_ids:4 nr_node_ids:1
PERCPU: Allocating 45056 bytes of per cpu data
Built 1 zonelists in Zone order, mobility grouping on. Total pages: 2064306
Kernel command line: BOOT_IMAGE=/vmlinuz-2.6.29-rc6 root=/dev/md1 ro
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Extended CMOS year: 2000
TSC: PIT calibration matches PMTIMER. 1 loops
Detected 2659.997 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Checking aperture...
No AGP bridge found
Calgary: detecting Calgary via BIOS EBDA area
Calgary: Unable to locate Rio Grande table in EBDA - bailing!
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing 64MB software IO TLB between ffff880020000000 - ffff880024000000
software IO TLB at phys 0x20000000 - 0x24000000
Memory: 8176948k/8912896k available (3452k kernel code, 527352k absent, 207624k reserved, 1566k data, 400k init)
hpet clockevent registered
HPET: 4 timers in total, 0 timers will be used for per-cpu timer
Calibrating delay loop (skipped), value calculated using timer frequency.. 5322.66 BogoMIPS (lpj=8866656)
Mount-cache hash table entries: 256
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
[ds] using Core 2/Atom configuration
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM2)
using mwait in idle threads.
Freeing SMP alternatives: 33k freed
ACPI: Core revision 20081204
Setting APIC routing to flat
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
CPU0: Intel(R) Xeon(R) CPU X3350 @ 2.66GHz stepping 07
Booting processor 1 APIC 0x2 ip 0x6000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5322.67 BogoMIPS (lpj=8866675)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
[ds] using Core 2/Atom configuration
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 2
CPU1: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 1, old 0x7040600070406, new 0x7010600070106
CPU1: Intel(R) Xeon(R) CPU X3350 @ 2.66GHz stepping 07
checking TSC synchronization [CPU#0 -> CPU#1]: passed.
Booting processor 2 APIC 0x1 ip 0x6000
Initializing CPU#2
Calibrating delay using timer specific routine.. 5322.64 BogoMIPS (lpj=8866633)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
[ds] using Core 2/Atom configuration
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU2: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 2, old 0x7040600070406, new 0x7010600070106
CPU2: Intel(R) Xeon(R) CPU X3350 @ 2.66GHz stepping 07
checking TSC synchronization [CPU#0 -> CPU#2]: passed.
Booting processor 3 APIC 0x3 ip 0x6000
Initializing CPU#3
Calibrating delay using timer specific routine.. 5322.66 BogoMIPS (lpj=8866661)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 6144K
[ds] using Core 2/Atom configuration
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 3
CPU3: Thermal monitoring enabled (TM2)
x86 PAT enabled: cpu 3, old 0x7040600070406, new 0x7010600070106
CPU3: Intel(R) Xeon(R) CPU X3350 @ 2.66GHz stepping 07
checking TSC synchronization [CPU#0 -> CPU#3]: passed.
Brought up 4 CPUs
Total of 4 processors activated (21288.64 BogoMIPS).
net_namespace: 1344 bytes
xor: automatically using best checksumming function: generic_sse
generic_sse: 9922.800 MB/sec
xor: using function: generic_sse (9922.800 MB/sec)
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: MCFG configuration 0: base f0000000 segment 0 buses 0 - 63
PCI: MCFG area at f0000000 reserved in E820
PCI: Using MMCONFIG at f0000000 - f3ffffff
PCI: Using configuration type 1 for base access
bio: create slab <bio-0> at 0
ACPI: EC: Look up EC in DSDT
ACPI: Interpreter enabled
ACPI: (supports S0 S5)
ACPI: Using IOAPIC for interrupt routing
ACPI: No dock devices found.
ACPI: PCI Root Bridge [PCI0] (0000:00)
pci 0000:00:06.0: PME# supported from D0 D3hot D3cold
pci 0000:00:06.0: PME# disabled
pci 0000:00:19.0: reg 10 32bit mmio: [0xe1d00000-0xe1d1ffff]
pci 0000:00:19.0: reg 14 32bit mmio: [0xe1d20000-0xe1d20fff]
pci 0000:00:19.0: reg 18 io port: [0x40e0-0x40ff]
pci 0000:00:19.0: PME# supported from D0 D3hot D3cold
pci 0000:00:19.0: PME# disabled
pci 0000:00:1a.0: reg 20 io port: [0x40c0-0x40df]
pci 0000:00:1a.1: reg 20 io port: [0x40a0-0x40bf]
pci 0000:00:1a.7: reg 10 32bit mmio: [0xe1d21c00-0xe1d21fff]
pci 0000:00:1a.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1a.7: PME# disabled
pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.0: PME# disabled
pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
pci 0000:00:1c.4: PME# disabled
pci 0000:00:1d.0: reg 20 io port: [0x4080-0x409f]
pci 0000:00:1d.1: reg 20 io port: [0x4060-0x407f]
pci 0000:00:1d.2: reg 20 io port: [0x4040-0x405f]
pci 0000:00:1d.7: reg 10 32bit mmio: [0xe1d21800-0xe1d21bff]
pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold
pci 0000:00:1d.7: PME# disabled
pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0ca0 (mask 000f)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 2 PIO at 0680 (mask 00ff)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 3 PIO at 0800 (mask 007f)
pci 0000:00:1f.0: ICH7 LPC Generic IO decode 4 PIO at 0600 (mask 007f)
pci 0000:00:1f.2: reg 10 io port: [0x4408-0x440f]
pci 0000:00:1f.2: reg 14 io port: [0x4414-0x4417]
pci 0000:00:1f.2: reg 18 io port: [0x4400-0x4407]
pci 0000:00:1f.2: reg 1c io port: [0x4410-0x4413]
pci 0000:00:1f.2: reg 20 io port: [0x4020-0x403f]
pci 0000:00:1f.2: reg 24 32bit mmio: [0xe1d21000-0xe1d217ff]
pci 0000:00:1f.2: PME# supported from D3hot
pci 0000:00:1f.2: PME# disabled
pci 0000:00:1f.3: reg 10 64bit mmio: [0xe1d22000-0xe1d220ff]
pci 0000:00:1f.3: reg 20 io port: [0x4000-0x401f]
pci 0000:01:00.0: reg 10 32bit mmio: [0xe1c00000-0xe1c01fff]
pci 0000:01:00.0: PME# supported from D0 D3hot
pci 0000:01:00.0: PME# disabled
pci 0000:00:06.0: bridge io port: [0x3000-0x3fff]
pci 0000:00:06.0: bridge 32bit mmio: [0xe1b00000-0xe1cfffff]
pci 0000:02:08.0: reg 10 io port: [0x3000-0x30ff]
pci 0000:02:08.0: reg 14 64bit mmio: [0xe1b00000-0xe1b1ffff]
pci 0000:02:08.0: reg 1c 64bit mmio: [0xe1b20000-0xe1b3ffff]
pci 0000:02:08.0: reg 30 32bit mmio: [0xfff00000-0xffffffff]
pci 0000:02:08.0: supports D1 D2
pci 0000:01:00.0: bridge io port: [0x3000-0x3fff]
pci 0000:01:00.0: bridge 32bit mmio: [0xe1b00000-0xe1bfffff]
pci 0000:03:00.0: reg 10 64bit mmio: [0xe1a04000-0xe1a0407f]
pci 0000:03:00.0: reg 18 64bit mmio: [0xe1a00000-0xe1a03fff]
pci 0000:03:00.0: reg 20 io port: [0x2000-0x207f]
pci 0000:03:00.0: reg 30 32bit mmio: [0xfff80000-0xffffffff]
pci 0000:03:00.0: supports D1 D2
pci 0000:00:1c.0: bridge io port: [0x2000-0x2fff]
pci 0000:00:1c.0: bridge 32bit mmio: [0xe1a00000-0xe1afffff]
pci 0000:04:00.0: reg 10 32bit mmio: [0xe0000000-0xe0ffffff]
pci 0000:04:00.0: reg 14 32bit mmio: [0xe1800000-0xe1803fff]
pci 0000:04:00.0: reg 18 32bit mmio: [0xe1000000-0xe17fffff]
pci 0000:04:00.0: reg 30 32bit mmio: [0xffff0000-0xffffffff]
pci 0000:00:1c.4: bridge 32bit mmio: [0xe1000000-0xe18fffff]
pci 0000:00:1c.4: bridge 64bit mmio pref: [0xe0000000-0xe0ffffff]
pci 0000:05:02.0: reg 10 32bit mmio: [0xe1920000-0xe193ffff]
pci 0000:05:02.0: reg 14 32bit mmio: [0xe1900000-0xe191ffff]
pci 0000:05:02.0: reg 18 io port: [0x1000-0x103f]
pci 0000:05:02.0: reg 30 32bit mmio: [0xfffe0000-0xffffffff]
pci 0000:05:02.0: PME# supported from D0 D3hot D3cold
pci 0000:05:02.0: PME# disabled
pci 0000:00:1e.0: transparent bridge
pci 0000:00:1e.0: bridge io port: [0x1000-0x1fff]
pci 0000:00:1e.0: bridge 32bit mmio: [0xe1900000-0xe19fffff]
pci_bus 0000:00: on NUMA node 0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.NPE6._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P32_._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 *11 12)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 *10 11 12)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 *9 10 11 12)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 *11 12)
SCSI subsystem initialized
libata version 3.00 loaded.
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0
hpet0: 4 comparators, 64-bit 14.318180 MHz counter
pnp: PnP ACPI init
ACPI: bus type pnp registered
pnp: PnP ACPI: found 14 devices
ACPI: ACPI bus type pnp unregistered
system 00:01: iomem range 0xf0000000-0xf3ffffff has been reserved
system 00:01: iomem range 0xfeb00000-0xfeb03fff has been reserved
system 00:01: iomem range 0xfed14000-0xfed17fff has been reserved
system 00:01: iomem range 0xfed18000-0xfed18fff has been reserved
system 00:01: iomem range 0xfed19000-0xfed19fff has been reserved
system 00:01: iomem range 0xfed1c000-0xfed1ffff has been reserved
system 00:01: iomem range 0xfed20000-0xfed3ffff has been reserved
system 00:01: iomem range 0xfed45000-0xfed99fff has been reserved
system 00:01: iomem range 0xc0000-0xdffff could not be reserved
system 00:01: iomem range 0xe0000-0xfffff could not be reserved
system 00:01: iomem range 0xffc00000-0xffffffff could not be reserved
system 00:06: ioport range 0x500-0x53f has been reserved
system 00:06: ioport range 0x400-0x47f has been reserved
system 00:06: ioport range 0xca0-0xcaf has been reserved
system 00:06: ioport range 0x680-0x6ff has been reserved
system 00:06: ioport range 0x800-0x87f has been reserved
system 00:06: ioport range 0x600-0x61f has been reserved
pci 0000:01:00.0: PCI bridge, secondary bus 0000:02
pci 0000:01:00.0: IO window: 0x3000-0x3fff
pci 0000:01:00.0: MEM window: 0xe1b00000-0xe1bfffff
pci 0000:01:00.0: PREFETCH window: 0x000000e1e00000-0x000000e1efffff
pci 0000:00:06.0: PCI bridge, secondary bus 0000:01
pci 0000:00:06.0: IO window: 0x3000-0x3fff
pci 0000:00:06.0: MEM window: 0xe1b00000-0xe1cfffff
pci 0000:00:06.0: PREFETCH window: 0x000000e1e00000-0x000000e1efffff
pci 0000:00:1c.0: PCI bridge, secondary bus 0000:03
pci 0000:00:1c.0: IO window: 0x2000-0x2fff
pci 0000:00:1c.0: MEM window: 0xe1a00000-0xe1afffff
pci 0000:00:1c.0: PREFETCH window: 0x000000e1f00000-0x000000e1ffffff
pci 0000:00:1c.4: PCI bridge, secondary bus 0000:04
pci 0000:00:1c.4: IO window: disabled
pci 0000:00:1c.4: MEM window: 0xe1000000-0xe18fffff
pci 0000:00:1c.4: PREFETCH window: 0x000000e0000000-0x000000e0ffffff
pci 0000:00:1e.0: PCI bridge, secondary bus 0000:05
pci 0000:00:1e.0: IO window: 0x1000-0x1fff
pci 0000:00:1e.0: MEM window: 0xe1900000-0xe19fffff
pci 0000:00:1e.0: PREFETCH window: disabled
pci 0000:00:06.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:00:06.0: setting latency timer to 64
pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
pci 0000:01:00.0: setting latency timer to 64
pci 0000:00:1c.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.0: setting latency timer to 64
pci 0000:00:1c.4: PCI INT A -> GSI 17 (level, low) -> IRQ 17
pci 0000:00:1c.4: setting latency timer to 64
pci 0000:00:1e.0: setting latency timer to 64
pci_bus 0000:00: resource 0 io: [0x00-0xffff]
pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffffffffffff]
pci_bus 0000:01: resource 0 io: [0x3000-0x3fff]
pci_bus 0000:01: resource 1 mem: [0xe1b00000-0xe1cfffff]
pci_bus 0000:01: resource 2 mem: [0xe1e00000-0xe1efffff]
pci_bus 0000:01: resource 3 mem: [0x0-0x0]
pci_bus 0000:02: resource 0 io: [0x3000-0x3fff]
pci_bus 0000:02: resource 1 mem: [0xe1b00000-0xe1bfffff]
pci_bus 0000:02: resource 2 mem: [0xe1e00000-0xe1efffff]
pci_bus 0000:02: resource 3 mem: [0x0-0x0]
pci_bus 0000:03: resource 0 io: [0x2000-0x2fff]
pci_bus 0000:03: resource 1 mem: [0xe1a00000-0xe1afffff]
pci_bus 0000:03: resource 2 mem: [0xe1f00000-0xe1ffffff]
pci_bus 0000:03: resource 3 mem: [0x0-0x0]
pci_bus 0000:04: resource 0 mem: [0x0-0x0]
pci_bus 0000:04: resource 1 mem: [0xe1000000-0xe18fffff]
pci_bus 0000:04: resource 2 mem: [0xe0000000-0xe0ffffff]
pci_bus 0000:04: resource 3 mem: [0x0-0x0]
pci_bus 0000:05: resource 0 io: [0x1000-0x1fff]
pci_bus 0000:05: resource 1 mem: [0xe1900000-0xe19fffff]
pci_bus 0000:05: resource 2 mem: [0x0-0x0]
pci_bus 0000:05: resource 3 io: [0x00-0xffff]
pci_bus 0000:05: resource 4 mem: [0x000000-0xffffffffffffffff]
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
NET: Registered protocol family 1
kvm: disabled by bios
Microcode Update Driver: v2.00 <[email protected]>, Peter Oruba
VFS: Disk quotas dquot_6.5.2
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
msgmni has been set to 15972
alg: No test for fcrypt (fcrypt-generic)
alg: No test for stdrng (krng)
async_tx: api initialized (sync-only)
io scheduler noop registered
io scheduler cfq registered (default)
pci 0000:04:00.0: Boot video device
pcieport-driver 0000:00:06.0: setting latency timer to 64
pcieport-driver 0000:00:06.0: irq 24 for MSI/MSI-X
pcieport-driver 0000:00:1c.0: setting latency timer to 64
pcieport-driver 0000:00:1c.0: irq 25 for MSI/MSI-X
pcieport-driver 0000:00:1c.4: setting latency timer to 64
pcieport-driver 0000:00:1c.4: irq 26 for MSI/MSI-X
input: Power Button (FF) as /class/input/input0
ACPI: Power Button (FF) [PWRF]
input: Sleep Button (CM) as /class/input/input1
ACPI: Sleep Button (CM) [SLPB]
input: Power Button (CM) as /class/input/input2
ACPI: Power Button (CM) [PWRB]
processor ACPI_CPU:00: registered as cooling_device0
processor ACPI_CPU:01: registered as cooling_device1
processor ACPI_CPU:02: registered as cooling_device2
processor ACPI_CPU:03: registered as cooling_device3
Non-volatile memory driver v1.3
Linux agpgart interface v0.103
ipmi message handler version 39.2
ipmi device interface
IPMI System Interface driver.
ipmi_si: Trying SMBIOS-specified kcs state machine at i/o address 0xca2, slave address 0x20, irq 0
ipmi: Found new BMC (man_id: 0x000157, prod_id: 0x003a, dev_id: 0x21)
IPMI kcs interface initialized
IPMI Watchdog: driver initialized
Copyright (C) 2004 MontaVista Software - IPMI Powerdown via sys_reboot.
IPMI poweroff: ATCA Detect mfg 0x157 prod 0x3A
IPMI poweroff: Found a chassis style poweroff function
Hangcheck: starting hangcheck timer 0.9.0 (tick is 180 seconds, margin is 60 seconds).
Hangcheck: Using get_cycles().
Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
brd: module loaded
loop: module loaded
Intel(R) PRO/1000 Network Driver - version 7.3.21-k3-NAPI
Copyright (c) 1999-2006 Intel Corporation.
e1000 0000:05:02.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
e1000: 0000:05:02.0: e1000_probe: (PCI:33MHz:32-bit) 00:15:17:26:b1:c7
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000e: Intel(R) PRO/1000 Network Driver - 0.3.3.3-k6
e1000e: Copyright (c) 1999-2008 Intel Corporation.
e1000e 0000:00:19.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20
e1000e 0000:00:19.0: setting latency timer to 64
e1000e 0000:00:19.0: irq 27 for MSI/MSI-X
0000:00:19.0: eth1: (PCI Express:2.5GB/s:Width x1) 00:15:17:26:b1:c9
0000:00:19.0: eth1: Intel(R) PRO/1000 Network Connection
0000:00:19.0: eth1: MAC: 6, PHY: 6, PBA No: 0070ff-0ff
Linux video capture interface: v2.00
3ware 9000 Storage Controller device driver for Linux v2.26.02.011.
st: Version 20081215, fixed bufsize 32768, s/g segs 256
Driver 'st' needs updating - please use bus_type methods
Driver 'sd' needs updating - please use bus_type methods
Driver 'sr' needs updating - please use bus_type methods
ahci 0000:00:1f.2: version 3.0
ahci 0000:00:1f.2: PCI INT A -> GSI 21 (level, low) -> IRQ 21
ahci 0000:00:1f.2: irq 28 for MSI/MSI-X
ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
ahci 0000:00:1f.2: flags: 64bit ncq sntf pm led clo pio slum part ems
ahci 0000:00:1f.2: setting latency timer to 64
scsi0 : ahci
scsi1 : ahci
scsi2 : ahci
scsi3 : ahci
scsi4 : ahci
scsi5 : ahci
ata1: SATA max UDMA/133 irq_stat 0x00400040, connection status changed irq 28
ata2: SATA max UDMA/133 irq_stat 0x00400040, connection status changed
ata3: SATA max UDMA/133 irq_stat 0x00400040, connection status changed
ata4: SATA max UDMA/133 irq_stat 0x00400040, connection status changed
ata5: SATA max UDMA/133 irq_stat 0x00400040, connection status changed
ata6: SATA max UDMA/133 irq_stat 0x00400040, connection status changed
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-8: WDC WD3000GLFS-01F8U0, 03.03V01, max UDMA/133
ata1.00: 586072368 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access ATA WDC WD3000GLFS-0 03.0 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 586072368 512-byte hardware sectors: (300 GB/279 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 586072368 512-byte hardware sectors: (300 GB/279 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3 sda4
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-8: WDC WD3000GLFS-01F8U0, 03.03V01, max UDMA/133
ata2.00: 586072368 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access ATA WDC WD3000GLFS-0 03.0 PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 586072368 512-byte hardware sectors: (300 GB/279 GiB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:0:0: [sdb] 586072368 512-byte hardware sectors: (300 GB/279 GiB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdb: sdb1 sdb2 sdb3 sdb4
sd 1:0:0:0: [sdb] Attached SCSI disk
sd 1:0:0:0: Attached scsi generic sg1 type 0
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata3.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata3.00: configured for UDMA/133
scsi 2:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 2:0:0:0: [sdc] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 2:0:0:0: [sdc] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 2:0:0:0: [sdc] Write Protect is off
sd 2:0:0:0: [sdc] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdc: sdc1
sd 2:0:0:0: [sdc] Attached SCSI disk
sd 2:0:0:0: Attached scsi generic sg2 type 0
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata4.00: ATA-8: WDC WD1002FBYS-01A6B0, 03.00C05, max UDMA/133
ata4.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata4.00: configured for UDMA/133
scsi 3:0:0:0: Direct-Access ATA WDC WD1002FBYS-0 03.0 PQ: 0 ANSI: 5
sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 3:0:0:0: [sdd] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 3:0:0:0: [sdd] Write Protect is off
sd 3:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 3:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdd: sdd1
sd 3:0:0:0: [sdd] Attached SCSI disk
sd 3:0:0:0: Attached scsi generic sg3 type 0
ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata5.00: ATA-8: WDC WD1002FBYS-01A6B0, 03.00C05, max UDMA/133
ata5.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata5.00: configured for UDMA/133
scsi 4:0:0:0: Direct-Access ATA WDC WD1002FBYS-0 03.0 PQ: 0 ANSI: 5
sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 4:0:0:0: [sde] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 4:0:0:0: [sde] Write Protect is off
sd 4:0:0:0: [sde] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sde] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sde: sde1
sd 4:0:0:0: [sde] Attached SCSI disk
sd 4:0:0:0: Attached scsi generic sg4 type 0
ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata6.00: ATA-8: WDC WD1002FBYS-01A6B0, 03.00C05, max UDMA/133
ata6.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata6.00: configured for UDMA/133
scsi 5:0:0:0: Direct-Access ATA WDC WD1002FBYS-0 03.0 PQ: 0 ANSI: 5
sd 5:0:0:0: [sdf] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 5:0:0:0: [sdf] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdf: sdf1 sdf2 sdf3
sd 5:0:0:0: [sdf] Attached SCSI disk
sd 5:0:0:0: Attached scsi generic sg5 type 0
sata_sil24 0000:03:00.0: version 1.1
sata_sil24 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
sata_sil24 0000:03:00.0: setting latency timer to 64
scsi6 : sata_sil24
scsi7 : sata_sil24
ata7: SATA max UDMA/100 host m128@0xe1a04000 port 0xe1a00000 irq 16
ata8: SATA max UDMA/100 host m128@0xe1a04000 port 0xe1a02000 irq 16
ata7: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
ata7.15: Port Multiplier 1.1, 0x1095:0x3726 r23, 6 ports, feat 0x1/0x9
ata7.00: hard resetting link
ata7.00: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
ata7.01: hard resetting link
ata7.01: SATA link down (SStatus 0 SControl 320)
ata7.02: hard resetting link
ata7.02: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata7.03: hard resetting link
ata7.03: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata7.04: hard resetting link
ata7.04: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata7.05: hard resetting link
ata7.05: SATA link up 1.5 Gbps (SStatus 113 SControl 320)
ata7.00: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata7.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32)
ata7.00: configured for UDMA/100
ata7.02: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata7.02: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata7.02: configured for UDMA/100
ata7.03: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata7.03: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata7.03: configured for UDMA/100
ata7.04: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata7.04: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata7.04: configured for UDMA/100
ata7: EH complete
scsi 6:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 6:0:0:0: [sdg] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:0:0:0: [sdg] Write Protect is off
sd 6:0:0:0: [sdg] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 6:0:0:0: [sdg] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:0:0:0: [sdg] Write Protect is off
sd 6:0:0:0: [sdg] Mode Sense: 00 3a 00 00
sd 6:0:0:0: [sdg] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdg: sdg1
sd 6:0:0:0: [sdg] Attached SCSI disk
sd 6:0:0:0: Attached scsi generic sg6 type 0
scsi 6:2:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 6:2:0:0: [sdh] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:2:0:0: [sdh] Write Protect is off
sd 6:2:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 6:2:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 6:2:0:0: [sdh] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:2:0:0: [sdh] Write Protect is off
sd 6:2:0:0: [sdh] Mode Sense: 00 3a 00 00
sd 6:2:0:0: [sdh] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdh: sdh1
sd 6:2:0:0: [sdh] Attached SCSI disk
sd 6:2:0:0: Attached scsi generic sg7 type 0
scsi 6:3:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 6:3:0:0: [sdi] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:3:0:0: [sdi] Write Protect is off
sd 6:3:0:0: [sdi] Mode Sense: 00 3a 00 00
sd 6:3:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 6:3:0:0: [sdi] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:3:0:0: [sdi] Write Protect is off
sd 6:3:0:0: [sdi] Mode Sense: 00 3a 00 00
sd 6:3:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdi: sdi1
sd 6:3:0:0: [sdi] Attached SCSI disk
sd 6:3:0:0: Attached scsi generic sg8 type 0
scsi 6:4:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 6:4:0:0: [sdj] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:4:0:0: [sdj] Write Protect is off
sd 6:4:0:0: [sdj] Mode Sense: 00 3a 00 00
sd 6:4:0:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 6:4:0:0: [sdj] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 6:4:0:0: [sdj] Write Protect is off
sd 6:4:0:0: [sdj] Mode Sense: 00 3a 00 00
sd 6:4:0:0: [sdj] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdj: sdj1
sd 6:4:0:0: [sdj] Attached SCSI disk
sd 6:4:0:0: Attached scsi generic sg9 type 0
ata8: SATA link up 3.0 Gbps (SStatus 123 SControl 0)
ata8.00: ATA-7: SAMSUNG HD103UJ, 1AA01112, max UDMA7
ata8.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32)
ata8.00: configured for UDMA/100
scsi 7:0:0:0: Direct-Access ATA SAMSUNG HD103UJ 1AA0 PQ: 0 ANSI: 5
sd 7:0:0:0: [sdk] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 7:0:0:0: [sdk] Write Protect is off
sd 7:0:0:0: [sdk] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 7:0:0:0: [sdk] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 7:0:0:0: [sdk] Write Protect is off
sd 7:0:0:0: [sdk] Mode Sense: 00 3a 00 00
sd 7:0:0:0: [sdk] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdk: sdk1
sd 7:0:0:0: [sdk] Attached SCSI disk
sd 7:0:0:0: Attached scsi generic sg10 type 0
Fusion MPT base driver 3.04.07
Copyright (c) 1999-2008 LSI Corporation
Fusion MPT SPI Host driver 3.04.07
mptspi 0000:02:08.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
mptbase: ioc0: Initiating bringup
ioc0: LSI53C1020A A1: Capabilities={Initiator,Target}
scsi8 : ioc0: LSI53C1020A A1, FwRev=01032700h, Ports=1, MaxQ=255, IRQ=16
scsi 8:0:0:0: Sequential-Access HP Ultrium 3-SCSI D22D PQ: 0 ANSI: 3
target8:0:0: Beginning Domain Validation
target8:0:0: Ending Domain Validation
target8:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, offset 64)
st 8:0:0:0: Attached scsi tape st0
st 8:0:0:0: st0: try direct i/o: yes (alignment 4 B)
st 8:0:0:0: Attached scsi generic sg11 type 1
Fusion MPT misc device (ioctl) driver 3.04.07
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci_hcd 0000:00:1a.7: PCI INT C -> GSI 17 (level, low) -> IRQ 17
ehci_hcd 0000:00:1a.7: setting latency timer to 64
ehci_hcd 0000:00:1a.7: EHCI Host Controller
ehci_hcd 0000:00:1a.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1a.7: debug port 1
ehci_hcd 0000:00:1a.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1a.7: irq 17, io mem 0xe1d21c00
ehci_hcd 0000:00:1a.7: USB 2.0 started, EHCI 1.00
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 6 ports detected
ehci_hcd 0000:00:1d.7: PCI INT A -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: setting latency timer to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 2
ehci_hcd 0000:00:1d.7: debug port 1
ehci_hcd 0000:00:1d.7: cache line size of 32 is not supported
ehci_hcd 0000:00:1d.7: irq 23, io mem 0xe1d21800
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 6 ports detected
uhci_hcd: USB Universal Host Controller Interface driver
uhci_hcd 0000:00:1a.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1a.0: setting latency timer to 64
uhci_hcd 0000:00:1a.0: UHCI Host Controller
uhci_hcd 0000:00:1a.0: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1a.0: irq 18, io base 0x000040c0
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
uhci_hcd 0000:00:1a.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21
uhci_hcd 0000:00:1a.1: setting latency timer to 64
uhci_hcd 0000:00:1a.1: UHCI Host Controller
uhci_hcd 0000:00:1a.1: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1a.1: irq 21, io base 0x000040a0
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.0: PCI INT A -> GSI 23 (level, low) -> IRQ 23
uhci_hcd 0000:00:1d.0: setting latency timer to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.0: irq 23, io base 0x00004080
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.1: PCI INT B -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: setting latency timer to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 6
uhci_hcd 0000:00:1d.1: irq 19, io base 0x00004060
usb usb6: configuration #1 chosen from 1 choice
hub 6-0:1.0: USB hub found
hub 6-0:1.0: 2 ports detected
uhci_hcd 0000:00:1d.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: setting latency timer to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 7
uhci_hcd 0000:00:1d.2: irq 18, io base 0x00004040
usb usb7: configuration #1 chosen from 1 choice
hub 7-0:1.0: USB hub found
hub 7-0:1.0: 2 ports detected
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver usbserial
USB Serial support registered for generic
usbcore: registered new interface driver usbserial_generic
usbserial: USB Serial Driver core
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
i2c /dev entries driver
i801_smbus 0000:00:1f.3: PCI INT B -> GSI 18 (level, low) -> IRQ 18
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.05
iTCO_wdt: failed to reset NO_REBOOT flag, reboot disabled by hardware
iTCO_wdt: No card detected
md: raid1 personality registered for level 1
input: AT Translated Set 2 keyboard as /class/input/input3
raid6: int64x1 2579 MB/s
raid6: int64x2 3291 MB/s
raid6: int64x4 2736 MB/s
raid6: int64x8 2310 MB/s
raid6: sse2x1 4541 MB/s
raid6: sse2x2 5122 MB/s
raid6: sse2x4 7993 MB/s
raid6: using algorithm sse2x4 (7993 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
device-mapper: ioctl: 4.14.0-ioctl (2008-04-23) initialised: [email protected]
EDAC MC: Ver: 2.1.0 Feb 24 2009
cpuidle: using governor ladder
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
usbhid: v2.6:USB HID core driver
Netfilter messages via NETLINK v0.30.
nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
ip_tables: (C) 2000-2006 Netfilter Core Team
arp_tables: (C) 2002 David S. Miller
TCP bic registered
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 17
NET: Registered protocol family 15
Bridge firewalling registered
Ebtables v2.0 registered
ebt_ulog: out of memory trying to call netlink_kernel_create
NET: Registered protocol family 33
RxRPC: Registered security type 2 'rxkad'
802.1Q VLAN Support v1.8 Ben Greear <[email protected]>
All bugs added by David S. Miller <[email protected]>
drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
BIOS EDD facility v0.16 2004-Jun-25, 6 devices found
input: ImExPS/2 Generic Explorer Mouse as /class/input/input4
md: Waiting for all devices to be available before autodetect
md: If you don't use raid, use raid=noautodetect
md: Autodetecting RAID arrays.
md: Scanned 16 and added 16 devices.
md: autorun ...
md: considering sdk1 ...
md: adding sdk1 ...
md: adding sdj1 ...
md: adding sdi1 ...
md: adding sdh1 ...
md: adding sdg1 ...
md: adding sde1 ...
md: adding sdd1 ...
md: adding sdc1 ...
md: sdb4 has different UUID to sdk1
md: sdb3 has different UUID to sdk1
md: sdb2 has different UUID to sdk1
md: sdb1 has different UUID to sdk1
md: sda4 has different UUID to sdk1
md: sda3 has different UUID to sdk1
md: sda2 has different UUID to sdk1
md: sda1 has different UUID to sdk1
md: created md5
md: bind<sdc1>
md: bind<sdd1>
md: bind<sde1>
md: bind<sdg1>
md: bind<sdh1>
md: bind<sdi1>
md: bind<sdj1>
md: bind<sdk1>
md: running: <sdk1><sdj1><sdi1><sdh1><sdg1><sde1><sdd1><sdc1>
raid5: device sdj1 operational as raid disk 6
raid5: device sdi1 operational as raid disk 5
raid5: device sdh1 operational as raid disk 4
raid5: device sdg1 operational as raid disk 3
raid5: device sde1 operational as raid disk 2
raid5: device sdd1 operational as raid disk 1
raid5: device sdc1 operational as raid disk 0
raid5: allocated 9518kB for md5
raid5: raid level 6 set md5 active with 7 out of 9 devices, algorithm 2
RAID5 conf printout:
--- rd:9 wd:7
disk 0, o:1, dev:sdc1
disk 1, o:1, dev:sdd1
disk 2, o:1, dev:sde1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdi1
disk 6, o:1, dev:sdj1
md5: bitmap initialized from disk: read 15/15 pages, set 3 bits
created bitmap (233 pages) for device md5
md: considering sdb4 ...
RAID5 conf printout:
--- rd:9 wd:7
disk 0, o:1, dev:sdc1
disk 1, o:1, dev:sdd1
disk 2, o:1, dev:sde1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdi1
disk 6, o:1, dev:sdj1
disk 7, o:1, dev:sdk1
md: adding sdb4 ...
md: sdb3 has different UUID to sdb4
md: sdb2 has different UUID to sdb4
md: sdb1 has different UUID to sdb4
md: recovery of RAID array md5
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 976759936 blocks.
md: adding sda4 ...
md: sda3 has different UUID to sdb4
md: sda2 has different UUID to sdb4
md: sda1 has different UUID to sdb4
md: created md3
md: bind<sda4>
md: bind<sdb4>
md: running: <sdb4><sda4>
raid1: raid set md3 active with 2 out of 2 mirrors
md3: bitmap initialized from disk: read 15/15 pages, set 0 bits
created bitmap (235 pages) for device md3
md: considering sdb3 ...
md: adding sdb3 ...
md: sdb2 has different UUID to sdb3
md: sdb1 has different UUID to sdb3
md: adding sda3 ...
md: sda2 has different UUID to sdb3
md: sda1 has different UUID to sdb3
md: created md2
md: bind<sda3>
md: bind<sdb3>
md: running: <sdb3><sda3>
raid1: raid set md2 active with 2 out of 2 mirrors
md2: bitmap initialized from disk: read 9/9 pages, set 0 bits
created bitmap (130 pages) for device md2
md: considering sdb2 ...
md: adding sdb2 ...
md: sdb1 has different UUID to sdb2
md: adding sda2 ...
md: sda1 has different UUID to sdb2
md: created md1
md: bind<sda2>
md: bind<sdb2>
md: running: <sdb2><sda2>
raid1: raid set md1 active with 2 out of 2 mirrors
md1: bitmap initialized from disk: read 10/10 pages, set 108 bits
created bitmap (150 pages) for device md1
md: considering sdb1 ...
md: adding sdb1 ...
md: adding sda1 ...
md: created md0
md: bind<sda1>
md: bind<sdb1>
md: running: <sdb1><sda1>
raid1: raid set md0 active with 2 out of 2 mirrors
md0: bitmap initialized from disk: read 1/1 pages, set 0 bits
created bitmap (12 pages) for device md0
md: ... autorun DONE.
md1: unknown partition table
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly on device 9:1.
Freeing unused kernel memory: 400k freed
md2:<6> md0: unknown partition table
unknown partition table
md5:<6> md3: unknown partition table
unknown partition table
st0: Block limits 1 - 16777215 bytes.
Adding 1919672k swap on /dev/md3. Priority:-1 extents:1 across:1919672k
EXT3 FS on md1, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on md0, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sdf1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on sdf2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
warning: `ntpd' uses 32-bit capabilities (legacy support in use)
EXT4-fs: barriers enabled
kjournald2 starting: pid 3427, dev dm-2:8, commit interval 5 seconds
EXT4 FS on dm-2, internal journal on dm-2:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-2 with ordered data mode
JBD: barrier-based sync failed on dm-2:8 - disabling barriers
md: md5: recovery done.
RAID5 conf printout:
--- rd:9 wd:8
disk 0, o:1, dev:sdc1
disk 1, o:1, dev:sdd1
disk 2, o:1, dev:sde1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdi1
disk 6, o:1, dev:sdj1
disk 7, o:1, dev:sdk1
EXT4-fs: barriers enabled
kjournald2 starting: pid 11990, dev dm-1:8, commit interval 5 seconds
EXT4 FS on dm-1, internal journal on dm-1:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-1 with ordered data mode
JBD: barrier-based sync failed on dm-1:8 - disabling barriers
EXT4-fs: mballoc: 5944357 blocks 16104 reqs (9609 success)
EXT4-fs: mballoc: 7150 extents scanned, 5731 goal hits, 5869 2^N hits, 0 breaks, 0 lost
EXT4-fs: mballoc: 806 generated and it took 1646776
EXT4-fs: mballoc: 3952320 preallocated, 943658 discarded
EXT3-fs: dm-1: couldn't mount because of unsupported optional features (240).
EXT4-fs: barriers enabled
kjournald2 starting: pid 12054, dev dm-1:8, commit interval 5 seconds
EXT4 FS on dm-1, internal journal on dm-1:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-1 with ordered data mode
EXT4-fs: mballoc: 0 blocks 0 reqs (0 success)
EXT4-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost
EXT4-fs: mballoc: 0 generated and it took 0
EXT4-fs: mballoc: 0 preallocated, 0 discarded
EXT4-fs: barriers enabled
kjournald2 starting: pid 12059, dev dm-1:8, commit interval 5 seconds
EXT4 FS on dm-1, internal journal on dm-1:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-1 with ordered data mode
EXT4-fs: barriers enabled
kjournald2 starting: pid 12063, dev dm-0:8, commit interval 5 seconds
EXT4 FS on dm-0, internal journal on dm-0:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-0 with ordered data mode
JBD: barrier-based sync failed on dm-0:8 - disabling barriers
EXT4-fs: mballoc: 36014838 blocks 245875 reqs (164335 success)
EXT4-fs: mballoc: 89077 extents scanned, 34866 goal hits, 81378 2^N hits, 0 breaks, 0 lost
EXT4-fs: mballoc: 6927 generated and it took 18103832
EXT4-fs: mballoc: 23427120 preallocated, 6893492 discarded
EXT4-fs: barriers enabled
kjournald2 starting: pid 12382, dev dm-0:8, commit interval 5 seconds
EXT4 FS on dm-0, internal journal on dm-0:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-0 with ordered data mode
JBD: barrier-based sync failed on dm-1:8 - disabling barriers
JBD: barrier-based sync failed on dm-0:8 - disabling barriers
sd 5:0:0:0: [sdf] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdf: sdf1 sdf2 sdf3
sd 5:0:0:0: [sdf] 1953525168 512-byte hardware sectors: (1.00 TB/931 GiB)
sd 5:0:0:0: [sdf] Write Protect is off
sd 5:0:0:0: [sdf] Mode Sense: 00 3a 00 00
sd 5:0:0:0: [sdf] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sdf: sdf1
md: bind<sdf1>
RAID5 conf printout:
--- rd:9 wd:8
disk 0, o:1, dev:sdc1
disk 1, o:1, dev:sdd1
disk 2, o:1, dev:sde1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdi1
disk 6, o:1, dev:sdj1
disk 7, o:1, dev:sdk1
disk 8, o:1, dev:sdf1
md: recovery of RAID array md5
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
md: using 128k window, over a total of 976759936 blocks.
md: md5: recovery done.
RAID5 conf printout:
--- rd:9 wd:9
disk 0, o:1, dev:sdc1
disk 1, o:1, dev:sdd1
disk 2, o:1, dev:sde1
disk 3, o:1, dev:sdg1
disk 4, o:1, dev:sdh1
disk 5, o:1, dev:sdi1
disk 6, o:1, dev:sdj1
disk 7, o:1, dev:sdk1
disk 8, o:1, dev:sdf1
md: data-check of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 96256 blocks.
md: delaying data-check of md1 until md0 has finished (they share one or more physical units)
md: delaying data-check of md2 until md1 has finished (they share one or more physical units)
md: delaying data-check of md1 until md0 has finished (they share one or more physical units)
md: delaying data-check of md3 until md1 has finished (they share one or more physical units)
md: data-check of RAID array md5
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 976759936 blocks.
md: md0: data-check done.
md: delaying data-check of md3 until md1 has finished (they share one or more physical units)
md: delaying data-check of md1 until md3 has finished (they share one or more physical units)
md: delaying data-check of md2 until md1 has finished (they share one or more physical units)
md: data-check of RAID array md3
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 1919680 blocks.
md: md3: data-check done.
md: delaying data-check of md2 until md1 has finished (they share one or more physical units)
md: data-check of RAID array md1
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 19534976 blocks.
md: md1: data-check done.
md: data-check of RAID array md2
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for data-check.
md: using 128k window, over a total of 271482368 blocks.
md: md2: data-check done.
md: md5: data-check done.
init_special_inode: bogus i_mode (53253)
attempt to access beyond end of device
dm-0: rw=0, want=1312475392770056, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=1312475392770056, limit=2147483648
init_special_inode: bogus i_mode (53253)
init_special_inode: bogus i_mode (53253)
init_special_inode: bogus i_mode (53253)
EXT4-fs: mballoc: 22367576 blocks 135958 reqs (123393 success)
EXT4-fs: mballoc: 13766 extents scanned, 132082 goal hits, 2742 2^N hits, 0 breaks, 0 lost
EXT4-fs: mballoc: 6665 generated and it took 32913144
EXT4-fs: mballoc: 16501501 preallocated, 257247 discarded
EXT4-fs: barriers enabled
kjournald2 starting: pid 26165, dev dm-0:8, commit interval 5 seconds
EXT4 FS on dm-0, internal journal on dm-0:8
EXT4-fs: delayed allocation enabled
EXT4-fs: file extents enabled
EXT4-fs: mballoc enabled
EXT4-fs: mounted filesystem dm-0 with ordered data mode
JBD: barrier-based sync failed on dm-0:8 - disabling barriers
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648
attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648



2009-03-11 06:18:54

by Andreas Dilger

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Mar 11, 2009 12:18 +1030, Kevin Shanahan wrote:
> On Wed, 2009-03-11 at 12:13 +1030, Kevin Shanahan wrote:
> >
> > getfattr: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error
> >
> > And syslog shows:
> > Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
> > Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
> >
> > hermes:~# debugfs /dev/dm-0
> > debugfs 1.41.3 (12-Oct-2008)
> > debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
> >
> > Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3b9c185
> > Generation: 3690868 Version: 0x9d36b10d
> > User: 868313917 Group: -1340283792 Size: 0
> > File ACL: 0 Directory ACL: 0
> > Links: 1 Blockcount: 0
> > Fragment: Address: 0 Number: 0 Size: 0
> > ctime: 0x0742afc4 -- Sun Nov 11 06:51:24 1973
> > atime: 0x472a2311 -- Fri Nov 2 05:33:45 2007
> > mtime: 0x80c59881 -- Fri Jun 18 09:51:21 2038
> > Size of extra inode fields: 4
> > BLOCKS:

There isn't anything obvious here that would imply reading a wacky block
beyond the end of the filesystem. I even checked if e.g. you had quotas
enabled and the bogus UID/GID would result in the quota file becoming
astronomically large or something, but the numbers don't seem to match.

There are no blocks in the file, no xattr block ("File ACL" should be
renamed...).

> hermes:~# ls -l /srv/samba/local/apps/Gestalt.Net/SetupCD/program\ files/Business\ Objects/Common/3.5/bin/RptControllers.dll
> prwS--x--t 1 868313917 2954683504 0 1902-05-13 03:23 /srv/samba/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll
>
> I guess this is what Andreas meant by "turning a pile of manure into
> neatly organized fertilizer" :)

Yes, you should just delete the inodes reported corrupted in your
earlier postings in the 87x range - they contain nothing of value
anymore, and I suspect your troubles would be gone. At least we
wouldn't be left wondering if you are seeing new corruption in
the same range of blocks, or just leftover badness.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


2009-03-11 11:37:31

by Manish Katiyar

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Tue, Mar 10, 2009 at 12:39 PM, Andreas Dilger <[email protected]> wrote:
> On Mar 10, 2009 ?15:32 +1030, Kevin Shanahan wrote:
>> On Mon, 2009-03-09 at 23:35 -0500, Eric Sandeen wrote:
>> > Kevin Shanahan wrote:
>> > > On Mon, 2009-03-09 at 21:57 -0500, Eric Sandeen wrote:
>> > >> Kevin Shanahan wrote:
>> > >>> kernel: init_special_inode: bogus i_mode (53253)
>
> If anyone has a chance, fixing this error message to be not-useless would
> be good... ?Including the device name and the inode number would help
> track down the source of the problem.

Hi Andreas,

Below is the patch along with other trivial cleanups in fs/inode.c

Fix init_special_inode to print device name and inode number.
Convert simple_strtoul to strict_strtoul.
Remove checkpatch.pl warnings:
Before :-
total: 24 errors, 26 warnings, 1547 lines checked
After :-
total: 0 errors, 0 warnings, 1539 lines checked


Signed-off-by: Manish Katiyar <[email protected]>
---
fs/inode.c | 91 +++++++++++++++++++++++++++--------------------------------
1 files changed, 42 insertions(+), 49 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 913ab2d..ef4a9b1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -98,7 +98,7 @@ static DEFINE_MUTEX(iprune_mutex);
*/
struct inodes_stat_t inodes_stat;

-static struct kmem_cache * inode_cachep __read_mostly;
+static struct kmem_cache *inode_cachep __read_mostly;

static void wake_up_inode(struct inode *inode)
{
@@ -123,7 +123,7 @@ struct inode *inode_init_always(struct super_block
*sb, struct inode *inode)
static struct inode_operations empty_iops;
static const struct file_operations empty_fops;

- struct address_space * const mapping = &inode->i_data;
+ struct address_space *const mapping = &inode->i_data;

inode->i_sb = sb;
inode->i_blkbits = sb->s_blocksize_bits;
@@ -206,7 +206,7 @@ static struct inode *alloc_inode(struct super_block *sb)
return NULL;
}

-void destroy_inode(struct inode *inode)
+void destroy_inode(struct inode *inode)
{
BUG_ON(inode_has_buffers(inode));
security_inode_free(inode);
@@ -242,12 +242,11 @@ void inode_init_once(struct inode *inode)
mutex_init(&inode->inotify_mutex);
#endif
}
-
EXPORT_SYMBOL(inode_init_once);

static void init_once(void *foo)
{
- struct inode * inode = (struct inode *) foo;
+ struct inode *inode = (struct inode *) foo;

inode_init_once(inode);
}
@@ -255,7 +254,7 @@ static void init_once(void *foo)
/*
* inode_lock must be held
*/
-void __iget(struct inode * inode)
+void __iget(struct inode *inode)
{
if (atomic_read(&inode->i_count)) {
atomic_inc(&inode->i_count);
@@ -279,7 +278,7 @@ void clear_inode(struct inode *inode)
{
might_sleep();
invalidate_inode_buffers(inode);
-
+
BUG_ON(inode->i_data.nrpages);
BUG_ON(!(inode->i_state & I_FREEING));
BUG_ON(inode->i_state & I_CLEAR);
@@ -293,7 +292,6 @@ void clear_inode(struct inode *inode)
cd_forget(inode);
inode->i_state = I_CLEAR;
}
-
EXPORT_SYMBOL(clear_inode);

/*
@@ -341,8 +339,8 @@ static int invalidate_list(struct list_head *head,
struct list_head *dispose)

next = head->next;
for (;;) {
- struct list_head * tmp = next;
- struct inode * inode;
+ struct list_head *tmp = next;
+ struct inode *inode;

/*
* We can reschedule here without worrying about the list's
@@ -378,7 +376,7 @@ static int invalidate_list(struct list_head *head,
struct list_head *dispose)
* fails because there are busy inodes then a non zero value is returned.
* If the discard is successful all the inodes have been discarded.
*/
-int invalidate_inodes(struct super_block * sb)
+int invalidate_inodes(struct super_block *sb)
{
int busy;
LIST_HEAD(throw_away);
@@ -394,7 +392,6 @@ int invalidate_inodes(struct super_block * sb)

return busy;
}
-
EXPORT_SYMBOL(invalidate_inodes);

static int can_unuse(struct inode *inode)
@@ -490,7 +487,7 @@ static int shrink_icache_memory(int nr, gfp_t gfp_mask)
* Nasty deadlock avoidance. We may hold various FS locks,
* and we don't want to recurse into the FS that called us
* in clear_inode() and friends..
- */
+ */
if (!(gfp_mask & __GFP_FS))
return -1;
prune_icache(nr);
@@ -510,10 +507,11 @@ static void __wait_on_freeing_inode(struct inode *inode);
* by hand after calling find_inode now! This simplifies iunique and won't
* add any additional branch in the common code.
*/
-static struct inode * find_inode(struct super_block * sb, struct
hlist_head *head, int (*test)(struct inode *, void *), void *data)
+static struct inode *find_inode(struct super_block *sb, struct
hlist_head *head,
+ int (*test)(struct inode *, void *), void *data)
{
struct hlist_node *node;
- struct inode * inode = NULL;
+ struct inode *inode = NULL;

repeat:
hlist_for_each_entry(inode, node, head, i_hash) {
@@ -534,10 +532,11 @@ repeat:
* find_inode_fast is the fast path version of find_inode, see the comment at
* iget_locked for details.
*/
-static struct inode * find_inode_fast(struct super_block * sb, struct
hlist_head *head, unsigned long ino)
+static struct inode *find_inode_fast(struct super_block *sb,
+ struct hlist_head *head, unsigned long ino)
{
struct hlist_node *node;
- struct inode * inode = NULL;
+ struct inode *inode = NULL;

repeat:
hlist_for_each_entry(inode, node, head, i_hash) {
@@ -617,10 +616,10 @@ struct inode *new_inode(struct super_block *sb)
* here to attempt to avoid that.
*/
static unsigned int last_ino;
- struct inode * inode;
+ struct inode *inode;

spin_lock_prefetch(&inode_lock);
-
+
inode = alloc_inode(sb);
if (inode) {
spin_lock(&inode_lock);
@@ -631,7 +630,6 @@ struct inode *new_inode(struct super_block *sb)
}
return inode;
}
-
EXPORT_SYMBOL(new_inode);

void unlock_new_inode(struct inode *inode)
@@ -659,7 +657,6 @@ void unlock_new_inode(struct inode *inode)
inode->i_state &= ~(I_LOCK|I_NEW);
wake_up_inode(inode);
}
-
EXPORT_SYMBOL(unlock_new_inode);

/*
@@ -668,13 +665,16 @@ EXPORT_SYMBOL(unlock_new_inode);
* We no longer cache the sb_flags in i_flags - see fs.h
* -- [email protected]
*/
-static struct inode * get_new_inode(struct super_block *sb, struct
hlist_head *head, int (*test)(struct inode *, void *), int
(*set)(struct inode *, void *), void *data)
+static struct inode *get_new_inode(struct super_block *sb,
+ struct hlist_head *head,
+ int (*test)(struct inode *, void *),
+ int (*set)(struct inode *, void *), void *data)
{
- struct inode * inode;
+ struct inode *inode;

inode = alloc_inode(sb);
if (inode) {
- struct inode * old;
+ struct inode *old;

spin_lock(&inode_lock);
/* We released the lock, so.. */
@@ -716,13 +716,14 @@ set_failed:
* get_new_inode_fast is the fast path version of get_new_inode, see the
* comment at iget_locked for details.
*/
-static struct inode * get_new_inode_fast(struct super_block *sb,
struct hlist_head *head, unsigned long ino)
+static struct inode *get_new_inode_fast(struct super_block *sb,
+ struct hlist_head *head, unsigned long ino)
{
- struct inode * inode;
+ struct inode *inode;

inode = alloc_inode(sb);
if (inode) {
- struct inode * old;
+ struct inode *old;

spin_lock(&inode_lock);
/* We released the lock, so.. */
@@ -808,7 +809,6 @@ struct inode *igrab(struct inode *inode)
spin_unlock(&inode_lock);
return inode;
}
-
EXPORT_SYMBOL(igrab);

/**
@@ -909,7 +909,6 @@ struct inode *ilookup5_nowait(struct super_block
*sb, unsigned long hashval,

return ifind(sb, head, test, data, 0);
}
-
EXPORT_SYMBOL(ilookup5_nowait);

/**
@@ -938,7 +937,6 @@ struct inode *ilookup5(struct super_block *sb,
unsigned long hashval,

return ifind(sb, head, test, data, 1);
}
-
EXPORT_SYMBOL(ilookup5);

/**
@@ -961,7 +959,6 @@ struct inode *ilookup(struct super_block *sb,
unsigned long ino)

return ifind_fast(sb, head, ino);
}
-
EXPORT_SYMBOL(ilookup);

/**
@@ -1000,7 +997,6 @@ struct inode *iget5_locked(struct super_block
*sb, unsigned long hashval,
*/
return get_new_inode(sb, head, test, set, data);
}
-
EXPORT_SYMBOL(iget5_locked);

/**
@@ -1032,7 +1028,6 @@ struct inode *iget_locked(struct super_block
*sb, unsigned long ino)
*/
return get_new_inode_fast(sb, head, ino);
}
-
EXPORT_SYMBOL(iget_locked);

int insert_inode_locked(struct inode *inode)
@@ -1061,7 +1056,6 @@ int insert_inode_locked(struct inode *inode)
iput(old);
}
}
-
EXPORT_SYMBOL(insert_inode_locked);

int insert_inode_locked4(struct inode *inode, unsigned long hashval,
@@ -1091,7 +1085,6 @@ int insert_inode_locked4(struct inode *inode,
unsigned long hashval,
iput(old);
}
}
-
EXPORT_SYMBOL(insert_inode_locked4);

/**
@@ -1109,7 +1102,6 @@ void __insert_inode_hash(struct inode *inode,
unsigned long hashval)
hlist_add_head(&inode->i_hash, head);
spin_unlock(&inode_lock);
}
-
EXPORT_SYMBOL(__insert_inode_hash);

/**
@@ -1124,7 +1116,6 @@ void remove_inode_hash(struct inode *inode)
hlist_del_init(&inode->i_hash);
spin_unlock(&inode_lock);
}
-
EXPORT_SYMBOL(remove_inode_hash);

/*
@@ -1171,7 +1162,6 @@ void generic_delete_inode(struct inode *inode)
BUG_ON(inode->i_state != I_CLEAR);
destroy_inode(inode);
}
-
EXPORT_SYMBOL(generic_delete_inode);

static void generic_forget_inode(struct inode *inode)
@@ -1218,12 +1208,11 @@ void generic_drop_inode(struct inode *inode)
else
generic_forget_inode(inode);
}
-
EXPORT_SYMBOL_GPL(generic_drop_inode);

/*
* Called when we're dropping the last reference
- * to an inode.
+ * to an inode.
*
* Call the FS "drop()" function, defaulting to
* the legacy UNIX filesystem behaviour..
@@ -1243,7 +1232,7 @@ static inline void iput_final(struct inode *inode)
}

/**
- * iput - put an inode
+ * iput - put an inode
* @inode: inode to put
*
* Puts an inode, dropping its usage count. If the inode use count hits
@@ -1260,7 +1249,6 @@ void iput(struct inode *inode)
iput_final(inode);
}
}
-
EXPORT_SYMBOL(iput);

/**
@@ -1271,10 +1259,10 @@ EXPORT_SYMBOL(iput);
* Returns the block number on the device holding the inode that
* is the disk block number for the block of the file requested.
* That is, asked for block 4 of inode 1 the function will return the
- * disk block relative to the disk start that holds that block of the
+ * disk block relative to the disk start that holds that block of the
* file.
*/
-sector_t bmap(struct inode * inode, sector_t block)
+sector_t bmap(struct inode *inode, sector_t block)
{
sector_t res = 0;
if (inode->i_mapping->a_ops->bmap)
@@ -1377,7 +1365,6 @@ void file_update_time(struct file *file)
mark_inode_dirty_sync(inode);
mnt_drop_write(file->f_path.mnt);
}
-
EXPORT_SYMBOL(file_update_time);

int inode_needs_sync(struct inode *inode)
@@ -1388,7 +1375,6 @@ int inode_needs_sync(struct inode *inode)
return 1;
return 0;
}
-
EXPORT_SYMBOL(inode_needs_sync);

int inode_wait(void *word)
@@ -1461,9 +1447,15 @@ EXPORT_SYMBOL(inode_double_unlock);
static __initdata unsigned long ihash_entries;
static int __init set_ihash_entries(char *str)
{
+ int ret;
+
if (!str)
return 0;
- ihash_entries = simple_strtoul(str, &str, 0);
+
+ ret = strict_strtoul(str, 0, &ihash_entries);
+ if (ret < 0 || ihash_entries == 0)
+ return 0;
+
return 1;
}
__setup("ihash_entries=", set_ihash_entries);
@@ -1540,7 +1532,8 @@ void init_special_inode(struct inode *inode,
umode_t mode, dev_t rdev)
else if (S_ISSOCK(mode))
inode->i_fop = &bad_sock_fops;
else
- printk(KERN_DEBUG "init_special_inode: bogus i_mode (%o)\n",
- mode);
+ printk(KERN_DEBUG "init_special_inode: bogus i_mode (%o) for"
+ " inode %s:%lu\n", mode, inode->i_sb->s_id,
+ inode->i_ino);
}
EXPORT_SYMBOL(init_special_inode);
--
1.5.4.3



Thanks -
Manish

>
>> hermes:~# debugfs /dev/dm-0
>> debugfs 1.41.3 (12-Oct-2008)
>> debugfs: ?stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32sv.dll"
>>
>> Gives the following output:
>>
>> ? Inode: 867 ? Type: bad type ? ?Mode: ?0404 ? Flags: 0x802a61af
>> ? Generation: 2483046020 ? ?Version: 0xb9286359:17a7fdfd
>> ? User: 1455931783 ? Group: -798021131 ? Size: -1808719531
>> ? File ACL: 141934744 ? ?Directory ACL: 0
>> ? Links: 15681 ? Blockcount: 171984001880781
>> ? Fragment: ?Address: 956780679 ? ?Number: 0 ? ?Size: 0
>> ? ?ctime: 0xdca60244:006c5b08 -- Wed Apr 23 01:54:36 2087
>> ? ?atime: 0x5c9e956c:777587a4 -- Sat Mar 30 08:30:12 2019
>> ? ?mtime: 0x2ce44e11:286138f8 -- Sat Nov 13 13:31:37 1993
>> ? crtime: 0x737781cb:5661f351 -- Thu May 22 19:54:11 2031
>> ? dtime: 0xf19c4882 -- Sat Jun 14 11:57:14 2098
>> ? Size of extra inode fields: 3625
>> ? BLOCKS:
>>
>> And when I exit the pager and return to the prompt I also see:
>>
>> ? invalid inode->i_extra_isize (3625)
>
> All garbage. ?Running e2fsck should fix this, though it won't explain
> why it got corrupted in the first place..
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2009-03-11 13:26:20

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Wed, Mar 11, 2009 at 12:18:39AM -0600, Andreas Dilger wrote:
> On Mar 11, 2009 12:18 +1030, Kevin Shanahan wrote:
> > On Wed, 2009-03-11 at 12:13 +1030, Kevin Shanahan wrote:
> > >
> > > getfattr: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error
> > >
> > > And syslog shows:
> > > Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
> > > Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
> > >
> > > hermes:~# debugfs /dev/dm-0
> > > debugfs 1.41.3 (12-Oct-2008)
> > > debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
> > >
> > > Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3b9c185
> > > Generation: 3690868 Version: 0x9d36b10d
> > > User: 868313917 Group: -1340283792 Size: 0
> > > File ACL: 0 Directory ACL: 0
> > > Links: 1 Blockcount: 0
> > > Fragment: Address: 0 Number: 0 Size: 0
> > > ctime: 0x0742afc4 -- Sun Nov 11 06:51:24 1973
> > > atime: 0x472a2311 -- Fri Nov 2 05:33:45 2007
> > > mtime: 0x80c59881 -- Fri Jun 18 09:51:21 2038
> > > Size of extra inode fields: 4
> > > BLOCKS:
>
> There isn't anything obvious here that would imply reading a wacky block
> beyond the end of the filesystem. I even checked if e.g. you had quotas
> enabled and the bogus UID/GID would result in the quota file becoming
> astronomically large or something, but the numbers don't seem to match.

More to the point, given that mode bits of the file detected the file
as a named pipe ("Type: FIFO"), it wouldn't have tried to access the
the disk. Trying to read from a named pipe would have resulted in a
hang (assuming no data in the named pipe); writing to named pipe would
have succeeded (and queued the data until another program tried
reading from the named pipe). So getting an I/O error from that file
doesn't make any sense.

> Yes, you should just delete the inodes reported corrupted in your
> earlier postings in the 87x range - they contain nothing of value
> anymore, and I suspect your troubles would be gone. At least we
> wouldn't be left wondering if you are seeing new corruption in
> the same range of blocks, or just leftover badness.

The inodes in question that are on that block would be inode numbers
864 to 879, inclusive. You can get the names of the files in question
using the ncheck command:

debugfs: ncheck 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879

... but at this point, I'm beginning to wonder if what is going on is
something in the I/O stack is occasionally returning random garbage
when you read from the particular block in question. The contents
reported for debugfs for block 875 should not have caused an I/O error
when you tried reading from the file. You can create your own named
pipe by using the command "mknod /tmp/test-fifo p", and playing with
it. So I'm wondering if when the kernel read block 875, it got one
version of garbage, and then when debugfs read block 875 later, it got
another version of garbage.

One of the original inodes involved was 867, right? You might want to
try using the "stat <867>" command and seeing if it still contains
garbage or not. Since that was e2fsck should have deleted for you (or
did you delete it manually yourself?), it should either be all zero's,
or it should contain the same inode garbage you had sent to the list,
but with an i_links_count of zero if you deleting the file via the
"rm" command. If it contains a different version of garbage, then
something is corrupting that block, possibly on the read path or the
write path.

- Ted

P.S. The reason why I asked you about your RAID card was because I
recently ended up spending time helping a user who had flashed LSI
firmware onto a Dell PERCS card, because some forum had stated that
the Dell PERCS card was a rebadged LSI hardware. Turns out that while
the card was made by LSI, it was also customized by Dell, and flashing
the non-approved hardware caused a bug to trigger for devices larger
than 4TB, such that a block near the beginning of the filesystem (in
this case, it was a block group descriptor block, so we could easily
recover from the backup descriptor blocks) was getting trashed after
every boot. I had forgotten that you had said earlier that you only
had a 1TB filesystem, but otherwise the symptoms looked very similar,
so I figured I had to ask.

We're now at the stage where I have to start asking questions about
the storage stack --- i.e. have you used this with this exact
hardware/configuration with ext3, and was it stable there,, have you
made any recent changes to the hardware/configuration, etc., since
this is starting to smell like a potential storage stack problem.

2009-03-11 18:08:51

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Wed, 2009-03-11 at 09:25 -0400, Theodore Tso wrote:
> On Wed, Mar 11, 2009 at 12:18:39AM -0600, Andreas Dilger wrote:
> > On Mar 11, 2009 12:18 +1030, Kevin Shanahan wrote:
> > > On Wed, 2009-03-11 at 12:13 +1030, Kevin Shanahan wrote:
> > > >
> > > > getfattr: apps/Gestalt.Net/SetupCD/program\040files/Business\040Objects/Common/3.5/bin/RptControllers.dll: Input/output error
> > > >
> > > > And syslog shows:
> > > > Mar 11 00:06:24 hermes kernel: attempt to access beyond end of device
> > > > Mar 11 00:06:24 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
> > > >
> > > > hermes:~# debugfs /dev/dm-0
> > > > debugfs 1.41.3 (12-Oct-2008)
> > > > debugfs: stat "local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll"
> > > >
> > > > Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3b9c185
> > > > Generation: 3690868 Version: 0x9d36b10d
> > > > User: 868313917 Group: -1340283792 Size: 0
> > > > File ACL: 0 Directory ACL: 0
> > > > Links: 1 Blockcount: 0
> > > > Fragment: Address: 0 Number: 0 Size: 0
> > > > ctime: 0x0742afc4 -- Sun Nov 11 06:51:24 1973
> > > > atime: 0x472a2311 -- Fri Nov 2 05:33:45 2007
> > > > mtime: 0x80c59881 -- Fri Jun 18 09:51:21 2038
> > > > Size of extra inode fields: 4
> > > > BLOCKS:
> >
> > There isn't anything obvious here that would imply reading a wacky block
> > beyond the end of the filesystem. I even checked if e.g. you had quotas
> > enabled and the bogus UID/GID would result in the quota file becoming
> > astronomically large or something, but the numbers don't seem to match.
>
> More to the point, given that mode bits of the file detected the file
> as a named pipe ("Type: FIFO"), it wouldn't have tried to access the
> the disk. Trying to read from a named pipe would have resulted in a
> hang (assuming no data in the named pipe); writing to named pipe would
> have succeeded (and queued the data until another program tried
> reading from the named pipe). So getting an I/O error from that file
> doesn't make any sense.

But getfattr isn't going to cause a read from the pipe is it? I would
expect that to cause a read from the disk.

> > Yes, you should just delete the inodes reported corrupted in your
> > earlier postings in the 87x range - they contain nothing of value
> > anymore, and I suspect your troubles would be gone. At least we
> > wouldn't be left wondering if you are seeing new corruption in
> > the same range of blocks, or just leftover badness.
>
> The inodes in question that are on that block would be inode numbers
> 864 to 879, inclusive. You can get the names of the files in question
> using the ncheck command:
>
> debugfs: ncheck 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879

debugfs: ncheck 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879
Inode Pathname
864 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32pl.dll
875 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll

I haven't rm'd the files just yet, but good to know which ones they are.

> ... but at this point, I'm beginning to wonder if what is going on is
> something in the I/O stack is occasionally returning random garbage
> when you read from the particular block in question. The contents
> reported for debugfs for block 875 should not have caused an I/O error
> when you tried reading from the file. You can create your own named
> pipe by using the command "mknod /tmp/test-fifo p", and playing with
> it.

Okay, a quick check shows that getfattr on the test-fifo file doesn't
cause an I/O error and it doesn't block either. A straight read from the
pipe does block of course.

> So I'm wondering if when the kernel read block 875, it got one
> version of garbage, and then when debugfs read block 875 later, it got
> another version of garbage.
>
> One of the original inodes involved was 867, right? You might want to
> try using the "stat <867>" command and seeing if it still contains
> garbage or not. Since that was e2fsck should have deleted for you (or
> did you delete it manually yourself?), it should either be all zero's,
> or it should contain the same inode garbage you had sent to the list,
> but with an i_links_count of zero if you deleting the file via the
> "rm" command. If it contains a different version of garbage, then
> something is corrupting that block, possibly on the read path or the
> write path.

debugfs: stat <867>

Inode: 867 Type: bad type Mode: 0404 Flags: 0x0
Generation: 2483046020 Version: 0x17a7fdfd
User: 1455931783 Group: -798021131 Size: 0
File ACL: 0 Directory ACL: 0
Links: 0 Blockcount: 0
Fragment: Address: 956780679 Number: 0 Size: 0
ctime: 0xdca60244 -- Wed Apr 23 01:54:36 2087
atime: 0x5c9e956c -- Sat Mar 30 08:30:12 2019
mtime: 0x2ce44e11 -- Sat Nov 13 13:31:37 1993
dtime: 0x49b6564f -- Tue Mar 10 22:30:15 2009
Size of extra inode fields: 4
BLOCKS:
(0):1487030929, (1):3739364871, (2):16299385, (3):2955804704,
(4):3028301176, (5):3255403360, (6):4066441585, (7):643698920,
(8):377498450, (9):297332775, (10):2206476866, (11):169813600,
(IND):2885921245, (DIND):1077961371, (TIND):3308808842
TOTAL: 15

Looks like fsck cleaned up a number of the fields, but not all zeroed.
It seems to have gained some blocks too, but I guess that is meaningless
for an unlinked inode?

> We're now at the stage where I have to start asking questions about
> the storage stack --- i.e. have you used this with this exact
> hardware/configuration with ext3, and was it stable there,, have you
> made any recent changes to the hardware/configuration, etc., since
> this is starting to smell like a potential storage stack problem.

The same hardware has been stable with an ext3 filesystem for about 9
months before this. The only other change that I made at the same time
as moving to ext4 was reconfiguring the RAID from what was originally a
10-disk RAID6 array to a 9-disk RAID6 array with room for one hot spare
(though that spare is not installed currently).

In case it matters, the change to ext4 was done by creating a new
filesystem and rsync'ing the data across, rather than upgrading an
existing ext3 filesystem.

Cheers,
Kevin.



2009-03-13 00:55:29

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Thu, Mar 12, 2009 at 04:37:10AM +1030, Kevin Shanahan wrote:
> But getfattr isn't going to cause a read from the pipe is it? I would
> expect that to cause a read from the disk.

Ah, yes, the getfattr would have caused a read from disk. I missed
that the I/O error could have been caused by that.

Still, e2fsck should have cleared the acl block if it was illegal the
last time that you ran a full fsck on the filesystem.

> Inode Pathname
> 864 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32pl.dll
> 875 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll

Well, it's likely those files are corrupted, so you might as well
delete them and restore from backup if needed/appropriate/possible.

> > One of the original inodes involved was 867, right? You might want to
> > try using the "stat <867>" command and seeing if it still contains
> > garbage or not. Since that was e2fsck should have deleted for you (or
> > did you delete it manually yourself?), it should either be all zero's,
> > or it should contain the same inode garbage you had sent to the list,
> > but with an i_links_count of zero if you deleting the file via the
> > "rm" command. If it contains a different version of garbage, then
> > something is corrupting that block, possibly on the read path or the
> > write path.
>
> debugfs: stat <867>
>
> Inode: 867 Type: bad type Mode: 0404 Flags: 0x0
> Generation: 2483046020 Version: 0x17a7fdfd
> User: 1455931783 Group: -798021131 Size: 0
> File ACL: 0 Directory ACL: 0
> Links: 0 Blockcount: 0
> Fragment: Address: 956780679 Number: 0 Size: 0
> ctime: 0xdca60244 -- Wed Apr 23 01:54:36 2087
> atime: 0x5c9e956c -- Sat Mar 30 08:30:12 2019
> mtime: 0x2ce44e11 -- Sat Nov 13 13:31:37 1993
> dtime: 0x49b6564f -- Tue Mar 10 22:30:15 2009
> Size of extra inode fields: 4
> BLOCKS:
> (0):1487030929, (1):3739364871, (2):16299385, (3):2955804704,
> (4):3028301176, (5):3255403360, (6):4066441585, (7):643698920,
> (8):377498450, (9):297332775, (10):2206476866, (11):169813600,
> (IND):2885921245, (DIND):1077961371, (TIND):3308808842
> TOTAL: 15
>
> Looks like fsck cleaned up a number of the fields, but not all zeroed.
> It seems to have gained some blocks too, but I guess that is meaningless
> for an unlinked inode?

It's meaningless for an unlinked inode, but it still shouldn't have
happened. I'm not sure how it could have happened in the first place.

Hmmm, I really don't know what to tell you; at this point probably the
best thing to do is to delete the last inodes used in that inode
table block, and then keep a watchful eye on the filesystem.

- Ted

2009-03-13 14:28:35

by Kevin Shanahan

[permalink] [raw]
Subject: Re: Possible ext4 corruption - ACL related?

On Thu, 2009-03-12 at 20:55 -0400, Theodore Tso wrote:
> > Inode Pathname
> > 864 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/Cdo32pl.dll
> > 875 /local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll
>
> Well, it's likely those files are corrupted, so you might as well
> delete them and restore from backup if needed/appropriate/possible.

Hrm, deleting these files resulted in:

EXT4-fs error (device dm-0): ext4_xattr_delete_inode: inode 875: block 118279104364544 read error
Aborting journal on device dm-0:8.
Remounting filesystem read-only

Unmounting resulted in:

EXT4-fs error (device dm-0) in ext4_free_inode: Journal has aborted
EXT4-fs: mballoc: 6695972 blocks 38497 reqs (34673 success)
EXT4-fs: mballoc: 4434 extents scanned, 37231 goal hits, 946 2^N hits, 0 breaks, 0 lost
EXT4-fs: mballoc: 750 generated and it took 6082776
EXT4-fs: mballoc: 4997239 preallocated, 120774 discarded
ext4_abort called.
EXT4-fs error (device dm-0): ext4_put_super: Couldn't clean up the journal

hermes:~# e2fsck -pfv /dev/dm-0
/dev/dm-0: recovering journal
/dev/dm-0: Group descriptor 768 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 769 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 770 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 771 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 772 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 773 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 774 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 775 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 776 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 777 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 778 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 779 checksum is invalid. FIXED.
/dev/dm-0: Note: if several inode or block bitmap blocks or part
of the inode table require relocation, you may wish to try
running e2fsck with the '-b 32768' option first. The problem
may lie only with the primary block group descriptors, and
the backup block group descriptors may be OK.

/dev/dm-0: Inode bitmap for group 780 is not in group. (block 339410944)


/dev/dm-0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
(i.e., without -a or -p options)

hermes:~# e2fsck -pfv -b 32768 /dev/dm-0
/dev/dm-0: Group descriptor 0 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 1 checksum is invalid. FIXED.
/dev/dm-0: Group descriptor 2 checksum is invalid. FIXED.
....
(snip messages for every group descriptor in sequence)
....
/dev/dm-0: Group descriptor 8191 checksum is invalid. FIXED.

289080 inodes used (0.43%)
11818 non-contiguous inodes (4.1%)
# of inodes with ind/dind/tind blocks: 0/0/0
Extent depth histogram: 288728/287
40783849 blocks used (15.19%)
0 bad blocks
2 large files

263218 regular files
25794 directories
0 character device files
0 block device files
1 fifo
0 links
58 symbolic links (54 fast symbolic links)
0 sockets
--------
289071 files

I guess there was some additional badness that e2fsck hadn't picked up
the on previous runs. And dammit, there's still something wrong.
Attempting to copy the backup files back in, the cp command is hanging
and I got another one of these:

attempt to access beyond end of device
dm-0: rw=0, want=946232834916360, limit=2147483648

Okay, I guess the hang has something to do with the stray file which
turned itself into a named pipe. Trying to remove it again:

hermes:~# rm '/srv/samba/local/apps/Gestalt.Net/SetupCD/program files/Business Objects/Common/3.5/bin/RptControllers.dll'

And I get:

Mar 14 00:48:34 hermes kernel: attempt to access beyond end of device
Mar 14 00:48:34 hermes kernel: dm-0: rw=0, want=946232834916360, limit=2147483648
Mar 14 00:48:34 hermes kernel: EXT4-fs error (device dm-0): ext4_xattr_delete_inode: inode 875: block 118279104364544 read error
Mar 14 00:48:34 hermes kernel: Aborting journal on device dm-0:8.
Mar 14 00:48:34 hermes kernel: Remounting filesystem read-only
Mar 14 00:48:34 hermes kernel: EXT4-fs error (device dm-0) in ext4_free_inode: Journal has aborted

Where is this request for block 118279104364544 coming from?

hermes:~# debugfs /dev/dm-0
debugfs 1.41.3 (12-Oct-2008)
debugfs: stat <875>

Inode: 875 Type: FIFO Mode: 0611 Flags: 0xb3a9c185
Generation: 3690868 Version: 0x00000000:9d36b10d
User: 27453 Group: 58480 Size: 0
File ACL: 0 Directory ACL: 0
Links: 0 Blockcount: 0
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x49ba6b3a:5a58878c -- Sat Mar 14 00:48:34 2009
atime: 0x472a2311:00000000 -- Fri Nov 2 05:33:45 2007
mtime: 0x80c59881:ffffffff -- Fri Jun 18 09:51:21 2038
crtime: 0x49a6c1d1:5f76c580 -- Fri Feb 27 02:52:41 2009
dtime: 0x49ba6b3a -- Sat Mar 14 00:48:34 2009
Size of extra inode fields: 28
BLOCKS:

Does having this reliable way to reproduce the bug help?

Cheers,
Kevin.