2015-09-11 03:03:59

by Chris Hunter

[permalink] [raw]
Subject: e2fsck discrepancy with debugfs stat ?

Hi,
Are there scenarios where e2fsck will report a deleted/unused inode but
debugfs is able to read the inode structure ?

Some details:
I am using lustre version of e2fsprogs (1.42.12.wc1). When I run e2fsck
in nofix/dry-run mode on a blockdev, I receive errors about unused inodes.
eg)
> $ e2fsck -nfv <DEV>
> e2fsck 1.42.12.wc1 (15-Sep-2014)
> Warning: skipping journal recovery because doing a read-only filesystem check.
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Entry '131249395' in /O/0/d19 (118843419) has deleted/unused inode 5671802. Clear? no
etc...

However when I run command debugfs -c -R "stat /O/0/d19/131249395"
<DEV>, I can retrieve inode contents. Further debugfs "dump" will
successfully pull the contents (980 bytes) of the file entry.

thanks,
chris hunter
[email protected]


2015-09-11 12:39:07

by Chris Hunter

[permalink] [raw]
Subject: Re: e2fsck discrepancy with debugfs stat ?

To follow-up on my own question,

The "dry-run" e2fsck does not replay the ext journal. So my discrepancy,
e2fsck reports a unused/deleted inode bug debugfs is able to access the
inode & file data, could be due to uncommitted journal transactions.

regards,
chris hunter
[email protected]

On 09/10/2015 11:03 PM, Chris Hunter wrote:
> Hi,
> Are there scenarios where e2fsck will report a deleted/unused inode but
> debugfs is able to read the inode structure ?
>
> Some details:
> I am using lustre version of e2fsprogs (1.42.12.wc1). When I run e2fsck
> in nofix/dry-run mode on a blockdev, I receive errors about unused inodes.
> eg)
> > $ e2fsck -nfv <DEV>
>> e2fsck 1.42.12.wc1 (15-Sep-2014)
>> Warning: skipping journal recovery because doing a read-only
>> filesystem check.
>> Pass 1: Checking inodes, blocks, and sizes
>> Pass 2: Checking directory structure
>> Entry '131249395' in /O/0/d19 (118843419) has deleted/unused inode
>> 5671802. Clear? no
> etc...
>
> However when I run command debugfs -c -R "stat /O/0/d19/131249395"
> <DEV>, I can retrieve inode contents. Further debugfs "dump" will
> successfully pull the contents (980 bytes) of the file entry.
>
> thanks,
> chris hunter
> [email protected]

2015-09-11 17:55:52

by Theodore Ts'o

[permalink] [raw]
Subject: Re: e2fsck discrepancy with debugfs stat ?

On Thu, Sep 10, 2015 at 11:03:57PM -0400, Chris Hunter wrote:
> Are there scenarios where e2fsck will report a deleted/unused inode but
> debugfs is able to read the inode structure ?

When e2fsck reports that an inode is deleted/unused, it means that the
i_links_count field in the inode is zero. If that happens, it's
possible that the blocks previously associated with inode have been
reassigned, and so may contain someone else's love letters, medical
records, etc., and so the ext4 file system will report a corruption,
and allow you to read the inode, and e2fsck assumes that the
appropriate resolution to the problem is to clear the directory entry.

(After all, you wouldn't want to accidentally commit a HIPPA
violation, when fines for violations range $100 to $50,000 per record,
would you? Not to mention potentially getting lots of terrible Yelp
reviews. :-)

> However when I run command debugfs -c -R "stat /O/0/d19/131249395" <DEV>, I
> can retrieve inode contents. Further debugfs "dump" will successfully pull
> the contents (980 bytes) of the file entry.

You can do anything with with debugfs. Debugfs doesn't care if the
i_links_count field is zero, so it will happily return whatever might
be pointed to by that inode.

In terms of what might cause this, unless someone has been
manipulating file system structures using debugfs (for example,
"set_inode_field /O/0/d19/131249395 i_links_count 0"), it shouldn't
happen modulo hardware or software malfunctions / bugs. For example,
if you are using a SSD which isn't power failure protected (most
consumer-grade SSD's aren't) after a power failure, even if the file
system is properly using cache flush commands, if the SSD isn't set up
to make sure the SSD's metadata is properly saved to stable storage
after a power failure, the underlying file system can get corrupted.

- Ted


2015-09-11 19:59:55

by Chris Hunter

[permalink] [raw]
Subject: Re: e2fsck discrepancy with debugfs stat ?

Hi Theodore, Thanks for the reply.

Most of the e2fsck errors appear to have been uncommitted journal
transactions. After stopping filesystem activity (hopefully to clear
journal) a second dry-run e2fsck produced a much shorter list of errors.

FYI, There was an entry "Links: 1 Blockcount: 8" reported by debugfs
"stat" command is that same as i_links_count field ?

thanks,
chris hunter
[email protected]

On 09/11/2015 01:55 PM, Theodore Ts'o wrote:
> On Thu, Sep 10, 2015 at 11:03:57PM -0400, Chris Hunter wrote:
>> Are there scenarios where e2fsck will report a deleted/unused inode but
>> debugfs is able to read the inode structure ?
>
> When e2fsck reports that an inode is deleted/unused, it means that the
> i_links_count field in the inode is zero. If that happens, it's
> possible that the blocks previously associated with inode have been
> reassigned, and so may contain someone else's love letters, medical
> records, etc., and so the ext4 file system will report a corruption,
> and allow you to read the inode, and e2fsck assumes that the
> appropriate resolution to the problem is to clear the directory entry.
>
> (After all, you wouldn't want to accidentally commit a HIPPA
> violation, when fines for violations range $100 to $50,000 per record,
> would you? Not to mention potentially getting lots of terrible Yelp
> reviews. :-)
>
>> However when I run command debugfs -c -R "stat /O/0/d19/131249395" <DEV>, I
>> can retrieve inode contents. Further debugfs "dump" will successfully pull
>> the contents (980 bytes) of the file entry.
>
> You can do anything with with debugfs. Debugfs doesn't care if the
> i_links_count field is zero, so it will happily return whatever might
> be pointed to by that inode.
>
> In terms of what might cause this, unless someone has been
> manipulating file system structures using debugfs (for example,
> "set_inode_field /O/0/d19/131249395 i_links_count 0"), it shouldn't
> happen modulo hardware or software malfunctions / bugs. For example,
> if you are using a SSD which isn't power failure protected (most
> consumer-grade SSD's aren't) after a power failure, even if the file
> system is properly using cache flush commands, if the SSD isn't set up
> to make sure the SSD's metadata is properly saved to stable storage
> after a power failure, the underlying file system can get corrupted.
>
> - Ted


2015-09-12 02:09:17

by Theodore Ts'o

[permalink] [raw]
Subject: Re: e2fsck discrepancy with debugfs stat ?

On Fri, Sep 11, 2015 at 03:59:50PM -0400, Chris Hunter wrote:
> Hi Theodore, Thanks for the reply.
>
> Most of the e2fsck errors appear to have been uncommitted journal
> transactions. After stopping filesystem activity (hopefully to clear
> journal) a second dry-run e2fsck produced a much shorter list of errors.

Oh, if you trying to run e2fsck on a mounted file system ---- don't do
that. The results will be very confusing.

> FYI, There was an entry "Links: 1 Blockcount: 8" reported by debugfs "stat"
> command is that same as i_links_count field ?

Yes.

- Ted