2008-11-22 17:39:24

by Andreas Dilger

[permalink] [raw]
Subject: Re: Error Propagation bugs in ext4

On Oct 31, 2008 20:19 -0500, Cindy Rubio Gonz�lez wrote:
> I am a graduate student at the University of Wisconsin-Madison. My
> advisor (Prof. Ben Liblit) and I have been working on performing
> static analysis to find error propagation bugs in Linux file system
> implementations. ext4 is one of our case studies. We consider three
> general scenarios in which unchecked error are commonly lost: the
> variable holding the unchecked error value (1) is overwritten with a
> new value, (2) goes out of scope, or (3) is returned by a function but
> not saved by the caller. Our tool produces detailed witness traces
> that illustrate each bug reported.

I apologize for not getting back to you on this sooner, I've been very
busy and barely able to keep up on my own work for the past few weeks.

> Slice:
> fs/jbd2/transaction.c:105:"ret" receives an error from "TENTATIVE_ENOMEM"
> fs/jbd2/transaction.c:241: an unchecked error may be returned
> fs/jbd2/transaction.c:427:"ret" receives an error from function
> "start_this_handle"
> fs/jbd2/transaction.c:428: an unchecked error may be returned
> fs/ext4/ext4_jbd2.h:188:"cabs2cil_" receives an error from function
> "jbd2_journal_restart"
> fs/ext4/ext4_jbd2.h:188: an unchecked error may be returned
> fs/ext4/extents.c:104:"cabs2cil_" receives an error from function
> "ext4_journal_restart"
> fs/ext4/extents.c:104: an unchecked error may be returned
> fs/ext4/extents.c:1933:"err" receives an error from function
> "ext4_ext_journal_restart"
> fs/ext4/extents.c:1980: an unchecked error may be returned

In this particular example, it appears that there is a logic error
in the simulator. This is the fragment of code that is in question:

2037 while (i >= 0 && err == 0) {
2038 if (i == depth) {
2039 /* this is leaf block */
2040 err = ext4_ext_rm_leaf(handle, inode, path, start);
2041 /* root level has p_bh == NULL, brelse() eats this */
2042 brelse(path[i].p_bh);
2043 path[i].p_bh = NULL;
2044 i--;
2045 continue;
2046 }

> fs/ext4/extents.c:2040:"err" receives an error from "ext4_ext_rm_leaf"
2040 err = ext4_ext_rm_leaf(handle, inode, path, start);
> fs/ext4/extents.c:2042:"err" may have an unchecked error
2042 brelse(path[i].p_bh);
> include/linux/buffer_head.h:270:"err" may have an unchecked error
2043 path[i].p_bh = NULL;
> fs/ext4/extents.c:2043:"err" may have an unchecked error
2043 path[i].p_bh = NULL;
> fs/ext4/extents.c:2044:"err" may have an unchecked error
2044 i--;
> fs/ext4/extents.c:2045:"err" may have an unchecked error
2045 continue;
> fs/ext4/extents.c:2037:"err" may have an unchecked error
2037 while (i >= 0 && err == 0) {
> fs/ext4/extents.c:2037:"err" may have an unchecked error
> fs/ext4/extents.c:2037:"err" may have an unchecked error
> fs/ext4/extents.c:2038:"err" may have an unchecked error
:
:
> fs/ext4/extents.c:2075:"err" may have an unchecked error
2075 bh = sb_bread(sb, idx_pblock(path[i].p_idx));
> fs/ext4/extents.c:2076:"err" may have an unchecked error
2076 if (!bh) {
> fs/ext4/extents.c:2078: overwriting potential non-tentative unchecked
> error in "err"
2078 err = -EIO;
> ====

It is clear that this loop should terminate because "err == 0" is now
false. Not only did the simulator fail to detect that the error in
"err" was checked at this point, the simulator execution flow incorrectly
shows the loop continuing until line 2078 where "err" is ostensibly
overwritten, but that is not what would actually happen.


I haven't looked at the other failure cases, because I suspect there
may be some similar false positives and I'd rather that the software
detect them than me.

That said, I'm a HUGE fan of static code analysis, and I'd be happy if
you could re-run your analysis and forward the results to the list again.
I think this is an excellent way of detecting bugs that happen rarely
enough that there are no good reproduction cases, but likely hit users
on occasion and are otherwise impossible to diagnose.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.