2012-09-10 21:55:54

by Carlos Maiolino

[permalink] [raw]
Subject: [PATCH v2] ext4: fix possible non-initialized variable

htree_dirblock_to_tree() declares a non-initialized 'err' variable, which is
passed as a reference to another functions expecting them to set this variable
with thei error codes.
It's passed to ext4_bread(), which then passes it to ext4_getblk(). If
ext4_map_blocks() returns 0 due to a lookup failure, leaving the ext4_getblk()
buffer_head uninitialized, it will make ext4_getblk() return to ext4_bread()
without initialize the 'err' variable, and ext4_bread() will return to
htree_dirblock_to_tree() with this variable still uninitialized.
htree_dirblock_to_tree() will pass this variable with garbage back to
ext4_htree_fill_tree(), which expects a number of directory entries added to the
rb-tree. which, in case, might return a fake non-zero value due the garbage left
in the 'err' variable, leading the kernel to an Oops in ext4_dx_readdir(), once
this is expecting a filled rb-tree node, when in turn it will have a NULL-ed
one, causing an invalid page request when trying to get a fname struct from
this NULL-ed rb-tree node in this line:

fname = rb_entry(info->curr_node, struct fname, rb_hash);

The patch itself initializes the err variable in htree_dirblock_to_tree() to
avoid usage mistakes by the called functions, and also fix ext4_getblk() to
return a initialized 'err' variable when ext4_map_blocks() fails a
lookup.

Signed-off-by: Carlos Maiolino <[email protected]>
---
fs/ext4/inode.c | 4 +++-
fs/ext4/namei.c | 2 +-
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index dff171c..41079f4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -732,11 +732,13 @@ struct buffer_head *ext4_getblk(handle_t *handle, struct inode *inode,
err = ext4_map_blocks(handle, inode, &map,
create ? EXT4_GET_BLOCKS_CREATE : 0);

+ /* ensure we send some value back into *errp */
+ *errp = 0;
+
if (err < 0)
*errp = err;
if (err <= 0)
return NULL;
- *errp = 0;

bh = sb_getblk(inode->i_sb, map.m_pblk);
if (!bh) {
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 2a42cc0..262f863 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -839,7 +839,7 @@ static int htree_dirblock_to_tree(struct file *dir_file,
{
struct buffer_head *bh;
struct ext4_dir_entry_2 *de, *top;
- int err, count = 0;
+ int err = 0, count = 0;

dxtrace(printk(KERN_INFO "In htree dirblock_to_tree: block %lu\n",
(unsigned long)block));
--
1.7.11.4



2012-09-15 18:30:30

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Mon, Sep 10, 2012 at 11:55:48AM -0000, Carlos Maiolino wrote:
> htree_dirblock_to_tree() declares a non-initialized 'err' variable,
> which is passed as a reference to another functions expecting them
> to set this variable with thei error codes. It's passed to
> ext4_bread(), which then passes it to ext4_getblk(). If
> ext4_map_blocks() returns 0 due to a lookup failure, leaving the
> ext4_getblk() buffer_head uninitialized, it will make ext4_getblk()
> return to ext4_bread() without initialize the 'err' variable, and
> ext4_bread() will return to htree_dirblock_to_tree() with this
> variable still uninitialized.

Hi Carlos,

Thanks for noticing this problem!

In the case where there is no block mapping for a particular block,
ext4_bread() can return NULL, and with your patch, *err will now be
zero instead of some uninitialized value. That's an improvement, and
in the case of htree_dirblock_to_tree(), when we return 0 as an
"error", the caller will do the right thing.

But there are other places where when ext4_bread() returns NULL with
err set to 0, the function ends up returning err, i.e., in ext4_add_entry:

bh = ext4_bread(handle, dir, block, 0, &retval);
if(!bh)
return retval;

... which will cause the caller of ext4_add_entry() to think that the
function had succeeded.

In the case of directories, there is never supposed to "holes" in
directories, so the right thing to do is to check to see if err = 0
and in that case to call ext4_error() to mark the file system as being
inconsistent, and then returning some kind of error like -EIO.

So your patch is an improvement, but I'm worried that there were cases
where we had been returning some uninitialized, non-zero stack
garbage, we had been serendipously treating the case of a directory
hole as an "error", now we we consider that situation as a "success"
even though the calling function (such as ext4_add_entry) had not
completed its processing. Which is a very long-winded way of saying
that we need to audit all of the functions which call ext4_bread() so
that they do the right thing when ext4_bread() returns NULL and err is
set to zero.

Regards,

- Ted

2012-09-17 14:55:55

by Eric Sandeen

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On 9/15/12 1:30 PM, Theodore Ts'o wrote:
> On Mon, Sep 10, 2012 at 11:55:48AM -0000, Carlos Maiolino wrote:
>> htree_dirblock_to_tree() declares a non-initialized 'err' variable,
>> which is passed as a reference to another functions expecting them
>> to set this variable with thei error codes. It's passed to
>> ext4_bread(), which then passes it to ext4_getblk(). If
>> ext4_map_blocks() returns 0 due to a lookup failure, leaving the
>> ext4_getblk() buffer_head uninitialized, it will make ext4_getblk()
>> return to ext4_bread() without initialize the 'err' variable, and
>> ext4_bread() will return to htree_dirblock_to_tree() with this
>> variable still uninitialized.
>
> Hi Carlos,
>
> Thanks for noticing this problem!
>
> In the case where there is no block mapping for a particular block,
> ext4_bread() can return NULL, and with your patch, *err will now be
> zero instead of some uninitialized value. That's an improvement, and
> in the case of htree_dirblock_to_tree(), when we return 0 as an
> "error", the caller will do the right thing.

Hm, sorry, I had counseled Carlos to do that. I figured a bmap
call w/o create set, returning a NULL bh was perfectly valid - it simply
means that it's not mapped there, right? - so a 0 retval made sense
to me.

> But there are other places where when ext4_bread() returns NULL with
> err set to 0, the function ends up returning err, i.e., in ext4_add_entry:
>
> bh = ext4_bread(handle, dir, block, 0, &retval);
> if(!bh)
> return retval;
>
> ... which will cause the caller of ext4_add_entry() to think that the
> function had succeeded.
>
> In the case of directories, there is never supposed to "holes" in
> directories, so the right thing to do is to check to see if err = 0
> and in that case to call ext4_error() to mark the file system as being
> inconsistent, and then returning some kind of error like -EIO.

Hm good point. Yeah, callers need to understand what that means.

> So your patch is an improvement, but I'm worried that there were cases
> where we had been returning some uninitialized, non-zero stack
> garbage, we had been serendipously treating the case of a directory
> hole as an "error", now we we consider that situation as a "success"
> even though the calling function (such as ext4_add_entry) had not
> completed its processing. Which is a very long-winded way of saying
> that we need to audit all of the functions which call ext4_bread() so
> that they do the right thing when ext4_bread() returns NULL and err is
> set to zero.

I agree with that. :)

-Eric

> Regards,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2012-09-17 15:30:49

by Eric Sandeen

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On 9/17/12 9:55 AM, Eric Sandeen wrote:
> On 9/15/12 1:30 PM, Theodore Ts'o wrote:
>> On Mon, Sep 10, 2012 at 11:55:48AM -0000, Carlos Maiolino wrote:
>>> htree_dirblock_to_tree() declares a non-initialized 'err' variable,
>>> which is passed as a reference to another functions expecting them
>>> to set this variable with thei error codes. It's passed to
>>> ext4_bread(), which then passes it to ext4_getblk(). If
>>> ext4_map_blocks() returns 0 due to a lookup failure, leaving the
>>> ext4_getblk() buffer_head uninitialized, it will make ext4_getblk()
>>> return to ext4_bread() without initialize the 'err' variable, and
>>> ext4_bread() will return to htree_dirblock_to_tree() with this
>>> variable still uninitialized.
>>
>> Hi Carlos,
>>
>> Thanks for noticing this problem!
>>
>> In the case where there is no block mapping for a particular block,
>> ext4_bread() can return NULL, and with your patch, *err will now be
>> zero instead of some uninitialized value. That's an improvement, and
>> in the case of htree_dirblock_to_tree(), when we return 0 as an
>> "error", the caller will do the right thing.
>
> Hm, sorry, I had counseled Carlos to do that. I figured a bmap
> call w/o create set, returning a NULL bh was perfectly valid - it simply
> means that it's not mapped there, right? - so a 0 retval made sense
> to me.

fwiw, the uninit variable came about as part of
2ed886852adfcb070bf350e66a0da0d98b2f3ab5; before that we happily returned
0 for an unmapped block; see below. So unless something else has changed
since then, Carlos' patch shouldn't be doing any harm, at least. An
audit may be in order but anyone misunderstanding a NULL/0 return has probably
been that way for a while.

struct buffer_head *ext4_getblk(handle_t *handle, struct inode *inode,
ext4_lblk_t block, int create, int *errp)
{
struct buffer_head dummy;
int fatal = 0, err;
int flags = 0;

J_ASSERT(handle != NULL || create == 0);

dummy.b_state = 0;
dummy.b_blocknr = -1000;
buffer_trace_init(&dummy.b_history);
if (create)
flags |= EXT4_GET_BLOCKS_CREATE;
err = ext4_get_blocks(handle, inode, block, 1, &dummy, flags);
/*
* ext4_get_blocks() returns number of blocks mapped. 0 in
* case of a HOLE.
*/
if (err > 0) {
if (err > 1)
WARN_ON(1);
err = 0;
}
*errp = err;




2012-09-17 15:37:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Mon, Sep 17, 2012 at 10:30:46AM -0500, Eric Sandeen wrote:
> fwiw, the uninit variable came about as part of
> 2ed886852adfcb070bf350e66a0da0d98b2f3ab5; before that we happily
> returned 0 for an unmapped block; see below. So unless something
> else has changed since then, Carlos' patch shouldn't be doing any
> harm, at least. An audit may be in order but anyone
> misunderstanding a NULL/0 return has probably been that way for a
> while.

Hmm, good point.

This is an audit that needs to happen for ext3 and ext4 as well, BTW
--- the callers of ext3_getblk() don't do the right thing when
ext3_getblk returns NULL and sets *errp to zero.

Fortunately it's rare that we have directories with holes, but there
are definitely bugs in terms of undefined behavior in the case of
directories with holes that we should look at and fix.

- Ted


2012-09-17 18:26:08

by Carlos Maiolino

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Mon, Sep 17, 2012 at 11:37:01AM -0400, Theodore Ts'o wrote:
> On Mon, Sep 17, 2012 at 10:30:46AM -0500, Eric Sandeen wrote:
> > fwiw, the uninit variable came about as part of
> > 2ed886852adfcb070bf350e66a0da0d98b2f3ab5; before that we happily
> > returned 0 for an unmapped block; see below. So unless something
> > else has changed since then, Carlos' patch shouldn't be doing any
> > harm, at least. An audit may be in order but anyone
> > misunderstanding a NULL/0 return has probably been that way for a
> > while.
>
> Hmm, good point.
>
> This is an audit that needs to happen for ext3 and ext4 as well, BTW
> --- the callers of ext3_getblk() don't do the right thing when
> ext3_getblk returns NULL and sets *errp to zero.
>
> Fortunately it's rare that we have directories with holes, but there
> are definitely bugs in terms of undefined behavior in the case of
> directories with holes that we should look at and fix.
>
> - Ted
>
Ted,

based on this conversation, is there anything else I should do to have this
patch accepted?

btw, I can review the callers for ext4_bread() and ext3_getblk in order to audit
it, is that ok for you?
--
--Carlos

2012-09-18 03:59:34

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Mon, Sep 17, 2012 at 03:26:04PM -0300, Carlos Maiolino wrote:
>
> based on this conversation, is there anything else I should do to have this
> patch accepted?

OK, I've applied this patch to my tree. However, I would really
appreciate if you could indeed review the callers for ext4_bread() and
ext3_bread() and send patches for ext3 and ext4 as you have suggested,
thanks!!

- Ted

2012-09-18 12:51:58

by Carlos Maiolino

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Mon, Sep 17, 2012 at 11:59:31PM -0400, Theodore Ts'o wrote:
> On Mon, Sep 17, 2012 at 03:26:04PM -0300, Carlos Maiolino wrote:
> >
> > based on this conversation, is there anything else I should do to have this
> > patch accepted?
>
> OK, I've applied this patch to my tree. However, I would really
> appreciate if you could indeed review the callers for ext4_bread() and
> ext3_bread() and send patches for ext3 and ext4 as you have suggested,
> thanks!!
>
On my todo list.
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
--Carlos

2012-09-19 20:11:03

by Carlos Maiolino

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

> > OK, I've applied this patch to my tree. However, I would really
> > appreciate if you could indeed review the callers for ext4_bread() and
> > ext3_bread() and send patches for ext3 and ext4 as you have suggested,
> > thanks!!
> >
> On my todo list.
> > - Ted
Ted,

In case of ext4_add_entry() I'm supposing to make the function call ext4_error()
and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed,
does that matches with your thoughts or is there a better way to handle with
this?
I'm talking about ext4_add_entry() behavious mainly as an example to better
understand how we should handle these situations. In case of ext4_add_entry(),
based on our discussions ext4_bread() should not fail once dir entries should
not have HOLES, so, a NULL return should indicate a on-disk corruption or an I/O
error.

Does that makes sense?

--
--Carlos

2012-09-19 20:41:03

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Wed, Sep 19, 2012 at 05:10:57PM -0300, Carlos Maiolino wrote:
> Ted,
>
> In case of ext4_add_entry() I'm supposing to make the function call ext4_error()
> and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed,
> does that matches with your thoughts or is there a better way to handle with
> this?

Yeah, that's about it. The real issue is EIO isn't really the right
errno code, but we don't have a better one. I've considered trying to
propose adding a new EFSCORRUPT errno value, just so that the error
message that eventually gets displayed back to the user via an
application error message is more obvious, but I haven't had the
energy to see if we can get it past the fsdevel shed-painting
process.

> I'm talking about ext4_add_entry() behavious mainly as an example to
> better understand how we should handle these situations. In case of
> ext4_add_entry(), based on our discussions ext4_bread() should not
> fail once dir entries should not have HOLES, so, a NULL return
> should indicate a on-disk corruption or an I/O error.

Well, a NULL return indicates that there is a hole in that particular
inode. If err=0, then it's a hole. Whether or not a hole is an
actual file system corruption is up to the caller of ext4_bread() to
determine. In the case of directories, holes are an example file
system corruption, so for the code in fs/ext4/namei.c, the appropriate
thing to do would be to call ext4_error() and then return an error to
abort the operation.

There is an argument to be made that if the file system is mounted
with errors=continue, in this case we could just try to ignore the
hole, and try to recover as best we can. But that's going to be quite
tricky to implement, and I'm not at all convinced it's worth the extra
complexity to implement. I could imagine someone who had a very high
requirement for "zero downtime" who might argue for this, but then we
would need to make sure the fallback code really was bulletproof. If
we pursued this option, then the EIO or EFSCORRUPT error code would
only be returned in the errors=remount-ro case.

(This is the sort of thing which is what customers pay $$$ for IBM's
mainframe in zOS. Whether or not this is really needed for ext4 and
Linux, even for an enterprise product, is a different sort of
question. I wouldn't object to someone who tried to make the
errors=continue behaviour to recover in a clean and safe way, but the
code would have to be very clearly documented, tested, and carefully
reviewed.)

- Ted

2012-09-20 02:59:09

by Eric Sandeen

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On 9/19/12 3:41 PM, Theodore Ts'o wrote:
> On Wed, Sep 19, 2012 at 05:10:57PM -0300, Carlos Maiolino wrote:
>> Ted,
>>
>> In case of ext4_add_entry() I'm supposing to make the function call ext4_error()
>> and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed,
>> does that matches with your thoughts or is there a better way to handle with
>> this?
>
> Yeah, that's about it. The real issue is EIO isn't really the right
> errno code, but we don't have a better one. I've considered trying to
> propose adding a new EFSCORRUPT errno value, just so that the error
> message that eventually gets displayed back to the user via an
> application error message is more obvious, but I haven't had the
> energy to see if we can get it past the fsdevel shed-painting
> process.

FWIW, xfs used to have a custom "EFSCORRUPTED" errno, but we ditched in
favor of a standard but seldom-used "EUCLEAN" (structure needs cleaning).
Doesn't seem like too bad a choice.

-Eric

>> I'm talking about ext4_add_entry() behavious mainly as an example to
>> better understand how we should handle these situations. In case of
>> ext4_add_entry(), based on our discussions ext4_bread() should not
>> fail once dir entries should not have HOLES, so, a NULL return
>> should indicate a on-disk corruption or an I/O error.
>
> Well, a NULL return indicates that there is a hole in that particular
> inode. If err=0, then it's a hole. Whether or not a hole is an
> actual file system corruption is up to the caller of ext4_bread() to
> determine. In the case of directories, holes are an example file
> system corruption, so for the code in fs/ext4/namei.c, the appropriate
> thing to do would be to call ext4_error() and then return an error to
> abort the operation.
>
> There is an argument to be made that if the file system is mounted
> with errors=continue, in this case we could just try to ignore the
> hole, and try to recover as best we can. But that's going to be quite
> tricky to implement, and I'm not at all convinced it's worth the extra
> complexity to implement. I could imagine someone who had a very high
> requirement for "zero downtime" who might argue for this, but then we
> would need to make sure the fallback code really was bulletproof. If
> we pursued this option, then the EIO or EFSCORRUPT error code would
> only be returned in the errors=remount-ro case.
>
> (This is the sort of thing which is what customers pay $$$ for IBM's
> mainframe in zOS. Whether or not this is really needed for ext4 and
> Linux, even for an enterprise product, is a different sort of
> question. I wouldn't object to someone who tried to make the
> errors=continue behaviour to recover in a clean and safe way, but the
> code would have to be very clearly documented, tested, and carefully
> reviewed.)
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2012-09-21 19:14:54

by Carlos Maiolino

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Wed, Sep 19, 2012 at 09:59:04PM -0500, Eric Sandeen wrote:
> On 9/19/12 3:41 PM, Theodore Ts'o wrote:
> > On Wed, Sep 19, 2012 at 05:10:57PM -0300, Carlos Maiolino wrote:
> >> Ted,
> >>
> >> In case of ext4_add_entry() I'm supposing to make the function call ext4_error()
> >> and return -EIO in the case where ext4_bread() returns NULL and err is 0'ed,
> >> does that matches with your thoughts or is there a better way to handle with
> >> this?
> >
> > Yeah, that's about it. The real issue is EIO isn't really the right
> > errno code, but we don't have a better one. I've considered trying to
> > propose adding a new EFSCORRUPT errno value, just so that the error
> > message that eventually gets displayed back to the user via an
> > application error message is more obvious, but I haven't had the
> > energy to see if we can get it past the fsdevel shed-painting
> > process.
>
> FWIW, xfs used to have a custom "EFSCORRUPTED" errno, but we ditched in
> favor of a standard but seldom-used "EUCLEAN" (structure needs cleaning).
> Doesn't seem like too bad a choice.
>
> -Eric
>
Not a bad idea really.

I've found some functions properly handling the problem (i.e !bh and !err), but
some of them not. I'm thinking in fix functions with missing hole handlers and
after that think about a 'default directory hole handler' for this, instead of
make each function work on a specific way.

What you guys think?
> >> I'm talking about ext4_add_entry() behavious mainly as an example to
> >> better understand how we should handle these situations. In case of
> >> ext4_add_entry(), based on our discussions ext4_bread() should not
> >> fail once dir entries should not have HOLES, so, a NULL return
> >> should indicate a on-disk corruption or an I/O error.
> >
> > Well, a NULL return indicates that there is a hole in that particular
> > inode. If err=0, then it's a hole. Whether or not a hole is an
> > actual file system corruption is up to the caller of ext4_bread() to
> > determine. In the case of directories, holes are an example file
> > system corruption, so for the code in fs/ext4/namei.c, the appropriate
> > thing to do would be to call ext4_error() and then return an error to
> > abort the operation.
> >
> > There is an argument to be made that if the file system is mounted
> > with errors=continue, in this case we could just try to ignore the
> > hole, and try to recover as best we can. But that's going to be quite
> > tricky to implement, and I'm not at all convinced it's worth the extra
> > complexity to implement. I could imagine someone who had a very high
> > requirement for "zero downtime" who might argue for this, but then we
> > would need to make sure the fallback code really was bulletproof. If
> > we pursued this option, then the EIO or EFSCORRUPT error code would
> > only be returned in the errors=remount-ro case.
> >
> > (This is the sort of thing which is what customers pay $$$ for IBM's
> > mainframe in zOS. Whether or not this is really needed for ext4 and
> > Linux, even for an enterprise product, is a different sort of
> > question. I wouldn't object to someone who tried to make the
> > errors=continue behaviour to recover in a clean and safe way, but the
> > code would have to be very clearly documented, tested, and carefully
> > reviewed.)
> >
> > - Ted
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
--Carlos

2012-09-22 00:52:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [v2] ext4: fix possible non-initialized variable

On Fri, Sep 21, 2012 at 04:14:49PM -0300, Carlos Maiolino wrote:
> Not a bad idea really.
>
> I've found some functions properly handling the problem (i.e !bh and !err), but
> some of them not. I'm thinking in fix functions with missing hole handlers and
> after that think about a 'default directory hole handler' for this, instead of
> make each function work on a specific way.

I'd have to take a look at a patch, but I didn't think there would be
enough code to be worth factoring out into a separate function. It's
just a conditional, a call to ext4_error(), and then setting up the
return code and releasing resources that need to be released on our
way out (which tends to be function-specific).

Am I missing something?

- Ted