2016-02-18 22:05:46

by Dilger, Andreas

[permalink] [raw]
Subject: problem in ext2fs_get_next_inode_full() ?

I was just looking at ext2fs_get_next_inode_full() to trace where we
are using large inodes and whether we could change the APIs to just
pass large inodes around instead of typecasting them. It has the
following hunk of code:

if (extra_bytes) {
memcpy(scan->temp_buffer+extra_bytes, scan->ptr,
scan->inode_size - extra_bytes);
scan->ptr += scan->inode_size - extra_bytes;
scan->bytes_left -= scan->inode_size - extra_bytes;

#ifdef WORDS_BIGENDIAN
memset(inode, 0, bufsize);
ext2fs_swap_inode_full(scan->fs,
(struct ext2_inode_large *) inode,
(struct ext2_inode_large *) scan->temp_buffer,
0, bufsize);
#else
*inode = *((struct ext2_inode *) scan->temp_buffer);
#endif

So if the inode is being swabbed then it handles the full inode size, but
if it is not being swabbed (the common case) it appears that it is only
copying the small inode into "*inode" using a struct assignment. This
appears like it would be dropping the large inode data, but I'm not sure
if or when this "extra_bytes" case is hit. The "else" clause appears to
copy the requested (full) inode size properly via "memcpy(..., bufsize)".

Should the struct assignment be changed similarly to use memcpy()?


Cheers, Andreas
--
Andreas Dilger

Lustre Principal Architect
Intel High Performance Data Division




2016-02-18 22:30:39

by Andreas Dilger

[permalink] [raw]
Subject: Re: problem in ext2fs_get_next_inode_full() ?

On Feb 18, 2016, at 3:05 PM, Dilger, Andreas <[email protected]> wrote:
>
> I was just looking at ext2fs_get_next_inode_full() to trace where we
> are using large inodes and whether we could change the APIs to just
> pass large inodes around instead of typecasting them. It has the
> following hunk of code:
>
> if (extra_bytes) {
> memcpy(scan->temp_buffer+extra_bytes, scan->ptr,
> scan->inode_size - extra_bytes);
> scan->ptr += scan->inode_size - extra_bytes;
> scan->bytes_left -= scan->inode_size - extra_bytes;
>
> #ifdef WORDS_BIGENDIAN
> memset(inode, 0, bufsize);
> ext2fs_swap_inode_full(scan->fs,
> (struct ext2_inode_large *) inode,
> (struct ext2_inode_large *) scan->temp_buffer,
> 0, bufsize);
> #else
> *inode = *((struct ext2_inode *) scan->temp_buffer);
> #endif
>
> So if the inode is being swabbed then it handles the full inode size, but
> if it is not being swabbed (the common case) it appears that it is only
> copying the small inode into "*inode" using a struct assignment. This
> appears like it would be dropping the large inode data, but I'm not sure
> if or when this "extra_bytes" case is hit. The "else" clause appears to
> copy the requested (full) inode size properly via "memcpy(..., bufsize)".
>
> Should the struct assignment be changed similarly to use memcpy()?

To follow up on my own email - I also see struct ext2_inode_cache_ent is
only caching the small inode, and not a large inode. This would seem to
potentially cause loss of the large inode data if the inode cache is
used by tools like resize2fs or others that move around inodes?

Cheers, Andreas






2016-02-19 14:24:34

by Theodore Ts'o

[permalink] [raw]
Subject: Re: problem in ext2fs_get_next_inode_full() ?

On Thu, Feb 18, 2016 at 03:30:35PM -0700, Andreas Dilger wrote:
> > So if the inode is being swabbed then it handles the full inode size, but
> > if it is not being swabbed (the common case) it appears that it is only
> > copying the small inode into "*inode" using a struct assignment. This
> > appears like it would be dropping the large inode data, but I'm not sure
> > if or when this "extra_bytes" case is hit. The "else" clause appears to
> > copy the requested (full) inode size properly via "memcpy(..., bufsize)".
> >
> > Should the struct assignment be changed similarly to use memcpy()?
>
> To follow up on my own email - I also see struct ext2_inode_cache_ent is
> only caching the small inode, and not a large inode. This would seem to
> potentially cause loss of the large inode data if the inode cache is
> used by tools like resize2fs or others that move around inodes?

Those are both bugs, and I'm guessing they were added when we added
metadata checksuming, as they aren't a problem in the maint branch.
The inode cache should *only* be used if we are reading the small
inode (which is the common case; we only need the full inode if we are
moving the inode around or if we need to access the xattrs). And
indeed we do that in the maint branch:

/* Check to see if it's in the inode cache */
if (bufsize == sizeof(struct ext2_inode)) {
/* only old good inode can be retrieved from the cache */
for (i=0; i < fs->icache->cache_size; i++) {
if (fs->icache->cache[i].ino == ino) {
*inode = fs->icache->cache[i].inode;
return 0;
}
}
}

Unfortunately this check got removed in the next branch, and I missed
it in my code reviews.

We should probably have some unit tests to make sure we don't regress
here again, and probably make the comments a bit more explicit.

- Ted