2007-05-25 05:01:30

by Dave Young

[permalink] [raw]
Subject: 2.6.22-rc2-mm1 NTFS & SLUB related fix

Hi,
As I umount a ntfs partition, the kernel report some trace infomation,
I can't call it oops, right?

Andrew, could you tell me who is the right person should I send to?

I navagated the ntfs inode.c, and found a possible bug, replaced
kmalloc with kzalloc,
because the ntfschar size is 2. then the kernel doesn't warning
again. and the slub debug info also disappeared.

This patch works for me:

diff -udr linux/fs/ntfs/inode.c linux.new/fs/ntfs/inode.c
--- linux/fs/ntfs/inode.c 2007-05-25 12:46:27.000000000 +0000
+++ linux.new/fs/ntfs/inode.c 2007-05-25 12:45:31.000000000 +0000
@@ -136,11 +136,10 @@

BUG_ON(!na->name);
i = na->name_len * sizeof(ntfschar);
- ni->name = kmalloc(i + sizeof(ntfschar), GFP_ATOMIC);
+ ni->name = kzalloc(i + sizeof(ntfschar), GFP_ATOMIC);
if (!ni->name)
return -ENOMEM;
memcpy(ni->name, na->name, i);
- ni->name[i] = 0;
}
return 0;
}



And please look the failed kernel message:

*** SLUB kmalloc-8: Redzone Active@0xc2959e38 slab 0xc1052b20
offset=3640 flags=0x400000c2 inuse=73 freelist=0x00000000
Bytes b4 0xc2959e28: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a
5a ........ZZZZZZZZ
Object 0xc2959e38: 24 00 51 00 00 00 6b a5
$.Q...k?
Redzone 0xc2959e40: 00 00 cc cc
..??
FreePointer 0xc2959e44 -> 0x00000000
Last alloc: ntfs_init_locked_inode+0x9e/0x110 jiffies_ago=5140 cpu=0 pid=1604
Last free : __vunmap+0xb2/0xe0 jiffies_ago=30727 cpu=0 pid=1491
Filler 0xc2959e68: 5a 5a 5a 5a 5a 5a 5a 5a
ZZZZZZZZ
[<c0163141>] check_object+0x71/0x250
[<c04432e0>] preempt_schedule+0x50/0x70
[<c01639a1>] free_debug_processing+0xc1/0x1a0
[<c011e587>] vprintk+0x227/0x250
[<c0164419>] __slab_free+0x79/0xe0
[<c02216da>] __ntfs_clear_inode+0x11a/0x1b0
[<c0164ee3>] kfree+0x63/0x70
[<c02216da>] __ntfs_clear_inode+0x11a/0x1b0
[<c02216da>] __ntfs_clear_inode+0x11a/0x1b0
[<c0221839>] ntfs_clear_big_inode+0x59/0x120
[<c019f690>] dquot_drop+0x0/0x50
[<c017d411>] clear_inode+0xc1/0x150
[<c017e3c7>] generic_forget_inode+0x107/0x180
[<c017e4b3>] iput+0x53/0x60
[<c0230b85>] ntfs_put_super+0x6c5/0x8e0
[<c016a61a>] generic_shutdown_super+0xea/0x100
[<c016b11c>] kill_block_super+0xc/0x20
[<c016a35e>] deactivate_super+0x4e/0xa0
[<c0180de5>] sys_umount+0x35/0x80
[<c0115274>] do_page_fault+0x434/0x5c0
[<c0180e45>] sys_oldumount+0x15/0x20
[<c0104098>] syscall_call+0x7/0xb
=======================
@@@ SLUB kmalloc-8: Restoring redzone (0xcc) from 0xc2959e40-0xc2959e43


2007-05-25 05:07:31

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

On Fri, 25 May 2007, young dave wrote:

> I can't call it oops, right?

Yes sure. This is a problem in the NTFS layer. It writes 2 bytes
after the allocated size.

> I navagated the ntfs inode.c, and found a possible bug, replaced
> kmalloc with kzalloc,
> because the ntfschar size is 2. then the kernel doesn't warning
> again. and the slub debug info also disappeared.

The kzalloc does not increase the size. So I suspect that the bug
did not trigger again after the change.

>
> This patch works for me:
>
> diff -udr linux/fs/ntfs/inode.c linux.new/fs/ntfs/inode.c
> --- linux/fs/ntfs/inode.c 2007-05-25 12:46:27.000000000 +0000
> +++ linux.new/fs/ntfs/inode.c 2007-05-25 12:45:31.000000000 +0000
> @@ -136,11 +136,10 @@
>
> BUG_ON(!na->name);
> i = na->name_len * sizeof(ntfschar);
> - ni->name = kmalloc(i + sizeof(ntfschar), GFP_ATOMIC);
> + ni->name = kzalloc(i + sizeof(ntfschar), GFP_ATOMIC);

Is this ntfs_init_locked_inode?

> Bytes b4 0xc2959e28: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a
> Object 0xc2959e38: 24 00 51 00 00 00 6b a5
> Redzone 0xc2959e40: 00 00 cc cc

First two bytes after the object overwritten. The allocation for this
object should have been two bytes longer.

> Last alloc: ntfs_init_locked_inode+0x9e/0x110 jiffies_ago=5140 cpu=0 pid=1604

This is the function that allocated a too short object.

2007-05-25 05:22:59

by Dave Young

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

Hi,

> Is this ntfs_init_locked_inode?

Yes, it is.

> > Bytes b4 0xc2959e28: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a
> > Object 0xc2959e38: 24 00 51 00 00 00 6b a5
> > Redzone 0xc2959e40: 00 00 cc cc
>
> First two bytes after the object overwritten. The allocation for this
> object should have been two bytes longer.
>
> > Last alloc: ntfs_init_locked_inode+0x9e/0x110 jiffies_ago=5140 cpu=0 pid=1604
>
> This is the function that allocated a too short object.
>

Only the last one byte of the string is zeroed, but It malloced 2
more byte appended the string because size of thentfschar type is 2
bytes , is this the reason? But why?

Regards
dave

2007-05-25 05:47:22

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

On Fri, 25 May 2007 05:22:50 +0000 "young dave" <[email protected]> wrote:

> Hi,
>
> > Is this ntfs_init_locked_inode?
>
> Yes, it is.
>
> > > Bytes b4 0xc2959e28: 00 00 00 00 00 00 00 00 5a 5a 5a 5a 5a 5a 5a
> > > Object 0xc2959e38: 24 00 51 00 00 00 6b a5
> > > Redzone 0xc2959e40: 00 00 cc cc
> >
> > First two bytes after the object overwritten. The allocation for this
> > object should have been two bytes longer.
> >
> > > Last alloc: ntfs_init_locked_inode+0x9e/0x110 jiffies_ago=5140 cpu=0 pid=1604
> >
> > This is the function that allocated a too short object.
> >
>
> Only the last one byte of the string is zeroed, but It malloced 2
> more byte appended the string because size of thentfschar type is 2
> bytes , is this the reason? But why?
>

Thing is, ntfs_inode.name[] is an array of le16's. But local variable `i'
in there is a byte index, not an le16 index. We end up writing that 0x0000
at twice the intended offset.

So I think this was meant:

--- a/fs/ntfs/inode.c~a
+++ a/fs/ntfs/inode.c
@@ -140,7 +140,7 @@ static int ntfs_init_locked_inode(struct
if (!ni->name)
return -ENOMEM;
memcpy(ni->name, na->name, i);
- ni->name[i] = 0;
+ ni->name[na->name_len] = 0;
}
return 0;
}
_

2007-05-25 06:28:01

by Dave Young

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

Hi Andrew,

> So I think this was meant:
>
> --- a/fs/ntfs/inode.c~a
> +++ a/fs/ntfs/inode.c
> @@ -140,7 +140,7 @@ static int ntfs_init_locked_inode(struct
> if (!ni->name)
> return -ENOMEM;
> memcpy(ni->name, na->name, i);
> - ni->name[i] = 0;
> + ni->name[na->name_len] = 0;
> }
> return 0;
> }
> _
>

I tested it , it doesn't work.

2007-05-25 06:44:17

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

On Fri, 25 May 2007 06:27:51 +0000 "young dave" <[email protected]> wrote:

> Hi Andrew,
>
> > So I think this was meant:
> >
> > --- a/fs/ntfs/inode.c~a
> > +++ a/fs/ntfs/inode.c
> > @@ -140,7 +140,7 @@ static int ntfs_init_locked_inode(struct
> > if (!ni->name)
> > return -ENOMEM;
> > memcpy(ni->name, na->name, i);
> > - ni->name[i] = 0;
> > + ni->name[na->name_len] = 0;
> > }
> > return 0;
> > }
> > _
> >
>
> I tested it , it doesn't work.

Surprised. Are you really sure that you had the patch applied, and that you
tested the right kernel, etc?

2007-05-25 06:48:42

by Dave Young

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

Yes, I'm sure. but the patch in top post of mine works, the diffrence
is using kzalloc and remove the "ni->name[i] = 0;" line.

Regards
dave

2007-05-25 07:01:36

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

On Fri, 25 May 2007 06:48:34 +0000 "young dave" <[email protected]> wrote:

> Yes, I'm sure. but the patch in top post of mine works, the diffrence
> is using kzalloc and remove the "ni->name[i] = 0;" line.
>

Let's walk through the existing code:

i = na->name_len * sizeof(ntfschar);

now, i = na->name_len * 2

ni->name = kmalloc(i + sizeof(ntfschar), GFP_ATOMIC);

we allocated (na->name_len * 2 + 2) bytes

if (!ni->name)
return -ENOMEM;
memcpy(ni->name, na->name, i);

we copied (na->name_len * 2) bytes

ni->name[i] = 0;

here, we zero the two bytes at byte offsets ((na->name_len * 2) * 2) and
((na->name_len * 2) * 2 + 1), and that is the bug. We _want_ to zero
the two bytes at byte offsets (na->name_len * 2) and (na->name_len * 2 + 1),
which we can do in C via

ni->name[na->name_len] = 0;

because sizeof(*(ni->name)) == 2.


So I'm still suspecting that you mistested that change.

2007-05-25 07:10:47

by Dave Young

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

Hi,
I'm very sorry, andrew, you are right.
I directly modified the source and forgot remove the line ni->name[i] = 0;

Regards
dave

2007-05-25 07:18:53

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.22-rc2-mm1 NTFS & SLUB related fix

On Fri, 25 May 2007 07:10:38 +0000 "young dave" <[email protected]> wrote:

> Hi,
> I'm very sorry, andrew, you are right.
> I directly modified the source and forgot remove the line ni->name[i] = 0;
>

Phew ;)

Thanks for checking.