2009-06-28 18:37:16

by Marton Balint

[permalink] [raw]
Subject: vfat creates multiple files with the same filename

Hi,

If I use vfat with the utf8 mount option, and create a file with an
invalid utf-8 character in it's name, then the invalid charater gets
removed from the filename silently. That's probably the expected way of
operation but the problem is, that this automatic filename sanitization
can be used to create multiple files with the same filename. Here's an
example how to do it:

mkdir /mnt/fatfs
FILENAME=`echo -ne "invalidutf8char_\\0341_endofchar"`
echo "Using filename: $FILENAME"
dd if=/dev/zero of=fatfs bs=512 count=128
mkdosfs -F 32 fatfs
mount -o loop,utf8 fatfs /mnt/fatfs
touch "/mnt/fatfs/$FILENAME"
umount /mnt/fatfs
mount -o loop,utf8 fatfs /mnt/fatfs
touch "/mnt/fatfs/$FILENAME"
ls -l /mnt/fatfs
umount /mnt/fatfs

---- And the output is:

Using filename: invalidutf8char_?_endofchar
128+0 records in
128+0 records out
65536 bytes (66 kB) copied, 0.000388118 s, 169 MB/s
mkdosfs 2.11 (12 Mar 2005)
total 0
-rwxr-xr-x 1 root root 0 Jun 28 19:46 invalidutf8char__endofchar
-rwxr-xr-x 1 root root 0 Jun 28 19:46 invalidutf8char__endofchar

I don't know if removing the invalid characters silently instead of
returning EINVAL is a good practice, but allowing two files with identical
filenames is definitely wrong.

Could someone please have a look at the problem? Unfortunately I am not
qualified enough to track it down...

I've run the testcase on both 2.6.30 and 2.6.31-rc1 kernels, but this bug
is probably not very new, since I remember having it also in 2.6.27.

Regards,
Marton


2009-06-30 02:24:56

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: vfat creates multiple files with the same filename

Marton Balint <[email protected]> writes:

> I don't know if removing the invalid characters silently instead of
> returning EINVAL is a good practice, but allowing two files with identical
> filenames is definitely wrong.

The utf8 function seems didn't return the error for invalid char.

> Could someone please have a look at the problem? Unfortunately I am not
> qualified enough to track it down...
>
> I've run the testcase on both 2.6.30 and 2.6.31-rc1 kernels, but this bug
> is probably not very new, since I remember having it also in 2.6.27.

This is quick hack though, this should return -EINVAL for invalid char
like nls. Can you try this?

Um..., probably, if utf8s_to_utf16s() returned the both of the error and
converted length to handle invalid char by caller, it would be better.
However, the user of it seems to be only vfat.

Signed-off-by: OGAWA Hirofumi <[email protected]>
---

fs/fat/namei_vfat.c | 15 ++++-----------
fs/nls/nls_base.c | 8 ++++----
2 files changed, 8 insertions(+), 15 deletions(-)

diff -puN fs/fat/namei_vfat.c~vfat-utf8-invalid-char fs/fat/namei_vfat.c
--- linux-2.6/fs/fat/namei_vfat.c~vfat-utf8-invalid-char 2009-06-30 10:09:32.000000000 +0900
+++ linux-2.6-hirofumi/fs/fat/namei_vfat.c 2009-06-30 10:24:48.000000000 +0900
@@ -500,17 +500,10 @@ xlate_to_uni(const unsigned char *name,
int charlen;

if (utf8) {
- int name_len = strlen(name);
-
- *outlen = utf8s_to_utf16s(name, PATH_MAX, (wchar_t *) outname);
-
- /*
- * We stripped '.'s before and set len appropriately,
- * but utf8s_to_utf16s doesn't care about len
- */
- *outlen -= (name_len - len);
-
- if (*outlen > 255)
+ *outlen = utf8s_to_utf16s(name, len, (wchar_t *)outname);
+ if (*outlen < 0)
+ return *outlen;
+ else if (*outlen > 255)
return -ENAMETOOLONG;

op = &outname[*outlen * sizeof(wchar_t)];
diff -puN fs/nls/nls_base.c~vfat-utf8-invalid-char fs/nls/nls_base.c
--- linux-2.6/fs/nls/nls_base.c~vfat-utf8-invalid-char 2009-06-30 10:10:04.000000000 +0900
+++ linux-2.6-hirofumi/fs/nls/nls_base.c 2009-06-30 10:10:54.000000000 +0900
@@ -124,10 +124,10 @@ int utf8s_to_utf16s(const u8 *s, int len
while (*s && len > 0) {
if (*s & 0x80) {
size = utf8_to_utf32(s, len, &u);
- if (size < 0) {
- /* Ignore character and move on */
- size = 1;
- } else if (u >= PLANE_SIZE) {
+ if (size < 0)
+ return -EINVAL;
+
+ if (u >= PLANE_SIZE) {
u -= PLANE_SIZE;
*op++ = (wchar_t) (SURROGATE_PAIR |
((u >> 10) & SURROGATE_BITS));
_

--
OGAWA Hirofumi <[email protected]>

2009-07-01 00:03:18

by Marton Balint

[permalink] [raw]
Subject: Re: vfat creates multiple files with the same filename

On Tue, 30 Jun 2009, OGAWA Hirofumi wrote:
> Marton Balint <[email protected]> writes:
>
> > I don't know if removing the invalid characters silently instead of
> > returning EINVAL is a good practice, but allowing two files with identical
> > filenames is definitely wrong.
>
> The utf8 function seems didn't return the error for invalid char.
>
> > Could someone please have a look at the problem? Unfortunately I am not
> > qualified enough to track it down...
> >
> > I've run the testcase on both 2.6.30 and 2.6.31-rc1 kernels, but this bug
> > is probably not very new, since I remember having it also in 2.6.27.
>
> This is quick hack though, this should return -EINVAL for invalid char
> like nls. Can you try this?

It works for me, filenames with invalid utf-8 characters are now rejected,
so I no longer can create multiple files with the same name. Thanks for
the fix.

Regards,
Marton

2009-07-01 00:29:44

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: vfat creates multiple files with the same filename

Marton Balint <[email protected]> writes:

> On Tue, 30 Jun 2009, OGAWA Hirofumi wrote:
>> Marton Balint <[email protected]> writes:
>>
>> > I don't know if removing the invalid characters silently instead of
>> > returning EINVAL is a good practice, but allowing two files with identical
>> > filenames is definitely wrong.
>>
>> The utf8 function seems didn't return the error for invalid char.
>>
>> > Could someone please have a look at the problem? Unfortunately I am not
>> > qualified enough to track it down...
>> >
>> > I've run the testcase on both 2.6.30 and 2.6.31-rc1 kernels, but this bug
>> > is probably not very new, since I remember having it also in 2.6.27.
>>
>> This is quick hack though, this should return -EINVAL for invalid char
>> like nls. Can you try this?
>
> It works for me, filenames with invalid utf-8 characters are now rejected,
> so I no longer can create multiple files with the same name. Thanks for
> the fix.

Thanks for testing.
--
OGAWA Hirofumi <[email protected]>

2009-07-06 19:12:05

by Marton Balint

[permalink] [raw]
Subject: Re: vfat creates multiple files with the same filename

On Wed, 1 Jul 2009, OGAWA Hirofumi wrote:

> Marton Balint <[email protected]> writes:
>
> > On Tue, 30 Jun 2009, OGAWA Hirofumi wrote:
> >> Marton Balint <[email protected]> writes:
> >>
> >> > I don't know if removing the invalid characters silently instead of
> >> > returning EINVAL is a good practice, but allowing two files with identical
> >> > filenames is definitely wrong.
> >>
> >> The utf8 function seems didn't return the error for invalid char.
> >>
> >> > Could someone please have a look at the problem? Unfortunately I am not
> >> > qualified enough to track it down...
> >> >
> >> > I've run the testcase on both 2.6.30 and 2.6.31-rc1 kernels, but this bug
> >> > is probably not very new, since I remember having it also in 2.6.27.
> >>
> >> This is quick hack though, this should return -EINVAL for invalid char
> >> like nls. Can you try this?
> >
> > It works for me, filenames with invalid utf-8 characters are now rejected,
> > so I no longer can create multiple files with the same name. Thanks for
> > the fix.
>
> Thanks for testing.

Will the patch be included in 2.6.31?

Regards,
Marton