2005-10-28 08:23:21

by Simon Horman

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

Ogawa-san,

I'm bringing this to you attention because a) I'm not sure who to ask
and b) I'm not sure what the correct behaviour is.

When a vfat filesystem is mounted isocharset=iso8859-1, then the
following works:

touch a.txt
ls A.txt

But when it is mounted isocharset=utf8, then ls complains, file not
found:

touch a.txt
ls A.txt

That is, in utf8, a =! A on vfat, and thus its not case insensitive
as one might expect.

I took a quick look in fs/nls/nls_utf8.c and I see that this is
intentional.

static struct nls_table table = {
.charset = "utf8",
.uni2char = uni2char,
.char2uni = char2uni,
.charset2lower = identity, /* no conversion */
.charset2upper = identity,
.owner = THIS_MODULE,
};

I guess it is charset2lower or charset2upper that vfat is calling,
which make no conversion, thus leading to the problem I outlined above.

My question is: Is this behaviour correct, or is it a bug?

--
Horms


2005-10-28 14:55:17

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

Horms <[email protected]> writes:

> static struct nls_table table = {
> .charset = "utf8",
> .uni2char = uni2char,
> .char2uni = char2uni,
> .charset2lower = identity, /* no conversion */
> .charset2upper = identity,
> .owner = THIS_MODULE,
> };
>
> I guess it is charset2lower or charset2upper that vfat is calling,
> which make no conversion, thus leading to the problem I outlined above.
>
> My question is: Is this behaviour correct, or is it a bug?

This is known bug. For fixing this bug cleanly, we will need to much
change the both of nls and filesystems.

Thanks.
--
OGAWA Hirofumi <[email protected]>

2005-10-28 15:07:54

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

OGAWA Hirofumi <[email protected]> writes:

> Horms <[email protected]> writes:
>
>> static struct nls_table table = {
>> .charset = "utf8",
>> .uni2char = uni2char,
>> .char2uni = char2uni,
>> .charset2lower = identity, /* no conversion */
>> .charset2upper = identity,
>> .owner = THIS_MODULE,
>> };
>>
>> I guess it is charset2lower or charset2upper that vfat is calling,
>> which make no conversion, thus leading to the problem I outlined above.
>>
>> My question is: Is this behaviour correct, or is it a bug?
>
> This is known bug. For fixing this bug cleanly, we will need to much
> change the both of nls and filesystems.

And fatfs has "utf8" option, probably the behavior is preferable than
"iocharset=utf8". However, unfortunately "utf8" has problem too.
--
OGAWA Hirofumi <[email protected]>

2005-10-29 04:01:18

by Simon Horman [Horms]

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

On Sat, Oct 29, 2005 at 12:07:40AM +0900, OGAWA Hirofumi wrote:
> OGAWA Hirofumi <[email protected]> writes:
>
> > Horms <[email protected]> writes:
> >
> >> static struct nls_table table = {
> >> .charset = "utf8",
> >> .uni2char = uni2char,
> >> .char2uni = char2uni,
> >> .charset2lower = identity, /* no conversion */
> >> .charset2upper = identity,
> >> .owner = THIS_MODULE,
> >> };
> >>
> >> I guess it is charset2lower or charset2upper that vfat is calling,
> >> which make no conversion, thus leading to the problem I outlined above.
> >>
> >> My question is: Is this behaviour correct, or is it a bug?
> >
> > This is known bug. For fixing this bug cleanly, we will need to much
> > change the both of nls and filesystems.
>
> And fatfs has "utf8" option, probably the behavior is preferable than
> "iocharset=utf8". However, unfortunately "utf8" has problem too.

Thanks

2005-10-29 14:45:14

by Ingo Oeser

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

Hi,

On Friday 28 October 2005 16:54, OGAWA Hirofumi wrote:
> Horms <[email protected]> writes:
> > I guess it is charset2lower or charset2upper that vfat is calling,
> > which make no conversion, thus leading to the problem I outlined above.
> >
> > My question is: Is this behaviour correct, or is it a bug?
>
> This is known bug. For fixing this bug cleanly, we will need to much
> change the both of nls and filesystems.

Using per locale collation sequences? :-)

Do you know, how Windows handles the problem of differing collation
sequences on the file system? Or is the file system always dependend
on the locale of the Windows version, which created the file system?

I'm so happy, that Unix is not case insensitive :-)


Regards

Ingo Oeser


Attachments:
(No filename) (755.00 B)
(No filename) (189.00 B)
Download all attachments

2005-10-29 16:28:38

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

Ingo Oeser <[email protected]> writes:

>> This is known bug. For fixing this bug cleanly, we will need to much
>> change the both of nls and filesystems.
>
> Using per locale collation sequences? :-)
>
> Do you know, how Windows handles the problem of differing collation
> sequences on the file system?

I don't know. Why do we need to care the collation sequences here?

> Or is the file system always dependend on the locale of the Windows
> version, which created the file system?

Probably, yes. I think we need to know on-disk filename's code set.
--
OGAWA Hirofumi <[email protected]>

2005-10-29 18:44:30

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

On Sun, 30 Oct 2005, OGAWA Hirofumi wrote:
> Ingo Oeser <[email protected]> writes:
> >> This is known bug. For fixing this bug cleanly, we will need to much
> >> change the both of nls and filesystems.
> >
> > Using per locale collation sequences? :-)
> >
> > Do you know, how Windows handles the problem of differing collation
> > sequences on the file system?
>
> I don't know. Why do we need to care the collation sequences here?
>
> > Or is the file system always dependend on the locale of the Windows
> > version, which created the file system?
>
> Probably, yes. I think we need to know on-disk filename's code set.

If FAT stores the filenames in 8 bits (non-UTF) then yes, it will be in
the current locale/code page of the Windows system writing them (e.g. that
happens with the names of EAs in NTFS).

If the names are stored in 16-bit Unicode like on NTFS then obviously they
are completely locale/code page independent. (Makes my life in NTFS a
_lot_ easier. Especially since the NTFS volume contains an upcase table
for the full 16-bit Unicode which we load and use to do upcasing for the
case insensitive comparisons...)

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2005-10-29 20:07:37

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: Bug#333776: linux-2.6: vfat driver in 2.6.12 is not properly case-insensitive

Anton Altaparmakov <[email protected]> writes:

>> Probably, yes. I think we need to know on-disk filename's code set.
>
> If FAT stores the filenames in 8 bits (non-UTF) then yes, it will be in
> the current locale/code page of the Windows system writing them (e.g. that
> happens with the names of EAs in NTFS).
>
> If the names are stored in 16-bit Unicode like on NTFS then obviously they
> are completely locale/code page independent. (Makes my life in NTFS a
> _lot_ easier. Especially since the NTFS volume contains an upcase table
> for the full 16-bit Unicode which we load and use to do upcasing for the
> case insensitive comparisons...)

Yes, I got to know it from fs/ntfs/*. :) Unfortunately, FAT stores
8/16bits codeset filename always. (Unicode (UCS2?) is stored in only
case of longname.)

Thanks.
--
OGAWA Hirofumi <[email protected]>