Ogawa-san,
I'm bringing this to you attention because a) I'm not sure who to ask
and b) I'm not sure what the correct behaviour is.
When a vfat filesystem is mounted isocharset=iso8859-1, then the
following works:
touch a.txt
ls A.txt
But when it is mounted isocharset=utf8, then ls complains, file not
found:
touch a.txt
ls A.txt
That is, in utf8, a =! A on vfat, and thus its not case insensitive
as one might expect.
I took a quick look in fs/nls/nls_utf8.c and I see that this is
intentional.
static struct nls_table table = {
.charset = "utf8",
.uni2char = uni2char,
.char2uni = char2uni,
.charset2lower = identity, /* no conversion */
.charset2upper = identity,
.owner = THIS_MODULE,
};
I guess it is charset2lower or charset2upper that vfat is calling,
which make no conversion, thus leading to the problem I outlined above.
My question is: Is this behaviour correct, or is it a bug?
--
Horms
Horms <[email protected]> writes:
> static struct nls_table table = {
> .charset = "utf8",
> .uni2char = uni2char,
> .char2uni = char2uni,
> .charset2lower = identity, /* no conversion */
> .charset2upper = identity,
> .owner = THIS_MODULE,
> };
>
> I guess it is charset2lower or charset2upper that vfat is calling,
> which make no conversion, thus leading to the problem I outlined above.
>
> My question is: Is this behaviour correct, or is it a bug?
This is known bug. For fixing this bug cleanly, we will need to much
change the both of nls and filesystems.
Thanks.
--
OGAWA Hirofumi <[email protected]>
OGAWA Hirofumi <[email protected]> writes:
> Horms <[email protected]> writes:
>
>> static struct nls_table table = {
>> .charset = "utf8",
>> .uni2char = uni2char,
>> .char2uni = char2uni,
>> .charset2lower = identity, /* no conversion */
>> .charset2upper = identity,
>> .owner = THIS_MODULE,
>> };
>>
>> I guess it is charset2lower or charset2upper that vfat is calling,
>> which make no conversion, thus leading to the problem I outlined above.
>>
>> My question is: Is this behaviour correct, or is it a bug?
>
> This is known bug. For fixing this bug cleanly, we will need to much
> change the both of nls and filesystems.
And fatfs has "utf8" option, probably the behavior is preferable than
"iocharset=utf8". However, unfortunately "utf8" has problem too.
--
OGAWA Hirofumi <[email protected]>
On Sat, Oct 29, 2005 at 12:07:40AM +0900, OGAWA Hirofumi wrote:
> OGAWA Hirofumi <[email protected]> writes:
>
> > Horms <[email protected]> writes:
> >
> >> static struct nls_table table = {
> >> .charset = "utf8",
> >> .uni2char = uni2char,
> >> .char2uni = char2uni,
> >> .charset2lower = identity, /* no conversion */
> >> .charset2upper = identity,
> >> .owner = THIS_MODULE,
> >> };
> >>
> >> I guess it is charset2lower or charset2upper that vfat is calling,
> >> which make no conversion, thus leading to the problem I outlined above.
> >>
> >> My question is: Is this behaviour correct, or is it a bug?
> >
> > This is known bug. For fixing this bug cleanly, we will need to much
> > change the both of nls and filesystems.
>
> And fatfs has "utf8" option, probably the behavior is preferable than
> "iocharset=utf8". However, unfortunately "utf8" has problem too.
Thanks
Hi,
On Friday 28 October 2005 16:54, OGAWA Hirofumi wrote:
> Horms <[email protected]> writes:
> > I guess it is charset2lower or charset2upper that vfat is calling,
> > which make no conversion, thus leading to the problem I outlined above.
> >
> > My question is: Is this behaviour correct, or is it a bug?
>
> This is known bug. For fixing this bug cleanly, we will need to much
> change the both of nls and filesystems.
Using per locale collation sequences? :-)
Do you know, how Windows handles the problem of differing collation
sequences on the file system? Or is the file system always dependend
on the locale of the Windows version, which created the file system?
I'm so happy, that Unix is not case insensitive :-)
Regards
Ingo Oeser
Ingo Oeser <[email protected]> writes:
>> This is known bug. For fixing this bug cleanly, we will need to much
>> change the both of nls and filesystems.
>
> Using per locale collation sequences? :-)
>
> Do you know, how Windows handles the problem of differing collation
> sequences on the file system?
I don't know. Why do we need to care the collation sequences here?
> Or is the file system always dependend on the locale of the Windows
> version, which created the file system?
Probably, yes. I think we need to know on-disk filename's code set.
--
OGAWA Hirofumi <[email protected]>
On Sun, 30 Oct 2005, OGAWA Hirofumi wrote:
> Ingo Oeser <[email protected]> writes:
> >> This is known bug. For fixing this bug cleanly, we will need to much
> >> change the both of nls and filesystems.
> >
> > Using per locale collation sequences? :-)
> >
> > Do you know, how Windows handles the problem of differing collation
> > sequences on the file system?
>
> I don't know. Why do we need to care the collation sequences here?
>
> > Or is the file system always dependend on the locale of the Windows
> > version, which created the file system?
>
> Probably, yes. I think we need to know on-disk filename's code set.
If FAT stores the filenames in 8 bits (non-UTF) then yes, it will be in
the current locale/code page of the Windows system writing them (e.g. that
happens with the names of EAs in NTFS).
If the names are stored in 16-bit Unicode like on NTFS then obviously they
are completely locale/code page independent. (Makes my life in NTFS a
_lot_ easier. Especially since the NTFS volume contains an upcase table
for the full 16-bit Unicode which we load and use to do upcasing for the
case insensitive comparisons...)
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
Anton Altaparmakov <[email protected]> writes:
>> Probably, yes. I think we need to know on-disk filename's code set.
>
> If FAT stores the filenames in 8 bits (non-UTF) then yes, it will be in
> the current locale/code page of the Windows system writing them (e.g. that
> happens with the names of EAs in NTFS).
>
> If the names are stored in 16-bit Unicode like on NTFS then obviously they
> are completely locale/code page independent. (Makes my life in NTFS a
> _lot_ easier. Especially since the NTFS volume contains an upcase table
> for the full 16-bit Unicode which we load and use to do upcasing for the
> case insensitive comparisons...)
Yes, I got to know it from fs/ntfs/*. :) Unfortunately, FAT stores
8/16bits codeset filename always. (Unicode (UCS2?) is stored in only
case of longname.)
Thanks.
--
OGAWA Hirofumi <[email protected]>