2002-03-01 23:42:35

by Urban Widmark

[permalink] [raw]
Subject: [PATCH] smbfs codepage fixes for 2.4.18


Ok, I think I've got something regarding the 2.4.18 oopses. (oopsen?)

There are two errors in the changes to smbfs in 2.4.18-rc3 (and 2.5.5)
1. SMB_MAXNAMELEN (max length of a single path component) was used where
SMB_MAXPATHLEN (max total path length) should have been used.
2. The charset conversion routine was modified to return errors as
negative values but not all callers was changed to handle this. When an
"illegal" character was hit the length of the string was set to
0xffffffff and when computing the hash value it read outside the kernel
memory.

Attached is a patch vs 2.4.18 that fixes these issues for me. Please test
and let me know.

If I select a codepage/charset combination that doesn't match I now get a
somewhat cryptic message instead of an oops (just a temporary thing).
"smbfs: filename charset conversion failed"

The file is then hidden, which is bad. Conversion errors should map to '?'
as they used to or do some translation into ":####" strings. I'll do
something about that.


Some comments on what some of you have been doing:

The smbfs remote codepage can never be utf8 since there are no smb servers
that talk utf8. It can be one of the dos codepages, it can be blank or
with additional patches it can be a 2 byte little endian unicode format.

Furthermore, the local charset must be one that matches the chars used in
the remote set. Otherwise you get conversion errors. A few known good
combinations are:

cp850 <-> iso8859-1
cp866 <-> koi8-r
cp932 <-> euc-jp
(the right is the local = linux side)

See also the smb.conf manpage.

But even with these it seems to be possible to create chars that do not
match, and I think it is caused by windows trying to map unicode to a
codepage and not finding a matching char to use.

Local utf8 always matches the remote and is preferred if your system is
setup to handle it.

I would explain the reported
smb_proc_readdir_long: name=<directory> result=-2, rcls=1, err=2
as a name conversion problem. If the conversion failed one way it used to
be truncated and would then fail when sent back to the server. The error
is ERRDOS - ERRbadfile (File not found).

Check the config and the nls maps used.

/Urban


Attachments:
smbfs-2.4.18-codepage.patch (4.56 kB)

2002-03-02 07:38:31

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH] smbfs codepage fixes for 2.4.18

Urban Widmark wrote:
> Attached is a patch vs 2.4.18 that fixes these issues for me. Please test
> and let me know.

There is no OOPS, which is good.

> If I select a codepage/charset combination that doesn't match I now get a
> somewhat cryptic message instead of an oops (just a temporary thing).
> "smbfs: filename charset conversion failed"

I see a lot of them.

my smb.conf:
character set = ISO8859-1
client code page = 850


But I think, that my local code page is actually 8859-15 (I have euro-support
so it has to be 15)
Is that a problem? AFAIK the only difference between 1 and 15 is the
Euro-sign.


> The smbfs remote codepage can never be utf8 since there are no smb servers
> that talk utf8. It can be one of the dos codepages, it can be blank or
> with additional patches it can be a 2 byte little endian unicode format.
>
> Furthermore, the local charset must be one that matches the chars used in
> the remote set. Otherwise you get conversion errors. A few known good
> combinations are:
>
> cp850 <-> iso8859-1
> cp866 <-> koi8-r
> cp932 <-> euc-jp
> (the right is the local = linux side)

> But even with these it seems to be possible to create chars that do not
> match, and I think it is caused by windows trying to map unicode to a
> codepage and not finding a matching char to use.

The computer I mount has samba 2.0.7. But I don't know which code page it is
running. If it is of interest I will ask.


> Local utf8 always matches the remote and is preferred if your system is
> setup to handle it.

I will try that someday. If I had the choice I would introduce 4Byte Unicode
for everthing and forbid everything else......

> smb_proc_readdir_long: name=<directory> result=-2, rcls=1, err=2

If I run a find over all shares I still get some rare:

smb_proc_readdir_long: name=directory\*, result=-13, rcls=1, err=5

and

smb_proc_readdir_long: name=directory\*, result=-2, rcls=1, err=2

messages.
These directories are empty, as you posted above.


greetings

CHristian

2002-03-03 13:21:33

by Urban Widmark

[permalink] [raw]
Subject: Re: [PATCH] smbfs codepage fixes for 2.4.18

On Sat, 2 Mar 2002, Christian Borntr?ger wrote:

> my smb.conf:
> character set = ISO8859-1
> client code page = 850

smbmount does not use smb.conf for these values.

It does matter what/if the server has as "client code page" and your
client must use matching settings. Either with the codepage/iocharset
mount options or what was set as default in make *config.


> But I think, that my local code page is actually 8859-15 (I have euro-support
> so it has to be 15)
> Is that a problem? AFAIK the only difference between 1 and 15 is the
> Euro-sign.

I count 8 differences in the codepage->unicode mapping in the kernel.

Even if you pick the correct mappings you can get errors where the server
is failing to convert its unicode chars into codepage chars.

http://marc.theaimsgroup.com/?t=96709071500001&r=1&w=2
http://marc.theaimsgroup.com/?l=samba&m=96835905219782&w=2

What might be 0x00a8 (DIAERESIS) or 0x0308 (COMBINING DIAERESIS) is mapped
into 0x22 by the server, smbfs sees 0x22 and uses that for open requests
and others. The server then complains because 0x22 doesn't match 0x00a8
(or whatever that char is on the server side).

I have added 3 patches for 2.4.18 to my smbfs page:
http://www.hojdpunkten.ac.se/054/samba/index.html

00 - fixes the oopses on failed codepages, now maps failed conversions
into :## strings for debugging.
01 - adds LFS
02 - adds Unicode support

For 01 and 02 you need to patch samba and add some extra options when
mounting to activate them. Details on the page.


> If I run a find over all shares I still get some rare:
>
> smb_proc_readdir_long: name=directory\*, result=-13, rcls=1, err=5
> smb_proc_readdir_long: name=directory\*, result=-2, rcls=1, err=2

Access denied, permissions on the server?
File not found, probably a char translation error.

/Urban