2004-09-17 21:02:46

by Samuel Thibault

[permalink] [raw]
Subject: [2.6] smbfs & "du" illness

Hi,

When smbmounting some samba-served share and running du on linux with
smbfs, one gets crazy results:

$ ls -l file
-rwxr-xr-x 1 samy thibault 348K 2004-05-12 18:04 file*
$ du file
512M file
$ stat file
File: `file'
Size: 356352 Blocks: 1048576 IO Block: 4096 fichier r?gulier
Device: bh/11d Inode: 80199 Links: 1
Access: (0755/-rwxr-xr-x) Uid: ( 1000/sthibaul) Gid: ( 101/ vmware)
Access: 2004-09-15 23:33:53.000000000 +0200
Modify: 2004-05-12 18:04:28.000000000 +0200
Change: 2004-05-12 18:04:28.000000000 +0200

This can be reproduced with a 2.6 kernel reading at any recent samba
server (2.2 or 3.0). 2.4 works fine.

What happens is that samba & kernel's smbfs don't agree on the meaning
of the 2nd 64-bit value in unix extension: samba/smbd/trans2.c tells
(and has always told since addition, cvs rev 1.149.4.47):
SOFF_T(p,0,get_allocation_size(NULL,&sbuf)); /* Number of bytes used on disk - 64 Bit
while the kernel does (and has always been doing since addition to
2.5.40)
fattr->f_blocks = LVAL(p, 8);
I.e. takes it as a number of sectors.

Who is wrong ? I could find some draft here:
http://uranus.it.swin.edu.au/~jn/linux/smbfs/Byron_Deadwiler_Paper.doc
which tells that:
CIFS Extensions for UNIX systems V1
LARGE_INTEGER NumOfBlocks
Number of file system block used to store file

Which is on the kernel's side...

But I discussed about it with samba people (see
http://lists.samba.org/archive/samba-technical/2004-September/thread.html
"Unix Extension & "du"" subject), and they told that they modified
it into bytes because there was no block size specified.

Some conservative way of correcting it is the following:

--- fs/smbfs/proc.c.vanilla 2004-09-17 22:18:38.000000000 +0200
+++ fs/smbfs/proc.c 2004-09-17 22:36:12.000000000 +0200
@@ -2095,6 +2095,9 @@ void smb_decode_unix_basic(struct smb_fa

fattr->f_size = LVAL(p, 0);
fattr->f_blocks = LVAL(p, 8);
+ if (fattr->f_blocks * 512 - fattr->f_size >= 512*512 - 512)
+ /* samba reports bytes, convert to sectors */
+ fattr->f_blocks >>= 9;
fattr->f_ctime = smb_ntutc2unixutc(LVAL(p, 16));
fattr->f_atime = smb_ntutc2unixutc(LVAL(p, 24));
fattr->f_mtime = smb_ntutc2unixutc(LVAL(p, 32));

Which works fine, even if samba people think back to tell a number of
sectors. Why 512*512 - 512 ? Because that's the minimum gap you'll
see between "bytes taken as sectors" size and real size: when the
file is 512 bytes and the disk uses 512b blocks.

Now another trouble is that samba people also use a minimum of 1Mo
(hence the number in the above fstat result). I'm not sure what to do
with this: should we then use file size which we divide into sectors
ourselves ? Or should we keep 1Mo, getting some strange results to some
extent ?

Regards,
Samuel Thibault


2004-09-25 16:42:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Fri, 17 Sep 2004, Samuel Thibault wrote:
>
> But I discussed about it with samba people (see
> http://lists.samba.org/archive/samba-technical/2004-September/thread.html
> "Unix Extension & "du"" subject), and they told that they modified
> it into bytes because there was no block size specified.

I hate to say so, but Jeremy is a git in this case.

There's definitely a very standard UNIX blocksize, and it's 512.

And yes, that's a _fixed_ blocksize. When you use "stat()", and you look
at "st_blocks", it's ALWAYS in 512-byte entities. It doesn't matter that
"st_blksize" might be something else - when UNIX counts blocks, it counts
them in 512-byte chunks.

So complaining that you don't have a blocksize is just silly.

This is a fairly common thinko, btw. Many people see "st_blocks" and
"st_blksize" in the UNIX stat info, and they think that they are defined
in terms of each other, and that disk usage is "st_blocks*st_blksize"
bytes. Not so. Disk usage is "st_blocks*512" bytes, and "st_blksize" is
something totally different - it's the _preferred_ IO blocking factor.

So let's all repeat after me: UNIX file "blocksize" is 512. It's not a
question of whether that value makes sense these days or not, it's just a
matter of simple _fact_.

> Some conservative way of correcting it is the following:

Since that won't fix "du" _anyway_, I think it's pointless. As you point
out:

> Now another trouble is that samba people also use a minimum of 1Mo
> (hence the number in the above fstat result). I'm not sure what to do
> with this: should we then use file size which we divide into sectors
> ourselves ? Or should we keep 1Mo, getting some strange results to some
> extent ?

I'm claiming that samba is broken, and we should not try to fix it on the
client side. We should ask the samba people to get their act together.
They've apparently been able to put _two_ bugs in one single integer:
both messing up the block size _and_ then using a totally illogical
minimum value for the thing.

I bet the minimum value comes from the fact that all files end up using
"n" bytes for things like inodes etc. Let's make up some numbers, and
assume that somebody thought that the minimum disk-space used was 2kB.
Instead of dividing that by 512, and coming up with the value "4", they
multiplied it by the block size and came up with the value 1Mb.

Whatever the reason, the minimum size is clearly a samba bug, even if you
were to (incorrectly, Jeremy) claim that there is no standard blocksize.

Linus

2004-09-25 17:11:48

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 09:41:59AM -0700, Linus Torvalds wrote:
>
> I hate to say so, but Jeremy is a git in this case.

Thanks, I prefer POSIX fascist myself :-).

> And yes, that's a _fixed_ blocksize. When you use "stat()", and you look
> at "st_blocks", it's ALWAYS in 512-byte entities. It doesn't matter that
> "st_blksize" might be something else - when UNIX counts blocks, it counts
> them in 512-byte chunks.

st_blocks and st_blksize are not in the POSIX spec
(I have one on my desk, and stroke it occasionally :-)
Besides which, on HPUX (which these extensions were
first created for) it returns st_blocks in 8192 byte
units, not 512, so your claim is incorrect.

That's why I got so pissed with the extensions spec
as it didn't specify a unit size. Rather an assume
"all the world is 512" which is plainly wrong, I
decided to make it a unit of bytes on the wire.
The client can then return in the correct blocksize
for it's own system.

> I'm claiming that samba is broken, and we should not try to fix it on the
> client side. We should ask the samba people to get their act together.
> They've apparently been able to put _two_ bugs in one single integer:
> both messing up the block size _and_ then using a totally illogical
> minimum value for the thing.

Well the minimum value is for Windows clients. They go a *lot*
faster with the minimum value as it tweaks some of their internal
buffers when they allocate. You can parameterize it (admittedly at
compile time, not runtime). I can make it runtime parameterized
in a later smbd if you want.

The CIFS client needs to divide the value returned by 512, or
whatever blocksize is being used on the UNIX clients. When
we have a 64-bit space it makes sense to return the absolute
bytes and let the client return it to userspace in whatever blocksize
it wants.

> I bet the minimum value comes from the fact that all files end up using
> "n" bytes for things like inodes etc. Let's make up some numbers, and
> assume that somebody thought that the minimum disk-space used was 2kB.
> Instead of dividing that by 512, and coming up with the value "4", they
> multiplied it by the block size and came up with the value 1Mb.

Nope. We have strange reasons for things, but they're usually
not *that* strange.

> Whatever the reason, the minimum size is clearly a samba bug, even if you
> were to (incorrectly, Jeremy) claim that there is no standard blocksize.

It's a Samba *feature* :-). But I agree for UNIX/Linux clients
it doesn't make much sense.

Jeremy.

2004-09-25 17:15:32

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 10:11:04AM -0700, Jeremy Allison wrote:

> Besides which, on HPUX (which these extensions were
> first created for) it returns st_blocks in 8192 byte
> units, not 512, so your claim is incorrect.

Oh yeah, and checking our configure.in, Stratos VOS
uses 4096 for the same units. That's when I gave up
in horror (I, like you, used to think it was 512 :-).
The original smbclient test code failed to return the
correct value when connecting to HPUX because of this,
we used to multiply by 512 bytes to show the user the
complete number.

Jeremy "standard, what standard ?" Allison.

2004-09-25 17:41:20

by YOSHIFUJI Hideaki

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

Hello.

In article <20040925171104.GN580@jeremy1> (at Sat, 25 Sep 2004 10:11:04 -0700), Jeremy Allison <[email protected]> says:

> > And yes, that's a _fixed_ blocksize. When you use "stat()", and you look
> > at "st_blocks", it's ALWAYS in 512-byte entities. It doesn't matter that
> > "st_blksize" might be something else - when UNIX counts blocks, it counts
> > them in 512-byte chunks.
>
> st_blocks and st_blksize are not in the POSIX spec
:
> That's why I got so pissed with the extensions spec
> as it didn't specify a unit size. Rather an assume
> "all the world is 512" which is plainly wrong, I
> decided to make it a unit of bytes on the wire.
> The client can then return in the correct blocksize
> for it's own system.

http://www.opengroup.org/onlinepubs/009695399/basedefs/sys/stat.h.html#tag_13_62

|The unit for the st_blocks member of the stat structure is not
|defined within IEEE Std 1003.1-2001. In some implementations it
|is 512 bytes. It may differ on a file system basis. There is no
|correlation between values of the st_blocks and st_blksize, and
|the f_bsize (from <sys/statvfs.h>) structure members.

--yoshfuji

2004-09-25 18:06:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> Thanks. Not only that. st_blocks is a UNIX standard, it's not
> in POSIX. I always try to keep the core fileserving code as
> close to POSIX as possible to allow people to do crazy things
> (like porting it to Windows :-).

Ehh, what's the problem with just having a simple config flag, and just
export whatever the underlying server exports. At least it won't be the
total mess that samba clearly makes of it now.

You cannot claim that the number samba now puts there makes _any_ sense.
Why the apparent 1MB minimum limit? And if it's just the size of the file
in bytes, then why bother with it at all? And if you cannot get a
blocksize from the OS, why make up a nonsensical number?

In other words, your arguments do not make any sense.

Let's go through this:

- you argue that you can't depend on st_blocks.

This implies that _any_ number you use for "number of bytes" _or_
"number of blocks" used on disk has _zero_ to do with reality.

This implies that the whole exercise is pointless, and you shouldn't
fill in the value at all.

- assuming you _do_ look at st_blocks on the server, you argue that you
cannot make sense of it.

Ok, so there may be broken systems out there. Tough. If you can't make
sense of the number, why are you trying to export it again?

In other words, neither of your arguments make any sense. You'd be better
off just initializing the field with plain zero, and not lying to the
client with a number that has no relevance to anything at all, and expect
the client to sort it out.

My argument is that if the only thing the client can really use the
information for is st_blocks (or equivalent) _anyway_, then just export
it. And if you think

unsigned long st_blocks = 0;

#ifdef HOST_HAS_ST_BLOCKS
st_blocks = stat->st_blocks;
#endif

is too ugly yo have in a header file somewhere, then why bother with
trying to put any number in there in the first place? What was the point
of:

CIFS Extensions for UNIX systems V1
LARGE_INTEGER NumOfBlocks
Number of file system block used to store file

again? Was it a CIFS Extension for POSIX? Or was it for UNIX like the
documentation specifies? Was it bytes, or was it blocks, like the
documentation says?

Methinks you've been looking too much at what MS does over the wire, and
that has made you think that it makes total sense to send random numbers
over the wire. Maybe so - maybe it's how you have to think if you make a
samba server. But if so, just _say_ so, instead of making arguments that
make no sense.

Linus

2004-09-25 18:12:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Linus Torvalds wrote:
>
> Methinks you've been looking too much at what MS does over the wire, and
> that has made you think that it makes total sense to send random numbers
> over the wire. Maybe so - maybe it's how you have to think if you make a
> samba server. But if so, just _say_ so, instead of making arguments that
> make no sense.

Btw, if you want to send bytes instead of blocks, I don't care. The Linux
client can easily do

blocks = bytes >> 9;

and I'll be perfectly happy. But if the "bytes" count you send has no
actual real-life meaning (ie it didn't actually come from the underlying
filesystem at all), then don't bother. The client might as well do

blocks = (filesize + 511) >> 9;

if that's what the server is (badly) mangling.

Linus


2004-09-25 18:21:07

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 11:12:04AM -0700, Linus Torvalds wrote:
>
> Btw, if you want to send bytes instead of blocks, I don't care. The Linux
> client can easily do
>
> blocks = bytes >> 9;
>
> and I'll be perfectly happy.

Good - that's all that needs doing.

> But if the "bytes" count you send has no
> actual real-life meaning (ie it didn't actually come from the underlying
> filesystem at all), then don't bother. The client might as well do
>
> blocks = (filesize + 511) >> 9;
>
> if that's what the server is (badly) mangling.

No, the number does have real life meaning if the underlying
OS supports st_blocks and st_blksize. We test for the presense
of these in configure (as they're not POSIX) and send the
correct values as bytes if they are there. We (smbd) takes care
of scaling them into bytes instead of the client having to
know if it's talking to HPUX in which case it's in 8192
byte units, or to VOS etc. etc.

To recap, if we have st_blocks from the filesystem we use it
and send the value scaled as bytes, if not we send the actual
file size there in bytes (as we know any POSIX system has at
least that).

Happy ?

Jeremy.

2004-09-25 18:25:24

by Samuel Thibault

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

Le sam 25 sep 2004 ? 11:20:21 -0700, Jeremy Allison a tapot? sur son clavier :
> On Sat, Sep 25, 2004 at 11:12:04AM -0700, Linus Torvalds wrote:
> >
> > Btw, if you want to send bytes instead of blocks, I don't care. The Linux
> > client can easily do
> >
> > blocks = bytes >> 9;
> >
> > and I'll be perfectly happy.
>
> Good - that's all that needs doing.

That's what I submitted, but:

> Happy ?

No, because there's still the minimum of 1MB... What have Microsoft
things to do with unix CIFS extensions ?...

Regards,
Samuel Thibault

2004-09-25 18:29:51

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 11:06:15AM -0700, Linus Torvalds wrote:

> You cannot claim that the number samba now puts there makes _any_ sense.

Yes I can :-).

> Why the apparent 1MB minimum limit? And if it's just the size of the file
> in bytes, then why bother with it at all? And if you cannot get a
> blocksize from the OS, why make up a nonsensical number?

As I've mentioned. The 1MB mess is to do with Windows clients
(it makes them faster with Samba). I apologise for that, you
can publicly beat me for it etc. and I promise to fix it for
Linux clients in future, but it works well for the majority
clients we have to support.

Also, the minumum size isn't the same issue as the st_blocks
issue.

> CIFS Extensions for UNIX systems V1
> LARGE_INTEGER NumOfBlocks
> Number of file system block used to store file
>
> again? Was it a CIFS Extension for POSIX? Or was it for UNIX like the
> documentation specifies? Was it bytes, or was it blocks, like the
> documentation says?

Some history. I turn up at HP and look at the patch they
have for this. They're putting the contents of st_blocks
from the OS (HPUX in their case) into this field. I then
write smbclient test code to test these extensions. Naturally
(and as god intended :-), I assume 512 byte blocks in this
value. When I look at the numbers they make no sense. So
I go digging. I find that 512 byte blocks are *not* guarenteed
(btw, thanks for your apology on being wrong on that... I'm
sure it's in the post :-) and so any clients using this have
no idea what this value actually means. So I have to decide
what this actually means. Yes, I could scale to 512 byte
blocks, but what if it's a HPUX client that expects 8192
byte blocks ? So I decide, unilaterally (like you in the
kernel I am god in the UNIX extensions space :-) to make this
the number returned from st_blocks scaled as bytes. After
all we have a 64 bit field so it has room. And then the
client can scale it to whatever strange block size the
userspace programs expect on that system.

My argument that POSIX doesn't have st_blocks was more
about the "it's always 512" issues, than what the code
does. Of course the code uses it if it's there.

Does it all make more sense now ?

Jeremy.

2004-09-25 18:31:57

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 11:29:07AM -0700, Jeremy Allison wrote:

> So I decide, unilaterally (like you in the
> kernel I am god in the UNIX extensions space :-) to make this
> the number returned from st_blocks scaled as bytes.

We haven't re-published the UNIX extensions doc yet with
this fix (as a v2 or so) as we're still working on adding
things like POSIX ACLs and byte range locks (and maybe a
POSIX open) so my apologies on the current doc being out
of date.

Jeremy.

2004-09-25 18:33:45

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 08:25:06PM +0200, Samuel Thibault wrote:
>
> That's what I submitted, but:
>
> > Happy ?
>
> No, because there's still the minimum of 1MB... What have Microsoft
> things to do with unix CIFS extensions ?...

Well in this case it's because they're using the same underlying
logic in the call. I'll fix this for Samba 3.0.8 or 9 - hand out
on samba-technical if you want the patch before then.

Ok, it's a bug. So kill me :-). Happy *now* ? :-).

Jeremy.

2004-09-25 19:16:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> To recap, if we have st_blocks from the filesystem we use it
> and send the value scaled as bytes, if not we send the actual
> file size there in bytes (as we know any POSIX system has at
> least that).
>
> Happy ?

Not really. Where did that 1MB minimum value come from?

Linus

2004-09-25 19:20:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> Also, the minumum size isn't the same issue as the st_blocks
> issue.

Well, it _is_.

Because right now the number is meaningless, and the Linux client is
apparently better off ignoring it totally.

Which was kind of my point to begin with.

If the number doesn't make sense, then we shouldn't look at it. The
_only_ thing that number makes sense for is st_blocks as far as the Linux
client is concerned.

In other words, the Linux client is a hell of a lot better off just taking
"(filesize + 511) >> 9", as far as I can tell. It's more accurate than the
random number you have.

Linus

2004-09-25 19:23:17

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 12:16:24PM -0700, Linus Torvalds wrote:
>
>
> On Sat, 25 Sep 2004, Jeremy Allison wrote:
> >
> > To recap, if we have st_blocks from the filesystem we use it
> > and send the value scaled as bytes, if not we send the actual
> > file size there in bytes (as we know any POSIX system has at
> > least that).
> >
> > Happy ?
>
> Not really. Where did that 1MB minimum value come from?

A Samba bug for Linux clients. Sending 1mb minimum for Windows
clients causes them to use larger internal buffers and do
larger reads. So making this value 1mb is good for us when
serving Windows clients.

The "blocks used on disk" logic in smbd is one function
called from all the (several :-) different info levels
that return this value. I made this mistake of calling
it for the UNIX info level return also, instead of returning
the real value for UNIX clients (who can handle the truth
rather better than Windows :-).

I'll fix it for the next Samba release. Happy *now* ? :-).

Jeremy.

2004-09-25 19:26:28

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 12:20:20PM -0700, Linus Torvalds wrote:
>
> If the number doesn't make sense, then we shouldn't look at it. The
> _only_ thing that number makes sense for is st_blocks as far as the Linux
> client is concerned.
>
> In other words, the Linux client is a hell of a lot better off just taking
> "(filesize + 511) >> 9", as far as I can tell. It's more accurate than the
> random number you have.

At present this is true, but just because our implementation of the spec
is broken (and yes it's complicated by the fact that we're the only
ones implementing the server side of this at the moment) doesn't mean
that the client should depend on this.

After all, now I know about it I'll fix it for the next release
and eventually modern clients and servers will be consistent on
this issue. But if you want to have a fallback hack that does
this at present then that's fine by me. It'd be nice if it was
temporary though.

Jeremy.

2004-09-25 19:29:47

by Samuel Thibault

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

Le sam 25 sep 2004 ? 12:22:24 -0700, Jeremy Allison a tapot? sur son clavier :
> I'll fix it for the next Samba release. Happy *now* ? :-).

As far as I am concerned, *now* I am indeed (although buggy samba
releases will unfortunately still last for some time...).

Regards,
Samuel Thibault

2004-09-25 19:53:39

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 12:20:20PM -0700, Linus Torvalds wrote:
>
> Because right now the number is meaningless, and the Linux client is
> apparently better off ignoring it totally.

Actually, just to be clear - the number isn't completely
meaningless, it's the actual size on disk (from the st_blocks
if they're available, filesize if not) rounded up to the nearest
1mb boundary. Just didn't want you to think we were randomly
returning 1mb. It's a meaningful number, it's just the granularity
that's a bit off :-).

Jeremy.

2004-09-25 20:21:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> On Sat, Sep 25, 2004 at 12:20:20PM -0700, Linus Torvalds wrote:
> >
> > Because right now the number is meaningless, and the Linux client is
> > apparently better off ignoring it totally.
>
> Actually, just to be clear - the number isn't completely
> meaningless, it's the actual size on disk (from the st_blocks
> if they're available, filesize if not) rounded up to the nearest
> 1mb boundary. Just didn't want you to think we were randomly
> returning 1mb. It's a meaningful number, it's just the granularity
> that's a bit off :-).

I repeat: the Linux client is apparently better off ignoring it totally.

That makes it meaningless, Jeremy.

meaningless (vs. meaningful), nonmeaningful -- (having no meaning or
direction or purpose; "a meaningless endeavor"; "a meaningless
life"; "a verbose but meaningless explanation")

It clearly has no meaning or direction or purpose for any sane client.

After all, the only thing we can use it for is st_blocks, and since the
granularity is _so_ big, we're much better off looking at the file size
and guessing from that.

I still don't see why you argue for that totally meaningless number. As
far as I can tell, the _only_ thing it matters for is some Windows
benchmark.

Tell me again: why should the Linux client look at that number? Give me
just _one_ valid reason.

Linus

2004-09-25 21:11:38

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 01:21:29PM -0700, Linus Torvalds wrote:
>
> I repeat: the Linux client is apparently better off ignoring it totally.
>
> That makes it meaningless, Jeremy.

Currently I can't argue with you on that.

> After all, the only thing we can use it for is st_blocks, and since the
> granularity is _so_ big, we're much better off looking at the file size
> and guessing from that.
> Tell me again: why should the Linux client look at that number? Give me
> just _one_ valid reason.

Right now (Samba 3.0.7) you are correct. But the intent is
to fix it going forward, so that in non-broken server implementations
(yes the Samba implementation is broken right now) then it
will be correct (ie. follow the intent of the spec).

Sorry, can't fix the past, I can only fix the future :-).

Jeremy.

2004-09-25 21:59:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> Right now (Samba 3.0.7) you are correct. But the intent is
> to fix it going forward, so that in non-broken server implementations
> (yes the Samba implementation is broken right now) then it
> will be correct (ie. follow the intent of the spec).
>
> Sorry, can't fix the past, I can only fix the future :-).

So let's work out something that works well now, _and_ might have a chance
in hell of working in the future too.

What kinds of values do different smb servers actually fill this field
with? And what's the value you expect future samba severs will use?

Ie, will something like this DTRT in smb_decode_unix_basic():

u64 size = LVAL(p, 0); /* file size */
u64 blocks = LVAL(p, 8); /* pseudo-blocks */
u32 real_blocks;

fattr->f_size = size;
real_blocks = (size + 511) >> 9;

/*
* old samba sends "bytes used on disk rounded up to 1MB",
* which is useless - ignore it. But if the low bits are
* set, maybe the value is valid..
*/
if (blocks & 0xfffff) {
/*
* If the low 9 bits are non-zero, it can't
* be a byte count, but maybe it's a block
* count?
*/
if (blocks & 0x1ff) {
.. use it how? ..
} else {
real_blocks = blocks >> 9;
}
}
fattr->f_blocks = real_blocks;

the above is ugly as hell, but what choice is there?

It would be good to know what all different server implementations can
do, to know just _how_ defensive this code has to be.

Linus

2004-09-25 22:09:27

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 02:59:08PM -0700, Linus Torvalds wrote:
>
> So let's work out something that works well now, _and_ might have a chance
> in hell of working in the future too.

Ok. BTW I've fixed this in the current SVN codebase
to do the right thing (return st_blocks expressed as
bytes).

> What kinds of values do different smb servers actually fill this field
> with? And what's the value you expect future samba severs will use?

Ok, right now the only smb server that supports the UNIX
extensions is smbd (I'm working on getting NetApp to also
support them, but they don't as yet so we can get them to
do the right thing, bytes allocated, when they do), so we
know pretty much what will happen.

> Ie, will something like this DTRT in smb_decode_unix_basic():
>
> u64 size = LVAL(p, 0); /* file size */
> u64 blocks = LVAL(p, 8); /* pseudo-blocks */
> u32 real_blocks;
>
> fattr->f_size = size;
> real_blocks = (size + 511) >> 9;
>
> /*
> * old samba sends "bytes used on disk rounded up to 1MB",
> * which is useless - ignore it. But if the low bits are
> * set, maybe the value is valid..
> */
> if (blocks & 0xfffff) {
> /*
> * If the low 9 bits are non-zero, it can't
> * be a byte count, but maybe it's a block
> * count?
> */
> if (blocks & 0x1ff) {
> .. use it how? ..
> } else {
> real_blocks = blocks >> 9;
> }
> }
> fattr->f_blocks = real_blocks;
>
> the above is ugly as hell, but what choice is there?

Not much, sorry.

> It would be good to know what all different server implementations can
> do, to know just _how_ defensive this code has to be.

It has to deal with Samba sending bytes allocated on disk (from st_blocks),
rounded up to the nearest 1mb (or whatever the local admin compiled with,
but most people don't mess with that compile-time parameter) or starting
with Samba 3.0.8 it'll send the value from the server st_blocks expressed
as bytes. So it doesn't have to be too bad (I hope).

Jeremy.

2004-09-25 22:19:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness



On Sat, 25 Sep 2004, Jeremy Allison wrote:
>
> > What kinds of values do different smb servers actually fill this field
> > with? And what's the value you expect future samba severs will use?
>
> Ok, right now the only smb server that supports the UNIX
> extensions is smbd (I'm working on getting NetApp to also
> support them, but they don't as yet so we can get them to
> do the right thing, bytes allocated, when they do), so we
> know pretty much what will happen.

Ok. Then we don't have to worry about somebody using a block count instead
of a byte count anywhere. That simplifies things a bit, at least.

Samuel, does this patch work for you? I'll commit if after somebody
reports that it works (assuming nobody points out any stupid thinkos
in it).

This will inevitably get the disk usage a _bit_ wrong if the file really
_does_ happen to use up an exact multiple of 1MB of disk, but hey, having
a heuristic that is sometimes a bit wrong is better than having one that
is always very wrong.

This is totally untested, btw. For obvious reasons.

Linus

----
===== fs/smbfs/proc.c 1.40 vs edited =====
--- 1.40/fs/smbfs/proc.c 2004-07-11 02:23:29 -07:00
+++ edited/fs/smbfs/proc.c 2004-09-25 15:15:23 -07:00
@@ -2076,6 +2076,8 @@

void smb_decode_unix_basic(struct smb_fattr *fattr, char *p)
{
+ u64 size, disk_bytes;
+
/* FIXME: verify nls support. all is sent as utf8? */

fattr->f_unix = 1;
@@ -2093,8 +2095,19 @@
/* 84 L permissions */
/* 92 L link count */

- fattr->f_size = LVAL(p, 0);
- fattr->f_blocks = LVAL(p, 8);
+ size = LVAL(p, 0);
+ disk_bytes = LVAL(p, 8);
+
+ /*
+ * Some samba versions round up on-disk byte usage
+ * to 1MB boundaries, making it useless. When seeing
+ * that, use the size instead.
+ */
+ if (!(disk_bytes & 0xfffff))
+ disk_bytes = size+511;
+
+ fattr->f_size = size;
+ fattr->f_blocks = disk_bytes >> 9;
fattr->f_ctime = smb_ntutc2unixutc(LVAL(p, 16));
fattr->f_atime = smb_ntutc2unixutc(LVAL(p, 24));
fattr->f_mtime = smb_ntutc2unixutc(LVAL(p, 32));

2004-09-25 22:24:33

by Jeremy Allison

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 03:18:50PM -0700, Linus Torvalds wrote:
>
> Ok. Then we don't have to worry about somebody using a block count instead
> of a byte count anywhere. That simplifies things a bit, at least.

Good. Glad to make your life easier sometimes :-).

> This will inevitably get the disk usage a _bit_ wrong if the file really
> _does_ happen to use up an exact multiple of 1MB of disk, but hey, having
> a heuristic that is sometimes a bit wrong is better than having one that
> is always very wrong.

Yep. My bug, sorry. Should be fixed from now on.

> This is totally untested, btw. For obvious reasons.

Ah, you just wait until we've finished making smb a tier 1
unix to unix filesystem. You'll be using it every day :-) :-).

Jeremy.

2004-09-25 22:40:49

by Samuel Thibault

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

Le sam 25 sep 2004 ? 15:18:50 -0700, Linus Torvalds a tapot? sur son clavier :
> Samuel, does this patch work for you?

Yes it does, thanks !

Regards,
Samuel Thibault

2004-09-26 00:44:42

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [2.6] smbfs & "du" illness

On Sat, Sep 25, 2004 at 11:29:07AM -0700, Jeremy Allison wrote:
>
> Some history. I turn up at HP and look at the patch they
> have for this. They're putting the contents of st_blocks
> from the OS (HPUX in their case) into this field. I then
> write smbclient test code to test these extensions. Naturally
> (and as god intended :-), I assume 512 byte blocks in this
> value. When I look at the numbers they make no sense.

HPUX... it *reminds* you of Unix....

- Ted