2000-12-05 03:56:41

by Linus Torvalds

[permalink] [raw]
Subject: test12-pre5


Ok, this contains one of the fixes for the dirty inode buffer list (the
other fix is pending, simply because I still want to understand why it
would be needed at all). Al?

Also, it has the final installment of the PageDirty handling, and now
officially direct IO can work by just marking the physical page dirty and
be done with it. NFS along with all filesystems have been converted, the
one hold-out still being smbfs.

Who works on smbfs these days? I see two ways of fixing smbfs now that
"writepage()" only gets an anonymous page and no "struct file" information
any more (this also fixes the double page unlock that Andrew saw).

- disable shared mmap over smbfs (very easily done by just setting
writepage to NULL)

- fetch the dentry that writepage needs by just looking at the
inode->i_dentry list and/or just make the smbfs page cache host be the
dentry instead of the inode like other filesystems. The first approach
assumes that all paths are equal for writeout, the second one assumes
that there are no hard linking going on in smbfs.

Somebody more knowledgeable than I will have to make the decision
(otherwise I'll just end up disabling shared mmap - I doubt anybody really
uses it anyway, but it would be more polite to just support it).

NOTE! There's another change to "writepage()" semantics than just dropping
the "struct file": the new writepage() is supposed to mirror the logic of
readpage(), and unlock the page when it is done with it. This allows the
VM system more visibility into what IO is pending (which the VM doesn't
take advantage of yet, but now it can _truly_ use the same logic for both
swapout and for dirty file writeback).

The other change is that I forward-ported the ymfpci driver from 2.2.18,
as it works better than the ALSA one on my now-to-be-main-laptop ;)

[ Alan - I ahve your patches in my incoming queue still, I wanted to get
an interim version out to check with Al on the block list and the VM
stuff with Rik and people. ]

Linus

----
- pre5:
- Jaroslav Kysela: ymfpci driver
- me: get rid of bogus MS_INVALIDATE semantics
- me: final part of the PageDirty() saga
- Rusty Russell: 4-way SMP iptables fix
- Al Viro: oops - bad ext2 inode dirty block bug

- pre4:
- Andries Brouwer: final isofs pieces.
- Kai Germaschewski: ISDN
- play CD audio correctly, don't stop after 12 minutes.
- Anton Altaparmakov: disable NTFS mmap for now, as it doesn't work.
- Stephen Tweedie: fix inode dirty block handling
- Bill Hartner: reschedule_idle - prefer right cpu
- Johannes Erdfelt: USB updates
- Alan Cox: synchronize
- Richard Henderson: alpha updates and optimizations
- Geert Uytterhoeven: fbdev could be fooled into crashing fix
- Trond Myklebust: NFS filehandles in inode rather than dentry

- pre3:
- me: more PageDirty / swapcache handling
- Neil Brown: raid and md init fixes
- David Brownell: pci hotplug sanitization.
- Kanoj Sarcar: mips64 update
- Kai Germaschewski: ISDN sync
- Andreas Bombe: ieee1394 cleanups and fixes
- Johannes Erdfelt: USB update
- David Miller: Sparc and net update
- Trond Myklebust: RPC layer SMP fixes
- Thomas Sailer: mixed sound driver fixes
- Tigran Aivazian: use atomic_dec_and_lock() for free_uid()

- pre2:
- Peter Anvin: more P4 configuration parsing
- Stephen Tweedie: O_SYNC patches. Make O_SYNC/fsync/fdatasync
do the right thing.
- Keith Owens: make mdule loading use the right struct module size
- Boszormenyi Zoltan: get MTRR's right for the >32-bit case
- Alan Cox: various random documentation etc
- Dario Ballabio: EATA and u14-34f update
- Ivan Kokshaysky: unbreak alpha ruffian
- Richard Henderson: PCI bridge initialization on alpha
- Zach Brown: correct locking in Maestro driver
- Geert Uytterhoeven: more m68k updates
- Andrey Savochkin: eepro100 update
- Dag Brattli: irda update
- Johannes Erdfelt: USB update

- pre1: (for ISDN synchronization _ONLY_! Not complete!)
- Byron Stanoszek: correct decimal precision for CPU MHz in
/proc/cpuinfo
- Ollie Lho: SiS pirq routing.
- Andries Brouwer: isofs cleanups
- Matt Kraai: /proc read() on directories should return EISDIR, not EINVAL
- me: be stricter about what we accept as a PCI bridge setup.
- me: always set PCI interrupts to be level-triggered when we enable them.
- me: updated PageDirty and swap cache handling
- Peter Anvin: update A20 code to work without keyboard controller
- Kai Germaschewski: ISDN updates
- Russell King: ARM updates
- Geert Uytterhoeven: m68k updates


2000-12-05 04:18:06

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Mon, 4 Dec 2000, Linus Torvalds wrote:

>
> Ok, this contains one of the fixes for the dirty inode buffer list (the
> other fix is pending, simply because I still want to understand why it
> would be needed at all). Al?

See previous posting. BTW, -pre5 doesn't do the right thing in clear_inode().

Scenario: bh of indirect block is busy (whatever reason, flush_dirty_buffers(),
anything that can bump ->b_count for a while). ext2_truncate() frees the
thing and does bforget(). bh is left on the inode's list. Woops...

The minimal fix would be to make clear_inode() empty the list. IMO it's
worse than preventing the freed stuff from being on that list...

Comments?

2000-12-05 04:36:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: test12-pre5



On Mon, 4 Dec 2000, Alexander Viro wrote:
>
> On Mon, 4 Dec 2000, Linus Torvalds wrote:
> >
> > Ok, this contains one of the fixes for the dirty inode buffer list (the
> > other fix is pending, simply because I still want to understand why it
> > would be needed at all). Al?
>
> See previous posting. BTW, -pre5 doesn't do the right thing in clear_inode().
>
> Scenario: bh of indirect block is busy (whatever reason, flush_dirty_buffers(),
> anything that can bump ->b_count for a while). ext2_truncate() frees the
> thing and does bforget(). bh is left on the inode's list. Woops...

So? Why wouldn't clear_inode() get rid of it?

> The minimal fix would be to make clear_inode() empty the list. IMO it's
> worse than preventing the freed stuff from being on that list...

This _is_ what clear_inode() does in pre5 (and in pre4, for that matter):

void clear_inode(struct inode *inode)
{
if (!list_empty(&inode->i_dirty_buffers))
invalidate_inode_buffers(inode);

...

which I find perfectly readable. And should mean that no dirty buffers
should be associated with the inode any more. ext2 calls clear_inode()
from ext2_free_inode(), and as far as I can tell the only thing that can
happen after that is that the inode is still scheduled for write-out
(which explains how the bug you fixed would cause a dirty block to be
attached to the inode _after_ we did a clear_inode() on it).

Or are you thinking of something else?

Linus

2000-12-05 05:01:47

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Mon, 4 Dec 2000, Linus Torvalds wrote:

> So? Why wouldn't clear_inode() get rid of it?

It will. Mea culpa. However, other reasons for taking the bh of freed
block from the list still stand. IOW, consider that part as an
optimization.

2000-12-05 06:01:48

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: test12-pre5

The following fixes to many arguments error in fs/udf/inode.c for
test12-pre5

--- fs/udf/inode.c.orig Mon Dec 4 23:34:23 2000
+++ fs/udf/inode.c Tue Dec 5 00:26:59 2000
@@ -202,7 +202,7 @@
mark_buffer_dirty(bh);
udf_release_data(bh);

- inode->i_data.a_ops->writepage(NULL, page);
+ inode->i_data.a_ops->writepage(page);
UnlockPage(page);
page_cache_release(page);


--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-12-05 08:18:12

by Andrew Morton

[permalink] [raw]
Subject: Re: test12-pre5

Linus Torvalds wrote:
>
> Ok, this contains one of the fixes for the dirty inode buffer list (the
> other fix is pending, simply because I still want to understand why it
> would be needed at all). Al?

I've run the same test suite against vanilla test12-pre5 on two machines for
five hours. On ext2/IDE/SMP+UP it's solid.

I'll test Al's latest bforget_inode patch overnight, but that's already
been through the wringer once.

> Also, it has the final installment of the PageDirty handling, and now
> officially direct IO can work by just marking the physical page dirty and
> be done with it. NFS along with all filesystems have been converted, the
> one hold-out still being smbfs.
>
> Who works on smbfs these days? I see two ways of fixing smbfs now that
> "writepage()" only gets an anonymous page and no "struct file" information
> any more (this also fixes the double page unlock that Andrew saw).
^^^^^^
Mike Galbraith.

-

2000-12-05 16:20:09

by Juri Haberland

[permalink] [raw]
Subject: Re: test12-pre5

Linus Torvalds wrote:
> NOTE! There's another change to "writepage()" semantics than just dropping
> the "struct file": the new writepage() is supposed to mirror the logic of
> readpage(), and unlock the page when it is done with it. This allows the
> VM system more visibility into what IO is pending (which the VM doesn't
> take advantage of yet, but now it can _truly_ use the same logic for both
> swapout and for dirty file writeback).

Or maybe readpage should *not* unlock the page. What if we wanted to
follow the writepage immediately by traversing the page's buffers? We'd
have to lock the page again, and we wouldn't know what happened in the
interim.

Thanks for fixing the (struct file *)'s! (major wart gone)

--
Daniel

2000-12-05 17:43:43

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: test12-pre5

Hi,

On Mon, Dec 04, 2000 at 08:00:03PM -0800, Linus Torvalds wrote:
>
> On Mon, 4 Dec 2000, Alexander Viro wrote:
> >
> This _is_ what clear_inode() does in pre5 (and in pre4, for that matter):
>
> void clear_inode(struct inode *inode)
> {
> if (!list_empty(&inode->i_dirty_buffers))
> invalidate_inode_buffers(inode);

That is still buggy. We MUST NOT invalidate the inode buffers unless
i_nlink == 0, because otherwise a subsequent open() and fsync() will
have forgotten what buffers are dirty, and hence will fail to
synchronise properly with the disk.

Al, I agreed with your observation on bforget() needing the
remove_inode_queue() call. Is there anywhere else we need it?

--Stephen

2000-12-05 17:59:37

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Stephen C. Tweedie wrote:

> That is still buggy. We MUST NOT invalidate the inode buffers unless
> i_nlink == 0, because otherwise a subsequent open() and fsync() will
> have forgotten what buffers are dirty, and hence will fail to
> synchronise properly with the disk.

Correction: they _will_ eventually end up on disk, but yes, fsync() may
miss them.

> Al, I agreed with your observation on bforget() needing the
> remove_inode_queue() call. Is there anywhere else we need it?

unmap_buffer(). Same story, but on the data side. I don't see anything else
right now.

2000-12-05 18:21:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Stephen C. Tweedie wrote:
>
> On Mon, Dec 04, 2000 at 08:00:03PM -0800, Linus Torvalds wrote:
> >
> > On Mon, 4 Dec 2000, Alexander Viro wrote:
> > >
> > This _is_ what clear_inode() does in pre5 (and in pre4, for that matter):
> >
> > void clear_inode(struct inode *inode)
> > {
> > if (!list_empty(&inode->i_dirty_buffers))
> > invalidate_inode_buffers(inode);
>
> That is still buggy. We MUST NOT invalidate the inode buffers unless
> i_nlink == 0, because otherwise a subsequent open() and fsync() will
> have forgotten what buffers are dirty, and hence will fail to
> synchronise properly with the disk.

Are you all on drugs?

Look at where clear_inode() is called. It's called by
ext2_delete_inode(). It's not called by close(). Never has, never will.

clear_inode() _destroys_ the inode. WE HAVE TO GET RID OF BUFFERS at that
point. Because there won't be anything left to index to. We're going to
free the memory in RAM held by the inode. Blathering on about
"i_nlink" etc is a complete and utter waste of time, because even if nlink
is a million, there's no way we could leave the buffers indexed on the
inode. We _will_ call destroy_inode() soon afterwards.

The thing that protects the inode while it is still truly in use, is the
CAN_UNUSE() macro, which explicitly tests that the inode is not used by
anybody and has no dirty pages. If that macro doesn't work, then we have a
damn more serious problem than nlink to worry about.

Get your acts together, guys. Stop blathering and frothing at the mouth.
The only time clear_inode() should be called is (a) when we prune the
inode cache - and we CLEARLY cannot prune an inode if it still has dirty
blocks associated with it and CAN_UNUSE() already enforces this or (b)
when we're getting rid of the very last occurrence of the inode. In one
case the dirty list is already empty. In the other, invalidating it is
clearly the right thing and the _required_ thing to do.

So I repeat: are there known bugs in this area left in pre5? And with
"bugs", I don't mean fever-induced rants like the above (*).

I don't see any. Andrew cannot re-create any. But I won't rest easy until
people agree that the code is ok as-is. After that I can consider applying
optimizations, because right now my personal conviction is that the
patches from Al are cleanups and optimizations, but _not_ bug-fixes. And
they will absolutely not get applied before there is some consensus on
these issues.

Linus

(*) And yes, you can smack me on the head for that outburst if it turns
out that I just didn't see anything. I'll apologize. But right now I'm
irritated.

2000-12-05 18:45:24

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Linus Torvalds wrote:

> Get your acts together, guys. Stop blathering and frothing at the mouth.
> The only time clear_inode() should be called is (a) when we prune the
> inode cache - and we CLEARLY cannot prune an inode if it still has dirty
> blocks associated with it and CAN_UNUSE() already enforces this or (b)
> when we're getting rid of the very last occurrence of the inode. In one
> case the dirty list is already empty. In the other, invalidating it is
> clearly the right thing and the _required_ thing to do.


Sigh. OK, let me put it that way:

* we _can_ have dirty blocks on the list when inode gets freed.
* no, CAN_UNUSE will not see them.
* no, we do not give a damn about forgetting them.

So Stephen is right wrt fsync() (it will not get that stuff on disk).
However, it's not a bug - if that crap will not end up on disk we
will only win.

However, I think that things would be cleaner if we wouldn't have that
call in clear_inode(). Why? Because filesystems that had ordered writing
the blocks would better tell us when they are no longer interested in it.
It does not matter for the current code. It may matter a lot for any
fs that does write ordering of any kind.

IOW, while with the current tree invalidate_inode_buffers() is not a band-aid
(just a bad misnomer) it may easily become such.

> So I repeat: are there known bugs in this area left in pre5?

AFAIK, none. I _really_ don't like the way we handle that stuff now (at
the very least let's rename the bloody thing - it doesn't invalidate
anything), but as far as I can see the code is technically correct.

2000-12-05 19:05:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Alexander Viro wrote:
>
> Sigh. OK, let me put it that way:
>
> * we _can_ have dirty blocks on the list when inode gets freed.

Agreed.

> * no, CAN_UNUSE will not see them.

CAN_UNUSE() is not used at all for the final forcible removal of an inode
that has no users.

> * no, we do not give a damn about forgetting them.

Indeed. We'd be better off marking them clean, to make sure they never hit
the disk.

> So Stephen is right wrt fsync() (it will not get that stuff on disk).
> However, it's not a bug - if that crap will not end up on disk we
> will only win.

Stephen is _wrong_ wrt fsync().

Why?

Think about it for a second. How the hell could you even _call_ fsync() on
a file that no longer exists, and has no file handles open to it?

By the time we call clear_inode(), fsync() had better not be an issue any
more. The only reason fscyn() could be an issue is if somebody starts
calling clear_inode() left and right, but hey, if that happens your
filesystem would be so FUBAR'ed that you really wouldn't care any more.
Data integrity of "fsync()" is the least of your worries.

Linus

2000-12-05 19:24:29

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: test12-pre5

Hi,

On Tue, Dec 05, 2000 at 09:48:51AM -0800, Linus Torvalds wrote:
>
> On Tue, 5 Dec 2000, Stephen C. Tweedie wrote:
> >
> > That is still buggy. We MUST NOT invalidate the inode buffers unless
> > i_nlink == 0, because otherwise a subsequent open() and fsync() will
> > have forgotten what buffers are dirty, and hence will fail to
> > synchronise properly with the disk.
>
> Are you all on drugs?
>
> Look at where clear_inode() is called. It's called by
> ext2_delete_inode(). It's not called by close(). Never has, never will.

It is also called during prune_icache(). Yes, I know that the
CAN_UNUSE() macro should make sure we never try to do this on a
count==0 inode which still has dirty buffers, but I'd feel happier if
we had the safety net to guard against any callers ever doing a
clear_inode while we have dirty buffers around.

We have two completely different paths to clear_inode() here: one in
which there had better not be any dirty buffers or we've got a data
corrupter on our hands, and a second in which dirty buffers are
irrelevant because we're doing a delete.

That's the problem --- doing the invalidate in clear_inode() feels
like the wrong place to do it, because our treatment of the dirty
buffers list differs depending on the caller. I'd be much happer with
the delete case doing the invalidate, and leave clear_inode BUG()ing
if there are any dirty buffers left, and that's basically what the
test with i_nlink==0 did.

> So I repeat: are there known bugs in this area left in pre5? And with
> "bugs", I don't mean fever-induced rants like the above (*).

No, I don't think so.

Cheers,
Stephen

2000-12-05 19:30:49

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Linus Torvalds wrote:

> > So Stephen is right wrt fsync() (it will not get that stuff on disk).
> > However, it's not a bug - if that crap will not end up on disk we
> > will only win.
>
> Stephen is _wrong_ wrt fsync().
>
> Why?
>
> Think about it for a second. How the hell could you even _call_ fsync() on
> a file that no longer exists, and has no file handles open to it?
^^^^^^^^^^^^^^
clear_inode() <- dispose_list() <- prune_icache().

IOW, if file still exists, but is closed by everyone, etc. you _can_
get clear_inode() on it. <thinks> Ah, I see your point. OK, how about that:
* clear_inode() _can_ be called for still-alive objects.
* no matter how it is called, we don't give a damn for the stuff
on the list.
* moreover, if it gets called for object that is still alive the
list is just empty. It doesn't contain anything valuable (as in every
case) _and_ it doesn't contain random crap.

If that's what you were talking about - fine with me.

2000-12-05 20:20:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Alexander Viro wrote:
> >
> > Stephen is _wrong_ wrt fsync().
> >
> > Why?
> >
> > Think about it for a second. How the hell could you even _call_ fsync() on
> > a file that no longer exists, and has no file handles open to it?
> ^^^^^^^^^^^^^^
> clear_inode() <- dispose_list() <- prune_icache().

No. prune_icache() will absolutely _refuse_ to prune an inode that is
still in use. Where "in use" is defined to be an aggregate of many things,
including the fact that the inode is dirty (even if there are no actual
references to it any more) _or_ the fact that the inode has cached data
associated with it.

Page cache counts as cached data.

As does dirty buffers.

So clean_inode() is basically always called only for "dead" objects.

And this is not just a "it happens to be like this" kind of thing. It
_has_ to be like this, because every time we call clear_inode() we are
going to physically free the memory associated with the inode very soon
afterwards. Which means that _any_ use of the inode had better be long
gone. Dirty buffers included.

So it's definitely ok to say that "once we get to clean_inode(), there is
no way in h*ll that dirty buffers can be valid any more". Because either
we have checked that by hand (prune_icache), or we're killing the inode
outright (no more users and the inode has been removed).

Linus

2000-12-05 20:57:10

by Alexander Viro

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Linus Torvalds wrote:

> And this is not just a "it happens to be like this" kind of thing. It
> _has_ to be like this, because every time we call clear_inode() we are
> going to physically free the memory associated with the inode very soon
> afterwards. Which means that _any_ use of the inode had better be long
> gone. Dirty buffers included.

Urgh. Linus, AFAICS we _all_ agree on that. The only real question is
whether we consider calling clear_inode() with droppable dirty buffers
to be OK. It can't happen on the dispose_list() path and I'ld rather
see it _not_ happening on the delete_inode() one. It's a policy question,
not the correctness one.

IOW, I would prefer to have BUG() instead of invalidate_inode_buffers()
and let the ->delete_inode() make sure that list is empty. I'm not saying
that current code doesn't work. However, "let's clean after the
truncate_inode_page()/foo_delete_inode(), they might leave some junk
on the list" looks like a wrong thing.

Notice that policy wrt pages is already of that kind - clear_inode()
expects the callers to make sure that ->i_data.nr_pages is zero instead of
trying to clean after them. I think that we will be better off with
similar rules wrt dirty buffers list.

Comments?

2000-12-05 22:29:26

by Linus Torvalds

[permalink] [raw]
Subject: Re: test12-pre5



On Tue, 5 Dec 2000, Daniel Phillips wrote:
>
> OK, I see - this isn't easy at all. You start the io if necessary, and
> some time later it completes.

Right. You don't know when. Once completed, it will unlock the page and
wake up waiters. It will also set PG_Uptodate if the read was successful
(and obviously it might not have been).

If you only care about the contents of the page, you can just test the
Uptodate flag - if the page is up-to-date you may not care whether IO is
outstanding on the page, or whether somebody is modifying page state
(removing page buffers etc). So it's entirely legal to look up a page in
the page cache and only look at uptodate.

(Of course, if it _isn't_ uptodate, you will still have to get the page
lock and re-test and possibly start the IO if it still isn't up-to-date
after you got the page lock. This is what all the generic_file_read()
stuff does for you).

> The locking state is therefore
> indeterminate after ->readpage or ->writepage; all we know is that
> after some finite amount of time the lock bit will go down.

Yes. It might have completed immediately (ramdisks or what not), or be
fast enough that by the time you get back it's already done. But the
likely case is that the page will be locked, and you'd be better off going
away and doing something else in the meantime (this is certainly important
for the VM layer - it needs to continue doing swap-outs so that it
doesn't just dribble the pages out one by one).

The main reason for the page locking changes for the writepage() case were
really:

- the swapping code serializes accesses by using the page lock, and
doesn't have any other underlying serializing primitives. So we needed
to let the brw_page() code leave the lock set until the write has
physically completed.

- the VM code really wants to have a notion of "I have this many pages in
flight", so that it can sanely make a decision on when to start waiting
on page writeout completion, instead of just writing as much as it can.
The block device layer does some of this, of course, but the VM layer
can do more by just using the "nr_async_pages" thing. However, before
the unlock changes, this could not work for shared file mappings.

I hope that explains the change,

Linus

2000-12-05 23:56:29

by Urban Widmark

[permalink] [raw]
Subject: smbfs writepage & struct file

On Mon, 4 Dec 2000, Linus Torvalds wrote:

> Who works on smbfs these days? I see two ways of fixing smbfs now that

That would be me, I guess. This stuff is a bit over my current level of
understanding so try not to laugh too hard ... :)

Maybe it is best to just disable it for now. Below is a patch that passes
a simple 5 minute test. Included for comments/corrections.

Note that MAP_SHARED doesn't seem to work anyway for smbfs (at least my
testprogram fails to actually write anything in 2.4.0-test11, works vs
ext2). I'll look into that later. I'm not even sure this code is being
called (where is my oops?!).

> - fetch the dentry that writepage needs by just looking at the
> inode->i_dentry list and/or just make the smbfs page cache host be the
> dentry instead of the inode like other filesystems. The first approach
> assumes that all paths are equal for writeout, the second one assumes
> that there are no hard linking going on in smbfs.

Hardlinks are not supported by smbfs, but they may be supported on the
server side (ntfs). Haven't looked if smb has anything on this. Not sure
if there are any implications on caching and such for smbfs. (If hardlinks
need to be handled, smbfs is probably broken anyway wrt those).

inode->i_dentry is a list of dentries that relate to this inode? When
would there be >1? (Except hardlinks, we don't have those here.) Is it
guaranteed to always be at least one entry in that list?

The write ends up needing 'dentry->d_inode->u.smbfs_i.fileid' to find the
id to tell the server. I can't now think of a case where the dentry
doesn't map to the one inode we have for this fileid.


I was "borrowing" stuff from the nfs code and this looked strange:

nfs_writepage_sync:
struct dentry *dentry = file->f_dentry;
struct rpc_cred *cred = nfs_file_cred(file);

yet it is called like this from nfs_writepage:
err = nfs_writepage_sync(NULL, inode, page, 0, offset);

where file is the 1st argument. Oh, it calls nfs_writepage_async most of
the time. All of the time?

/Urban


diff -ur -X exclude linux-2.4.0-test12-pre5-orig/fs/smbfs/file.c linux-2.4.0-test12-pre5-smbfs/fs/smbfs/file.c
--- linux-2.4.0-test12-pre5-orig/fs/smbfs/file.c Sun Aug 6 20:43:18 2000
+++ linux-2.4.0-test12-pre5-smbfs/fs/smbfs/file.c Tue Dec 5 23:43:30 2000
@@ -82,7 +82,7 @@
}

/*
- * We are called with the page locked and the caller unlocks.
+ * We are called with the page locked and we unlock it when done.
*/
static int
smb_readpage(struct file *file, struct page *page)
@@ -147,17 +147,33 @@
* Write a page to the server. This will be used for NFS swapping only
* (for now), and we currently do this synchronously only.
*
- * We are called with the page locked and the caller unlocks.
+ * We are called with the page locked and we unlock it when done.
*/
static int
-smb_writepage(struct file *file, struct page *page)
+smb_writepage(struct page *page)
{
- struct dentry *dentry = file->f_dentry;
- struct inode *inode = dentry->d_inode;
- unsigned long end_index = inode->i_size >> PAGE_CACHE_SHIFT;
+ struct address_space *mapping = page->mapping;
+ struct dentry *dentry;
+ struct inode *inode;
+ struct list_head *head;
+ unsigned long end_index;
unsigned offset = PAGE_CACHE_SIZE;
int err;

+ if (!mapping)
+ BUG();
+ inode = (struct inode *)mapping->host;
+ if (!inode)
+ BUG();
+
+ /* Pick the first dentry for this inode. */
+ head = &inode->i_dentry;
+ if (list_empty(head))
+ BUG(); /* We need one, are we guaranteed to have one? */
+ dentry = list_entry(head->next, struct dentry, d_alias);
+
+ end_index = inode->i_size >> PAGE_CACHE_SHIFT;
+
/* easy case */
if (page->index < end_index)
goto do_it;
@@ -170,6 +186,7 @@
get_page(page);
err = smb_writepage_sync(dentry, page, 0, offset);
SetPageUptodate(page);
+ UnlockPage(page);
put_page(page);
return err;
}

2000-12-06 00:28:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: smbfs writepage & struct file



On Wed, 6 Dec 2000, Urban Widmark wrote:
>
> Hardlinks are not supported by smbfs, but they may be supported on the
> server side (ntfs). Haven't looked if smb has anything on this. Not sure
> if there are any implications on caching and such for smbfs. (If hardlinks
> need to be handled, smbfs is probably broken anyway wrt those).

Without hardlinks there will be a 1:1 relation between inodes and
dentries, which means that the dentry could be used as a mapping host.
Then you could just do

struct dentry *dentry = (struct dentry *)mapping->host;

and get the inode from "dentry->d_inode". The problem with going the other
ways is that you can have an inode where the dentry has been reaped by
shrink_dcache() due to memory pressure, so the inode->i_dentry list could
in theory be empty. The reverse can never be true (if the dentry exists,
the inode will exist too).

HOWEVER, the problem with just making the dentry be the mapping host is
that I don't think it has ever been done before, so there is no good
source of examples on how to do this. You would just initialize it in
"smb_iget()", but then you'd have to pass the dentry down to that routine
in order to know what to initialize the mapping host to, of course.

Even after you'd do that, there are other problems: the inode shrinking
code knows about having pages attached to the inode. The same is not true
of the dentry shrinking code - so we could shrink a dentry that is still
busy in that it has pages in its mapping.

In comparison to these dentry host problems, your patch looks fine. But I
suspect that you _can_ trigger the BUG() with an empty dentry list due to
dentry shrinking. So what I did is basically to (a) apply your patch and
(b) set writepage to NULL in the address space operations, so that shared
mappings will be disabled for the time being.

I also cc'd Al, because he's likely to just come up with a clever scheme
that solves this, and call me an idiot.

Hmm.. The smb filesystem code in general seems to assume that if we have
an inode, we'd always have a dentry and vice versa. Because otherwise we'd
have potentially multiple inodes for the same dentry. Every time we do a
successful dentry lookup(), we do a "new_inode()", so if we drop the
dentry and re-lookup, we'd have aliased inodes (and thus various page
cache aliases too, etc).

Having aliased inodes is not wrong per se: it's just that it doesn't mix
well with having the page cache associated with the inode. So it probably
works fine as-is, it just has the potential to not cache things as well as
it could (because dropping the dentry will basically mean dropping the
full page cache).

Al? Any ideas?

> I was "borrowing" stuff from the nfs code and this looked strange:
>
> nfs_writepage_sync:
> struct dentry *dentry = file->f_dentry;
> struct rpc_cred *cred = nfs_file_cred(file);
>
> yet it is called like this from nfs_writepage:
> err = nfs_writepage_sync(NULL, inode, page, 0, offset);
>
> where file is the 1st argument. Oh, it calls nfs_writepage_async most of
> the time. All of the time?

That's bogus. But Trond probably overlooked it in testing, because yes, it
always calls the async version unless there has been errors doing async
writes (which should be rather uncommon).

Trond? Can you massage the sync version to do the same as the async one?

Thanks for noticing.

Linus

2000-12-06 00:30:54

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: test12-pre5

Hi,

On Tue, Dec 05, 2000 at 03:17:07PM -0500, Alexander Viro wrote:
>
>
> On Tue, 5 Dec 2000, Linus Torvalds wrote:
>
> > And this is not just a "it happens to be like this" kind of thing. It
> > _has_ to be like this, because every time we call clear_inode() we are
> > going to physically free the memory associated with the inode very soon
> > afterwards. Which means that _any_ use of the inode had better be long
> > gone. Dirty buffers included.
>
> Urgh. Linus, AFAICS we _all_ agree on that. The only real question is
> whether we consider calling clear_inode() with droppable dirty buffers
> to be OK. It can't happen on the dispose_list() path and I'ld rather
> see it _not_ happening on the delete_inode() one. It's a policy question,
> not the correctness one.

Right, because if we get this wrong, the kernel won't complain, but
we'll have a rare, impossible-to-diagnose potential data corrupter for
any applications which do recovery on reboot.

If we're not going to change the code, then at the very least a huge
warning comment above clear_inode() is necessary to make it explicit
that you shouldn't ever pass in an inode with dirty buffers unless
nlink==0, and I'd rather see the BUG there instead.

--Stephen

2000-12-06 10:36:06

by Trond Myklebust

[permalink] [raw]
Subject: Re: smbfs writepage & struct file

>>>>> " " == Linus Torvalds <[email protected]> writes:

>> I was "borrowing" stuff from the nfs code and this looked
>> strange:
>>
>> nfs_writepage_sync: struct dentry *dentry = file->f_dentry;
>> struct rpc_cred *cred = nfs_file_cred(file);
>>
>> yet it is called like this from nfs_writepage: err =
>> nfs_writepage_sync(NULL, inode, page, 0, offset);
>>
>> where file is the 1st argument. Oh, it calls
>> nfs_writepage_async most of the time. All of the time?

> That's bogus. But Trond probably overlooked it in testing,
> because yes, it always calls the async version unless there has
> been errors doing async writes (which should be rather
> uncommon).

> Trond? Can you massage the sync version to do the same as the
> async one?

Oops. That was due to a stupid oversight... I grepped for dependencies
upon dentry->d_inode, but forgot to do the same in order to weed out
the remaining file->f_dentry.

The appended patch should remove the last 'struct file' dependencies
from both readpage() and writepage().

Cheers,
Trond

diff -u --recursive --new-file linux-2.4.0-test12-pre6/fs/nfs/read.c linux-2.4.0-test12-fixme/fs/nfs/read.c
--- linux-2.4.0-test12-pre6/fs/nfs/read.c Wed Dec 6 08:34:53 2000
+++ linux-2.4.0-test12-fixme/fs/nfs/read.c Wed Dec 6 08:41:54 2000
@@ -84,8 +84,7 @@
static int
nfs_readpage_sync(struct file *file, struct inode *inode, struct page *page)
{
- struct dentry *dentry = file->f_dentry;
- struct rpc_cred *cred = nfs_file_cred(file);
+ struct rpc_cred *cred = NULL;
struct nfs_fattr fattr;
loff_t offset = page_offset(page);
char *buffer;
@@ -97,6 +96,9 @@

dprintk("NFS: nfs_readpage_sync(%p)\n", page);

+ if (file)
+ cred = nfs_file_cred(file);
+
/*
* This works now because the socket layer never tries to DMA
* into this buffer directly.
@@ -106,9 +108,9 @@
if (count < rsize)
rsize = count;

- dprintk("NFS: nfs_proc_read(%s, (%s/%s), %Ld, %d, %p)\n",
+ dprintk("NFS: nfs_proc_read(%s, (%x/%Ld), %Ld, %d, %p)\n",
NFS_SERVER(inode)->hostname,
- dentry->d_parent->d_name.name, dentry->d_name.name,
+ inode->i_dev, (long long)NFS_FILEID(inode),
(long long)offset, rsize, buffer);

lock_kernel();
diff -u --recursive --new-file linux-2.4.0-test12-pre6/fs/nfs/write.c linux-2.4.0-test12-fixme/fs/nfs/write.c
--- linux-2.4.0-test12-pre6/fs/nfs/write.c Wed Dec 6 08:34:53 2000
+++ linux-2.4.0-test12-fixme/fs/nfs/write.c Wed Dec 6 08:43:52 2000
@@ -171,8 +171,7 @@
nfs_writepage_sync(struct file *file, struct inode *inode, struct page *page,
unsigned int offset, unsigned int count)
{
- struct dentry *dentry = file->f_dentry;
- struct rpc_cred *cred = nfs_file_cred(file);
+ struct rpc_cred *cred = NULL;
loff_t base;
unsigned int wsize = NFS_SERVER(inode)->wsize;
int result, refresh = 0, written = 0, flags;
@@ -180,9 +179,13 @@
struct nfs_fattr fattr;
struct nfs_writeverf verf;

+
+ if (file)
+ cred = nfs_file_cred(file);
+
lock_kernel();
- dprintk("NFS: nfs_writepage_sync(%s/%s %d@%Ld)\n",
- dentry->d_parent->d_name.name, dentry->d_name.name,
+ dprintk("NFS: nfs_writepage_sync(%x/%Ld %d@%Ld)\n",
+ inode->i_dev, (long long)NFS_FILEID(inode),
count, (long long)(page_offset(page) + offset));

buffer = kmap(page) + offset;

2000-12-06 13:50:11

by Panu Matilainen

[permalink] [raw]
Subject: Re: test12-pre5

On Mon, 4 Dec 2000, Linus Torvalds wrote:

>
> Ok, this contains one of the fixes for the dirty inode buffer list (the
> other fix is pending, simply because I still want to understand why it
> would be needed at all). Al?
>
> Also, it has the final installment of the PageDirty handling, and now
> officially direct IO can work by just marking the physical page dirty and
> be done with it. NFS along with all filesystems have been converted, the
> one hold-out still being smbfs.
>
> Who works on smbfs these days? I see two ways of fixing smbfs now that
> "writepage()" only gets an anonymous page and no "struct file" information
> any more (this also fixes the double page unlock that Andrew saw).

Urban Widmark is the smbfs maintainer nowadays. BTW Urban thanks again for
the NetApp-fix, it has saved my behind quite a few times by now :)

- Panu -


>
> - disable shared mmap over smbfs (very easily done by just setting
> writepage to NULL)
>
> - fetch the dentry that writepage needs by just looking at the
> inode->i_dentry list and/or just make the smbfs page cache host be the
> dentry instead of the inode like other filesystems. The first approach
> assumes that all paths are equal for writeout, the second one assumes
> that there are no hard linking going on in smbfs.
>
> Somebody more knowledgeable than I will have to make the decision
> (otherwise I'll just end up disabling shared mmap - I doubt anybody really
> uses it anyway, but it would be more polite to just support it).
>
> NOTE! There's another change to "writepage()" semantics than just dropping
> the "struct file": the new writepage() is supposed to mirror the logic of
> readpage(), and unlock the page when it is done with it. This allows the
> VM system more visibility into what IO is pending (which the VM doesn't
> take advantage of yet, but now it can _truly_ use the same logic for both
> swapout and for dirty file writeback).
>
> The other change is that I forward-ported the ymfpci driver from 2.2.18,
> as it works better than the ALSA one on my now-to-be-main-laptop ;)
>
> [ Alan - I ahve your patches in my incoming queue still, I wanted to get
> an interim version out to check with Al on the block list and the VM
> stuff with Rik and people. ]
>
> Linus
>
> ----
> - pre5:
> - Jaroslav Kysela: ymfpci driver
> - me: get rid of bogus MS_INVALIDATE semantics
> - me: final part of the PageDirty() saga
> - Rusty Russell: 4-way SMP iptables fix
> - Al Viro: oops - bad ext2 inode dirty block bug
>
> - pre4:
> - Andries Brouwer: final isofs pieces.
> - Kai Germaschewski: ISDN
> - play CD audio correctly, don't stop after 12 minutes.
> - Anton Altaparmakov: disable NTFS mmap for now, as it doesn't work.
> - Stephen Tweedie: fix inode dirty block handling
> - Bill Hartner: reschedule_idle - prefer right cpu
> - Johannes Erdfelt: USB updates
> - Alan Cox: synchronize
> - Richard Henderson: alpha updates and optimizations
> - Geert Uytterhoeven: fbdev could be fooled into crashing fix
> - Trond Myklebust: NFS filehandles in inode rather than dentry
>
> - pre3:
> - me: more PageDirty / swapcache handling
> - Neil Brown: raid and md init fixes
> - David Brownell: pci hotplug sanitization.
> - Kanoj Sarcar: mips64 update
> - Kai Germaschewski: ISDN sync
> - Andreas Bombe: ieee1394 cleanups and fixes
> - Johannes Erdfelt: USB update
> - David Miller: Sparc and net update
> - Trond Myklebust: RPC layer SMP fixes
> - Thomas Sailer: mixed sound driver fixes
> - Tigran Aivazian: use atomic_dec_and_lock() for free_uid()
>
> - pre2:
> - Peter Anvin: more P4 configuration parsing
> - Stephen Tweedie: O_SYNC patches. Make O_SYNC/fsync/fdatasync
> do the right thing.
> - Keith Owens: make mdule loading use the right struct module size
> - Boszormenyi Zoltan: get MTRR's right for the >32-bit case
> - Alan Cox: various random documentation etc
> - Dario Ballabio: EATA and u14-34f update
> - Ivan Kokshaysky: unbreak alpha ruffian
> - Richard Henderson: PCI bridge initialization on alpha
> - Zach Brown: correct locking in Maestro driver
> - Geert Uytterhoeven: more m68k updates
> - Andrey Savochkin: eepro100 update
> - Dag Brattli: irda update
> - Johannes Erdfelt: USB update
>
> - pre1: (for ISDN synchronization _ONLY_! Not complete!)
> - Byron Stanoszek: correct decimal precision for CPU MHz in
> /proc/cpuinfo
> - Ollie Lho: SiS pirq routing.
> - Andries Brouwer: isofs cleanups
> - Matt Kraai: /proc read() on directories should return EISDIR, not EINVAL
> - me: be stricter about what we accept as a PCI bridge setup.
> - me: always set PCI interrupts to be level-triggered when we enable them.
> - me: updated PageDirty and swap cache handling
> - Peter Anvin: update A20 code to work without keyboard controller
> - Kai Germaschewski: ISDN updates
> - Russell King: ARM updates
> - Geert Uytterhoeven: m68k updates
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>

2000-12-10 14:14:16

by Urban Widmark

[permalink] [raw]
Subject: Re: smbfs writepage & struct file

On Tue, 5 Dec 2000, Linus Torvalds wrote:

> In comparison to these dentry host problems, your patch looks fine. But I
> suspect that you _can_ trigger the BUG() with an empty dentry list due to
> dentry shrinking. So what I did is basically to (a) apply your patch and
> (b) set writepage to NULL in the address space operations, so that shared
> mappings will be disabled for the time being.

A simple fix for this seems to be just not looking at the dentry. Turns
out it is only used to find the inode anyway ...
(and for a debug printout of the filename)

Replaced smb_invent_inos with iunique.

Changed the smb_proc_read to use an inode argument too, for symmetry with
smb_proc_write. smb_readpage_sync needs the dentry since it calls smb_open
(what a strange thing to do ... see below) and needs to find the pathname.

smb_file_open doesn't actually open anything on the remote end. So for
mmap you can open, mmap, write to mem, msync and writepage gets called on
a file the server doesn't consider opened. It will work the first time
since readpage opens the file, but not the second if the page cache
already has the page.

writepage enabled again, although I think there are problems with
reconnections (a file you had open is no longer open after reconnect).

Below a patch for this vs 2.4.0-test12-pre7.


If the aliasing of inodes&pages only affects efficiency of caching in
smbfs I think that may be acceptable. With nothing good to use for ino's I
don't see any simple inexpensive way to map filenames to inodes which
would be necessary to avoid aliases.
(Any smbfs-inode with pages could be placed on a list/hash with the
pathname attached. Lookup would search that list to avoid creating
aliases. Or something clever. Abusing the dcache so that dentries for
smbfs are not dropped while the inode is in use? If that is even
possible.)

Or is it possible to loose data because of aliased pages? even if the
application(s) do proper locking/msync/... Is it possible to get two pages
of the same file but the two pages in the page cache disagree on the
contents, and both can be written to the fs?

/Urban


diff -ur -X exclude linux-2.4.0-test12-pre7-orig/fs/smbfs/dir.c linux-2.4.0-test12-pre7-smbfs/fs/smbfs/dir.c
--- linux-2.4.0-test12-pre7-orig/fs/smbfs/dir.c Mon Aug 14 22:31:10 2000
+++ linux-2.4.0-test12-pre7-smbfs/fs/smbfs/dir.c Thu Dec 7 22:11:09 2000
@@ -124,7 +124,7 @@
qname.len = entry->len;
entry->ino = find_inode_number(dentry, &qname);
if (!entry->ino)
- entry->ino = smb_invent_inos(1);
+ entry->ino = iunique(dentry->d_sb, 2);
}

if (filldir(dirent, entry->name, entry->len,
@@ -325,7 +325,7 @@
goto add_entry;
if (!error) {
error = -EACCES;
- finfo.f_ino = smb_invent_inos(1);
+ finfo.f_ino = iunique(dentry->d_sb, 2);
inode = smb_iget(dir->i_sb, &finfo);
if (inode) {
add_entry:
@@ -362,7 +362,7 @@
goto out_close;

smb_renew_times(dentry);
- fattr.f_ino = smb_invent_inos(1);
+ fattr.f_ino = iunique(dentry->d_sb, 2);
inode = smb_iget(dentry->d_sb, &fattr);
if (!inode)
goto out_no_inode;
diff -ur -X exclude linux-2.4.0-test12-pre7-orig/fs/smbfs/file.c linux-2.4.0-test12-pre7-smbfs/fs/smbfs/file.c
--- linux-2.4.0-test12-pre7-orig/fs/smbfs/file.c Thu Dec 7 20:29:38 2000
+++ linux-2.4.0-test12-pre7-smbfs/fs/smbfs/file.c Sun Dec 10 14:21:32 2000
@@ -48,8 +48,7 @@
DENTRY_PATH(dentry), count, offset, rsize);

result = smb_open(dentry, SMB_O_RDONLY);
- if (result < 0)
- {
+ if (result < 0) {
PARANOIA("%s/%s open failed, error=%d\n",
DENTRY_PATH(dentry), result);
goto io_error;
@@ -59,7 +58,7 @@
if (count < rsize)
rsize = count;

- result = smb_proc_read(dentry, offset, rsize, buffer);
+ result = smb_proc_read(dentry->d_inode, offset, rsize, buffer);
if (result < 0)
goto io_error;

@@ -103,25 +102,27 @@
* Offset is the data offset within the page.
*/
static int
-smb_writepage_sync(struct dentry *dentry, struct page *page,
+smb_writepage_sync(struct inode *inode, struct page *page,
unsigned long offset, unsigned int count)
{
- struct inode *inode = dentry->d_inode;
u8 *buffer = page_address(page) + offset;
- int wsize = smb_get_wsize(server_from_dentry(dentry));
+ int wsize = smb_get_wsize(server_from_inode(inode));
int result, written = 0;

offset += page->index << PAGE_CACHE_SHIFT;
- VERBOSE("file %s/%s, count=%d@%ld, wsize=%d\n",
- DENTRY_PATH(dentry), count, offset, wsize);
+ VERBOSE("file ino=%ld, fileid=%d, count=%d@%ld, wsize=%d\n",
+ inode->i_ino, inode->u.smbfs_i.fileid, count, offset, wsize);

do {
if (count < wsize)
wsize = count;

- result = smb_proc_write(dentry, offset, wsize, buffer);
- if (result < 0)
+ result = smb_proc_write(inode, offset, wsize, buffer);
+ if (result < 0) {
+ PARANOIA("failed write, wsize=%d, result=%d\n",
+ wsize, result);
break;
+ }
/* N.B. what if result < wsize?? */
#ifdef SMBFS_PARANOIA
if (result < wsize)
@@ -153,9 +154,7 @@
smb_writepage(struct page *page)
{
struct address_space *mapping = page->mapping;
- struct dentry *dentry;
struct inode *inode;
- struct list_head *head;
unsigned long end_index;
unsigned offset = PAGE_CACHE_SIZE;
int err;
@@ -166,12 +165,6 @@
if (!inode)
BUG();

- /* Pick the first dentry for this inode. */
- head = &inode->i_dentry;
- if (list_empty(head))
- BUG(); /* We need one, are we guaranteed to have one? */
- dentry = list_entry(head->next, struct dentry, d_alias);
-
end_index = inode->i_size >> PAGE_CACHE_SHIFT;

/* easy case */
@@ -184,7 +177,7 @@
return -EIO;
do_it:
get_page(page);
- err = smb_writepage_sync(dentry, page, 0, offset);
+ err = smb_writepage_sync(inode, page, 0, offset);
SetPageUptodate(page);
UnlockPage(page);
put_page(page);
@@ -200,7 +193,7 @@
DEBUG1("(%s/%s %d@%ld)\n", DENTRY_PATH(dentry),
count, (page->index << PAGE_CACHE_SHIFT)+offset);

- return smb_writepage_sync(dentry, page, offset, count);
+ return smb_writepage_sync(dentry->d_inode, page, offset, count);
}

static ssize_t
@@ -281,7 +274,7 @@

struct address_space_operations smb_file_aops = {
readpage: smb_readpage,
- /* writepage: smb_writepage, */
+ writepage: smb_writepage,
prepare_write: smb_prepare_write,
commit_write: smb_commit_write
};
@@ -325,8 +318,16 @@
static int
smb_file_open(struct inode *inode, struct file * file)
{
+ int result;
+ struct dentry *dentry = file->f_dentry;
+ int smb_mode = (file->f_mode & O_ACCMODE) - 1;
+
lock_kernel();
+ result = smb_open(dentry, smb_mode);
+ if (result)
+ goto out;
inode->u.smbfs_i.openers++;
+out:
unlock_kernel();
return 0;
}
diff -ur -X exclude linux-2.4.0-test12-pre7-orig/fs/smbfs/inode.c linux-2.4.0-test12-pre7-smbfs/fs/smbfs/inode.c
--- linux-2.4.0-test12-pre7-orig/fs/smbfs/inode.c Thu Nov 16 22:18:26 2000
+++ linux-2.4.0-test12-pre7-smbfs/fs/smbfs/inode.c Sun Dec 10 14:23:37 2000
@@ -53,22 +53,6 @@
statfs: smb_statfs,
};

-/* FIXME: Look at all inodes whether so that we do not get duplicate
- * inode numbers. */
-
-unsigned long
-smb_invent_inos(unsigned long n)
-{
- static unsigned long ino = 2;
-
- if (ino + 2*n < ino)
- {
- /* wrap around */
- ino = 2;
- }
- ino += n;
- return ino;
-}

/* We are always generating a new inode here */
struct inode *
@@ -282,7 +266,7 @@
static void
smb_delete_inode(struct inode *ino)
{
- DEBUG1("\n");
+ DEBUG1("ino=%ld\n", ino->i_ino);
lock_kernel();
if (smb_close(ino))
PARANOIA("could not close inode %ld\n", ino->i_ino);
diff -ur -X exclude linux-2.4.0-test12-pre7-orig/fs/smbfs/proc.c linux-2.4.0-test12-pre7-smbfs/fs/smbfs/proc.c
--- linux-2.4.0-test12-pre7-orig/fs/smbfs/proc.c Mon Aug 14 22:31:10 2000
+++ linux-2.4.0-test12-pre7-smbfs/fs/smbfs/proc.c Sun Dec 10 14:25:14 2000
@@ -1072,9 +1072,9 @@
file-id would not be valid after a reconnection. */

int
-smb_proc_read(struct dentry *dentry, off_t offset, int count, char *data)
+smb_proc_read(struct inode *inode, off_t offset, int count, char *data)
{
- struct smb_sb_info *server = server_from_dentry(dentry);
+ struct smb_sb_info *server = server_from_inode(inode);
__u16 returned_count, data_len;
unsigned char *buf;
int result;
@@ -1082,7 +1082,7 @@
smb_lock_server(server);
smb_setup_header(server, SMBread, 5, 0);
buf = server->packet;
- WSET(buf, smb_vwv0, dentry->d_inode->u.smbfs_i.fileid);
+ WSET(buf, smb_vwv0, inode->u.smbfs_i.fileid);
WSET(buf, smb_vwv1, count);
DSET(buf, smb_vwv2, offset);
WSET(buf, smb_vwv4, 0);
@@ -1114,25 +1114,26 @@
result = data_len;

out:
- VERBOSE("file %s/%s, count=%d, result=%d\n",
- DENTRY_PATH(dentry), count, result);
+ VERBOSE("ino=%ld, fileid=%d, count=%d, result=%d\n",
+ inode->ino, inode->u.smbfs_i.fileid, count, result);
smb_unlock_server(server);
return result;
}

int
-smb_proc_write(struct dentry *dentry, off_t offset, int count, const char *data)
+smb_proc_write(struct inode *inode, off_t offset, int count, const char *data)
{
- struct smb_sb_info *server = server_from_dentry(dentry);
+ struct smb_sb_info *server = server_from_inode(inode);
int result;
__u8 *p;

- VERBOSE("file %s/%s, count=%d@%ld, packet_size=%d\n",
- DENTRY_PATH(dentry), count, offset, server->packet_size);
+ VERBOSE("ino=%ld, fileid=%d, count=%d@%ld, packet_size=%d\n",
+ inode->ino, inode->u.smbfs_i.fileid, count, offset,
+ server->packet_size);

smb_lock_server(server);
p = smb_setup_header(server, SMBwrite, 5, count + 3);
- WSET(server->packet, smb_vwv0, dentry->d_inode->u.smbfs_i.fileid);
+ WSET(server->packet, smb_vwv0, inode->u.smbfs_i.fileid);
WSET(server->packet, smb_vwv1, count);
DSET(server->packet, smb_vwv2, offset);
WSET(server->packet, smb_vwv4, 0);
diff -ur -X exclude linux-2.4.0-test12-pre7-orig/include/linux/smb_fs.h linux-2.4.0-test12-pre7-smbfs/include/linux/smb_fs.h
--- linux-2.4.0-test12-pre7-orig/include/linux/smb_fs.h Thu Dec 7 20:42:42 2000
+++ linux-2.4.0-test12-pre7-smbfs/include/linux/smb_fs.h Thu Dec 7 21:17:57 2000
@@ -126,8 +126,8 @@
int smb_close(struct inode *);
int smb_close_fileid(struct dentry *, __u16);
int smb_open(struct dentry *, int);
-int smb_proc_read(struct dentry *, off_t, int, char *);
-int smb_proc_write(struct dentry *, off_t, int, const char *);
+int smb_proc_read(struct inode *, off_t, int, char *);
+int smb_proc_write(struct inode *, off_t, int, const char *);
int smb_proc_create(struct dentry *, __u16, time_t, __u16 *);
int smb_proc_mv(struct dentry *, struct dentry *);
int smb_proc_mkdir(struct dentry *);
diff -ur -X exclude linux-2.4.0-test12-pre7-orig/include/linux/smb_fs_sb.h linux-2.4.0-test12-pre7-smbfs/include/linux/smb_fs_sb.h
--- linux-2.4.0-test12-pre7-orig/include/linux/smb_fs_sb.h Thu Dec 7 20:35:56 2000
+++ linux-2.4.0-test12-pre7-smbfs/include/linux/smb_fs_sb.h Thu Dec 7 21:11:56 2000
@@ -14,8 +14,9 @@
#include <linux/types.h>
#include <linux/smb.h>

-/* Get the server for the specified dentry */
-#define server_from_dentry(dentry) &dentry->d_sb->u.smbfs_sb
+/* structure access macros */
+#define server_from_inode(inode) (&(inode)->i_sb->u.smbfs_sb)
+#define server_from_dentry(dentry) (&(dentry)->d_sb->u.smbfs_sb)
#define SB_of(server) ((struct super_block *) ((char *)(server) - \
(unsigned long)(&((struct super_block *)0)->u.smbfs_sb)))