2002-11-15 10:17:53

by Richard Bouska

[permalink] [raw]
Subject: NFS mountned directory and apache2 (2.5.47)

Hello
In 2.45.47 and 2.5.46 (at least) did not try any other am I not able to
serve files bigger than 255 bytes by apache2 from nfs mounted directory.
The local files are served correctly.
The server is on 2.4.18. When I use the 2.4 also on client everything
works. When the content of the page is CGI generated then the size is
not limited like this.
Both client and server are x86 (athlon)
kernel compiled with: gcc version 2.95.4 20011002 (Debian prerelease)

sendfile() bug ??

Any ideas
Richard Bouska



2002-11-15 18:04:51

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

>>>>> " " == Richard Bouska <[email protected]> writes:

> Hello In 2.45.47 and 2.5.46 (at least) did not try any other am
> I not able to serve files bigger than 255 bytes by apache2 from
> nfs mounted directory. The local files are served correctly.

> The server is on 2.4.18. When I use the 2.4 also on client
> everything works. When the content of the page is CGI generated
> then the size is not limited like this.

> Both client and server are x86 (athlon) kernel compiled with:
> gcc version 2.95.4 20011002 (Debian prerelease)

> sendfile() bug ??

Looks like whoever it was that changed the 'sendfile' API forgot to
update NFS. Try the following patch.

Cheers,
Trond

--- linux-2.5.47/fs/nfs/file.c.orig 2002-11-08 14:16:27.000000000 -0500
+++ linux-2.5.47/fs/nfs/file.c 2002-11-15 13:06:06.000000000 -0500
@@ -175,6 +175,7 @@
#ifdef CONFIG_NFS_DIRECTIO
.direct_IO = nfs_direct_IO,
#endif
+ .sendfile = generic_file_sendfile,
};

/*

2002-11-15 18:12:46

by Petr Vandrovec

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

On 15 Nov 02 at 19:10, Trond Myklebust wrote:

> > Hello In 2.45.47 and 2.5.46 (at least) did not try any other am
> > I not able to serve files bigger than 255 bytes by apache2 from
> > nfs mounted directory. The local files are served correctly.
>
> Looks like whoever it was that changed the 'sendfile' API forgot to
> update NFS. Try the following patch.

It does not change anything on the brokeness of apache2 (or maybe
glibc). It must be able to revert to read/write loop if sendfile fails
with EINVAL. There is no guarantee that existing sendfile() API means
that you can use it with all filesystems.
Best regards,
Petr Vandrovec
[email protected]

2002-11-15 20:09:24

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

>>>>> " " == Petr Vandrovec <[email protected]> writes:

> It does not change anything on the brokeness of apache2 (or
> maybe glibc). It must be able to revert to read/write loop if
> sendfile fails with EINVAL. There is no guarantee that existing
> sendfile() API means that you can use it with all filesystems.

I disagree. Sendfile can *always* be emulated using the standard file
'read' method.

For most filesystems, that means just reading directly from the page
cache, and that is precisely what generic_file_sendfile() does. IIRC,
the sendfile() inode op was added so that those few filesystems which
don't support direct reading of the page cache could roll their own.

Cheers,
Trond

2002-11-15 20:20:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

On Fri, Nov 15, 2002 at 09:16:16PM +0100, Trond Myklebust wrote:
> >>>>> " " == Petr Vandrovec <[email protected]> writes:
>
> > It does not change anything on the brokeness of apache2 (or
> > maybe glibc). It must be able to revert to read/write loop if
> > sendfile fails with EINVAL. There is no guarantee that existing
> > sendfile() API means that you can use it with all filesystems.
>
> I disagree. Sendfile can *always* be emulated using the standard file
> 'read' method.

Linus removed that in early 2.5 because it led to kmap() deadlocks.
sendfile can fail with EINVAL and userspace must not rely on it
working on any object.

2002-11-15 20:46:10

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)


>> I disagree. Sendfile can *always* be emulated using the
>> standard file 'read' method.

> Linus removed that in early 2.5 because it led to kmap()
> deadlocks. sendfile can fail with EINVAL and userspace must
> not rely on it working on any object.

Fair enough. The kernel may not be the appropriate place for providing
such an emulation, but there's no reason why glibc shouldn't be able
to do so for the case where sendfile returns EINVAL.

However none of this changes the matter of the NFS client. The latter
*does* support a pagecache, and so the one-line patch is appropriate.

Cheers,
Trond

2002-11-15 21:03:52

by Linus Torvalds

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

In article <[email protected]>,
Christoph Hellwig <[email protected]> wrote:
>On Fri, Nov 15, 2002 at 09:16:16PM +0100, Trond Myklebust wrote:
>> >>>>> " " == Petr Vandrovec <[email protected]> writes:
>>
>> > It does not change anything on the brokeness of apache2 (or
>> > maybe glibc). It must be able to revert to read/write loop if
>> > sendfile fails with EINVAL. There is no guarantee that existing
>> > sendfile() API means that you can use it with all filesystems.
>>
>> I disagree. Sendfile can *always* be emulated using the standard file
>> 'read' method.
>
>Linus removed that in early 2.5 because it led to kmap() deadlocks.
>sendfile can fail with EINVAL and userspace must not rely on it
>working on any object.

Now that I think we've fixed the deadlocks a different way, we _might_
be able to re-introduce a more generic sendfile().

We should also change the name of the dang thing at least internally,
since it has very little to do with sending a file any more. And
furthermore we should probably introduce an internal file operation that
is the reverse of our misnamed "sendfile", ie a "receive actor" (we
already have the notion of actors, but we don't use them for receiving
directly into a "struct file *").

Then we could actually do a real "copyfile()", by just matching up the
source file "sendfile()" function with the destination file "receive
actor" function and letting it rip. That should allow true "move the
page cache page from one file to another" copies of files, for example.

Linus

2002-11-16 04:38:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

On Fri, Nov 15, 2002 at 09:53:03PM +0100, Trond Myklebust wrote:
> > Linus removed that in early 2.5 because it led to kmap()
> > deadlocks. sendfile can fail with EINVAL and userspace must
> > not rely on it working on any object.
>
> Fair enough. The kernel may not be the appropriate place for providing
> such an emulation, but there's no reason why glibc shouldn't be able
> to do so for the case where sendfile returns EINVAL.

*nod*

> However none of this changes the matter of the NFS client. The latter
> *does* support a pagecache, and so the one-line patch is appropriate.

just a little bit more then one line :) nfs needs to do the same
revalidation as in nfs_file_read and then call generic_file_sendfile.

2002-11-17 20:11:16

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS mountned directory and apache2 (2.5.47)

>>>>> " " == Christoph Hellwig <[email protected]> writes:

> just a little bit more then one line :) nfs needs to do the
> same revalidation as in nfs_file_read and then call
> generic_file_sendfile.

Duh, I must have eaten a bad mushroom at dinner. Thanks for staying
awake...

The correct patch is of course as appended.

Cheers,
Trond

--- linux-2.5.47/fs/nfs/file.c 2002-11-08 14:16:27.000000000 -0500
+++ linux-2.5.47-01-sendfile/fs/nfs/file.c 2002-11-17 15:15:04.000000000 -0500
@@ -35,6 +35,7 @@
#define NFSDBG_FACILITY NFSDBG_FILE

static int nfs_file_mmap(struct file *, struct vm_area_struct *);
+static ssize_t nfs_file_sendfile(struct file *, loff_t *, size_t, read_actor_t, void *);
static ssize_t nfs_file_read(struct kiocb *, char *, size_t, loff_t);
static ssize_t nfs_file_write(struct kiocb *, const char *, size_t, loff_t);
static int nfs_file_flush(struct file *);
@@ -52,6 +53,7 @@
.release = nfs_release,
.fsync = nfs_fsync,
.lock = nfs_lock,
+ .sendfile = nfs_file_sendfile,
};

struct inode_operations nfs_file_inode_operations = {
@@ -102,6 +104,24 @@
return result;
}

+static ssize_t
+nfs_file_sendfile(struct file *filp, loff_t *ppos, size_t count,
+ read_actor_t actor, void *target)
+{
+ struct dentry *dentry = filp->f_dentry;
+ struct inode *inode = dentry->d_inode;
+ ssize_t res;
+
+ dfprintk(VFS, "nfs: sendfile(%s/%s, %lu@%Lu)\n",
+ dentry->d_parent->d_name.name, dentry->d_name.name,
+ (unsigned long) count, (unsigned long long) *ppos);
+
+ res = nfs_revalidate_inode(NFS_SERVER(inode), inode);
+ if (!res)
+ res = generic_file_sendfile(filp, ppos, count, actor, target);
+ return res;
+}
+
static int
nfs_file_mmap(struct file * file, struct vm_area_struct * vma)
{