Hi,
SHORT:
the current 2.2.19 fs/nfs/dir.c ll. 455ff. nfs_dir_lseek breaks
fdopen(3) which (at least with glibc 2.1.3) cals __llseek with offset==0
and whence==1 (SEEK_CUR), probably to poll the current file position.
Application software affected comprises cvs (tried 1.10.7) and Perl5
(sysopen, see below).
I suggest that SEEK_CUR be allowed for offset == 0 in nfs_dir_llseek,
but I'm asking for help since I'm not into this and cannot do this on my
own. Thanks in advance.
LONG, with hints and logs:
Symptom: software calls __llseek with SEEK_CUR, offset=0; possibly to
obtain file pointer position on fdopen, and gets EINVAL. This seems to
be a deliberate decision from what I see in fs/nfs/dir.c. I traced what
happens to a simply Perl5 script in gdb, see below.
Regretfully, I cannot try either 2.4.3 (fails to detect sda attached to
aic7xxx) nor 2.4.4-pre6 (does not compile for __builtin_expect) nor
2.4.3-ac14 (pcnet32.c:327: pcnet32_pci_tbl causes a section type
conflict) to see. Using gcc 2.95.2 here.
Solaris 7 clients doing the same user-space operations on the same NFS
server (Linux 2.2.19 knfsd) are fine.
Perl script to test:
sysopen(O, "/net/server/usr/net", O_RDONLY) or die "sysopen failed: $!";
/net/server/usr/net is an automounted NFSv2 or NFSv3 directory
C source to trigger problem, with strace:
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
main() {
int fd = open("/net/server/usr/net", O_RDONLY);
/* again, that's a NFS-imported directory, no matter if NFSv2 or
v3 */
if(fd) {
FILE *f = fdopen(fd, "r");
}
}
strace, omitting brk:
open("/net/server/usr/net", O_RDONLY) = 3
fcntl(3, F_GETFL) = 0 (flags O_RDONLY)
...
fstat64(3, 0xbffff45c) = -1 ENOSYS (Function not implemented)
fstat(3, {st_mode=S_IFDIR|S_ISGID|0755, st_size=2048, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x124000
_llseek(3, 0, 0xbffff4a4, SEEK_CUR) = -1 EINVAL (Invalid argument)
_exit(0) = ?
Stack trace, caught when entering _llseek:
#0 0x1cf665 in __llseek (fd=5, offset=0, whence=1)
at ../sysdeps/unix/sysv/linux/llseek.c:32
#1 0x17d49b in _IO_file_seek () at fileops.c:671
#2 0x17d3a9 in _IO_new_file_seekoff (fp=0x8049618, offset=0, dir=1, mode=3)
at fileops.c:652
#3 0x17cbc3 in _IO_new_file_attach (fp=0x8049618, fd=5) at fileops.c:268
#4 0x178f7c in _IO_new_fdopen (fd=5, mode=0x80484dc "r") at iofdopen.c:126
#5 0x804845e in main () at test.c:10
--
Matthias Andree
>>>>> " " == Matthias Andree <[email protected]> writes:
> Hi, SHORT:
> the current 2.2.19 fs/nfs/dir.c ll. 455ff. nfs_dir_lseek breaks
> fdopen(3) which (at least with glibc 2.1.3) cals __llseek with
> offset==0 and whence==1 (SEEK_CUR), probably to poll the
> current file position. Application software affected comprises
> cvs (tried 1.10.7) and Perl5 (sysopen, see below).
> I suggest that SEEK_CUR be allowed for offset == 0 in
> nfs_dir_llseek, but I'm asking for help since I'm not into this
> and cannot do this on my own. Thanks in advance.
Ion has already sent in a patch to Alan for this. Here it is...
Please note that if glibc is checking this return value, it will still
screw up if file->f_pos > 0x7fffffff, which can and does happen
against certain servers (particularly IRIX).
As I've said before: it is a bug for glibc to be relying on seekdir if
we want to support non-POSIX compliant filesystems under Linux.
Cheers,
Trond
--- /mnt/3/linux-2.2.19/fs/nfs/dir.c Sun Mar 25 08:37:38 2001
+++ linux-2.2.19/fs/nfs/dir.c Thu Apr 5 14:37:59 2001
@@ -454,6 +454,9 @@
*/
static loff_t nfs_dir_llseek(struct file *file, loff_t offset, int origin)
{
+ /* Glibc 2.0 backwards compatibility crap... */
+ if (origin == 1 && offset == 0)
+ return file->f_pos;
/* We disallow SEEK_CUR and SEEK_END */
if (origin != 0)
return -EINVAL;
On Thu, 26 Apr 2001, Trond Myklebust wrote:
> Please note that if glibc is checking this return value, it will still
> screw up if file->f_pos > 0x7fffffff, which can and does happen
> against certain servers (particularly IRIX).
Do servers have directories that are this large? It'd take quite some
files to get a directory itself (not counting its files) exceed 2 GB,
wouldn't it?
> As I've said before: it is a bug for glibc to be relying on seekdir if
> we want to support non-POSIX compliant filesystems under Linux.
There's no seekdir. No telldir. Just a "get ofile current position".
Meanwhile, I took a glance at fileops.c and iofdopen.c of the glibc
source RPM that SuSE 7.0 uses, there is no seekdir like stuff involved,
it's just that when glibc rolls the dice to get a FILE structure filled,
it gathers the current file position, since someone might have called
read before fdopen. I think that's legitimate.
Here's the excerpt, glibc-2.1/libio/fileops.c, ll. 255 ff.:
_IO_FILE *
_IO_new_file_attach (fp, fd)
_IO_FILE *fp;
int fd;
{
if (_IO_file_is_open (fp))
return NULL;
fp->_fileno = fd;
fp->_flags &= ~(_IO_NO_READS+_IO_NO_WRITES);
fp->_flags |= _IO_DELETE_DONT_CLOSE;
/* Get the current position of the file. */
/* We have to do that since that may be junk. */
fp->_offset = _IO_pos_BAD;
if (_IO_SEEKOFF (fp, (_IO_off64_t)0, _IO_seek_cur, _IOS_INPUT|_IOS_OUTPUT)
== _IO_pos_BAD && errno != ESPIPE)
return NULL;
return fp;
}
No seekdir. (would be pointless since seekdir does not return a value)
Not even telldir. Just plain fdopen. Is a plain fdopen supposed to fail
just because some clients don't understand the semantics some server
uses; not even considering who's fault it might be? Certainly not.
> --- /mnt/3/linux-2.2.19/fs/nfs/dir.c Sun Mar 25 08:37:38 2001
> +++ linux-2.2.19/fs/nfs/dir.c Thu Apr 5 14:37:59 2001
> @@ -454,6 +454,9 @@
Thanks, will try.
--
Matthias Andree
On Thu, 26 Apr 2001, Trond Myklebust wrote:
> --- /mnt/3/linux-2.2.19/fs/nfs/dir.c Sun Mar 25 08:37:38 2001
> +++ linux-2.2.19/fs/nfs/dir.c Thu Apr 5 14:37:59 2001
> @@ -454,6 +454,9 @@
> */
> static loff_t nfs_dir_llseek(struct file *file, loff_t offset, int origin)
> {
> + /* Glibc 2.0 backwards compatibility crap... */
> + if (origin == 1 && offset == 0)
> + return file->f_pos;
This fixes my problems with cvs and my test script. Thanks a lot again!
>>>>> " " == Matthias Andree <[email protected]> writes:
> On Thu, 26 Apr 2001, Trond Myklebust wrote:
>> Please note that if glibc is checking this return value, it
>> will still screw up if file->f_pos > 0x7fffffff, which can and
>> does happen against certain servers (particularly IRIX).
> Do servers have directories that are this large? It'd take
> quite some files to get a directory itself (not counting its
> files) exceed 2 GB, wouldn't it?
Check out IRIX. The xfs filesystem uses full 32 or 64 bit unsigned
cookies on all directories whether they are short or long.
Bottom line: you should not confuse directory cookies with offsets.
>> As I've said before: it is a bug for glibc to be relying on
>> seekdir if we want to support non-POSIX compliant filesystems
>> under Linux.
> There's no seekdir. No telldir. Just a "get ofile current
> position".
Same difference.
> Meanwhile, I took a glance at fileops.c and iofdopen.c of the
> glibc source RPM that SuSE 7.0 uses, there is no seekdir like
> stuff involved, it's just that when glibc rolls the dice to get
> a FILE structure filled, it gathers the current file position,
> since someone might have called read before fdopen. I think
> that's legitimate.
> No seekdir. (would be pointless since seekdir does not return a
> value) Not even telldir. Just plain fdopen. Is a plain fdopen
> supposed to fail just because some clients don't understand the
> semantics some server uses; not even considering who's fault it
> might be? Certainly not.
File position for an NFS directory can take any 32bit or 64bit
unsigned value. This has been the case on Linux since before glibc2
was even a glimmer in Ulrich Drepper's eye.
Cheers,
Trond