2008-05-15 14:06:30

by Adam Olsen

[permalink] [raw]
Subject: NFS+GD issues on kernel 2.6.24, but not 2.6.22

Hello,

I'm having an issue with perl and libgd reading TrueType fonts over an
NFS mount.

The NFS server is an Isilon cluster (I believe they are based on FreeBSD 6.1?)
The client is a machine that's just been updated to Hardy.
* Kernel 2.6.24-16-server
* libgd-gd2-perl 2.35-1
* libgd2-xpm 2.0.35

The issue is this: using libgd via a perl script, I try writing text
to an image in a font, specified by a file name (the font is located
on the remote NFS server). The script says that it cannot locate the
font. The client machine does not have any other known problems with
the NFS share, it can read and write other files just fine. We have
several other machines that are reading this same share, though they
are running kernel 2.6.22 from Gutsy (I confirmed that a downgrade to
2.6.22 on Hardy resolves the issue).

Here is a link to a script that can duplicate the problem 100% of the
time: http://soc.ath.cx/testfont.pl.txt

An strace using the kernel that doesn't work looks like this:

access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_mode=S_IFREG|0666, st_size=48784, ...}) = 0
close(3) = 0

And the strace on the system that *does* work looks like this:

access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_mode=S_IFREG|0666, st_size=48784, ...}) = 0
mmap2(NULL, 48784, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb79cd000
close(3) = 0

Everything looks the same except the missing "mmap2" section. Again,
a note, this same machine *can* access other shares and read fonts
just fine. It appears to just be the connection to the Isilon
machine. Everything worked before the upgrade, and a downgrade to
2.6.22 works. Any information would be helpful.

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic


2008-05-15 15:46:46

by Jeff Layton

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, 15 May 2008 08:06:29 -0600
"Adam Olsen" <[email protected]> wrote:

> Hello,
>
> I'm having an issue with perl and libgd reading TrueType fonts over an
> NFS mount.
>
> The NFS server is an Isilon cluster (I believe they are based on FreeBSD 6.1?)
> The client is a machine that's just been updated to Hardy.
> * Kernel 2.6.24-16-server
> * libgd-gd2-perl 2.35-1
> * libgd2-xpm 2.0.35
>
> The issue is this: using libgd via a perl script, I try writing text
> to an image in a font, specified by a file name (the font is located
> on the remote NFS server). The script says that it cannot locate the
> font. The client machine does not have any other known problems with
> the NFS share, it can read and write other files just fine. We have
> several other machines that are reading this same share, though they
> are running kernel 2.6.22 from Gutsy (I confirmed that a downgrade to
> 2.6.22 on Hardy resolves the issue).
>
> Here is a link to a script that can duplicate the problem 100% of the
> time: http://soc.ath.cx/testfont.pl.txt
>
> An strace using the kernel that doesn't work looks like this:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_mode=S_IFREG|0666, st_size=48784, ...}) = 0
> close(3) = 0
>
> And the strace on the system that *does* work looks like this:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_mode=S_IFREG|0666, st_size=48784, ...}) = 0
> mmap2(NULL, 48784, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb79cd000
> close(3) = 0
>

> Everything looks the same except the missing "mmap2" section. Again,
> a note, this same machine *can* access other shares and read fonts
> just fine. It appears to just be the connection to the Isilon
> machine. Everything worked before the upgrade, and a downgrade to
> 2.6.22 works. Any information would be helpful.
>


It really depends on the program, but I'd guess that it saw something
in the fstat64() call that it didn't like. You might want to use strace
with '-v -s 256' or something and look for differences in the info
returned by the fstat64 call.

Cheers,
--
Jeff Layton <[email protected]>

2008-05-15 15:55:19

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 9:46 AM, Jeff Layton <[email protected]> wrote:
> It really depends on the program, but I'd guess that it saw something
> in the fstat64() call that it didn't like. You might want to use strace
> with '-v -s 256' or something and look for differences in the info
> returned by the fstat64 call.

With the working kernel:

access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_dev=makedev(0, 25), st_ino=4634215,
st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
st_blksize=32768, st_blocks=194, st_size=48784,
st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
st_ctime=2008/05/14-21:12:18}) = 0
mmap2(NULL, 48784, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7a07000
close(3)

With the *non* working kernel:

access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
fstat64(3, {st_dev=makedev(0, 23), st_ino=4299601510,
st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
st_blksize=32768, st_blocks=194, st_size=48784,
st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
st_ctime=2008/05/14-21:12:18}) = 0
close(3) = 0

Still looks almost identical, except the missing mmap2 in the
non-working kernel. Also, the st_ino is different... should they be
the same?

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 16:20:27

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 10:08 AM, James Pearson
<[email protected]> wrote:
> Is this a 32 bit app running on a 64 OS by any chance?

Nope, here is `uname -a` on the failing system:

Linux slavedb 2.6.24-16-server #1 SMP Thu Apr 10 13:58:00 UTC 2008
i686 GNU/Linux

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 16:35:45

by James Pearson

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

Adam Olsen wrote:
> On Thu, May 15, 2008 at 9:46 AM, Jeff Layton <[email protected]> wrote:
>
>> It really depends on the program, but I'd guess that it saw something
>> in the fstat64() call that it didn't like. You might want to use strace
>> with '-v -s 256' or something and look for differences in the info
>> returned by the fstat64 call.
>
>
> With the working kernel:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_dev=makedev(0, 25), st_ino=4634215,
> st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
> st_blksize=32768, st_blocks=194, st_size=48784,
> st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
> st_ctime=2008/05/14-21:12:18}) = 0
> mmap2(NULL, 48784, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7a07000
> close(3)
>
> With the *non* working kernel:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_dev=makedev(0, 23), st_ino=4299601510,
> st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
> st_blksize=32768, st_blocks=194, st_size=48784,
> st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
> st_ctime=2008/05/14-21:12:18}) = 0
> close(3) = 0
>
> Still looks almost identical, except the missing mmap2 in the
> non-working kernel. Also, the st_ino is different... should they be
> the same?

Is this a 32 bit app running on a 64 OS by any chance?

James Pearson


2008-05-15 18:37:36

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, 2008-05-15 at 09:55 -0600, Adam Olsen wrote:
> On Thu, May 15, 2008 at 9:46 AM, Jeff Layton <[email protected]> wrote:
> > It really depends on the program, but I'd guess that it saw something
> > in the fstat64() call that it didn't like. You might want to use strace
> > with '-v -s 256' or something and look for differences in the info
> > returned by the fstat64 call.
>
> With the working kernel:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_dev=makedev(0, 25), st_ino=4634215,
> st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
> st_blksize=32768, st_blocks=194, st_size=48784,
> st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
> st_ctime=2008/05/14-21:12:18}) = 0
> mmap2(NULL, 48784, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7a07000
> close(3)
>
> With the *non* working kernel:
>
> access("/mnt/isilon/fonts/arial.ttf", R_OK) = 0
> open("/mnt/isilon/fonts/arial.ttf", O_RDONLY) = 3
> fcntl64(3, F_SETFD, FD_CLOEXEC) = 0
> fstat64(3, {st_dev=makedev(0, 23), st_ino=4299601510,
> st_mode=S_IFREG|0666, st_nlink=1, st_uid=1001, st_gid=100,
> st_blksize=32768, st_blocks=194, st_size=48784,
> st_atime=2008/05/14-11:46:53, st_mtime=2008/05/14-11:46:53,
> st_ctime=2008/05/14-21:12:18}) = 0
> close(3) = 0
>
> Still looks almost identical, except the missing mmap2 in the
> non-working kernel. Also, the st_ino is different... should they be
> the same?

Looks as if you've got a 32-bit application that doesn't like 64-bit
inode numbers. Try booting with the kernel parameter
'nfs.enable_ino64=0'.

Trond


2008-05-15 20:00:04

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 12:37 PM, Trond Myklebust
<[email protected]> wrote:
> Looks as if you've got a 32-bit application that doesn't like 64-bit
> inode numbers. Try booting with the kernel parameter
> 'nfs.enable_ino64=0'.

Sorry if this is a dumb question, but is that going to hurt me in the
long run? This filesystem has the capability of being rather huge
(it's currently at 27TB).

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 20:04:49

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, 2008-05-15 at 14:00 -0600, Adam Olsen wrote:
> On Thu, May 15, 2008 at 12:37 PM, Trond Myklebust
> <[email protected]> wrote:
> > Looks as if you've got a 32-bit application that doesn't like 64-bit
> > inode numbers. Try booting with the kernel parameter
> > 'nfs.enable_ino64=0'.
>
> Sorry if this is a dumb question, but is that going to hurt me in the
> long run? This filesystem has the capability of being rather huge
> (it's currently at 27TB).

Some applications (e.g. backup apps) may gripe at the fact that the
inode numbers are no longer guaranteed to be unique, but the only
alternative solution to that would be to convert your apps to be 64-bit
safe.

Trond


2008-05-15 20:06:30

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 2:04 PM, Trond Myklebust
<[email protected]> wrote:
> Some applications (e.g. backup apps) may gripe at the fact that the
> inode numbers are no longer guaranteed to be unique, but the only
> alternative solution to that would be to convert your apps to be 64-bit
> safe.

In general, it would be ok though?

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 20:15:52

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 12:37 PM, Trond Myklebust
<[email protected]> wrote:
> Looks as if you've got a 32-bit application that doesn't like 64-bit
> inode numbers. Try booting with the kernel parameter
> 'nfs.enable_ino64=0'.

Ok, tried that. It's still a no go.

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 20:50:03

by Jeff Layton

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, 15 May 2008 14:15:51 -0600
"Adam Olsen" <[email protected]> wrote:

> On Thu, May 15, 2008 at 12:37 PM, Trond Myklebust
> <[email protected]> wrote:
> > Looks as if you've got a 32-bit application that doesn't like 64-bit
> > inode numbers. Try booting with the kernel parameter
> > 'nfs.enable_ino64=0'.
>
> Ok, tried that. It's still a no go.
>

I suggest some debugging of the actual application. Everything looks
fine from a system call standpoint. You'll need to debug the
application and figure out why it's not doing what you expect.

--
Jeff Layton <[email protected]>

2008-05-15 20:57:22

by Jeff Layton

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, 15 May 2008 16:49:13 -0400
Jeff Layton <[email protected]> wrote:

> On Thu, 15 May 2008 14:15:51 -0600
> "Adam Olsen" <[email protected]> wrote:
>
> > On Thu, May 15, 2008 at 12:37 PM, Trond Myklebust
> > <[email protected]> wrote:
> > > Looks as if you've got a 32-bit application that doesn't like 64-bit
> > > inode numbers. Try booting with the kernel parameter
> > > 'nfs.enable_ino64=0'.
> >
> > Ok, tried that. It's still a no go.
> >
>
> I suggest some debugging of the actual application. Everything looks
> fine from a system call standpoint. You'll need to debug the
> application and figure out why it's not doing what you expect.
>

Actually...my suspicion is that Trond is right and this app (or maybe a
library) doesn't like 64 bit inode numbers. It was probably not built
with LFS defines. glibc will turn that into a fstat64() system call,
and when it gets an inode number that won't fit in the field, it will
generate a -EOVERFLOW in userspace. You won't see it in an strace. An
ltrace *might* show it, or you could hook up gdb to your program and
try to look at it that way.


Good luck

--
Jeff Layton <[email protected]>

2008-05-15 21:19:54

by Adam Olsen

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On Thu, May 15, 2008 at 2:56 PM, Jeff Layton <[email protected]> wrote:
> Actually...my suspicion is that Trond is right and this app (or maybe a
> library) doesn't like 64 bit inode numbers. It was probably not built
> with LFS defines. glibc will turn that into a fstat64() system call,
> and when it gets an inode number that won't fit in the field, it will
> generate a -EOVERFLOW in userspace. You won't see it in an strace. An
> ltrace *might* show it, or you could hook up gdb to your program and
> try to look at it that way.

Ok, the library is freetype. I'll look for related bugs on their
site. Why does it work in kernel 2.6.22, though?

--
Adam Olsen
SendOutCards.com
http://www.vimtips.org
http://last.fm/user/synic

2008-05-15 21:29:05

by James Pearson

[permalink] [raw]
Subject: Re: NFS+GD issues on kernel 2.6.24, but not 2.6.22

On 15/05/2008, Jeff Layton <[email protected]> wrote:
>
> Actually...my suspicion is that Trond is right and this app (or maybe a
> library) doesn't like 64 bit inode numbers. It was probably not built
> with LFS defines. glibc will turn that into a fstat64() system call,
> and when it gets an inode number that won't fit in the field, it will
> generate a -EOVERFLOW in userspace. You won't see it in an strace. An
> ltrace *might* show it, or you could hook up gdb to your program and
> try to look at it that way.

We had a similar issue with Isilon NFS mounts - but only had an issue
with non-LFS apps (32 bit) running on 64 bit clients - hence my
previous question ...

The 'fix' we use on the Isilon servers is to enable 32 bit 'fileids'
on the server nodes - the fix is to add the line:

vfs.nfsrv.do_32bit_fileid=1

to /etc/mcp/override/sysctl.conf (or create the file if it doesn't
exist) on any node (it gets copied to all nodes and updated
automagically) - however, _do not_ do this if you have any mounted
clients as are likely to get stale mounts ...

However, we get this issue with earlier 2.6 clients - so I'm not sure
it is exactly the same problem ...

James Pearson