Hi Trond,
I am running a sparc-linux kernel using an NFS Root and it is falling over
with the trace below.
My Kernel is not a standard kernel (I have had to tweek it to get the
SBUS GC3 and the 82077 floppy to work on my OPUS Sparc 1 clone).
Can you advise me on any known issues in the NFS Client code that might
enter NULL pointers into the 'slot->slots[i]' in __lookup_tag.
If there are none that you are aware of, are there any specific areas that
I should investigate with printk statements.
The Kernel is cross compiled on an Athlon 64 3400+ (32bit linux at the
moment) using GCC-4.0.2 and Binutils-2.16.1. Compilation takes about 10
minutes so there is no real issue in making changes to the kernel to find
the source of the problem.
A compiler/binutils bug should not be ruled out. I might try
gcc-3.4.3/binutils-2.15.
Please let me know if you would like further information.
Regards
Mark Fortescue.
--------------------------------------------------------------------------
kernel BUG at /L64/src/linux-2.6/linux-2.6.13.4-p01/lib/radix-tree.c:575!
\|/ ____ \|/
"@'/ ,. \`@"
/_| \__/ |_\
\__U_/
ld(45): Kernel bad trap [#1]
PSR: 004000c4 PC: f00e0ff4 NPC: f00e0ff8 Y: 00000000 Not tainted
PC: <radix_tree_gang_lookup_tag+0x144/0x1ac>
%G: 00000001 f022ec00 f022eccc 00400fe2 f002fd18 f022ec00 ff020000
00000000
%O: 0000004d f01fbd78 0000023f 00000000 00000001 00000000 ff021a48
f00e0fec
RPC: <radix_tree_gang_lookup_tag+0x13c/0x1ac>
%L: 00000001 ff021b14 0000003f 00000001 00000002 00000000 ff020000
e0162000
%I: 00000000 ff021b14 00000000 00000001 00000008 ff021b10 ff021ab0
f00b10d4
Caller[f00b10d4]: nfs_wait_on_requests+0x98/0xb8
Caller[f00b2a70]: nfs_sync_inode+0x20/0x74
Caller[f00b063c]: nfs_readpage+0x44/0x44c
Caller[f004fc8c]: do_generic_mapping_read+0x290/0x564
Caller[f005084c]: __generic_file_aio_read+0x168/0x1cc
Caller[f0050a2c]: generic_file_aio_read+0x44/0x54
Caller[f006e298]: do_sync_read+0x94/0xc8
Caller[f006e62c]: vfs_read+0xa0/0x15c
Caller[f006f200]: sys_read+0x30/0x64
Caller[f001144c]: syscall_is_too_hard+0x34/0x40
Caller[e0096e58]: 0xe0096e58
Instruction DUMP: 90122178 7ffcc514 01000000 <91d02005> 9402a001
80a28010 0280000f c4244001 8600e001
--------------------------------------------------------------------------
On Thu, 2005-11-03 at 18:10 +0000, Mark Fortescue wrote:
> Hi Trond,
>
> I am running a sparc-linux kernel using an NFS Root and it is falling over
> with the trace below.
>
> My Kernel is not a standard kernel (I have had to tweek it to get the
> SBUS GC3 and the 82077 floppy to work on my OPUS Sparc 1 clone).
>
> Can you advise me on any known issues in the NFS Client code that might
> enter NULL pointers into the 'slot->slots[i]' in __lookup_tag.
>
> If there are none that you are aware of, are there any specific areas that
> I should investigate with printk statements.
NFS does not ever directly access the radix tree internals: it always
uses the API, and it always protects those operations using the
NFS_I(inode)->req_lock.
Are you sure that radix_tree_init() is being called before the NFSroot
stuff is started? To me, this whole thing smells of memory scribble.
Cheers,
Trond
Hi Trond,
The error occoures well after the NFS root is up and running (during the
link phase of a gcc compilation of hello.c). I thought it might be part of
the NFS system due to the backtrace.
I am currently working on a GCC-3.4.3, Binutils-2.15 version to see if it
is a compiler/binary utilities issue. The problem I have is that
GCC-3.4.3, Binutils-2.15 does not cope with printk("%llu") so I know that
there is a high potential for failure with this combination.
If it works OK, I will try with GCC-3.4.3, Binutils-2.16.1 and GCC-4.0.2,
Binutils-2.15 to try and eliminate compiler/binutils issues.
Once I have eliminated compiler/binutils bugs/features then I will start
to concentrate on the Kernel to try to identify the initial point of
failure.
Regards
Mark Fortescue.
On Thu, 3 Nov 2005, Trond Myklebust wrote:
> On Thu, 2005-11-03 at 18:10 +0000, Mark Fortescue wrote:
> > Hi Trond,
> >
> > I am running a sparc-linux kernel using an NFS Root and it is falling over
> > with the trace below.
> >
> > My Kernel is not a standard kernel (I have had to tweek it to get the
> > SBUS GC3 and the 82077 floppy to work on my OPUS Sparc 1 clone).
> >
> > Can you advise me on any known issues in the NFS Client code that might
> > enter NULL pointers into the 'slot->slots[i]' in __lookup_tag.
> >
> > If there are none that you are aware of, are there any specific areas that
> > I should investigate with printk statements.
>
> NFS does not ever directly access the radix tree internals: it always
> uses the API, and it always protects those operations using the
> NFS_I(inode)->req_lock.
>
> Are you sure that radix_tree_init() is being called before the NFSroot
> stuff is started? To me, this whole thing smells of memory scribble.
>
> Cheers,
> Trond
>
>
Hi Trond,
I have found a working combination of GCC/Binutils [gcc-3.4.3,
binutils-2.16.1 (GCC needs more work as it got its specs wrong and has a
bug in it regarding %llu on sparc).
This suggests that there is a kernel build error associated with GCC-4.0.2
(for sparc-linux). I will need to investigate this as GCC-4.0.2 has a
veriety of bug fixes in it that affect the sparc-linux target. It also has
improved configuration/build scripts that are relevent to what I am trying
to do.
I will let you know what I find. It may take me some time as my sparc
assembly is not too good and this is the best place to find compiler
hickups.
Regards
Mark Fortescue.
On Thu, 3 Nov 2005, Trond Myklebust wrote:
> On Thu, 2005-11-03 at 18:10 +0000, Mark Fortescue wrote:
> > Hi Trond,
> >
> > I am running a sparc-linux kernel using an NFS Root and it is falling over
> > with the trace below.
> >
> > My Kernel is not a standard kernel (I have had to tweek it to get the
> > SBUS GC3 and the 82077 floppy to work on my OPUS Sparc 1 clone).
> >
> > Can you advise me on any known issues in the NFS Client code that might
> > enter NULL pointers into the 'slot->slots[i]' in __lookup_tag.
> >
> > If there are none that you are aware of, are there any specific areas that
> > I should investigate with printk statements.
>
> NFS does not ever directly access the radix tree internals: it always
> uses the API, and it always protects those operations using the
> NFS_I(inode)->req_lock.
>
> Are you sure that radix_tree_init() is being called before the NFSroot
> stuff is started? To me, this whole thing smells of memory scribble.
>
> Cheers,
> Trond
>
>