2005-04-01 17:32:09

by Steven Procter

[permalink] [raw]
Subject: NFS crash problem in readdirplus


A readdirplus call with count=0 reliably causes a server crash due to a
null pointer dereference. The relevant information from my
/var/log/messages is at the end of this message.

I believe this to be caused by the following code;

fs/nfsd/nfs3xdr.c:562:
int
nfs3svc_decode_readdirplusargs(struct svc_rqst *rqstp, u32 *p,
struct nfsd3_readdirargs *args)
{
int len, pn;

if (!(p = decode_fh(p, &args->fh)))
return 0;
p = xdr_decode_hyper(p, &args->cookie);
args->verf = p; p += 2;
args->dircount = ntohl(*p++);
args->count = ntohl(*p++);

len = (args->count > NFSSVC_MAXBLKSIZE) ? NFSSVC_MAXBLKSIZE :
args->count;
args->count = len;

here> while (len > 0) {
pn = rqstp->rq_resused;
svc_take_page(rqstp);
if (!args->buffer)
args->buffer = page_address(rqstp->rq_respages[pn]);
len -= PAGE_SIZE;
}

return xdr_argsize_check(rqstp, p);
}

If len is 0 then the while loop is never executed.

Here is my system information. I have seen this happen on various 2.4
servers as well.

# cat /proc/version
Linux version 2.6.5-1.358 ([email protected]) (gcc version 3.3.3 20040412 (Red Hat Linux 3.3.3-7)) #1 Sat May 8 09:04:50 EDT 2004

--Steven

--- /var/log/messages ---

Mar 29 13:26:31 tc47 kernel: Unable to handle kernel NULL pointer dereference at
virtual address 00000000
Mar 29 13:26:31 tc47 kernel: printing eip:
Mar 29 13:26:31 tc47 kernel: 42aeef47
Mar 29 13:26:31 tc47 kernel: *pde = 00000000
Mar 29 13:26:31 tc47 kernel: Oops: 0002 [#1]
Mar 29 13:26:31 tc47 kernel: CPU: 0
Mar 29 13:26:31 tc47 kernel: EIP: 0060:[<42aeef47>] Not tainted
Mar 29 13:26:31 tc47 kernel: EFLAGS: 00010246 (2.6.5-1.358)
Mar 29 13:26:31 tc47 kernel: EIP is at nfs3svc_encode_readdirres+0x3f/0x89 [nfsd
]
Mar 29 13:26:31 tc47 kernel: eax: 00000000 ebx: 2d690800 ecx: 00000000 edx
: 00000000
Mar 29 13:26:31 tc47 kernel: esi: 2d6908f8 edi: 33e5b080 ebp: 03a92800 esp
: 3fe58f50
Mar 29 13:26:31 tc47 kernel: ds: 007b es: 007b ss: 0068
Mar 29 13:26:31 tc47 kernel: Process nfsd (pid: 1190, threadinfo=3fe58000 task=3
cf7f930)
Mar 29 13:26:31 tc47 kernel: Stack: 03a92864 03a92800 42aeef08 33e5b020 42b019e4
42ae35a6 33e5b018 03a92864
Mar 29 13:26:31 tc47 kernel: 03a92800 42b01a98 33e5b018 42a7ec24 fffffeff
00000043 0000010c 00000100
Mar 29 13:26:31 tc47 kernel: 000186a3 03a92840 42b019e4 42b01a98 42b00ee0
03948504 00000000 18b7b1a1
Mar 29 13:26:31 tc47 kernel: Call Trace:
Mar 29 13:26:31 tc47 kernel: [<42aeef08>] nfs3svc_encode_readdirres+0x0/0x89 [n
fsd]
Mar 29 13:26:31 tc47 kernel: [<42ae35a6>] nfsd_dispatch+0x117/0x165 [nfsd]
Mar 29 13:26:31 tc47 kernel: [<42a7ec24>] svc_process+0x323/0x55f [sunrpc]
Mar 29 13:26:31 tc47 kernel: [<42ae3355>] nfsd+0x18f/0x2c9 [nfsd]
Mar 29 13:26:31 tc47 kernel: [<42ae31c6>] nfsd+0x0/0x2c9 [nfsd]
Mar 29 13:26:31 tc47 kernel: [<021041d9>] kernel_thread_helper+0x5/0xb
Mar 29 13:26:31 tc47 kernel:
Mar 29 13:26:31 tc47 kernel: Code: c7 02 00 00 00 00 81 bb f8 00 00 00 00 00 75
31 0f 94 c0 0f




-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-04-20 08:49:41

by Olaf Kirch

[permalink] [raw]
Subject: Re: NFS crash problem in readdirplus

On Fri, Apr 01, 2005 at 09:30:53AM -0800, Steven wrote:
> A readdirplus call with count=0 reliably causes a server crash due to a
> null pointer dereference. The relevant information from my
> /var/log/messages is at the end of this message.

According to your Oops, it dies here:

int
nfs3svc_encode_readdirres(struct svc_rqst *rqstp, u32 *p,
struct nfsd3_readdirres *resp)
{
p = encode_post_op_attr(rqstp, p, &resp->fh);

if (resp->status == 0) {
/* stupid readdir cookie */
memcpy(p, resp->verf, 8); p += 2;
xdr_ressize_check(rqstp, p);
p = resp->buffer;
here -->> *p++ = 0; /* no more entries */
*p++ = htonl(resp->common.err == nfserr_eof);

resp->buffer is NULL because no entries were encoded. This was fixed
by Neil in 2.6.9.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax


-------------------------------------------------------
This SF.Net email is sponsored by: New Crystal Reports XI.
Version 11 adds new functionality designed to reduce time involved in
creating, integrating, and deploying reporting solutions. Free runtime info,
new features, or free trial, at: http://www.businessobjects.com/devxi/728
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs