OK. I've managed to squash the NFS read corruption problems that I had
on my 2.5.x client setup with the following patch.
Since the two of you reported what appears to be the same problem,
would you mind trying it out?
The fix basically tightens up consistency checks in the process of
reading the skb (which is done in the sk->data_ready() callback).
Cheers,
Trond
diff -u --recursive --new-file linux-2.5.66-10-nr_dirty/net/sunrpc/xprt.c linux-2.5.66-11-fix_read/net/sunrpc/xprt.c
--- linux-2.5.66-10-nr_dirty/net/sunrpc/xprt.c 2003-03-27 18:34:08.000000000 +0100
+++ linux-2.5.66-11-fix_read/net/sunrpc/xprt.c 2003-04-07 15:15:29.000000000 +0200
@@ -625,7 +625,8 @@
{
if (len > desc->count)
len = desc->count;
- skb_copy_bits(desc->skb, desc->offset, to, len);
+ if (skb_copy_bits(desc->skb, desc->offset, to, len))
+ return 0;
desc->count -= len;
desc->offset += len;
return len;
@@ -669,11 +670,15 @@
csum2 = skb_checksum(skb, desc.offset, skb->len - desc.offset, 0);
desc.csum = csum_block_add(desc.csum, csum2, desc.offset);
}
+ if (desc.count)
+ return -1;
if ((unsigned short)csum_fold(desc.csum))
return -1;
return 0;
no_checksum:
xdr_partial_copy_from_skb(xdr, 0, &desc, skb_read_bits);
+ if (desc.count)
+ return -1;
return 0;
}
@@ -750,7 +755,8 @@
{
if (len > desc->count)
len = desc->count;
- skb_copy_bits(desc->skb, desc->offset, p, len);
+ if (skb_copy_bits(desc->skb, desc->offset, p, len))
+ return 0;
desc->offset += len;
desc->count -= len;
return len;
On Mon, 2003-04-07 at 09:23, Trond Myklebust wrote:
> OK. I've managed to squash the NFS read corruption problems that I had
> on my 2.5.x client setup with the following patch.
> Since the two of you reported what appears to be the same problem,
> would you mind trying it out?
This fixes it for me. No errors, no corruption.
I did a verify of the md5sums of all of the Red Hat 9 RPM packages over
NFS. I had random failures (in different packages each time) before.
I just did it twice to be sure -- it works.
Thank you, Trond.
Robert Love