LinuxLists.cc - NFS corruption on 2.6.11.7

2005-05-23 23:45:04

Subject: NFS corruption on 2.6.11.7

I have both the server and client running 2.6.11.7 and have some severe
data corruption when reading from the server (maybe on write also I have
not tested).

If I copy the data over with scp or ftp I get correct data. Also nfs
works OK with a mac os x 10.4 client.

Running gen.sh on the server and then cmp.sh on the client results in a
md5 checksum difference on 5-12 files I have never done one run where
there was no errors.

This is what cat /proc/mounts reports on the nfs mount

:/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0

Attachments:

gen.sh (176.00 B)
cmp.sh (147.00 B)
signature.asc (189.00 B)
This is a digitally signed message part Download all attachments

2005-05-24 00:36:15

by Trond Myklebust

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson:
> I have both the server and client running 2.6.11.7 and have some severe
> data corruption when reading from the server (maybe on write also I have
> not tested).
>
> If I copy the data over with scp or ftp I get correct data. Also nfs
> works OK with a mac os x 10.4 client.
>
> Running gen.sh on the server and then cmp.sh on the client results in a
> md5 checksum difference on 5-12 files I have never done one run where
> there was no errors.
>
> This is what cat /proc/mounts reports on the nfs mount
>
> :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0
>

I'm seeing no problems at all with this on a loopback mount with
2.6.12-rc4. Mind giving us some more details on your setup?

Cheers,
Trond

Attachments:

sum_org (1.07 kB)
sum_new (1.07 kB)
Download all attachments

2005-05-24 01:12:10

by Trond Myklebust

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

må den 23.05.2005 Klokka 20:35 (-0400) skreiv Trond Myklebust:
> ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson:

> > This is what cat /proc/mounts reports on the nfs mount
> >
> > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0
> >

BTW: Why is /proc/mounts reporting the server as being an empty string?
Normally, the "mount" program should be setting that to whatever you
specified on the command line.

Cheers,
Trond

2005-05-24 10:22:00

by Kenneth Johansson

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

On Mon, 2005-05-23 at 20:35 -0400, Trond Myklebust wrote:
> ty den 24.05.2005 Klokka 00:47 (+0200) skreiv Kenneth Johansson:
> > I have both the server and client running 2.6.11.7 and have some severe
> > data corruption when reading from the server (maybe on write also I have
> > not tested).
> >
> > If I copy the data over with scp or ftp I get correct data. Also nfs
> > works OK with a mac os x 10.4 client.
> >
> > Running gen.sh on the server and then cmp.sh on the client results in a
> > md5 checksum difference on 5-12 files I have never done one run where
> > there was no errors.
> >
> > This is what cat /proc/mounts reports on the nfs mount
> >
> > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0
> >
>
> I'm seeing no problems at all with this on a loopback mount with
> 2.6.12-rc4. Mind giving us some more details on your setup?
>
> Cheers,
> Trond

I did some more investigation what type of data error I get and it looks
a bit strange. I always get 28 bytes wrong in a sequence some times this
is data repeated from previous in the file but not always. Anybody know
what cache line size this cpu has?

processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 8
model name : AMD Athlon(TM) XP 2200+
stepping : 0
cpu MHz : 1802.998
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
bogomips : 3547.13

Here is a sample if three files with errors in them.

file 13 "od -Ax -tx1z"

-924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.<
+924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<

-924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........<
+924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....<

28 bytes wrong in a sequence
The data is a repeat from previous data in the file.

>grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org
924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<

>grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err
924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<

924dc0 is a copy of 924d40
128 bytes offset

file 14 "od -Ax -tx1z"

-0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..<
+0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<

-091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.<
+091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.<

>grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err
0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<

28 bytes wrong
64 bytes offset

file 16 "od -Ax -tx1z"

-635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......<
+635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................<

-635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$<
+635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.<

-635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m<
+635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m<

28 bytes wrong
This time the data is not from this file.

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2005-05-24 12:01:47

by Trond Myklebust

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

ty den 24.05.2005 Klokka 12:15 (+0200) skreiv Kenneth Johansson:

> > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0
> > >
> >
> > I'm seeing no problems at all with this on a loopback mount with
> > 2.6.12-rc4. Mind giving us some more details on your setup?
> >
> > Cheers,
> > Trond

Does the above export line mean that you are running with amd? If so,
could you retry using an ordinary NFS mount (preferably a loopback mount
- i.e. mount something over "localhost").

Again, please could you give us more details on how you are doing these
tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod
output, (and ditto for the server).
How are you using your scripts? Are you first running one on the server,
then the other on the client, are you deleting the old files before you
start a new run, etc.

> I did some more investigation what type of data error I get and it looks
> a bit strange. I always get 28 bytes wrong in a sequence some times this
> is data repeated from previous in the file but not always. Anybody know
> what cache line size this cpu has?
>
> processor : 0
> vendor_id : AuthenticAMD
> cpu family : 6
> model : 8
> model name : AMD Athlon(TM) XP 2200+
> stepping : 0
> cpu MHz : 1802.998
> cache size : 256 KB
> fdiv_bug : no
> hlt_bug : no
> f00f_bug : no
> coma_bug : no
> fpu : yes
> fpu_exception : yes
> cpuid level : 1
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
> bogomips : 3547.13
>
> Here is a sample if three files with errors in them.
>
> file 13 "od -Ax -tx1z"
>
>
> -924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.<
> +924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
>
> -924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........<
> +924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....<
>
>
> 28 bytes wrong in a sequence
> The data is a repeat from previous data in the file.
>
> >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org
> 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
>
> >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err
> 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
> 924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
>
> 924dc0 is a copy of 924d40
> 128 bytes offset
>
>
> file 14 "od -Ax -tx1z"
>
> -0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..<
> +0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
>
> -091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.<
> +091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.<
>
> >grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err
> 0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
> 0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
>
> 28 bytes wrong
> 64 bytes offset
>
>
> file 16 "od -Ax -tx1z"
>
> -635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......<
> +635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................<
>
> -635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$<
> +635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.<
>
> -635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m<
> +635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m<
>
> 28 bytes wrong
> This time the data is not from this file.
>
>
>
>

2005-05-24 15:00:01

by Kenneth Johansson

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

On Tue, 2005-05-24 at 08:01 -0400, Trond Myklebust wrote:
> ty den 24.05.2005 Klokka 12:15 (+0200) skreiv Kenneth Johansson:
>
> > > > :/export/home/ken /home/ken nfs rw,v3,rsize=32768,wsize=32768,hard,udp,lock,addr=amd 0 0
> > > >
> > >
> > > I'm seeing no problems at all with this on a loopback mount with
> > > 2.6.12-rc4. Mind giving us some more details on your setup?
> > >
> > > Cheers,
> > > Trond
>
> Does the above export line mean that you are running with amd? If so,
This only means that I had no imagination naming the computer and simply
used the name of the cpu manufacturer used in the computer.

> could you retry using an ordinary NFS mount (preferably a loopback mount
> - i.e. mount something over "localhost").

This works OK.

> Again, please could you give us more details on how you are doing these
> tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod
> output, (and ditto for the server).

The only new thing is.
0000:00:0e.0 Ethernet controller: D-Link System Inc Gigabit Ethernet Adapter (rev 11)
And the driver is sk98lin compiled into the kernel.

Everything else has been the same for over a year. hmm I did change the
switch also but I do not remember what I got.

I do not get any problem reading with a osx client also in gigabit speed
but the client cpu is much slower so it's not exactly the same thing.

> How are you using your scripts? Are you first running one on the server,
> then the other on the client, are you deleting the old files before you
> start a new run, etc.
Telnet to the server run the gen part then run the cmp on the client. And yes I do delete the files otherwise they would more or less only be in the cache.

>
> > I did some more investigation what type of data error I get and it looks
> > a bit strange. I always get 28 bytes wrong in a sequence some times this
> > is data repeated from previous in the file but not always. Anybody know
> > what cache line size this cpu has?
> >
> > processor : 0
> > vendor_id : AuthenticAMD
> > cpu family : 6
> > model : 8
> > model name : AMD Athlon(TM) XP 2200+
> > stepping : 0
> > cpu MHz : 1802.998
> > cache size : 256 KB
> > fdiv_bug : no
> > hlt_bug : no
> > f00f_bug : no
> > coma_bug : no
> > fpu : yes
> > fpu_exception : yes
> > cpuid level : 1
> > wp : yes
> > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr sse pni syscall mmxext 3dnowext 3dnow
> > bogomips : 3547.13
> >
> > Here is a sample if three files with errors in them.
> >
> > file 13 "od -Ax -tx1z"
> >
> >
> > -924dc0 df b3 0c 89 2d a2 83 da 1c 08 f2 66 da f6 6b f4 >....-......f..k.<
> > +924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
> >
> > -924dd0 af c2 44 57 9a 13 01 43 84 bf 99 c3 1b 16 8a 00 >..DW...C........<
> > +924dd0 3e 64 d7 bd 4f 8d 26 cf 4f 4f 2c 62 1b 16 8a 00 >>d..O.&.OO,b....<
> >
> >
> > 28 bytes wrong in a sequence
> > The data is a repeat from previous data in the file.
> >
> > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_org
> > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
> >
> > >grep "43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd" 13_err
> > 924d40 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
> > 924dc0 43 11 2a f4 98 09 d5 76 aa 26 83 00 24 3d 11 fd >C.*....v.&..$=..<
> >
> > 924dc0 is a copy of 924d40
> > 128 bytes offset
> >
> >
> > file 14 "od -Ax -tx1z"
> >
> > -0912f0 91 45 bb cd eb 4f 01 d3 69 27 88 b5 7d 7d 17 8d >.E...O..i'..}}..<
> > +0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
> >
> > -091300 7d 94 8e f9 81 d0 c2 4a b5 8e c6 af b0 03 4c 16 >}......J......L.<
> > +091300 d9 05 ac 0d fc eb 00 71 17 bd fb 3e b0 03 4c 16 >.......q...>..L.<
> >
> > >grep "b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29" 14_err
> > 0912b0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
> > 0912f0 b8 3f 4e 5d 2e 86 ed c0 51 79 fe ec 3e 53 c9 29 >.?N]....Qy..>S.)<
> >
> > 28 bytes wrong
> > 64 bytes offset
> >
> >
> > file 16 "od -Ax -tx1z"
> >
> > -635200 c3 1d f2 b8 c4 d5 12 c1 3f 48 e6 9d dc 98 1f e5 >........?H......<
> > +635200 c3 1d f2 b8 c4 d5 12 c1 00 10 00 00 00 d0 ec 08 >................<
> >
> > -635210 9e 54 e7 f1 49 5b 1e d0 9f e2 7c 26 24 cb 98 24 >.T..I[....|&$..$<
> > +635210 00 10 00 00 00 90 14 08 00 10 00 00 00 50 25 06 >.............P%.<
> >
> > -635220 25 fc 63 2a bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >%.c*......g...]m<
> > +635220 00 10 00 00 bf 07 b4 c0 cf a1 67 9b ef 01 5d 6d >..........g...]m<
> >
> > 28 bytes wrong
> > This time the data is not from this file.
> >
> >
> >
> >
>

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2005-05-25 20:13:33

by Kenneth Johansson

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

On Tue, 2005-05-24 at 08:01 -0400, Trond Myklebust wrote:

> Again, please could you give us more details on how you are doing these
> tests: what hardware (i.e. what NIC, switch, server, memory,...), lsmod
> output, (and ditto for the server).

After changing the mount option to use tcp instead of udp I have now
read several gigabytes without a single error.

Is there some fundamental difference in how nfs over upd and tcp is
handled regarding the packet contents like tcp using the tcp checksum
and udp not using the udp checksum or something like that?

Are there any counters for checksum errors in udp and tcp that can be
read ?? I faild to spot anything in /proc.

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part

2005-05-25 20:34:09

by Kenneth Johansson

[permalink] [raw]

Subject: Re: NFS corruption on 2.6.11.7

On Wed, 2005-05-25 at 13:16 -0700, David S.Miller wrote:
> From: Kenneth Johansson <[email protected]>
> Date: Wed, 25 May 2005 22:13:27 +0200
>
> > Is there some fundamental difference in how nfs over upd and tcp is
> > handled regarding the packet contents like tcp using the tcp checksum
> > and udp not using the udp checksum or something like that?
> >
> > Are there any counters for checksum errors in udp and tcp that can be
> > read ?? I faild to spot anything in /proc.
>
> If you are on a gigabit or faster network, IPv4 fragment sequence
> numbers can wrap and if you are very unlucky the checksums will
> match as well corrupting your data. This is a fatal limitation of
> the small 16-bit IPv4 framgent ID.
>
> Use TCP for NFS unless you want NFS data corruption.
>

Unlikely to be the case this time. I get a sequence of 28 bytes that is
wrong in the data and often the wrong data is a copy from data 64 or 128
byte earlier in the file. If this was not on a PC with cache coherency I
would guess that someone forgot to do a cache invalidate/flush. But I do
wonder why I only see this problem with nfs over udp.

Attachments:

signature.asc (189.00 B)
This is a digitally signed message part