2016-04-13 11:35:20

by Ding Tianhong

[permalink] [raw]
Subject: [RFC] Is it a bug for nfs on udp6 mode or kernel?

Hi everyone:

I have met this problem when I try to test udp6 for nfs connection, my environment is:

Server:
kernel: 4.1.15
IP:xxxx::36/64
MTU:1500
Setting: /etc/exports:/home/nfs *(rw,sync,no_subtree_check,no_root_squash)

Client:
kernel: 4.1.18
IP:xxxx::90/64
MTU:1500
command: mount -t nfs -o vers=3,proto=udp6,rsize=4096,wsize=4096 [xxxx::36]:/home/nfs /home/tmp

I check the nfs parameter configuration, it looks fine and could work well for proto=tcp6.

Then I have mount correctly and try to run the command "ls", it hang.

When I use the rsize=1024 and wsize=1024 to mount, the problem disappeared, so I guess it is the problem for GSO or GRO for UDP。

Then I try to debug the problem, first I tcpdump the package from cline to server, and found that
the client have send readdirplus message to server correctly, and then the Server send a 4k package
to client(the big package will frag to 4 package by GSO), till now it looks fine, and the Client Nic could
receive the 4 skb then send to upper stack to ipv6 and udp, I found the incoming 4 package has been merged
to one and send to upper stack just like sunrpc, but I try to open the rpc_debug, it looks that the rpc could
not receive message.


I built a simple demo to test the udp stack, use the client socket to send big package to server socket, it work well,
so I think the udp is fine, maybe the bug is in sunrpc.

The test is very simple, does any body met the same problem like me, thanks for any suggestion.

Ding



2016-04-13 14:11:38

by Benjamin Coddington

[permalink] [raw]
Subject: Re: [RFC] Is it a bug for nfs on udp6 mode or kernel?

On Wed, 13 Apr 2016, Ding Tianhong wrote:

> Hi everyone:
>
> I have met this problem when I try to test udp6 for nfs connection, my environment is:
>
> Server:
> kernel: 4.1.15
> IP:xxxx::36/64
> MTU:1500
> Setting: /etc/exports:/home/nfs *(rw,sync,no_subtree_check,no_root_squash)
>
> Client:
> kernel: 4.1.18
> IP:xxxx::90/64
> MTU:1500
> command: mount -t nfs -o vers=3,proto=udp6,rsize=4096,wsize=4096 [xxxx::36]:/home/nfs /home/tmp
>
> I check the nfs parameter configuration, it looks fine and could work well for proto=tcp6.
>
> Then I have mount correctly and try to run the command "ls", it hang.
>
> When I use the rsize=1024 and wsize=1024 to mount, the problem disappeared, so I guess it is the problem for GSO or GRO for UDP。
>
> Then I try to debug the problem, first I tcpdump the package from cline to server, and found that
> the client have send readdirplus message to server correctly, and then the Server send a 4k package
> to client(the big package will frag to 4 package by GSO), till now it looks fine, and the Client Nic could
> receive the 4 skb then send to upper stack to ipv6 and udp, I found the incoming 4 package has been merged
> to one and send to upper stack just like sunrpc, but I try to open the rpc_debug, it looks that the rpc could
> not receive message.
>
>
> I built a simple demo to test the udp stack, use the client socket to send big package to server socket, it work well,
> so I think the udp is fine, maybe the bug is in sunrpc.
>
> The test is very simple, does any body met the same problem like me, thanks for any suggestion.
>
> Ding

I had a similar problem. Do you have upstream

405c92f ("ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment")
682b1a9 ("ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets")

Turning off GSO and GRO caused sunrpc to work properly before this.

https://bugzilla.redhat.com/show_bug.cgi?id=1271759

Ben

2016-04-13 16:17:12

by Eric Dumazet

[permalink] [raw]
Subject: Re: [RFC] Is it a bug for nfs on udp6 mode or kernel?

On Wed, 2016-04-13 at 19:28 +0800, Ding Tianhong wrote:
> Hi everyone:
>
> I have met this problem when I try to test udp6 for nfs connection, my environment is:
>
> Server:
> kernel: 4.1.15
> IP:xxxx::36/64
> MTU:1500
> Setting: /etc/exports:/home/nfs *(rw,sync,no_subtree_check,no_root_squash)
>
> Client:
> kernel: 4.1.18
> IP:xxxx::90/64
> MTU:1500
> command: mount -t nfs -o vers=3,proto=udp6,rsize=4096,wsize=4096 [xxxx::36]:/home/nfs /home/tmp
>
> I check the nfs parameter configuration, it looks fine and could work well for proto=tcp6.
>
> Then I have mount correctly and try to run the command "ls", it hang.
>
> When I use the rsize=1024 and wsize=1024 to mount, the problem disappeared, so I guess it is the problem for GSO or GRO for UDP。
>
> Then I try to debug the problem, first I tcpdump the package from cline to server, and found that
> the client have send readdirplus message to server correctly, and then the Server send a 4k package
> to client(the big package will frag to 4 package by GSO), till now it looks fine, and the Client Nic could
> receive the 4 skb then send to upper stack to ipv6 and udp, I found the incoming 4 package has been merged
> to one and send to upper stack just like sunrpc, but I try to open the rpc_debug, it looks that the rpc could
> not receive message.
>
>
> I built a simple demo to test the udp stack, use the client socket to send big package to server socket, it work well,
> so I think the udp is fine, maybe the bug is in sunrpc.
>
> The test is very simple, does any body met the same problem like me, thanks for any suggestion.
>
> Ding

Have you tried to disable UFO ?

ethtool -K eth... ufo off


2016-04-14 00:56:25

by Ding Tianhong

[permalink] [raw]
Subject: Re: [RFC] Is it a bug for nfs on udp6 mode or kernel?

On 2016/4/13 22:11, Benjamin Coddington wrote:
> On Wed, 13 Apr 2016, Ding Tianhong wrote:
>
>> Hi everyone:
>>
>> I have met this problem when I try to test udp6 for nfs connection, my environment is:
>>
>> Server:
>> kernel: 4.1.15
>> IP:xxxx::36/64
>> MTU:1500
>> Setting: /etc/exports:/home/nfs *(rw,sync,no_subtree_check,no_root_squash)
>>
>> Client:
>> kernel: 4.1.18
>> IP:xxxx::90/64
>> MTU:1500
>> command: mount -t nfs -o vers=3,proto=udp6,rsize=4096,wsize=4096 [xxxx::36]:/home/nfs /home/tmp
>>
>> I check the nfs parameter configuration, it looks fine and could work well for proto=tcp6.
>>
>> Then I have mount correctly and try to run the command "ls", it hang.
>>
>> When I use the rsize=1024 and wsize=1024 to mount, the problem disappeared, so I guess it is the problem for GSO or GRO for UDP。
>>
>> Then I try to debug the problem, first I tcpdump the package from cline to server, and found that
>> the client have send readdirplus message to server correctly, and then the Server send a 4k package
>> to client(the big package will frag to 4 package by GSO), till now it looks fine, and the Client Nic could
>> receive the 4 skb then send to upper stack to ipv6 and udp, I found the incoming 4 package has been merged
>> to one and send to upper stack just like sunrpc, but I try to open the rpc_debug, it looks that the rpc could
>> not receive message.
>>
>>
>> I built a simple demo to test the udp stack, use the client socket to send big package to server socket, it work well,
>> so I think the udp is fine, maybe the bug is in sunrpc.
>>
>> The test is very simple, does any body met the same problem like me, thanks for any suggestion.
>>
>> Ding
>
> I had a similar problem. Do you have upstream
>
> 405c92f ("ipv6: add defensive check for CHECKSUM_PARTIAL skbs in ip_fragment")
> 682b1a9 ("ipv6: no CHECKSUM_PARTIAL on MSG_MORE corked sockets")
>

Hi Ben:

Thanks for the feedback, my kernel is 4.1.15, not merged these two patch, I will check and try, thanks a lot.


> Turning off GSO and GRO caused sunrpc to work properly before this.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1271759
>

Sorry, I don't have permission to read this, but thanks for this message.:)

Ding

> Ben
>


2016-04-14 01:01:52

by Ding Tianhong

[permalink] [raw]
Subject: Re: [RFC] Is it a bug for nfs on udp6 mode or kernel?

On 2016/4/14 0:17, Eric Dumazet wrote:
> On Wed, 2016-04-13 at 19:28 +0800, Ding Tianhong wrote:
>> Hi everyone:
>>
>> I have met this problem when I try to test udp6 for nfs connection, my environment is:
>>
>> Server:
>> kernel: 4.1.15
>> IP:xxxx::36/64
>> MTU:1500
>> Setting: /etc/exports:/home/nfs *(rw,sync,no_subtree_check,no_root_squash)
>>
>> Client:
>> kernel: 4.1.18
>> IP:xxxx::90/64
>> MTU:1500
>> command: mount -t nfs -o vers=3,proto=udp6,rsize=4096,wsize=4096 [xxxx::36]:/home/nfs /home/tmp
>>
>> I check the nfs parameter configuration, it looks fine and could work well for proto=tcp6.
>>
>> Then I have mount correctly and try to run the command "ls", it hang.
>>
>> When I use the rsize=1024 and wsize=1024 to mount, the problem disappeared, so I guess it is the problem for GSO or GRO for UDP。
>>
>> Then I try to debug the problem, first I tcpdump the package from cline to server, and found that
>> the client have send readdirplus message to server correctly, and then the Server send a 4k package
>> to client(the big package will frag to 4 package by GSO), till now it looks fine, and the Client Nic could
>> receive the 4 skb then send to upper stack to ipv6 and udp, I found the incoming 4 package has been merged
>> to one and send to upper stack just like sunrpc, but I try to open the rpc_debug, it looks that the rpc could
>> not receive message.
>>
>>
>> I built a simple demo to test the udp stack, use the client socket to send big package to server socket, it work well,
>> so I think the udp is fine, maybe the bug is in sunrpc.
>>
>> The test is very simple, does any body met the same problem like me, thanks for any suggestion.
>>
>> Ding
>
> Have you tried to disable UFO ?
>
> ethtool -K eth... ufo off
>
>
Hi Eric:

Already disabled, my nic don't support UFO.

Thanks.

Ding
>
> .
>