2020-04-29 12:05:13

by gaurav gangalwar

[permalink] [raw]
Subject: [Problem] Client discarding data, mount hung.

I am getting these logs in rpcdebug, client is discarding replies for
FSINFO and FSSTAT.
Mount point is not accessible, but I am able to mount and access from other ips.

C02W91BDHV2R:jita-hang-discard gaurav.gangalwar$ grep "discarded" syslog.1

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
discarded 172 bytes

Apr 28 16:13:09 jita_tester_host_12 kernel: [2345499.393657] RPC:
discarded 168 bytes

Apr 28 16:13:16 jita_tester_host_12 kernel: [2345507.216440] RPC:
discarded 172 bytes

Apr 28 16:13:44 jita_tester_host_12 kernel: [2345534.412376] RPC:
discarded 168 bytes


I am not sure what could be the reason for discarding, I am using tcp
as mount option, as per code looks like tcp_flags on transport may not
be set.
I am clueless, how to debug, as I couldn’t find any failure in server
logs and tcpdump.
Please let me know how to debug this issue further?

Pasting more log snippet and kernel version

nutanix@jita_tester_host_12:~$ uname -a

Linux jita_tester_host_12 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10
10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782554] RPC:
61642 xprt_transmit(72)

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782606] RPC:
xs_tcp_send_request(72) = 0

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782607] RPC:
61642 xmit complete

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782609] RPC:
61642 sleep_on(queue "xprt_pending" time 4881255675)

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782610] RPC:
61642 added to queue 00000000d869a80a "xprt_pending"

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782611] RPC:
61642 setting alarm for 60000 ms

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782612] RPC:
wake_up_first(00000000ae52133b "xprt_sending")

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782613] RPC:
61642 sync task going to sleep

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783719] RPC:
xs_data_ready...

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783759] RPC:
xs_tcp_data_recv started

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
discarded 172 bytes

Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
xs_tcp_data_recv done



Regards,
Gaurav


2020-04-30 19:44:58

by Ashish Sangwan

[permalink] [raw]
Subject: Re: [Problem] Client discarding data, mount hung.

On Wed, Apr 29, 2020 at 5:35 PM gaurav gangalwar
<[email protected]> wrote:
>
> I am getting these logs in rpcdebug, client is discarding replies for
> FSINFO and FSSTAT.
> Mount point is not accessible, but I am able to mount and access from other ips.
>
> C02W91BDHV2R:jita-hang-discard gaurav.gangalwar$ grep "discarded" syslog.1
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
> discarded 172 bytes
>
> Apr 28 16:13:09 jita_tester_host_12 kernel: [2345499.393657] RPC:
> discarded 168 bytes
>
> Apr 28 16:13:16 jita_tester_host_12 kernel: [2345507.216440] RPC:
> discarded 172 bytes
>
> Apr 28 16:13:44 jita_tester_host_12 kernel: [2345534.412376] RPC:
> discarded 168 bytes
>
Adding some more information to what Gaurav mentioned:
A quick glance at the code, looks like in xs_tcp_data_recv(),
xs_tcp_read_discard() is called for the entire packet.
That would mean transport->tcp_flags are 0?

But we have seen that the connection is still in ESTABLISHED state:
tcp 0 0 10.41.24.83:973 10.41.24.14:2049 ESTABLISHED -

Not sure how can tcp_flags become 0 for established connection?


>
> I am not sure what could be the reason for discarding, I am using tcp
> as mount option, as per code looks like tcp_flags on transport may not
> be set.
> I am clueless, how to debug, as I couldn’t find any failure in server
> logs and tcpdump.
> Please let me know how to debug this issue further?
>
> Pasting more log snippet and kernel version
>
> nutanix@jita_tester_host_12:~$ uname -a
>
> Linux jita_tester_host_12 4.15.0-38-generic #41-Ubuntu SMP Wed Oct 10
> 10:59:38 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782554] RPC:
> 61642 xprt_transmit(72)
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782606] RPC:
> xs_tcp_send_request(72) = 0
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782607] RPC:
> 61642 xmit complete
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782609] RPC:
> 61642 sleep_on(queue "xprt_pending" time 4881255675)
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782610] RPC:
> 61642 added to queue 00000000d869a80a "xprt_pending"
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782611] RPC:
> 61642 setting alarm for 60000 ms
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782612] RPC:
> wake_up_first(00000000ae52133b "xprt_sending")
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.782613] RPC:
> 61642 sync task going to sleep
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783719] RPC:
> xs_data_ready...
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783759] RPC:
> xs_tcp_data_recv started
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
> discarded 172 bytes
>
> Apr 28 16:12:16 jita_tester_host_12 kernel: [2345446.783761] RPC:
> xs_tcp_data_recv done
>
>
>
> Regards,
> Gaurav