Hey Trond,
I've been noticing slower than usual xfstests, and in the logs:
"RPC: Could not send backchannel reply error: -110"
Since "59464b262ff5 SUNRPC: SOFTCONN tasks should time out when on the
sending list", some backchannel reqs will immediately reset the connection
if they need to sleep on ->sending in xprt_reserve_xprt.
I don't think we set up rq_timeout and rq_majortimeout for backchannel reqs,
so they immediately fail with -ETIMEDOUT.
I'm hunting around for the best fix, but maybe you've got one I can test.
Ben
On 7 Dec 2023, at 10:25, Benjamin Coddington wrote:
> Hey Trond,
>
> I've been noticing slower than usual xfstests, and in the logs:
>
> "RPC: Could not send backchannel reply error: -110"
>
> Since "59464b262ff5 SUNRPC: SOFTCONN tasks should time out when on the
> sending list", some backchannel reqs will immediately reset the connection
> if they need to sleep on ->sending in xprt_reserve_xprt.
>
> I don't think we set up rq_timeout and rq_majortimeout for backchannel reqs,
> so they immediately fail with -ETIMEDOUT.
>
> I'm hunting around for the best fix, but maybe you've got one I can test.
Assuming we want backchannel reqs to actually check/timeout/reset, I think
its looking like we need to do a version of xprt_init_majortimeo() for every
xprt_get_bc_request()..
Ben