2022-05-23 01:42:48

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH] xprtrdam: Don't treat a call as bcall when bc_serv is NULL



On 2022/5/22 9:41 AM, Chuck Lever III wrote:
>
>
>> On May 21, 2022, at 9:24 PM, Kinglong Mee <[email protected]> wrote:
>>
>> Hi Chuck,
>>
>> On 2022/5/22 12:33 AM, Chuck Lever III wrote:
>>>> On May 21, 2022, at 5:51 AM, Kinglong Mee <[email protected]> wrote:
>>>>
>>>> When rdma server returns a fault reply, rpcrdma may treats it as a bcall.
>>>> As using NFSv3, a bc server is never exist.
>>>> rpcrdma_bc_receive_call will meets NULL pointer as,
>>>>
>>>> [ 226.057890] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
>>>> ...
>>>> [ 226.058704] RIP: 0010:_raw_spin_lock+0xc/0x20
>>>> ...
>>>> [ 226.059732] Call Trace:
>>>> [ 226.059878] rpcrdma_bc_receive_call+0x138/0x327 [rpcrdma]
>>>> [ 226.060011] __ib_process_cq+0x89/0x170 [ib_core]
>>>> [ 226.060092] ib_cq_poll_work+0x26/0x80 [ib_core]
>>>> [ 226.060257] process_one_work+0x1a7/0x360
>>>> [ 226.060367] ? create_worker+0x1a0/0x1a0
>>>> [ 226.060440] worker_thread+0x30/0x390
>>>> [ 226.060500] ? create_worker+0x1a0/0x1a0
>>>> [ 226.060574] kthread+0x116/0x130
>>>> [ 226.060661] ? kthread_flush_work_fn+0x10/0x10
>>>> [ 226.060724] ret_from_fork+0x35/0x40
>>>> ...
>>>>
>>>> Signed-off-by: Kinglong Mee <[email protected]>
>>>> ---
>>>> net/sunrpc/xprtrdma/rpc_rdma.c | 5 +++++
>>>> 1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/net/sunrpc/xprtrdma/rpc_rdma.c b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> index 281ddb87ac8d..9486bb98eb2f 100644
>>>> --- a/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> +++ b/net/sunrpc/xprtrdma/rpc_rdma.c
>>>> @@ -1121,9 +1121,14 @@ static bool
>>>> rpcrdma_is_bcall(struct rpcrdma_xprt *r_xprt, struct rpcrdma_rep *rep)
>>>> #if defined(CONFIG_SUNRPC_BACKCHANNEL)
>>>> {
>>>> + struct rpc_xprt *xprt = &r_xprt->rx_xprt;
>>>> struct xdr_stream *xdr = &rep->rr_stream;
>>>> __be32 *p;
>>>>
>>>> + /* no bc service, not a bcall. */
>>>> + if (xprt->bc_serv == NULL)
>>>> + return false;
>>>> +
>>>> if (rep->rr_proc != rdma_msg)
>>>> return false;
>>> I'm not sure what you mean above by "fault reply".
>>> The check here for whether the RPC/RDMA procedure is an RDMA_MSG
>>> is supposed to be enough to avoid any further processing of an
>>> RDMA_ERR type procedure.
>>> What kind of fault has occurred? Can you share with us the
>>> actual RPC/RDMA transport header that triggers the BUG?
>>
>> I have a nfs rdma server, but it handles drc reply wrongly sometimes,
>> it returns a bad format reply to nfs client.
>> Nfs rdma client treats the bad format reply as a bcall, and
>
> Doesn't this test return false:
>
> 1144 if (*p != cpu_to_be32(RPC_CALL))
> 1145 return false;

No, it doesn't return false here.

The debug message at rpcrdma_bc_receive_call are,

[56579.837169] RPC: rpcrdma_bc_receive_call: callback XID
00000001, length=20
[56579.837174] RPC: rpcrdma_bc_receive_call: 00 00 00 01 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 04

>
> But OK: a malicious NFSv3 server can trigger a client crash.
> That's a bug.
>
> This is an additional conditional in a hot path. Would it make
> sense to move the new test to just after 1145? >
> Even then, it could be a bcall, the client simply isn't
> prepared to process it. So "/* no bc service */" by itself
> would be a more accurate comment.

Okay, I will update it and send a v2 patch after moving the checking
after 1145.

Thanks,
Kinglong Mee