If RPC use udp as it's transport protocol, transport->connect_worker
will call xs_udp_setup_socket.
xs_udp_setup_socket
sock = xs_create_sock
if (IS_ERR(sock))
goto out;
out:
xprt_unlock_connect
xprt_schedule_autodisconnect
mod_timer
internal_add_timer -->insert xprt->timer to base timer list
xs_destroy
cancel_delayed_work_sync(&transport->connect_worker)
xs_xprt_free(xprt) -->free xprt
Thus use-after-free will happen.
Signed-off-by: Zheng Bin <[email protected]>
---
net/sunrpc/xprtsock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 845d0be805ec..c796808e7f7a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
dprintk("RPC: xs_destroy xprt %p\n", xprt);
cancel_delayed_work_sync(&transport->connect_worker);
+ del_timer_sync(&xprt->timer);
xs_close(xprt);
cancel_work_sync(&transport->recv_worker);
cancel_work_sync(&transport->error_worker);
--
2.26.0.106.g9fadedd
On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
> If RPC use udp as it's transport protocol, transport->connect_worker
> will call xs_udp_setup_socket.
> xs_udp_setup_socket
> sock = xs_create_sock
> if (IS_ERR(sock))
> goto out;
> out:
> xprt_unlock_connect
> xprt_schedule_autodisconnect
> mod_timer
> internal_add_timer -->insert xprt->timer to base timer
> list
>
> xs_destroy
> cancel_delayed_work_sync(&transport->connect_worker)
> xs_xprt_free(xprt) -->free xprt
>
> Thus use-after-free will happen.
>
> Signed-off-by: Zheng Bin <[email protected]>
> ---
> net/sunrpc/xprtsock.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index 845d0be805ec..c796808e7f7a 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
> dprintk("RPC: xs_destroy xprt %p\n", xprt);
>
> cancel_delayed_work_sync(&transport->connect_worker);
> + del_timer_sync(&xprt->timer);
> xs_close(xprt);
> cancel_work_sync(&transport->recv_worker);
> cancel_work_sync(&transport->error_worker);
> --
> 2.26.0.106.g9fadedd
>
I'm confused. How can this happen given that xprt_destroy() first takes
the XPRT_LOCK, and then deletes xprt->timer?
Right now, the socket code knows nothing about the details of xprt-
>timer and what it is used for. I'd prefer to keep it that way if
possible.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
The complete process is like this:
xprt_destroy
wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE)
-->getlock
del_timer_sync(&xprt->timer) -->del xprt->timer
INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
xprt_destroy_cb
xs_destroy(xprt->ops->destroy)
cancel_delayed_work_sync -->will call
transport->connect_worker, whose callback is xs_udp_setup_socket
xs_xprt_free(xprt) -->free xprt
xs_udp_setup_socket
sock = xs_create_sock
xprt_unlock_connect
if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED
xprt_schedule_autodisconnect
mod_timer
internal_add_timer -->insert xprt->timer to base timer list
On 2020/6/4 23:39, Trond Myklebust wrote:
> On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
>> If RPC use udp as it's transport protocol, transport->connect_worker
>> will call xs_udp_setup_socket.
>> xs_udp_setup_socket
>> sock = xs_create_sock
>> if (IS_ERR(sock))
>> goto out;
>> out:
>> xprt_unlock_connect
>> xprt_schedule_autodisconnect
>> mod_timer
>> internal_add_timer -->insert xprt->timer to base timer
>> list
>>
>> xs_destroy
>> cancel_delayed_work_sync(&transport->connect_worker)
>> xs_xprt_free(xprt) -->free xprt
>>
>> Thus use-after-free will happen.
>>
>> Signed-off-by: Zheng Bin <[email protected]>
>> ---
>> net/sunrpc/xprtsock.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>> index 845d0be805ec..c796808e7f7a 100644
>> --- a/net/sunrpc/xprtsock.c
>> +++ b/net/sunrpc/xprtsock.c
>> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
>> dprintk("RPC: xs_destroy xprt %p\n", xprt);
>>
>> cancel_delayed_work_sync(&transport->connect_worker);
>> + del_timer_sync(&xprt->timer);
>> xs_close(xprt);
>> cancel_work_sync(&transport->recv_worker);
>> cancel_work_sync(&transport->error_worker);
>> --
>> 2.26.0.106.g9fadedd
>>
> I'm confused. How can this happen given that xprt_destroy() first takes
> the XPRT_LOCK, and then deletes xprt->timer?
>
> Right now, the socket code knows nothing about the details of xprt-
>> timer and what it is used for. I'd prefer to keep it that way if
> possible.
>
The issue trigger like this:
mount_fs
nfs_try_mount
nfs_create_server
nfs_init_server
nfs_init_client
nfs_create_rpc_client
rpc_create
xprt_create_transport
xs_setup_udp
xs_setup_xprt
INIT_DELAYED_WORK(xs_udp_setup_socket)//timer
xs_connect
queue_delayed_work
rpc_create_xprt
rpc_new_client
rpc_ping //fail
rpc_shutdown
rpc_release_client
rpc_free_auth
rpc_free_client
xprt_put
xprt_destroy
XPRT_LOCKED //wait lock
del_timer_sync
xprt_destroy_cb
xs_destroy
cancel_delayed_work_sync --fire--> xs_udp_setup_socket
xprt_unlock_connect
xprt_schedule_autodisconnect
mod_timer
xs_xprt_free
timer out, access xprt //UAF
On 2020/6/5 10:10, Zhengbin (OSKernel) wrote:
> The complete process is like this:
>
> xprt_destroy
> wait_on_bit_lock(&xprt->state, XPRT_LOCKED, TASK_UNINTERRUPTIBLE) -->getlock
> del_timer_sync(&xprt->timer) -->del xprt->timer
> INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
>
> xprt_destroy_cb
> xs_destroy(xprt->ops->destroy)
> cancel_delayed_work_sync -->will call transport->connect_worker, whose callback is xs_udp_setup_socket
> xs_xprt_free(xprt) -->free xprt
>
> xs_udp_setup_socket
> sock = xs_create_sock
> xprt_unlock_connect
> if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is XPRT_LOCKED
> xprt_schedule_autodisconnect
> mod_timer
> internal_add_timer -->insert xprt->timer to base timer list
>
> On 2020/6/4 23:39, Trond Myklebust wrote:
>> On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
>>> If RPC use udp as it's transport protocol, transport->connect_worker
>>> will call xs_udp_setup_socket.
>>> xs_udp_setup_socket
>>> sock = xs_create_sock
>>> if (IS_ERR(sock))
>>> goto out;
>>> out:
>>> xprt_unlock_connect
>>> xprt_schedule_autodisconnect
>>> mod_timer
>>> internal_add_timer -->insert xprt->timer to base timer
>>> list
>>>
>>> xs_destroy
>>> cancel_delayed_work_sync(&transport->connect_worker)
>>> xs_xprt_free(xprt) -->free xprt
>>>
>>> Thus use-after-free will happen.
>>>
>>> Signed-off-by: Zheng Bin <[email protected]>
>>> ---
>>> net/sunrpc/xprtsock.c | 1 +
>>> 1 file changed, 1 insertion(+)
>>>
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index 845d0be805ec..c796808e7f7a 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt *xprt)
>>> dprintk("RPC: xs_destroy xprt %p\n", xprt);
>>>
>>> cancel_delayed_work_sync(&transport->connect_worker);
>>> + del_timer_sync(&xprt->timer);
>>> xs_close(xprt);
>>> cancel_work_sync(&transport->recv_worker);
>>> cancel_work_sync(&transport->error_worker);
>>> --
>>> 2.26.0.106.g9fadedd
>>>
>> I'm confused. How can this happen given that xprt_destroy() first takes
>> the XPRT_LOCK, and then deletes xprt->timer?
>>
>> Right now, the socket code knows nothing about the details of xprt-
>>> timer and what it is used for. I'd prefer to keep it that way if
>> possible.
>>
>
>
> .
>
On Fri, 2020-06-05 at 10:10 +0800, Zhengbin (OSKernel) wrote:
> The complete process is like this:
>
> xprt_destroy
> wait_on_bit_lock(&xprt->state, XPRT_LOCKED,
> TASK_UNINTERRUPTIBLE)
> -->getlock
> del_timer_sync(&xprt->timer) -->del xprt->timer
> INIT_WORK(&xprt->task_cleanup, xprt_destroy_cb)
>
> xprt_destroy_cb
> xs_destroy(xprt->ops->destroy)
> cancel_delayed_work_sync -->will call
> transport->connect_worker, whose callback is xs_udp_setup_socket
> xs_xprt_free(xprt) -->free xprt
>
> xs_udp_setup_socket
> sock = xs_create_sock
> xprt_unlock_connect
> if (!test_bit(XPRT_LOCKED, &xprt->state)) -->state is
> XPRT_LOCKED
> xprt_schedule_autodisconnect
> mod_timer
> internal_add_timer -->insert xprt->timer to base timer
> list
The call to xprt_lock_connect() in xs_connect() is supposed to ensure
exclusion w.r.t. wait_on_bit_lock(). In other words if the transport-
>connect_worker is actually queued, it is also supposed to be holding
the XPRT_LOCK and ensuring that we can't get into the above situation.
Why is that not the case here?
>
> On 2020/6/4 23:39, Trond Myklebust wrote:
> > On Thu, 2020-06-04 at 22:49 +0800, Zheng Bin wrote:
> > > If RPC use udp as it's transport protocol, transport-
> > > >connect_worker
> > > will call xs_udp_setup_socket.
> > > xs_udp_setup_socket
> > > sock = xs_create_sock
> > > if (IS_ERR(sock))
> > > goto out;
> > > out:
> > > xprt_unlock_connect
> > > xprt_schedule_autodisconnect
> > > mod_timer
> > > internal_add_timer -->insert xprt->timer to base
> > > timer
> > > list
> > >
> > > xs_destroy
> > > cancel_delayed_work_sync(&transport->connect_worker)
> > > xs_xprt_free(xprt) -->free xprt
> > >
> > > Thus use-after-free will happen.
> > >
> > > Signed-off-by: Zheng Bin <[email protected]>
> > > ---
> > > net/sunrpc/xprtsock.c | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> > > index 845d0be805ec..c796808e7f7a 100644
> > > --- a/net/sunrpc/xprtsock.c
> > > +++ b/net/sunrpc/xprtsock.c
> > > @@ -1242,6 +1242,7 @@ static void xs_destroy(struct rpc_xprt
> > > *xprt)
> > > dprintk("RPC: xs_destroy xprt %p\n", xprt);
> > >
> > > cancel_delayed_work_sync(&transport->connect_worker);
> > > + del_timer_sync(&xprt->timer);
> > > xs_close(xprt);
> > > cancel_work_sync(&transport->recv_worker);
> > > cancel_work_sync(&transport->error_worker);
> > > --
> > > 2.26.0.106.g9fadedd
> > >
> > I'm confused. How can this happen given that xprt_destroy() first
> > takes
> > the XPRT_LOCK, and then deletes xprt->timer?
> >
> > Right now, the socket code knows nothing about the details of xprt-
> > > timer and what it is used for. I'd prefer to keep it that way if
> > possible.
> >
--
Trond Myklebust
CTO, Hammerspace Inc
4984 El Camino Real, Suite 208
Los Altos, CA 94022
http://www.hammer.space