From: Baokun Li <[email protected]>
Hi all!
This is the second version of this patch series. Thank you, Jia Zhu and
Gao Xiang, for the feedback in the previous version.
We've been testing ondemand mode for cachefiles since January, and we're
almost done. We hit a lot of issues during the testing period, and this
patch set fixes some of the issues related to reopen worker/send req/poll.
The patches have passed internal testing without regression.
Patch 1-3: A read request waiting for reopen could be closed maliciously
before the reopen worker is executing or waiting to be scheduled. So
ondemand_object_worker() may be called after the info and object and even
the cache have been freed and trigger use-after-free. So use
cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the
reopen worker or wait for it to finish. Since it makes no sense to wait
for the daemon to complete the reopen request, to avoid this pointless
operation blocking cancel_work_sync(), Patch 1 avoids request generation
by the DROPPING state when the request has not been sent, and Patch 2
flushes the requests of the current object before cancel_work_sync().
Patch 4: Cyclic allocation of msg_id to avoid msg_id reuse misleading
the daemon to cause hung.
Patch 5: Hold xas_lock during polling to avoid dereferencing reqs causing
use-after-free. This issue was triggered frequently in our tests, and we
found that anolis 5.10 had fixed it, so to avoid failing the test, this
patch was pushed upstream as well.
Comments and questions are, as always, welcome.
Please let me know what you think.
Thanks,
Baokun
Changes since v1:
* Collect RVB from Jia Zhu and Gao Xiang.(Thanks for your review!)
* Pathch 1,2:Add more commit messages.
* Pathch 3:Add Fixes tag as suggested by Jia Zhu.
* Pathch 4:No longer changing "do...while" to "retry" to focus changes
and optimise commit messages.
* Pathch 5: Drop the internal RVB tag.
[V1]: https://lore.kernel.org/all/[email protected]
Baokun Li (3):
cachefiles: stop sending new request when dropping object
cachefiles: flush all requests for the object that is being dropped
cachefiles: cyclic allocation of msg_id to avoid reuse
Hou Tao (1):
cachefiles: flush ondemand_object_worker during clean object
Jingbo Xu (1):
cachefiles: add missing lock protection when polling
fs/cachefiles/daemon.c | 4 ++--
fs/cachefiles/internal.h | 3 +++
fs/cachefiles/ondemand.c | 52 +++++++++++++++++++++++++++++++++++-----
3 files changed, 51 insertions(+), 8 deletions(-)
--
2.39.2
From: Hou Tao <[email protected]>
When queuing ondemand_object_worker() to re-open the object,
cachefiles_object is not pinned. The cachefiles_object may be freed when
the pending read request is completed intentionally and the related
erofs is umounted. If ondemand_object_worker() runs after the object is
freed, it will incur use-after-free problem as shown below.
process A processs B process C process D
cachefiles_ondemand_send_req()
// send a read req X
// wait for its completion
// close ondemand fd
cachefiles_ondemand_fd_release()
// set object as CLOSE
cachefiles_ondemand_daemon_read()
// set object as REOPENING
queue_work(fscache_wq, &info->ondemand_work)
// close /dev/cachefiles
cachefiles_daemon_release
cachefiles_flush_reqs
complete(&req->done)
// read req X is completed
// umount the erofs fs
cachefiles_put_object()
// object will be freed
cachefiles_ondemand_deinit_obj_info()
kmem_cache_free(object)
// both info and object are freed
ondemand_object_worker()
When dropping an object, it is no longer necessary to reopen the object,
so use cancel_work_sync() to cancel or wait for ondemand_object_worker()
to complete.
Fixes: 0a7e54c1959c ("cachefiles: resend an open request if the read request's object is closed")
Signed-off-by: Hou Tao <[email protected]>
Signed-off-by: Baokun Li <[email protected]>
Reviewed-by: Jia Zhu <[email protected]>
---
fs/cachefiles/ondemand.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index d24bff43499b..f6440b3e7368 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -589,6 +589,9 @@ void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
}
}
xa_unlock(&cache->reqs);
+
+ /* Wait for ondemand_object_worker() to finish to avoid UAF. */
+ cancel_work_sync(&object->ondemand->ondemand_work);
}
int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
--
2.39.2
From: Jingbo Xu <[email protected]>
Add missing lock protection in poll routine when iterating xarray,
otherwise:
Even with RCU read lock held, only the slot of the radix tree is
ensured to be pinned there, while the data structure (e.g. struct
cachefiles_req) stored in the slot has no such guarantee. The poll
routine will iterate the radix tree and dereference cachefiles_req
accordingly. Thus RCU read lock is not adequate in this case and
spinlock is needed here.
Fixes: b817e22b2e91 ("cachefiles: narrow the scope of triggering EPOLLIN events in ondemand mode")
Signed-off-by: Jingbo Xu <[email protected]>
Signed-off-by: Baokun Li <[email protected]>
Reviewed-by: Jia Zhu <[email protected]>
Reviewed-by: Gao Xiang <[email protected]>
---
fs/cachefiles/daemon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cachefiles/daemon.c b/fs/cachefiles/daemon.c
index 6465e2574230..73ed2323282a 100644
--- a/fs/cachefiles/daemon.c
+++ b/fs/cachefiles/daemon.c
@@ -365,14 +365,14 @@ static __poll_t cachefiles_daemon_poll(struct file *file,
if (cachefiles_in_ondemand_mode(cache)) {
if (!xa_empty(&cache->reqs)) {
- rcu_read_lock();
+ xas_lock(&xas);
xas_for_each_marked(&xas, req, ULONG_MAX, CACHEFILES_REQ_NEW) {
if (!cachefiles_ondemand_is_reopening_read(req)) {
mask |= EPOLLIN;
break;
}
}
- rcu_read_unlock();
+ xas_unlock(&xas);
}
} else {
if (test_bit(CACHEFILES_STATE_CHANGED, &cache->flags))
--
2.39.2
From: Baokun Li <[email protected]>
Because after an object is dropped, requests for that object are useless,
flush them to avoid causing other problems.
This prepares for the later addition of cancel_work_sync(). After the
reopen requests is generated, flush it to avoid cancel_work_sync()
blocking by waiting for daemon to complete the reopen requests.
Signed-off-by: Baokun Li <[email protected]>
---
fs/cachefiles/ondemand.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 73da4d4eaa9b..d24bff43499b 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -564,12 +564,31 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object)
void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
{
+ unsigned long index;
+ struct cachefiles_req *req;
+ struct cachefiles_cache *cache;
+
if (!object->ondemand)
return;
cachefiles_ondemand_send_req(object, CACHEFILES_OP_CLOSE, 0,
cachefiles_ondemand_init_close_req, NULL);
+
+ if (!object->ondemand->ondemand_id)
+ return;
+
+ /* Flush all requests for the object that is being dropped. */
+ cache = object->volume->cache;
+ xa_lock(&cache->reqs);
cachefiles_ondemand_set_object_dropping(object);
+ xa_for_each(&cache->reqs, index, req) {
+ if (req->object == object) {
+ req->error = -EIO;
+ complete(&req->done);
+ __xa_erase(&cache->reqs, index);
+ }
+ }
+ xa_unlock(&cache->reqs);
}
int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
--
2.39.2
From: Baokun Li <[email protected]>
Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles
object is being dropped, and is set after the close request for the dropped
object completes, and no new requests are allowed to be sent after this
state.
This prepares for the later addition of cancel_work_sync(). It prevents
leftover reopen requests from being sent, to avoid processing unnecessary
requests and to avoid cancel_work_sync() blocking by waiting for daemon to
complete the reopen requests.
Signed-off-by: Baokun Li <[email protected]>
---
fs/cachefiles/internal.h | 2 ++
fs/cachefiles/ondemand.c | 10 ++++++++--
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/cachefiles/internal.h b/fs/cachefiles/internal.h
index d33169f0018b..8ecd296cc1c4 100644
--- a/fs/cachefiles/internal.h
+++ b/fs/cachefiles/internal.h
@@ -48,6 +48,7 @@ enum cachefiles_object_state {
CACHEFILES_ONDEMAND_OBJSTATE_CLOSE, /* Anonymous fd closed by daemon or initial state */
CACHEFILES_ONDEMAND_OBJSTATE_OPEN, /* Anonymous fd associated with object is available */
CACHEFILES_ONDEMAND_OBJSTATE_REOPENING, /* Object that was closed and is being reopened. */
+ CACHEFILES_ONDEMAND_OBJSTATE_DROPPING, /* Object is being dropped. */
};
struct cachefiles_ondemand_info {
@@ -332,6 +333,7 @@ cachefiles_ondemand_set_object_##_state(struct cachefiles_object *object) \
CACHEFILES_OBJECT_STATE_FUNCS(open, OPEN);
CACHEFILES_OBJECT_STATE_FUNCS(close, CLOSE);
CACHEFILES_OBJECT_STATE_FUNCS(reopening, REOPENING);
+CACHEFILES_OBJECT_STATE_FUNCS(dropping, DROPPING);
static inline bool cachefiles_ondemand_is_reopening_read(struct cachefiles_req *req)
{
diff --git a/fs/cachefiles/ondemand.c b/fs/cachefiles/ondemand.c
index 4ba42f1fa3b4..73da4d4eaa9b 100644
--- a/fs/cachefiles/ondemand.c
+++ b/fs/cachefiles/ondemand.c
@@ -422,7 +422,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
*/
xas_lock(&xas);
- if (test_bit(CACHEFILES_DEAD, &cache->flags)) {
+ if (test_bit(CACHEFILES_DEAD, &cache->flags) ||
+ cachefiles_ondemand_object_is_dropping(object)) {
xas_unlock(&xas);
ret = -EIO;
goto out;
@@ -463,7 +464,8 @@ static int cachefiles_ondemand_send_req(struct cachefiles_object *object,
* If error occurs after creating the anonymous fd,
* cachefiles_ondemand_fd_release() will set object to close.
*/
- if (opcode == CACHEFILES_OP_OPEN)
+ if (opcode == CACHEFILES_OP_OPEN &&
+ !cachefiles_ondemand_object_is_dropping(object))
cachefiles_ondemand_set_object_close(object);
kfree(req);
return ret;
@@ -562,8 +564,12 @@ int cachefiles_ondemand_init_object(struct cachefiles_object *object)
void cachefiles_ondemand_clean_object(struct cachefiles_object *object)
{
+ if (!object->ondemand)
+ return;
+
cachefiles_ondemand_send_req(object, CACHEFILES_OP_CLOSE, 0,
cachefiles_ondemand_init_close_req, NULL);
+ cachefiles_ondemand_set_object_dropping(object);
}
int cachefiles_ondemand_init_obj_info(struct cachefiles_object *object,
--
2.39.2