2021-09-16 09:25:00

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 0/7] handle unexpected message from server

This patch set tries to fix that client might oops if nbd server send
unexpected message to client, for example, our syzkaller report a uaf
in nbd_read_stat():

Call trace:
dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x144/0x1b4 lib/dump_stack.c:118
print_address_description+0x68/0x2d0 mm/kasan/report.c:253
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x134/0x2f0 mm/kasan/report.c:409
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
__asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
__read_once_size include/linux/compiler.h:193 [inline]
blk_mq_rq_state block/blk-mq.h:106 [inline]
blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
nbd_read_stat drivers/block/nbd.c:670 [inline]
recv_work+0x1bc/0x890 drivers/block/nbd.c:749
process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
kthread+0x1d8/0x1e0 kernel/kthread.c:255
ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
blk_mq_rq_ctx_init
sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
__blk_mq_get_driver_tag -> get tag from tags
tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
nbd_read_stat
blk_mq_tag_to_rq(tags, tag)
rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
-> step 2) access rq before clearing rq mapping
blk_mq_clear_rq_mapping(set, tags, hctx_idx);
__free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_read_stat()

Changes in v8:
- add patch 5 to this series.
- modify some words.
Changes in v7:
- instead of exposing blk_queue_exit(), using percpu_ref_put()
directly.
- drop the ref right after nbd_handle_reply().
Changes in v6:
- don't set cmd->status to error if request is completed before
nbd_clear_req().
- get 'q_usage_counter' to prevent accessing freed request through
blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
Changes in v5:
- move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
- add some comment in patch 5
Changes in v4:
- change the name of the patchset, since uaf is not the only problem
if server send unexpected reply message.
- instead of adding new interface, use blk_mq_find_and_get_req().
- add patch 5 to this series
Changes in v3:
- v2 can't fix the problem thoroughly, add patch 3-4 to this series.
- modify descriptions.
- patch 5 is just a cleanup
Changes in v2:
- as Bart suggested, add a new helper function for drivers to get
request by tag.

Yu Kuai (7):
nbd: don't handle response without a corresponding request message
nbd: make sure request completion won't concurrent
nbd: check sock index in nbd_read_stat()
nbd: don't start request if nbd_queue_rq() failed
nbd: clean up return value checking of sock_xmit()
nbd: partition nbd_read_stat() into nbd_read_reply() and
nbd_handle_reply()
nbd: fix uaf in nbd_handle_reply()

drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
1 file changed, 96 insertions(+), 39 deletions(-)

--
2.31.1


2021-09-16 09:25:52

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 2/7] nbd: make sure request completion won't concurrent

commit cddce0116058 ("nbd: Aovid double completion of a request")
try to fix that nbd_clear_que() and recv_work() can complete a
request concurrently. However, the problem still exists:

t1 t2 t3

nbd_disconnect_and_put
flush_workqueue
recv_work
blk_mq_complete_request
blk_mq_complete_request_remote -> this is true
WRITE_ONCE(rq->state, MQ_RQ_COMPLETE)
blk_mq_raise_softirq
blk_done_softirq
blk_complete_reqs
nbd_complete_rq
blk_mq_end_request
blk_mq_free_request
WRITE_ONCE(rq->state, MQ_RQ_IDLE)
nbd_clear_que
blk_mq_tagset_busy_iter
nbd_clear_req
__blk_mq_free_request
blk_mq_put_tag
blk_mq_complete_request -> complete again

There are three places where request can be completed in nbd:
recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they
all hold cmd->lock before completing the request, it's easy to
avoid the problem by setting and checking a cmd flag.

Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
---
drivers/block/nbd.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 23ded569e8d3..614c6ab2b8fe 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -406,7 +406,11 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
if (!mutex_trylock(&cmd->lock))
return BLK_EH_RESET_TIMER;

- __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ mutex_unlock(&cmd->lock);
+ return BLK_EH_DONE;
+ }
+
if (!refcount_inc_not_zero(&nbd->config_refs)) {
cmd->status = BLK_STS_TIMEOUT;
mutex_unlock(&cmd->lock);
@@ -841,7 +845,10 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
return true;

mutex_lock(&cmd->lock);
- __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ mutex_unlock(&cmd->lock);
+ return true;
+ }
cmd->status = BLK_STS_IOERR;
mutex_unlock(&cmd->lock);

--
2.31.1

2021-09-16 09:25:52

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

The sock that clent send request in nbd_send_cmd() and receive reply
in nbd_read_stat() should be the same.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/nbd.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 614c6ab2b8fe..c724a5bd7fa4 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
ret = -ENOENT;
goto out;
}
+ if (cmd->index != index) {
+ dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
+ tag, index, cmd->index);
+ }
if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
--
2.31.1

2021-09-16 09:26:02

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 5/7] nbd: clean up return value checking of sock_xmit()

Check if sock_xmit() return 0 is useless because it'll never return
0, comment it and remove such checkings.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/nbd.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 22c91d8901f6..7f9e030e41ce 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -490,7 +490,8 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
}

/*
- * Send or receive packet.
+ * Send or receive packet. Return a positive value on success and
+ * negtive value on failue, and never return 0.
*/
static int sock_xmit(struct nbd_device *nbd, int index, int send,
struct iov_iter *iter, int msg_flags, int *sent)
@@ -616,7 +617,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
result = sock_xmit(nbd, index, 1, &from,
(type == NBD_CMD_WRITE) ? MSG_MORE : 0, &sent);
trace_nbd_header_sent(req, handle);
- if (result <= 0) {
+ if (result < 0) {
if (was_interrupted(result)) {
/* If we havne't sent anything we can just return BUSY,
* however if we have sent something we need to make
@@ -660,7 +661,7 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
skip = 0;
}
result = sock_xmit(nbd, index, 1, &from, flags, &sent);
- if (result <= 0) {
+ if (result < 0) {
if (was_interrupted(result)) {
/* We've already sent the header, we
* have no choice but to set pending and
@@ -712,7 +713,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
reply.magic = 0;
iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
- if (result <= 0) {
+ if (result < 0) {
if (!nbd_disconnected(config))
dev_err(disk_to_dev(nbd->disk),
"Receive control failed (result %d)\n", result);
@@ -783,7 +784,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
rq_for_each_segment(bvec, req, iter) {
iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
- if (result <= 0) {
+ if (result < 0) {
dev_err(disk_to_dev(nbd->disk), "Receive data failed (result %d)\n",
result);
/*
@@ -1229,7 +1230,7 @@ static void send_disconnects(struct nbd_device *nbd)
iov_iter_kvec(&from, WRITE, &iov, 1, sizeof(request));
mutex_lock(&nsock->tx_lock);
ret = sock_xmit(nbd, i, 1, &from, 0, NULL);
- if (ret <= 0)
+ if (ret < 0)
dev_err(disk_to_dev(nbd->disk),
"Send disconnect failed %d\n", ret);
mutex_unlock(&nsock->tx_lock);
--
2.31.1

2021-09-16 09:26:21

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()

Prepare to fix uaf in nbd_read_stat(), no functional changes.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/nbd.c | 74 +++++++++++++++++++++++++++------------------
1 file changed, 44 insertions(+), 30 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 7f9e030e41ce..69dc5eac9ad3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -695,38 +695,45 @@ static int nbd_send_cmd(struct nbd_device *nbd, struct nbd_cmd *cmd, int index)
return 0;
}

-/* NULL returned = something went wrong, inform userspace */
-static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
+static int nbd_read_reply(struct nbd_device *nbd, int index,
+ struct nbd_reply *reply)
{
- struct nbd_config *config = nbd->config;
- int result;
- struct nbd_reply reply;
- struct nbd_cmd *cmd;
- struct request *req = NULL;
- u64 handle;
- u16 hwq;
- u32 tag;
- struct kvec iov = {.iov_base = &reply, .iov_len = sizeof(reply)};
+ struct kvec iov = {.iov_base = reply, .iov_len = sizeof(*reply)};
struct iov_iter to;
- int ret = 0;
+ int result;

- reply.magic = 0;
- iov_iter_kvec(&to, READ, &iov, 1, sizeof(reply));
+ reply->magic = 0;
+ iov_iter_kvec(&to, READ, &iov, 1, sizeof(*reply));
result = sock_xmit(nbd, index, 0, &to, MSG_WAITALL, NULL);
if (result < 0) {
- if (!nbd_disconnected(config))
+ if (!nbd_disconnected(nbd->config))
dev_err(disk_to_dev(nbd->disk),
"Receive control failed (result %d)\n", result);
- return ERR_PTR(result);
+ return result;
}

- if (ntohl(reply.magic) != NBD_REPLY_MAGIC) {
+ if (ntohl(reply->magic) != NBD_REPLY_MAGIC) {
dev_err(disk_to_dev(nbd->disk), "Wrong magic (0x%lx)\n",
- (unsigned long)ntohl(reply.magic));
- return ERR_PTR(-EPROTO);
+ (unsigned long)ntohl(reply->magic));
+ return -EPROTO;
}

- memcpy(&handle, reply.handle, sizeof(handle));
+ return 0;
+}
+
+/* NULL returned = something went wrong, inform userspace */
+static struct nbd_cmd *nbd_handle_reply(struct nbd_device *nbd, int index,
+ struct nbd_reply *reply)
+{
+ int result;
+ struct nbd_cmd *cmd;
+ struct request *req = NULL;
+ u64 handle;
+ u16 hwq;
+ u32 tag;
+ int ret = 0;
+
+ memcpy(&handle, reply->handle, sizeof(handle));
tag = nbd_handle_to_tag(handle);
hwq = blk_mq_unique_tag_to_hwq(tag);
if (hwq < nbd->tag_set.nr_hw_queues)
@@ -769,9 +776,9 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
ret = -ENOENT;
goto out;
}
- if (ntohl(reply.error)) {
+ if (ntohl(reply->error)) {
dev_err(disk_to_dev(nbd->disk), "Other side returned error (%d)\n",
- ntohl(reply.error));
+ ntohl(reply->error));
cmd->status = BLK_STS_IOERR;
goto out;
}
@@ -780,6 +787,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
if (rq_data_dir(req) != WRITE) {
struct req_iterator iter;
struct bio_vec bvec;
+ struct iov_iter to;

rq_for_each_segment(bvec, req, iter) {
iov_iter_bvec(&to, READ, &bvec, 1, bvec.bv_len);
@@ -793,7 +801,7 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
* and let the timeout stuff handle resubmitting
* this request onto another connection.
*/
- if (nbd_disconnected(config)) {
+ if (nbd_disconnected(nbd->config)) {
cmd->status = BLK_STS_IOERR;
goto out;
}
@@ -817,24 +825,30 @@ static void recv_work(struct work_struct *work)
work);
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
+ struct nbd_sock *nsock;
struct nbd_cmd *cmd;
struct request *rq;

while (1) {
- cmd = nbd_read_stat(nbd, args->index);
- if (IS_ERR(cmd)) {
- struct nbd_sock *nsock = config->socks[args->index];
+ struct nbd_reply reply;

- mutex_lock(&nsock->tx_lock);
- nbd_mark_nsock_dead(nbd, nsock, 1);
- mutex_unlock(&nsock->tx_lock);
+ if (nbd_read_reply(nbd, args->index, &reply))
+ break;
+
+ cmd = nbd_handle_reply(nbd, args->index, &reply);
+ if (IS_ERR(cmd))
break;
- }

rq = blk_mq_rq_from_pdu(cmd);
if (likely(!blk_should_fake_timeout(rq->q)))
blk_mq_complete_request(rq);
}
+
+ nsock = config->socks[args->index];
+ mutex_lock(&nsock->tx_lock);
+ nbd_mark_nsock_dead(nbd, nsock, 1);
+ mutex_unlock(&nsock->tx_lock);
+
nbd_config_put(nbd);
atomic_dec(&config->recv_threads);
wake_up(&config->recv_wq);
--
2.31.1

2021-09-16 09:26:31

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()

There is a problem that nbd_handle_reply() might access freed request:

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
blk_mq_rq_ctx_init
sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
__blk_mq_get_driver_tag -> get tag from tags
tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
nbd_handle_reply
blk_mq_tag_to_rq(tags, tag)
rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
-> step 2) access rq before clearing rq mapping
blk_mq_clear_rq_mapping(set, tags, hctx_idx);
__free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_handle_reply

Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
thus request is ensured not to be freed because 'q_usage_counter' is
not zero.

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/nbd.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 69dc5eac9ad3..b3a47fc6237f 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
work);
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
+ struct request_queue *q = nbd->disk->queue;
struct nbd_sock *nsock;
struct nbd_cmd *cmd;
struct request *rq;
@@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
if (nbd_read_reply(nbd, args->index, &reply))
break;

+ /*
+ * Grab .q_usage_counter so request pool won't go away, then no
+ * request use-after-free is possible during nbd_handle_reply().
+ * If queue is frozen, there won't be any inflight requests, we
+ * needn't to handle the incoming garbage message.
+ */
+ if (!percpu_ref_tryget(&q->q_usage_counter)) {
+ dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
+ __func__);
+ break;
+ }
+
cmd = nbd_handle_reply(nbd, args->index, &reply);
+ percpu_ref_put(&q->q_usage_counter);
if (IS_ERR(cmd))
break;

--
2.31.1

2021-09-16 09:26:47

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 1/7] nbd: don't handle response without a corresponding request message

While handling a response message from server, nbd_read_stat() will
try to get request by tag, and then complete the request. However,
this is problematic if nbd haven't sent a corresponding request
message:

t1 t2
submit_bio
nbd_queue_rq
blk_mq_start_request
recv_work
nbd_read_stat
blk_mq_tag_to_rq
blk_mq_complete_request
nbd_send_cmd

Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in
nbd_send_cmd() and checked in nbd_read_stat().

Noted that this patch can't fix that blk_mq_tag_to_rq() might
return a freed request, and this will be fixed in following
patches.

Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
---
drivers/block/nbd.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 5170a630778d..23ded569e8d3 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -126,6 +126,12 @@ struct nbd_device {
};

#define NBD_CMD_REQUEUED 1
+/*
+ * This flag will be set if nbd_queue_rq() succeed, and will be checked and
+ * cleared in completion. Both setting and clearing of the flag are protected
+ * by cmd->lock.
+ */
+#define NBD_CMD_INFLIGHT 2

struct nbd_cmd {
struct nbd_device *nbd;
@@ -400,6 +406,7 @@ static enum blk_eh_timer_return nbd_xmit_timeout(struct request *req,
if (!mutex_trylock(&cmd->lock))
return BLK_EH_RESET_TIMER;

+ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
if (!refcount_inc_not_zero(&nbd->config_refs)) {
cmd->status = BLK_STS_TIMEOUT;
mutex_unlock(&cmd->lock);
@@ -729,6 +736,12 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
cmd = blk_mq_rq_to_pdu(req);

mutex_lock(&cmd->lock);
+ if (!__test_and_clear_bit(NBD_CMD_INFLIGHT, &cmd->flags)) {
+ dev_err(disk_to_dev(nbd->disk), "Suspicious reply %d (status %u flags %lu)",
+ tag, cmd->status, cmd->flags);
+ ret = -ENOENT;
+ goto out;
+ }
if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
@@ -828,6 +841,7 @@ static bool nbd_clear_req(struct request *req, void *data, bool reserved)
return true;

mutex_lock(&cmd->lock);
+ __clear_bit(NBD_CMD_INFLIGHT, &cmd->flags);
cmd->status = BLK_STS_IOERR;
mutex_unlock(&cmd->lock);

@@ -964,7 +978,13 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
* returns EAGAIN can be retried on a different socket.
*/
ret = nbd_send_cmd(nbd, cmd, index);
- if (ret == -EAGAIN) {
+ /*
+ * Access to this flag is protected by cmd->lock, thus it's safe to set
+ * the flag after nbd_send_cmd() succeed to send request to server.
+ */
+ if (!ret)
+ __set_bit(NBD_CMD_INFLIGHT, &cmd->flags);
+ else if (ret == -EAGAIN) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Request send failed, requeueing\n");
nbd_mark_nsock_dead(nbd, nsock, 1);
--
2.31.1

2021-09-16 09:28:46

by Yu Kuai

[permalink] [raw]
Subject: [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed

commit 6a468d5990ec ("nbd: don't start req until after the dead
connection logic") move blk_mq_start_request() from nbd_queue_rq()
to nbd_handle_cmd() to skip starting request if the connection is
dead. However, request is still started in other error paths.

Currently, blk_mq_end_request() will be called immediately if
nbd_queue_rq() failed, thus start request in such situation is
useless. So remove blk_mq_start_request() from error paths in
nbd_handle_cmd().

Signed-off-by: Yu Kuai <[email protected]>
---
drivers/block/nbd.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index c724a5bd7fa4..22c91d8901f6 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -934,7 +934,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
if (!refcount_inc_not_zero(&nbd->config_refs)) {
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Socks array is empty\n");
- blk_mq_start_request(req);
return -EINVAL;
}
config = nbd->config;
@@ -943,7 +942,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
dev_err_ratelimited(disk_to_dev(nbd->disk),
"Attempted send on invalid socket\n");
nbd_config_put(nbd);
- blk_mq_start_request(req);
return -EINVAL;
}
cmd->status = BLK_STS_OK;
@@ -967,7 +965,6 @@ static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
*/
sock_shutdown(nbd);
nbd_config_put(nbd);
- blk_mq_start_request(req);
return -EIO;
}
goto again;
--
2.31.1

2021-09-16 12:49:45

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 4/7] nbd: don't start request if nbd_queue_rq() failed

On Thu, Sep 16, 2021 at 05:33:47PM +0800, Yu Kuai wrote:
> commit 6a468d5990ec ("nbd: don't start req until after the dead
> connection logic") move blk_mq_start_request() from nbd_queue_rq()
> to nbd_handle_cmd() to skip starting request if the connection is
> dead. However, request is still started in other error paths.
>
> Currently, blk_mq_end_request() will be called immediately if
> nbd_queue_rq() failed, thus start request in such situation is
> useless. So remove blk_mq_start_request() from error paths in
> nbd_handle_cmd().
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---

Reviewed-by: Ming Lei <[email protected]>

--
Ming

2021-09-16 12:51:33

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 5/7] nbd: clean up return value checking of sock_xmit()

On Thu, Sep 16, 2021 at 05:33:48PM +0800, Yu Kuai wrote:
> Check if sock_xmit() return 0 is useless because it'll never return
> 0, comment it and remove such checkings.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/block/nbd.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)

Reviewed-by: Ming Lei <[email protected]>

--
Ming

2021-09-16 12:55:16

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()

On Thu, Sep 16, 2021 at 05:33:49PM +0800, Yu Kuai wrote:
> Prepare to fix uaf in nbd_read_stat(), no functional changes.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---

Reviewed-by: Ming Lei <[email protected]>

--
Ming

2021-09-16 12:59:17

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()

On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
>
> 1) At first, a normal io is submitted and completed with scheduler:
>
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
> blk_mq_rq_ctx_init
> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
> __blk_mq_get_driver_tag -> get tag from tags
> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
>
> 2) nbd server send a reply with random tag directly:
>
> recv_work
> nbd_handle_reply
> blk_mq_tag_to_rq(tags, tag)
> rq = tags->rq[tag]
>
> 3) if the sched_tags->static_rq is freed:
>
> blk_mq_sched_free_requests
> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> -> step 2) access rq before clearing rq mapping
> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> __free_pages() -> rq is freed here
>
> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>
> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> thus request is ensured not to be freed because 'q_usage_counter' is
> not zero.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/block/nbd.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 69dc5eac9ad3..b3a47fc6237f 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
> work);
> struct nbd_device *nbd = args->nbd;
> struct nbd_config *config = nbd->config;
> + struct request_queue *q = nbd->disk->queue;
> struct nbd_sock *nsock;
> struct nbd_cmd *cmd;
> struct request *rq;
> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
> if (nbd_read_reply(nbd, args->index, &reply))
> break;
>
> + /*
> + * Grab .q_usage_counter so request pool won't go away, then no
> + * request use-after-free is possible during nbd_handle_reply().
> + * If queue is frozen, there won't be any inflight requests, we
> + * needn't to handle the incoming garbage message.
> + */
> + if (!percpu_ref_tryget(&q->q_usage_counter)) {
> + dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
> + __func__);
> + break;
> + }
> +
> cmd = nbd_handle_reply(nbd, args->index, &reply);
> + percpu_ref_put(&q->q_usage_counter);
> if (IS_ERR(cmd))
> break;

The refcount needs to be grabbed when completing the request because
the request may be completed from other code path, then the request pool
will be freed from that code path when the request is referred.


Thanks,
Ming

2021-09-16 13:15:25

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()

On 2021/09/16 20:58, Ming Lei wrote:
> On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
>> There is a problem that nbd_handle_reply() might access freed request:
>>
>> 1) At first, a normal io is submitted and completed with scheduler:
>>
>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>> blk_mq_rq_ctx_init
>> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>> ...
>> blk_mq_get_driver_tag
>> __blk_mq_get_driver_tag -> get tag from tags
>> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>
>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>> io is finished.
>>
>> 2) nbd server send a reply with random tag directly:
>>
>> recv_work
>> nbd_handle_reply
>> blk_mq_tag_to_rq(tags, tag)
>> rq = tags->rq[tag]
>>
>> 3) if the sched_tags->static_rq is freed:
>>
>> blk_mq_sched_free_requests
>> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>> -> step 2) access rq before clearing rq mapping
>> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>> __free_pages() -> rq is freed here
>>
>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>
>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>> thus request is ensured not to be freed because 'q_usage_counter' is
>> not zero.
>>
>> Signed-off-by: Yu Kuai <[email protected]>
>> ---
>> drivers/block/nbd.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>> index 69dc5eac9ad3..b3a47fc6237f 100644
>> --- a/drivers/block/nbd.c
>> +++ b/drivers/block/nbd.c
>> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
>> work);
>> struct nbd_device *nbd = args->nbd;
>> struct nbd_config *config = nbd->config;
>> + struct request_queue *q = nbd->disk->queue;
>> struct nbd_sock *nsock;
>> struct nbd_cmd *cmd;
>> struct request *rq;
>> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
>> if (nbd_read_reply(nbd, args->index, &reply))
>> break;
>>
>> + /*
>> + * Grab .q_usage_counter so request pool won't go away, then no
>> + * request use-after-free is possible during nbd_handle_reply().
>> + * If queue is frozen, there won't be any inflight requests, we
>> + * needn't to handle the incoming garbage message.
>> + */
>> + if (!percpu_ref_tryget(&q->q_usage_counter)) {
>> + dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
>> + __func__);
>> + break;
>> + }
>> +
>> cmd = nbd_handle_reply(nbd, args->index, &reply);
>> + percpu_ref_put(&q->q_usage_counter);
>> if (IS_ERR(cmd))
>> break;
>
> The refcount needs to be grabbed when completing the request because
> the request may be completed from other code path, then the request pool
> will be freed from that code path when the request is referred.

Hi,

The request can't complete concurrently, thus put ref here is safe.

There used to be a commet here that I tried to explain it... It's fine
to me to move it behind anyway.

Thanks,
Kuai

2021-09-16 13:58:31

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()

On Thu, Sep 16, 2021 at 09:10:37PM +0800, yukuai (C) wrote:
> On 2021/09/16 20:58, Ming Lei wrote:
> > On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
> > > There is a problem that nbd_handle_reply() might access freed request:
> > >
> > > 1) At first, a normal io is submitted and completed with scheduler:
> > >
> > > internel_tag = blk_mq_get_tag -> get tag from sched_tags
> > > blk_mq_rq_ctx_init
> > > sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> > > ...
> > > blk_mq_get_driver_tag
> > > __blk_mq_get_driver_tag -> get tag from tags
> > > tags->rq[tag] = sched_tag->static_rq[internel_tag]
> > >
> > > So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> > > to the request: sched_tags->static_rq[internal_tag]. Even if the
> > > io is finished.
> > >
> > > 2) nbd server send a reply with random tag directly:
> > >
> > > recv_work
> > > nbd_handle_reply
> > > blk_mq_tag_to_rq(tags, tag)
> > > rq = tags->rq[tag]
> > >
> > > 3) if the sched_tags->static_rq is freed:
> > >
> > > blk_mq_sched_free_requests
> > > blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> > > -> step 2) access rq before clearing rq mapping
> > > blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> > > __free_pages() -> rq is freed here
> > >
> > > 4) Then, nbd continue to use the freed request in nbd_handle_reply
> > >
> > > Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> > > thus request is ensured not to be freed because 'q_usage_counter' is
> > > not zero.
> > >
> > > Signed-off-by: Yu Kuai <[email protected]>
> > > ---
> > > drivers/block/nbd.c | 14 ++++++++++++++
> > > 1 file changed, 14 insertions(+)
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 69dc5eac9ad3..b3a47fc6237f 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
> > > work);
> > > struct nbd_device *nbd = args->nbd;
> > > struct nbd_config *config = nbd->config;
> > > + struct request_queue *q = nbd->disk->queue;
> > > struct nbd_sock *nsock;
> > > struct nbd_cmd *cmd;
> > > struct request *rq;
> > > @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
> > > if (nbd_read_reply(nbd, args->index, &reply))
> > > break;
> > > + /*
> > > + * Grab .q_usage_counter so request pool won't go away, then no
> > > + * request use-after-free is possible during nbd_handle_reply().
> > > + * If queue is frozen, there won't be any inflight requests, we
> > > + * needn't to handle the incoming garbage message.
> > > + */
> > > + if (!percpu_ref_tryget(&q->q_usage_counter)) {
> > > + dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
> > > + __func__);
> > > + break;
> > > + }
> > > +
> > > cmd = nbd_handle_reply(nbd, args->index, &reply);
> > > + percpu_ref_put(&q->q_usage_counter);
> > > if (IS_ERR(cmd))
> > > break;
> >
> > The refcount needs to be grabbed when completing the request because
> > the request may be completed from other code path, then the request pool
> > will be freed from that code path when the request is referred.
>
> Hi,
>
> The request can't complete concurrently, thus put ref here is safe.
>
> There used to be a commet here that I tried to explain it... It's fine
> to me to move it behind anyway.

Never see such comment. cmd->lock isn't held here, so I believe
concurrent completion is possible here.


Thanks,
Ming

2021-09-16 14:09:07

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 7/7] nbd: fix uaf in nbd_handle_reply()

On 2021/09/16 21:55, Ming Lei write:
> On Thu, Sep 16, 2021 at 09:10:37PM +0800, yukuai (C) wrote:
>> On 2021/09/16 20:58, Ming Lei wrote:
>>> On Thu, Sep 16, 2021 at 05:33:50PM +0800, Yu Kuai wrote:
>>>> There is a problem that nbd_handle_reply() might access freed request:
>>>>
>>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>>
>>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>> blk_mq_rq_ctx_init
>>>> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>>> ...
>>>> blk_mq_get_driver_tag
>>>> __blk_mq_get_driver_tag -> get tag from tags
>>>> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>>
>>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>>> io is finished.
>>>>
>>>> 2) nbd server send a reply with random tag directly:
>>>>
>>>> recv_work
>>>> nbd_handle_reply
>>>> blk_mq_tag_to_rq(tags, tag)
>>>> rq = tags->rq[tag]
>>>>
>>>> 3) if the sched_tags->static_rq is freed:
>>>>
>>>> blk_mq_sched_free_requests
>>>> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>> -> step 2) access rq before clearing rq mapping
>>>> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>> __free_pages() -> rq is freed here
>>>>
>>>> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>>>>
>>>> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
>>>> thus request is ensured not to be freed because 'q_usage_counter' is
>>>> not zero.
>>>>
>>>> Signed-off-by: Yu Kuai <[email protected]>
>>>> ---
>>>> drivers/block/nbd.c | 14 ++++++++++++++
>>>> 1 file changed, 14 insertions(+)
>>>>
>>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>>> index 69dc5eac9ad3..b3a47fc6237f 100644
>>>> --- a/drivers/block/nbd.c
>>>> +++ b/drivers/block/nbd.c
>>>> @@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
>>>> work);
>>>> struct nbd_device *nbd = args->nbd;
>>>> struct nbd_config *config = nbd->config;
>>>> + struct request_queue *q = nbd->disk->queue;
>>>> struct nbd_sock *nsock;
>>>> struct nbd_cmd *cmd;
>>>> struct request *rq;
>>>> @@ -835,7 +836,20 @@ static void recv_work(struct work_struct *work)
>>>> if (nbd_read_reply(nbd, args->index, &reply))
>>>> break;
>>>> + /*
>>>> + * Grab .q_usage_counter so request pool won't go away, then no
>>>> + * request use-after-free is possible during nbd_handle_reply().
>>>> + * If queue is frozen, there won't be any inflight requests, we
>>>> + * needn't to handle the incoming garbage message.
>>>> + */
>>>> + if (!percpu_ref_tryget(&q->q_usage_counter)) {
>>>> + dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
>>>> + __func__);
>>>> + break;
>>>> + }
>>>> +
>>>> cmd = nbd_handle_reply(nbd, args->index, &reply);
>>>> + percpu_ref_put(&q->q_usage_counter);
>>>> if (IS_ERR(cmd))
>>>> break;
>>>
>>> The refcount needs to be grabbed when completing the request because
>>> the request may be completed from other code path, then the request pool
>>> will be freed from that code path when the request is referred.
>>
>> Hi,
>>
>> The request can't complete concurrently, thus put ref here is safe.
>>
>> There used to be a commet here that I tried to explain it... It's fine
>> to me to move it behind anyway.
>
> Never see such comment. cmd->lock isn't held here, so I believe
> concurrent completion is possible here.
>

After patch 2, __test_and_clear_bit(NBD_CMD_INFLIGHT) must pass
while cmd->lock is held before completing the request, thus request
completion won't concurrent...

Thanks,
Kuai

2021-09-16 18:02:31

by Yu Kuai

[permalink] [raw]
Subject: [PATCH v9] nbd: fix uaf in nbd_handle_reply()

There is a problem that nbd_handle_reply() might access freed request:

1) At first, a normal io is submitted and completed with scheduler:

internel_tag = blk_mq_get_tag -> get tag from sched_tags
blk_mq_rq_ctx_init
sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
...
blk_mq_get_driver_tag
__blk_mq_get_driver_tag -> get tag from tags
tags->rq[tag] = sched_tag->static_rq[internel_tag]

So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
to the request: sched_tags->static_rq[internal_tag]. Even if the
io is finished.

2) nbd server send a reply with random tag directly:

recv_work
nbd_handle_reply
blk_mq_tag_to_rq(tags, tag)
rq = tags->rq[tag]

3) if the sched_tags->static_rq is freed:

blk_mq_sched_free_requests
blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
-> step 2) access rq before clearing rq mapping
blk_mq_clear_rq_mapping(set, tags, hctx_idx);
__free_pages() -> rq is freed here

4) Then, nbd continue to use the freed request in nbd_handle_reply

Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
thus request is ensured not to be freed because 'q_usage_counter' is
not zero.

Signed-off-by: Yu Kuai <[email protected]>
---
Changes in v9:
- move percpu_ref_put() behind.

drivers/block/nbd.c | 18 +++++++++++++++++-
1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 69dc5eac9ad3..f9d63794275e 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -825,6 +825,7 @@ static void recv_work(struct work_struct *work)
work);
struct nbd_device *nbd = args->nbd;
struct nbd_config *config = nbd->config;
+ struct request_queue *q = nbd->disk->queue;
struct nbd_sock *nsock;
struct nbd_cmd *cmd;
struct request *rq;
@@ -835,13 +836,28 @@ static void recv_work(struct work_struct *work)
if (nbd_read_reply(nbd, args->index, &reply))
break;

+ /*
+ * Grab .q_usage_counter so request pool won't go away, then no
+ * request use-after-free is possible during nbd_handle_reply().
+ * If queue is frozen, there won't be any inflight requests, we
+ * needn't to handle the incoming garbage message.
+ */
+ if (!percpu_ref_tryget(&q->q_usage_counter)) {
+ dev_err(disk_to_dev(nbd->disk), "%s: no io inflight\n",
+ __func__);
+ break;
+ }
+
cmd = nbd_handle_reply(nbd, args->index, &reply);
- if (IS_ERR(cmd))
+ if (IS_ERR(cmd)) {
+ percpu_ref_put(&q->q_usage_counter);
break;
+ }

rq = blk_mq_rq_from_pdu(cmd);
if (likely(!blk_should_fake_timeout(rq->q)))
blk_mq_complete_request(rq);
+ percpu_ref_put(&q->q_usage_counter);
}

nsock = config->socks[args->index];
--
2.31.1

2021-09-17 16:54:49

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH v9] nbd: fix uaf in nbd_handle_reply()

On Thu, Sep 16, 2021 at 10:18:10PM +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
>
> 1) At first, a normal io is submitted and completed with scheduler:
>
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
> blk_mq_rq_ctx_init
> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
> __blk_mq_get_driver_tag -> get tag from tags
> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
>
> 2) nbd server send a reply with random tag directly:
>
> recv_work
> nbd_handle_reply
> blk_mq_tag_to_rq(tags, tag)
> rq = tags->rq[tag]
>
> 3) if the sched_tags->static_rq is freed:
>
> blk_mq_sched_free_requests
> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> -> step 2) access rq before clearing rq mapping
> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> __free_pages() -> rq is freed here
>
> 4) Then, nbd continue to use the freed request in nbd_handle_reply
>
> Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(),
> thus request is ensured not to be freed because 'q_usage_counter' is
> not zero.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> Changes in v9:
> - move percpu_ref_put() behind.

Looks fine:

Reviewed-by: Ming Lei <[email protected]>

--
Ming

2021-09-19 13:03:34

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On 2021/09/16 17:33, Yu Kuai wrote:
> The sock that clent send request in nbd_send_cmd() and receive reply
> in nbd_read_stat() should be the same.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/block/nbd.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 614c6ab2b8fe..c724a5bd7fa4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> ret = -ENOENT;
> goto out;
> }
> + if (cmd->index != index) {
> + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> + tag, index, cmd->index);
> + }
> if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
>

Hi, Ming

Any suggestions about this patch?

Thanks,
Kuai

2021-09-22 09:23:57

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
> On 2021/09/16 17:33, Yu Kuai wrote:
> > The sock that clent send request in nbd_send_cmd() and receive reply
> > in nbd_read_stat() should be the same.
> >
> > Signed-off-by: Yu Kuai <[email protected]>
> > ---
> > drivers/block/nbd.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > index 614c6ab2b8fe..c724a5bd7fa4 100644
> > --- a/drivers/block/nbd.c
> > +++ b/drivers/block/nbd.c
> > @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> > ret = -ENOENT;
> > goto out;
> > }
> > + if (cmd->index != index) {
> > + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> > + tag, index, cmd->index);
> > + }
> > if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> > dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> > req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> >
>
> Hi, Ming
>
> Any suggestions about this patch?

I think this one relies on nbd protocol between server and client, and
does the protocol require both request and reply xmitted via same
socket?


Thanks,
Ming

2021-09-22 12:14:28

by Eric Blake

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On Wed, Sep 22, 2021 at 05:22:07PM +0800, Ming Lei wrote:
>
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?

Yes, a reply must be transmitted on the same socket as the request
came over. This is because independent sockets are not required to
use distinct 64-bit handles, and there is no way for a server to tell
if independent clients are related to one another; sending a reply on
the wrong socket is thus not guaranteed to reach the intended client.
Thus, a compliant server will never send a reply over a different
socket than the original request, and if a client ever gets a reply
with a handle it was not expecting, then the server is buggy or
malicious.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org

2021-09-22 12:24:00

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On 2021/09/22 17:22, Ming Lei wrote:
> On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
>> On 2021/09/16 17:33, Yu Kuai wrote:
>>> The sock that clent send request in nbd_send_cmd() and receive reply
>>> in nbd_read_stat() should be the same.
>>>
>>> Signed-off-by: Yu Kuai <[email protected]>
>>> ---
>>> drivers/block/nbd.c | 4 ++++
>>> 1 file changed, 4 insertions(+)
>>>
>>> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
>>> index 614c6ab2b8fe..c724a5bd7fa4 100644
>>> --- a/drivers/block/nbd.c
>>> +++ b/drivers/block/nbd.c
>>> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
>>> ret = -ENOENT;
>>> goto out;
>>> }
>>> + if (cmd->index != index) {
>>> + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
>>> + tag, index, cmd->index);
>>> + }
>>> if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
>>> dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
>>> req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
>>>
>>
>> Hi, Ming
>>
>> Any suggestions about this patch?
>
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?
>

I searched nbd-server source code, and found that socket_read() and
send_reply->socket_write() are always come in pares and using the same
socket.

BTW, if server reply a read request from a unexpected sock, then
nbd_read_stat() might stuck in receiving the read data. And for worse,
nbd_read_stat() can mistake the normal reply message for the read data
afterwards and corrupt client.

Thanks,
Kuai

2021-09-22 16:22:39

by Wouter Verhelst

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On Wed, Sep 22, 2021 at 05:22:07PM +0800, Ming Lei wrote:
> On Sun, Sep 19, 2021 at 06:34:28PM +0800, yukuai (C) wrote:
> > On 2021/09/16 17:33, Yu Kuai wrote:
> > > The sock that clent send request in nbd_send_cmd() and receive reply
> > > in nbd_read_stat() should be the same.
> > >
> > > Signed-off-by: Yu Kuai <[email protected]>
> > > ---
> > > drivers/block/nbd.c | 4 ++++
> > > 1 file changed, 4 insertions(+)
> > >
> > > diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> > > index 614c6ab2b8fe..c724a5bd7fa4 100644
> > > --- a/drivers/block/nbd.c
> > > +++ b/drivers/block/nbd.c
> > > @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> > > ret = -ENOENT;
> > > goto out;
> > > }
> > > + if (cmd->index != index) {
> > > + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> > > + tag, index, cmd->index);
> > > + }
> > > if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> > > dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> > > req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> > >
> >
> > Hi, Ming
> >
> > Any suggestions about this patch?
>
> I think this one relies on nbd protocol between server and client, and
> does the protocol require both request and reply xmitted via same
> socket?

As Eric already answered: yes, the request and reply are specified such
that they must be transmitted over the same socket.

For more details, you can find the protocol specification at
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md

Please note that the protocol defined there has some options that are
not currently supported by the Linux nbd implementation -- specifically
the "structured reply" message format (and all that requires it) is not
implemented (yet?).

--
w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}

2021-09-23 00:33:19

by Ming Lei

[permalink] [raw]
Subject: Re: [patch v8 3/7] nbd: check sock index in nbd_read_stat()

On Thu, Sep 16, 2021 at 05:33:46PM +0800, Yu Kuai wrote:
> The sock that clent send request in nbd_send_cmd() and receive reply
> in nbd_read_stat() should be the same.
>
> Signed-off-by: Yu Kuai <[email protected]>
> ---
> drivers/block/nbd.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 614c6ab2b8fe..c724a5bd7fa4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -746,6 +746,10 @@ static struct nbd_cmd *nbd_read_stat(struct nbd_device *nbd, int index)
> ret = -ENOENT;
> goto out;
> }
> + if (cmd->index != index) {
> + dev_err(disk_to_dev(nbd->disk), "Unexpected reply %d from different sock %d (expected %d)",
> + tag, index, cmd->index);
> + }
> if (cmd->cmd_cookie != nbd_handle_to_cookie(handle)) {
> dev_err(disk_to_dev(nbd->disk), "Double reply on req %p, cmd_cookie %u, handle cookie %u\n",
> req, cmd->cmd_cookie, nbd_handle_to_cookie(handle));
> --
> 2.31.1
>

Reviewed-by: Ming Lei <[email protected]>

--
Ming

2021-09-23 13:36:15

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On 2021/09/16 17:33, Yu Kuai wrote:

Hi, jens

Any interest to apply this series?

Thanks,
Kuai
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
>
> Call trace:
> dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
> show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x144/0x1b4 lib/dump_stack.c:118
> print_address_description+0x68/0x2d0 mm/kasan/report.c:253
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x134/0x2f0 mm/kasan/report.c:409
> check_memory_region_inline mm/kasan/kasan.c:260 [inline]
> __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
> __read_once_size include/linux/compiler.h:193 [inline]
> blk_mq_rq_state block/blk-mq.h:106 [inline]
> blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
> nbd_read_stat drivers/block/nbd.c:670 [inline]
> recv_work+0x1bc/0x890 drivers/block/nbd.c:749
> process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
> worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
> kthread+0x1d8/0x1e0 kernel/kthread.c:255
> ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>
> 1) At first, a normal io is submitted and completed with scheduler:
>
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
> blk_mq_rq_ctx_init
> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
> __blk_mq_get_driver_tag -> get tag from tags
> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
>
> 2) nbd server send a reply with random tag directly:
>
> recv_work
> nbd_read_stat
> blk_mq_tag_to_rq(tags, tag)
> rq = tags->rq[tag]
>
> 3) if the sched_tags->static_rq is freed:
>
> blk_mq_sched_free_requests
> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> -> step 2) access rq before clearing rq mapping
> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> __free_pages() -> rq is freed here
>
> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>
> Changes in v8:
> - add patch 5 to this series.
> - modify some words.
> Changes in v7:
> - instead of exposing blk_queue_exit(), using percpu_ref_put()
> directly.
> - drop the ref right after nbd_handle_reply().
> Changes in v6:
> - don't set cmd->status to error if request is completed before
> nbd_clear_req().
> - get 'q_usage_counter' to prevent accessing freed request through
> blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
> Changes in v5:
> - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
> - add some comment in patch 5
> Changes in v4:
> - change the name of the patchset, since uaf is not the only problem
> if server send unexpected reply message.
> - instead of adding new interface, use blk_mq_find_and_get_req().
> - add patch 5 to this series
> Changes in v3:
> - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
> - modify descriptions.
> - patch 5 is just a cleanup
> Changes in v2:
> - as Bart suggested, add a new helper function for drivers to get
> request by tag.
>
> Yu Kuai (7):
> nbd: don't handle response without a corresponding request message
> nbd: make sure request completion won't concurrent
> nbd: check sock index in nbd_read_stat()
> nbd: don't start request if nbd_queue_rq() failed
> nbd: clean up return value checking of sock_xmit()
> nbd: partition nbd_read_stat() into nbd_read_reply() and
> nbd_handle_reply()
> nbd: fix uaf in nbd_handle_reply()
>
> drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
> 1 file changed, 96 insertions(+), 39 deletions(-)
>

2021-09-29 14:11:32

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On 2021/09/23 21:33, yukuai (C) wrote:
> On 2021/09/16 17:33, Yu Kuai wrote:
>
> Hi, jens
>
> Any interest to apply this series?

friendly ping ...
>
> Thanks,
> Kuai
>> This patch set tries to fix that client might oops if nbd server send
>> unexpected message to client, for example, our syzkaller report a uaf
>> in nbd_read_stat():
>>
>> Call trace:
>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>   __dump_stack lib/dump_stack.c:77 [inline]
>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>   __read_once_size include/linux/compiler.h:193 [inline]
>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>
>> 1) At first, a normal io is submitted and completed with scheduler:
>>
>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>   blk_mq_rq_ctx_init
>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>> ...
>> blk_mq_get_driver_tag
>>   __blk_mq_get_driver_tag -> get tag from tags
>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>
>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>> io is finished.
>>
>> 2) nbd server send a reply with random tag directly:
>>
>> recv_work
>>   nbd_read_stat
>>    blk_mq_tag_to_rq(tags, tag)
>>     rq = tags->rq[tag]
>>
>> 3) if the sched_tags->static_rq is freed:
>>
>> blk_mq_sched_free_requests
>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>    -> step 2) access rq before clearing rq mapping
>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>    __free_pages() -> rq is freed here
>>
>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>
>> Changes in v8:
>>   - add patch 5 to this series.
>>   - modify some words.
>> Changes in v7:
>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>   directly.
>>   - drop the ref right after nbd_handle_reply().
>> Changes in v6:
>>   - don't set cmd->status to error if request is completed before
>>   nbd_clear_req().
>>   - get 'q_usage_counter' to prevent accessing freed request through
>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>> Changes in v5:
>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>   - add some comment in patch 5
>> Changes in v4:
>>   - change the name of the patchset, since uaf is not the only problem
>>   if server send unexpected reply message.
>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>   - add patch 5 to this series
>> Changes in v3:
>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>   - modify descriptions.
>>   - patch 5 is just a cleanup
>> Changes in v2:
>>   - as Bart suggested, add a new helper function for drivers to get
>>   request by tag.
>>
>> Yu Kuai (7):
>>    nbd: don't handle response without a corresponding request message
>>    nbd: make sure request completion won't concurrent
>>    nbd: check sock index in nbd_read_stat()
>>    nbd: don't start request if nbd_queue_rq() failed
>>    nbd: clean up return value checking of sock_xmit()
>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>      nbd_handle_reply()
>>    nbd: fix uaf in nbd_handle_reply()
>>
>>   drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>

2021-10-08 07:21:21

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On 2021/09/29 20:54, yukuai (C) wrote:
> On 2021/09/23 21:33, yukuai (C) wrote:
>> On 2021/09/16 17:33, Yu Kuai wrote:
>>
>> Hi, jens
>>
>> Any interest to apply this series?
>
> friendly ping ...

Hi, Jens

friendly ping again ...
>>
>> Thanks,
>> Kuai
>>> This patch set tries to fix that client might oops if nbd server send
>>> unexpected message to client, for example, our syzkaller report a uaf
>>> in nbd_read_stat():
>>>
>>> Call trace:
>>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>>   __read_once_size include/linux/compiler.h:193 [inline]
>>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>>
>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>
>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>   blk_mq_rq_ctx_init
>>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>> ...
>>> blk_mq_get_driver_tag
>>>   __blk_mq_get_driver_tag -> get tag from tags
>>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>
>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>> io is finished.
>>>
>>> 2) nbd server send a reply with random tag directly:
>>>
>>> recv_work
>>>   nbd_read_stat
>>>    blk_mq_tag_to_rq(tags, tag)
>>>     rq = tags->rq[tag]
>>>
>>> 3) if the sched_tags->static_rq is freed:
>>>
>>> blk_mq_sched_free_requests
>>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>    -> step 2) access rq before clearing rq mapping
>>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>    __free_pages() -> rq is freed here
>>>
>>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>>
>>> Changes in v8:
>>>   - add patch 5 to this series.
>>>   - modify some words.
>>> Changes in v7:
>>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>>   directly.
>>>   - drop the ref right after nbd_handle_reply().
>>> Changes in v6:
>>>   - don't set cmd->status to error if request is completed before
>>>   nbd_clear_req().
>>>   - get 'q_usage_counter' to prevent accessing freed request through
>>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>>> Changes in v5:
>>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>>   - add some comment in patch 5
>>> Changes in v4:
>>>   - change the name of the patchset, since uaf is not the only problem
>>>   if server send unexpected reply message.
>>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>>   - add patch 5 to this series
>>> Changes in v3:
>>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>>   - modify descriptions.
>>>   - patch 5 is just a cleanup
>>> Changes in v2:
>>>   - as Bart suggested, add a new helper function for drivers to get
>>>   request by tag.
>>>
>>> Yu Kuai (7):
>>>    nbd: don't handle response without a corresponding request message
>>>    nbd: make sure request completion won't concurrent
>>>    nbd: check sock index in nbd_read_stat()
>>>    nbd: don't start request if nbd_queue_rq() failed
>>>    nbd: clean up return value checking of sock_xmit()
>>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>>      nbd_handle_reply()
>>>    nbd: fix uaf in nbd_handle_reply()
>>>
>>>   drivers/block/nbd.c | 135 +++++++++++++++++++++++++++++++-------------
>>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>>

2021-10-15 15:20:50

by Yu Kuai

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On 2021/10/08 15:17, yukuai (C) wrote:
> On 2021/09/29 20:54, yukuai (C) wrote:
>> On 2021/09/23 21:33, yukuai (C) wrote:
>>> On 2021/09/16 17:33, Yu Kuai wrote:
>>>
>>> Hi, jens
>>>
>>> Any interest to apply this series?
>>
>> friendly ping ...
>
> Hi, Jens
>
> friendly ping again ...

Hi, Jens

friendly ping again ...
>>>
>>> Thanks,
>>> Kuai
>>>> This patch set tries to fix that client might oops if nbd server send
>>>> unexpected message to client, for example, our syzkaller report a uaf
>>>> in nbd_read_stat():
>>>>
>>>> Call trace:
>>>>   dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
>>>>   show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
>>>>   __dump_stack lib/dump_stack.c:77 [inline]
>>>>   dump_stack+0x144/0x1b4 lib/dump_stack.c:118
>>>>   print_address_description+0x68/0x2d0 mm/kasan/report.c:253
>>>>   kasan_report_error mm/kasan/report.c:351 [inline]
>>>>   kasan_report+0x134/0x2f0 mm/kasan/report.c:409
>>>>   check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>>>>   __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
>>>>   __read_once_size include/linux/compiler.h:193 [inline]
>>>>   blk_mq_rq_state block/blk-mq.h:106 [inline]
>>>>   blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
>>>>   nbd_read_stat drivers/block/nbd.c:670 [inline]
>>>>   recv_work+0x1bc/0x890 drivers/block/nbd.c:749
>>>>   process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
>>>>   worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
>>>>   kthread+0x1d8/0x1e0 kernel/kthread.c:255
>>>>   ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>>>>
>>>> 1) At first, a normal io is submitted and completed with scheduler:
>>>>
>>>> internel_tag = blk_mq_get_tag -> get tag from sched_tags
>>>>   blk_mq_rq_ctx_init
>>>>    sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
>>>> ...
>>>> blk_mq_get_driver_tag
>>>>   __blk_mq_get_driver_tag -> get tag from tags
>>>>   tags->rq[tag] = sched_tag->static_rq[internel_tag]
>>>>
>>>> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
>>>> to the request: sched_tags->static_rq[internal_tag]. Even if the
>>>> io is finished.
>>>>
>>>> 2) nbd server send a reply with random tag directly:
>>>>
>>>> recv_work
>>>>   nbd_read_stat
>>>>    blk_mq_tag_to_rq(tags, tag)
>>>>     rq = tags->rq[tag]
>>>>
>>>> 3) if the sched_tags->static_rq is freed:
>>>>
>>>> blk_mq_sched_free_requests
>>>>   blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
>>>>    -> step 2) access rq before clearing rq mapping
>>>>    blk_mq_clear_rq_mapping(set, tags, hctx_idx);
>>>>    __free_pages() -> rq is freed here
>>>>
>>>> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>>>>
>>>> Changes in v8:
>>>>   - add patch 5 to this series.
>>>>   - modify some words.
>>>> Changes in v7:
>>>>   - instead of exposing blk_queue_exit(), using percpu_ref_put()
>>>>   directly.
>>>>   - drop the ref right after nbd_handle_reply().
>>>> Changes in v6:
>>>>   - don't set cmd->status to error if request is completed before
>>>>   nbd_clear_req().
>>>>   - get 'q_usage_counter' to prevent accessing freed request through
>>>>   blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
>>>> Changes in v5:
>>>>   - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
>>>>   - add some comment in patch 5
>>>> Changes in v4:
>>>>   - change the name of the patchset, since uaf is not the only problem
>>>>   if server send unexpected reply message.
>>>>   - instead of adding new interface, use blk_mq_find_and_get_req().
>>>>   - add patch 5 to this series
>>>> Changes in v3:
>>>>   - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
>>>>   - modify descriptions.
>>>>   - patch 5 is just a cleanup
>>>> Changes in v2:
>>>>   - as Bart suggested, add a new helper function for drivers to get
>>>>   request by tag.
>>>>
>>>> Yu Kuai (7):
>>>>    nbd: don't handle response without a corresponding request message
>>>>    nbd: make sure request completion won't concurrent
>>>>    nbd: check sock index in nbd_read_stat()
>>>>    nbd: don't start request if nbd_queue_rq() failed
>>>>    nbd: clean up return value checking of sock_xmit()
>>>>    nbd: partition nbd_read_stat() into nbd_read_reply() and
>>>>      nbd_handle_reply()
>>>>    nbd: fix uaf in nbd_handle_reply()
>>>>
>>>>   drivers/block/nbd.c | 135
>>>> +++++++++++++++++++++++++++++++-------------
>>>>   1 file changed, 96 insertions(+), 39 deletions(-)
>>>>

2021-10-17 00:03:39

by Josef Bacik

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On Thu, Sep 16, 2021 at 05:33:43PM +0800, Yu Kuai wrote:
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
>
> Call trace:
> dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
> show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x144/0x1b4 lib/dump_stack.c:118
> print_address_description+0x68/0x2d0 mm/kasan/report.c:253
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x134/0x2f0 mm/kasan/report.c:409
> check_memory_region_inline mm/kasan/kasan.c:260 [inline]
> __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
> __read_once_size include/linux/compiler.h:193 [inline]
> blk_mq_rq_state block/blk-mq.h:106 [inline]
> blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
> nbd_read_stat drivers/block/nbd.c:670 [inline]
> recv_work+0x1bc/0x890 drivers/block/nbd.c:749
> process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
> worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
> kthread+0x1d8/0x1e0 kernel/kthread.c:255
> ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>
> 1) At first, a normal io is submitted and completed with scheduler:
>
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
> blk_mq_rq_ctx_init
> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
> __blk_mq_get_driver_tag -> get tag from tags
> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>
> So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing
> to the request: sched_tags->static_rq[internal_tag]. Even if the
> io is finished.
>
> 2) nbd server send a reply with random tag directly:
>
> recv_work
> nbd_read_stat
> blk_mq_tag_to_rq(tags, tag)
> rq = tags->rq[tag]
>
> 3) if the sched_tags->static_rq is freed:
>
> blk_mq_sched_free_requests
> blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i)
> -> step 2) access rq before clearing rq mapping
> blk_mq_clear_rq_mapping(set, tags, hctx_idx);
> __free_pages() -> rq is freed here
>
> 4) Then, nbd continue to use the freed request in nbd_read_stat()
>
> Changes in v8:
> - add patch 5 to this series.
> - modify some words.
> Changes in v7:
> - instead of exposing blk_queue_exit(), using percpu_ref_put()
> directly.
> - drop the ref right after nbd_handle_reply().
> Changes in v6:
> - don't set cmd->status to error if request is completed before
> nbd_clear_req().
> - get 'q_usage_counter' to prevent accessing freed request through
> blk_mq_tag_to_rq(), instead of using blk_mq_find_and_get_req().
> Changes in v5:
> - move patch 1 & 2 in v4 (patch 4 & 5 in v5) behind
> - add some comment in patch 5
> Changes in v4:
> - change the name of the patchset, since uaf is not the only problem
> if server send unexpected reply message.
> - instead of adding new interface, use blk_mq_find_and_get_req().
> - add patch 5 to this series
> Changes in v3:
> - v2 can't fix the problem thoroughly, add patch 3-4 to this series.
> - modify descriptions.
> - patch 5 is just a cleanup
> Changes in v2:
> - as Bart suggested, add a new helper function for drivers to get
> request by tag.
>

You can add

Reviewed-by: Josef Bacik <[email protected]>

to the series. Sorry Jens, this one slipped through the cracks. Thanks,

Josef

2021-10-18 03:44:35

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v9] nbd: fix uaf in nbd_handle_reply()

On Thu, 16 Sep 2021 22:18:10 +0800, Yu Kuai wrote:
> There is a problem that nbd_handle_reply() might access freed request:
>
> 1) At first, a normal io is submitted and completed with scheduler:
>
> internel_tag = blk_mq_get_tag -> get tag from sched_tags
> blk_mq_rq_ctx_init
> sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag]
> ...
> blk_mq_get_driver_tag
> __blk_mq_get_driver_tag -> get tag from tags
> tags->rq[tag] = sched_tag->static_rq[internel_tag]
>
> [...]

Applied, thanks!

[1/1] nbd: fix uaf in nbd_handle_reply()
commit: 52c90e0184f67eecb00b53b79bfdf75e0274f8fd

Best regards,
--
Jens Axboe


2021-10-18 03:53:15

by Jens Axboe

[permalink] [raw]
Subject: Re: [patch v8 0/7] handle unexpected message from server

On Thu, 16 Sep 2021 17:33:43 +0800, Yu Kuai wrote:
> This patch set tries to fix that client might oops if nbd server send
> unexpected message to client, for example, our syzkaller report a uaf
> in nbd_read_stat():
>
> Call trace:
> dump_backtrace+0x0/0x310 arch/arm64/kernel/time.c:78
> show_stack+0x28/0x38 arch/arm64/kernel/traps.c:158
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x144/0x1b4 lib/dump_stack.c:118
> print_address_description+0x68/0x2d0 mm/kasan/report.c:253
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x134/0x2f0 mm/kasan/report.c:409
> check_memory_region_inline mm/kasan/kasan.c:260 [inline]
> __asan_load4+0x88/0xb0 mm/kasan/kasan.c:699
> __read_once_size include/linux/compiler.h:193 [inline]
> blk_mq_rq_state block/blk-mq.h:106 [inline]
> blk_mq_request_started+0x24/0x40 block/blk-mq.c:644
> nbd_read_stat drivers/block/nbd.c:670 [inline]
> recv_work+0x1bc/0x890 drivers/block/nbd.c:749
> process_one_work+0x3ec/0x9e0 kernel/workqueue.c:2147
> worker_thread+0x80/0x9d0 kernel/workqueue.c:2302
> kthread+0x1d8/0x1e0 kernel/kthread.c:255
> ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1174
>
> [...]

Applied, thanks!

[1/7] nbd: don't handle response without a corresponding request message
commit: b5644a3a79bf3be5f1238db1b2f241374b27b0f0
[2/7] nbd: make sure request completion won't concurrent
commit: d14b304f558f8c8f53da3a8d0c0b671f14a9c2f4
[3/7] nbd: check sock index in nbd_read_stat()
commit: dbd73178da676945d8bbcf6afe731623f683ce0a
[4/7] nbd: don't start request if nbd_queue_rq() failed
commit: a83fdc85365586dc5c0f3ff91680e18e37a66f19
[5/7] nbd: clean up return value checking of sock_xmit()
commit: 6157a8f489909db00151a4e361903b9099b03b75
[6/7] nbd: partition nbd_read_stat() into nbd_read_reply() and nbd_handle_reply()
commit: 961e9f50be9bb47835b0ac7e08d55d2d0a45e493
[7/7] nbd: fix uaf in nbd_handle_reply()
commit: 52c90e0184f67eecb00b53b79bfdf75e0274f8fd

Best regards,
--
Jens Axboe