2018-04-17 21:44:15

by Kees Cook

[permalink] [raw]
Subject: [PATCH] blk-mq: Clear out elevator private data

Some elevators may not correctly check rq->rq_flags & RQF_ELVPRIV, and
may attempt to read rq->elv fields. When requests got reused, this
caused BFQ to think it already had a bfqq (rq->elv.priv[1]) allocated.
This could lead to odd behaviors like having the sense buffer address
slowly start incrementing. This eventually tripped HARDENED_USERCOPY
and KASAN.

This patch wipes all of rq->elv instead of just rq->elv.icq. While
it shouldn't technically be needed, this ends up being a robustness
improvement that should lead to at least finding bugs in elevators faster.

Reported-by: Oleksandr Natalenko <[email protected]>
Fixes: bd166ef183c26 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
---
In theory, BFQ needs to also check the RQF_ELVPRIV flag, but I'll leave that
to Paolo to figure out. Also, my Fixes line is kind of a best-guess. This
is where icq was originally wiped, so it seemed as good a commit as any.
---
block/blk-mq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 0dc9e341c2a7..859df3160303 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -363,7 +363,7 @@ static struct request *blk_mq_get_request(struct request_queue *q,

rq = blk_mq_rq_ctx_init(data, tag, op);
if (!op_is_flush(op)) {
- rq->elv.icq = NULL;
+ memset(&rq->elv, 0, sizeof(rq->elv));
if (e && e->type->ops.mq.prepare_request) {
if (e->type->icq_cache && rq_ioc(bio))
blk_mq_sched_assign_ioc(rq, bio);
@@ -461,7 +461,7 @@ void blk_mq_free_request(struct request *rq)
e->type->ops.mq.finish_request(rq);
if (rq->elv.icq) {
put_io_context(rq->elv.icq->ioc);
- rq->elv.icq = NULL;
+ memset(&rq->elv, 0, sizeof(rq->elv));
}
}

--
2.7.4


--
Kees Cook
Pixel Security


2018-04-17 21:47:08

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] blk-mq: Clear out elevator private data

On 4/17/18 3:42 PM, Kees Cook wrote:
> Some elevators may not correctly check rq->rq_flags & RQF_ELVPRIV, and
> may attempt to read rq->elv fields. When requests got reused, this
> caused BFQ to think it already had a bfqq (rq->elv.priv[1]) allocated.
> This could lead to odd behaviors like having the sense buffer address
> slowly start incrementing. This eventually tripped HARDENED_USERCOPY
> and KASAN.
>
> This patch wipes all of rq->elv instead of just rq->elv.icq. While
> it shouldn't technically be needed, this ends up being a robustness
> improvement that should lead to at least finding bugs in elevators faster.

Comments from the other email still apply, we should not need to do this
full memset() for every request. From a quick look, BFQ needs to straighten
out its usage of prepare request and interactions with insert_request.

> Reported-by: Oleksandr Natalenko <[email protected]>
> Fixes: bd166ef183c26 ("blk-mq-sched: add framework for MQ capable IO schedulers")
> Cc: [email protected]
> Signed-off-by: Kees Cook <[email protected]>
> ---
> In theory, BFQ needs to also check the RQF_ELVPRIV flag, but I'll leave that
> to Paolo to figure out. Also, my Fixes line is kind of a best-guess. This
> is where icq was originally wiped, so it seemed as good a commit as any.

Yeah, that's probably a bit too broad for fixes :-)

--
Jens Axboe


2018-04-17 22:58:24

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] blk-mq: Clear out elevator private data

On Tue, Apr 17, 2018 at 2:45 PM, Jens Axboe <[email protected]> wrote:
> On 4/17/18 3:42 PM, Kees Cook wrote:
>> Some elevators may not correctly check rq->rq_flags & RQF_ELVPRIV, and
>> may attempt to read rq->elv fields. When requests got reused, this
>> caused BFQ to think it already had a bfqq (rq->elv.priv[1]) allocated.
>> This could lead to odd behaviors like having the sense buffer address
>> slowly start incrementing. This eventually tripped HARDENED_USERCOPY
>> and KASAN.
>>
>> This patch wipes all of rq->elv instead of just rq->elv.icq. While
>> it shouldn't technically be needed, this ends up being a robustness
>> improvement that should lead to at least finding bugs in elevators faster.
>
> Comments from the other email still apply, we should not need to do this
> full memset() for every request. From a quick look, BFQ needs to straighten
> out its usage of prepare request and interactions with insert_request.

Sure, understood. I would point out, FWIW, that memset() gets unrolled
by the compiler and this is just two more XORs in the same cacheline
(the two words following icq). (And there is SO much more being
cleared during alloc, it didn't seem like hardly any extra cost vs the
robustness it provided.)

-Kees

--
Kees Cook
Pixel Security

2018-04-17 23:02:10

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] blk-mq: Clear out elevator private data

On 4/17/18 4:57 PM, Kees Cook wrote:
> On Tue, Apr 17, 2018 at 2:45 PM, Jens Axboe <[email protected]> wrote:
>> On 4/17/18 3:42 PM, Kees Cook wrote:
>>> Some elevators may not correctly check rq->rq_flags & RQF_ELVPRIV, and
>>> may attempt to read rq->elv fields. When requests got reused, this
>>> caused BFQ to think it already had a bfqq (rq->elv.priv[1]) allocated.
>>> This could lead to odd behaviors like having the sense buffer address
>>> slowly start incrementing. This eventually tripped HARDENED_USERCOPY
>>> and KASAN.
>>>
>>> This patch wipes all of rq->elv instead of just rq->elv.icq. While
>>> it shouldn't technically be needed, this ends up being a robustness
>>> improvement that should lead to at least finding bugs in elevators faster.
>>
>> Comments from the other email still apply, we should not need to do this
>> full memset() for every request. From a quick look, BFQ needs to straighten
>> out its usage of prepare request and interactions with insert_request.
>
> Sure, understood. I would point out, FWIW, that memset() gets unrolled
> by the compiler and this is just two more XORs in the same cacheline
> (the two words following icq). (And there is SO much more being
> cleared during alloc, it didn't seem like hardly any extra cost vs the
> robustness it provided.)

Yeah, it's not super pricey, but it's not needed. BFQ is the user of
the members, and the one that assigns them. You're saying leftover
assignments, since it doesn't always assign them. Hence I think that's
a better fix, just sent out a test patch a few minutes ago.

You did all the hard work, I'm just coasting on your findings.

--
Jens Axboe


2018-04-18 08:48:44

by Paolo Valente

[permalink] [raw]
Subject: Re: [PATCH] blk-mq: Clear out elevator private data



> Il giorno 17 apr 2018, alle ore 23:42, Kees Cook <[email protected]> ha scritto:
>
> Some elevators may not correctly check rq->rq_flags & RQF_ELVPRIV, and
> may attempt to read rq->elv fields. When requests got reused, this
> caused BFQ to think it already had a bfqq (rq->elv.priv[1]) allocated.

Hi Kees,
where does BFQ gets confused and operates on a request not destined to
it? I'm asking because I paid attention to always avoid such a
mistake.

Thanks,
Paolo

> This could lead to odd behaviors like having the sense buffer address
> slowly start incrementing. This eventually tripped HARDENED_USERCOPY
> and KASAN.
>
> This patch wipes all of rq->elv instead of just rq->elv.icq. While
> it shouldn't technically be needed, this ends up being a robustness
> improvement that should lead to at least finding bugs in elevators faster.
>
> Reported-by: Oleksandr Natalenko <[email protected]>
> Fixes: bd166ef183c26 ("blk-mq-sched: add framework for MQ capable IO schedulers")
> Cc: [email protected]
> Signed-off-by: Kees Cook <[email protected]>
> ---
> In theory, BFQ needs to also check the RQF_ELVPRIV flag, but I'll leave that
> to Paolo to figure out. Also, my Fixes line is kind of a best-guess. This
> is where icq was originally wiped, so it seemed as good a commit as any.
> ---
> block/blk-mq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 0dc9e341c2a7..859df3160303 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -363,7 +363,7 @@ static struct request *blk_mq_get_request(struct request_queue *q,
>
> rq = blk_mq_rq_ctx_init(data, tag, op);
> if (!op_is_flush(op)) {
> - rq->elv.icq = NULL;
> + memset(&rq->elv, 0, sizeof(rq->elv));
> if (e && e->type->ops.mq.prepare_request) {
> if (e->type->icq_cache && rq_ioc(bio))
> blk_mq_sched_assign_ioc(rq, bio);
> @@ -461,7 +461,7 @@ void blk_mq_free_request(struct request *rq)
> e->type->ops.mq.finish_request(rq);
> if (rq->elv.icq) {
> put_io_context(rq->elv.icq->ioc);
> - rq->elv.icq = NULL;
> + memset(&rq->elv, 0, sizeof(rq->elv));
> }
> }
>
> --
> 2.7.4
>
>
> --
> Kees Cook
> Pixel Security