2017-08-07 19:38:23

by David Jeffery

[permalink] [raw]
Subject: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

There is a race between changing I/O elevator and request_queue removal
which can trigger the warning in kobject_add_internal. A program can
use sysfs to request a change of elevator at the same time another task
is unregistering the request_queue the elevator would be attached to.
The elevator's kobject will then attempt to be connected to the
request_queue in the object tree when the request_queue has just been
removed from sysfs. This triggers the warning in kobject_add_internal
as the request_queue no longer has a sysfs directory:

kobject_add_internal failed for iosched (error: -2 parent: queue)
------------[ cut here ]------------
WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0


To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when
changing the elevator and use the request_queue's sysfs_lock to
serialize between clearing the flag and the elevator testing the flag.

Signed-off-by: David Jeffery <[email protected]>
---
block/blk-sysfs.c | 2 ++
block/elevator.c | 4 ++++
2 files changed, 6 insertions(+)


diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 27aceab..b8362c0 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
if (WARN_ON(!q))
return;

+ mutex_lock(&q->sysfs_lock);
queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
+ mutex_unlock(&q->sysfs_lock);

wbt_exit(q);

diff --git a/block/elevator.c b/block/elevator.c
index 4bb2f0c..51da592 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -1055,6 +1055,10 @@ static int __elevator_change(struct request_queue *q, const char *name)
char elevator_name[ELV_NAME_MAX];
struct elevator_type *e;

+ /* Make sure queue is not in the middle of being removed */
+ if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
+ return -ENOENT;
+
/*
* Special case for mq, turn off scheduling
*/


2017-08-07 23:53:44

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:
> There is a race between changing I/O elevator and request_queue removal
> which can trigger the warning in kobject_add_internal. A program can
> use sysfs to request a change of elevator at the same time another task
> is unregistering the request_queue the elevator would be attached to.
> The elevator's kobject will then attempt to be connected to the
> request_queue in the object tree when the request_queue has just been
> removed from sysfs. This triggers the warning in kobject_add_internal
> as the request_queue no longer has a sysfs directory:
>
> kobject_add_internal failed for iosched (error: -2 parent: queue)
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 14075 at lib/kobject.c:244 kobject_add_internal+0x103/0x2d0
>
>
> To fix this warning, we can check the QUEUE_FLAG_REGISTERED flag when
> changing the elevator and use the request_queue's sysfs_lock to
> serialize between clearing the flag and the elevator testing the flag.

I remember I saw this issue too.

>
> Signed-off-by: David Jeffery <[email protected]>
> ---
> block/blk-sysfs.c | 2 ++
> block/elevator.c | 4 ++++
> 2 files changed, 6 insertions(+)
>
>
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 27aceab..b8362c0 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
> if (WARN_ON(!q))
> return;
>
> + mutex_lock(&q->sysfs_lock);
> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
> + mutex_unlock(&q->sysfs_lock);

Could you share why the lock of 'q->sysfs_lock' is needed here?

>
> wbt_exit(q);
>
> diff --git a/block/elevator.c b/block/elevator.c
> index 4bb2f0c..51da592 100644
> --- a/block/elevator.c
> +++ b/block/elevator.c
> @@ -1055,6 +1055,10 @@ static int __elevator_change(struct request_queue *q, const char *name)
> char elevator_name[ELV_NAME_MAX];
> struct elevator_type *e;
>
> + /* Make sure queue is not in the middle of being removed */
> + if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
> + return -ENOENT;
> +

I suggest to check 'e->registered' here, which should be more
reasonable or straightforward.

--
Ming Lei

2017-08-08 18:13:18

by David Jeffery

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

On 08/07/2017 07:53 PM, Ming Lei wrote:
> On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:

>>
>> Signed-off-by: David Jeffery <[email protected]>
>> ---
>> block/blk-sysfs.c | 2 ++
>> block/elevator.c | 4 ++++
>> 2 files changed, 6 insertions(+)
>>
>>
>> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
>> index 27aceab..b8362c0 100644
>> --- a/block/blk-sysfs.c
>> +++ b/block/blk-sysfs.c
>> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
>> if (WARN_ON(!q))
>> return;
>>
>> + mutex_lock(&q->sysfs_lock);
>> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>> + mutex_unlock(&q->sysfs_lock);
>
> Could you share why the lock of 'q->sysfs_lock' is needed here?

As the elevator change is initiated through a sysfs attr file in the
queue directory, the task doing the elevator change already acquires the
q->sysfs_lock before it can try and change the elevator. Adding the
lock around clearing QUEUE_FLAG_REGISTERED ensures that the queue state
will be stable while the elevator is being changed. It prevents a race
condition where the bit is checked but then cleared and queue removed
from sysfs before the elevator change completes.

>
>>
>> wbt_exit(q);
>>
>> diff --git a/block/elevator.c b/block/elevator.c
>> index 4bb2f0c..51da592 100644
>> --- a/block/elevator.c
>> +++ b/block/elevator.c
>> @@ -1055,6 +1055,10 @@ static int __elevator_change(struct request_queue *q, const char *name)
>> char elevator_name[ELV_NAME_MAX];
>> struct elevator_type *e;
>>
>> + /* Make sure queue is not in the middle of being removed */
>> + if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
>> + return -ENOENT;
>> +
>
> I suggest to check 'e->registered' here, which should be more
> reasonable or straightforward.
>

e->registered is not the state needing to be checked. We need to know
the state of the associated request queue.

Before changing the elevator, we need to ensure the request queue is
still connected to sysfs. i.e. We need to know that kobject_del has not
been called on the request queue. When QUEUE_FLAG_REGISTERED is not set
it means the request queue either has had kobject_del called or will
have it called soon, so we should fail the elevator change attempt.



2017-08-09 01:44:15

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

Hi David,

On Wed, Aug 9, 2017 at 2:13 AM, David Jeffery <[email protected]> wrote:
> On 08/07/2017 07:53 PM, Ming Lei wrote:
>> On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:
>
>>>
>>> Signed-off-by: David Jeffery <[email protected]>
>>> ---
>>> block/blk-sysfs.c | 2 ++
>>> block/elevator.c | 4 ++++
>>> 2 files changed, 6 insertions(+)
>>>
>>>
>>> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
>>> index 27aceab..b8362c0 100644
>>> --- a/block/blk-sysfs.c
>>> +++ b/block/blk-sysfs.c
>>> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
>>> if (WARN_ON(!q))
>>> return;
>>>
>>> + mutex_lock(&q->sysfs_lock);
>>> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>>> + mutex_unlock(&q->sysfs_lock);
>>
>> Could you share why the lock of 'q->sysfs_lock' is needed here?
>
> As the elevator change is initiated through a sysfs attr file in the
> queue directory, the task doing the elevator change already acquires the
> q->sysfs_lock before it can try and change the elevator. Adding the

It is e->sysfs_lock which is held in elv_attr_store(), instead of q->sysfs_lock.

> lock around clearing QUEUE_FLAG_REGISTERED ensures that the queue state
> will be stable while the elevator is being changed. It prevents a race
> condition where the bit is checked but then cleared and queue removed
> from sysfs before the elevator change completes.
>
>>
>>>
>>> wbt_exit(q);
>>>
>>> diff --git a/block/elevator.c b/block/elevator.c
>>> index 4bb2f0c..51da592 100644
>>> --- a/block/elevator.c
>>> +++ b/block/elevator.c
>>> @@ -1055,6 +1055,10 @@ static int __elevator_change(struct request_queue *q, const char *name)
>>> char elevator_name[ELV_NAME_MAX];
>>> struct elevator_type *e;
>>>
>>> + /* Make sure queue is not in the middle of being removed */
>>> + if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
>>> + return -ENOENT;
>>> +
>>
>> I suggest to check 'e->registered' here, which should be more
>> reasonable or straightforward.
>>
>
> e->registered is not the state needing to be checked. We need to know
> the state of the associated request queue.
>
> Before changing the elevator, we need to ensure the request queue is
> still connected to sysfs. i.e. We need to know that kobject_del has not
> been called on the request queue. When QUEUE_FLAG_REGISTERED is not set
> it means the request queue either has had kobject_del called or will
> have it called soon, so we should fail the elevator change attempt.

elv_unregister_queue() is called in blk_unregister_queue() too, that is why
I suggest to check e->registered.


--
Ming Lei

2017-08-23 03:34:45

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

On Wed, Aug 9, 2017 at 9:44 AM, Ming Lei <[email protected]> wrote:
> Hi David,
>
> On Wed, Aug 9, 2017 at 2:13 AM, David Jeffery <[email protected]> wrote:
>> On 08/07/2017 07:53 PM, Ming Lei wrote:
>>> On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:
>>
>>>>
>>>> Signed-off-by: David Jeffery <[email protected]>
>>>> ---
>>>> block/blk-sysfs.c | 2 ++
>>>> block/elevator.c | 4 ++++
>>>> 2 files changed, 6 insertions(+)
>>>>
>>>>
>>>> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
>>>> index 27aceab..b8362c0 100644
>>>> --- a/block/blk-sysfs.c
>>>> +++ b/block/blk-sysfs.c
>>>> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
>>>> if (WARN_ON(!q))
>>>> return;
>>>>
>>>> + mutex_lock(&q->sysfs_lock);
>>>> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>>>> + mutex_unlock(&q->sysfs_lock);
>>>
>>> Could you share why the lock of 'q->sysfs_lock' is needed here?
>>
>> As the elevator change is initiated through a sysfs attr file in the
>> queue directory, the task doing the elevator change already acquires the
>> q->sysfs_lock before it can try and change the elevator. Adding the
>
> It is e->sysfs_lock which is held in elv_attr_store(), instead of q->sysfs_lock.

Looks I was wrong, and the store is from queue_attr_store() instead of
elv_attr_store().

I can reproduce the issue and this patch can fix the issue in my test
on scsi_debug,
so:

Tested-by: Ming Lei <[email protected]>

And it is a typical race between removing queue kobj and adding children of
this queue kobj, we can't acquire q->sysfs_lock in blk_unregister_queue()
because of sysfs's limit(may cause deadlock), so one state has to be used
for this purpose, just like what register/unregister hctx kobjs does,
and it should
be fine to use QUEUE_FLAG_REGISTERED here. More changes are required if
we use e->registered, so this patch looks fine:

Reviewed-by: Ming Lei <[email protected]>


Thanks,
Ming Lei

2017-08-28 01:36:20

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

On Wed, Aug 23, 2017 at 11:34 AM, Ming Lei <[email protected]> wrote:
> On Wed, Aug 9, 2017 at 9:44 AM, Ming Lei <[email protected]> wrote:
>> Hi David,
>>
>> On Wed, Aug 9, 2017 at 2:13 AM, David Jeffery <[email protected]> wrote:
>>> On 08/07/2017 07:53 PM, Ming Lei wrote:
>>>> On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:
>>>
>>>>>
>>>>> Signed-off-by: David Jeffery <[email protected]>
>>>>> ---
>>>>> block/blk-sysfs.c | 2 ++
>>>>> block/elevator.c | 4 ++++
>>>>> 2 files changed, 6 insertions(+)
>>>>>
>>>>>
>>>>> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
>>>>> index 27aceab..b8362c0 100644
>>>>> --- a/block/blk-sysfs.c
>>>>> +++ b/block/blk-sysfs.c
>>>>> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
>>>>> if (WARN_ON(!q))
>>>>> return;
>>>>>
>>>>> + mutex_lock(&q->sysfs_lock);
>>>>> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>>>>> + mutex_unlock(&q->sysfs_lock);
>>>>
>>>> Could you share why the lock of 'q->sysfs_lock' is needed here?
>>>
>>> As the elevator change is initiated through a sysfs attr file in the
>>> queue directory, the task doing the elevator change already acquires the
>>> q->sysfs_lock before it can try and change the elevator. Adding the
>>
>> It is e->sysfs_lock which is held in elv_attr_store(), instead of q->sysfs_lock.
>
> Looks I was wrong, and the store is from queue_attr_store() instead of
> elv_attr_store().
>
> I can reproduce the issue and this patch can fix the issue in my test
> on scsi_debug,
> so:
>
> Tested-by: Ming Lei <[email protected]>
>
> And it is a typical race between removing queue kobj and adding children of
> this queue kobj, we can't acquire q->sysfs_lock in blk_unregister_queue()
> because of sysfs's limit(may cause deadlock), so one state has to be used
> for this purpose, just like what register/unregister hctx kobjs does,
> and it should
> be fine to use QUEUE_FLAG_REGISTERED here. More changes are required if
> we use e->registered, so this patch looks fine:
>
> Reviewed-by: Ming Lei <[email protected]>


Hi Jens,

Could you consider this patch for v4.13 or v4.14?


Thanks,
Ming Lei

2017-08-28 16:54:08

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH] block: Fix warning when I/O elevator is changed as request_queue is being removed

On 08/27/2017 07:36 PM, Ming Lei wrote:
> On Wed, Aug 23, 2017 at 11:34 AM, Ming Lei <[email protected]> wrote:
>> On Wed, Aug 9, 2017 at 9:44 AM, Ming Lei <[email protected]> wrote:
>>> Hi David,
>>>
>>> On Wed, Aug 9, 2017 at 2:13 AM, David Jeffery <[email protected]> wrote:
>>>> On 08/07/2017 07:53 PM, Ming Lei wrote:
>>>>> On Tue, Aug 8, 2017 at 3:38 AM, David Jeffery <[email protected]> wrote:
>>>>
>>>>>>
>>>>>> Signed-off-by: David Jeffery <[email protected]>
>>>>>> ---
>>>>>> block/blk-sysfs.c | 2 ++
>>>>>> block/elevator.c | 4 ++++
>>>>>> 2 files changed, 6 insertions(+)
>>>>>>
>>>>>>
>>>>>> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
>>>>>> index 27aceab..b8362c0 100644
>>>>>> --- a/block/blk-sysfs.c
>>>>>> +++ b/block/blk-sysfs.c
>>>>>> @@ -931,7 +931,9 @@ void blk_unregister_queue(struct gendisk *disk)
>>>>>> if (WARN_ON(!q))
>>>>>> return;
>>>>>>
>>>>>> + mutex_lock(&q->sysfs_lock);
>>>>>> queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
>>>>>> + mutex_unlock(&q->sysfs_lock);
>>>>>
>>>>> Could you share why the lock of 'q->sysfs_lock' is needed here?
>>>>
>>>> As the elevator change is initiated through a sysfs attr file in the
>>>> queue directory, the task doing the elevator change already acquires the
>>>> q->sysfs_lock before it can try and change the elevator. Adding the
>>>
>>> It is e->sysfs_lock which is held in elv_attr_store(), instead of q->sysfs_lock.
>>
>> Looks I was wrong, and the store is from queue_attr_store() instead of
>> elv_attr_store().
>>
>> I can reproduce the issue and this patch can fix the issue in my test
>> on scsi_debug,
>> so:
>>
>> Tested-by: Ming Lei <[email protected]>
>>
>> And it is a typical race between removing queue kobj and adding children of
>> this queue kobj, we can't acquire q->sysfs_lock in blk_unregister_queue()
>> because of sysfs's limit(may cause deadlock), so one state has to be used
>> for this purpose, just like what register/unregister hctx kobjs does,
>> and it should
>> be fine to use QUEUE_FLAG_REGISTERED here. More changes are required if
>> we use e->registered, so this patch looks fine:
>>
>> Reviewed-by: Ming Lei <[email protected]>
>
>
> Hi Jens,
>
> Could you consider this patch for v4.13 or v4.14?

Yep, added for 4.14, thanks.

--
Jens Axboe