2005-02-18 12:54:30

by Philip R Auld

[permalink] [raw]
Subject: bio refcount problem

Hi,
I think there are some potential issues with the reference
counting of bios as used in 2.6.10. The __make_request function
which is the default block device routine accesses the bio structure
after issuing the call to add_request. This means that the bio could
have completed before __make_request uses it.

The submit_bh path takes an extra reference with an explicit
bio_get/put pair around the submit_bio, but many other users of
submit_bio do not. Given that most of the end_io routines remove a
reference and hence could free the bio this can lead at the least to
__make_request mis-reading the sync flag. In more extreme cases it can
cause an oops when run with CONFIG_DEBUG_PAGEALLOC.

The question is what is the preferred fix? I think it may be to simply
have submit_bio take its own reference (and remove the extra one from
submit_bh).

Alternatively __make_request could be adjusted so that it does not
access the bio after calling add_request. All it is doing is checking
the bi_rw field for the sync bit.

Or make all users of submit_bio take and release and extra reference
like submit_bh.

Thoughts?


Cheers,

Phil


--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752


2005-02-18 13:59:38

by Jens Axboe

[permalink] [raw]
Subject: Re: bio refcount problem

On Fri, Feb 18 2005, Philip R Auld wrote:
> Hi,
> I think there are some potential issues with the reference
> counting of bios as used in 2.6.10. The __make_request function
> which is the default block device routine accesses the bio structure
> after issuing the call to add_request. This means that the bio could
> have completed before __make_request uses it.
>
> The submit_bh path takes an extra reference with an explicit
> bio_get/put pair around the submit_bio, but many other users of
> submit_bio do not. Given that most of the end_io routines remove a
> reference and hence could free the bio this can lead at the least to
> __make_request mis-reading the sync flag. In more extreme cases it can
> cause an oops when run with CONFIG_DEBUG_PAGEALLOC.
>
> The question is what is the preferred fix? I think it may be to simply
> have submit_bio take its own reference (and remove the extra one from
> submit_bh).
>
> Alternatively __make_request could be adjusted so that it does not
> access the bio after calling add_request. All it is doing is checking
> the bi_rw field for the sync bit.
>
> Or make all users of submit_bio take and release and extra reference
> like submit_bh.

The queue lock is still held at that point, so the driver hasn't had a
chance to process the request yet.

--
Jens Axboe

2005-02-18 14:26:47

by Philip R Auld

[permalink] [raw]
Subject: Re: bio refcount problem

Hi,

Rumor has it that on Fri, Feb 18, 2005 at 02:59:32PM +0100 Jens Axboe said:
> On Fri, Feb 18 2005, Philip R Auld wrote:

...
> > Or make all users of submit_bio take and release and extra reference
> > like submit_bh.
>
> The queue lock is still held at that point, so the driver hasn't had a
> chance to process the request yet.

Interesting. This is not a theoretical problem though. I've got traces of
the oops showing the bio getting freed before the bio_sync(bio) test.
When you say driver here what level do you mean? scsi_request_fn at
least drops the queue lock.

What if it's merged instead of added directly? That could also get to
the same place.

The end_io callback _is_ getting called before __make_request
does its "if(bio_sync(bio))" test.


Cheers,

Phil


>
> --
> Jens Axboe

--
Philip R. Auld, Ph.D. Egenera, Inc.
Software Architect 165 Forest St.
(508) 858-2628 Marlboro, MA 01752

2005-02-18 14:36:19

by Jens Axboe

[permalink] [raw]
Subject: Re: bio refcount problem

On Fri, Feb 18 2005, Philip R Auld wrote:
> Hi,
>
> Rumor has it that on Fri, Feb 18, 2005 at 02:59:32PM +0100 Jens Axboe said:
> > On Fri, Feb 18 2005, Philip R Auld wrote:
>
> ...
> > > Or make all users of submit_bio take and release and extra reference
> > > like submit_bh.
> >
> > The queue lock is still held at that point, so the driver hasn't had a
> > chance to process the request yet.
>
> Interesting. This is not a theoretical problem though. I've got traces of
> the oops showing the bio getting freed before the bio_sync(bio) test.
> When you say driver here what level do you mean? scsi_request_fn at
> least drops the queue lock.

But it must be holding the lock to retrieve the request in question, so
there should be no opening for a completion race there.

> What if it's merged instead of added directly? That could also get to
> the same place.

Same deal - if the request is already seen by the driver, merging is not
allowed. If not, then the same rules apply.

> The end_io callback _is_ getting called before __make_request
> does its "if(bio_sync(bio))" test.

Sounds strange, it sounds like a driver issue. If you have time, please
do poke some more at this. I'll try to be responsive, but I'm busy with
other things atm. Are you using a vanilla kernel?

--
Jens Axboe