Last September a fix was checked in for a memory leak problem in
bounce_end_io causing the entire bio to be checked. This ended up
causing some dm cloned bios that had bounce buffers to free NULL pages
because their bi_idx can be non-zero. This patch skips NULL pages in
the bio's bio_vec. I'm not sure if this is the most optimal fix but I
think that it is safe since bvec_alloc memsets the bio_vec to zero.
Mark.
===== mm/highmem.c 1.55 vs edited =====
--- 1.55/mm/highmem.c 2005-01-07 21:44:13 -08:00
+++ edited/mm/highmem.c 2005-02-25 07:54:21 -08:00
@@ -319,7 +319,7 @@
*/
__bio_for_each_segment(bvec, bio, i, 0) {
org_vec = bio_orig->bi_io_vec + i;
- if (bvec->bv_page == org_vec->bv_page)
+ if (!bvec->bv_page || bvec->bv_page == org_vec->bv_page)
continue;
mempool_free(bvec->bv_page, pool);
--
Mark Haverkamp <[email protected]>
On Fri, 2005-02-25 at 09:03 -0800, Mark Haverkamp wrote:
> Last September a fix was checked in for a memory leak problem in
> bounce_end_io causing the entire bio to be checked. This ended up
> causing some dm cloned bios that had bounce buffers to free NULL pages
> because their bi_idx can be non-zero. This patch skips NULL pages in
> the bio's bio_vec. I'm not sure if this is the most optimal fix but I
> think that it is safe since bvec_alloc memsets the bio_vec to zero.
>
> Mark.
I forgot a signed-off-by
Signed-off-by Mark Haverkamp <[email protected]>
===== mm/highmem.c 1.55 vs edited =====
--- 1.55/mm/highmem.c 2005-01-07 21:44:13 -08:00
+++ edited/mm/highmem.c 2005-02-25 07:54:21 -08:00
@@ -319,7 +319,7 @@
*/
__bio_for_each_segment(bvec, bio, i, 0) {
org_vec = bio_orig->bi_io_vec + i;
- if (bvec->bv_page == org_vec->bv_page)
+ if (!bvec->bv_page || bvec->bv_page == org_vec->bv_page)
continue;
mempool_free(bvec->bv_page, pool);
--
Mark Haverkamp <[email protected]>
Mark Haverkamp <[email protected]> wrote:
>
>
> Last September a fix was checked in for a memory leak problem in
> bounce_end_io causing the entire bio to be checked. This ended up
> causing some dm cloned bios that had bounce buffers to free NULL pages
> because their bi_idx can be non-zero. This patch skips NULL pages in
> the bio's bio_vec. I'm not sure if this is the most optimal fix but I
> think that it is safe since bvec_alloc memsets the bio_vec to zero.
>
Thanks, we should get fixed for 2.6.11.
It seems very weird for dm to be shoving NULL page*'s into the middle of a
bio's bvec array, so your fix might end up being a workaround pending a
closer look at what's going on in there.
>
> ===== mm/highmem.c 1.55 vs edited =====
> --- 1.55/mm/highmem.c 2005-01-07 21:44:13 -08:00
> +++ edited/mm/highmem.c 2005-02-25 07:54:21 -08:00
> @@ -319,7 +319,7 @@
> */
> __bio_for_each_segment(bvec, bio, i, 0) {
> org_vec = bio_orig->bi_io_vec + i;
> - if (bvec->bv_page == org_vec->bv_page)
> + if (!bvec->bv_page || bvec->bv_page == org_vec->bv_page)
> continue;
>
> mempool_free(bvec->bv_page, pool);
>
> --
> Mark Haverkamp <[email protected]>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Fri, 25 Feb 2005, Andrew Morton wrote:
>
> It seems very weird for dm to be shoving NULL page*'s into the middle of a
> bio's bvec array, so your fix might end up being a workaround pending a
> closer look at what's going on in there.
Yes. I don't see how this patch can be anything but bandaid to hide the
real bug. Where do these "non-page" bvec's originate?
Linus
On Fri, Feb 25 2005, Linus Torvalds wrote:
>
>
> On Fri, 25 Feb 2005, Andrew Morton wrote:
> >
> > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > bio's bvec array, so your fix might end up being a workaround pending a
> > closer look at what's going on in there.
>
> Yes. I don't see how this patch can be anything but bandaid to hide the
> real bug. Where do these "non-page" bvec's originate?
Yep that's the fishy part, there should not be NULL pages in the middle
(or empty bios, for that matter) submitted for io.
Mark, what was the bug that triggered you to write this patch?
--
Jens Axboe
On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> On Fri, Feb 25 2005, Linus Torvalds wrote:
> >
> >
> > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > >
> > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > bio's bvec array, so your fix might end up being a workaround pending a
> > > closer look at what's going on in there.
> >
> > Yes. I don't see how this patch can be anything but bandaid to hide the
> > real bug. Where do these "non-page" bvec's originate?
>
> Yep that's the fishy part, there should not be NULL pages in the middle
> (or empty bios, for that matter) submitted for io.
>
> Mark, what was the bug that triggered you to write this patch?
It happened when some pages of IO from a dm device were bounced. It
looks to me when bio's are cloned in the dm code to split it for
physical devices that only the pointers to pages that apply to that
device are copied and th bi_idx is adjusted to point to the start,
leaving some NULL pointers at the start of the bio_vec.
Mark,
>
--
Mark Haverkamp <[email protected]>
On Mon, 2005-02-28 at 07:32 -0800, Mark Haverkamp wrote:
> On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> > On Fri, Feb 25 2005, Linus Torvalds wrote:
> > >
> > >
> > > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > > >
> > > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > > bio's bvec array, so your fix might end up being a workaround pending a
> > > > closer look at what's going on in there.
> > >
> > > Yes. I don't see how this patch can be anything but bandaid to hide the
> > > real bug. Where do these "non-page" bvec's originate?
> >
> > Yep that's the fishy part, there should not be NULL pages in the middle
> > (or empty bios, for that matter) submitted for io.
> >
> > Mark, what was the bug that triggered you to write this patch?
>
> It happened when some pages of IO from a dm device were bounced. It
> looks to me when bio's are cloned in the dm code to split it for
> physical devices that only the pointers to pages that apply to that
> device are copied and th bi_idx is adjusted to point to the start,
> leaving some NULL pointers at the start of the bio_vec.
I don't think that I explained that correctly. The dm code clones the
bio and sets the bi_idx to point to where the IO should start. Then
when __blk_queue_bounce gets called, it only fills in its bio starting
at bi_idx (it uses bio_for_each_segment) and since it calls bio_alloc
instead of bio_clone, any pages it doesn't fill in are NULL. I suppose
we could call bio_clone instead of bio_alloc in __blk_queue_bounce and
fill in the whole bio.
Mark.
>
> Mark,
>
>
> >
--
Mark Haverkamp <[email protected]>
On Mon, Feb 28 2005, Mark Haverkamp wrote:
> On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> > On Fri, Feb 25 2005, Linus Torvalds wrote:
> > >
> > >
> > > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > > >
> > > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > > bio's bvec array, so your fix might end up being a workaround pending a
> > > > closer look at what's going on in there.
> > >
> > > Yes. I don't see how this patch can be anything but bandaid to hide the
> > > real bug. Where do these "non-page" bvec's originate?
> >
> > Yep that's the fishy part, there should not be NULL pages in the middle
> > (or empty bios, for that matter) submitted for io.
> >
> > Mark, what was the bug that triggered you to write this patch?
>
> It happened when some pages of IO from a dm device were bounced. It
> looks to me when bio's are cloned in the dm code to split it for
> physical devices that only the pointers to pages that apply to that
> device are copied and th bi_idx is adjusted to point to the start,
> leaving some NULL pointers at the start of the bio_vec.
This should fix it.
Signed-off-by: Jens Axboe <[email protected]>
===== mm/highmem.c 1.55 vs edited =====
--- 1.55/mm/highmem.c 2005-01-08 06:44:13 +01:00
+++ edited/mm/highmem.c 2005-02-28 16:50:59 +01:00
@@ -425,7 +425,7 @@
* at least one page was bounced, fill in possible non-highmem
* pages
*/
- bio_for_each_segment(from, *bio_orig, i) {
+ __bio_for_each_segment(from, *bio_orig, i) {
to = bio_iovec_idx(bio, i);
if (!to->bv_page) {
to->bv_page = from->bv_page;
--
Jens Axboe
On Mon, 2005-02-28 at 16:51 +0100, Jens Axboe wrote:
> On Mon, Feb 28 2005, Mark Haverkamp wrote:
> > On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> > > On Fri, Feb 25 2005, Linus Torvalds wrote:
> > > >
> > > >
> > > > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > > > >
> > > > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > > > bio's bvec array, so your fix might end up being a workaround pending a
> > > > > closer look at what's going on in there.
> > > >
> > > > Yes. I don't see how this patch can be anything but bandaid to hide the
> > > > real bug. Where do these "non-page" bvec's originate?
> > >
> > > Yep that's the fishy part, there should not be NULL pages in the middle
> > > (or empty bios, for that matter) submitted for io.
> > >
> > > Mark, what was the bug that triggered you to write this patch?
> >
> > It happened when some pages of IO from a dm device were bounced. It
> > looks to me when bio's are cloned in the dm code to split it for
> > physical devices that only the pointers to pages that apply to that
> > device are copied and th bi_idx is adjusted to point to the start,
> > leaving some NULL pointers at the start of the bio_vec.
>
> This should fix it.
Wouldn't this potentially create bounce pages that will never be used?
>
> Signed-off-by: Jens Axboe <[email protected]>
>
>
> ===== mm/highmem.c 1.55 vs edited =====
> --- 1.55/mm/highmem.c 2005-01-08 06:44:13 +01:00
> +++ edited/mm/highmem.c 2005-02-28 16:50:59 +01:00
> @@ -425,7 +425,7 @@
> * at least one page was bounced, fill in possible non-highmem
> * pages
> */
> - bio_for_each_segment(from, *bio_orig, i) {
> + __bio_for_each_segment(from, *bio_orig, i) {
> to = bio_iovec_idx(bio, i);
> if (!to->bv_page) {
> to->bv_page = from->bv_page;
>
--
Mark Haverkamp <[email protected]>
On Mon, 2005-02-28 at 08:13 -0800, Mark Haverkamp wrote:
> On Mon, 2005-02-28 at 16:51 +0100, Jens Axboe wrote:
> > On Mon, Feb 28 2005, Mark Haverkamp wrote:
> > > On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> > > > On Fri, Feb 25 2005, Linus Torvalds wrote:
> > > > >
> > > > >
> > > > > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > > > > >
> > > > > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > > > > bio's bvec array, so your fix might end up being a workaround pending a
> > > > > > closer look at what's going on in there.
> > > > >
> > > > > Yes. I don't see how this patch can be anything but bandaid to hide the
> > > > > real bug. Where do these "non-page" bvec's originate?
> > > >
> > > > Yep that's the fishy part, there should not be NULL pages in the middle
> > > > (or empty bios, for that matter) submitted for io.
> > > >
> > > > Mark, what was the bug that triggered you to write this patch?
> > >
> > > It happened when some pages of IO from a dm device were bounced. It
> > > looks to me when bio's are cloned in the dm code to split it for
> > > physical devices that only the pointers to pages that apply to that
> > > device are copied and th bi_idx is adjusted to point to the start,
> > > leaving some NULL pointers at the start of the bio_vec.
> >
> > This should fix it.
>
> Wouldn't this potentially create bounce pages that will never be used?
Sorry, never mind. I didn't notice that this was for the non-highmem
pages.
--
Mark Haverkamp <[email protected]>
On Mon, Feb 28 2005, Mark Haverkamp wrote:
> On Mon, 2005-02-28 at 16:51 +0100, Jens Axboe wrote:
> > On Mon, Feb 28 2005, Mark Haverkamp wrote:
> > > On Sat, 2005-02-26 at 13:39 +0100, Jens Axboe wrote:
> > > > On Fri, Feb 25 2005, Linus Torvalds wrote:
> > > > >
> > > > >
> > > > > On Fri, 25 Feb 2005, Andrew Morton wrote:
> > > > > >
> > > > > > It seems very weird for dm to be shoving NULL page*'s into the middle of a
> > > > > > bio's bvec array, so your fix might end up being a workaround pending a
> > > > > > closer look at what's going on in there.
> > > > >
> > > > > Yes. I don't see how this patch can be anything but bandaid to hide the
> > > > > real bug. Where do these "non-page" bvec's originate?
> > > >
> > > > Yep that's the fishy part, there should not be NULL pages in the middle
> > > > (or empty bios, for that matter) submitted for io.
> > > >
> > > > Mark, what was the bug that triggered you to write this patch?
> > >
> > > It happened when some pages of IO from a dm device were bounced. It
> > > looks to me when bio's are cloned in the dm code to split it for
> > > physical devices that only the pointers to pages that apply to that
> > > device are copied and th bi_idx is adjusted to point to the start,
> > > leaving some NULL pointers at the start of the bio_vec.
> >
> > This should fix it.
>
> Wouldn't this potentially create bounce pages that will never be used?
Well no, it just points them at the "top" pages from the original bio.
--
Jens Axboe
Just trivial, I think you're missing the final "0" argument
in the __bio_for_each_segment().
On Mon, Feb 28, 2005 at 04:51:28PM +0100, Jens Axboe wrote:
> This should fix it.
>
> Signed-off-by: Jens Axboe <[email protected]>
>
>
> ===== mm/highmem.c 1.55 vs edited =====
> --- 1.55/mm/highmem.c 2005-01-08 06:44:13 +01:00
> +++ edited/mm/highmem.c 2005-02-28 16:50:59 +01:00
> @@ -425,7 +425,7 @@
> * at least one page was bounced, fill in possible non-highmem
> * pages
> */
> - bio_for_each_segment(from, *bio_orig, i) {
> + __bio_for_each_segment(from, *bio_orig, i) {
> to = bio_iovec_idx(bio, i);
> if (!to->bv_page) {
> to->bv_page = from->bv_page;
>
> --
> Jens Axboe
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Mon, Feb 28 2005, Dave Olien wrote:
>
> Just trivial, I think you're missing the final "0" argument
> in the __bio_for_each_segment().
Yep indeed, was just a hasty edit...
>
> On Mon, Feb 28, 2005 at 04:51:28PM +0100, Jens Axboe wrote:
> > This should fix it.
> >
> > Signed-off-by: Jens Axboe <[email protected]>
> >
> >
> > ===== mm/highmem.c 1.55 vs edited =====
> > --- 1.55/mm/highmem.c 2005-01-08 06:44:13 +01:00
> > +++ edited/mm/highmem.c 2005-02-28 16:50:59 +01:00
> > @@ -425,7 +425,7 @@
> > * at least one page was bounced, fill in possible non-highmem
> > * pages
> > */
> > - bio_for_each_segment(from, *bio_orig, i) {
> > + __bio_for_each_segment(from, *bio_orig, i) {
> > to = bio_iovec_idx(bio, i);
> > if (!to->bv_page) {
> > to->bv_page = from->bv_page;
> >
> > --
> > Jens Axboe
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
--
Jens Axboe