Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935641AbXJQRWc (ORCPT ); Wed, 17 Oct 2007 13:22:32 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1763385AbXJQRWF (ORCPT ); Wed, 17 Oct 2007 13:22:05 -0400 Received: from brick.kernel.dk ([87.55.233.238]:2905 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1761775AbXJQRWD (ORCPT ); Wed, 17 Oct 2007 13:22:03 -0400 Date: Wed, 17 Oct 2007 19:21:58 +0200 From: Jens Axboe To: Linus Torvalds Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [bug] block subsystem related crash with latest -git Message-ID: <20071017172158.GH15552@kernel.dk> References: <20071017154655.GA13394@elte.hu> <20071017165949.GF15552@kernel.dk> <20071017170804.GG15552@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071017170804.GG15552@kernel.dk> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3982 Lines: 127 On Wed, Oct 17 2007, Jens Axboe wrote: > On Wed, Oct 17 2007, Jens Axboe wrote: > > On Wed, Oct 17 2007, Linus Torvalds wrote: > > > > > > > > > On Wed, 17 Oct 2007, Ingo Molnar wrote: > > > > > > > > Jens, just got this crash on a testbox: > > > > > > The code in question is: > > > > > > mov %edx,0xc(%esp) > > > mov (%ebx),%edi > > > mov %edi,%edx > > > sub %eax,%edx > > > mov %edx,%eax > > > sar $0x5,%eax > > > shl $0xc,%eax > > > add 0x8(%ebx),%eax > > > cmp %eax,0xc(%esp) > > > je +126 > > > mov 0x10(%esi),%eax <----- Oops > > > lea 0x10(%esi),%edx > > > test $0x1,%al > > > jne +76 > > > mov %edi,(%esi) > > > mov %ebp,0xc(%esi) > > > mov 0x8(%ebx),%eax > > > mov %eax,0x4(%esi) > > > > > > > > > and it looks like %esi is overflowing from one page to the next one, ie: > > > > > > BUG: unable to handle kernel paging request at virtual address 7ca76000 > > > ESI: 7ca75ff0 > > > > > > and you caught this thanks to page-alloc debugging again. > > > > > > I think I can match that up with the source code: that's "sg_next()". It's > > > doing: > > > > > > sg++; > > > > > > if (unlikely(sg_is_chain(sg))) > > > sg = sg_chain_ptr(sg); > > > > > > return sg; > > > > > > and the oopsing instruction is that load of "sg->page" in the assembly > > > code: > > > > > > mov 0x10(%esi),%eax # %eax = sg->page > > > lea 0x10(%esi),%edx # %edx = sg+1; > > > test $0x1,%al # if (unlikely(sg_is_chain())) > > > jne +76 > > > > > > Jens? > > > > Yep, that's what I came up with as well - I asked Ingo for a dump in > > private, but ended up just using ksymoops to decode the line. > > > > The way blk_rq_map_sg() operates is that it ends up doing a > > > > next_sg = sg_next(sg); > > > > even though sg may be the last entry. Perhaps this is crapping out, > > although if sg is a valid address, then sg + 1 should be as well. > > next_sg may end up being crap, in fact it will, but we'll never use that > > unless there are more entries to fill. And if there is, then both sg and > > next_sg were valid. > > > > So nothing in for-linus should fix it, I'll try and come up with an > > alternate way to assign next_sg so it's always valid. > > OK, the below should actually be safe, I don't know why I talked myself > into the next_sg stuff in the beginning. It's always safe to zero sg, > since it's a valid entry - nothing to save in ->page. Ingo, does this > work for you? > > diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c > index 9e3f3cc..3935469 100644 > --- a/block/ll_rw_blk.c > +++ b/block/ll_rw_blk.c > @@ -1322,8 +1322,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, > struct scatterlist *sglist) > { > struct bio_vec *bvec, *bvprv; > - struct scatterlist *next_sg, *sg; > struct req_iterator iter; > + struct scatterlist *sg; > int nsegs, cluster; > > nsegs = 0; > @@ -1333,7 +1333,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, > * for each bio in rq > */ > bvprv = NULL; > - sg = next_sg = &sglist[0]; > + sg = NULL; > rq_for_each_segment(bvec, rq, iter) { > int nbytes = bvec->bv_len; > > @@ -1349,8 +1349,10 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, > sg->length += nbytes; > } else { > new_segment: > - sg = next_sg; > - next_sg = sg_next(sg); > + if (!sg) > + sg = sglist; > + else > + sg = sg_next(sg); > > memset(sg, 0, sizeof(*sg)); > sg->page = bvec->bv_page; > Scratch that, it cannot work... I'll think up a different approach. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/