Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935460AbXJQRIe (ORCPT ); Wed, 17 Oct 2007 13:08:34 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1764181AbXJQRIL (ORCPT ); Wed, 17 Oct 2007 13:08:11 -0400 Received: from brick.kernel.dk ([87.55.233.238]:5647 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1763107AbXJQRIK (ORCPT ); Wed, 17 Oct 2007 13:08:10 -0400 Date: Wed, 17 Oct 2007 19:08:05 +0200 From: Jens Axboe To: Linus Torvalds Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Andrew Morton Subject: Re: [bug] block subsystem related crash with latest -git Message-ID: <20071017170804.GG15552@kernel.dk> References: <20071017154655.GA13394@elte.hu> <20071017165949.GF15552@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20071017165949.GF15552@kernel.dk> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3643 Lines: 123 On Wed, Oct 17 2007, Jens Axboe wrote: > On Wed, Oct 17 2007, Linus Torvalds wrote: > > > > > > On Wed, 17 Oct 2007, Ingo Molnar wrote: > > > > > > Jens, just got this crash on a testbox: > > > > The code in question is: > > > > mov %edx,0xc(%esp) > > mov (%ebx),%edi > > mov %edi,%edx > > sub %eax,%edx > > mov %edx,%eax > > sar $0x5,%eax > > shl $0xc,%eax > > add 0x8(%ebx),%eax > > cmp %eax,0xc(%esp) > > je +126 > > mov 0x10(%esi),%eax <----- Oops > > lea 0x10(%esi),%edx > > test $0x1,%al > > jne +76 > > mov %edi,(%esi) > > mov %ebp,0xc(%esi) > > mov 0x8(%ebx),%eax > > mov %eax,0x4(%esi) > > > > > > and it looks like %esi is overflowing from one page to the next one, ie: > > > > BUG: unable to handle kernel paging request at virtual address 7ca76000 > > ESI: 7ca75ff0 > > > > and you caught this thanks to page-alloc debugging again. > > > > I think I can match that up with the source code: that's "sg_next()". It's > > doing: > > > > sg++; > > > > if (unlikely(sg_is_chain(sg))) > > sg = sg_chain_ptr(sg); > > > > return sg; > > > > and the oopsing instruction is that load of "sg->page" in the assembly > > code: > > > > mov 0x10(%esi),%eax # %eax = sg->page > > lea 0x10(%esi),%edx # %edx = sg+1; > > test $0x1,%al # if (unlikely(sg_is_chain())) > > jne +76 > > > > Jens? > > Yep, that's what I came up with as well - I asked Ingo for a dump in > private, but ended up just using ksymoops to decode the line. > > The way blk_rq_map_sg() operates is that it ends up doing a > > next_sg = sg_next(sg); > > even though sg may be the last entry. Perhaps this is crapping out, > although if sg is a valid address, then sg + 1 should be as well. > next_sg may end up being crap, in fact it will, but we'll never use that > unless there are more entries to fill. And if there is, then both sg and > next_sg were valid. > > So nothing in for-linus should fix it, I'll try and come up with an > alternate way to assign next_sg so it's always valid. OK, the below should actually be safe, I don't know why I talked myself into the next_sg stuff in the beginning. It's always safe to zero sg, since it's a valid entry - nothing to save in ->page. Ingo, does this work for you? diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index 9e3f3cc..3935469 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -1322,8 +1322,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, struct scatterlist *sglist) { struct bio_vec *bvec, *bvprv; - struct scatterlist *next_sg, *sg; struct req_iterator iter; + struct scatterlist *sg; int nsegs, cluster; nsegs = 0; @@ -1333,7 +1333,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, * for each bio in rq */ bvprv = NULL; - sg = next_sg = &sglist[0]; + sg = NULL; rq_for_each_segment(bvec, rq, iter) { int nbytes = bvec->bv_len; @@ -1349,8 +1349,10 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq, sg->length += nbytes; } else { new_segment: - sg = next_sg; - next_sg = sg_next(sg); + if (!sg) + sg = sglist; + else + sg = sg_next(sg); memset(sg, 0, sizeof(*sg)); sg->page = bvec->bv_page; -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/