Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758659AbXJXMqw (ORCPT ); Wed, 24 Oct 2007 08:46:52 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1757146AbXJXMqn (ORCPT ); Wed, 24 Oct 2007 08:46:43 -0400 Received: from tama55.ecl.ntt.co.jp ([129.60.39.103]:40840 "EHLO tama55.ecl.ntt.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757128AbXJXMqm (ORCPT ); Wed, 24 Oct 2007 08:46:42 -0400 To: apw@shadowen.org Cc: jens.axboe@oracle.com, kamalesh@linux.vnet.ibm.com, linux-kernel@vger.kernel.org Subject: Re: [BUG] 2.6.23-git18 Kernel oops in sg helpers From: FUJITA Tomonori In-Reply-To: <20071024115436.GT32058@shadowen.org> References: <471E110C.20404@linux.vnet.ibm.com> <20071023184419.GD14671@kernel.dk> <20071024115436.GT32058@shadowen.org> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20071024214014C.fujita.tomonori@lab.ntt.co.jp> Date: Wed, 24 Oct 2007 21:40:14 +0900 X-Dispatcher: imput version 20040704(IM147) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1848 Lines: 47 On Wed, 24 Oct 2007 12:54:36 +0100 Andy Whitcroft wrote: > On Tue, Oct 23, 2007 at 08:44:20PM +0200, Jens Axboe wrote: > > On Tue, Oct 23 2007, Kamalesh Babulal wrote: > > > Hi, > > > > > > Kernel oops is triggered while running fsx-linux test, followed by cpu softlock > > > over the AMD box > > > > > > Unable to handle kernel NULL pointer dereference at 0000000000000018 RIP: > > > [] gart_map_sg+0x26c/0x406 > > > PGD 10185b067 PUD 10075b067 PMD 0 > > > Oops: 0002 [1] SMP > > > CPU 3 > > > Modules linked in: > > > Pid: 18676, comm: fsx-linux Not tainted 2.6.23-git18-autokern1 #1 > > > RIP: 0010:[] [] gart_map_sg+0x26c/0x406 > > > RSP: 0000:ffff810181edf948 EFLAGS: 00010002 > > > > Can you check where gart_map_sg+0x26c is at? Make sure you have > > CONFIG_DEBUG_INFO defined, then do: > > > > $ gdb vmlinux > > $ l *gart_map_sg+0x26c > > Ok, this problem still seems to be about in 2.6.24-rc1. Here is the gdb > output from that version, the panic (also below) seems the same: > > (gdb) l *gart_map_sg+0x26c > 0xffffffff8022011e is in gart_map_sg (arch/x86/kernel/pci-gart_64.c:433). > 428 goto error; > 429 out++; > 430 flush_gart(); > 431 if (out < nents) { > 432 sgmap = sg_next(sgmap); > 433 sgmap->dma_length = 0; > 434 } > 435 return out; > 436 > 437 error: > > So it seems sg_next has returned 0. Have you tried this? http://marc.info/?l=linux-kernel&m=119317981406073&w=2 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/