2007-10-04 07:19:24

by Don Mullis

[permalink] [raw]
Subject: 2.6.23-rc8-mm2: OOPS in mmc on boot

OOPS followed by a 3 minute timeout, then completion of boot.
Not seen if card (Kingston microSD adapter) is ejected; not seen in 2.6.23-rc8.
Running on a Dell XPS M1330 laptop.

`dmesg` reports:

[ 13.695045] mmcblk0: mmc0:e95c SD02G 1966080KiB
[ 13.695155] mmcblk0: p1
[ 13.706907] BUG: unable to handle kernel paging request at virtual address 6b6b6b7a
[ 13.707026] printing eip: c01f09f0 *pde = 00000000
[ 13.707174] Oops: 0000 [#1] SMP
[ 13.707326] last sysfs file: /class/mmc_host/mmc0/mmc0:e95c/serial
[ 13.707389] Modules linked in: mmc_block sr_mod iwl4965 cdrom serio_raw mac80211 piix sdhci pcspkr psmouse ide_core iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev cfg80211 mmc_core shpchp pci_hotplug intel_agp agpgart battery ac power_supply button evdev ata_generic ext3 jbd mbcache sg sd_mod usbhid hid ahci ata_piix libata scsi_mod ohci1394 tg3 ieee1394 ehci_hcd uhci_hcd thermal processor fan fuse
[ 13.709649]
[ 13.709705] Pid: 4089, comm: mmcqd Not tainted (2.6.23-rc8-mm2 #27)
[ 13.709767] EIP: 0060:[<c01f09f0>] EFLAGS: 00010206 CPU: 0
[ 13.709831] EIP is at blk_rq_map_sg+0xc0/0x160
[ 13.709889] EAX: 04b6a000 EBX: c4a030e0 ECX: 04b6b000 EDX: c1000000
[ 13.709951] ESI: 6b6b6b6a EDI: c11535d0 EBP: c4971e30 ESP: c4971df4
[ 13.710013] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 13.710074] Process mmcqd (pid: 4089, ti=c4970000 task=c387b660 task.ti=c4970000)
[ 13.710137] last branch before last exception/interrupt
[ 13.710249] from c0129bf8 (vprintk+0x1d8/0x340)
[ 13.710359] to c0129c9c (vprintk+0x27c/0x340)
[ 13.710453] Stack: c4971e08 c013d85f c48a0440 00002000 04b6c000 c3914180 00000001 00000001
[ 13.710920] 00001000 00000000 c4923700 01000000 c48ca3a0 c48f6d88 c48f6d88 c4971e40
[ 13.711390] f8c39e58 c48f6d88 c4a9f020 c4971fbc f8c396c9 c017de04 00000004 c4971e84
[ 13.711857] Call Trace:
[ 13.711971] [<f8c39e58>] mmc_queue_map_sg+0x28/0xc0 [mmc_block]
[ 13.712085] [<f8c396c9>] mmc_blk_issue_rq+0x199/0x780 [mmc_block]
[ 13.712193] [<f8c3a168>] mmc_queue_thread+0x78/0xe0 [mmc_block]
[ 13.712309] [<c013d382>] kthread+0x42/0x70
[ 13.712415] [<c0104e73>] kernel_thread_helper+0x7/0x14
[ 13.712523] =======================
[ 13.712584] Code: 0c 03 4f 08 8b 7f 04 01 cf 89 7d d4 8b 3b 89 f8 29 d0 c1 f8 03 69 c0 39 8e e3 38 c1 e0 0c 03 43 08 39 45 d4 74 73 90 8d 74 26 00 <8b> 46 10 8d 4e 10 89 3e 89 c2 83 e2 fe a8 01 8b 45 e4 0f 45 ca
[ 13.715555] EIP: [<c01f09f0>] blk_rq_map_sg+0xc0/0x160 SS:ESP 0068:c4971df4
[ 13.845668] Synaptics Touchpad, model: 1, fw: 6.3, id: 0x1c0b1, caps: 0xa04753/0x200000
[ 13.879914] input: SynPS/2 Synaptics TouchPad as /class/input/input7
[ 192.162711] Adding 2731008k swap on /dev/sda7. Priority:-1 extents:1 across:2731008k


`lspci -vvv` reports:

03:01.1 Generic system peripheral [0805]: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 22) (prog-if 01)
Subsystem: Dell Unknown device 0209
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 22
Region 0: Memory at fe4ff400 (32-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

03:01.2 System peripheral: Ricoh Co Ltd R5C843 MMC Host Controller (rev 12)
Subsystem: Dell Unknown device 0209
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 4
Region 0: Memory at fe4ff500 (32-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

03:01.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host Adapter (rev 12)
Subsystem: Dell Unknown device 0209
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 4
Region 0: Memory at fe4ff600 (32-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-

03:01.4 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev 12)
Subsystem: Dell Unknown device 0209
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64, Cache Line Size: 64 bytes
Interrupt: pin B routed to IRQ 4
Region 0: Memory at fe4ff700 (32-bit, non-prefetchable) [size=256]
Capabilities: [80] Power Management version 2
Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=2 PME-




2007-10-04 06:17:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Wed, 03 Oct 2007 23:11:02 -0700 Don Mullis <[email protected]> wrote:

> OOPS followed by a 3 minute timeout, then completion of boot.
> Not seen if card (Kingston microSD adapter) is ejected; not seen in 2.6.23-rc8.
> Running on a Dell XPS M1330 laptop.
>
> `dmesg` reports:
>
> [ 13.695045] mmcblk0: mmc0:e95c SD02G 1966080KiB
> [ 13.695155] mmcblk0: p1
> [ 13.706907] BUG: unable to handle kernel paging request at virtual address 6b6b6b7a
> [ 13.707026] printing eip: c01f09f0 *pde = 00000000
> [ 13.707174] Oops: 0000 [#1] SMP
> [ 13.707326] last sysfs file: /class/mmc_host/mmc0/mmc0:e95c/serial
> [ 13.707389] Modules linked in: mmc_block sr_mod iwl4965 cdrom serio_raw mac80211 piix sdhci pcspkr psmouse ide_core iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev cfg80211 mmc_core shpchp pci_hotplug intel_agp agpgart battery ac power_supply button evdev ata_generic ext3 jbd mbcache sg sd_mod usbhid hid ahci ata_piix libata scsi_mod ohci1394 tg3 ieee1394 ehci_hcd uhci_hcd thermal processor fan fuse
> [ 13.709649]
> [ 13.709705] Pid: 4089, comm: mmcqd Not tainted (2.6.23-rc8-mm2 #27)
> [ 13.709767] EIP: 0060:[<c01f09f0>] EFLAGS: 00010206 CPU: 0
> [ 13.709831] EIP is at blk_rq_map_sg+0xc0/0x160
> [ 13.709889] EAX: 04b6a000 EBX: c4a030e0 ECX: 04b6b000 EDX: c1000000
> [ 13.709951] ESI: 6b6b6b6a EDI: c11535d0 EBP: c4971e30 ESP: c4971df4
> [ 13.710013] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> [ 13.710074] Process mmcqd (pid: 4089, ti=c4970000 task=c387b660 task.ti=c4970000)
> [ 13.710137] last branch before last exception/interrupt
> [ 13.710249] from c0129bf8 (vprintk+0x1d8/0x340)
> [ 13.710359] to c0129c9c (vprintk+0x27c/0x340)
> [ 13.710453] Stack: c4971e08 c013d85f c48a0440 00002000 04b6c000 c3914180 00000001 00000001
> [ 13.710920] 00001000 00000000 c4923700 01000000 c48ca3a0 c48f6d88 c48f6d88 c4971e40
> [ 13.711390] f8c39e58 c48f6d88 c4a9f020 c4971fbc f8c396c9 c017de04 00000004 c4971e84
> [ 13.711857] Call Trace:
> [ 13.711971] [<f8c39e58>] mmc_queue_map_sg+0x28/0xc0 [mmc_block]
> [ 13.712085] [<f8c396c9>] mmc_blk_issue_rq+0x199/0x780 [mmc_block]
> [ 13.712193] [<f8c3a168>] mmc_queue_thread+0x78/0xe0 [mmc_block]
> [ 13.712309] [<c013d382>] kthread+0x42/0x70
> [ 13.712415] [<c0104e73>] kernel_thread_helper+0x7/0x14
> [ 13.712523] =======================
> [ 13.712584] Code: 0c 03 4f 08 8b 7f 04 01 cf 89 7d d4 8b 3b 89 f8 29 d0 c1 f8 03 69 c0 39 8e e3 38 c1 e0 0c 03 43 08 39 45 d4 74 73 90 8d 74 26 00 <8b> 46 10 8d 4e 10 89 3e 89 c2 83 e2 fe a8 01 8b 45 e4 0f 45 ca
> [ 13.715555] EIP: [<c01f09f0>] blk_rq_map_sg+0xc0/0x160 SS:ESP 0068:c4971df4
> [ 13.845668] Synaptics Touchpad, model: 1, fw: 6.3, id: 0x1c0b1, caps: 0xa04753/0x200000
> [ 13.879914] input: SynPS/2 Synaptics TouchPad as /class/input/input7
> [ 192.162711] Adding 2731008k swap on /dev/sda7. Priority:-1 extents:1 across:2731008k

This could be due to git-block changes (or a lack of them ;))

2007-10-04 07:23:48

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Wed, Oct 03 2007, Andrew Morton wrote:
> On Wed, 03 Oct 2007 23:11:02 -0700 Don Mullis <[email protected]> wrote:
>
> > OOPS followed by a 3 minute timeout, then completion of boot.
> > Not seen if card (Kingston microSD adapter) is ejected; not seen in 2.6.23-rc8.
> > Running on a Dell XPS M1330 laptop.
> >
> > `dmesg` reports:
> >
> > [ 13.695045] mmcblk0: mmc0:e95c SD02G 1966080KiB
> > [ 13.695155] mmcblk0: p1
> > [ 13.706907] BUG: unable to handle kernel paging request at virtual address 6b6b6b7a
> > [ 13.707026] printing eip: c01f09f0 *pde = 00000000
> > [ 13.707174] Oops: 0000 [#1] SMP
> > [ 13.707326] last sysfs file: /class/mmc_host/mmc0/mmc0:e95c/serial
> > [ 13.707389] Modules linked in: mmc_block sr_mod iwl4965 cdrom serio_raw mac80211 piix sdhci pcspkr psmouse ide_core iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev cfg80211 mmc_core shpchp pci_hotplug intel_agp agpgart battery ac power_supply button evdev ata_generic ext3 jbd mbcache sg sd_mod usbhid hid ahci ata_piix libata scsi_mod ohci1394 tg3 ieee1394 ehci_hcd uhci_hcd thermal processor fan fuse
> > [ 13.709649]
> > [ 13.709705] Pid: 4089, comm: mmcqd Not tainted (2.6.23-rc8-mm2 #27)
> > [ 13.709767] EIP: 0060:[<c01f09f0>] EFLAGS: 00010206 CPU: 0
> > [ 13.709831] EIP is at blk_rq_map_sg+0xc0/0x160
> > [ 13.709889] EAX: 04b6a000 EBX: c4a030e0 ECX: 04b6b000 EDX: c1000000
> > [ 13.709951] ESI: 6b6b6b6a EDI: c11535d0 EBP: c4971e30 ESP: c4971df4
> > [ 13.710013] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > [ 13.710074] Process mmcqd (pid: 4089, ti=c4970000 task=c387b660 task.ti=c4970000)
> > [ 13.710137] last branch before last exception/interrupt
> > [ 13.710249] from c0129bf8 (vprintk+0x1d8/0x340)
> > [ 13.710359] to c0129c9c (vprintk+0x27c/0x340)
> > [ 13.710453] Stack: c4971e08 c013d85f c48a0440 00002000 04b6c000 c3914180 00000001 00000001
> > [ 13.710920] 00001000 00000000 c4923700 01000000 c48ca3a0 c48f6d88 c48f6d88 c4971e40
> > [ 13.711390] f8c39e58 c48f6d88 c4a9f020 c4971fbc f8c396c9 c017de04 00000004 c4971e84
> > [ 13.711857] Call Trace:
> > [ 13.711971] [<f8c39e58>] mmc_queue_map_sg+0x28/0xc0 [mmc_block]
> > [ 13.712085] [<f8c396c9>] mmc_blk_issue_rq+0x199/0x780 [mmc_block]
> > [ 13.712193] [<f8c3a168>] mmc_queue_thread+0x78/0xe0 [mmc_block]
> > [ 13.712309] [<c013d382>] kthread+0x42/0x70
> > [ 13.712415] [<c0104e73>] kernel_thread_helper+0x7/0x14
> > [ 13.712523] =======================
> > [ 13.712584] Code: 0c 03 4f 08 8b 7f 04 01 cf 89 7d d4 8b 3b 89 f8 29 d0 c1 f8 03 69 c0 39 8e e3 38 c1 e0 0c 03 43 08 39 45 d4 74 73 90 8d 74 26 00 <8b> 46 10 8d 4e 10 89 3e 89 c2 83 e2 fe a8 01 8b 45 e4 0f 45 ca
> > [ 13.715555] EIP: [<c01f09f0>] blk_rq_map_sg+0xc0/0x160 SS:ESP 0068:c4971df4
> > [ 13.845668] Synaptics Touchpad, model: 1, fw: 6.3, id: 0x1c0b1, caps: 0xa04753/0x200000
> > [ 13.879914] input: SynPS/2 Synaptics TouchPad as /class/input/input7
> > [ 192.162711] Adding 2731008k swap on /dev/sda7. Priority:-1 extents:1 across:2731008k
>
> This could be due to git-block changes (or a lack of them ;))

It looks like missing init of the sg list in mmc, does this work?

--- linux-2.6.23-rc8/drivers/mmc/card/queue.c~ 2007-10-04 09:22:02.000000000 +0200
+++ linux-2.6.23-rc8/drivers/mmc/card/queue.c 2007-10-04 09:23:13.000000000 +0200
@@ -334,14 +334,18 @@ static void copy_sg(struct scatterlist *

unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
{
+ struct request *rq = mq->req;
unsigned int sg_len;

- if (!mq->bounce_buf)
- return blk_rq_map_sg(mq->queue, mq->req, mq->sg);
+ if (!mq->bounce_buf) {
+ memset(mq->sg, 0, rq->nr_hw_segments * sizeof(struct scatterlist));
+ return blk_rq_map_sg(mq->queue, rq, mq->sg);
+ }

BUG_ON(!mq->bounce_sg);

- sg_len = blk_rq_map_sg(mq->queue, mq->req, mq->bounce_sg);
+ memset(mq->bounce_sg, 0, rq->nr_hw_segments * sizeof(struct scatterlist));
+ sg_len = blk_rq_map_sg(mq->queue, rq, mq->bounce_sg);

mq->bounce_sg_len = sg_len;


--
Jens Axboe

2007-10-04 07:28:55

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Wed, 3 Oct 2007 23:16:59 -0700
Andrew Morton <[email protected]> wrote:

> On Wed, 03 Oct 2007 23:11:02 -0700 Don Mullis <[email protected]> wrote:
>
> > OOPS followed by a 3 minute timeout, then completion of boot.
> > Not seen if card (Kingston microSD adapter) is ejected; not seen in
> > 2.6.23-rc8. Running on a Dell XPS M1330 laptop.
> >

Impossible! My code is bug free!

> > [ 13.709831] EIP is at blk_rq_map_sg+0xc0/0x160
> > [ 13.711857] Call Trace:
> > [ 13.711971] [<f8c39e58>] mmc_queue_map_sg+0x28/0xc0 [mmc_block]
> > [ 13.712085] [<f8c396c9>] mmc_blk_issue_rq+0x199/0x780 [mmc_block]
> > [ 13.712193] [<f8c3a168>] mmc_queue_thread+0x78/0xe0 [mmc_block]

Seems to be in the handling of the bounce buffer. I don't see how any
of the parameters to blk_rq_map_sg() could be incorrect though, so I
suspect the problem is not in the mmc layer.

Don, is MMC_BLOCK_BOUNCE enabled? Could you try toggling it and see if
things change?

>
> This could be due to git-block changes (or a lack of them ;))
>

There are no pending patches, or recent changes that mess about in any
real way in there.

Don, when did this work last?

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 08:02:22

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 4 Oct 2007 09:25:15 +0200
Jens Axboe <[email protected]> wrote:

>
> It looks like missing init of the sg list in mmc, does this work?
>

Huh? Isn't the block layer supposed to fill in the entire thing? (i.e.
current contents shouldn't matter)

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 08:05:15

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Pierre Ossman wrote:
> On Thu, 4 Oct 2007 09:25:15 +0200
> Jens Axboe <[email protected]> wrote:
>
> >
> > It looks like missing init of the sg list in mmc, does this work?
> >
>
> Huh? Isn't the block layer supposed to fill in the entire thing? (i.e.
> current contents shouldn't matter)

Yeah, but sg chaining requires that ->page be filled in properly or it
could confuse it. I think I'll add some debugging to catch that.

--
Jens Axboe

2007-10-04 08:46:43

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 4 Oct 2007 10:06:32 +0200
Jens Axboe <[email protected]> wrote:

> On Thu, Oct 04 2007, Pierre Ossman wrote:
> >
> > Huh? Isn't the block layer supposed to fill in the entire thing?
> > (i.e. current contents shouldn't matter)
>
> Yeah, but sg chaining requires that ->page be filled in properly or it
> could confuse it. I think I'll add some debugging to catch that.
>

I assume sg_init_one() still can work on an uninitialized sg entry?

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 09:38:49

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Pierre Ossman wrote:
> On Thu, 4 Oct 2007 10:06:32 +0200
> Jens Axboe <[email protected]> wrote:
>
> > On Thu, Oct 04 2007, Pierre Ossman wrote:
> > >
> > > Huh? Isn't the block layer supposed to fill in the entire thing?
> > > (i.e. current contents shouldn't matter)
> >
> > Yeah, but sg chaining requires that ->page be filled in properly or it
> > could confuse it. I think I'll add some debugging to catch that.
> >
>
> I assume sg_init_one() still can work on an uninitialized sg entry?

Yes, but only if that sg entry is not part of a chained list.

--
Jens Axboe

2007-10-04 10:24:53

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 4 Oct 2007 11:30:14 +0200
Jens Axboe <[email protected]> wrote:

> On Thu, Oct 04 2007, Pierre Ossman wrote:
> >
> > I assume sg_init_one() still can work on an uninitialized sg entry?
>
> Yes, but only if that sg entry is not part of a chained list.
>

Is that a yes or a no? You said that the ->page field was involved in
list chaining, so does or doesn't it have to be initialized before a
call to sg_init_one()?

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 10:36:40

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Pierre Ossman wrote:
> On Thu, 4 Oct 2007 11:30:14 +0200
> Jens Axboe <[email protected]> wrote:
>
> > On Thu, Oct 04 2007, Pierre Ossman wrote:
> > >
> > > I assume sg_init_one() still can work on an uninitialized sg entry?
> >
> > Yes, but only if that sg entry is not part of a chained list.
> >
>
> Is that a yes or a no? You said that the ->page field was involved in

It's a conditional yes, re-read it :-)

> list chaining, so does or doesn't it have to be initialized before a
> call to sg_init_one()?

That's not the problem. It has to be initialized before calling
blk_rq_map_sg(). sg_init_one() will zero the entire sg entry, and that
breaks if that particular sg entry is part of a larger sg table AND that
sg entry happens to be the chain element.

--
Jens Axboe

2007-10-04 10:46:19

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 4 Oct 2007 12:38:05 +0200
Jens Axboe <[email protected]> wrote:

> On Thu, Oct 04 2007, Pierre Ossman wrote:
> >
> > Is that a yes or a no? You said that the ->page field was involved
> > in
>
> It's a conditional yes, re-read it :-)
>

I didn't get the memo about what chained sg entries entail.

> > list chaining, so does or doesn't it have to be initialized before a
> > call to sg_init_one()?
>
> That's not the problem. It has to be initialized before calling
> blk_rq_map_sg(). sg_init_one() will zero the entire sg entry, and that
> breaks if that particular sg entry is part of a larger sg table AND
> that sg entry happens to be the chain element.
>

Ok, then it shouldn't affect my world at least.

Rgds
Pierre

PS. Did someone forget to do a review of all blk_rq_map_sg() callers
before committing the chained list stuff? ;)


Attachments:
signature.asc (189.00 B)

2007-10-04 10:58:11

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Pierre Ossman wrote:
> On Thu, 4 Oct 2007 12:38:05 +0200
> Jens Axboe <[email protected]> wrote:
>
> > On Thu, Oct 04 2007, Pierre Ossman wrote:
> > >
> > > Is that a yes or a no? You said that the ->page field was involved
> > > in
> >
> > It's a conditional yes, re-read it :-)
> >
>
> I didn't get the memo about what chained sg entries entail.

It's been posted here several times, but that's ok and it should not
matter. I just can't answer your question with a clear yes or no, since
it depends on certain situations.

> > > list chaining, so does or doesn't it have to be initialized before a
> > > call to sg_init_one()?
> >
> > That's not the problem. It has to be initialized before calling
> > blk_rq_map_sg(). sg_init_one() will zero the entire sg entry, and that
> > breaks if that particular sg entry is part of a larger sg table AND
> > that sg entry happens to be the chain element.
> >
>
> Ok, then it shouldn't affect my world at least.

No, I think mmc is fine, it just needed that memset.

> PS. Did someone forget to do a review of all blk_rq_map_sg() callers
> before committing the chained list stuff? ;)

Apparently this one got missed (and cciss), I'll do a new look just to
be on the safe side.

--
Jens Axboe

2007-10-04 16:21:47

by Don Mullis

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

This patch fixes the boot.


On Thu, 2007-10-04 at 09:25 +0200, Jens Axboe wrote:
> On Wed, Oct 03 2007, Andrew Morton wrote:
> > On Wed, 03 Oct 2007 23:11:02 -0700 Don Mullis <[email protected]> wrote:
> >
> > > OOPS followed by a 3 minute timeout, then completion of boot.
> > > Not seen if card (Kingston microSD adapter) is ejected; not seen in 2.6.23-rc8.
> > > Running on a Dell XPS M1330 laptop.
> > >
> > > `dmesg` reports:
> > >
> > > [ 13.695045] mmcblk0: mmc0:e95c SD02G 1966080KiB
> > > [ 13.695155] mmcblk0: p1
> > > [ 13.706907] BUG: unable to handle kernel paging request at virtual address 6b6b6b7a
> > > [ 13.707026] printing eip: c01f09f0 *pde = 00000000
> > > [ 13.707174] Oops: 0000 [#1] SMP
> > > [ 13.707326] last sysfs file: /class/mmc_host/mmc0/mmc0:e95c/serial
> > > [ 13.707389] Modules linked in: mmc_block sr_mod iwl4965 cdrom serio_raw mac80211 piix sdhci pcspkr psmouse ide_core iTCO_wdt iTCO_vendor_support watchdog_core watchdog_dev cfg80211 mmc_core shpchp pci_hotplug intel_agp agpgart battery ac power_supply button evdev ata_generic ext3 jbd mbcache sg sd_mod usbhid hid ahci ata_piix libata scsi_mod ohci1394 tg3 ieee1394 ehci_hcd uhci_hcd thermal processor fan fuse
> > > [ 13.709649]
> > > [ 13.709705] Pid: 4089, comm: mmcqd Not tainted (2.6.23-rc8-mm2 #27)
> > > [ 13.709767] EIP: 0060:[<c01f09f0>] EFLAGS: 00010206 CPU: 0
> > > [ 13.709831] EIP is at blk_rq_map_sg+0xc0/0x160
> > > [ 13.709889] EAX: 04b6a000 EBX: c4a030e0 ECX: 04b6b000 EDX: c1000000
> > > [ 13.709951] ESI: 6b6b6b6a EDI: c11535d0 EBP: c4971e30 ESP: c4971df4
> > > [ 13.710013] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
> > > [ 13.710074] Process mmcqd (pid: 4089, ti=c4970000 task=c387b660 task.ti=c4970000)
> > > [ 13.710137] last branch before last exception/interrupt
> > > [ 13.710249] from c0129bf8 (vprintk+0x1d8/0x340)
> > > [ 13.710359] to c0129c9c (vprintk+0x27c/0x340)
> > > [ 13.710453] Stack: c4971e08 c013d85f c48a0440 00002000 04b6c000 c3914180 00000001 00000001
> > > [ 13.710920] 00001000 00000000 c4923700 01000000 c48ca3a0 c48f6d88 c48f6d88 c4971e40
> > > [ 13.711390] f8c39e58 c48f6d88 c4a9f020 c4971fbc f8c396c9 c017de04 00000004 c4971e84
> > > [ 13.711857] Call Trace:
> > > [ 13.711971] [<f8c39e58>] mmc_queue_map_sg+0x28/0xc0 [mmc_block]
> > > [ 13.712085] [<f8c396c9>] mmc_blk_issue_rq+0x199/0x780 [mmc_block]
> > > [ 13.712193] [<f8c3a168>] mmc_queue_thread+0x78/0xe0 [mmc_block]
> > > [ 13.712309] [<c013d382>] kthread+0x42/0x70
> > > [ 13.712415] [<c0104e73>] kernel_thread_helper+0x7/0x14
> > > [ 13.712523] =======================
> > > [ 13.712584] Code: 0c 03 4f 08 8b 7f 04 01 cf 89 7d d4 8b 3b 89 f8 29 d0 c1 f8 03 69 c0 39 8e e3 38 c1 e0 0c 03 43 08 39 45 d4 74 73 90 8d 74 26 00 <8b> 46 10 8d 4e 10 89 3e 89 c2 83 e2 fe a8 01 8b 45 e4 0f 45 ca
> > > [ 13.715555] EIP: [<c01f09f0>] blk_rq_map_sg+0xc0/0x160 SS:ESP 0068:c4971df4
> > > [ 13.845668] Synaptics Touchpad, model: 1, fw: 6.3, id: 0x1c0b1, caps: 0xa04753/0x200000
> > > [ 13.879914] input: SynPS/2 Synaptics TouchPad as /class/input/input7
> > > [ 192.162711] Adding 2731008k swap on /dev/sda7. Priority:-1 extents:1 across:2731008k
> >
> > This could be due to git-block changes (or a lack of them ;))
>
> It looks like missing init of the sg list in mmc, does this work?
>
> --- linux-2.6.23-rc8/drivers/mmc/card/queue.c~ 2007-10-04 09:22:02.000000000 +0200
> +++ linux-2.6.23-rc8/drivers/mmc/card/queue.c 2007-10-04 09:23:13.000000000 +0200
> @@ -334,14 +334,18 @@ static void copy_sg(struct scatterlist *
>
> unsigned int mmc_queue_map_sg(struct mmc_queue *mq)
> {
> + struct request *rq = mq->req;
> unsigned int sg_len;
>
> - if (!mq->bounce_buf)
> - return blk_rq_map_sg(mq->queue, mq->req, mq->sg);
> + if (!mq->bounce_buf) {
> + memset(mq->sg, 0, rq->nr_hw_segments * sizeof(struct scatterlist));
> + return blk_rq_map_sg(mq->queue, rq, mq->sg);
> + }
>
> BUG_ON(!mq->bounce_sg);
>
> - sg_len = blk_rq_map_sg(mq->queue, mq->req, mq->bounce_sg);
> + memset(mq->bounce_sg, 0, rq->nr_hw_segments * sizeof(struct scatterlist));
> + sg_len = blk_rq_map_sg(mq->queue, rq, mq->bounce_sg);
>
> mq->bounce_sg_len = sg_len;
>
>

2007-10-04 16:36:17

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 04 Oct 2007 09:19:40 -0700
Don Mullis <[email protected]> wrote:

> This patch fixes the boot.
>

Fantastic. Then will try to get this upstream then.

> >
> > It looks like missing init of the sg list in mmc, does this work?
> >

Jens, is this zeroing needed for each invocation, or really just once
to get the list in a known state?

Also, is chaining already upstream so Linus should have this for 2.6.23?

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 16:41:07

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Pierre Ossman wrote:
> On Thu, 04 Oct 2007 09:19:40 -0700
> Don Mullis <[email protected]> wrote:
>
> > This patch fixes the boot.
> >
>
> Fantastic. Then will try to get this upstream then.

I already put it in the sgchain drivers part. If you could please ack
it, that would be nice :-). I have a bunch of driver
updates/work-arounds there.

> > > It looks like missing init of the sg list in mmc, does this work?
> > >
>
> Jens, is this zeroing needed for each invocation, or really just once
> to get the list in a known state?

Once should actually be enough, so you could move it to init as well.
Don, care to verify with the below patch as well?

> Also, is chaining already upstream so Linus should have this for 2.6.23?

No, it's for 2.6.24.

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index b0abc7d..a5d0354 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -153,14 +153,14 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
blk_queue_max_hw_segments(mq->queue, bouncesz / 512);
blk_queue_max_segment_size(mq->queue, bouncesz);

- mq->sg = kmalloc(sizeof(struct scatterlist),
+ mq->sg = kzalloc(sizeof(struct scatterlist),
GFP_KERNEL);
if (!mq->sg) {
ret = -ENOMEM;
goto cleanup_queue;
}

- mq->bounce_sg = kmalloc(sizeof(struct scatterlist) *
+ mq->bounce_sg = kzalloc(sizeof(struct scatterlist) *
bouncesz / 512, GFP_KERNEL);
if (!mq->bounce_sg) {
ret = -ENOMEM;
@@ -177,7 +177,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
blk_queue_max_hw_segments(mq->queue, host->max_hw_segs);
blk_queue_max_segment_size(mq->queue, host->max_seg_size);

- mq->sg = kmalloc(sizeof(struct scatterlist) *
+ mq->sg = kzalloc(sizeof(struct scatterlist) *
host->max_phys_segs, GFP_KERNEL);
if (!mq->sg) {
ret = -ENOMEM;

--
Jens Axboe

2007-10-04 16:47:09

by Pierre Ossman

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, 4 Oct 2007 18:42:25 +0200
Jens Axboe <[email protected]> wrote:

>
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index b0abc7d..a5d0354 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c

Acked-by: Pierre Ossman <[email protected]>

(Provided it works ;))

I have no patches touching queue.c in my tree, so there should be no
problems with merge conflicts.

Rgds
Pierre


Attachments:
signature.asc (189.00 B)

2007-10-04 17:08:28

by Don Mullis

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

That patch boots without complaint as well.

BTW, the earlier failure messages did not make it
into /var/log/messages, only the dmesg buffer.
This is with standard Ubuntu Gutsy logging levels.


On Thu, 2007-10-04 at 18:42 +0200, Jens Axboe wrote:
> On Thu, Oct 04 2007, Pierre Ossman wrote:
> > On Thu, 04 Oct 2007 09:19:40 -0700
> > Don Mullis <[email protected]> wrote:
> >
> > > This patch fixes the boot.
> > >
> >
> > Fantastic. Then will try to get this upstream then.
>
> I already put it in the sgchain drivers part. If you could please ack
> it, that would be nice :-). I have a bunch of driver
> updates/work-arounds there.
>
> > > > It looks like missing init of the sg list in mmc, does this work?
> > > >
> >
> > Jens, is this zeroing needed for each invocation, or really just once
> > to get the list in a known state?
>
> Once should actually be enough, so you could move it to init as well.
> Don, care to verify with the below patch as well?
>
> > Also, is chaining already upstream so Linus should have this for 2.6.23?
>
> No, it's for 2.6.24.
>
> diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
> index b0abc7d..a5d0354 100644
> --- a/drivers/mmc/card/queue.c
> +++ b/drivers/mmc/card/queue.c
> @@ -153,14 +153,14 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
> blk_queue_max_hw_segments(mq->queue, bouncesz / 512);
> blk_queue_max_segment_size(mq->queue, bouncesz);
>
> - mq->sg = kmalloc(sizeof(struct scatterlist),
> + mq->sg = kzalloc(sizeof(struct scatterlist),
> GFP_KERNEL);
> if (!mq->sg) {
> ret = -ENOMEM;
> goto cleanup_queue;
> }
>
> - mq->bounce_sg = kmalloc(sizeof(struct scatterlist) *
> + mq->bounce_sg = kzalloc(sizeof(struct scatterlist) *
> bouncesz / 512, GFP_KERNEL);
> if (!mq->bounce_sg) {
> ret = -ENOMEM;
> @@ -177,7 +177,7 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card, spinlock_t *lock
> blk_queue_max_hw_segments(mq->queue, host->max_hw_segs);
> blk_queue_max_segment_size(mq->queue, host->max_seg_size);
>
> - mq->sg = kmalloc(sizeof(struct scatterlist) *
> + mq->sg = kzalloc(sizeof(struct scatterlist) *
> host->max_phys_segs, GFP_KERNEL);
> if (!mq->sg) {
> ret = -ENOMEM;
>

2007-10-04 18:08:55

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.6.23-rc8-mm2: OOPS in mmc on boot

On Thu, Oct 04 2007, Don Mullis wrote:
> That patch boots without complaint as well.
>
> BTW, the earlier failure messages did not make it
> into /var/log/messages, only the dmesg buffer.
> This is with standard Ubuntu Gutsy logging levels.

Super, thanks for retesting!

--
Jens Axboe