2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
>From 2.6.23-rc9:
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
PIIX4: IDE controller at PCI slot 0000:00:07.1
eth1: Optical link UP (Full Duplex, Flow Control: )
PIIX4: chipset revision 1
PIIX4: not 100% native mode: will probe irqs later
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
hdc: SAMSUNG SC-140B, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hdc: ATAPI 40X CD-ROM drive, 128kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ide-floppy driver 0.99.newide
ACPI: PCI Interrupt 0000:00:0d.1[A] -> GSI 17 (level, low) -> IRQ 18
megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
megaraid: [1.06:1p00] detected 1 logical drives.
megaraid: channel[0] is raid.
megaraid: channel[1] is raid.
scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
scsi0: scanning scsi channel 0 for logical drives.
scsi 0:0:0:0: Direct-Access MegaRAID LD0 RAID1 8568R 1.06 PQ: 0 ANSI: 2
scsi0: scanning scsi channel 4 [P0] for physical devices.
scsi0: scanning scsi channel 5 [P1] for physical devices.
st: Version 20070203, fixed bufsize 32768, s/g segs 256
sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sda: sda1
sda: p1 exceeds device capacity
sd 0:0:0:0: [sda] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input1
input: AT Translated Set 2 keyboard as /class/input/input2
i2c /dev entries driver
piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
NET: Registered protocol family 26
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
Starting balanced_irq
Using IPI Shortcut mode
attempt to access beyond end of device
sda: rw=0, want=67, limit=1
EXT3-fs: unable to read superblock
attempt to access beyond end of device
sda: rw=0, want=67, limit=1
EXT2-fs: unable to read superblock
attempt to access beyond end of device
sda: rw=0, want=129, limit=1
isofs_fill_super: bread failed, dev=sda1, iso_blknum=16, block=32
attempt to access beyond end of device
sda: rw=0, want=131, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542979, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541955, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541731, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542971, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541947, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541723, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542379, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541355, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541131, limit=1
attempt to access beyond end of device
sda: rw=0, want=17542371, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541347, limit=1
attempt to access beyond end of device
sda: rw=0, want=17541123, limit=1
attempt to access beyond end of device
sda: rw=0, want=14394267, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393243, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393019, limit=1
attempt to access beyond end of device
sda: rw=0, want=14394259, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393235, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393011, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393667, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392643, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392419, limit=1
attempt to access beyond end of device
sda: rw=0, want=14393659, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392635, limit=1
attempt to access beyond end of device
sda: rw=0, want=14392411, limit=1
attempt to access beyond end of device
sda: rw=0, want=1315, limit=1
attempt to access beyond end of device
sda: rw=0, want=1091, limit=1
UDF-fs: No partition found (1)
List of all partitions:
1600 4194302 hdc driver: ide-cdrom
0800 0 sda driver: sd
0801 8771458 sda1
No filesystem could mount root, tried: ext3 ext2 iso9660 udf
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,1)
>From 2.6.22.9:
megaraid: found 0x8086:0x1960:bus 0:slot 13:func 1
scsi0:Found MegaRAID controller at 0xf8812000, IRQ:18
megaraid: [1.06:1p00] detected 1 logical drives.
megaraid: channel[0] is raid.
megaraid: channel[1] is raid.
scsi0 : LSI Logic MegaRAID 1.06 254 commands 16 targs 5 chans 7 luns
scsi0: scanning scsi channel 0 for logical drives.
scsi 0:0:0:0: Direct-Access MegaRAID LD0 RAID1 8568R 1.06 PQ: 0 ANSI: 2
scsi0: scanning scsi channel 4 [P0] for physical devices.
scsi0: scanning scsi channel 5 [P1] for physical devices.
st: Version 20070203, fixed bufsize 32768, s/g segs 256
sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sda: sda1
sd 0:0:0:0: [sda] Attached SCSI disk
PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input1
input: AT Translated Set 2 keyboard as /class/input/input2
i2c /dev entries driver
piix4_smbus 0000:00:07.3: Found 0000:00:07.3 device
piix4_smbus 0000:00:07.3: Host SMBus controller not enabled!
NET: Registered protocol family 26
TCP cubic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
Starting balanced_irq
Using IPI Shortcut mode
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 260k freed
EXT3 FS on sda1, internal journal
00:0d.1 I2O: Intel Corporation 80960RP [i960RP Microprocessor] (rev 02) (prog-if 01)
Subsystem: Dell PowerEdge Expandable RAID Controller 2/DC
Flags: bus master, medium devsel, latency 64, IRQ 18
Memory at f7000000 (32-bit, prefetchable) [size=4M]
[virtual] Expansion ROM at 50000000 [disabled] [size=32K]
Capabilities: <access denied>
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y
CONFIG_SCSI_LOGGING=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_MEGARAID_LEGACY=y
--
Burton Windle [email protected]
Cc's added, the complete bug report is at
http://lkml.org/lkml/2007/10/2/243
On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
>
> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
>...
Thanks for your report.
Diff'ing the dmesg's shows:
<-- snip -->
scsi0: scanning scsi channel 4 [P0] for physical devices.
scsi0: scanning scsi channel 5 [P1] for physical devices.
st: Version 20070203, fixed bufsize 32768, s/g segs 256
-sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
+sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
+sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
-sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
+sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
+sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Asking for cache data failed
sd 0:0:0:0: [sda] Assuming drive cache: write through
sda: sda1
+ sda: p1 exceeds device capacity
<-- snip -->
Does reverting the commit below fix the problem?
cu
Adrian
commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
Author: FUJITA Tomonori <[email protected]>
Date: Mon May 14 20:17:27 2007 +0900
[SCSI] megaraid_old: convert to use the data buffer accessors
- remove the unnecessary map_single path.
- convert to use the new accessors for the sg lists and the
parameters.
Jens Axboe <[email protected]> did the for_each_sg cleanup.
Signed-off-by: FUJITA Tomonori <[email protected]>
Acked-by: Sumant Patro <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 40ee07d..3907f67 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -523,10 +523,8 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)
/*
* filter the internal and ioctl commands
*/
- if((cmd->cmnd[0] == MEGA_INTERNAL_CMD)) {
- return cmd->request_buffer;
- }
-
+ if((cmd->cmnd[0] == MEGA_INTERNAL_CMD))
+ return (scb_t *)cmd->host_scribble;
/*
* We know what channels our logical drives are on - mega_find_card()
@@ -657,22 +655,14 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int *busy)
case MODE_SENSE: {
char *buf;
+ struct scatterlist *sg;
- if (cmd->use_sg) {
- struct scatterlist *sg;
+ sg = scsi_sglist(cmd);
+ buf = kmap_atomic(sg->page, KM_IRQ0) + sg->offset;
- sg = (struct scatterlist *)cmd->request_buffer;
- buf = kmap_atomic(sg->page, KM_IRQ0) +
- sg->offset;
- } else
- buf = cmd->request_buffer;
memset(buf, 0, cmd->cmnd[4]);
- if (cmd->use_sg) {
- struct scatterlist *sg;
+ kunmap_atomic(buf - sg->offset, KM_IRQ0);
- sg = (struct scatterlist *)cmd->request_buffer;
- kunmap_atomic(buf - sg->offset, KM_IRQ0);
- }
cmd->result = (DID_OK << 16);
cmd->scsi_done(cmd);
return NULL;
@@ -1551,23 +1541,15 @@ mega_cmd_done(adapter_t *adapter, u8 completed[], int nstatus, int status)
islogical = adapter->logdrv_chan[cmd->device->channel];
if( cmd->cmnd[0] == INQUIRY && !islogical ) {
- if( cmd->use_sg ) {
- sgl = (struct scatterlist *)
- cmd->request_buffer;
-
- if( sgl->page ) {
- c = *(unsigned char *)
+ sgl = scsi_sglist(cmd);
+ if( sgl->page ) {
+ c = *(unsigned char *)
page_address((&sgl[0])->page) +
(&sgl[0])->offset;
- }
- else {
- printk(KERN_WARNING
- "megaraid: invalid sg.\n");
- c = 0;
- }
- }
- else {
- c = *(u8 *)cmd->request_buffer;
+ } else {
+ printk(KERN_WARNING
+ "megaraid: invalid sg.\n");
+ c = 0;
}
if(IS_RAID_CH(adapter, cmd->device->channel) &&
@@ -1704,30 +1686,14 @@ mega_rundoneq (adapter_t *adapter)
static void
mega_free_scb(adapter_t *adapter, scb_t *scb)
{
- unsigned long length;
-
switch( scb->dma_type ) {
case MEGA_DMA_TYPE_NONE:
break;
- case MEGA_BULK_DATA:
- if (scb->cmd->use_sg == 0)
- length = scb->cmd->request_bufflen;
- else {
- struct scatterlist *sgl =
- (struct scatterlist *)scb->cmd->request_buffer;
- length = sgl->length;
- }
- pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
- length, scb->dma_direction);
- break;
-
case MEGA_SGLIST:
- pci_unmap_sg(adapter->dev, scb->cmd->request_buffer,
- scb->cmd->use_sg, scb->dma_direction);
+ scsi_dma_unmap(scb->cmd);
break;
-
default:
break;
}
@@ -1767,80 +1733,33 @@ __mega_busywait_mbox (adapter_t *adapter)
static int
mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
{
- struct scatterlist *sgl;
- struct page *page;
- unsigned long offset;
- unsigned int length;
+ struct scatterlist *sg;
Scsi_Cmnd *cmd;
int sgcnt;
int idx;
cmd = scb->cmd;
- /* Scatter-gather not used */
- if( cmd->use_sg == 0 || (cmd->use_sg == 1 &&
- !adapter->has_64bit_addr)) {
-
- if (cmd->use_sg == 0) {
- page = virt_to_page(cmd->request_buffer);
- offset = offset_in_page(cmd->request_buffer);
- length = cmd->request_bufflen;
- } else {
- sgl = (struct scatterlist *)cmd->request_buffer;
- page = sgl->page;
- offset = sgl->offset;
- length = sgl->length;
- }
-
- scb->dma_h_bulkdata = pci_map_page(adapter->dev,
- page, offset,
- length,
- scb->dma_direction);
- scb->dma_type = MEGA_BULK_DATA;
-
- /*
- * We need to handle special 64-bit commands that need a
- * minimum of 1 SG
- */
- if( adapter->has_64bit_addr ) {
- scb->sgl64[0].address = scb->dma_h_bulkdata;
- scb->sgl64[0].length = length;
- *buf = (u32)scb->sgl_dma_addr;
- *len = (u32)length;
- return 1;
- }
- else {
- *buf = (u32)scb->dma_h_bulkdata;
- *len = (u32)length;
- }
- return 0;
- }
-
- sgl = (struct scatterlist *)cmd->request_buffer;
-
/*
* Copy Scatter-Gather list info into controller structure.
*
* The number of sg elements returned must not exceed our limit
*/
- sgcnt = pci_map_sg(adapter->dev, sgl, cmd->use_sg,
- scb->dma_direction);
+ sgcnt = scsi_dma_map(cmd);
scb->dma_type = MEGA_SGLIST;
- BUG_ON(sgcnt > adapter->sglen);
+ BUG_ON(sgcnt > adapter->sglen || sgcnt < 0);
*len = 0;
- for( idx = 0; idx < sgcnt; idx++, sgl++ ) {
-
- if( adapter->has_64bit_addr ) {
- scb->sgl64[idx].address = sg_dma_address(sgl);
- *len += scb->sgl64[idx].length = sg_dma_len(sgl);
- }
- else {
- scb->sgl[idx].address = sg_dma_address(sgl);
- *len += scb->sgl[idx].length = sg_dma_len(sgl);
+ scsi_for_each_sg(cmd, sg, sgcnt, idx) {
+ if (adapter->has_64bit_addr) {
+ scb->sgl64[idx].address = sg_dma_address(sg);
+ *len += scb->sgl64[idx].length = sg_dma_len(sg);
+ } else {
+ scb->sgl[idx].address = sg_dma_address(sg);
+ *len += scb->sgl[idx].length = sg_dma_len(sg);
}
}
@@ -4494,7 +4413,7 @@ mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
scmd->device = sdev;
scmd->device->host = adapter->host;
- scmd->request_buffer = (void *)scb;
+ scmd->host_scribble = (void *)scb;
scmd->cmnd[0] = MEGA_INTERNAL_CMD;
scb->state |= SCB_ACTIVE;
On Tue, 2 Oct 2007, Adrian Bunk wrote:
> Cc's added, the complete bug report is at
> http://lkml.org/lkml/2007/10/2/243
>
> On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
>> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
>>
>> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
>> ...
>
> Thanks for your report.
>
> Does reverting the commit below fix the problem?
>
> cu
> Adrian
>
>
> commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
> Author: FUJITA Tomonori <[email protected]>
> Date: Mon May 14 20:17:27 2007 +0900
>
Confirmed; reverting the above (snipped) patch does fix the issue.
--
Burton Windle [email protected]
On Tuesday, 2 October 2007 20:46, Burton Windle wrote:
> On Tue, 2 Oct 2007, Adrian Bunk wrote:
>
> > Cc's added, the complete bug report is at
> > http://lkml.org/lkml/2007/10/2/243
> >
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> >> 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> >>
> >> System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> >> ...
> >
> > Thanks for your report.
> >
> > Does reverting the commit below fix the problem?
> >
> > cu
> > Adrian
> >
> >
> > commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e
> > Author: FUJITA Tomonori <[email protected]>
> > Date: Mon May 14 20:17:27 2007 +0900
> >
>
> Confirmed; reverting the above (snipped) patch does fix the issue.
I've created a bugzilla entry for your report at:
http://bugzilla.kernel.org/show_bug.cgi?id=9113
Please add a summary of your observations in there.
Greetings,
Rafael
On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> Cc's added, the complete bug report is at
> http://lkml.org/lkml/2007/10/2/243
>
> On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> >
> > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> >...
>
> Thanks for your report.
>
> Diff'ing the dmesg's shows:
>
> <-- snip -->
>
> scsi0: scanning scsi channel 4 [P0] for physical devices.
> scsi0: scanning scsi channel 5 [P1] for physical devices.
> st: Version 20070203, fixed bufsize 32768, s/g segs 256
> -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Asking for cache data failed
> sd 0:0:0:0: [sda] Assuming drive cache: write through
> -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Asking for cache data failed
> sd 0:0:0:0: [sda] Assuming drive cache: write through
> sda: sda1
> + sda: p1 exceeds device capacity
>
> <-- snip -->
>
> - case MEGA_BULK_DATA:
> - if (scb->cmd->use_sg == 0)
> - length = scb->cmd->request_bufflen;
> - else {
> - struct scatterlist *sgl =
> - (struct scatterlist *)scb->cmd->request_buffer;
> - length = sgl->length;
> - }
> - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> - length, scb->dma_direction);
> - break;
> -
This is the problem piece I think. We've reintroduced a very old bug:
commit 51c928c34fa7cff38df584ad01de988805877dba
Author: James Bottomley <[email protected]>
Date: Sat Oct 1 09:38:05 2005 -0500
[SCSI] Legacy MegaRAID: Fix READ CAPACITY
Some Legacy megaraid cards can't actually cope with the scatter/gather
version of the READ CAPACITY command (which is what we now send them
since altering all SCSI internal I/O to go via the block layer). Fix
this (and a few other broken megaraid driver assumptions) by sending
the non-sg version of the command if the sg list only has a single
element.
Signed-off-by: James Bottomley <[email protected]>
So what we have to do is put back the check for use_sg == 1 and send
that as a bulk transfer command.
James
On Tue, 02 Oct 2007 15:38:13 -0500
James Bottomley <[email protected]> wrote:
> On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > Cc's added, the complete bug report is at
> > http://lkml.org/lkml/2007/10/2/243
> >
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > >
> > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > >...
> >
> > Thanks for your report.
> >
> > Diff'ing the dmesg's shows:
> >
> > <-- snip -->
> >
> > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > st: Version 20070203, fixed bufsize 32768, s/g segs 256
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Asking for cache data failed
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Asking for cache data failed
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > sda: sda1
> > + sda: p1 exceeds device capacity
> >
> > <-- snip -->
> >
> > - case MEGA_BULK_DATA:
> > - if (scb->cmd->use_sg == 0)
> > - length = scb->cmd->request_bufflen;
> > - else {
> > - struct scatterlist *sgl =
> > - (struct scatterlist *)scb->cmd->request_buffer;
> > - length = sgl->length;
> > - }
> > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > - length, scb->dma_direction);
> > - break;
> > -
>
> This is the problem piece I think. We've reintroduced a very old bug:
>
> commit 51c928c34fa7cff38df584ad01de988805877dba
> Author: James Bottomley <[email protected]>
> Date: Sat Oct 1 09:38:05 2005 -0500
>
> [SCSI] Legacy MegaRAID: Fix READ CAPACITY
>
> Some Legacy megaraid cards can't actually cope with the scatter/gather
> version of the READ CAPACITY command (which is what we now send them
> since altering all SCSI internal I/O to go via the block layer). Fix
> this (and a few other broken megaraid driver assumptions) by sending
> the non-sg version of the command if the sg list only has a single
> element.
>
> Signed-off-by: James Bottomley <[email protected]>
>
> So what we have to do is put back the check for use_sg == 1 and send
> that as a bulk transfer command.
Sorry again. Needs to check sg count before dma mapping.
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3907f67..ae0b220 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -1737,9 +1737,12 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
Scsi_Cmnd *cmd;
int sgcnt;
int idx;
+ int bulkdata;
cmd = scb->cmd;
+ bulkdata = (scsi_sg_count(cmd) == 1) ? 1 : 0;
+
/*
* Copy Scatter-Gather list info into controller structure.
*
@@ -1753,6 +1756,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
*len = 0;
+ if (bulkdata && !adapter->has_64bit_addr) {
+ sg = scsi_sglist(cmd);
+ scb->dma_h_bulkdata = sg_dma_address(sg);
+ *buf = (u32)scb->dma_h_bulkdata;
+ *len = sg_dma_len(sg);
+ return 0;
+ }
+
scsi_for_each_sg(cmd, sg, sgcnt, idx) {
if (adapter->has_64bit_addr) {
scb->sgl64[idx].address = sg_dma_address(sg);
On Tue, 02 Oct 2007 15:38:13 -0500
James Bottomley <[email protected]> wrote:
> On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > Cc's added, the complete bug report is at
> > http://lkml.org/lkml/2007/10/2/243
> >
> > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > >
> > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > >...
> >
> > Thanks for your report.
> >
> > Diff'ing the dmesg's shows:
> >
> > <-- snip -->
> >
> > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > st: Version 20070203, fixed bufsize 32768, s/g segs 256
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Asking for cache data failed
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > -sd 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > sd 0:0:0:0: [sda] Write Protect is off
> > sd 0:0:0:0: [sda] Asking for cache data failed
> > sd 0:0:0:0: [sda] Assuming drive cache: write through
> > sda: sda1
> > + sda: p1 exceeds device capacity
> >
> > <-- snip -->
> >
> > - case MEGA_BULK_DATA:
> > - if (scb->cmd->use_sg == 0)
> > - length = scb->cmd->request_bufflen;
> > - else {
> > - struct scatterlist *sgl =
> > - (struct scatterlist *)scb->cmd->request_buffer;
> > - length = sgl->length;
> > - }
> > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > - length, scb->dma_direction);
> > - break;
> > -
>
> This is the problem piece I think. We've reintroduced a very old bug:
>
> commit 51c928c34fa7cff38df584ad01de988805877dba
> Author: James Bottomley <[email protected]>
> Date: Sat Oct 1 09:38:05 2005 -0500
>
> [SCSI] Legacy MegaRAID: Fix READ CAPACITY
>
> Some Legacy megaraid cards can't actually cope with the scatter/gather
> version of the READ CAPACITY command (which is what we now send them
> since altering all SCSI internal I/O to go via the block layer). Fix
> this (and a few other broken megaraid driver assumptions) by sending
> the non-sg version of the command if the sg list only has a single
> element.
>
> Signed-off-by: James Bottomley <[email protected]>
>
> So what we have to do is put back the check for use_sg == 1 and send
> that as a bulk transfer command.
Sorry about this. Can this fix the problem?
Thanks,
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3907f67..da56163 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter, scb_t *scb, u32 *buf, u32 *len)
*len = 0;
+ if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
+ sg = scsi_sglist(cmd);
+ scb->dma_h_bulkdata = sg_dma_address(sg);
+ *buf = (u32)scb->dma_h_bulkdata;
+ *len = sg_dma_len(sg);
+ return 0;
+ }
+
scsi_for_each_sg(cmd, sg, sgcnt, idx) {
if (adapter->has_64bit_addr) {
scb->sgl64[idx].address = sg_dma_address(sg);
> -----Original Message-----
> From: FUJITA Tomonori [mailto:[email protected]]
> Sent: Tuesday, October 02, 2007 5:01 PM
> To: [email protected]
> Cc: [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; Patro, Sumant; DL-MegaRAID
> Linux; [email protected]
> Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
>
> On Tue, 02 Oct 2007 15:38:13 -0500
> James Bottomley <[email protected]> wrote:
>
> > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > Cc's added, the complete bug report is at
> > > http://lkml.org/lkml/2007/10/2/243
> > >
> > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > >
> > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > >...
> > >
> > > Thanks for your report.
> > >
> > > Diff'ing the dmesg's shows:
> > >
> > > <-- snip -->
> > >
> > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> cache: write
> > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> sectors (8984
> > > MB)
> > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> cache: write
> > > through
> > > sda: sda1
> > > + sda: p1 exceeds device capacity
> > >
> > > <-- snip -->
> > >
> > > - case MEGA_BULK_DATA:
> > > - if (scb->cmd->use_sg == 0)
> > > - length = scb->cmd->request_bufflen;
> > > - else {
> > > - struct scatterlist *sgl =
> > > - (struct scatterlist
> *)scb->cmd->request_buffer;
> > > - length = sgl->length;
> > > - }
> > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > - length, scb->dma_direction);
> > > - break;
> > > -
> >
> > This is the problem piece I think. We've reintroduced a
> very old bug:
> >
> > commit 51c928c34fa7cff38df584ad01de988805877dba
> > Author: James Bottomley <[email protected]>
> > Date: Sat Oct 1 09:38:05 2005 -0500
> >
> > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> >
> > Some Legacy megaraid cards can't actually cope with the
> scatter/gather
> > version of the READ CAPACITY command (which is what we
> now send them
> > since altering all SCSI internal I/O to go via the
> block layer). Fix
> > this (and a few other broken megaraid driver
> assumptions) by sending
> > the non-sg version of the command if the sg list only
> has a single
> > element.
> >
> > Signed-off-by: James Bottomley <[email protected]>
> >
> > So what we have to do is put back the check for use_sg == 1
> and send
> > that as a bulk transfer command.
>
> Sorry about this. Can this fix the problem?
>
> Thanks,
>
>
> diff --git a/drivers/scsi/megaraid.c
> b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> --- a/drivers/scsi/megaraid.c
> +++ b/drivers/scsi/megaraid.c
> @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> scb_t *scb, u32 *buf, u32 *len)
>
> *len = 0;
>
> + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> + sg = scsi_sglist(cmd);
> + scb->dma_h_bulkdata = sg_dma_address(sg);
> + *buf = (u32)scb->dma_h_bulkdata;
> + *len = sg_dma_len(sg);
> + return 0;
> + }
> +
> scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> if (adapter->has_64bit_addr) {
> scb->sgl64[idx].address = sg_dma_address(sg);
>
With this patch I see the correct logical disk size reported.
Thanks.
Sumant
On Wed, 3 Oct 2007 17:32:55 -0600
"Patro, Sumant" <[email protected]> wrote:
>
>
> > -----Original Message-----
> > From: FUJITA Tomonori [mailto:[email protected]]
> > Sent: Tuesday, October 02, 2007 5:01 PM
> > To: [email protected]
> > Cc: [email protected]; [email protected];
> > [email protected]; [email protected];
> > [email protected]; Patro, Sumant; DL-MegaRAID
> > Linux; [email protected]
> > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> >
> > On Tue, 02 Oct 2007 15:38:13 -0500
> > James Bottomley <[email protected]> wrote:
> >
> > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > Cc's added, the complete bug report is at
> > > > http://lkml.org/lkml/2007/10/2/243
> > > >
> > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > >
> > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > >...
> > > >
> > > > Thanks for your report.
> > > >
> > > > Diff'ing the dmesg's shows:
> > > >
> > > > <-- snip -->
> > > >
> > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > cache: write
> > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> > sectors (8984
> > > > MB)
> > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > cache: write
> > > > through
> > > > sda: sda1
> > > > + sda: p1 exceeds device capacity
> > > >
> > > > <-- snip -->
> > > >
> > > > - case MEGA_BULK_DATA:
> > > > - if (scb->cmd->use_sg == 0)
> > > > - length = scb->cmd->request_bufflen;
> > > > - else {
> > > > - struct scatterlist *sgl =
> > > > - (struct scatterlist
> > *)scb->cmd->request_buffer;
> > > > - length = sgl->length;
> > > > - }
> > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > - length, scb->dma_direction);
> > > > - break;
> > > > -
> > >
> > > This is the problem piece I think. We've reintroduced a
> > very old bug:
> > >
> > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > Author: James Bottomley <[email protected]>
> > > Date: Sat Oct 1 09:38:05 2005 -0500
> > >
> > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > >
> > > Some Legacy megaraid cards can't actually cope with the
> > scatter/gather
> > > version of the READ CAPACITY command (which is what we
> > now send them
> > > since altering all SCSI internal I/O to go via the
> > block layer). Fix
> > > this (and a few other broken megaraid driver
> > assumptions) by sending
> > > the non-sg version of the command if the sg list only
> > has a single
> > > element.
> > >
> > > Signed-off-by: James Bottomley <[email protected]>
> > >
> > > So what we have to do is put back the check for use_sg == 1
> > and send
> > > that as a bulk transfer command.
> >
> > Sorry about this. Can this fix the problem?
> >
> > Thanks,
> >
> >
> > diff --git a/drivers/scsi/megaraid.c
> > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > --- a/drivers/scsi/megaraid.c
> > +++ b/drivers/scsi/megaraid.c
> > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> > scb_t *scb, u32 *buf, u32 *len)
> >
> > *len = 0;
> >
> > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > + sg = scsi_sglist(cmd);
> > + scb->dma_h_bulkdata = sg_dma_address(sg);
> > + *buf = (u32)scb->dma_h_bulkdata;
> > + *len = sg_dma_len(sg);
> > + return 0;
> > + }
> > +
> > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > if (adapter->has_64bit_addr) {
> > scb->sgl64[idx].address = sg_dma_address(sg);
> >
>
>
> With this patch I see the correct logical disk size reported.
> Thanks.
Great, thanks for testing!
Can you try the following patch instead of the above patch?
http://marc.info/?l=linux-scsi&m=119137033016550&w=2
I know the changes are pretty trivial and it should work...
On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Wed, 3 Oct 2007 17:32:55 -0600
> "Patro, Sumant" <[email protected]> wrote:
>
> >
> >
> > > -----Original Message-----
> > > From: FUJITA Tomonori [mailto:[email protected]]
> > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > To: [email protected]
> > > Cc: [email protected]; [email protected];
> > > [email protected]; [email protected];
> > > [email protected]; Patro, Sumant; DL-MegaRAID
> > > Linux; [email protected]
> > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > >
> > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > James Bottomley <[email protected]> wrote:
> > >
> > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > Cc's added, the complete bug report is at
> > > > > http://lkml.org/lkml/2007/10/2/243
> > > > >
> > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > >
> > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > >...
> > > > >
> > > > > Thanks for your report.
> > > > >
> > > > > Diff'ing the dmesg's shows:
> > > > >
> > > > > <-- snip -->
> > > > >
> > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > cache: write
> > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> > > sectors (8984
> > > > > MB)
> > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > cache: write
> > > > > through
> > > > > sda: sda1
> > > > > + sda: p1 exceeds device capacity
> > > > >
> > > > > <-- snip -->
> > > > >
> > > > > - case MEGA_BULK_DATA:
> > > > > - if (scb->cmd->use_sg == 0)
> > > > > - length = scb->cmd->request_bufflen;
> > > > > - else {
> > > > > - struct scatterlist *sgl =
> > > > > - (struct scatterlist
> > > *)scb->cmd->request_buffer;
> > > > > - length = sgl->length;
> > > > > - }
> > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > - length, scb->dma_direction);
> > > > > - break;
> > > > > -
> > > >
> > > > This is the problem piece I think. We've reintroduced a
> > > very old bug:
> > > >
> > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > Author: James Bottomley <[email protected]>
> > > > Date: Sat Oct 1 09:38:05 2005 -0500
> > > >
> > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > >
> > > > Some Legacy megaraid cards can't actually cope with the
> > > scatter/gather
> > > > version of the READ CAPACITY command (which is what we
> > > now send them
> > > > since altering all SCSI internal I/O to go via the
> > > block layer). Fix
> > > > this (and a few other broken megaraid driver
> > > assumptions) by sending
> > > > the non-sg version of the command if the sg list only
> > > has a single
> > > > element.
> > > >
> > > > Signed-off-by: James Bottomley <[email protected]>
> > > >
> > > > So what we have to do is put back the check for use_sg == 1
> > > and send
> > > > that as a bulk transfer command.
> > >
> > > Sorry about this. Can this fix the problem?
> > >
> > > Thanks,
> > >
> > >
> > > diff --git a/drivers/scsi/megaraid.c
> > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > --- a/drivers/scsi/megaraid.c
> > > +++ b/drivers/scsi/megaraid.c
> > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> > > scb_t *scb, u32 *buf, u32 *len)
> > >
> > > *len = 0;
> > >
> > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > + sg = scsi_sglist(cmd);
> > > + scb->dma_h_bulkdata = sg_dma_address(sg);
> > > + *buf = (u32)scb->dma_h_bulkdata;
> > > + *len = sg_dma_len(sg);
> > > + return 0;
> > > + }
> > > +
> > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > if (adapter->has_64bit_addr) {
> > > scb->sgl64[idx].address = sg_dma_address(sg);
> > >
> >
> >
> > With this patch I see the correct logical disk size reported.
> > Thanks.
>
> Great, thanks for testing!
>
> Can you try the following patch instead of the above patch?
>
> http://marc.info/?l=linux-scsi&m=119137033016550&w=2
>
>
> I know the changes are pretty trivial and it should work...
Tomo, this is the patch I added.
--
Jens Axboe
On Thu, 4 Oct 2007 09:28:34 +0200
Jens Axboe <[email protected]> wrote:
> On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > On Wed, 3 Oct 2007 17:32:55 -0600
> > "Patro, Sumant" <[email protected]> wrote:
> >
> > >
> > >
> > > > -----Original Message-----
> > > > From: FUJITA Tomonori [mailto:[email protected]]
> > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > To: [email protected]
> > > > Cc: [email protected]; [email protected];
> > > > [email protected]; [email protected];
> > > > [email protected]; Patro, Sumant; DL-MegaRAID
> > > > Linux; [email protected]
> > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > >
> > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > James Bottomley <[email protected]> wrote:
> > > >
> > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > Cc's added, the complete bug report is at
> > > > > > http://lkml.org/lkml/2007/10/2/243
> > > > > >
> > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > >
> > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > >...
> > > > > >
> > > > > > Thanks for your report.
> > > > > >
> > > > > > Diff'ing the dmesg's shows:
> > > > > >
> > > > > > <-- snip -->
> > > > > >
> > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > cache: write
> > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> > > > sectors (8984
> > > > > > MB)
> > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > cache: write
> > > > > > through
> > > > > > sda: sda1
> > > > > > + sda: p1 exceeds device capacity
> > > > > >
> > > > > > <-- snip -->
> > > > > >
> > > > > > - case MEGA_BULK_DATA:
> > > > > > - if (scb->cmd->use_sg == 0)
> > > > > > - length = scb->cmd->request_bufflen;
> > > > > > - else {
> > > > > > - struct scatterlist *sgl =
> > > > > > - (struct scatterlist
> > > > *)scb->cmd->request_buffer;
> > > > > > - length = sgl->length;
> > > > > > - }
> > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > - length, scb->dma_direction);
> > > > > > - break;
> > > > > > -
> > > > >
> > > > > This is the problem piece I think. We've reintroduced a
> > > > very old bug:
> > > > >
> > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > Author: James Bottomley <[email protected]>
> > > > > Date: Sat Oct 1 09:38:05 2005 -0500
> > > > >
> > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > >
> > > > > Some Legacy megaraid cards can't actually cope with the
> > > > scatter/gather
> > > > > version of the READ CAPACITY command (which is what we
> > > > now send them
> > > > > since altering all SCSI internal I/O to go via the
> > > > block layer). Fix
> > > > > this (and a few other broken megaraid driver
> > > > assumptions) by sending
> > > > > the non-sg version of the command if the sg list only
> > > > has a single
> > > > > element.
> > > > >
> > > > > Signed-off-by: James Bottomley <[email protected]>
> > > > >
> > > > > So what we have to do is put back the check for use_sg == 1
> > > > and send
> > > > > that as a bulk transfer command.
> > > >
> > > > Sorry about this. Can this fix the problem?
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > diff --git a/drivers/scsi/megaraid.c
> > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > --- a/drivers/scsi/megaraid.c
> > > > +++ b/drivers/scsi/megaraid.c
> > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> > > > scb_t *scb, u32 *buf, u32 *len)
> > > >
> > > > *len = 0;
> > > >
> > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > + sg = scsi_sglist(cmd);
> > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > + *buf = (u32)scb->dma_h_bulkdata;
> > > > + *len = sg_dma_len(sg);
> > > > + return 0;
> > > > + }
> > > > +
> > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > > if (adapter->has_64bit_addr) {
> > > > scb->sgl64[idx].address = sg_dma_address(sg);
> > > >
> > >
> > >
> > > With this patch I see the correct logical disk size reported.
> > > Thanks.
> >
> > Great, thanks for testing!
> >
> > Can you try the following patch instead of the above patch?
> >
> > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> >
> >
> > I know the changes are pretty trivial and it should work...
>
> Tomo, this is the patch I added.
Thanks. I thought that it will be sent via scsi-misc because the scsi
accessor patch introduced this bug. But either is ok with me.
BTW, please add my sign-off.
-
[SCSI] megaraid_old: fix scatter/gather for legacy megaraid cards
Some legacy megaraid cards (!has_64bit_addr case) can't cope with the
catter/gather version of the READ CAPACITY command. We need to send
the non-sg version of the command if the sg list only as a single
element.
commit 3f6270ef76f2ce5c134615a470685d6c2a66c07e reintroduced this bug,
which was fixed long ago (commit 51c928c34fa7cff38df584ad01de988805877dba).
Signed-off-by: FUJITA Tomonori <[email protected]>
On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Thu, 4 Oct 2007 09:28:34 +0200
> Jens Axboe <[email protected]> wrote:
>
> > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > > On Wed, 3 Oct 2007 17:32:55 -0600
> > > "Patro, Sumant" <[email protected]> wrote:
> > >
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: FUJITA Tomonori [mailto:[email protected]]
> > > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > > To: [email protected]
> > > > > Cc: [email protected]; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; Patro, Sumant; DL-MegaRAID
> > > > > Linux; [email protected]
> > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > > >
> > > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > > James Bottomley <[email protected]> wrote:
> > > > >
> > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > > Cc's added, the complete bug report is at
> > > > > > > http://lkml.org/lkml/2007/10/2/243
> > > > > > >
> > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > > >
> > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > > >...
> > > > > > >
> > > > > > > Thanks for your report.
> > > > > > >
> > > > > > > Diff'ing the dmesg's shows:
> > > > > > >
> > > > > > > <-- snip -->
> > > > > > >
> > > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > > cache: write
> > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> > > > > sectors (8984
> > > > > > > MB)
> > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > > cache: write
> > > > > > > through
> > > > > > > sda: sda1
> > > > > > > + sda: p1 exceeds device capacity
> > > > > > >
> > > > > > > <-- snip -->
> > > > > > >
> > > > > > > - case MEGA_BULK_DATA:
> > > > > > > - if (scb->cmd->use_sg == 0)
> > > > > > > - length = scb->cmd->request_bufflen;
> > > > > > > - else {
> > > > > > > - struct scatterlist *sgl =
> > > > > > > - (struct scatterlist
> > > > > *)scb->cmd->request_buffer;
> > > > > > > - length = sgl->length;
> > > > > > > - }
> > > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > > - length, scb->dma_direction);
> > > > > > > - break;
> > > > > > > -
> > > > > >
> > > > > > This is the problem piece I think. We've reintroduced a
> > > > > very old bug:
> > > > > >
> > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > > Author: James Bottomley <[email protected]>
> > > > > > Date: Sat Oct 1 09:38:05 2005 -0500
> > > > > >
> > > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > > >
> > > > > > Some Legacy megaraid cards can't actually cope with the
> > > > > scatter/gather
> > > > > > version of the READ CAPACITY command (which is what we
> > > > > now send them
> > > > > > since altering all SCSI internal I/O to go via the
> > > > > block layer). Fix
> > > > > > this (and a few other broken megaraid driver
> > > > > assumptions) by sending
> > > > > > the non-sg version of the command if the sg list only
> > > > > has a single
> > > > > > element.
> > > > > >
> > > > > > Signed-off-by: James Bottomley <[email protected]>
> > > > > >
> > > > > > So what we have to do is put back the check for use_sg == 1
> > > > > and send
> > > > > > that as a bulk transfer command.
> > > > >
> > > > > Sorry about this. Can this fix the problem?
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > diff --git a/drivers/scsi/megaraid.c
> > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > > --- a/drivers/scsi/megaraid.c
> > > > > +++ b/drivers/scsi/megaraid.c
> > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> > > > > scb_t *scb, u32 *buf, u32 *len)
> > > > >
> > > > > *len = 0;
> > > > >
> > > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > > + sg = scsi_sglist(cmd);
> > > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > > + *buf = (u32)scb->dma_h_bulkdata;
> > > > > + *len = sg_dma_len(sg);
> > > > > + return 0;
> > > > > + }
> > > > > +
> > > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > > > if (adapter->has_64bit_addr) {
> > > > > scb->sgl64[idx].address = sg_dma_address(sg);
> > > > >
> > > >
> > > >
> > > > With this patch I see the correct logical disk size reported.
> > > > Thanks.
> > >
> > > Great, thanks for testing!
> > >
> > > Can you try the following patch instead of the above patch?
> > >
> > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> > >
> > >
> > > I know the changes are pretty trivial and it should work...
> >
> > Tomo, this is the patch I added.
>
> Thanks. I thought that it will be sent via scsi-misc because the scsi
> accessor patch introduced this bug. But either is ok with me.
If it only affects the driver _after_ the scsi accessor patch and as
such doesn't screw over git-block, then I'll drop it for sure.
--
Jens Axboe
On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
>...
> Tomo, this is the patch I added.
Please excuse my comment in case this was already clear:
You are aware that this bug is a regression in 2.6.23-rc and the patch
should therefore go to Linus ASAP and not after the release of 2.6.23?
> Jens Axboe
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
On Thu, 4 Oct 2007 12:48:58 +0200
Adrian Bunk <[email protected]> wrote:
> On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
> >...
> > Tomo, this is the patch I added.
>
> Please excuse my comment in case this was already clear:
>
> You are aware that this bug is a regression in 2.6.23-rc and the patch
> should therefore go to Linus ASAP and not after the release of 2.6.23?
Oops, you are right. This should go via scsi-rc-fixes tree ASAP.
On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> On Thu, 4 Oct 2007 12:48:58 +0200
> Adrian Bunk <[email protected]> wrote:
>
> > On Thu, Oct 04, 2007 at 09:28:34AM +0200, Jens Axboe wrote:
> > >...
> > > Tomo, this is the patch I added.
> >
> > Please excuse my comment in case this was already clear:
> >
> > You are aware that this bug is a regression in 2.6.23-rc and the patch
> > should therefore go to Linus ASAP and not after the release of 2.6.23?
>
> Oops, you are right. This should go via scsi-rc-fixes tree ASAP.
Irk, the scsi accessor stuff is already in, I forgot and thought it was
pending for 2.6.24. So rush the patch upstream please!
--
Jens Axboe
On Thu, 2007-10-04 at 12:36 +0200, Jens Axboe wrote:
> On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > On Thu, 4 Oct 2007 09:28:34 +0200
> > Jens Axboe <[email protected]> wrote:
> >
> > > On Thu, Oct 04 2007, FUJITA Tomonori wrote:
> > > > On Wed, 3 Oct 2007 17:32:55 -0600
> > > > "Patro, Sumant" <[email protected]> wrote:
> > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: FUJITA Tomonori [mailto:[email protected]]
> > > > > > Sent: Tuesday, October 02, 2007 5:01 PM
> > > > > > To: [email protected]
> > > > > > Cc: [email protected]; [email protected];
> > > > > > [email protected]; [email protected];
> > > > > > [email protected]; Patro, Sumant; DL-MegaRAID
> > > > > > Linux; [email protected]
> > > > > > Subject: Re: 2.6.23-rc9 boot failure (megaraid?)
> > > > > >
> > > > > > On Tue, 02 Oct 2007 15:38:13 -0500
> > > > > > James Bottomley <[email protected]> wrote:
> > > > > >
> > > > > > > On Tue, 2007-10-02 at 20:15 +0200, Adrian Bunk wrote:
> > > > > > > > Cc's added, the complete bug report is at
> > > > > > > > http://lkml.org/lkml/2007/10/2/243
> > > > > > > >
> > > > > > > > On Tue, Oct 02, 2007 at 12:48:26PM -0400, Burton Windle wrote:
> > > > > > > > > 2.6.23-rc9 fails to boot for me; 2.6.22.9 works fine.
> > > > > > > > >
> > > > > > > > > System is a Dell Poweredge with PERC 2/DC with RAID1 volume.
> > > > > > > > >...
> > > > > > > >
> > > > > > > > Thanks for your report.
> > > > > > > >
> > > > > > > > Diff'ing the dmesg's shows:
> > > > > > > >
> > > > > > > > <-- snip -->
> > > > > > > >
> > > > > > > > scsi0: scanning scsi channel 4 [P0] for physical devices.
> > > > > > > > scsi0: scanning scsi channel 5 [P1] for physical devices.
> > > > > > > > st: Version 20070203, fixed bufsize 32768, s/g segs 256 -sd
> > > > > > > > 0:0:0:0: [sda] 17547264 512-byte hardware sectors (8984 MB)
> > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > > > cache: write
> > > > > > > > through -sd 0:0:0:0: [sda] 17547264 512-byte hardware
> > > > > > sectors (8984
> > > > > > > > MB)
> > > > > > > > +sd 0:0:0:0: [sda] Sector size 0 reported, assuming 512.
> > > > > > > > +sd 0:0:0:0: [sda] 1 512-byte hardware sectors (0 MB)
> > > > > > > > sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Asking
> > > > > > > > for cache data failed sd 0:0:0:0: [sda] Assuming drive
> > > > > > cache: write
> > > > > > > > through
> > > > > > > > sda: sda1
> > > > > > > > + sda: p1 exceeds device capacity
> > > > > > > >
> > > > > > > > <-- snip -->
> > > > > > > >
> > > > > > > > - case MEGA_BULK_DATA:
> > > > > > > > - if (scb->cmd->use_sg == 0)
> > > > > > > > - length = scb->cmd->request_bufflen;
> > > > > > > > - else {
> > > > > > > > - struct scatterlist *sgl =
> > > > > > > > - (struct scatterlist
> > > > > > *)scb->cmd->request_buffer;
> > > > > > > > - length = sgl->length;
> > > > > > > > - }
> > > > > > > > - pci_unmap_page(adapter->dev, scb->dma_h_bulkdata,
> > > > > > > > - length, scb->dma_direction);
> > > > > > > > - break;
> > > > > > > > -
> > > > > > >
> > > > > > > This is the problem piece I think. We've reintroduced a
> > > > > > very old bug:
> > > > > > >
> > > > > > > commit 51c928c34fa7cff38df584ad01de988805877dba
> > > > > > > Author: James Bottomley <[email protected]>
> > > > > > > Date: Sat Oct 1 09:38:05 2005 -0500
> > > > > > >
> > > > > > > [SCSI] Legacy MegaRAID: Fix READ CAPACITY
> > > > > > >
> > > > > > > Some Legacy megaraid cards can't actually cope with the
> > > > > > scatter/gather
> > > > > > > version of the READ CAPACITY command (which is what we
> > > > > > now send them
> > > > > > > since altering all SCSI internal I/O to go via the
> > > > > > block layer). Fix
> > > > > > > this (and a few other broken megaraid driver
> > > > > > assumptions) by sending
> > > > > > > the non-sg version of the command if the sg list only
> > > > > > has a single
> > > > > > > element.
> > > > > > >
> > > > > > > Signed-off-by: James Bottomley <[email protected]>
> > > > > > >
> > > > > > > So what we have to do is put back the check for use_sg == 1
> > > > > > and send
> > > > > > > that as a bulk transfer command.
> > > > > >
> > > > > > Sorry about this. Can this fix the problem?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > diff --git a/drivers/scsi/megaraid.c
> > > > > > b/drivers/scsi/megaraid.c index 3907f67..da56163 100644
> > > > > > --- a/drivers/scsi/megaraid.c
> > > > > > +++ b/drivers/scsi/megaraid.c
> > > > > > @@ -1753,6 +1753,14 @@ mega_build_sglist(adapter_t *adapter,
> > > > > > scb_t *scb, u32 *buf, u32 *len)
> > > > > >
> > > > > > *len = 0;
> > > > > >
> > > > > > + if (scsi_sg_count(cmd) == 1 && !adapter->has_64bit_addr) {
> > > > > > + sg = scsi_sglist(cmd);
> > > > > > + scb->dma_h_bulkdata = sg_dma_address(sg);
> > > > > > + *buf = (u32)scb->dma_h_bulkdata;
> > > > > > + *len = sg_dma_len(sg);
> > > > > > + return 0;
> > > > > > + }
> > > > > > +
> > > > > > scsi_for_each_sg(cmd, sg, sgcnt, idx) {
> > > > > > if (adapter->has_64bit_addr) {
> > > > > > scb->sgl64[idx].address = sg_dma_address(sg);
> > > > > >
> > > > >
> > > > >
> > > > > With this patch I see the correct logical disk size reported.
> > > > > Thanks.
> > > >
> > > > Great, thanks for testing!
> > > >
> > > > Can you try the following patch instead of the above patch?
> > > >
> > > > http://marc.info/?l=linux-scsi&m=119137033016550&w=2
> > > >
> > > >
> > > > I know the changes are pretty trivial and it should work...
> > >
> > > Tomo, this is the patch I added.
> >
> > Thanks. I thought that it will be sent via scsi-misc because the scsi
> > accessor patch introduced this bug. But either is ok with me.
>
> If it only affects the driver _after_ the scsi accessor patch and as
> such doesn't screw over git-block, then I'll drop it for sure.
No, this is a release critical fix ... I'll roll it up and send it in
for 2.6.23.
James