Subject: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

I am having great difficulty getting a 3Ware 9550SX working.

I'm using 2.6.12 with EITHER the 2.6.14 driver, or the latest 3-ware driver
(2.26.04.006) from their web site, plus latest BIOS and firmware, on an
Dual AMD64 opteron 275 system.

I'm only not using 2.6.14 because compiling an ubuntu-boot-sequeunce
compatible kernel for AMD64 SMP is decidedly non-trivial (i.e. I have spent
the whole day doing it and failed dismally).

The RAID card is attached to a 4 x 512Gb SATA-II disk array in RAID-5
configuration (1.5Tb total)

All seems to go well until I try and do mke2fs. This appears to work,
and tries to write the inode tables. However, at (about) 3400 inodes
(of 11176), it slows to a crawl, writing one table every 10 seconds.
strace shows it is still running, and no errors are being reported.
However, it seems very sick.

No debug messages indicating any errors.

The only other clue as to what may be wrong is in the boot sequence.
I see lots of bad LUN messages (detail below). However, it does appear
to be detecting the disks right in the end.

Anyone got any ideas?


Oct 30 20:09:09 localhost kernel: [ 138.688249]
/dev/scsi/host4/bus0/target0/lun0: p1 < p5 >
Oct 30 20:09:09 localhost kernel: [ 138.712496] Attached scsi disk sdb at
scsi4, channel 0, id 0, lun 0
Oct 30 20:09:09 localhost kernel: [ 138.712814] scsi: On host 4 channel 0
id 0
only 511 (max_scsi_report_luns) of 214715501 luns reported, try increasing
max_scsi_report_luns.
Oct 30 20:09:09 localhost kernel: [ 138.712817] scsi: host 4 channel 0 id 0
lun 0x383438203636202d has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712822] scsi: host 4 channel 0 id 0
lun 0x204c697665203078 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712826] scsi: host 4 channel 0 id 0
lun 0x6666666666666666 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712830] scsi: host 4 channel 0 id 0
lun 0x3838303039303030 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712835] scsi: host 4 channel 0 id 0
lun 0x0a74696c65626c69 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712839] scsi: host 4 channel 0 id 0
lun 0x7420333332382031 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712843] scsi: host 4 channel 0 id 0
lun 0x206662636f6e2c20 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712848] scsi: host 4 channel 0 id 0
lun 0x4c69766520307866 has a LUN larger than currently supported.
Oct 30 20:09:09 localhost kernel: [ 138.712852] scsi: host 4 channel 0 id 0
lun 0x6666666666666638 has a LUN larger than currently supported.


--
Alex Bligh


2005-11-01 12:45:30

by Florian Weimer

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

* Alex Bligh:

> All seems to go well until I try and do mke2fs. This appears to work,
> and tries to write the inode tables. However, at (about) 3400 inodes
> (of 11176), it slows to a crawl, writing one table every 10 seconds.
> strace shows it is still running, and no errors are being reported.
> However, it seems very sick.

In my experience, the 3ware SATA controllers which are not NCQ-capable
have very, very lousy write performance with some drives, unless you
enable the write cache (which is, of course, a bit dangerous without
UPS or battery backup on the controller).

2005-11-01 17:04:15

by Lawrence Walton

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

Florian Weimer [[email protected]] wrote:
> * Alex Bligh:
>
> > All seems to go well until I try and do mke2fs. This appears to work,
> > and tries to write the inode tables. However, at (about) 3400 inodes
> > (of 11176), it slows to a crawl, writing one table every 10 seconds.
> > strace shows it is still running, and no errors are being reported.
> > However, it seems very sick.
>
> In my experience, the 3ware SATA controllers which are not NCQ-capable
> have very, very lousy write performance with some drives, unless you
> enable the write cache (which is, of course, a bit dangerous without
> UPS or battery backup on the controller).
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Not to sound like the a 3ware chearleader, but this card does support NCQ.
--
*--* Mail: [email protected]
*--* Voice: 425.739.4247
*--* Fax: 425.827.9577
*--* HTTP://the-penguin.otak.com/~lawrence
--------------------------------------
- - - - - - O t a k i n c . - - - - -



Attachments:
(No filename) (1.15 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2005-11-01 17:14:13

by Florian Weimer

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

* Lawrence Walton:

>> In my experience, the 3ware SATA controllers which are not NCQ-capable
>> have very, very lousy write performance with some drives, unless you
>> enable the write cache (which is, of course, a bit dangerous without
>> UPS or battery backup on the controller).

> Not to sound like the a 3ware chearleader, but this card does support NCQ.

Oh. I didn't know whether this particukar controller supported NCQ or
not.

BTW, does anybody know if NCQ support is available for the 9500S in a
firmware upgrade?

Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables



--On 01 November 2005 18:13 +0100 Florian Weimer <[email protected]> wrote:

>>> In my experience, the 3ware SATA controllers which are not NCQ-capable
>>> have very, very lousy write performance with some drives, unless you
>>> enable the write cache (which is, of course, a bit dangerous without
>>> UPS or battery backup on the controller).
>
>> Not to sound like the a 3ware chearleader, but this card does support
>> NCQ.
>
> Oh. I didn't know whether this particukar controller supported NCQ or
> not.

It even supports SATA-3, not much good that it does me.

I managed to format it reiserfs in the end. dbench (yes I know it isn't a
great benchmark) gives me a write speed of 7Mb/s compared to 700Mb/s if one
of the disks in the array is attached to the motherboard SATA controller.

7Mb/s is quite stunningly appalling. I realise the release notes warn of
slow writes, but that's just daft! I have a few bits in my setup to check
before I start pointing the finger comprehensively. It may (for instance)
be a large partition problem (suggested on the ext2 list).

I'm taking it that it works at least for some people (did you test write
speed Lawrence?).

--
Alex Bligh

2005-11-02 00:16:39

by jmerkey

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables



I see this problem if you have the array configured for RAID 5 and you
have not pushed the F8 key during array config after setting up
the array. Try rebooting, setting the Raid 5 for INIT (Press F8) and it
goes away, but the whole array will reinit itself. There seems to be
some sort of problem in their RAID 5 logic and you can setup a RAID 5
stripe set, but init doesn't finish or gets in a wierd state during
reboot. It seems confined to 9500 series controllers, but I have also
seen this behavior on the 8000 series drivers as well. I don't know
if you are using RAID 5 , but I have seen this problem on RAID 5 configs
only.

Jeff

Alex Bligh - linux-kernel wrote:

>
>
> --On 01 November 2005 18:13 +0100 Florian Weimer <[email protected]>
> wrote:
>
>>>> In my experience, the 3ware SATA controllers which are not NCQ-capable
>>>> have very, very lousy write performance with some drives, unless you
>>>> enable the write cache (which is, of course, a bit dangerous without
>>>> UPS or battery backup on the controller).
>>>
>>
>>> Not to sound like the a 3ware chearleader, but this card does support
>>> NCQ.
>>
>>
>> Oh. I didn't know whether this particukar controller supported NCQ or
>> not.
>
>
> It even supports SATA-3, not much good that it does me.
>
> I managed to format it reiserfs in the end. dbench (yes I know it isn't a
> great benchmark) gives me a write speed of 7Mb/s compared to 700Mb/s
> if one
> of the disks in the array is attached to the motherboard SATA controller.
>
> 7Mb/s is quite stunningly appalling. I realise the release notes warn of
> slow writes, but that's just daft! I have a few bits in my setup to check
> before I start pointing the finger comprehensively. It may (for instance)
> be a large partition problem (suggested on the ext2 list).
>
> I'm taking it that it works at least for some people (did you test write
> speed Lawrence?).
>
> --
> Alex Bligh
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables



--On 01 November 2005 15:54 -0700 "Jeff V. Merkey" <[email protected]>
wrote:

> I see this problem if you have the array configured for RAID 5 and you
> have not pushed the F8 key during array config after setting up
> the array. Try rebooting, setting the Raid 5 for INIT (Press F8) and it
> goes away, but the whole array will reinit itself. There seems to be
> some sort of problem in their RAID 5 logic and you can setup a RAID 5
> stripe set, but init doesn't finish or gets in a wierd state during
> reboot. It seems confined to 9500 series controllers, but I have also
> seen this behavior on the 8000 series drivers as well. I don't know
> if you are using RAID 5 , but I have seen this problem on RAID 5 configs
> only.

I don't know when it's meant to reinit itself, but F8 does exit here,
and an immediate exit to boot up. The manual describes some RAID configs
need initialization in BIOS, but some (allegedly) initialize as the
OS loads. Then it contradicts itself and says RAID-5 is ready for
use w/o initialization. Whatever, it does nothing on boot, until a
mkfs, and then the world grinds very slowly.

--
Alex Bligh

2005-11-02 22:59:17

by adam radford

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

On 10/30/05, Alex Bligh - linux-kernel <[email protected]> wrote:

> All seems to go well until I try and do mke2fs. This appears to work,
> and tries to write the inode tables. However, at (about) 3400 inodes
> (of 11176), it slows to a crawl, writing one table every 10 seconds.
> strace shows it is still running, and no errors are being reported.
> However, it seems very sick.

Do you have cache turned on or off? If it's off, try turning it on.

>
> No debug messages indicating any errors.
>
> The only other clue as to what may be wrong is in the boot sequence.
> I see lots of bad LUN messages (detail below). However, it does appear
> to be detecting the disks right in the end.
>
> Anyone got any ideas?
>
>
> Oct 30 20:09:09 localhost kernel: [ 138.688249]
> /dev/scsi/host4/bus0/target0/lun0: p1 < p5 >
> Oct 30 20:09:09 localhost kernel: [ 138.712496] Attached scsi disk sdb at
> scsi4, channel 0, id 0, lun 0
> Oct 30 20:09:09 localhost kernel: [ 138.712814] scsi: On host 4 channel 0
> id 0
> only 511 (max_scsi_report_luns) of 214715501 luns reported, try increasing
> max_scsi_report_luns.
> Oct 30 20:09:09 localhost kernel: [ 138.712817] scsi: host 4 channel 0 id 0
> lun 0x383438203636202d has a LUN larger than currently supported.

There were some changes to scsi_scan.c between 2.6.12 and 2.6.14 that
seem to have fixed this issue. Reproduce with 2.6.14.

-Adam

Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

Adam,

>> All seems to go well until I try and do mke2fs. This appears to work,
>> and tries to write the inode tables. However, at (about) 3400 inodes
>> (of 11176), it slows to a crawl, writing one table every 10 seconds.
>> strace shows it is still running, and no errors are being reported.
>> However, it seems very sick.
>
> Do you have cache turned on or off? If it's off, try turning it on.

On. I started again (deleted the units etc.) which I'd done before.
I am not sure quite what I did different this time. Now I get 270Mb/s
on dbench, 100Mb/s (approx) on a solid contiguous write (dd), which
is well into the field of uninspiring but not as daft as 7Mb/s. I had
rather expected that h/w RAID 5 would give me faster reads, and only
slightly degraded writes, compared to a single disk of the same type
plugged into the motherboard SATA.

However, dbench puts the (dual opteron 275) machine into 99% system
state. Is that normal? Surely it should be in i/o wait.

I /think/ what had happened is this: When I press F8 to exit the
BIOS, it did not initialize the array (this is in accordance with the
manual, it being deferred). Despite leaving the machine idle in the O/S
for 2 days, it didn't start initializing the array. Running the mkfs
started the initialization (would that make sense)? The second time
I ran mkfs, I may have already (somehow) triggered it to start earlier.

I shall try and work out some soak test I can run on it this w/e.

--
Alex Bligh

2005-11-05 17:28:05

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

Alex Bligh - linux-kernel wrote:

> Adam,
>
>>> All seems to go well until I try and do mke2fs. This appears to work,
>>> and tries to write the inode tables. However, at (about) 3400 inodes
>>> (of 11176), it slows to a crawl, writing one table every 10 seconds.
>>> strace shows it is still running, and no errors are being reported.
>>> However, it seems very sick.
>>
>>
>> Do you have cache turned on or off? If it's off, try turning it on.
>
>
> On. I started again (deleted the units etc.) which I'd done before.
> I am not sure quite what I did different this time. Now I get 270Mb/s
> on dbench, 100Mb/s (approx) on a solid contiguous write (dd), which
> is well into the field of uninspiring but not as daft as 7Mb/s. I had
> rather expected that h/w RAID 5 would give me faster reads, and only
> slightly degraded writes, compared to a single disk of the same type
> plugged into the motherboard SATA.
>
> However, dbench puts the (dual opteron 275) machine into 99% system
> state. Is that normal? Surely it should be in i/o wait.
>
> I /think/ what had happened is this: When I press F8 to exit the
> BIOS, it did not initialize the array (this is in accordance with the
> manual, it being deferred). Despite leaving the machine idle in the O/S
> for 2 days, it didn't start initializing the array. Running the mkfs
> started the initialization (would that make sense)? The second time
> I ran mkfs, I may have already (somehow) triggered it to start earlier.
>
> I shall try and work out some soak test I can run on it this w/e.


This is what I reported earlier and is the same behavior I am seeing on
he 9500. I don't think its related
to the firmwae, but something in the driver because I can reproduce it
on 8000 series controllers as well.
What's common between the two is it's on a 2.6.9 kernel with the
drivers. Eventually, the init proceeds
and completes, but it's related to the INIT state not getting tripped
into running.

Jeff

>
> --
> Alex Bligh
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-11-07 21:54:25

by Florian Weimer

[permalink] [raw]
Subject: Re: 3ware 9550SX problems - mke2fs incredibly slow writing last third of inode tables

* Alex Bligh:

> I /think/ what had happened is this: When I press F8 to exit the
> BIOS, it did not initialize the array (this is in accordance with the
> manual, it being deferred). Despite leaving the machine idle in the O/S
> for 2 days, it didn't start initializing the array. Running the mkfs
> started the initialization (would that make sense)? The second time
> I ran mkfs, I may have already (somehow) triggered it to start earlier.
>
> I shall try and work out some soak test I can run on it this w/e.

Please check the write cache settings and report the results.
(There's a closed-source command line utility which can report the
status in an unambiguous way.)

If the system doesn't wait on I/O, this means that all I/O is cached
by the controller, which in turn suggests that the write cache is
turned on (with obvious consequences).