2002-01-07 23:40:17

by Francois Romieu

[permalink] [raw]
Subject: RAID1 oddity on 2.4.17+

Greetings,

hda: IC35L060AVER07-0, ATA DISK drive
hdc: IC35L060AVER07-0, ATA DISK drive
hde: IC35L060AVER07-0, ATA DISK drive
hdg: IC35L060AVER07-0, ATA DISK drive

$ lspci
00:00.0 Host bridge: Intel Corporation 82845 845 (Brookdale) Chipset Host Bridge
(rev 03)
00:01.0 PCI bridge: Intel Corporation 82845 845 (Brookdale) Chipset AGP Bridge
(rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801BAM PCI (rev 12)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (ICH2) (rev 12)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 12)
01:00.0 VGA compatible controller: ATI Technologies Inc Rage 128 PF
02:06.0 Ethernet controller: Standard Microsystems Corp [SMC] 83C170QF (rev 09)
02:07.0 Ethernet controller: Winbond Electronics Corp W89C940
02:0a.0 Ethernet controller: 3Com Corporation 3c905C-TX [Fast Etherlink] (rev
78)
02:0d.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10)
02:0e.0 Unknown mass storage controller: Promise Technology, Inc. 20265 (rev 02)

$ cat /proc/interrupts
CPU0
0: 89814 XT-PIC timer
1: 4 XT-PIC keyboard
2: 0 XT-PIC cascade
5: 190 XT-PIC eth2
7: 1884 XT-PIC eth0
8: 1 XT-PIC rtc
9: 20487 XT-PIC ide2, ide3
10: 3851 XT-PIC eth1
12: 20 XT-PIC PS/2 Mouse
14: 21445 XT-PIC ide0
15: 22279 XT-PIC ide1
NMI: 0
LOC: 89777
ERR: 0
MIS: 0

$ cat /proc/mdstat
Personalities : [raid1]
read_ahead 1024 sectors
md0 : active raid1 hdc1[0] hda1[1]
1052160 blocks [2/2] [UU]

md1 : active raid1 hdc2[0] hda2[1]
2104448 blocks [2/2] [UU]

md2 : active raid1 hdc3[1] hda3[0]
56894080 blocks [2/2] [UU]

md3 : active raid1 hdg3[1] hde3[0]
56894080 blocks [2/2] [UU]

unused devices: <none>

$ dmesg | less
Linux version 2.4.17 ([email protected]) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #
6 Sun Jan 6 17:03:37 CET 2002

$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 1
model name : Intel(R) Pentium(R) 4 CPU 1.50GHz
stepping : 2
cpu MHz : 1513.513
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm
bogomips : 3021.20

# partition table of /dev/hde
unit: sectors

/dev/hde1 : start= 63, size= 2104452, Id=fd, bootable
/dev/hde2 : start= 2104515, size= 4209030, Id=fd
/dev/hde3 : start= 6313545, size=113788395, Id=fd
/dev/hde4 : start= 0, size= 0, Id= 0

Ditto for hda, hdc, hdg.

CD-ROM present as hdb.

Output of this script (2.4.17) as first attachment
#!/bin/sh

for i in hdc hdg; do
hdparm -i /dev/$i
hdparm -t /dev/$i
sleep 1
done

for i in hdc3 hdg3; do
time mke2fs -j -O sparse_super -T largefile -m 0 -R stride=8 -v /dev/$i
sleep 1
done

sync

for i in hdc3 hdg3; do
mkdir /mnt/$i
mount /dev/$i /mnt/$i
date;
time dd if=/dev/zero of=/mnt/$i/foo bs=1024k count=1000
time sync
sleep 1
done

So far, so good.

Turn hdc into hda, hdg into hde, see second attachment.

So far, so good.

Same output available on request for test-2.4.18-pre1-andrew-last-low-latency.

I issue a reboot.

Now I build RAID1 arrays as:

raiddev /dev/md2
raid-level 1
nr-raid-disks 2
chunk-size 128k
persistent-superblock 1
nr-spare-disks 0
device /dev/hda3
raid-disk 0
device /dev/hdc3
raid-disk 1


raiddev /dev/md3
raid-level 1
nr-raid-disks 2
chunk-size 128k
persistent-superblock 1
nr-spare-disks 0
device /dev/hde3
raid-disk 0
device /dev/hdg3
raid-disk 1

Please, see third attached file. Everything is fast.

/etc/fstab:
LABEL=/ / ext3 defaults 1 1
none /dev/pts devpts gid=5,mode=620 0 0
none /proc proc defaults 0 0
none /dev/shm tmpfs defaults 0 0
/dev/md0 swap swap defaults 0 0
/dev/cdrom /mnt/cdrom iso9660 noauto,owner,kudzu,ro 0


Now I run this script:
#!/bin/sh

for i in md2 md3; do
time mke2fs -j -O sparse_super -T largefile -m 0 -R stride=8 -v /dev/$i
sleep 1
done

sync

for i in md3 md2; do
mkdir /mnt/$i
mount /dev/$i /mnt/$i
date;
time dd if=/dev/zero of=/mnt/$i/foo bs=1024k count=1000
time sync
sleep 1
done

Please, see fourth attached file for output.

Question: where does the slowdown for operations on /dev/md3 come from ?

--
Ueimor


Attachments:
(No filename) (5.15 kB)
log-hdc-hdg (15.08 kB)
log-hda-hde (15.08 kB)
log-md2-md3 (13.74 kB)
raid-build (3.37 kB)
vmstat-dd-md2 (1.72 kB)
vmstat-dd-md3 (1.72 kB)
Download all attachments

2002-01-19 22:09:59

by Francois Romieu

[permalink] [raw]
Subject: RAID1/promise 20265 trouble on 2.4.18-pre3-ac2 (was: RAID1 oddity on 2.4.17+)


Francois Romieu <[email protected]> :
[see <URL:http://www.cs.helsinki.fi/linux/linux-kernel/2002-01/0511.html>]

I put the same partitions on the 4 disks and built two similar RAID arrays.

o md10 is on the Intel controller and md20 on the Promise 20265
(I've dedicated these two arrays to testing, you're welcome).

# cat /proc/mdstat
[...]
md10 : active raid1 hdc3[1] hda3[0]
10241344 blocks [2/2] [UU]
[...]
md20 : active raid1 hdg3[1] hde3[0]
10241344 blocks [2/2] [UU]


o Boot-time parameters:
Kernel command line: BOOT_IMAGE=2.4.18-pre3-ac2 ro root=901 \
BOOT_FILE=/boot/bzImage-2.4.18-pre3-ac2 nmi_watchdog=1 root=/dev/md1 \
ide0=autotune ide1=autotune ide2=autotune ide3=autotune profile=2 devfs=nomount

o Some gross test:
sync; date; dd if=/dev/zero of=test bs=1024k count=500; sync; date
/dev/md10: 20s
/dev/md20: 240s+/-9s (<-- 10 times more *doh*)
Reproductibility: good.

o The first column is a difference in profile output when test is done
againt /dev/md10 and the second when the test is done on /dev/md20.

1 1 __block_prepare_write
0 2 __brelse
2 1 __journal_file_buffer
2 0 __journal_remove_journal_head
0 1 __kfree_skb
6 12 __make_request
0 1 __put_unused_buffer_head
1 2 __rdtsc_delay
1 1 __refile_buffer
1 3 __wake_up
1 0 balance_dirty_state
1 0 block_prepare_write
0 5 bread
0 1 d_lookup
1 0 deactivate_page_nolock
5060 30630 default_idle
0 1 do_anonymous_page
4 7 do_get_write_access
32 23 do_rw_disk
0 5 do_softirq
0 1 do_wp_page
0 2 ext3_block_to_path
2 1 ext3_commit_write
1 1 ext3_dirty_inode
2 1 ext3_do_update_inode
1 0 ext3_get_block_handle
1 1 ext3_get_branch
2 0 ext3_get_group_desc
11 6 ext3_get_inode_loc
3 2 ext3_mark_inode_dirty
5 5 ext3_new_block
0 1 ext3_releasepage
0 2 ext3_reserve_inode_write
0 1 ext3_writepage_trans_blocks
122 129 generic_file_write
2 2 generic_make_request
8 7 get_hash_table
8 39 handle_IRQ_event
2 1 ide_build_dmatable
8 4 ide_build_sglist
5 8 ide_dma_intr
22 19 ide_dmaproc
0 1 ide_do_request
25 27 ide_end_request
2 1 ide_get_queue
4 52 ide_intr
11 17 ide_wait_stat
4 2 journal_add_journal_head
3 5 journal_cancel_revoke
6 4 journal_commit_transaction
1 0 journal_destroy_revoke_caches
2 0 journal_dirty_data
5 4 journal_dirty_metadata
5 1 journal_dirty_sync_data
1 0 journal_free_journal_head
2 5 journal_get_write_access
0 1 journal_start
3 1 journal_stop
0 1 journal_unlock_journal_head
10 13 kfree
17 18 kmalloc
29 19 kmem_cache_alloc
8 13 kmem_cache_free
2 1 kmem_cache_grow
0 1 kmem_slab_destroy
1 0 ll_rw_block
0 13 pdc202xx_dmaproc
0 1 prune_icache
1 0 raid1_alloc_bh
3 1 raid1_alloc_r1bh
3 2 raid1_make_request
3 0 rmqueue
1 0 schedule
0 1 set_bh_page
10 4 start_request
0 1 sys_read
1 0 sys_write
2 0 system_call
1 0 try_to_free_buffers
3 1 unlock_page
1 1 walk_page_buffers
2 0 write_some_buffers
0 1 zap_page_range
0 2 zeromap_page_range

o Nothing special in dmesg output.

o ide settings:
# diff -u /proc/ide/hda/settings /proc/ide/hde/settings
--- /proc/ide/hda/settings Sat Jan 19 22:48:47 2002
+++ /proc/ide/hde/settings Sat Jan 19 22:48:47 2002
@@ -2,8 +2,8 @@
---- ----- --- --- ----
acoustic 0 0 254 rw
address 0 0 2 rw
-bios_cyl 7476 0 65535 rw
-bios_head 255 0 255 rw
+bios_cyl 119150 0 65535 rw
+bios_head 16 0 255 rw
bios_sect 63 0 63 rw
breada_readahead 128 0 255 rw
bswap 0 0 1 r
@@ -11,7 +11,7 @@
failures 0 0 65535 rw
file_readahead 124 0 16384 rw
ide_scsi 0 0 1 rw
-init_speed 69 0 69 rw
+init_speed 12 0 69 rw
io_32bit 3 0 3 rw
keepsettings 0 0 1 rw
lun 0 0 7 rw

# diff -u /proc/ide/hdc/settings /proc/ide/hdg/settings
--- /proc/ide/hdc/settings Sat Jan 19 22:49:35 2002
+++ /proc/ide/hdg/settings Sat Jan 19 22:49:35 2002
@@ -2,8 +2,8 @@
---- ----- --- --- ----
acoustic 0 0 254 rw
address 0 0 2 rw
-bios_cyl 7476 0 65535 rw
-bios_head 255 0 255 rw
+bios_cyl 119150 0 65535 rw
+bios_head 16 0 255 rw
bios_sect 63 0 63 rw
breada_readahead 128 0 255 rw
bswap 0 0 1 r
@@ -11,7 +11,7 @@
failures 0 0 65535 rw
file_readahead 124 0 16384 rw
ide_scsi 0 0 1 rw
-init_speed 69 0 69 rw
+init_speed 12 0 69 rw
io_32bit 3 0 3 rw
keepsettings 0 0 1 rw
lun 0 0 7 rw

o Thanks for your attention.

--
Ueimor