2000-11-21 20:22:37

by CMA

[permalink] [raw]
Subject: e2fs performance as function of block size

Sirs,
performing extensive tests on linux platform performance, optimized as
database server, I got IMHO confusing results:
in particular e2fs initialized to use 1024 block/fragment size showed
significant I/O gains over 4096 block/fragment size, while I expected the
opposite. I would appreciate some hints to understand this.
The test performed was a c-isam index rebuild for a large table (more than
300000 tuples, over 80 megs for data and 90 megs for indexes).
Disk configurations were just cloned (no fragmentation).
Optimal (for the specific purpose) hdparm and bdflush tuning were applied
and tests run in single user mode.
This behaviour was consistent through a broad range of kernel releases (up
to 2.2.17) and h/w configurations.
Please find attached info and test results for a reference platform.
BTW, similar tests running interbase 4.0 DB load showed > 4x performance
over a fully tuned Win NT 4 config (same platform) ;-)
Regards

Dr. Eng. Mauro Tassinari
Rome, Italy
[email protected]




dmesg

Linux version 2.2.6 (root@zap) (gcc version 2.7.2.3) #20 Tue Apr 27 15:23:25
CDT 1999
Detected 736483166 Hz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 734.00 BogoMIPS
Memory: 127708k/130944k available (1204k kernel code, 408k reserved, 1568k
data, 56k init)
VFS: Diskquotas version dquot_6.4.0 initialized
CPU: Intel 00/08 stepping 03
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
Checking for popad bug... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.26 (19981001) Richard Gooch ([email protected])
PCI: PCI BIOS revision 2.10 entry at 0xf08f0
PCI: Using configuration type 1
PCI: Probing PCI hardware
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP, IGMP
Initializing RT netlink socket
Starting kswapd v 1.5
Detected PS/2 Mouse Port.
Serial driver version 4.27 with HUB-6 MANY_PORTS MULTIPORT SHARE_IRQ enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS01 at 0x02f8 (irq = 3) is a 16550A
Real Time Clock Driver v1.09
RAM disk driver initialized: 16 RAM disks of 4096K size
loop: registered device at major 7
VP_IDE: IDE controller on PCI bus 00 dev 21
VP_IDE: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:pio, hdd:DMA
hda: QUANTUM FIREBALLlct15 20, ATA DISK drive
hdd: ATAPI-CD ROM-DRIVE-50MAX, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: QUANTUM FIREBALLlct15 20, 19470MB w/418kB Cache, CHS=2482/255/63
hdd: ATAPI 40X CD-ROM drive, 128kB Cache
Uniform CDROM driver Revision: 2.54
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
md driver 0.36.6 MAX_MD_DEV=4, MAX_REAL=8
linear personality registered
raid0 personality registered
scsi : 0 hosts.
scsi : detected total.
Partition check:
hda: hda1 hda2 hda3
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 56k freed
Adding Swap: 136516k swap-space (priority -1)
parport0: PC-style at 0x378 [SPP,PS2,EPP]
lp0: using parport0 (polling).
CSLIP: code copyright 1989 Regents of the University of California
SLIP: version 0.8.4-NET3.019-NEWTTY-MODULAR (dynamic channels, max=256).
SLIP linefill/keepalive option.
PPP: version 2.3.3 (demand dialling)
PPP line discipline registered.
es1371: version v0.11 time 23:59:18 Apr 28 1999
es1371: found adapter at io 0xa800 irq 5
es1371: features: joystick 0x0
es1371: codec vendor ??v revision 9
es1371: codec features 18bit DAC 18bit ADC
es1371: stereo enhancement: unknown
rtl8139.c:v1.07 5/6/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/rtl8139.html
eth0: RealTek RTL8139 Fast Ethernet at 0xa400, IRQ 10, 00:60:52:0b:ce:a4.
st: bufsize 32768, wrt 30720, max buffers 4, s/g segs 16.
scsi : 0 hosts.
Linux PCMCIA Card Services 3.0.9
kernel build: 2.2.6 #20 Sat Apr 17 23:17:12 CDT 1999
options: [pci] [cardbus]
Intel PCIC probe: not found.
Databook TCIC-2 PCMCIA probe: not found.
ds: no socket drivers loaded!






rc.local

#!/bin/sh
#
# /etc/rc.d/rc.local: Local system initialization script.
(...skipped...)
hdparm -a16A1c1d1k1K1m16u1W1 /dev/hda
# tune buffer cache parameters
bdflush -0 100 -1 5000 -2 2000 -3 32 -5 12000 -d
# start xdm daemon
(...skipped...)






test 1024 block/fragment size

BCHECK C-ISAM B-tree Checker version 1.07
Copyright (C) 1981-1989 Informix Software, Inc.


C-ISAM Files: prod

Checking dictionary and file sizes.
index file node size = 1024
current C-ISAM index node size = 1024
Checking data file records.
Checking indexes and key descriptions.
(...skipped...)
Recreating data record free list.
Recreating index 10.
Recreating index 9.
Recreating index 8.
Recreating index 7.
Recreating index 6.
Recreating index 5.
Recreating index 4.
Recreating index 3.
Recreating index 2.
Recreating index 1.
12 index node(s) used, 0 free --349724 data record(s) used, 4 free


Command being timed: "bcheck -y prod"
User time (seconds): 69.32
System time (seconds): 25.15
Percent of CPU this job got: 54%
Elapsed (wall clock) time (h:mm:ss or m:ss): 2:54.14
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 0
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 43861
Minor (reclaiming a frame) page faults: 207
Voluntary context switches: 0
Involuntary context switches: 0
Swaps: 95
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0





test 4096 block/fragment size

BCHECK C-ISAM B-tree Checker version 1.07
Copyright (C) 1981-1989 Informix Software, Inc.


C-ISAM Files: prod

Checking dictionary and file sizes.
index file node size = 1024
current C-ISAM index node size = 1024
Checking data file records.
Checking indexes and key descriptions.
(...skipped...)
Recreating data record free list.
Recreating index 10.
Recreating index 9.
Recreating index 8.
Recreating index 7.
Recreating index 6.
Recreating index 5.
Recreating index 4.
Recreating index 3.
Recreating index 2.
Recreating index 1.
12 index node(s) used, 0 free --349724 data record(s) used, 4 free


Command being timed: "bcheck -y prod"
User time (seconds): 68.43
System time (seconds): 24.79
Percent of CPU this job got: 39%
Elapsed (wall clock) time (h:mm:ss or m:ss): 3:54.87
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 0
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 41944
Minor (reclaiming a frame) page faults: 239
Voluntary context switches: 0
Involuntary context switches: 0
Swaps: 32
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0


2000-11-21 23:06:46

by Reto Baettig

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

Hi

I think I have a possible explanation for your observations:

1) 1024B Block size:
> User time (seconds): 69.32
> System time (seconds): 25.15
> Percent of CPU this job got: 54%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:54.14
> Major (requiring I/O) page faults: 43861

2) 4096B Block size:
> User time (seconds): 68.43
> System time (seconds): 24.79
> Percent of CPU this job got: 39%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 3:54.87
> Major (requiring I/O) page faults: 41944

Your application seems to have a lot of small accesses (<4k). Each time
you try to read/write a single byte, the whole disk block has to be in
the buffercache.
Now when you only have to read 1k from the disk to do this, you're
faster than when you have to read 4k.

You have to chose if you need good sequential I/O performance for large
I/O transfers where you need a large block size and good random small
I/O performance where you want to have a small block size.

Have fun

Reto


CMA wrote:
>
> Sirs,
> performing extensive tests on linux platform performance, optimized as
> database server, I got IMHO confusing results:
> in particular e2fs initialized to use 1024 block/fragment size showed
> significant I/O gains over 4096 block/fragment size, while I expected the
> opposite. I would appreciate some hints to understand this.
> The test performed was a c-isam index rebuild for a large table (more than
> 300000 tuples, over 80 megs for data and 90 megs for indexes).
> Disk configurations were just cloned (no fragmentation).
> Optimal (for the specific purpose) hdparm and bdflush tuning were applied
> and tests run in single user mode.
> This behaviour was consistent through a broad range of kernel releases (up
> to 2.2.17) and h/w configurations.
> Please find attached info and test results for a reference platform.
> BTW, similar tests running interbase 4.0 DB load showed > 4x performance
> over a fully tuned Win NT 4 config (same platform) ;-)
> Regards
>
> Dr. Eng. Mauro Tassinari
> Rome, Italy
> [email protected]
>
> dmesg
>
> Linux version 2.2.6 (root@zap) (gcc version 2.7.2.3) #20 Tue Apr 27 15:23:25
> CDT 1999
> Detected 736483166 Hz processor.
> Console: colour VGA+ 80x25
> Calibrating delay loop... 734.00 BogoMIPS
> Memory: 127708k/130944k available (1204k kernel code, 408k reserved, 1568k
> data, 56k init)
> VFS: Diskquotas version dquot_6.4.0 initialized
> CPU: Intel 00/08 stepping 03
> Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
> Checking 'hlt' instruction... OK.
> Checking for popad bug... OK.
> POSIX conformance testing by UNIFIX
> mtrr: v1.26 (19981001) Richard Gooch ([email protected])
> PCI: PCI BIOS revision 2.10 entry at 0xf08f0
> PCI: Using configuration type 1
> PCI: Probing PCI hardware
> Linux NET4.0 for Linux 2.2
> Based upon Swansea University Computer Society NET3.039
> NET4: Unix domain sockets 1.0 for Linux NET4.0.
> NET4: Linux TCP/IP 1.0 for NET4.0
> IP Protocols: ICMP, UDP, TCP, IGMP
> Initializing RT netlink socket
> Starting kswapd v 1.5
> Detected PS/2 Mouse Port.
> Serial driver version 4.27 with HUB-6 MANY_PORTS MULTIPORT SHARE_IRQ enabled
> ttyS00 at 0x03f8 (irq = 4) is a 16550A
> ttyS01 at 0x02f8 (irq = 3) is a 16550A
> Real Time Clock Driver v1.09
> RAM disk driver initialized: 16 RAM disks of 4096K size
> loop: registered device at major 7
> VP_IDE: IDE controller on PCI bus 00 dev 21
> VP_IDE: not 100% native mode: will probe irqs later
> ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
> ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:pio, hdd:DMA
> hda: QUANTUM FIREBALLlct15 20, ATA DISK drive
> hdd: ATAPI-CD ROM-DRIVE-50MAX, ATAPI CDROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> ide1 at 0x170-0x177,0x376 on irq 15
> hda: QUANTUM FIREBALLlct15 20, 19470MB w/418kB Cache, CHS=2482/255/63
> hdd: ATAPI 40X CD-ROM drive, 128kB Cache
> Uniform CDROM driver Revision: 2.54
> Floppy drive(s): fd0 is 1.44M
> FDC 0 is a post-1991 82077
> md driver 0.36.6 MAX_MD_DEV=4, MAX_REAL=8
> linear personality registered
> raid0 personality registered
> scsi : 0 hosts.
> scsi : detected total.
> Partition check:
> hda: hda1 hda2 hda3
> VFS: Mounted root (ext2 filesystem) readonly.
> Freeing unused kernel memory: 56k freed
> Adding Swap: 136516k swap-space (priority -1)
> parport0: PC-style at 0x378 [SPP,PS2,EPP]
> lp0: using parport0 (polling).
> CSLIP: code copyright 1989 Regents of the University of California
> SLIP: version 0.8.4-NET3.019-NEWTTY-MODULAR (dynamic channels, max=256).
> SLIP linefill/keepalive option.
> PPP: version 2.3.3 (demand dialling)
> PPP line discipline registered.
> es1371: version v0.11 time 23:59:18 Apr 28 1999
> es1371: found adapter at io 0xa800 irq 5
> es1371: features: joystick 0x0
> es1371: codec vendor f?v revision 9
> es1371: codec features 18bit DAC 18bit ADC
> es1371: stereo enhancement: unknown
> rtl8139.c:v1.07 5/6/99 Donald Becker
> http://cesdis.gsfc.nasa.gov/linux/drivers/rtl8139.html
> eth0: RealTek RTL8139 Fast Ethernet at 0xa400, IRQ 10, 00:60:52:0b:ce:a4.
> st: bufsize 32768, wrt 30720, max buffers 4, s/g segs 16.
> scsi : 0 hosts.
> Linux PCMCIA Card Services 3.0.9
> kernel build: 2.2.6 #20 Sat Apr 17 23:17:12 CDT 1999
> options: [pci] [cardbus]
> Intel PCIC probe: not found.
> Databook TCIC-2 PCMCIA probe: not found.
> ds: no socket drivers loaded!
>
> rc.local
>
> #!/bin/sh
> #
> # /etc/rc.d/rc.local: Local system initialization script.
> (...skipped...)
> hdparm -a16A1c1d1k1K1m16u1W1 /dev/hda
> # tune buffer cache parameters
> bdflush -0 100 -1 5000 -2 2000 -3 32 -5 12000 -d
> # start xdm daemon
> (...skipped...)
>
> test 1024 block/fragment size
>
> BCHECK C-ISAM B-tree Checker version 1.07
> Copyright (C) 1981-1989 Informix Software, Inc.
>
> C-ISAM Files: prod
>
> Checking dictionary and file sizes.
> index file node size = 1024
> current C-ISAM index node size = 1024
> Checking data file records.
> Checking indexes and key descriptions.
> (...skipped...)
> Recreating data record free list.
> Recreating index 10.
> Recreating index 9.
> Recreating index 8.
> Recreating index 7.
> Recreating index 6.
> Recreating index 5.
> Recreating index 4.
> Recreating index 3.
> Recreating index 2.
> Recreating index 1.
> 12 index node(s) used, 0 free --349724 data record(s) used, 4 free
>
> Command being timed: "bcheck -y prod"
> User time (seconds): 69.32
> System time (seconds): 25.15
> Percent of CPU this job got: 54%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:54.14
> Average shared text size (kbytes): 0
> Average unshared data size (kbytes): 0
> Average stack size (kbytes): 0
> Average total size (kbytes): 0
> Maximum resident set size (kbytes): 0
> Average resident set size (kbytes): 0
> Major (requiring I/O) page faults: 43861
> Minor (reclaiming a frame) page faults: 207
> Voluntary context switches: 0
> Involuntary context switches: 0
> Swaps: 95
> File system inputs: 0
> File system outputs: 0
> Socket messages sent: 0
> Socket messages received: 0
> Signals delivered: 0
> Page size (bytes): 4096
> Exit status: 0
>
> test 4096 block/fragment size
>
> BCHECK C-ISAM B-tree Checker version 1.07
> Copyright (C) 1981-1989 Informix Software, Inc.
>
> C-ISAM Files: prod
>
> Checking dictionary and file sizes.
> index file node size = 1024
> current C-ISAM index node size = 1024
> Checking data file records.
> Checking indexes and key descriptions.
> (...skipped...)
> Recreating data record free list.
> Recreating index 10.
> Recreating index 9.
> Recreating index 8.
> Recreating index 7.
> Recreating index 6.
> Recreating index 5.
> Recreating index 4.
> Recreating index 3.
> Recreating index 2.
> Recreating index 1.
> 12 index node(s) used, 0 free --349724 data record(s) used, 4 free
>
> Command being timed: "bcheck -y prod"
> User time (seconds): 68.43
> System time (seconds): 24.79
> Percent of CPU this job got: 39%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 3:54.87
> Average shared text size (kbytes): 0
> Average unshared data size (kbytes): 0
> Average stack size (kbytes): 0
> Average total size (kbytes): 0
> Maximum resident set size (kbytes): 0
> Average resident set size (kbytes): 0
> Major (requiring I/O) page faults: 41944
> Minor (reclaiming a frame) page faults: 239
> Voluntary context switches: 0
> Involuntary context switches: 0
> Swaps: 32
> File system inputs: 0
> File system outputs: 0
> Socket messages sent: 0
> Socket messages received: 0
> Signals delivered: 0
> Page size (bytes): 4096
> Exit status: 0
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-22 00:04:38

by Alan

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

> Sirs,
> performing extensive tests on linux platform performance, optimized as
> database server, I got IMHO confusing results:
> in particular e2fs initialized to use 1024 block/fragment size showed
> significant I/O gains over 4096 block/fragment size, while I expected t=
> he
> opposite. I would appreciate some hints to understand this.

It may be that your database is writing out 1K sized blocks on random
boundaries. If so then the behaviour you describe would be quite reasonable.

2000-11-22 00:41:05

by Jeff Merkey

[permalink] [raw]
Subject: Re: e2fs performance as function of block size



Alan Cox wrote:
>
> > Sirs,
> > performing extensive tests on linux platform performance, optimized as
> > database server, I got IMHO confusing results:
> > in particular e2fs initialized to use 1024 block/fragment size showed
> > significant I/O gains over 4096 block/fragment size, while I expected t=
> > he
> > opposite. I would appreciate some hints to understand this.
>
> It may be that your database is writing out 1K sized blocks on random
> boundaries. If so then the behaviour you describe would be quite reasonable.

Alan,

Perhaps, but I have reported this before and seen something similiar.
It's as though the disk drivers are optimized for this case (1024). I
get better performance running NWFS at 1024 block size vs. all the other
sizes, even with EXT2 configured to use 4096, etc. At first glance,
when I was changing block sizes, I did note that by default, EXT2 set to
1024 would mix buffer sizes in the buffer cache, which skewed caching
behavior, but there is clearly some optimization relative to this size
inherent in the design of Linux -- and it may be a pure accident. This
person may be mixing and matching block sizes in the buffer cache, which
would satisfy your explanation.

Jeff

>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-22 00:58:11

by Alan

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

> It's as though the disk drivers are optimized for this case (1024). I

The disk drivers are not, and they normally see merged runs of blocks so they
will see big chunks rather than 1K then 1K then 1K etc.

> behavior, but there is clearly some optimization relative to this size
> inherent in the design of Linux -- and it may be a pure accident. This
> person may be mixing and matching block sizes in the buffer cache, which
> would satisfy your explanation.

I see higher performance with 4K block sizes. I should see higher latency too
but have never been able to measure it. Maybe it depends on the file system.
It certainly depends on the nature of requests


2000-11-22 01:03:01

by Brian Pomerantz

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

On Tue, Nov 21, 2000 at 05:06:20PM -0700, Jeff V. Merkey wrote:
>
>
> Alan Cox wrote:
> >
> > > Sirs,
> > > performing extensive tests on linux platform performance, optimized as
> > > database server, I got IMHO confusing results:
> > > in particular e2fs initialized to use 1024 block/fragment size showed
> > > significant I/O gains over 4096 block/fragment size, while I expected t=
> > > he
> > > opposite. I would appreciate some hints to understand this.
> >
> > It may be that your database is writing out 1K sized blocks on random
> > boundaries. If so then the behaviour you describe would be quite reasonable.
>
> Alan,
>
> Perhaps, but I have reported this before and seen something similiar.
> It's as though the disk drivers are optimized for this case (1024). I
> get better performance running NWFS at 1024 block size vs. all the other
> sizes, even with EXT2 configured to use 4096, etc. At first glance,
> when I was changing block sizes, I did note that by default, EXT2 set to
> 1024 would mix buffer sizes in the buffer cache, which skewed caching
> behavior, but there is clearly some optimization relative to this size
> inherent in the design of Linux -- and it may be a pure accident. This
> person may be mixing and matching block sizes in the buffer cache, which
> would satisfy your explanation.
>

You may want to try using raw I/O to fully characterize the behavior
of you device. I found that when I use raw I/O I could get very good
performance characteristics for devices. If you use dd with a raw
device you can vary the block size to see what kind of performance you
get out of different sizes. This will completely bypass any affects
of buffer cache to get you the performance of the disk in question.
An example of this would be to run this sequence of commands noting
the time it takes to run it (all transfers 100MB):

time dd if=/dev/zero of=/dev/raw1 bs=512 count=204800
time dd if=/dev/zero of=/dev/raw1 bs=1k count=102400
time dd if=/dev/zero of=/dev/raw1 bs=4k count=25600
...

The standard Stephen Tweedie raw I/O will do up to 64KB chunks, beyond
that you'll probably have to write one specific to your device (SGI
has one for SCSI which I've gotten up to 1MB reads/writes). Using dd
doesn't necessarily show you your performance as most access patterns
will not be completely sequential in nature, but you can figure out
what your "sweet spot" is for your block size.


BAPper

2000-11-22 01:12:37

by Jeff Merkey

[permalink] [raw]
Subject: Re: e2fs performance as function of block size



Alan Cox wrote:
>
> > It's as though the disk drivers are optimized for this case (1024). I
>
> The disk drivers are not, and they normally see merged runs of blocks so they
> will see big chunks rather than 1K then 1K then 1K etc.
>
> > behavior, but there is clearly some optimization relative to this size
> > inherent in the design of Linux -- and it may be a pure accident. This
> > person may be mixing and matching block sizes in the buffer cache, which
> > would satisfy your explanation.
>
> I see higher performance with 4K block sizes. I should see higher latency too

^^^^^^^^^^^^^^^^^
Since buffer heads are chained, this would make sense.


> but have never been able to measure it. Maybe it depends on the file system.
> It certainly depends on the nature of requests

Could be. NWFS likes 4K block sizes -- this is it's default. On linux,
I've been emulating other block sizes beneath it. I see best
performance at 1024 byte blocks, worst at 512. The overhead of buffer
chaining is clearly the culprit.

On the TCPIP oops on 2.2.18-22, I have not been able to reproduce it
reliably. It appears to be in the ppp code, however, and not the TCPIP
code. The problem only shows up after several pppd connections have
accessed the box then terminated the connections (which is why I think
it's pp related). I would rate this as a level IV bug due to the
difficulty in creating it, and the fact that you have to deliberately
misconfigure a TCPIP network to make it show up.

Jeff

Jeff

2000-11-22 22:58:38

by Michael Marxmeier

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

Alan Cox wrote:
> I see higher performance with 4K block sizes. I should see higher
> latency too but have never been able to measure it. Maybe it depends
> on the file system.
> It certainly depends on the nature of requests

If the files get somewhat bigger (eg. > 1G) having a bigger block
size also greatly reduces the ext2 overhead. Especially fsync()
used to be really bad on big file but choosing a bigger block
size changed a lot.

If the database used by the original poster is based on
something like c-isam then (AFAIR) it is in fact using
1k blocks which may explain the better results of 1k block
size. With a 100 MB file size fs management overhead should
not be that visible.


Michael

--
Michael Marxmeier Marxmeier Software AG
E-Mail: [email protected] Besenbruchstrasse 9
Phone : +49 202 2431440 42285 Wuppertal, Germany
Fax : +49 202 2431420 http://www.marxmeier.com/

2000-11-24 14:19:28

by Stephen C. Tweedie

[permalink] [raw]
Subject: Re: e2fs performance as function of block size

Hi,

On Wed, Nov 22, 2000 at 11:28:12PM +0100, Michael Marxmeier wrote:
>
> If the files get somewhat bigger (eg. > 1G) having a bigger block
> size also greatly reduces the ext2 overhead. Especially fsync()
> used to be really bad on big file but choosing a bigger block
> size changed a lot.

2.4 fsync should be better, but still dependent on file size. The
O_SYNC patches I posted the other day give you an fsync which is
independent of file size.

Cheers,
Stephen