2006-05-22 18:57:48

by fitzboy

[permalink] [raw]
Subject: tuning for large files in xfs

I've got a very large (2TB) proprietary database that is kept on an XFS
partition under a debian 2.6.8 kernel. It seemed to work well, until I
recently did some of my own tests and found that the performance should
be better then it is...

basically, treat the database as just a bunch of random seeks. The XFS
partition is sitting on top of a SCSI array (Dell PowerVault) which has
13+1 disks in a RAID5, stripe size=64k. I have done a number of tests
that mimic my app's accesses and realized that I want the inode to be
as large as possible (which in an intel box is only 2k), played with su
and sw and got those to 64k and 13... and performance got better.

BUT... here is what I need to understand, the filesize has a drastic
effect on performance. If I am doing random reads from a 20GB file
(system only has 2GB ram, so caching is not a factor), I get
performance about where I want it to be: about 5.7 - 6ms per block. But
if that file is 2TB then the time almost doubles, to 11ms. Why is this?
No other factors changed, only the filesize.

Another note, on this partition there is no other file then this one
file.

I am assuming that somewhere along the way, the kernel now has to do an
additional read from the disk for some metadata for xfs... perhaps the
btree for the file doesn't fit in the kernel's memory? so it actually
needs to do 2 seeks, one to find out where to go on disk then one to
get the data. Is that the case? If so, how can I remedy this? How can I
tell the kernel to keep all of the files xfs data in memory?

Tim


2006-05-22 19:15:11

by Avi Kivity

[permalink] [raw]
Subject: Re: tuning for large files in xfs

fitzboy wrote:
>
> BUT... here is what I need to understand, the filesize has a drastic
> effect on performance. If I am doing random reads from a 20GB file
> (system only has 2GB ram, so caching is not a factor), I get
> performance about where I want it to be: about 5.7 - 6ms per block. But
> if that file is 2TB then the time almost doubles, to 11ms. Why is this?
> No other factors changed, only the filesize.
>

With the 20GB file, the disk head is seeking over 1% of the tracks. With
the 2TB file, it is seeking over 100% of the tracks.

>
> I am assuming that somewhere along the way, the kernel now has to do an
> additional read from the disk for some metadata for xfs... perhaps the
> btree for the file doesn't fit in the kernel's memory? so it actually
> needs to do 2 seeks, one to find out where to go on disk then one to

No, the btree should be completely cached fairly soon into the test.

> get the data. Is that the case? If so, how can I remedy this? How can I
> tell the kernel to keep all of the files xfs data in memory?

Add more disks. If you're writing, use RAID 1.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

2006-05-22 19:32:53

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

That makes sense, but how come the numbers for the large file (2TB)
doesn't seem to match with the Avg. Seek Time that 15k drives have? Avg
Seek Time for those drives are in the 5ms range, and I assume they
aren't just seeking in the first couple tracks when they come up with
that number (and Bonnie++ confirms this too). Any thoughts on why it is
double for me when I use the drives?

Avi Kivity wrote:
> fitzboy wrote:
>
>>
>> BUT... here is what I need to understand, the filesize has a drastic
>> effect on performance. If I am doing random reads from a 20GB file
>> (system only has 2GB ram, so caching is not a factor), I get
>> performance about where I want it to be: about 5.7 - 6ms per block. But
>> if that file is 2TB then the time almost doubles, to 11ms. Why is this?
>> No other factors changed, only the filesize.
>>
>
> With the 20GB file, the disk head is seeking over 1% of the tracks. With
> the 2TB file, it is seeking over 100% of the tracks.
>
>>
>> I am assuming that somewhere along the way, the kernel now has to do an
>> additional read from the disk for some metadata for xfs... perhaps the
>> btree for the file doesn't fit in the kernel's memory? so it actually
>> needs to do 2 seeks, one to find out where to go on disk then one to
>
>
> No, the btree should be completely cached fairly soon into the test.
>
>> get the data. Is that the case? If so, how can I remedy this? How can I
>> tell the kernel to keep all of the files xfs data in memory?
>
>
> Add more disks. If you're writing, use RAID 1.
>

2006-05-22 19:36:25

by Avi Kivity

[permalink] [raw]
Subject: Re: tuning for large files in xfs

fitzboy wrote:
> That makes sense, but how come the numbers for the large file (2TB)
> doesn't seem to match with the Avg. Seek Time that 15k drives have? Avg
> Seek Time for those drives are in the 5ms range, and I assume they
> aren't just seeking in the first couple tracks when they come up with
> that number (and Bonnie++ confirms this too). Any thoughts on why it is
> double for me when I use the drives?
>

What's your testing methodology?

You can try to measure the amount of seeks going to the disk by using
iostat, and see if that matches your test program.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

2006-05-22 22:22:37

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: tuning for large files in xfs

In article <[email protected]>,
fitzboy <[email protected]> wrote:
>I've got a very large (2TB) proprietary database that is kept on an XFS
>partition

Ehh.. partition ? Let me tell you about a very common mistake:

>under a debian 2.6.8 kernel. It seemed to work well, until I
>recently did some of my own tests and found that the performance should
>be better then it is...
>
>basically, treat the database as just a bunch of random seeks. The XFS
>partition is sitting on top of a SCSI array (Dell PowerVault) which has
>13+1 disks in a RAID5, stripe size=64k. I have done a number of tests
>that mimic my app's accesses and realized that I want the inode to be
>as large as possible (which in an intel box is only 2k), played with su
>and sw and got those to 64k and 13... and performance got better.

If your RAID has 64K stripes, and the XFS partition is also tuned
for 64K, you expect those two to match, right ? But I bet the start
of the partition isn't aligned to a multiple of 64K..

In those cases I just use the whole disk instead of a partition,
or I dump the partition table with sfdisk, move the start of the
partition I'm using to a multiple of X, and re-read it. You need
to re-mkfs the fs though.

Not sure if it has any impact in your case, just thought I'd mention it.

Mike.

2006-05-22 22:30:41

by Matheus Izvekov

[permalink] [raw]
Subject: Re: tuning for large files in xfs

On 5/22/06, fitzboy <[email protected]> wrote:
> I've got a very large (2TB) proprietary database that is kept on an XFS
> partition under a debian 2.6.8 kernel. It seemed to work well, until I
> recently did some of my own tests and found that the performance should
> be better then it is...
>
> basically, treat the database as just a bunch of random seeks. The XFS
> partition is sitting on top of a SCSI array (Dell PowerVault) which has
> 13+1 disks in a RAID5, stripe size=64k. I have done a number of tests
> that mimic my app's accesses and realized that I want the inode to be
> as large as possible (which in an intel box is only 2k), played with su
> and sw and got those to 64k and 13... and performance got better.
>
> BUT... here is what I need to understand, the filesize has a drastic
> effect on performance. If I am doing random reads from a 20GB file
> (system only has 2GB ram, so caching is not a factor), I get
> performance about where I want it to be: about 5.7 - 6ms per block. But
> if that file is 2TB then the time almost doubles, to 11ms. Why is this?
> No other factors changed, only the filesize.
>
> Another note, on this partition there is no other file then this one
> file.
>

Why use a flesystem with just one file?? Why not use the device node
of the partition directly?

2006-05-22 22:51:32

by Nathan Scott

[permalink] [raw]
Subject: Re: tuning for large files in xfs

On Mon, May 22, 2006 at 11:57:44AM -0700, fitzboy wrote:
> I've got a very large (2TB) proprietary database that is kept on an XFS
> partition under a debian 2.6.8 kernel. It seemed to work well, until I
> recently did some of my own tests and found that the performance should
> be better then it is...
>
> basically, treat the database as just a bunch of random seeks. The XFS
> partition is sitting on top of a SCSI array (Dell PowerVault) which has
> 13+1 disks in a RAID5, stripe size=64k. I have done a number of tests
> that mimic my app's accesses and realized that I want the inode to be
> as large as possible (which in an intel box is only 2k), played with su
> and sw and got those to 64k and 13... and performance got better.
>
> BUT... here is what I need to understand, the filesize has a drastic
> effect on performance. If I am doing random reads from a 20GB file
> (system only has 2GB ram, so caching is not a factor), I get
> performance about where I want it to be: about 5.7 - 6ms per block. But
> if that file is 2TB then the time almost doubles, to 11ms. Why is this?
> No other factors changed, only the filesize.
>
> Another note, on this partition there is no other file then this one
> file.
>
> I am assuming that somewhere along the way, the kernel now has to do an
> additional read from the disk for some metadata for xfs... perhaps the
> btree for the file doesn't fit in the kernel's memory? so it actually
> needs to do 2 seeks, one to find out where to go on disk then one to
> get the data. Is that the case? If so, how can I remedy this? How can I
> tell the kernel to keep all of the files xfs data in memory?

Hi Tim,

[please CC this sort of question to [email protected]]

Can you send xfs_info output for the filesystem and the output
from xfs_bmap -vvp on this file?

The file size has zero effect on performance, but the number and
layout of the files extents is very relevent. How was this file
created?

cheers.

--
Nathan

2006-05-23 00:40:16

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

>
> What's your testing methodology?
>

here is my code... pretty simple, opens the file and reads around to a
random block, 32k worth (note, doing a 2k read doesn't make much of a
difference, only marginal, from 6.7ms per seek down to 6.1).

int main (int argc, char* argv[]) {
char buffer[256*1024];
int fd = open(argv[1],O_RDONLY);
struct stat s;
fstat(fd,&s);
s.st_size=s.st_size-(s.st_size%256*1024);
initTimers();
srandom(startSec);
long long currentPos;
int currentSize=32*1024;
int startTime=getTimeMS();
for (int currentRead = 0;currentRead<10000;currentRead++) {
currentPos=random();
currentPos=currentPos*currentSize;
currentPos=currentPos%s.st_size;
if (pread(fd,buffer,currentSize,currentPos) != currentSize)
std::cout<<"couldn't read in"<<std::endl;
}
std::cout << "blocksize of
"<<currentSize<<"took"<<getTimeMS()-startTime<<" ms"<<std::endl;
}

> You can try to measure the amount of seeks going to the disk by using
> iostat, and see if that matches your test program.
>

I used iostat and found exactly what I was expecting: 10,000 rounds x 16
(number of 2k blocks in a 32k read) x 4 (number of 512 blocks per 2k
block) = 640,000 reads, which is what iostat told me. so now my question
remains, if each seek is supposed to average 3.5ms, how come I am seeing
an average of 6-7ms?

2006-05-23 00:42:23

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

Matheus Izvekov wrote:
>
> Why use a flesystem with just one file?? Why not use the device node
> of the partition directly?

I am not sure what you mean, could you elaborate?

2006-05-23 00:49:53

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

/mnt/array/disk1/pidBase32k:
EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS
0: [0..4182527]: 384..4182911 0 (384..4182911) 4182528 00011
1: [4182528..8354943]: 4194432..8366847 1 (128..4172543) 4172416 00011
2: [8354944..12548479]: 8388736..12582271 2 (128..4193663) 4193536 00011
3: [12548480..16703487]: 12583040..16738047 3 (128..4155135) 4155008 00011
4: [16703488..20767487]: 16777344..20841343 4 (128..4064127) 4064000 00011
5: [20767488..24954623]: 20971648..25158783 5 (128..4187263) 4187136 00011
6: [24954624..29106431]: 25165952..29317759 6 (128..4151935) 4151808 00011
7: [29106432..33258111]: 29360256..33511935 7 (128..4151807) 4151680 00011
8: [33258112..37364735]: 33554560..37661183 8 (128..4106751) 4106624 00010
9: [37364736..41519359]: 37748864..41903487 9 (128..4154751) 4154624 00011
10: [41519360..45683455]: 41943168..46107263 10 (128..4164223) 4164096 00011
11: [45683456..49745919]: 46137472..50199935 11 (128..4062591) 4062464 00011
12: [49745920..53932031]: 50331776..54517887 12 (128..4186239) 4186112 00011
13: [53932032..58105087]: 54526080..58699135 13 (128..4173183) 4173056 00011
14: [58105088..62175103]: 58720384..62790399 14 (128..4070143) 4070016 00011
15: [62175104..66356479]: 62914688..67096063 15 (128..4181503) 4181376 00011
16: [66356480..70531967]: 67108992..71284479 16 (128..4175615) 4175488 00011
17: [70531968..74609791]: 71303296..75381119 17 (128..4077951) 4077824 00011
18: [74609792..78773887]: 75497600..79661695 18 (128..4164223) 4164096 00011
19: [78773888..82879359]: 79691904..83797375 19 (128..4105599) 4105472 00011
20: [82879360..87015167]: 83886208..88022015 20 (128..4135935) 4135808 00010
21: [87015168..91182719]: 88080512..92248063 21 (128..4167679) 4167552 00011
22: [91182720..95375615]: 92274816..96467711 22 (128..4193023) 4192896 00011
23: [95375616..99553023]: 96469120..100646527 23 (128..4177535) 4177408 00011
24: [99553024..103719295]: 100663424..104829695 24 (128..4166399) 4166272 00011
25: [103719296..107857791]: 104857728..108996223 25 (128..4138623) 4138496 00011
26: [107857792..112014719]: 109052032..113208959 26 (128..4157055) 4156928 00011
27: [112014720..116178559]: 113246336..117410175 27 (128..4163967) 4163840 00011
28: [116178560..120336255]: 117440640..121598335 28 (128..4157823) 4157696 00011
29: [120336256..124484095]: 121634944..125782783 29 (128..4147967) 4147840 00011
30: [124484096..128600703]: 125829248..129945855 30 (128..4116735) 4116608 00011
31: [128600704..132771071]: 130023552..134193919 31 (128..4170495) 4170368 00011
32: [132771072..136906367]: 134217856..138353151 32 (128..4135423) 4135296 00011
33: [136906368..141054207]: 138412160..142559999 33 (128..4147967) 4147840 00011
34: [141054208..145218815]: 142606464..146771071 34 (128..4164735) 4164608 00011
35: [145218816..149359999]: 146800768..150941951 35 (128..4141311) 4141184 00011
36: [149360000..153508735]: 150995072..155143807 36 (128..4148863) 4148736 00011
37: [153508736..157662207]: 155189376..159342847 37 (128..4153599) 4153472 00011
38: [157662208..161817215]: 159383680..163538687 38 (128..4155135) 4155008 00011
39: [161817216..166000255]: 163577984..167761023 39 (128..4183167) 4183040 00011
40: [166000256..170112511]: 167772288..171884543 40 (128..4112383) 4112256 00010
41: [170112512..174281343]: 171966592..176135423 41 (128..4168959) 4168832 00011
42: [174281344..178448127]: 176160896..180327679 42 (128..4166911) 4166784 00011
43: [178448128..182641023]: 180355200..184548095 43 (128..4193023) 4192896 00011
44: [182641024..186813951]: 184549504..188722431 44 (128..4173055) 4172928 00011
45: [186813952..190898047]: 188743808..192827903 45 (128..4084223) 4084096 00010
46: [190898048..195091967]: 192938112..197132031 46 (128..4194047) 4193920 00011
47: [195091968..199263231]: 197132416..201303679 47 (128..4171391) 4171264 00011
48: [199263232..203455615]: 201326720..205519103 48 (128..4192511) 4192384 00011
49: [203455616..204409343]: 205521024..206474751 49 (128..953855) 953728 00011
50: [204409344..204457343]: 206610560..206658559 49 (1089664..1137663) 48000 00011
51: [204457344..204504447]: 206800512..206847615 49 (1279616..1326719) 47104 00011
52: [204504448..204555263]: 206988288..207039103 49 (1467392..1518207) 50816 00001
53: [204555264..204604031]: 207179520..207228287 49 (1658624..1707391) 48768 00011
54: [204604032..204669567]: 207369728..207435263 49 (1848832..1914367) 65536 00010
55: [204669568..204718847]: 207577600..207626879 49 (2056704..2105983) 49280 00011
56: [204718848..204772351]: 207769344..207822847 49 (2248448..2301951) 53504 00011
57: [204772352..204820735]: 207963904..208012287 49 (2443008..2491391) 48384 00011
58: [204820736..204875007]: 208153216..208207487 49 (2632320..2686591) 54272 00011
59: [204875008..204914431]: 208364544..208403967 49 (2843648..2883071) 39424 00001
60: [204914432..204986751]: 208573056..208645375 49 (3052160..3124479) 72320 00011
61: [204986752..204999551]: 208653184..208665983 49 (3132288..3145087) 12800 00011
62: [204999552..205001215]: 208668032..208669695 49 (3147136..3148799) 1664 00011
63: [205001216..205070591]: 208818560..208887935 49 (3297664..3367039) 69376 00011
64: [205070592..205123327]: 209029504..209082239 49 (3508608..3561343) 52736 00011
65: [205123328..205177599]: 209223808..209278079 49 (3702912..3757183) 54272 00011
66: [205177600..205228927]: 209413888..209465215 49 (3892992..3944319) 51328 00011
67: [205228928..205328127]: 209605248..209704447 49 (4084352..4183551) 99200 00011
68: [205328128..205374335]: 209996160..210042367 50 (280960..327167) 46208 00011
69: [205374336..205429247]: 210189568..210244479 50 (474368..529279) 54912 00011
70: [205429248..205500543]: 210388864..210460159 50 (673664..744959) 71296 00011
71: [205500544..205549183]: 210608128..210656767 50 (892928..941567) 48640 00011
72: [205549184..205600511]: 210815616..210866943 50 (1100416..1151743) 51328 00011
73: [205600512..205652735]: 211047040..211099263 50 (1331840..1384063) 52224 00011
74: [205652736..205660671]: 211104640..211112575 50 (1389440..1397375) 7936 00011
75: [205660672..205684735]: 211114112..211138175 50 (1398912..1422975) 24064 00011
76: [205684736..205718783]: 211299200..211333247 50 (1584000..1618047) 34048 00011
77: [205718784..205726975]: 211350400..211358591 50 (1635200..1643391) 8192 00011
78: [205726976..205728895]: 211361152..211363071 50 (1645952..1647871) 1920 00011
79: [205728896..205761535]: 211363328..211395967 50 (1648128..1680767) 32640 00011
80: [205761536..205811327]: 211550336..211600127 50 (1835136..1884927) 49792 00011
81: [205811328..205827455]: 211699968..211716095 50 (1984768..2000895) 16128 00010
82: [205827456..205830399]: 211725824..211728767 50 (2010624..2013567) 2944 00011
83: [205830400..205831039]: 211730304..211730943 50 (2015104..2015743) 640 00011
84: [205831040..205890687]: 211879168..211938815 50 (2163968..2223615) 59648 00010
85: [205890688..205900543]: 211994624..212004479 50 (2279424..2289279) 9856 00011
86: [205900544..205902975]: 212009856..212012287 50 (2294656..2297087) 2432 00011
87: [205902976..205931007]: 212012800..212040831 50 (2297600..2325631) 28032 00011
88: [205931008..205971583]: 212197504..212238079 50 (2482304..2522879) 40576 00011
89: [205971584..205998591]: 212347136..212374143 50 (2631936..2658943) 27008 00011
90: [205998592..206002303]: 212377728..212381439 50 (2662528..2666239) 3712 00011
91: [206002304..206002943]: 212381952..212382591 50 (2666752..2667391) 640 00011
92: [206002944..206048639]: 212527360..212573055 50 (2812160..2857855) 45696 00011
93: [206048640..206082815]: 212639232..212673407 50 (2924032..2958207) 34176 00001
94: [206082816..206088575]: 212677760..212683519 50 (2962560..2968319) 5760 00011
95: [206088576..206089471]: 212684544..212685439 50 (2969344..2970239) 896 00011
96: [206089472..206115839]: 212685952..212712319 50 (2970752..2997119) 26368 00011
97: [206115840..206161535]: 212859008..212904703 50 (3143808..3189503) 45696 00011
98: [206161536..206181503]: 213002624..213022591 50 (3287424..3307391) 19968 00011
99: [206181504..206184831]: 213024000..213027327 50 (3308800..3312127) 3328 00011
100: [206184832..206185471]: 213028224..213028863 50 (3313024..3313663) 640 00011
101: [206185472..206210815]: 213028992..213054335 50 (3313792..3339135) 25344 00011
102: [206210816..206253311]: 213200128..213242623 50 (3484928..3527423) 42496 00011
103: [206253312..206269055]: 213312640..213328383 50 (3597440..3613183) 15744 00011
104: [206269056..206272895]: 213336576..213340415 50 (3621376..3625215) 3840 00001
105: [206272896..206273663]: 213340672..213341439 50 (3625472..3626239) 768 00011
106: [206273664..206324607]: 213489792..213540735 50 (3774592..3825535) 50944 00011
107: [206324608..206348031]: 213603328..213626751 50 (3888128..3911551) 23424 00011
108: [206348032..206352511]: 213629696..213634175 50 (3914496..3918975) 4480 00011
109: [206352512..206353407]: 213634816..213635711 50 (3919616..3920511) 896 00011
110: [206353408..206480511]: 213781888..213908991 50 (4066688..4193791) 127104 00011
111: [206480512..206542975]: 214230144..214292607 51 (320640..383103) 62464 00011
112: [206542976..206553343]: 214307200..214317567 51 (397696..408063) 10368 00011
113: [206553344..206555263]: 214319360..214321279 51 (409856..411775) 1920 00011
114: [206555264..206555647]: 214321792..214322175 51 (412288..412671) 384 00011
115: [206555648..206604799]: 214471040..214520191 51 (561536..610687) 49152 00011
116: [206604800..206625023]: 214577920..214598143 51 (668416..688639) 20224 00011
117: [206625024..206628991]: 214605568..214609535 51 (696064..700031) 3968 00011
118: [206628992..206629759]: 214610048..214610815 51 (700544..701311) 768 00011
119: [206629760..206630143]: 214611072..214611455 51 (701568..701951) 384 00010
120: [206630144..206687615]: 214767104..214824575 51 (857600..915071) 57472 00011
121: [206687616..206720895]: 214899072..214932351 51 (989568..1022847) 33280 00011
122: [206720896..206727551]: 214936576..214943231 51 (1027072..1033727) 6656 00010
123: [206727552..206728831]: 214944768..214946047 51 (1035264..1036543) 1280 00001
124: [206728832..206784127]: 215097472..215152767 51 (1187968..1243263) 55296 00011
125: [206784128..206814335]: 215225600..215255807 51 (1316096..1346303) 30208 00011
126: [206814336..206820479]: 215266816..215272959 51 (1357312..1363455) 6144 00011
127: [206820480..206821503]: 215273472..215274495 51 (1363968..1364991) 1024 00001
128: [206821504..206821759]: 215274624..215274879 51 (1365120..1365375) 256 00011
129: [206821760..206867839]: 215420416..215466495 51 (1510912..1556991) 46080 00011
130: [206867840..206897279]: 215554048..215583487 51 (1644544..1673983) 29440 00011
131: [206897280..206901887]: 215587072..215591679 51 (1677568..1682175) 4608 00011
132: [206901888..206902655]: 215592192..215592959 51 (1682688..1683455) 768 00010
133: [206902656..206937599]: 215593472..215628415 51 (1683968..1718911) 34944 00011
134: [206937600..206979839]: 215785344..215827583 51 (1875840..1918079) 42240 00011
135: [206979840..206999167]: 215910784..215930111 51 (2001280..2020607) 19328 00011
136: [206999168..207001727]: 215932672..215935231 51 (2023168..2025727) 2560 00011
137: [207001728..207001983]: 215935744..215935999 51 (2026240..2026495) 256 00011
138: [207001984..207079295]: 216076288..216153599 51 (2166784..2244095) 77312 00010
139: [207079296..207099263]: 216252160..216272127 51 (2342656..2362623) 19968 00011
140: [207099264..207102591]: 216273152..216276479 51 (2363648..2366975) 3328 00010
141: [207102592..207103871]: 216276992..216278271 51 (2367488..2368767) 1280 00011
142: [207103872..207149951]: 216427264..216473343 51 (2517760..2563839) 46080 00011
143: [207149952..207201919]: 216550144..216602111 51 (2640640..2692607) 51968 00010
144: [207201920..207259007]: 216754944..216812031 51 (2845440..2902527) 57088 00011
145: [207259008..207313023]: 216951808..217005823 51 (3042304..3096319) 54016 00011
146: [207313024..207371135]: 217144704..217202815 51 (3235200..3293311) 58112 00011
147: [207371136..207406335]: 217342592..217377791 51 (3433088..3468287) 35200 00010
148: [207406336..207455359]: 217523072..217572095 51 (3613568..3662591) 49024 00011
149: [207455360..207546495]: 217710464..217801599 51 (3800960..3892095) 91136 00011
150: [207546496..207691903]: 217953152..218098559 51 (4043648..4189055) 145408 00011
151: [207691904..207747071]: 218522752..218577919 52 (418944..474111) 55168 00011
152: [207747072..207815679]: 218719104..218787711 52 (615296..683903) 68608 00011
153: [207815680..207821055]: 218811776..218817151 52 (707968..713343) 5376 00011
154: [207821056..207822335]: 218819840..218821119 52 (716032..717311) 1280 00011
155: [207822336..207889407]: 218966400..219033471 52 (862592..929663) 67072 00011
156: [207889408..207941375]: 219173248..219225215 52 (1069440..1121407) 51968 00011
157: [207941376..207999999]: 219368960..219427583 52 (1265152..1323775) 58624 00011
158: [208000000..208075135]: 219568256..219643391 52 (1464448..1539583) 75136 00010
159: [208075136..208114175]: 219776256..219815295 52 (1672448..1711487) 39040 00011
160: [208114176..208165759]: 219956608..220008191 52 (1852800..1904383) 51584 00011
161: [208165760..208226047]: 220162560..220222847 52 (2058752..2119039) 60288 00001
162: [208226048..208268799]: 220365056..220407807 52 (2261248..2303999) 42752 00011
163: [208268800..208339967]: 220553344..220624511 52 (2449536..2520703) 71168 00011
164: [208339968..208387967]: 220763392..220811391 52 (2659584..2707583) 48000 00011
165: [208387968..208448895]: 220983424..221044351 52 (2879616..2940543) 60928 00011
166: [208448896..208500095]: 221193856..221245055 52 (3090048..3141247) 51200 00011
167: [208500096..208559231]: 221387008..221446143 52 (3283200..3342335) 59136 00011
168: [208559232..208647039]: 221586560..221674367 52 (3482752..3570559) 87808 00011
169: [208647040..208697727]: 221820032..221870719 52 (3716224..3766911) 50688 00011
170: [208697728..208751231]: 222025600..222079103 52 (3921792..3975295) 53504 00011
171: [208751232..208801663]: 222240512..222290943 52 (4136704..4187135) 50432 00011
172: [208801664..208858239]: 222443904..222500479 53 (145792..202367) 56576 00011
173: [208858240..208916351]: 222645120..222703231 53 (347008..405119) 58112 00011
174: [208916352..208972287]: 222842368..222898303 53 (544256..600191) 55936 00011
175: [208972288..209028479]: 223042816..223099007 53 (744704..800895) 56192 00011
176: [209028480..209125119]: 223264768..223361407 53 (966656..1063295) 96640 00011
177: [209125120..209182719]: 223497984..223555583 53 (1199872..1257471) 57600 00010
178: [209182720..209236735]: 223689856..223743871 53 (1391744..1445759) 54016 00011
179: [209236736..209295359]: 223890944..223949567 53 (1592832..1651455) 58624 00011
180: [209295360..209351295]: 224092160..224148095 53 (1794048..1849983) 55936 00011
181: [209351296..209408895]: 224297728..224355327 53 (1999616..2057215) 57600 00011
182: [209408896..209461887]: 224488448..224541439 53 (2190336..2243327) 52992 00011
183: [209461888..209508351]: 224687360..224733823 53 (2389248..2435711) 46464 00011
184: [209508352..209556095]: 224877952..224925695 53 (2579840..2627583) 47744 00010
185: [209556096..209609599]: 225071104..225124607 53 (2772992..2826495) 53504 00011
186: [209609600..209668735]: 225264896..225324031 53 (2966784..3025919) 59136 00011
187: [209668736..209789951]: 225479936..225601151 53 (3181824..3303039) 121216 00011
188: [209789952..209847039]: 225756032..225813119 53 (3457920..3515007) 57088 00011
189: [209847040..209904895]: 225957504..226015359 53 (3659392..3717247) 57856 00011
190: [209904896..209944959]: 226160512..226200575 53 (3862400..3902463) 40064 00010
191: [209944960..210092287]: 226339968..226487295 53 (4041856..4189183) 147328 00011
192: [210092288..210183167]: 226913280..227004159 54 (420864..511743) 90880 00001
193: [210183168..210301311]: 227145728..227263871 54 (653312..771455) 118144 00011
194: [210301312..213660415]: 227288576..230647679 54 (796160..4155263) 3359104 00011
195: [213660416..217807871]: 230686848..234834303 55 (128..4147583) 4147456 00011
196: [217807872..221920639]: 234881152..238993919 56 (128..4112895) 4112768 00010
197: [221920640..226062591]: 239075456..243217407 57 (128..4142079) 4141952 00011
198: [226062592..230209279]: 243269760..247416447 58 (128..4146815) 4146688 00011
199: [230209280..234403071]: 247464064..251657855 59 (128..4193919) 4193792 00011
200: [234403072..238592895]: 251658368..255848191 60 (128..4189951) 4189824 00011
201: [238592896..242750975]: 255852672..260010751 61 (128..4158207) 4158080 00011
202: [242750976..246860671]: 260046976..264156671 62 (128..4109823) 4109696 00010
203: [246860672..251017727]: 264241280..268398335 63 (128..4157183) 4157056 00011
204: [251017728..255157503]: 268435584..272575359 64 (128..4139903) 4139776 00011
205: [255157504..259307391]: 272629888..276779775 65 (128..4150015) 4149888 00011
206: [259307392..263501183]: 276824192..281017983 66 (128..4193919) 4193792 00011
207: [263501184..267641983]: 281018496..285159295 67 (128..4140927) 4140800 00011
208: [267641984..271771903]: 285212800..289342719 68 (128..4130047) 4129920 00011
209: [271771904..275953407]: 289407104..293588607 69 (128..4181631) 4181504 00011
210: [275953408..280124031]: 293601408..297772031 70 (128..4170751) 4170624 00010
211: [280124032..284303999]: 297795712..301975679 71 (128..4180095) 4179968 00011
212: [284304000..288458879]: 301990016..306144895 72 (128..4155007) 4154880 00011
213: [288458880..292621567]: 306184320..310347007 73 (128..4162815) 4162688 00011
214: [292621568..296794111]: 310378624..314551167 74 (128..4172671) 4172544 00011
215: [296794112..300919167]: 314572928..318697983 75 (128..4125183) 4125056 00011
216: [300919168..305046527]: 318767232..322894591 76 (128..4127487) 4127360 00011
217: [305046528..309240319]: 322961536..327155327 77 (128..4193919) 4193792 00011
218: [309240320..313411071]: 327155840..331326591 78 (128..4170879) 4170752 00011
219: [313411072..317591423]: 331350144..335530495 79 (128..4180479) 4180352 00011
220: [317591424..321744127]: 335544448..339697151 80 (128..4152831) 4152704 00010
221: [321744128..325919743]: 339738752..343914367 81 (128..4175743) 4175616 00011
222: [325919744..329981951]: 343933056..347995263 82 (128..4062335) 4062208 00011
223: [329981952..334172159]: 348127360..352317567 83 (128..4190335) 4190208 00011
224: [334172160..338297855]: 352321664..356447359 84 (128..4125823) 4125696 00011
225: [338297856..342473343]: 356515968..360691455 85 (128..4175615) 4175488 00011
226: [342473344..346622719]: 360710272..364859647 86 (128..4149503) 4149376 00011
227: [346622720..350778239]: 364904576..369060095 87 (128..4155647) 4155520 00011
228: [350778240..354899711]: 369098880..373220351 88 (128..4121599) 4121472 00010
229: [354899712..359093759]: 373293184..377487231 89 (128..4194175) 4194048 00011
230: [359093760..363223807]: 377487488..381617535 90 (128..4130175) 4130048 00011
231: [363223808..367413247]: 381681792..385871231 91 (128..4189567) 4189440 00011
232: [367413248..371531263]: 385876096..389994111 92 (128..4118143) 4118016 00011
233: [371531264..375688191]: 390070400..394227327 93 (128..4157055) 4156928 00011
234: [375688192..379867519]: 394264704..398444031 94 (128..4179455) 4179328 00011
235: [379867520..383962879]: 398459008..402554367 95 (128..4095487) 4095360 00011
236: [383962880..388142719]: 402653312..406833151 96 (128..4179967) 4179840 00011
237: [388142720..392279679]: 406847616..410984575 97 (128..4137087) 4136960 00011
238: [392279680..396412927]: 411041920..415175167 98 (128..4133375) 4133248 00011
239: [396412928..400469887]: 415236224..419293183 99 (128..4057087) 4056960 00011
240: [400469888..404607103]: 419430528..423567743 100 (128..4137343) 4137216 00011
241: [404607104..408657279]: 423624832..427675007 101 (128..4050303) 4050176 00011
242: [408657280..412818943]: 427819136..431980799 102 (128..4161791) 4161664 00011
243: [412818944..416984063]: 432013440..436178559 103 (128..4165247) 4165120 00011
244: [416984064..421143807]: 436207744..440367487 104 (128..4159871) 4159744 00011
245: [421143808..425291903]: 440402048..444550143 105 (128..4148223) 4148096 00011
246: [425291904..429485439]: 444596352..448789887 106 (128..4193663) 4193536 00011
247: [429485440..433672575]: 448790656..452977791 107 (128..4187263) 4187136 00011
248: [433672576..437763839]: 452984960..457076223 108 (128..4091391) 4091264 00011
249: [437763840..441864063]: 457179264..461279487 109 (128..4100351) 4100224 00011
250: [441864064..446052863]: 461373568..465562367 110 (128..4188927) 4188800 00011
251: [446052864..450201983]: 465567872..469716991 111 (128..4149247) 4149120 00011
252: [450201984..454391423]: 469762176..473951615 112 (128..4189567) 4189440 00011
253: [454391424..458519039]: 473956480..478084095 113 (128..4127743) 4127616 00011
254: [458519040..462705023]: 478150784..482336767 114 (128..4186111) 4185984 00011
255: [462705024..466874879]: 482345088..486514943 115 (128..4169983) 4169856 00011
256: [466874880..471003135]: 486539392..490667647 116 (128..4128383) 4128256 00011
257: [471003136..475196927]: 490733696..494927487 117 (128..4193919) 4193792 00011
258: [475196928..479390207]: 494928000..499121279 118 (128..4193407) 4193280 00011
259: [479390208..483533439]: 499122304..503265535 119 (128..4143359) 4143232 00011
260: [483533440..487663487]: 503316608..507446655 120 (128..4130175) 4130048 00011
261: [487663488..491798911]: 507510912..511646335 121 (128..4135551) 4135424 00011
262: [491798912..495974527]: 511705216..515880831 122 (128..4175743) 4175616 00011
263: [495974528..500075519]: 515899520..520000511 123 (128..4101119) 4100992 00010
264: [500075520..504258943]: 520093824..524277247 124 (128..4183551) 4183424 00011
265: [504258944..508416255]: 524288128..528445439 125 (128..4157439) 4157312 00010
266: [508416256..512558719]: 528482432..532624895 126 (128..4142591) 4142464 00010
267: [512558720..516728319]: 532676736..536846335 127 (128..4169727) 4169600 00011
268: [516728320..520905599]: 536871040..541048319 128 (128..4177407) 4177280 00010
269: [520905600..525092991]: 541065344..545252735 129 (128..4187519) 4187392 00011
270: [525092992..529264895]: 545259648..549431551 130 (128..4172031) 4171904 00011
271: [529264896..533445759]: 549453952..553634815 131 (128..4180991) 4180864 00011
272: [533445760..537575807]: 553648256..557778303 132 (128..4130175) 4130048 00011
273: [537575808..541753087]: 557842560..562019839 133 (128..4177407) 4177280 00011
274: [541753088..545893119]: 562036864..566176895 134 (128..4140159) 4140032 00011
275: [545893120..550024063]: 566231168..570362111 135 (128..4131071) 4130944 00011
276: [550024064..554175103]: 570425472..574576511 136 (128..4151167) 4151040 00011
277: [554175104..558367743]: 574619776..578812415 137 (128..4192767) 4192640 00010
278: [558367744..562557055]: 578814080..583003391 138 (128..4189439) 4189312 00011
279: [562557056..566690303]: 583008384..587141631 139 (128..4133375) 4133248 00011
280: [566690304..570842239]: 587202688..591354623 140 (128..4152063) 4151936 00011
281: [570842240..575021951]: 591396992..595576703 141 (128..4179839) 4179712 00011
282: [575021952..579213055]: 595591296..599782399 142 (128..4191231) 4191104 00011
283: [579213056..583351679]: 599785600..603924223 143 (128..4138751) 4138624 00011
284: [583351680..587518207]: 603979904..608146431 144 (128..4166655) 4166528 00011
285: [587518208..591687423]: 608174208..612343423 145 (128..4169343) 4169216 00011
286: [591687424..595877631]: 612368512..616558719 146 (128..4190335) 4190208 00011
287: [595877632..600049919]: 616562816..620735103 147 (128..4172415) 4172288 00011
288: [600049920..604233599]: 620757120..624940799 148 (128..4183807) 4183680 00011
289: [604233600..608418431]: 624951424..629136255 149 (128..4184959) 4184832 00011
290: [608418432..612551167]: 629145728..633278463 150 (128..4132863) 4132736 00011
291: [612551168..616726143]: 633340032..637515007 151 (128..4175103) 4174976 00011
292: [616726144..620891135]: 637534336..641699327 152 (128..4165119) 4164992 00010
293: [620891136..625079423]: 641728640..645916927 153 (128..4188415) 4188288 00011
294: [625079424..629269631]: 645922944..650113151 154 (128..4190335) 4190208 00011
295: [629269632..633461759]: 650117248..654309375 155 (128..4192255) 4192128 00011
296: [633461760..637645823]: 654311552..658495615 156 (128..4184191) 4184064 00011
297: [637645824..641836415]: 658505856..662696447 157 (128..4190719) 4190592 00010
298: [641836416..645958655]: 662700160..666822399 158 (128..4122367) 4122240 00011
299: [645958656..650018303]: 666894464..670954111 159 (128..4059775) 4059648 00011
300: [650018304..654164351]: 671088768..675234815 160 (128..4146175) 4146048 00010
301: [654164352..658356863]: 675283072..679475583 161 (128..4192639) 4192512 00011
302: [658356864..662434559]: 679477376..683555071 162 (128..4077823) 4077696 00011
303: [662434560..666608255]: 683671680..687845375 163 (128..4173823) 4173696 00010
304: [666608256..670750079]: 687865984..692007807 164 (128..4141951) 4141824 00011
305: [670750080..674939391]: 692060288..696249599 165 (128..4189439) 4189312 00011
306: [674939392..679046271]: 696254592..700361471 166 (128..4107007) 4106880 00011
307: [679046272..683219071]: 700448896..704621695 167 (128..4172927) 4172800 00011
308: [683219072..687398911]: 704643200..708823039 168 (128..4179967) 4179840 00011
309: [687398912..691576447]: 708837504..713015039 169 (128..4177663) 4177536 00011
310: [691576448..695760767]: 713031808..717216127 170 (128..4184447) 4184320 00011
311: [695760768..699920767]: 717226112..721386111 171 (128..4160127) 4160000 00011
312: [699920768..704114175]: 721420416..725613823 172 (128..4193535) 4193408 00011
313: [704114176..708248319]: 725614720..729748863 173 (128..4134271) 4134144 00011
314: [708248320..712438783]: 729809024..733999487 174 (128..4190591) 4190464 00011
315: [712438784..716579455]: 734003328..738143999 175 (128..4140799) 4140672 00011
316: [716579456..720708479]: 738197632..742326655 176 (128..4129151) 4129024 00011
317: [720708480..724859263]: 742391936..746542719 177 (128..4150911) 4150784 00011
318: [724859264..729044479]: 746586240..750771455 178 (128..4185343) 4185216 00011
319: [729044480..733213695]: 750780544..754949759 179 (128..4169343) 4169216 00011
320: [733213696..737321855]: 754974848..759083007 180 (128..4108287) 4108160 00011
321: [737321856..741421951]: 759169152..763269247 181 (128..4100223) 4100096 00011
322: [741421952..745530879]: 763363456..767472383 182 (128..4109055) 4108928 00011
323: [745530880..749712767]: 767557760..771739647 183 (128..4182015) 4181888 00011
324: [749712768..753891711]: 771752064..775931007 184 (128..4179071) 4178944 00011
325: [753891712..758060927]: 775946368..780115583 185 (128..4169343) 4169216 00011
326: [758060928..762244607]: 780140672..784324351 186 (128..4183807) 4183680 00011
327: [762244608..766380927]: 784334976..788471295 187 (128..4136447) 4136320 00011
328: [766380928..770563455]: 788529280..792711807 188 (128..4182655) 4182528 00011
329: [770563456..774704895]: 792723584..796865023 189 (128..4141567) 4141440 00011
330: [774704896..778889599]: 796917888..801102591 190 (128..4184831) 4184704 00011
331: [778889600..783025791]: 801112192..805248383 191 (128..4136319) 4136192 00011
332: [783025792..787211519]: 805306496..809492223 192 (128..4185855) 4185728 00011
333: [787211520..791354623]: 809500800..813643903 193 (128..4143231) 4143104 00011
334: [791354624..795512575]: 813695104..817853055 194 (128..4158079) 4157952 00011
335: [795512576..799618047]: 817889408..821994879 195 (128..4105599) 4105472 00011
336: [799618048..803780095]: 822083712..826245759 196 (128..4162175) 4162048 00011
337: [803780096..807963135]: 826278016..830461055 197 (128..4183167) 4183040 00011
338: [807963136..812131327]: 830472320..834640511 198 (128..4168319) 4168192 00011
339: [812131328..816184191]: 834666624..838719487 199 (128..4052991) 4052864 00011
340: [816184192..820326527]: 838860928..843003263 200 (128..4142463) 4142336 00011
341: [820326528..824463359]: 843055232..847192063 201 (128..4136959) 4136832 00011
342: [824463360..828646015]: 847249536..851432191 202 (128..4182783) 4182656 00011
343: [828646016..832839039]: 851443840..855636863 203 (128..4193151) 4193024 00011
344: [832839040..836996607]: 855638144..859795711 204 (128..4157695) 4157568 00011
345: [836996608..841062527]: 859897984..863963903 205 (65664..4131583) 4065920 00011
346: [841062528..845253759]: 864026752..868217983 206 (128..4191359) 4191232 00011
347: [845253760..849432191]: 868221056..872399487 207 (128..4178559) 4178432 00011
348: [849432192..853546879]: 872415360..876530047 208 (128..4114815) 4114688 00011
349: [853546880..857592191]: 876609664..880654975 209 (128..4045439) 4045312 00011
350: [857592192..861744127]: 880803968..884955903 210 (128..4152063) 4151936 00011
351: [861744128..865844607]: 884998272..889098751 211 (128..4100607) 4100480 00011
352: [865844608..870013439]: 889192576..893361407 212 (128..4168959) 4168832 00011
353: [870013440..874193663]: 893386880..897567103 213 (128..4180351) 4180224 00011
354: [874193664..878262143]: 897581184..901649663 214 (128..4068607) 4068480 00011
355: [878262144..882454143]: 901775488..905967487 215 (128..4192127) 4192000 00011
356: [882454144..886619519]: 905969792..910135167 216 (128..4165503) 4165376 00011
357: [886619520..890779647]: 910164096..914324223 217 (128..4160255) 4160128 00011
358: [890779648..894848639]: 914358400..918427391 218 (128..4069119) 4068992 00011
359: [894848640..898916479]: 918552704..922620543 219 (128..4067967) 4067840 00011
360: [898916480..903088383]: 922747008..926918911 220 (128..4172031) 4171904 00011
361: [903088384..907218687]: 926941312..931071615 221 (128..4130431) 4130304 00011
362: [907218688..911328255]: 931135616..935245183 222 (128..4109695) 4109568 00011
363: [911328256..915487871]: 935329920..939489535 223 (128..4159743) 4159616 00011
364: [915487872..919626623]: 939524224..943662975 224 (128..4138879) 4138752 00011
365: [919626624..923776383]: 943718528..947868287 225 (128..4149887) 4149760 00011
366: [923776384..927934207]: 947912832..952070655 226 (128..4157951) 4157824 00011
367: [927934208..932089727]: 952107136..956262655 227 (128..4155647) 4155520 00011
368: [932089728..936242175]: 956301440..960453887 228 (128..4152575) 4152448 00011
369: [936242176..940430719]: 960495744..964684287 229 (128..4188671) 4188544 00011
370: [940430720..944508927]: 964690048..968768255 230 (128..4078335) 4078208 00011
371: [944508928..948598143]: 968884352..972973567 231 (128..4089343) 4089216 00011
372: [948598144..952791167]: 973078656..977271679 232 (128..4193151) 4193024 00011
373: [952791168..956928127]: 977272960..981409919 233 (128..4137087) 4136960 00011
374: [956928128..961092607]: 981467264..985631743 234 (128..4164607) 4164480 00011
375: [961092608..965241727]: 985661568..989810687 235 (128..4149247) 4149120 00010
376: [965241728..969372031]: 989855872..993986175 236 (128..4130431) 4130304 00011
377: [969372032..973549951]: 994050176..998228095 237 (128..4178047) 4177920 00011
378: [973549952..977705727]: 998244480..1002400255 238 (128..4155903) 4155776 00011
379: [977705728..981881471]: 1002438784..1006614527 239 (128..4175871) 4175744 00010
380: [981881472..986022911]: 1006633088..1010774527 240 (128..4141567) 4141440 00011
381: [986022912..990156031]: 1010827392..1014960511 241 (128..4133247) 4133120 00011
382: [990156032..994333567]: 1015021696..1019199231 242 (128..4177663) 4177536 00011
383: [994333568..998504959]: 1019216000..1023387391 243 (128..4171519) 4171392 00011
384: [998504960..1002684287]: 1023410304..1027589631 244 (128..4179455) 4179328 00011
385: [1002684288..1006869119]: 1027604608..1031789439 245 (128..4184959) 4184832 00011
386: [1006869120..1011060351]: 1031798912..1035990143 246 (128..4191359) 4191232 00011
387: [1011060352..1015226623]: 1035993216..1040159487 247 (128..4166399) 4166272 00011
388: [1015226624..1019370879]: 1040187520..1044331775 248 (128..4144383) 4144256 00011
389: [1019370880..1023495807]: 1044381824..1048506751 249 (128..4125055) 4124928 00011
390: [1023495808..1027669631]: 1048576128..1052749951 250 (128..4173951) 4173824 00011
391: [1027669632..1031859455]: 1052770432..1056960255 251 (128..4189951) 4189824 00011
392: [1031859456..1036052607]: 1056964736..1061157887 252 (128..4193279) 4193152 00010
393: [1036052608..1040211967]: 1061159040..1065318399 253 (128..4159487) 4159360 00011
394: [1040211968..1044391167]: 1065353344..1069532543 254 (128..4179327) 4179200 00011
395: [1044391168..1048548095]: 1069547648..1073704575 255 (128..4157055) 4156928 00011
396: [1048548096..1052721279]: 1073741952..1077915135 256 (128..4173311) 4173184 00011
397: [1052721280..1056875263]: 1077936256..1082090239 257 (128..4154111) 4153984 00011
398: [1056875264..1061041663]: 1082130560..1086296959 258 (128..4166527) 4166400 00011
399: [1061041664..1065213823]: 1086324864..1090497023 259 (128..4172287) 4172160 00010
400: [1065213824..1069406335]: 1090519168..1094711679 260 (128..4192639) 4192512 00011
401: [1069406336..1073537023]: 1094713472..1098844159 261 (128..4130815) 4130688 00011
402: [1073537024..1077687039]: 1098907776..1103057791 262 (128..4150143) 4150016 00011
403: [1077687040..1081859071]: 1103102080..1107274111 263 (128..4172159) 4172032 00011
404: [1081859072..1085973759]: 1107296384..1111411071 264 (128..4114815) 4114688 00011
405: [1085973760..1090159999]: 1111490688..1115676927 265 (128..4186367) 4186240 00011
406: [1090160000..1094309631]: 1115684992..1119834623 266 (128..4149759) 4149632 00010
407: [1094309632..1098430591]: 1119879296..1124000255 267 (128..4121087) 4120960 00010
408: [1098430592..1102607999]: 1124073600..1128251007 268 (128..4177535) 4177408 00011
409: [1102608000..1106755583]: 1128267904..1132415487 269 (128..4147711) 4147584 00011
410: [1106755584..1110909695]: 1132462208..1136616319 270 (128..4154239) 4154112 00011
411: [1110909696..1115053183]: 1136656512..1140799999 271 (128..4143615) 4143488 00011
412: [1115053184..1119239679]: 1140850816..1145037311 272 (128..4186623) 4186496 00010
413: [1119239680..1123422719]: 1145045120..1149228159 273 (128..4183167) 4183040 00011
414: [1123422720..1127577343]: 1149239424..1153394047 274 (128..4154751) 4154624 00011
415: [1127577344..1131768063]: 1153433728..1157624447 275 (128..4190847) 4190720 00011
416: [1131768064..1135932031]: 1157628032..1161791999 276 (128..4164095) 4163968 00010
417: [1135932032..1140125055]: 1161822336..1166015359 277 (128..4193151) 4193024 00011
418: [1140125056..1144311167]: 1166016640..1170202751 278 (128..4186239) 4186112 00011
419: [1144311168..1148503551]: 1170210944..1174403327 279 (128..4192511) 4192384 00011
420: [1148503552..1152666239]: 1174405248..1178567935 280 (128..4162815) 4162688 00011
421: [1152666240..1156859263]: 1178599552..1182792575 281 (128..4193151) 4193024 00011
422: [1156859264..1161038719]: 1182793856..1186973311 282 (128..4179583) 4179456 00011
423: [1161038720..1165226367]: 1186988160..1191175807 283 (128..4187775) 4187648 00011
424: [1165226368..1169399935]: 1191182464..1195356031 284 (128..4173695) 4173568 00011
425: [1169399936..1173592063]: 1195376768..1199568895 285 (128..4192255) 4192128 00011
426: [1173592064..1177730943]: 1199571072..1203709951 286 (128..4139007) 4138880 00011
427: [1177730944..1181907455]: 1203765376..1207941887 287 (128..4176639) 4176512 00011
428: [1181907456..1186046975]: 1207959680..1212099199 288 (128..4139647) 4139520 00011
429: [1186046976..1190221055]: 1212153984..1216328063 289 (128..4174207) 4174080 00011
430: [1190221056..1194414847]: 1216348288..1220542079 290 (128..4193919) 4193792 00011
431: [1194414848..1198584447]: 1220542592..1224712191 291 (128..4169727) 4169600 00011
432: [1198584448..1202716031]: 1224736896..1228868479 292 (128..4131711) 4131584 00011
433: [1202716032..1206902911]: 1228931200..1233118079 293 (128..4187007) 4186880 00011
434: [1206902912..1211029759]: 1233125504..1237252351 294 (128..4126975) 4126848 00011
435: [1211029760..1215125119]: 1237319808..1241415167 295 (128..4095487) 4095360 00010
436: [1215125120..1219285247]: 1241514112..1245674239 296 (128..4160255) 4160128 00011
437: [1219285248..1223435519]: 1245708416..1249858687 297 (128..4150399) 4150272 00011
438: [1223435520..1227593855]: 1249902720..1254061055 298 (128..4158463) 4158336 00010
439: [1227593856..1231751039]: 1254097024..1258254207 299 (128..4157311) 4157184 00011
440: [1231751040..1235911423]: 1258291328..1262451711 300 (128..4160511) 4160384 00011
441: [1235911424..1240070015]: 1262485632..1266644223 301 (128..4158719) 4158592 00011
442: [1240070016..1244245887]: 1266679936..1270855807 302 (128..4175999) 4175872 00011
443: [1244245888..1248316159]: 1270874240..1274944511 303 (128..4070399) 4070272 00010
444: [1248316160..1252469119]: 1275068544..1279221503 304 (128..4153087) 4152960 00011
445: [1252469120..1256656383]: 1279262848..1283450111 305 (128..4187391) 4187264 00011
446: [1256656384..1260826623]: 1283457152..1287627391 306 (128..4170367) 4170240 00011
447: [1260826624..1264971903]: 1287651456..1291796735 307 (128..4145407) 4145280 00011
448: [1264971904..1269126783]: 1291845760..1296000639 308 (128..4155007) 4154880 00011
449: [1269126784..1273286527]: 1296040064..1300199807 309 (128..4159871) 4159744 00011
450: [1273286528..1277452671]: 1300234368..1304400511 310 (128..4166271) 4166144 00011
451: [1277452672..1281630079]: 1304428672..1308606079 311 (128..4177535) 4177408 00011
452: [1281630080..1285804287]: 1308622976..1312797183 312 (128..4174335) 4174208 00011
453: [1285804288..1289931775]: 1312817280..1316944767 313 (128..4127615) 4127488 00011
454: [1289931776..1294101503]: 1317011584..1321181311 314 (128..4169855) 4169728 00011
455: [1294101504..1298191103]: 1321205888..1325295487 315 (128..4089727) 4089600 00011
456: [1298191104..1302336895]: 1325400192..1329545983 316 (128..4145919) 4145792 00011
457: [1302336896..1306530687]: 1329594496..1333788287 317 (128..4193919) 4193792 00011
458: [1306530688..1310690943]: 1333788800..1337949055 318 (128..4160383) 4160256 00011
459: [1310690944..1314872191]: 1337983104..1342164351 319 (128..4181375) 4181248 00011
460: [1314872192..1318999935]: 1342177408..1346305151 320 (128..4127871) 4127744 00011
461: [1318999936..1323078143]: 1346371712..1350449919 321 (128..4078335) 4078208 00011
462: [1323078144..1327264895]: 1350566016..1354752767 322 (128..4186879) 4186752 00011
463: [1327264896..1331359743]: 1354760320..1358855167 323 (128..4094975) 4094848 00011
464: [1331359744..1335545855]: 1358954624..1363140735 324 (128..4186239) 4186112 00011
465: [1335545856..1339726591]: 1363148928..1367329663 325 (128..4180863) 4180736 00011
466: [1339726592..1343796991]: 1367343232..1371413631 326 (128..4070527) 4070400 00011
467: [1343796992..1347946367]: 1371537536..1375686911 327 (128..4149503) 4149376 00011
468: [1347946368..1352100735]: 1375731840..1379886207 328 (128..4154495) 4154368 00011
469: [1352100736..1356251903]: 1379926144..1384077311 329 (128..4151295) 4151168 00010
470: [1356251904..1360434687]: 1384120448..1388303231 330 (128..4182911) 4182784 00011
471: [1360434688..1364580607]: 1388314752..1392460671 331 (128..4146047) 4145920 00011
472: [1364580608..1368765695]: 1392509056..1396694143 332 (128..4185215) 4185088 00011
473: [1368765696..1372911359]: 1396703360..1400849023 333 (128..4145791) 4145664 00011
474: [1372911360..1377105535]: 1400897664..1405091839 334 (128..4194303) 4194176 00011
475: [1377105536..1377109759]: 8381696..8385919 1 (4187392..4191615) 4224 00011
476: [1377109760..1377159935]: 108996224..109046399 25 (4138624..4188799) 50176 00011
477: [1377159936..1377179647]: 113208960..113228671 26 (4157056..4176767) 19712 00011
478: [1377179648..1377220223]: 125782784..125823359 29 (4147968..4188543) 40576 00011
479: [1377220224..1377252351]: 129945856..129977983 30 (4116736..4148863) 32128 00011
480: [1377252352..1377310719]: 138353152..138411519 32 (4135424..4193791) 58368 00011
481: [1377310720..1377317631]: 142560000..142566911 33 (4147968..4154879) 6912 00010
482: [1377317632..1377393407]: 171884544..171960319 40 (4112384..4188159) 75776 00001
483: [1377393408..1377463295]: 192827904..192897791 45 (4084224..4154111) 69888 00001
484: [1377463296..1377534591]: 238993920..239065215 56 (4112896..4184191) 71296 00001
485: [1377534592..1377608319]: 264156672..264230399 62 (4109824..4183551) 73728
486: [1377608320..1377671935]: 289342720..289406335 68 (4130048..4193663) 63616 00011
487: [1377671936..1377676543]: 293588608..293593215 69 (4181632..4186239) 4608 00011
488: [1377676544..1377734783]: 318697984..318756223 75 (4125184..4183423) 58240 00011
489: [1377734784..1377748479]: 322894592..322908287 76 (4127488..4141183) 13696 00011
490: [1377748480..1377820799]: 347995264..348067583 82 (4062336..4134655) 72320 00011
491: [1377820800..1377888767]: 356447360..356515327 84 (4125824..4193791) 67968 00011
492: [1377888768..1377923327]: 364859648..364894207 86 (4149504..4184063) 34560 00011
493: [1377923328..1377955583]: 369060096..369092351 87 (4155648..4187903) 32256 00011
494: [1377955584..1378023551]: 373220352..373288319 88 (4121600..4189567) 67968 00001
495: [1378023552..1378073343]: 381617536..381667327 90 (4130176..4179967) 49792 00011
496: [1378073344..1378097279]: 389994112..390018047 92 (4118144..4142079) 23936 00010
497: [1378097280..1378171263]: 402554368..402628351 95 (4095488..4169471) 73984 00011
498: [1378171264..1378244351]: 419293184..419366271 99 (4057088..4130175) 73088 00011
499: [1378244352..1378386431]: 427675008..427817087 101 (4050304..4192383) 142080 00011
500: [1378386432..1378454527]: 457076224..457144319 108 (4091392..4159487) 68096 00010
501: [1378454528..1378524927]: 461279488..461349887 109 (4100352..4170751) 70400 00010
502: [1378524928..1378598399]: 520000512..520073983 123 (4101120..4174591) 73472 00001
503: [1378598400..1378668799]: 666822400..666892799 158 (4122368..4192767) 70400 00010
504: [1378668800..1378779007]: 670954112..671064319 159 (4059776..4169983) 110208 00011
505: [1378779008..1378817535]: 675234816..675273343 160 (4146176..4184703) 38528 00001
506: [1378817536..1378929535]: 683555072..683667071 162 (4077824..4189823) 112000 00011
507: [1378929536..1378959615]: 692007808..692037887 164 (4141952..4172031) 30080 00011
508: [1378959616..1379018879]: 700361472..700420735 166 (4107008..4166271) 59264 00011
509: [1379018880..1379023487]: 700420864..700425471 166 (4166400..4171007) 4608 00011
510: [1379023488..1379098879]: 759083008..759158399 180 (4108288..4183679) 75392 00011
511: [1379098880..1379166207]: 763269248..763336575 181 (4100224..4167551) 67328 00011
512: [1379166208..1379240447]: 767472384..767546623 182 (4109056..4183295) 74240 00011
513: [1379240448..1379263743]: 780115584..780138879 185 (4169344..4192639) 23296 00011
514: [1379263744..1379315711]: 788471296..788523263 187 (4136448..4188415) 51968 00011
515: [1379315712..1379386879]: 821994880..822066047 195 (4105600..4176767) 71168 00011
516: [1379386880..1379527423]: 838719488..838860031 199 (4052992..4193535) 140544 00011
517: [1379527424..1379531519]: 843003264..843007359 200 (4142464..4146559) 4096 00011
518: [1379531520..1379609343]: 876530048..876607871 208 (4114816..4192639) 77824 00011
519: [1379609344..1379750015]: 880654976..880795647 209 (4045440..4186111) 140672 00011
520: [1379750016..1383905023]: 1405091968..1409246975 335 (128..4155135) 4155008 00011
521: [1383905024..1388093695]: 1409286272..1413474943 336 (128..4188799) 4188672 00011
522: [1388093696..1392277887]: 1413480576..1417664767 337 (128..4184319) 4184192 00011
523: [1392277888..1396460159]: 1417674880..1421857151 338 (128..4182399) 4182272 00011
524: [1396460160..1400594943]: 1421869184..1426003967 339 (128..4134911) 4134784 00010
525: [1400594944..1404777727]: 1426063488..1430246271 340 (128..4182911) 4182784 00011
526: [1404777728..1408918783]: 1430257792..1434398847 341 (128..4141183) 4141056 00011
527: [1408918784..1413105407]: 1434452096..1438638719 342 (128..4186751) 4186624 00011
528: [1413105408..1417254911]: 1438646400..1442795903 343 (128..4149631) 4149504 00011
529: [1417254912..1421434495]: 1442840704..1447020287 344 (128..4179711) 4179584 00011
530: [1421434496..1425578239]: 1447035008..1451178751 345 (128..4143871) 4143744 00011
531: [1425578240..1429749375]: 1451229312..1455400447 346 (128..4171263) 4171136 00011
532: [1429749376..1433926655]: 1455423616..1459600895 347 (128..4177407) 4177280 00010
533: [1433926656..1438046463]: 1459617920..1463737727 348 (128..4119935) 4119808 00011
534: [1438046464..1442184319]: 1463812224..1467950079 349 (128..4137983) 4137856 00011
535: [1442184320..1446287871]: 1468006528..1472110079 350 (128..4103679) 4103552 00010
536: [1446287872..1450453887]: 1472200832..1476366847 351 (128..4166143) 4166016 00011
537: [1450453888..1454548351]: 1476395136..1480489599 352 (128..4094591) 4094464 00011
538: [1454548352..1458734463]: 1480589440..1484775551 353 (128..4186239) 4186112 00011
539: [1458734464..1462921983]: 1484783744..1488971263 354 (128..4187647) 4187520 00011
540: [1462921984..1467092735]: 1488978048..1493148799 355 (128..4170879) 4170752 00011
541: [1467092736..1471266175]: 1493172352..1497345791 356 (128..4173567) 4173440 00011
542: [1471266176..1475425279]: 1497366656..1501525759 357 (128..4159231) 4159104 00011
543: [1475425280..1479595519]: 1501560960..1505731199 358 (128..4170367) 4170240 00011
544: [1479595520..1482415743]: 1505755264..1508575487 359 (128..2820351) 2820224 00011
545: [1482415744..1482838655]: 1508853120..1509276031 359 (3097984..3520895) 422912 00011
546: [1482838656..1483369343]: 1509416832..1509947519 359 (3661696..4192383) 530688 00011
547: [1483369344..1487563007]: 1509949568..1514143231 360 (128..4193791) 4193664 00011
548: [1487563008..1491753471]: 1514143872..1518334335 361 (128..4190591) 4190464 00011
549: [1491753472..1495689215]: 1518338176..1522273919 362 (128..3935871) 3935744 00011
550: [1495689216..1499818495]: 1522532480..1526661759 363 (128..4129407) 4129280 00011
551: [1499818496..1503997951]: 1526726784..1530906239 364 (128..4179583) 4179456 00011
552: [1503997952..1508121983]: 1530921088..1535045119 365 (128..4124159) 4124032 00011
553: [1508121984..1512214911]: 1535115392..1539208319 366 (128..4093055) 4092928 00011
554: [1512214912..1516367743]: 1539309696..1543462527 367 (128..4152959) 4152832 00011
555: [1516367744..1520511999]: 1543504000..1547648255 368 (128..4144383) 4144256 00011
556: [1520512000..1524698367]: 1547698304..1551884671 369 (128..4186495) 4186368 00011
557: [1524698368..1528890111]: 1551892608..1556084351 370 (128..4191871) 4191744 00011
558: [1528890112..1533021439]: 1556086912..1560218239 371 (128..4131455) 4131328 00011
559: [1533021440..1537150975]: 1560281216..1564410751 372 (128..4129663) 4129536 00011
560: [1537150976..1541330943]: 1564475520..1568655487 373 (128..4180095) 4179968 00011
561: [1541330944..1545494527]: 1568669824..1572833407 374 (128..4163711) 4163584 00011
562: [1545494528..1549683327]: 1572864128..1577052927 375 (128..4188927) 4188800 00011
563: [1549683328..1553838335]: 1577058432..1581213439 376 (128..4155135) 4155008 00011
564: [1553838336..1558030975]: 1581252736..1585445375 377 (128..4192767) 4192640 00010
565: [1558030976..1562182399]: 1585447040..1589598463 378 (128..4151551) 4151424 00011
566: [1562182400..1566314367]: 1589641344..1593773311 379 (128..4132095) 4131968 00011
567: [1566314368..1570499455]: 1593835648..1598020735 380 (128..4185215) 4185088 00011
568: [1570499456..1574651647]: 1598029952..1602182143 381 (128..4152319) 4152192 00011
569: [1574651648..1578828159]: 1602224256..1606400767 382 (128..4176639) 4176512 00011
570: [1578828160..1582995199]: 1606418560..1610585599 383 (128..4167167) 4167040 00011
571: [1582995200..1587180031]: 1610612864..1614797695 384 (128..4184959) 4184832 00011
572: [1587180032..1591357183]: 1614807168..1618984319 385 (128..4177279) 4177152 00011
573: [1591357184..1595525631]: 1619001472..1623169919 386 (128..4168575) 4168448 00011
574: [1595525632..1599705215]: 1623195776..1627375359 387 (128..4179711) 4179584 00011
575: [1599705216..1603898239]: 1627390080..1631583103 388 (128..4193151) 4193024 00011
576: [1603898240..1606494015]: 1631584384..1634180159 389 (128..2595903) 2595776 00111
FLAG Values:
010000 Unwritten preallocated extent
001000 Doesn't begin on stripe unit
000100 Doesn't end on stripe unit
000010 Doesn't begin on stripe width
000001 Doesn't end on stripe width


Attachments:
map.txt (52.16 kB)

2006-05-23 01:02:38

by fitzboy

[permalink] [raw]
Subject: Re: Re: tuning for large files in xfs

Return-Path: <[email protected]>
X-Original-To: [email protected]
Delivered-To: [email protected]
Received: from horus.isnic.is (horus.isnic.is [193.4.58.12])
(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
(No client certificate requested)
by attila.bofh.it (Postfix) with ESMTP id 057865F87F
for <[email protected]>; Tue, 23 May 2006 02:59:46 +0200
(CEST)
Received: from proxy.google.com (proxy.google.com [66.102.0.4])
by horus.isnic.is (8.12.9p2/8.12.9/isnic) with ESMTP id k4N0xduC097233
for <[email protected]>; Tue, 23 May 2006 00:59:41 GMT
(envelope-from [email protected])
Received: from G018037
by proxy.google.com with ESMTP id k4N0xXAA020036
for <[email protected]>; Mon, 22 May 2006 17:59:33 -0700
Received: (from news@localhost)
by Google Production with id k4N0xXvP022170
for [email protected]; Mon, 22 May 2006 17:59:33 -0700
To: [email protected]
Path: j73g2000cwa.googlegroups.com!not-for-mail
From: [email protected]
Newsgroups: linux.kernel
Subject: Re: tuning for large files in xfs
Date: 22 May 2006 17:59:27 -0700
Organization: http://groups.google.com
Lines: 12
Message-ID: <[email protected]>
References: <[email protected]> <[email protected]>
NNTP-Posting-Host: 209.209.36.196
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1148345973 22160 127.0.0.1 (23 May 2006
00:59:33 GMT)
X-Complaints-To: [email protected]
NNTP-Posting-Date: Tue, 23 May 2006 00:59:33 +0000 (UTC)
In-Reply-To: <[email protected]>
User-Agent: G2/0.2
X-HTTP-Useragent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en)
AppleWebKit/418 (KHTML, like Gecko) Safari/417.9.2,gzip(gfe),gzip(gfe)
Complaints-To: [email protected]
Injection-Info: j73g2000cwa.googlegroups.com; posting-host=209.209.36.196;
posting-account=4elvtg0AAAAA6DoPt8oOAEoKYVdVm_2h

the sweet size for me would be a 32k block size on both the RAID and
the XFS partition for me (that is the best number for my application).
However, on the lower level RAID there is a very nominal performance
difference between 32k and 64k stripe size (like 4%), so I just stick
with the defauly 64k. And for the XFS partition, since I am on an intel
machine with 2.6.8, I can only go up to a 2k blocksize...

But to answer your question, the RAID is on a 64k stripe size, and I
just changed my test app to do 2k reads, and still I get the same
performance (with only marginal improvement), so alignment can't be the



2006-05-23 01:07:50

by Matheus Izvekov

[permalink] [raw]
Subject: Re: tuning for large files in xfs

On 5/22/06, fitzboy <[email protected]> wrote:
> Matheus Izvekov wrote:
> >
> > Why use a flesystem with just one file?? Why not use the device node
> > of the partition directly?
>
> I am not sure what you mean, could you elaborate?
>

If you have, say, a partition of size 2GB (lets call it sdc3), you can
use the device node /dev/sdc3 as if it was a 2GB file. Just configure
your program to use /dev/sdc3 instead of whatever you name your file
that is alone in this xfs filesystem. You could try that and see how
fast it goes.

2006-05-23 01:59:57

by Nathan Scott

[permalink] [raw]
Subject: Re: tuning for large files in xfs

Hi Tim,

On Mon, May 22, 2006 at 05:49:43PM -0700, fitzboy wrote:
> Nathan Scott wrote:
> > Can you send xfs_info output for the filesystem and the output
> > from xfs_bmap -vvp on this file?
> xfs_info:
> meta-data=/mnt/array/disk1 isize=2048 agcount=410, agsize=524288

Thats odd - why such a large number of allocation groups?

> blks
> = sectsz=512
> data = bsize=4096 blocks=214670562, imaxpct=25
> = sunit=16 swidth=192 blks, unwritten=1
> naming =version 2 bsize=4096
> log =internal bsize=4096 blocks=8192, version=1
> = sectsz=512 sunit=0 blks
> realtime =none extsz=65536 blocks=0, rtextents=0
>
> it is mounted rw,noatime,nodiratime
> generally I am doing 32k reads from the application, so I would like
> larger blocksize (32k would be ideal), but can't go above 2k on intel...

4K you mean (thats what you've got above, and thats your max with
a 4K pagesize).

I thought you said you had a 2TB file? The filesystem above is
4096 * 214670562 blocks, i.e. 818GB. Perhaps its a sparse file?
I guess I could look closer at the bmap and figure that out for
myself. ;)

> I made the file my copying it over via dd from another machine onto a
> clean partition... then from that point we just append to the end of it,
> or modify existing data...

> I am attaching the extent map

Interesting. So, the allocator did an OK job for you, at least
initially - everything is contiguous (within the bounds of the
small agsize you set) until extent #475, and I guess that'd have
been the end of the initial copied file. After that it chops
about a bit (goes back to earlier AGs and uses the small amounts
of space in each I'm guessing), then gets back into nice long
contiguous extent allocations in the high AGs.

Anyway, you should be able to alleviate the problem by:

- Using a small number of larger AGs (say 32 or so) instead of
a large number of small AGs. this'll give you most bang for
your buck I expect.
[ If you use a mkfs.xfs binary from an xfsprogs anytime since
November 2003, this will automatically scale for you - did you
use a very old mkfs? Or set the agcount/size values by hand?
Current mkfs would give you this:
# mkfs.xfs -isize=2k -dfile,name=/dev/null,size=214670562b -N
meta-data=/dev/null isize=2048 agcount=32, agsize=6708455 blks
...which is just what you want here. ]

- Preallocate the space in the file - i.e. before running the
dd you can do an "xfs_io -c 'resvsp 0 2t' /mnt/array/disk1/xxx"
(preallocates 2 terabytes) and then overwrite that. Yhis will
give you an optimal layout.

- Not sure about your stripe unit/width settings, I would need
to know details about your RAID. But maybe theres tweaking that
could be done there too.

- Your extent map is fairly large, the 2.6.17 kernel will have
some improvements in the way the memory management is done here
which may help you a bit too.

Have fun!

cheers.

--
Nathan

2006-05-23 08:05:58

by Avi Kivity

[permalink] [raw]
Subject: Re: tuning for large files in xfs

fitzboy wrote:
> >
> > What's your testing methodology?
> >
>
> here is my code... pretty simple, opens the file and reads around to a
> random block, 32k worth (note, doing a 2k read doesn't make much of a
> difference, only marginal, from 6.7ms per seek down to 6.1).
>
> int main (int argc, char* argv[]) {
> char buffer[256*1024];
> int fd = open(argv[1],O_RDONLY);
> struct stat s;
> fstat(fd,&s);
> s.st_size=s.st_size-(s.st_size%256*1024);
> initTimers();
> srandom(startSec);
> long long currentPos;
> int currentSize=32*1024;
> int startTime=getTimeMS();
> for (int currentRead = 0;currentRead<10000;currentRead++) {
> currentPos=random();
> currentPos=currentPos*currentSize;
This will overflow. I think that

currentPos = drand48() * s.st_size;

will give better results

> currentPos=currentPos%s.st_size;

I'd suggest aligning currentPos to currentSize. Very likely your
database does the same. Won't matter much on a single-threaded test though.

> if (pread(fd,buffer,currentSize,currentPos) != currentSize)
> std::cout<<"couldn't read in"<<std::endl;
> }
> std::cout << "blocksize of
> "<<currentSize<<"took"<<getTimeMS()-startTime<<" ms"<<std::endl;
> }
>
> > You can try to measure the amount of seeks going to the disk by using
> > iostat, and see if that matches your test program.
> >
>
> I used iostat and found exactly what I was expecting: 10,000 rounds x
> 16 (number of 2k blocks in a 32k read) x 4 (number of 512 blocks per
> 2k block) = 640,000 reads, which is what iostat told me. so now my
> question remains, if each seek is supposed to average 3.5ms, how come
> I am seeing an average of 6-7ms?
>

Sorry, I wasn't specific enough: please run iostat -x /dev/whatever 1
and look at the 'r/s' (reads per second) field. If that agrees with what
your test says, you have a block layer or lower problem, otherwise it's
a filesystem problem.

--
error compiling committee.c: too many arguments to function

2006-05-23 20:21:22

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs



Avi Kivity wrote:
>
> This will overflow. I think that
>
> currentPos = drand48() * s.st_size;
>
> will give better results
>
>> currentPos=currentPos%s.st_size;
>

why would it overflow? Random() returns a 32 bit number, and if I
multiple that by 32k (basically the number random() returns is the block
number I am going to), that should never be over 64 bits? It may be over
to size of the file though, but that is why I do mod s.st_size... and a
random number mod something is still a random number. Also, with this
method it is already currentSize aligned...

>
> Sorry, I wasn't specific enough: please run iostat -x /dev/whatever 1
> and look at the 'r/s' (reads per second) field. If that agrees with what
> your test says, you have a block layer or lower problem, otherwise it's
> a filesystem problem.
>

I ran it and found an r/s at 165, which basically corresponds to my 6 ms
access time... when it should be around 3.5ms... so it seems like the
seeks themselves are taking along time, NOT that I am doing extra seeks...

2006-05-24 01:41:46

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

Nathan Scott wrote:
> Hi Tim,
>
> On Mon, May 22, 2006 at 05:49:43PM -0700, fitzboy wrote:
>
>>Nathan Scott wrote:
>>
>>>Can you send xfs_info output for the filesystem and the output
>>>from xfs_bmap -vvp on this file?
>>
>>xfs_info:
>>meta-data=/mnt/array/disk1 isize=2048 agcount=410, agsize=524288
>
>
> Thats odd - why such a large number of allocation groups?

I read online in multiple places that the largest allocation groups
should get is 4g, so I made mine 2g. However, having said that, I did
test with different allocation sizes and the effect was not that
dramatic. I will retest again though, just to verify.

I was also thinking that the more AGs the better since I do a lot of
parallel reads/writes... granted it doesn't change the file system all
that much (the file only grows or get existing blocks get modified), so
I am not sure if the number of AGs matter, does it?

>>meta-data=/mnt/array/disk1 isize=2048 agcount=410,
agsize=524288 blks
>> = sectsz=512
>>data = bsize=4096 blocks=214670562, imaxpct=25
>> = sunit=16 swidth=192 blks, unwritten=1
>>naming =version 2 bsize=4096
>>log =internal bsize=4096 blocks=8192, version=1
>> = sectsz=512 sunit=0 blks
>>realtime =none extsz=65536 blocks=0, rtextents=0
>>
>>it is mounted rw,noatime,nodiratime
>>generally I am doing 32k reads from the application, so I would like
>>larger blocksize (32k would be ideal), but can't go above 2k on intel...
>
>
> 4K you mean (thats what you've got above, and thats your max with
> a 4K pagesize).
>

Sorry, I meant that moving the Inode size to 2k (over 256bytes) gave me
a sizeable increase in performance... I assume that is because the
extent map can be smaller now (since blocks are much larger, less blocks
to keep track of). Of course, ideal would be to have InodeSize be large
and blocksize to be 32k... but I hit the limits on both...

> I thought you said you had a 2TB file? The filesystem above is
> 4096 * 214670562 blocks, i.e. 818GB. Perhaps its a sparse file?
> I guess I could look closer at the bmap and figure that out for
> myself. ;)

On my production servers the file is 2TB, but on this testing enviroment
I have, the file is only 767G of a 819G partition... This is sufficient
to tell because the performance is already hindered alot even at 767G,
going to 2TB just makes it worse...

>>I made the file my copying it over via dd from another machine onto a
>>clean partition... then from that point we just append to the end of it,
>>or modify existing data...
>
>
>>I am attaching the extent map
>
>
> Interesting. So, the allocator did an OK job for you, at least
> initially - everything is contiguous (within the bounds of the
> small agsize you set) until extent #475, and I guess that'd have
> been the end of the initial copied file. After that it chops
> about a bit (goes back to earlier AGs and uses the small amounts
> of space in each I'm guessing), then gets back into nice long
> contiguous extent allocations in the high AGs.
>
> Anyway, you should be able to alleviate the problem by:
>
> - Using a small number of larger AGs (say 32 or so) instead of
> a large number of small AGs. this'll give you most bang for
> your buck I expect.
> [ If you use a mkfs.xfs binary from an xfsprogs anytime since
> November 2003, this will automatically scale for you - did you
> use a very old mkfs? Or set the agcount/size values by hand?
> Current mkfs would give you this:
> # mkfs.xfs -isize=2k -dfile,name=/dev/null,size=214670562b -N
> meta-data=/dev/null isize=2048 agcount=32, agsize=6708455 blks
> ...which is just what you want here. ]

I set it by hand. I rebuilt the partition and am now copying over the
file again to see the results...

>
> - Preallocate the space in the file - i.e. before running the
> dd you can do an "xfs_io -c 'resvsp 0 2t' /mnt/array/disk1/xxx"
> (preallocates 2 terabytes) and then overwrite that. Yhis will
> give you an optimal layout.

I tried this a couple of times, but it seemed to wedge the machine... I
would do: 1) touch a file (just to create it), 2) do the above command
which would then show effect in du, but the file size was still 0 3) I
then opened that file (without O_TRUNC or O_APPEND) and started to write
out to it. It would work fine for a few minutes but after about 5 or 7GB
the machine would freeze... nothing in syslog, only a brief message on
console about come cpu state being bad...

>
> - Not sure about your stripe unit/width settings, I would need
> to know details about your RAID. But maybe theres tweaking that
> could be done there too.

stripe unit is 64k, array is a RAID5 with 14 disks, so I say sw=13 (one
disk is parity). I set this when I made the array, though it doesn't
seem to matter much either.

>
> - Your extent map is fairly large, the 2.6.17 kernel will have
> some improvements in the way the memory management is done here
> which may help you a bit too.
>

we have plenty of memory on the machines, shouldn't be an issue... I am
a little cautious about moving to a new kernel though...

2006-05-24 02:24:11

by Nathan Scott

[permalink] [raw]
Subject: Re: tuning for large files in xfs

On Tue, May 23, 2006 at 06:41:36PM -0700, fitzboy wrote:
> I read online in multiple places that the largest allocation groups
> should get is 4g,

Thats not correct (for a few years now).

> I was also thinking that the more AGs the better since I do a lot of
> parallel reads/writes... granted it doesn't change the file system all
> that much (the file only grows or get existing blocks get modified), so
> I am not sure if the number of AGs matter, does it?

Yes, it can matter. For large extents like you have here, AGs
introduce a discontinuity that you'd otherwise not have.

> Sorry, I meant that moving the Inode size to 2k (over 256bytes) gave me
> a sizeable increase in performance... I assume that is because the
> extent map can be smaller now (since blocks are much larger, less blocks
> to keep track of). Of course, ideal would be to have InodeSize be large
> and blocksize to be 32k... but I hit the limits on both...

It means that more extents/btree records fit inline in the inode,
as theres more space available after the stat data. 2k is your
best choice for inode size, stick with that.

> > - Preallocate the space in the file - i.e. before running the
> > dd you can do an "xfs_io -c 'resvsp 0 2t' /mnt/array/disk1/xxx"
> > (preallocates 2 terabytes) and then overwrite that. Yhis will
> > give you an optimal layout.
>
> I tried this a couple of times, but it seemed to wedge the machine... I
> would do: 1) touch a file (just to create it), 2) do the above command

Oh, use the -f (create) option and you won't need a touch.

> which would then show effect in du, but the file size was still 0 3) I
> then opened that file (without O_TRUNC or O_APPEND) and started to write
> out to it. It would work fine for a few minutes but after about 5 or 7GB
> the machine would freeze... nothing in syslog, only a brief message on
> console about come cpu state being bad...

Hmm - I'd be interested to hear if that happens with a recent
kernel.

> > - Your extent map is fairly large, the 2.6.17 kernel will have
> > some improvements in the way the memory management is done here
> > which may help you a bit too.
>
> we have plenty of memory on the machines, shouldn't be an issue... I am
> a little cautious about moving to a new kernel though...

Its not the amount of memory that was the issue here, its more the
way we were using it that was a problem for kernels of the vintage
you're using here. You will definately see better performance in
a 2.6.17 kernel with that large extent map.

cheers.

--
Nathan

2006-05-24 08:12:04

by Avi Kivity

[permalink] [raw]
Subject: Re: tuning for large files in xfs

fitzboy wrote:
>
>
> Avi Kivity wrote:
>>
>> This will overflow. I think that
>
> why would it overflow? Random() returns a 32 bit number, and if I
> multiple that by 32k (basically the number random() returns is the
> block number I am going to), that should never be over 64 bits? It may
> be over to size of the file though, but that is why I do mod
> s.st_size... and a random number mod something is still a random
> number. Also, with this method it is already currentSize aligned...
>

You're right, of course. Thinko on my part.

>>
>> Sorry, I wasn't specific enough: please run iostat -x /dev/whatever 1
>> and look at the 'r/s' (reads per second) field. If that agrees with
>> what your test says, you have a block layer or lower problem,
>> otherwise it's a filesystem problem.
>>
>
> I ran it and found an r/s at 165, which basically corresponds to my 6
> ms access time... when it should be around 3.5ms... so it seems like
> the seeks themselves are taking along time, NOT that I am doing extra
> seeks...
>

I presume that was with the 20GB file?

If so, that rules out the filesystem as the cause.

I would do the following next:

- run the test on the device node (/dev/something), just to make sure.
you will need to issue an ioctl (BLKGETSIZE64) to get the size as fstat
will not return the correct size
- break out the array into individual disks and run the test on each
disk. that will show whether the controller is causing the problem or
one of the disks (is it possible the array is in degraded mode?)

--
error compiling committee.c: too many arguments to function

2006-05-25 19:15:42

by fitzboy

[permalink] [raw]
Subject: Re: tuning for large files in xfs

here are the results:

under 2.6.8 kernel with agsize=2g (440 some AGs): 6.9ms avg access
under 2.6.8 kernel with agcount=32: 6.2ms
under 2.6.17 kernel with partition made in 2.6.8 with agcount=32: 6.2ms
under 2.6.17 kernel just reading from /dev/sdb1: 6.2ms
under 2.6.17 kernel with new partition (made under 2.6.17) with
agcount=32, file created via the xfs_io reserve call: 6.9 ms
under 2.6.17 kernel just reading from /dev/sdb1: 6.9ms (not sure why
this changed from 6.2 the day before)...

So it seems like going to 32 AGs helped about 10%, but other then that
not much else is making much of a difference... now I am moving on and
trying to break the RAID up and testing individual disks to see their
performance...

Nathan Scott wrote:
> On Tue, May 23, 2006 at 06:41:36PM -0700, fitzboy wrote:
>
>>I read online in multiple places that the largest allocation groups
>>should get is 4g,
>
>
> Thats not correct (for a few years now).
>
>
>>I was also thinking that the more AGs the better since I do a lot of
>>parallel reads/writes... granted it doesn't change the file system all
>>that much (the file only grows or get existing blocks get modified), so
>>I am not sure if the number of AGs matter, does it?
>
>
> Yes, it can matter. For large extents like you have here, AGs
> introduce a discontinuity that you'd otherwise not have.
>
>
>>Sorry, I meant that moving the Inode size to 2k (over 256bytes) gave me
>>a sizeable increase in performance... I assume that is because the
>>extent map can be smaller now (since blocks are much larger, less blocks
>>to keep track of). Of course, ideal would be to have InodeSize be large
>>and blocksize to be 32k... but I hit the limits on both...
>
>
> It means that more extents/btree records fit inline in the inode,
> as theres more space available after the stat data. 2k is your
> best choice for inode size, stick with that.
>
>
>>>- Preallocate the space in the file - i.e. before running the
>>>dd you can do an "xfs_io -c 'resvsp 0 2t' /mnt/array/disk1/xxx"
>>>(preallocates 2 terabytes) and then overwrite that. Yhis will
>>>give you an optimal layout.
>>
>>I tried this a couple of times, but it seemed to wedge the machine... I
>>would do: 1) touch a file (just to create it), 2) do the above command
>
>
> Oh, use the -f (create) option and you won't need a touch.
>
>
>>which would then show effect in du, but the file size was still 0 3) I
>>then opened that file (without O_TRUNC or O_APPEND) and started to write
>>out to it. It would work fine for a few minutes but after about 5 or 7GB
>>the machine would freeze... nothing in syslog, only a brief message on
>>console about come cpu state being bad...
>
>
> Hmm - I'd be interested to hear if that happens with a recent
> kernel.
>
>
>>>- Your extent map is fairly large, the 2.6.17 kernel will have
>>>some improvements in the way the memory management is done here
>>>which may help you a bit too.
>>
>>we have plenty of memory on the machines, shouldn't be an issue... I am
>>a little cautious about moving to a new kernel though...
>
>
> Its not the amount of memory that was the issue here, its more the
> way we were using it that was a problem for kernels of the vintage
> you're using here. You will definately see better performance in
> a 2.6.17 kernel with that large extent map.
>
> cheers.
>

--
Timothy Fitz
Lead Programmer

iParadigms, LLC
1624 Franklin St., 7th Floor
Oakland, CA 94612

p. +1-510-287-9720 x233
f. +1-510-444-1952
e. [email protected]

The information contained in this message may be privileged and
confidential and protected from disclosure. If the reader of this
message is not the intended recipient, or an employee or agent
responsible for delivering this message to the intended recipient, you
are hereby notified that any dissemination, distribution or copying of
this communication is strictly prohibited. If you have received this
communication in error, please notify the sender immediately by replying
to the message and deleting it from your computer.