2013-01-30 05:46:50

by Bron Gondwana

[permalink] [raw]
Subject: fallocate creating fragmented files

Hi All,

I'm trying to understand why my ext4 filesystem is creating highly fragmented files even though it's only just over 50% full.

The hardware is 2 x 64Gb Intel x25e drives in software RAID1 as /dev/md0:

[brong@imap14 conf]$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdf[0] sdg[1]
62522624 blocks [2/2] [UU]

unused devices: <none>


[brong@imap14 conf]$ df | grep md0
/dev/md0 58594701 30996530 24472040 56% /mnt/ssd14

It's mounted data=ordered:

[brong@imap14 conf]$ mount | grep md0
/dev/md0 on /mnt/ssd14 type ext4 (rw,noatime,data=ordered,barrier=0,commit=120)


The filesystem was created with 1k blocks, nearly 2 years ago. It's seen a lot of usage since then, but never been super-full.


[brong@imap14 conf]$ dumpe2fs -h /dev/md0
dumpe2fs 1.42.4 (12-Jun-2012)
Filesystem volume name: <none>
Last mounted on: /mnt/ssd14
Filesystem UUID: f10d5f08-a4b5-476f-9568-586ca0b793b5
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 15632384
Block count: 62522624
Reserved block count: 3126131
Free blocks: 26483551
Free inodes: 13674200
First block: 1
Block size: 1024
Fragment size: 1024
Reserved GDT blocks: 256
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 2048
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Tue Mar 8 08:30:35 2011
Last mount time: Thu Jan 24 20:05:50 2013
Last write time: Thu Jan 24 20:05:50 2013
Mount count: 10
Maximum mount count: 24
Last checked: Fri May 13 09:10:27 2011
Check interval: 15552000 (6 months)
Next check after: Wed Nov 9 08:10:27 2011
Lifetime writes: 33 TB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
First orphan inode: 14256056
Default directory hash: half_md4
Directory Hash Seed: 2f49b7f2-f4f7-464a-b804-8598867da2ba
Journal backup: inode blocks
Journal features: journal_incompat_revoke
Journal size: 32M
Journal length: 32768
Journal sequence: 0xcfc0f41e
Journal start: 16226


----------------------------

Now we've got some background, let's create two files using the fallocate command line tool, and see where the blocks wound up. This mirrors exactly the behaviour we have seen with both using posix_fallocate to pre-allocate space, or just using seek and writev to write individual records out to the file. Except with the slower writes of a real application, we get closer to 9000 extents for a file this size.

[brong@imap14 conf]$ fallocate -l 20m testfile
[brong@imap14 conf]$ fallocate -l 20m testfile2
[brong@imap14 conf]$ filefrag testfile
testfile: 421 extents found
[brong@imap14 conf]$ filefrag testfile2
testfile2: 306 extents found

Now looking at the verbose output, we can see that there are many extents of just 3 or 4 blocks:

[brong@imap14 conf]$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | head
2
1 is
1 length
1 unwritten
6 3
10 4
6 5
5 6
3 7
1 8

Yet looking at the next file,

[brong@imap14 conf]$ filefrag -v testfile2 | awk '{print $5}' | sort -n | uniq -c | tail
1 173
1 175
1 178
1 184
1 187
1 189
1 194
1 289
1 321
1 330


There are multiple extents of hundreds of blocks in length. Why weren't they used in allocating the first file?

This filesystem is quite busy all the time. There are hundreds of imapd processes all locking and writing to it, including a lot of fdatasync and fsync calls. During the time it took to run this command, there would have been multiple fsyncs. I can't see why that would affect the allocator in this way for a single fallocate call though.

Regards,

Bron.

(full filefrag dump for testfile follows:)

[brong@imap14 conf]$ filefrag -v testfile
Filesystem type is: ef53
File size of testfile is 20971520 (20480 blocks, blocksize 1024)
ext logical physical expected length flags
0 0 13545000 48 unwritten
1 48 13545134 13545048 48 unwritten
2 96 13545357 13545182 48 unwritten
3 144 13545431 13545405 48 unwritten
4 192 13546319 13545479 48 unwritten
5 240 13546936 13546367 48 unwritten
6 288 13542294 13546984 47 unwritten
7 335 13542391 13542341 47 unwritten
8 382 13542478 13542438 47 unwritten
9 429 13542719 13542525 47 unwritten
10 476 13542927 13542766 47 unwritten
11 523 13543448 13542974 47 unwritten
12 570 13543595 13543495 47 unwritten
13 617 13543745 13543642 47 unwritten
14 664 13543994 13543792 47 unwritten
15 711 13544115 13544041 47 unwritten
16 758 13544396 13544162 47 unwritten
17 805 13544471 13544443 47 unwritten
18 852 13544608 13544518 47 unwritten
19 899 13547571 13544655 48 unwritten
20 947 13545648 13547619 47 unwritten
21 994 13545706 13545695 47 unwritten
22 1041 13544043 13545753 46 unwritten
23 1087 13547656 13544089 88 unwritten
24 1175 13544923 13547744 46 unwritten
25 1221 13547756 13544969 47 unwritten
26 1268 13545598 13547803 46 unwritten
27 1314 13546485 13545644 46 unwritten
28 1360 13546677 13546531 46 unwritten
29 1406 13547836 13546723 83 unwritten
30 1489 13547921 13547919 51 unwritten
31 1540 13547973 13547972 55 unwritten
32 1595 13546828 13548028 46 unwritten
33 1641 13548107 13546874 68 unwritten
34 1709 13548176 13548175 59 unwritten
35 1768 13546883 13548235 46 unwritten
36 1814 13542536 13546929 45 unwritten
37 1859 13546182 13542581 44 unwritten
38 1903 13548255 13546226 52 unwritten
39 1955 13541526 13548307 43 unwritten
40 1998 13545077 13541569 43 unwritten
41 2041 13545183 13545120 43 unwritten
42 2084 13541687 13545226 42 unwritten
43 2126 13543819 13541729 42 unwritten
44 2168 13548400 13543861 132 unwritten
45 2300 13548605 13548532 73 unwritten
46 2373 13548679 13548678 43 unwritten
47 2416 13545815 13548722 42 unwritten
48 2458 13548728 13545857 104 unwritten
49 2562 13548833 13548832 62 unwritten
50 2624 13548896 13548895 45 unwritten
51 2669 13548942 13548941 66 unwritten
52 2735 13546442 13549008 40 unwritten
53 2775 13542342 13546482 38 unwritten
54 2813 13544286 13542380 38 unwritten
55 2851 13549069 13544324 152 unwritten
56 3003 13541775 13549221 37 unwritten
57 3040 13545858 13541812 37 unwritten
58 3077 13549016 13545895 37 unwritten
59 3114 13544780 13549053 36 unwritten
60 3150 13547004 13544816 36 unwritten
61 3186 13546147 13547040 34 unwritten
62 3220 13549304 13546181 60 unwritten
63 3280 13549365 13549364 76 unwritten
64 3356 13548308 13549441 34 unwritten
65 3390 13542234 13548342 32 unwritten
66 3422 13548343 13542266 32 unwritten
67 3454 13542009 13548375 31 unwritten
68 3485 13542445 13542040 31 unwritten
69 3516 13542662 13542476 31 unwritten
70 3547 13545965 13542693 31 unwritten
71 3578 13549222 13545996 31 unwritten
72 3609 13543892 13549253 30 unwritten
73 3639 13544749 13543922 30 unwritten
74 3669 13545323 13544779 30 unwritten
75 3699 13546759 13545353 30 unwritten
76 3729 13546645 13546789 29 unwritten
77 3758 13547328 13546674 29 unwritten
78 3787 13544257 13547357 28 unwritten
79 3815 13549893 13544285 33 unwritten
80 3848 13550421 13549926 60 unwritten
81 3908 13554246 13550481 59 unwritten
82 3967 13550131 13554305 56 unwritten
83 4023 13551262 13550187 56 unwritten
84 4079 13551565 13551318 56 unwritten
85 4135 13551661 13551621 56 unwritten
86 4191 13553303 13551717 56 unwritten
87 4247 13555017 13553359 56 unwritten
88 4303 13550960 13555073 55 unwritten
89 4358 13554082 13551015 55 unwritten
90 4413 13554384 13554137 55 unwritten
91 4468 13554916 13554439 55 unwritten
92 4523 13551128 13554971 54 unwritten
93 4577 13552389 13551182 52 unwritten
94 4629 13555468 13552441 51 unwritten
95 4680 13554742 13555519 50 unwritten
96 4730 13555208 13554792 50 unwritten
97 4780 13554673 13555258 49 unwritten
98 4829 13555132 13554722 48 unwritten
99 4877 13551757 13555180 47 unwritten
100 4924 13552116 13551804 47 unwritten
101 4971 13552595 13552163 47 unwritten
102 5018 13554562 13552642 47 unwritten
103 5065 13550644 13554609 46 unwritten
104 5111 13550702 13550690 46 unwritten
105 5157 13552001 13550748 46 unwritten
106 5203 13552535 13552047 46 unwritten
107 5249 13552971 13552581 46 unwritten
108 5295 13553841 13553017 46 unwritten
109 5341 13557058 13553887 123 unwritten
110 5464 13550578 13557181 45 unwritten
111 5509 13551044 13550623 43 unwritten
112 5552 13557376 13551087 111 unwritten
113 5663 13557488 13557487 71 unwritten
114 5734 13551325 13557559 43 unwritten
115 5777 13552774 13551368 43 unwritten
116 5820 13553935 13552817 43 unwritten
117 5863 13557599 13553978 162 unwritten
118 6025 13552493 13557761 41 unwritten
119 6066 13550194 13552534 39 unwritten
120 6105 13550520 13550233 39 unwritten
121 6144 13553029 13550559 39 unwritten
122 6183 13552643 13553068 38 unwritten
123 6221 13555841 13552681 38 unwritten
124 6259 13555681 13555879 37 unwritten
125 6296 13551425 13555718 36 unwritten
126 6332 13550385 13551461 35 unwritten
127 6367 13552288 13550420 35 unwritten
128 6402 13552457 13552323 35 unwritten
129 6437 13555329 13552492 35 unwritten
130 6472 13550340 13555364 33 unwritten
131 6505 13550901 13550373 33 unwritten
132 6538 13552164 13550934 33 unwritten
133 6571 13551528 13552197 32 unwritten
134 6603 13551622 13551560 31 unwritten
135 6634 13557026 13551653 31 unwritten
136 6665 13552696 13557057 29 unwritten
137 6694 13549661 13552725 26 unwritten
138 6720 13552070 13549687 26 unwritten
139 6746 13557336 13552096 26 unwritten
140 6772 13549741 13557362 24 unwritten
141 6796 13551016 13549765 24 unwritten
142 6820 13551805 13551040 24 unwritten
143 6844 13551878 13551829 24 unwritten
144 6868 13553981 13551902 24 unwritten
145 6892 13554972 13554005 24 unwritten
146 6916 13556993 13554996 23 unwritten
147 6939 13554631 13557016 22 unwritten
148 6961 13551201 13554653 21 unwritten
149 6982 13553888 13551222 21 unwritten
150 7003 13554508 13553909 21 unwritten
151 7024 13557560 13554529 21 unwritten
152 7045 13554610 13557581 20 unwritten
153 7065 13555188 13554630 19 unwritten
154 7084 13553910 13555207 18 unwritten
155 7102 13549803 13553928 17 unwritten
156 7119 13550307 13549820 17 unwritten
157 7136 13550938 13550324 17 unwritten
158 7153 13551381 13550955 17 unwritten
159 7170 13553806 13551398 17 unwritten
160 7187 13556111 13553823 17 unwritten
161 7204 13556460 13556128 17 unwritten
162 7221 13549602 13556477 16 unwritten
163 7237 13549786 13549618 16 unwritten
164 7253 13550482 13549802 16 unwritten
165 7269 13552097 13550498 16 unwritten
166 7285 13556478 13552113 15 unwritten
167 7300 13556776 13556493 15 unwritten
168 7315 13549622 13556791 13 unwritten
169 7328 13549993 13549635 13 unwritten
170 7341 13550234 13550006 13 unwritten
171 7354 13550499 13550247 13 unwritten
172 7367 13551950 13550512 13 unwritten
173 7380 13554362 13551963 13 unwritten
174 7393 13557585 13554375 13 unwritten
175 7406 13550091 13557598 12 unwritten
176 7418 13550248 13550103 12 unwritten
177 7430 13551484 13550260 12 unwritten
178 7442 13551981 13551496 12 unwritten
179 7454 13552582 13551993 12 unwritten
180 7466 13554799 13552594 12 unwritten
181 7478 13554997 13554811 12 unwritten
182 7490 13555452 13555009 12 unwritten
183 7502 13556098 13555464 12 unwritten
184 7514 13557363 13556110 12 unwritten
185 7526 13550750 13557375 11 unwritten
186 7537 13550762 13550761 11 unwritten
187 7548 13551903 13550773 11 unwritten
188 7559 13554730 13551914 11 unwritten
189 7570 13550691 13554741 10 unwritten
190 7580 13551517 13550701 10 unwritten
191 7590 13552324 13551527 10 unwritten
192 7600 13554013 13552334 10 unwritten
193 7610 13554812 13554023 10 unwritten
194 7620 13549843 13554822 9 unwritten
195 7629 13551118 13549852 9 unwritten
196 7638 13551250 13551127 9 unwritten
197 7647 13551500 13551259 9 unwritten
198 7656 13552335 13551509 9 unwritten
199 7665 13553684 13552344 9 unwritten
200 7674 13554661 13553693 9 unwritten
201 7683 13556886 13554670 9 unwritten
202 7692 13559300 13556895 9 unwritten
203 7701 13561293 13559309 9 unwritten
204 7710 13561303 13561302 8 unwritten
205 7718 13560950 13561311 7 unwritten
206 7725 13561199 13560957 7 unwritten
207 7732 13561317 13561206 7 unwritten
208 7739 13559285 13561324 6 unwritten
209 7745 13561183 13559291 6 unwritten
210 7751 13561192 13561189 6 unwritten
211 7757 13561548 13561198 6 unwritten
212 7763 13558768 13561554 5 unwritten
213 7768 13558889 13558773 5 unwritten
214 7773 13559294 13558894 5 unwritten
215 7778 13559310 13559299 5 unwritten
216 7783 13560214 13559315 5 unwritten
217 7788 13561118 13560219 5 unwritten
218 7793 13558047 13561123 4 unwritten
219 7797 13558332 13558051 4 unwritten
220 7801 13558365 13558336 4 unwritten
221 7805 13558622 13558369 4 unwritten
222 7809 13558679 13558626 4 unwritten
223 7813 13558749 13558683 4 unwritten
224 7817 13559165 13558753 4 unwritten
225 7821 13564189 13559169 36 unwritten
226 7857 13564289 13564225 70 unwritten
227 7927 13559354 13564359 4 unwritten
228 7931 13564891 13559358 38 unwritten
229 7969 13565148 13564929 84 unwritten
230 8053 13565233 13565232 63 unwritten
231 8116 13565297 13565296 19 unwritten
232 8135 13565317 13565316 10 unwritten
233 8145 13565328 13565327 41 unwritten
234 8186 13559471 13565369 4 unwritten
235 8190 13565375 13559475 45 unwritten
236 8235 13565421 13565420 38 unwritten
237 8273 13565460 13565459 48 unwritten
238 8321 13565509 13565508 17 unwritten
239 8338 13565370 13565526 4 unwritten
240 8342 13557962 13565374 3 unwritten
241 8345 13565533 13557965 94 unwritten
242 8439 13558370 13565627 3 unwritten
243 8442 13565631 13558373 6 unwritten
244 8448 13565638 13565637 86 unwritten
245 8534 13558540 13565724 3 unwritten
246 8537 13565729 13558543 13 unwritten
247 8550 13565743 13565742 105 unwritten
248 8655 13565849 13565848 28 unwritten
249 8683 13565878 13565877 75 unwritten
250 8758 13558664 13565953 3 unwritten
251 8761 13558762 13558667 3 unwritten
252 8764 13558858 13558765 3 unwritten
253 8767 13565981 13558861 11 unwritten
254 8778 13570409 13565992 54 unwritten
255 8832 13571201 13570463 47 unwritten
256 8879 13575188 13571248 35 unwritten
257 8914 13579393 13575223 119 unwritten
258 9033 13582040 13579512 106 unwritten
259 9139 13582147 13582146 82 unwritten
260 9221 13582231 13582229 106 unwritten
261 9327 13575840 13582337 53 unwritten
262 9380 13582005 13575893 34 unwritten
263 9414 13579728 13582039 33 unwritten
264 9447 13579915 13579761 24 unwritten
265 9471 13574339 13579939 23 unwritten
266 9494 13577088 13574362 20 unwritten
267 9514 13577627 13577108 19 unwritten
268 9533 13579950 13577646 18 unwritten
269 9551 13574295 13579968 17 unwritten
270 9568 13582508 13574312 30 unwritten
271 9598 13588652 13582538 146 unwritten
272 9744 13585801 13588798 121 unwritten
273 9865 13585434 13585922 120 unwritten
274 9985 13588448 13585554 119 unwritten
275 10104 13588180 13588567 113 unwritten
276 10217 13588951 13588293 107 unwritten
277 10324 13586207 13589058 106 unwritten
278 10430 13589131 13586313 164 unwritten
279 10594 13586622 13589295 102 unwritten
280 10696 13589328 13586724 106 unwritten
281 10802 13586927 13589434 82 unwritten
282 10884 13585993 13587009 79 unwritten
283 10963 13587311 13586072 78 unwritten
284 11041 13589514 13587389 78 unwritten
285 11119 13587602 13589592 77 unwritten
286 11196 13589929 13587679 161 unwritten
287 11357 13589436 13590090 77 unwritten
288 11434 13590120 13589513 120 unwritten
289 11554 13590241 13590240 126 unwritten
290 11680 13586543 13590367 76 unwritten
291 11756 13590409 13586619 120 unwritten
292 11876 13589593 13590529 70 unwritten
293 11946 13586857 13589663 69 unwritten
294 12015 13586359 13586926 67 unwritten
295 12082 13588883 13586426 67 unwritten
296 12149 13589063 13588950 67 unwritten
297 12216 13586727 13589130 64 unwritten
298 12280 13586468 13586791 61 unwritten
299 12341 13585929 13586529 60 unwritten
300 12401 13586073 13585989 59 unwritten
301 12460 13585613 13586132 58 unwritten
302 12518 13585690 13585671 58 unwritten
303 12576 13584829 13585748 57 unwritten
304 12633 13589704 13584886 57 unwritten
305 12690 13588800 13589761 52 unwritten
306 12742 13588312 13588852 51 unwritten
307 12793 13586148 13588363 47 unwritten
308 12840 44779909 13586195 61 unwritten
309 12901 44780846 44779970 79 unwritten
310 12980 44780458 44780925 61 unwritten
311 13041 44779653 44780519 58 unwritten
312 13099 44777775 44779711 52 unwritten
313 13151 44778482 44777827 51 unwritten
314 13202 44779251 44778533 51 unwritten
315 13253 44781016 44779302 91 unwritten
316 13344 44778808 44781107 48 unwritten
317 13392 44779120 44778856 46 unwritten
318 13438 44781140 44779166 63 unwritten
319 13501 44778244 44781203 45 unwritten
320 13546 44778762 44778289 45 unwritten
321 13591 44781225 44778807 95 unwritten
322 13686 44778858 44781320 44 unwritten
323 13730 44778187 44778902 43 unwritten
324 13773 44779053 44778230 41 unwritten
325 13814 44781399 44779094 43 unwritten
326 13857 44778962 44781442 40 unwritten
327 13897 44779487 44779002 40 unwritten
328 13937 44778414 44779527 39 unwritten
329 13976 44778922 44778453 39 unwritten
330 14015 44781321 44778961 39 unwritten
331 14054 44781517 44781360 85 unwritten
332 14139 44780786 44781602 38 unwritten
333 14177 44777555 44780824 37 unwritten
334 14214 44780587 44777592 36 unwritten
335 14250 44778642 44780623 35 unwritten
336 14285 44781647 44778677 100 unwritten
337 14385 44781748 44781747 45 unwritten
338 14430 44780382 44781793 35 unwritten
339 14465 44778340 44780417 34 unwritten
340 14499 44779385 44778374 34 unwritten
341 14533 44781848 44779419 72 unwritten
342 14605 44779870 44781920 34 unwritten
343 14639 44780015 44779904 34 unwritten
344 14673 44777889 44780049 33 unwritten
345 14706 44781972 44777922 138 unwritten
346 14844 44782111 44782110 160 unwritten
347 15004 44780926 44782271 32 unwritten
348 15036 44778375 44780958 31 unwritten
349 15067 44778575 44778406 31 unwritten
350 15098 44780187 44778606 30 unwritten
351 15128 44782299 44780217 37 unwritten
352 15165 44778688 44782336 29 unwritten
353 15194 44782353 44778717 75 unwritten
354 15269 44782429 44782428 122 unwritten
355 15391 44782553 44782551 322 unwritten
356 15713 44782876 44782875 40 unwritten
357 15753 44782917 44782916 30 unwritten
358 15783 44782948 44782947 70 unwritten
359 15853 44783019 44783018 86 unwritten
360 15939 44783107 44783105 215 unwritten
361 16154 44783323 44783322 35 unwritten
362 16189 44778724 44783358 29 unwritten
363 16218 44783389 44778753 52 unwritten
364 16270 44783442 44783441 67 unwritten
365 16337 44783510 44783509 76 unwritten
366 16413 44783587 44783586 91 unwritten
367 16504 44783679 44783678 188 unwritten
368 16692 44783868 44783867 55 unwritten
369 16747 44783924 44783923 45 unwritten
370 16792 44783970 44783969 39 unwritten
371 16831 44780072 44784009 29 unwritten
372 16860 44784022 44780101 73 unwritten
373 16933 44784096 44784095 76 unwritten
374 17009 44780115 44784172 29 unwritten
375 17038 44780145 44780144 29 unwritten
376 17067 44784181 44780174 233 unwritten
377 17300 44784415 44784414 255 unwritten
378 17555 44783359 44784670 29 unwritten
379 17584 44779712 44783388 28 unwritten
380 17612 44779777 44779740 28 unwritten
381 17640 44784707 44779805 52 unwritten
382 17692 44781369 44784759 28 unwritten
383 17720 44777970 44781397 27 unwritten
384 17747 44778085 44777997 27 unwritten
385 17774 44778141 44778112 27 unwritten
386 17801 44778293 44778168 27 unwritten
387 17828 44784809 44778320 179 unwritten
388 18007 44784989 44784988 40 unwritten
389 18047 44785030 44785029 231 unwritten
390 18278 44785262 44785261 45 unwritten
391 18323 44785308 44785307 357 unwritten
392 18680 44778547 44785665 27 unwritten
393 18707 44779357 44778574 27 unwritten
394 18734 44779537 44779384 27 unwritten
395 18761 44780278 44779564 27 unwritten
396 18788 44781921 44780305 27 unwritten
397 18815 44779017 44781948 26 unwritten
398 18841 44779580 44779043 26 unwritten
399 18867 44780335 44779606 26 unwritten
400 18893 44780658 44780361 26 unwritten
401 18919 44780728 44780684 26 unwritten
402 18945 44785892 44780754 30 unwritten
403 18975 44789606 44785922 136 unwritten
404 19111 44788602 44789742 104 unwritten
405 19215 44789075 44788706 95 unwritten
406 19310 44789895 44789170 92 unwritten
407 19402 44787855 44789987 78 unwritten
408 19480 44787670 44787933 77 unwritten
409 19557 44791413 44787747 80 unwritten
410 19637 44788513 44791493 75 unwritten
411 19712 44790631 44788588 75 unwritten
412 19787 44787177 44790706 71 unwritten
413 19858 44791556 44787248 122 unwritten
414 19980 44790092 44791678 69 unwritten
415 20049 44786455 44790161 68 unwritten
416 20117 44790242 44786523 66 unwritten
417 20183 44791726 44790308 118 unwritten
418 20301 44791100 44791844 63 unwritten
419 20364 44788380 44791163 62 unwritten
420 20426 44786252 44788442 54 unwritten,eof
testfile: 421 extents found


--
Bron Gondwana
[email protected]



2013-01-30 06:05:45

by Eric Sandeen

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On 1/29/13 11:46 PM, Bron Gondwana wrote:
> Hi All,
>
> I'm trying to understand why my ext4 filesystem is creating highly fragmented files even though it's only just over 50% full.

It's at least possible that freespace is very fragmented; you could try the "e2freefrag" command to see.

<backgroun>

> Now we've got some background, let's create two files using the fallocate command line tool, and see where the blocks wound up. This mirrors exactly the behaviour we have seen with both using posix_fallocate to pre-allocate space, or just using seek and writev to write individual records out to the file. Except with the slower writes of a real application, we get closer to 9000 extents for a file this size.
>
> [brong@imap14 conf]$ fallocate -l 20m testfile
> [brong@imap14 conf]$ fallocate -l 20m testfile2
> [brong@imap14 conf]$ filefrag testfile
> testfile: 421 extents found
> [brong@imap14 conf]$ filefrag testfile2
> testfile2: 306 extents found
>
> Now looking at the verbose output, we can see that there are many extents of just 3 or 4 blocks:
>
> [brong@imap14 conf]$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | head
> 2
> 1 is
> 1 length
> 1 unwritten
> 6 3
> 10 4
> 6 5
> 5 6
> 3 7
> 1 8

But longer extents too, right:

$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | tail
1 162
1 164
1 179
1 188
1 215
1 231
1 233
1 255
1 322
1 357

> Yet looking at the next file,
>
> [brong@imap14 conf]$ filefrag -v testfile2 | awk '{print $5}' | sort -n | uniq -c | tail
> 1 173
> 1 175
> 1 178
> 1 184
> 1 187
> 1 189
> 1 194
> 1 289
> 1 321
> 1 330
>

and presumably shorter extents at the beginning?

So it sounds like both files are a mix of long & short extents.

> There are multiple extents of hundreds of blocks in length. Why weren't they used in allocating the first file?

I'm not sure, offhand. But just to be clear, while contiguous allocations are usually a nice side-effect of fallocate, nothing at all guarantees it. It only guarantees that you'll have that space available for future writes.

Still, it'd be interesting to figure out why the allocator is behaving this way.
It'd be interesting to see the freefrag info, the allocator might really be in scavenger mode.

-Eric

> This filesystem is quite busy all the time. There are hundreds of imapd processes all locking and writing to it, including a lot of fdatasync and fsync calls. During the time it took to run this command, there would have been multiple fsyncs. I can't see why that would affect the allocator in this way for a single fallocate call though.
>
> Regards,
>
> Bron.



2013-01-30 06:35:14

by Bron Gondwana

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Wed, Jan 30, 2013, at 05:05 PM, Eric Sandeen wrote:
> On 1/29/13 11:46 PM, Bron Gondwana wrote:
> > Hi All,
> >
> > I'm trying to understand why my ext4 filesystem is creating highly fragmented files even though it's only just over 50% full.
>
> It's at least possible that freespace is very fragmented; you could try the "e2freefrag" command to see.

[brong@imap14 ~]$ e2freefrag /dev/md0
Device: /dev/md0
Blocksize: 1024 bytes
Total blocks: 62522624
Free blocks: 26483551 (42.4%)

Min. free extent: 1 KB
Max. free extent: 757 KB
Avg. free extent: 14 KB
Num. free extent: 1940838

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range : Free extents Free Blocks Percent
1K... 2K- : 538480 538480 2.03%
2K... 4K- : 362189 870860 3.29%
4K... 8K- : 321158 1681591 6.35%
8K... 16K- : 268848 2934959 11.08%
16K... 32K- : 210746 4697440 17.74%
32K... 64K- : 151755 6738418 25.44%
64K... 128K- : 63761 5512870 20.82%
128K... 256K- : 20563 3552580 13.41%
256K... 512K- : 3308 1047995 3.96%
512K... 1024K- : 30 17615 0.07%

> > Now looking at the verbose output, we can see that there are many extents of just 3 or 4 blocks:
> >
> > [brong@imap14 conf]$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | head
> > 2
> > 1 is
> > 1 length
> > 1 unwritten
> > 6 3
> > 10 4
> > 6 5
> > 5 6
> > 3 7
> > 1 8
>
> But longer extents too, right:
>
> $ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | tail
> 1 162
> 1 164
> 1 179
> 1 188
> 1 215
> 1 231
> 1 233
> 1 255
> 1 322
> 1 357
>
> > Yet looking at the next file,
> >
> > [brong@imap14 conf]$ filefrag -v testfile2 | awk '{print $5}' | sort -n | uniq -c | tail
> > 1 173
> > 1 175
> > 1 178
> > 1 184
> > 1 187
> > 1 189
> > 1 194
> > 1 289
> > 1 321
> > 1 330
> >
>
> and presumably shorter extents at the beginning?

Well, that's sorted. Yes, there were shorter extents too.

> So it sounds like both files are a mix of long & short extents.

Definitely.

> > There are multiple extents of hundreds of blocks in length. Why weren't they used in allocating the first file?
>
> I'm not sure, offhand. But just to be clear, while contiguous allocations are usually a nice side-effect of fallocate, nothing at all guarantees it. It only guarantees that you'll have that space available for future writes.

Sure. I was hoping it would help though!

> Still, it'd be interesting to figure out why the allocator is behaving this way.
> It'd be interesting to see the freefrag info, the allocator might really be in scavenger mode.

What do you think from the output above. Is that reasonable? I'll check a more recently set-up machine.

[brong@imap30 ~]$ e2freefrag /dev/sdf1
Device: /dev/sdf1
Blocksize: 1024 bytes

Total blocks: 97124320
Free blocks: 68429391 (70.5%)

Min. free extent: 1 KB
Max. free extent: 1009 KB
Avg. free extent: 25 KB
Num. free extent: 2781696

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range : Free extents Free Blocks Percent
1K... 2K- : 705257 705257 1.03%
2K... 4K- : 553577 1348712 1.97%
4K... 8K- : 349406 1789755 2.62%
8K... 16K- : 289102 3185026 4.65%
16K... 32K- : 279061 6307452 9.22%
32K... 64K- : 271631 12321046 18.01%
64K... 128K- : 205191 18340308 26.80%
128K... 256K- : 110082 19121199 27.94%
256K... 512K- : 16962 5584384 8.16%
512K... 1024K- : 1427 882388 1.29%

This one is 100Gb SSDs from some other vendor (can't remember which) on hardware RAID1. It's never been more than about 30% full. It looks like a similar histogram of extent sizes. Again it's a 1kb block size (piles of small files on these filesystems)

[brong@imap30 ~]$ dumpe2fs -h /dev/sdf1
dumpe2fs 1.42.4 (12-Jun-2012)
Filesystem volume name: ssd30
Last mounted on: /mnt/ssd30
Filesystem UUID: c2623b6a-b3f4-4a5a-99e3-495f29112ba6
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 12140544
Block count: 97124320
Reserved block count: 4856216
Free blocks: 68429391
Free inodes: 7157347
First block: 1
Block size: 1024
Fragment size: 1024
Reserved GDT blocks: 256
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 1024
Inode blocks per group: 256
Flex block group size: 16
Filesystem created: Tue Aug 2 07:39:40 2011
Last mount time: Thu Jan 24 23:15:41 2013
Last write time: Thu Jan 24 23:15:41 2013
Mount count: 10
Maximum mount count: 39
Last checked: Tue Aug 2 07:39:40 2011
Check interval: 15552000 (6 months)
Next check after: Sun Jan 29 06:39:40 2012
Lifetime writes: 13 TB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 0ecbfe75-57e3-4d4e-b4a8-bf0114dc0997
Journal backup: inode blocks
Journal features: journal_incompat_revoke
Journal size: 32M
Journal length: 32768
Journal sequence: 0x32367a0d
Journal start: 1537

Regards,

Bron.
--
Bron Gondwana
[email protected]


2013-01-30 15:56:55

by Eric Sandeen

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On 1/30/13 12:35 AM, Bron Gondwana wrote:
> On Wed, Jan 30, 2013, at 05:05 PM, Eric Sandeen wrote:
>> On 1/29/13 11:46 PM, Bron Gondwana wrote:
>>> Hi All,
>>>
>>> I'm trying to understand why my ext4 filesystem is creating highly fragmented files even though it's only just over 50% full.
>>
>> It's at least possible that freespace is very fragmented; you could try the "e2freefrag" command to see.
>
> [brong@imap14 ~]$ e2freefrag /dev/md0
> Device: /dev/md0
> Blocksize: 1024 bytes
> Total blocks: 62522624
> Free blocks: 26483551 (42.4%)
>
> Min. free extent: 1 KB
> Max. free extent: 757 KB
> Avg. free extent: 14 KB
> Num. free extent: 1940838
>
> HISTOGRAM OF FREE EXTENT SIZES:
> Extent Size Range : Free extents Free Blocks Percent
> 1K... 2K- : 538480 538480 2.03%
> 2K... 4K- : 362189 870860 3.29%
> 4K... 8K- : 321158 1681591 6.35%
> 8K... 16K- : 268848 2934959 11.08%
> 16K... 32K- : 210746 4697440 17.74%
> 32K... 64K- : 151755 6738418 25.44%
> 64K... 128K- : 63761 5512870 20.82%
> 128K... 256K- : 20563 3552580 13.41%
> 256K... 512K- : 3308 1047995 3.96%
> 512K... 1024K- : 30 17615 0.07%

Ok, TBH I'd not certain why the allocator is doing just what it's doing.
There are quite a lot of larger-than-3-block free spaces. OTOH, it might be
trying for some kind of locality.

I think it'd take some digging into the allocator behavior; there may
be tracepoints that'd help.

-Eric

>>> Now looking at the verbose output, we can see that there are many extents of just 3 or 4 blocks:
>>>
>>> [brong@imap14 conf]$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | head
>>> 2
>>> 1 is
>>> 1 length
>>> 1 unwritten
>>> 6 3
>>> 10 4
>>> 6 5
>>> 5 6
>>> 3 7
>>> 1 8
>>
>> But longer extents too, right:
>>
>> $ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | tail
>> 1 162
>> 1 164
>> 1 179
>> 1 188
>> 1 215
>> 1 231
>> 1 233
>> 1 255
>> 1 322
>> 1 357
>>
>>> Yet looking at the next file,
>>>
>>> [brong@imap14 conf]$ filefrag -v testfile2 | awk '{print $5}' | sort -n | uniq -c | tail
>>> 1 173
>>> 1 175
>>> 1 178
>>> 1 184
>>> 1 187
>>> 1 189
>>> 1 194
>>> 1 289
>>> 1 321
>>> 1 330
>>>
>>
>> and presumably shorter extents at the beginning?
>
> Well, that's sorted. Yes, there were shorter extents too.
>
>> So it sounds like both files are a mix of long & short extents.
>
> Definitely.
>
>>> There are multiple extents of hundreds of blocks in length. Why weren't they used in allocating the first file?
>>
>> I'm not sure, offhand. But just to be clear, while contiguous allocations are usually a nice side-effect of fallocate, nothing at all guarantees it. It only guarantees that you'll have that space available for future writes.
>
> Sure. I was hoping it would help though!
>
>> Still, it'd be interesting to figure out why the allocator is behaving this way.
>> It'd be interesting to see the freefrag info, the allocator might really be in scavenger mode.
>
> What do you think from the output above. Is that reasonable? I'll check a more recently set-up machine.
>
> [brong@imap30 ~]$ e2freefrag /dev/sdf1
> Device: /dev/sdf1
> Blocksize: 1024 bytes
>
> Total blocks: 97124320
> Free blocks: 68429391 (70.5%)
>
> Min. free extent: 1 KB
> Max. free extent: 1009 KB
> Avg. free extent: 25 KB
> Num. free extent: 2781696
>
> HISTOGRAM OF FREE EXTENT SIZES:
> Extent Size Range : Free extents Free Blocks Percent
> 1K... 2K- : 705257 705257 1.03%
> 2K... 4K- : 553577 1348712 1.97%
> 4K... 8K- : 349406 1789755 2.62%
> 8K... 16K- : 289102 3185026 4.65%
> 16K... 32K- : 279061 6307452 9.22%
> 32K... 64K- : 271631 12321046 18.01%
> 64K... 128K- : 205191 18340308 26.80%
> 128K... 256K- : 110082 19121199 27.94%
> 256K... 512K- : 16962 5584384 8.16%
> 512K... 1024K- : 1427 882388 1.29%
>
> This one is 100Gb SSDs from some other vendor (can't remember which) on hardware RAID1. It's never been more than about 30% full. It looks like a similar histogram of extent sizes. Again it's a 1kb block size (piles of small files on these filesystems)
>
> [brong@imap30 ~]$ dumpe2fs -h /dev/sdf1
> dumpe2fs 1.42.4 (12-Jun-2012)
> Filesystem volume name: ssd30
> Last mounted on: /mnt/ssd30
> Filesystem UUID: c2623b6a-b3f4-4a5a-99e3-495f29112ba6
> Filesystem magic number: 0xEF53
> Filesystem revision #: 1 (dynamic)
> Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
> Filesystem flags: signed_directory_hash
> Default mount options: (none)
> Filesystem state: clean
> Errors behavior: Continue
> Filesystem OS type: Linux
> Inode count: 12140544
> Block count: 97124320
> Reserved block count: 4856216
> Free blocks: 68429391
> Free inodes: 7157347
> First block: 1
> Block size: 1024
> Fragment size: 1024
> Reserved GDT blocks: 256
> Blocks per group: 8192
> Fragments per group: 8192
> Inodes per group: 1024
> Inode blocks per group: 256
> Flex block group size: 16
> Filesystem created: Tue Aug 2 07:39:40 2011
> Last mount time: Thu Jan 24 23:15:41 2013
> Last write time: Thu Jan 24 23:15:41 2013
> Mount count: 10
> Maximum mount count: 39
> Last checked: Tue Aug 2 07:39:40 2011
> Check interval: 15552000 (6 months)
> Next check after: Sun Jan 29 06:39:40 2012
> Lifetime writes: 13 TB
> Reserved blocks uid: 0 (user root)
> Reserved blocks gid: 0 (group root)
> First inode: 11
> Inode size: 256
> Required extra isize: 28
> Desired extra isize: 28
> Journal inode: 8
> Default directory hash: half_md4
> Directory Hash Seed: 0ecbfe75-57e3-4d4e-b4a8-bf0114dc0997
> Journal backup: inode blocks
> Journal features: journal_incompat_revoke
> Journal size: 32M
> Journal length: 32768
> Journal sequence: 0x32367a0d
> Journal start: 1537
>
> Regards,
>
> Bron.
>


2013-01-30 20:14:18

by Theodore Ts'o

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Wed, Jan 30, 2013 at 09:56:51AM -0600, Eric Sandeen wrote:
> Ok, TBH I'd not certain why the allocator is doing just what it's doing.
> There are quite a lot of larger-than-3-block free spaces. OTOH, it might be
> trying for some kind of locality.

Yeah, I'll bet that's what's going on.

Can you show us the the inode number for each of the test files along
with the filefrag -v output? What I suspect is going on is that the
kernel is trying too hard to start the block allocation in the same
block group which was used for the inode number.

What we probably need to do is to have some hueristic where if we know
there are plenty of block groups with lots of large contiguous free
space, and the block group which we are preferring either because it's
the same one as the inode number, or because it's where we had
previously done the last block allocation for the file, and we the
user requests a large fallocated region, that it's better to switch
over to one of the other block groups with lots and lots of free
space.

- Ted

2013-01-30 21:21:51

by Robert Mueller

[permalink] [raw]
Subject: Re: fallocate creating fragmented files



> Can you show us the the inode number for each of the test files along
> with the filefrag -v output? What I suspect is going on is that the
> kernel is trying too hard to start the block allocation in the same
> block group which was used for the inode number.

Since Bron isn't around right now and I know where the files are...

[robm@imap14 conf]$ ls -i testfile
15238691 testfile
[robm@imap14 conf]$ ls -i testfile2
15238707 testfile2
[robm@imap14 conf]$ filefrag -v testfile
Filesystem type is: ef53
File size of testfile is 20971520 (20480 blocks, blocksize 1024)
ext logical physical expected length flags
0 0 13545000 48 unwritten
1 48 13545134 13545048 48 unwritten
2 96 13545357 13545182 48 unwritten
3 144 13545431 13545405 48 unwritten
4 192 13546319 13545479 48 unwritten
5 240 13546936 13546367 48 unwritten
6 288 13542294 13546984 47 unwritten
7 335 13542391 13542341 47 unwritten
8 382 13542478 13542438 47 unwritten
9 429 13542719 13542525 47 unwritten
10 476 13542927 13542766 47 unwritten
11 523 13543448 13542974 47 unwritten
12 570 13543595 13543495 47 unwritten
13 617 13543745 13543642 47 unwritten
14 664 13543994 13543792 47 unwritten
15 711 13544115 13544041 47 unwritten
16 758 13544396 13544162 47 unwritten
17 805 13544471 13544443 47 unwritten
18 852 13544608 13544518 47 unwritten
19 899 13547571 13544655 48 unwritten
20 947 13545648 13547619 47 unwritten
21 994 13545706 13545695 47 unwritten
22 1041 13544043 13545753 46 unwritten
23 1087 13547656 13544089 88 unwritten
24 1175 13544923 13547744 46 unwritten
25 1221 13547756 13544969 47 unwritten
26 1268 13545598 13547803 46 unwritten
27 1314 13546485 13545644 46 unwritten
28 1360 13546677 13546531 46 unwritten
29 1406 13547836 13546723 83 unwritten
30 1489 13547921 13547919 51 unwritten
31 1540 13547973 13547972 55 unwritten
32 1595 13546828 13548028 46 unwritten
33 1641 13548107 13546874 68 unwritten
34 1709 13548176 13548175 59 unwritten
35 1768 13546883 13548235 46 unwritten
36 1814 13542536 13546929 45 unwritten
37 1859 13546182 13542581 44 unwritten
38 1903 13548255 13546226 52 unwritten
39 1955 13541526 13548307 43 unwritten
40 1998 13545077 13541569 43 unwritten
41 2041 13545183 13545120 43 unwritten
42 2084 13541687 13545226 42 unwritten
43 2126 13543819 13541729 42 unwritten
44 2168 13548400 13543861 132 unwritten
45 2300 13548605 13548532 73 unwritten
46 2373 13548679 13548678 43 unwritten
47 2416 13545815 13548722 42 unwritten
48 2458 13548728 13545857 104 unwritten
49 2562 13548833 13548832 62 unwritten
50 2624 13548896 13548895 45 unwritten
51 2669 13548942 13548941 66 unwritten
52 2735 13546442 13549008 40 unwritten
53 2775 13542342 13546482 38 unwritten
54 2813 13544286 13542380 38 unwritten
55 2851 13549069 13544324 152 unwritten
56 3003 13541775 13549221 37 unwritten
57 3040 13545858 13541812 37 unwritten
58 3077 13549016 13545895 37 unwritten
59 3114 13544780 13549053 36 unwritten
60 3150 13547004 13544816 36 unwritten
61 3186 13546147 13547040 34 unwritten
62 3220 13549304 13546181 60 unwritten
63 3280 13549365 13549364 76 unwritten
64 3356 13548308 13549441 34 unwritten
65 3390 13542234 13548342 32 unwritten
66 3422 13548343 13542266 32 unwritten
67 3454 13542009 13548375 31 unwritten
68 3485 13542445 13542040 31 unwritten
69 3516 13542662 13542476 31 unwritten
70 3547 13545965 13542693 31 unwritten
71 3578 13549222 13545996 31 unwritten
72 3609 13543892 13549253 30 unwritten
73 3639 13544749 13543922 30 unwritten
74 3669 13545323 13544779 30 unwritten
75 3699 13546759 13545353 30 unwritten
76 3729 13546645 13546789 29 unwritten
77 3758 13547328 13546674 29 unwritten
78 3787 13544257 13547357 28 unwritten
79 3815 13549893 13544285 33 unwritten
80 3848 13550421 13549926 60 unwritten
81 3908 13554246 13550481 59 unwritten
82 3967 13550131 13554305 56 unwritten
83 4023 13551262 13550187 56 unwritten
84 4079 13551565 13551318 56 unwritten
85 4135 13551661 13551621 56 unwritten
86 4191 13553303 13551717 56 unwritten
87 4247 13555017 13553359 56 unwritten
88 4303 13550960 13555073 55 unwritten
89 4358 13554082 13551015 55 unwritten
90 4413 13554384 13554137 55 unwritten
91 4468 13554916 13554439 55 unwritten
92 4523 13551128 13554971 54 unwritten
93 4577 13552389 13551182 52 unwritten
94 4629 13555468 13552441 51 unwritten
95 4680 13554742 13555519 50 unwritten
96 4730 13555208 13554792 50 unwritten
97 4780 13554673 13555258 49 unwritten
98 4829 13555132 13554722 48 unwritten
99 4877 13551757 13555180 47 unwritten
100 4924 13552116 13551804 47 unwritten
101 4971 13552595 13552163 47 unwritten
102 5018 13554562 13552642 47 unwritten
103 5065 13550644 13554609 46 unwritten
104 5111 13550702 13550690 46 unwritten
105 5157 13552001 13550748 46 unwritten
106 5203 13552535 13552047 46 unwritten
107 5249 13552971 13552581 46 unwritten
108 5295 13553841 13553017 46 unwritten
109 5341 13557058 13553887 123 unwritten
110 5464 13550578 13557181 45 unwritten
111 5509 13551044 13550623 43 unwritten
112 5552 13557376 13551087 111 unwritten
113 5663 13557488 13557487 71 unwritten
114 5734 13551325 13557559 43 unwritten
115 5777 13552774 13551368 43 unwritten
116 5820 13553935 13552817 43 unwritten
117 5863 13557599 13553978 162 unwritten
118 6025 13552493 13557761 41 unwritten
119 6066 13550194 13552534 39 unwritten
120 6105 13550520 13550233 39 unwritten
121 6144 13553029 13550559 39 unwritten
122 6183 13552643 13553068 38 unwritten
123 6221 13555841 13552681 38 unwritten
124 6259 13555681 13555879 37 unwritten
125 6296 13551425 13555718 36 unwritten
126 6332 13550385 13551461 35 unwritten
127 6367 13552288 13550420 35 unwritten
128 6402 13552457 13552323 35 unwritten
129 6437 13555329 13552492 35 unwritten
130 6472 13550340 13555364 33 unwritten
131 6505 13550901 13550373 33 unwritten
132 6538 13552164 13550934 33 unwritten
133 6571 13551528 13552197 32 unwritten
134 6603 13551622 13551560 31 unwritten
135 6634 13557026 13551653 31 unwritten
136 6665 13552696 13557057 29 unwritten
137 6694 13549661 13552725 26 unwritten
138 6720 13552070 13549687 26 unwritten
139 6746 13557336 13552096 26 unwritten
140 6772 13549741 13557362 24 unwritten
141 6796 13551016 13549765 24 unwritten
142 6820 13551805 13551040 24 unwritten
143 6844 13551878 13551829 24 unwritten
144 6868 13553981 13551902 24 unwritten
145 6892 13554972 13554005 24 unwritten
146 6916 13556993 13554996 23 unwritten
147 6939 13554631 13557016 22 unwritten
148 6961 13551201 13554653 21 unwritten
149 6982 13553888 13551222 21 unwritten
150 7003 13554508 13553909 21 unwritten
151 7024 13557560 13554529 21 unwritten
152 7045 13554610 13557581 20 unwritten
153 7065 13555188 13554630 19 unwritten
154 7084 13553910 13555207 18 unwritten
155 7102 13549803 13553928 17 unwritten
156 7119 13550307 13549820 17 unwritten
157 7136 13550938 13550324 17 unwritten
158 7153 13551381 13550955 17 unwritten
159 7170 13553806 13551398 17 unwritten
160 7187 13556111 13553823 17 unwritten
161 7204 13556460 13556128 17 unwritten
162 7221 13549602 13556477 16 unwritten
163 7237 13549786 13549618 16 unwritten
164 7253 13550482 13549802 16 unwritten
165 7269 13552097 13550498 16 unwritten
166 7285 13556478 13552113 15 unwritten
167 7300 13556776 13556493 15 unwritten
168 7315 13549622 13556791 13 unwritten
169 7328 13549993 13549635 13 unwritten
170 7341 13550234 13550006 13 unwritten
171 7354 13550499 13550247 13 unwritten
172 7367 13551950 13550512 13 unwritten
173 7380 13554362 13551963 13 unwritten
174 7393 13557585 13554375 13 unwritten
175 7406 13550091 13557598 12 unwritten
176 7418 13550248 13550103 12 unwritten
177 7430 13551484 13550260 12 unwritten
178 7442 13551981 13551496 12 unwritten
179 7454 13552582 13551993 12 unwritten
180 7466 13554799 13552594 12 unwritten
181 7478 13554997 13554811 12 unwritten
182 7490 13555452 13555009 12 unwritten
183 7502 13556098 13555464 12 unwritten
184 7514 13557363 13556110 12 unwritten
185 7526 13550750 13557375 11 unwritten
186 7537 13550762 13550761 11 unwritten
187 7548 13551903 13550773 11 unwritten
188 7559 13554730 13551914 11 unwritten
189 7570 13550691 13554741 10 unwritten
190 7580 13551517 13550701 10 unwritten
191 7590 13552324 13551527 10 unwritten
192 7600 13554013 13552334 10 unwritten
193 7610 13554812 13554023 10 unwritten
194 7620 13549843 13554822 9 unwritten
195 7629 13551118 13549852 9 unwritten
196 7638 13551250 13551127 9 unwritten
197 7647 13551500 13551259 9 unwritten
198 7656 13552335 13551509 9 unwritten
199 7665 13553684 13552344 9 unwritten
200 7674 13554661 13553693 9 unwritten
201 7683 13556886 13554670 9 unwritten
202 7692 13559300 13556895 9 unwritten
203 7701 13561293 13559309 9 unwritten
204 7710 13561303 13561302 8 unwritten
205 7718 13560950 13561311 7 unwritten
206 7725 13561199 13560957 7 unwritten
207 7732 13561317 13561206 7 unwritten
208 7739 13559285 13561324 6 unwritten
209 7745 13561183 13559291 6 unwritten
210 7751 13561192 13561189 6 unwritten
211 7757 13561548 13561198 6 unwritten
212 7763 13558768 13561554 5 unwritten
213 7768 13558889 13558773 5 unwritten
214 7773 13559294 13558894 5 unwritten
215 7778 13559310 13559299 5 unwritten
216 7783 13560214 13559315 5 unwritten
217 7788 13561118 13560219 5 unwritten
218 7793 13558047 13561123 4 unwritten
219 7797 13558332 13558051 4 unwritten
220 7801 13558365 13558336 4 unwritten
221 7805 13558622 13558369 4 unwritten
222 7809 13558679 13558626 4 unwritten
223 7813 13558749 13558683 4 unwritten
224 7817 13559165 13558753 4 unwritten
225 7821 13564189 13559169 36 unwritten
226 7857 13564289 13564225 70 unwritten
227 7927 13559354 13564359 4 unwritten
228 7931 13564891 13559358 38 unwritten
229 7969 13565148 13564929 84 unwritten
230 8053 13565233 13565232 63 unwritten
231 8116 13565297 13565296 19 unwritten
232 8135 13565317 13565316 10 unwritten
233 8145 13565328 13565327 41 unwritten
234 8186 13559471 13565369 4 unwritten
235 8190 13565375 13559475 45 unwritten
236 8235 13565421 13565420 38 unwritten
237 8273 13565460 13565459 48 unwritten
238 8321 13565509 13565508 17 unwritten
239 8338 13565370 13565526 4 unwritten
240 8342 13557962 13565374 3 unwritten
241 8345 13565533 13557965 94 unwritten
242 8439 13558370 13565627 3 unwritten
243 8442 13565631 13558373 6 unwritten
244 8448 13565638 13565637 86 unwritten
245 8534 13558540 13565724 3 unwritten
246 8537 13565729 13558543 13 unwritten
247 8550 13565743 13565742 105 unwritten
248 8655 13565849 13565848 28 unwritten
249 8683 13565878 13565877 75 unwritten
250 8758 13558664 13565953 3 unwritten
251 8761 13558762 13558667 3 unwritten
252 8764 13558858 13558765 3 unwritten
253 8767 13565981 13558861 11 unwritten
254 8778 13570409 13565992 54 unwritten
255 8832 13571201 13570463 47 unwritten
256 8879 13575188 13571248 35 unwritten
257 8914 13579393 13575223 119 unwritten
258 9033 13582040 13579512 106 unwritten
259 9139 13582147 13582146 82 unwritten
260 9221 13582231 13582229 106 unwritten
261 9327 13575840 13582337 53 unwritten
262 9380 13582005 13575893 34 unwritten
263 9414 13579728 13582039 33 unwritten
264 9447 13579915 13579761 24 unwritten
265 9471 13574339 13579939 23 unwritten
266 9494 13577088 13574362 20 unwritten
267 9514 13577627 13577108 19 unwritten
268 9533 13579950 13577646 18 unwritten
269 9551 13574295 13579968 17 unwritten
270 9568 13582508 13574312 30 unwritten
271 9598 13588652 13582538 146 unwritten
272 9744 13585801 13588798 121 unwritten
273 9865 13585434 13585922 120 unwritten
274 9985 13588448 13585554 119 unwritten
275 10104 13588180 13588567 113 unwritten
276 10217 13588951 13588293 107 unwritten
277 10324 13586207 13589058 106 unwritten
278 10430 13589131 13586313 164 unwritten
279 10594 13586622 13589295 102 unwritten
280 10696 13589328 13586724 106 unwritten
281 10802 13586927 13589434 82 unwritten
282 10884 13585993 13587009 79 unwritten
283 10963 13587311 13586072 78 unwritten
284 11041 13589514 13587389 78 unwritten
285 11119 13587602 13589592 77 unwritten
286 11196 13589929 13587679 161 unwritten
287 11357 13589436 13590090 77 unwritten
288 11434 13590120 13589513 120 unwritten
289 11554 13590241 13590240 126 unwritten
290 11680 13586543 13590367 76 unwritten
291 11756 13590409 13586619 120 unwritten
292 11876 13589593 13590529 70 unwritten
293 11946 13586857 13589663 69 unwritten
294 12015 13586359 13586926 67 unwritten
295 12082 13588883 13586426 67 unwritten
296 12149 13589063 13588950 67 unwritten
297 12216 13586727 13589130 64 unwritten
298 12280 13586468 13586791 61 unwritten
299 12341 13585929 13586529 60 unwritten
300 12401 13586073 13585989 59 unwritten
301 12460 13585613 13586132 58 unwritten
302 12518 13585690 13585671 58 unwritten
303 12576 13584829 13585748 57 unwritten
304 12633 13589704 13584886 57 unwritten
305 12690 13588800 13589761 52 unwritten
306 12742 13588312 13588852 51 unwritten
307 12793 13586148 13588363 47 unwritten
308 12840 44779909 13586195 61 unwritten
309 12901 44780846 44779970 79 unwritten
310 12980 44780458 44780925 61 unwritten
311 13041 44779653 44780519 58 unwritten
312 13099 44777775 44779711 52 unwritten
313 13151 44778482 44777827 51 unwritten
314 13202 44779251 44778533 51 unwritten
315 13253 44781016 44779302 91 unwritten
316 13344 44778808 44781107 48 unwritten
317 13392 44779120 44778856 46 unwritten
318 13438 44781140 44779166 63 unwritten
319 13501 44778244 44781203 45 unwritten
320 13546 44778762 44778289 45 unwritten
321 13591 44781225 44778807 95 unwritten
322 13686 44778858 44781320 44 unwritten
323 13730 44778187 44778902 43 unwritten
324 13773 44779053 44778230 41 unwritten
325 13814 44781399 44779094 43 unwritten
326 13857 44778962 44781442 40 unwritten
327 13897 44779487 44779002 40 unwritten
328 13937 44778414 44779527 39 unwritten
329 13976 44778922 44778453 39 unwritten
330 14015 44781321 44778961 39 unwritten
331 14054 44781517 44781360 85 unwritten
332 14139 44780786 44781602 38 unwritten
333 14177 44777555 44780824 37 unwritten
334 14214 44780587 44777592 36 unwritten
335 14250 44778642 44780623 35 unwritten
336 14285 44781647 44778677 100 unwritten
337 14385 44781748 44781747 45 unwritten
338 14430 44780382 44781793 35 unwritten
339 14465 44778340 44780417 34 unwritten
340 14499 44779385 44778374 34 unwritten
341 14533 44781848 44779419 72 unwritten
342 14605 44779870 44781920 34 unwritten
343 14639 44780015 44779904 34 unwritten
344 14673 44777889 44780049 33 unwritten
345 14706 44781972 44777922 138 unwritten
346 14844 44782111 44782110 160 unwritten
347 15004 44780926 44782271 32 unwritten
348 15036 44778375 44780958 31 unwritten
349 15067 44778575 44778406 31 unwritten
350 15098 44780187 44778606 30 unwritten
351 15128 44782299 44780217 37 unwritten
352 15165 44778688 44782336 29 unwritten
353 15194 44782353 44778717 75 unwritten
354 15269 44782429 44782428 122 unwritten
355 15391 44782553 44782551 322 unwritten
356 15713 44782876 44782875 40 unwritten
357 15753 44782917 44782916 30 unwritten
358 15783 44782948 44782947 70 unwritten
359 15853 44783019 44783018 86 unwritten
360 15939 44783107 44783105 215 unwritten
361 16154 44783323 44783322 35 unwritten
362 16189 44778724 44783358 29 unwritten
363 16218 44783389 44778753 52 unwritten
364 16270 44783442 44783441 67 unwritten
365 16337 44783510 44783509 76 unwritten
366 16413 44783587 44783586 91 unwritten
367 16504 44783679 44783678 188 unwritten
368 16692 44783868 44783867 55 unwritten
369 16747 44783924 44783923 45 unwritten
370 16792 44783970 44783969 39 unwritten
371 16831 44780072 44784009 29 unwritten
372 16860 44784022 44780101 73 unwritten
373 16933 44784096 44784095 76 unwritten
374 17009 44780115 44784172 29 unwritten
375 17038 44780145 44780144 29 unwritten
376 17067 44784181 44780174 233 unwritten
377 17300 44784415 44784414 255 unwritten
378 17555 44783359 44784670 29 unwritten
379 17584 44779712 44783388 28 unwritten
380 17612 44779777 44779740 28 unwritten
381 17640 44784707 44779805 52 unwritten
382 17692 44781369 44784759 28 unwritten
383 17720 44777970 44781397 27 unwritten
384 17747 44778085 44777997 27 unwritten
385 17774 44778141 44778112 27 unwritten
386 17801 44778293 44778168 27 unwritten
387 17828 44784809 44778320 179 unwritten
388 18007 44784989 44784988 40 unwritten
389 18047 44785030 44785029 231 unwritten
390 18278 44785262 44785261 45 unwritten
391 18323 44785308 44785307 357 unwritten
392 18680 44778547 44785665 27 unwritten
393 18707 44779357 44778574 27 unwritten
394 18734 44779537 44779384 27 unwritten
395 18761 44780278 44779564 27 unwritten
396 18788 44781921 44780305 27 unwritten
397 18815 44779017 44781948 26 unwritten
398 18841 44779580 44779043 26 unwritten
399 18867 44780335 44779606 26 unwritten
400 18893 44780658 44780361 26 unwritten
401 18919 44780728 44780684 26 unwritten
402 18945 44785892 44780754 30 unwritten
403 18975 44789606 44785922 136 unwritten
404 19111 44788602 44789742 104 unwritten
405 19215 44789075 44788706 95 unwritten
406 19310 44789895 44789170 92 unwritten
407 19402 44787855 44789987 78 unwritten
408 19480 44787670 44787933 77 unwritten
409 19557 44791413 44787747 80 unwritten
410 19637 44788513 44791493 75 unwritten
411 19712 44790631 44788588 75 unwritten
412 19787 44787177 44790706 71 unwritten
413 19858 44791556 44787248 122 unwritten
414 19980 44790092 44791678 69 unwritten
415 20049 44786455 44790161 68 unwritten
416 20117 44790242 44786523 66 unwritten
417 20183 44791726 44790308 118 unwritten
418 20301 44791100 44791844 63 unwritten
419 20364 44788380 44791163 62 unwritten
420 20426 44786252 44788442 54 unwritten,eof
testfile: 421 extents found
[robm@imap14 conf]$ filefrag -v testfile2
Filesystem type is: ef53
File size of testfile2 is 20971520 (20480 blocks, blocksize 1024)
ext logical physical expected length flags
0 0 44789287 60 unwritten
1 60 44791893 44789347 65 unwritten
2 125 44789786 44791958 60 unwritten
3 185 44787297 44789846 59 unwritten
4 244 44788004 44787356 59 unwritten
5 303 44792057 44788063 83 unwritten
6 386 44791023 44792140 59 unwritten
7 445 44789420 44791082 58 unwritten
8 503 44791281 44789478 58 unwritten
9 561 44792216 44791339 83 unwritten
10 644 44787360 44792299 57 unwritten
11 701 44789198 44787417 57 unwritten
12 758 44788926 44789255 56 unwritten
13 814 44792388 44788982 189 unwritten
14 1003 44792578 44792577 157 unwritten
15 1160 44792736 44792735 65 unwritten
16 1225 44786531 44792801 55 unwritten
17 1280 44792817 44786586 56 unwritten
18 1336 44789500 44792873 54 unwritten
19 1390 44787126 44789554 50 unwritten
20 1440 44790188 44787176 49 unwritten
21 1489 44792929 44790237 132 unwritten
22 1621 44793062 44793061 78 unwritten
23 1699 44788086 44793140 47 unwritten
24 1746 44786733 44788133 46 unwritten
25 1792 44793167 44786779 92 unwritten
26 1884 44793260 44793259 131 unwritten
27 2015 44788841 44793391 45 unwritten
28 2060 44793415 44788886 128 unwritten
29 2188 44789029 44793543 45 unwritten
30 2233 44792331 44789074 45 unwritten
31 2278 44793562 44792376 110 unwritten
32 2388 44790741 44793672 44 unwritten
33 2432 44793684 44790785 173 unwritten
34 2605 44788795 44793857 42 unwritten
35 2647 44788311 44788837 40 unwritten
36 2687 44790962 44788351 40 unwritten
37 2727 44790511 44791002 39 unwritten
38 2766 44787249 44790550 38 unwritten
39 2804 44790015 44787287 38 unwritten
40 2842 44790370 44790053 38 unwritten
41 2880 44790472 44790408 38 unwritten
42 2918 44786792 44790510 37 unwritten
43 2955 44789364 44786829 37 unwritten
44 2992 44787938 44789401 36 unwritten
45 3028 44787810 44787974 35 unwritten
46 3063 44791494 44787845 35 unwritten
47 3098 44786380 44791529 33 unwritten
48 3131 44787440 44786413 33 unwritten
49 3164 44790409 44787473 33 unwritten
50 3197 44792010 44790442 32 unwritten
51 3229 44787750 44792042 31 unwritten
52 3260 44788887 44787781 31 unwritten
53 3291 44792874 44788918 31 unwritten
54 3322 44794131 44792905 31 unwritten
55 3353 44798983 44794162 178 unwritten
56 3531 44799949 44799161 147 unwritten
57 3678 44800354 44800096 175 unwritten
58 3853 44800531 44800529 149 unwritten
59 4002 44797734 44800680 99 unwritten
60 4101 44799648 44797833 95 unwritten
61 4196 44798074 44799743 88 unwritten
62 4284 44800953 44798162 113 unwritten
63 4397 44798549 44801066 85 unwritten
64 4482 44800268 44798634 85 unwritten
65 4567 44798645 44800353 79 unwritten
66 4646 44798323 44798724 78 unwritten
67 4724 44797840 44798401 76 unwritten
68 4800 44799181 44797916 76 unwritten
69 4876 44795475 44799257 74 unwritten
70 4950 44797386 44795549 72 unwritten
71 5022 44801325 44797458 71 unwritten
72 5093 44795922 44801396 70 unwritten
73 5163 44801683 44795992 142 unwritten
74 5305 44796623 44801825 70 unwritten
75 5375 44801881 44796693 168 unwritten
76 5543 44797584 44802049 67 unwritten
77 5610 44795206 44797651 66 unwritten
78 5676 44797133 44795272 66 unwritten
79 5742 44798912 44797199 60 unwritten
80 5802 44798163 44798972 59 unwritten
81 5861 44796879 44798222 58 unwritten
82 5919 44798455 44796937 58 unwritten
83 5977 44794654 44798513 57 unwritten
84 6034 44796544 44794711 57 unwritten
85 6091 44797954 44796601 57 unwritten
86 6148 44798752 44798011 57 unwritten
87 6205 44798223 44798809 55 unwritten
88 6260 44796237 44798278 54 unwritten
89 6314 44794524 44796291 53 unwritten
90 6367 44796788 44794577 51 unwritten
91 6418 44798020 44796839 51 unwritten
92 6469 44794321 44798071 48 unwritten
93 6517 44798402 44794369 47 unwritten
94 6564 44796433 44798449 44 unwritten
95 6608 44794187 44796477 43 unwritten
96 6651 44795577 44794230 43 unwritten
97 6694 44797079 44795620 43 unwritten
98 6737 44798279 44797122 43 unwritten
99 6780 44794884 44798322 42 unwritten
100 6822 44797316 44794926 42 unwritten
101 6864 44794833 44797358 41 unwritten
102 6905 44795780 44794874 41 unwritten
103 6946 44794995 44795821 39 unwritten
104 6985 44796504 44795034 39 unwritten
105 7024 44797502 44796543 39 unwritten
106 7063 44795040 44797541 37 unwritten
107 7100 44794259 44795077 36 unwritten
108 7136 44797239 44794295 36 unwritten
109 7172 44794578 44797275 35 unwritten
110 7207 44796355 44794613 35 unwritten
111 7242 44795367 44796390 34 unwritten
112 7276 44795633 44795401 34 unwritten
113 7310 44796292 44795667 34 unwritten
114 7344 44797549 44796326 34 unwritten
115 7378 44798871 44797583 34 unwritten
116 7412 44796976 44798905 33 unwritten
117 7445 44794736 44797009 32 unwritten
118 7477 44796694 44794768 32 unwritten
119 7509 44794791 44796726 31 unwritten
120 7540 44794963 44794822 31 unwritten
121 7571 44795405 44794994 31 unwritten
122 7602 44795443 44795436 31 unwritten
123 7633 44796152 44795474 31 unwritten
124 7664 44802226 44796183 31 unwritten
125 7695 44808125 44802257 101 unwritten
126 7796 44808240 44808226 83 unwritten
127 7879 44808324 44808323 62 unwritten
128 7941 44808018 44808386 60 unwritten
129 8001 44803262 44808078 37 unwritten
130 8038 44808408 44803299 123 unwritten
131 8161 44806517 44808531 30 unwritten
132 8191 44808559 44806547 74 unwritten
133 8265 44807545 44808633 30 unwritten
134 8295 44804990 44807575 29 unwritten
135 8324 44803712 44805019 28 unwritten
136 8352 44804103 44803740 28 unwritten
137 8380 44808676 44804131 97 unwritten
138 8477 44808774 44808773 38 unwritten
139 8515 44808813 44808812 92 unwritten
140 8607 44808906 44808905 55 unwritten
141 8662 44809177 44808961 105 unwritten
142 8767 44809283 44809282 62 unwritten
143 8829 44809438 44809345 35 unwritten
144 8864 44809596 44809473 41 unwritten
145 8905 44809638 44809637 140 unwritten
146 9045 44806785 44809778 28 unwritten
147 9073 44809783 44806813 194 unwritten
148 9267 44807795 44809977 28 unwritten
149 9295 44802258 44807823 26 unwritten
150 9321 44805759 44802284 26 unwritten
151 9347 44808532 44805785 26 unwritten
152 9373 44810002 44808558 129 unwritten
153 9502 44802419 44810131 25 unwritten
154 9527 44810135 44802444 106 unwritten
155 9633 44802500 44810241 25 unwritten
156 9658 44802977 44802525 25 unwritten
157 9683 44803076 44803002 25 unwritten
158 9708 44804791 44803101 25 unwritten
159 9733 44807082 44804816 25 unwritten
160 9758 44807224 44807107 25 unwritten
161 9783 44807586 44807249 25 unwritten
162 9808 44808079 44807611 25 unwritten
163 9833 44806291 44808104 24 unwritten
164 9857 44803491 44806315 23 unwritten
165 9880 44804325 44803514 23 unwritten
166 9903 44805519 44804348 22 unwritten
167 9925 44807354 44805541 21 unwritten
168 9946 44810397 44807375 22 unwritten
169 9968 44814980 44810419 167 unwritten
170 10135 44816191 44815147 167 unwritten
171 10302 44816650 44816358 143 unwritten
172 10445 44815247 44816793 135 unwritten
173 10580 44817213 44815382 321 unwritten
174 10901 44817535 44817534 289 unwritten
175 11190 44814060 44817824 131 unwritten
176 11321 44814816 44814191 130 unwritten
177 11451 44810857 44814946 124 unwritten
178 11575 44811990 44810981 122 unwritten
179 11697 44816050 44812112 122 unwritten
180 11819 44811116 44816172 120 unwritten
181 11939 44816829 44811236 113 unwritten
182 12052 44813614 44816942 110 unwritten
183 12162 44812501 44813724 107 unwritten
184 12269 44812150 44812608 103 unwritten
185 12372 44811438 44812253 98 unwritten
186 12470 44818103 44811536 330 unwritten
187 12800 44815800 44818433 93 unwritten
188 12893 52396657 44815893 141 unwritten
189 13034 52398902 52396798 135 unwritten
190 13169 52400131 52399037 86 unwritten
191 13255 52400651 52400217 83 unwritten
192 13338 52400803 52400734 90 unwritten
193 13428 52396921 52400893 79 unwritten
194 13507 52400896 52397000 109 unwritten
195 13616 52397287 52401005 75 unwritten
196 13691 52397837 52397362 71 unwritten
197 13762 52400006 52397908 71 unwritten
198 13833 52398016 52400077 66 unwritten
199 13899 52399533 52398082 65 unwritten
200 13964 52399895 52399598 65 unwritten
201 14029 52399736 52399960 64 unwritten
202 14093 52401119 52399800 79 unwritten
203 14172 52400322 52401198 64 unwritten
204 14236 52397657 52400386 63 unwritten
205 14299 52401217 52397720 147 unwritten
206 14446 52400490 52401364 63 unwritten
207 14509 52398224 52400553 59 unwritten
208 14568 52398547 52398283 59 unwritten
209 14627 52398607 52398606 59 unwritten
210 14686 52401410 52398666 80 unwritten
211 14766 52399651 52401490 59 unwritten
212 14825 52400232 52399710 59 unwritten
213 14884 52399330 52400291 58 unwritten
214 14942 52399425 52399388 58 unwritten
215 15000 52400397 52399483 58 unwritten
216 15058 52401564 52400455 60 unwritten
217 15118 52398802 52401624 57 unwritten
218 15175 52401634 52398859 82 unwritten
219 15257 52401718 52401716 97 unwritten
220 15354 52397001 52401815 55 unwritten
221 15409 52396448 52397056 54 unwritten
222 15463 52399159 52396502 54 unwritten
223 15517 52396209 52399213 49 unwritten
224 15566 52401871 52396258 146 unwritten
225 15712 52398410 52402017 49 unwritten
226 15761 52402031 52398459 82 unwritten
227 15843 52398306 52402113 48 unwritten
228 15891 52398460 52398354 46 unwritten
229 15937 52397555 52398506 45 unwritten
230 15982 52400759 52397600 43 unwritten
231 16025 52402222 52400802 106 unwritten
232 16131 52402329 52402328 102 unwritten
233 16233 52402432 52402431 184 unwritten
234 16417 52397740 52402616 40 unwritten
235 16457 52400590 52397780 40 unwritten
236 16497 52402638 52400630 46 unwritten
237 16543 52402686 52402684 82 unwritten
238 16625 52402769 52402768 42 unwritten
239 16667 52399966 52402811 39 unwritten
240 16706 52397212 52400005 38 unwritten
241 16744 52402831 52397250 47 unwritten
242 16791 52402879 52402878 108 unwritten
243 16899 52399108 52402987 37 unwritten
244 16936 52397933 52399145 36 unwritten
245 16972 52399852 52397969 36 unwritten
246 17008 52396834 52399888 35 unwritten
247 17043 52403035 52396869 67 unwritten
248 17110 52403103 52403102 48 unwritten
249 17158 52403152 52403151 84 unwritten
250 17242 52397781 52403236 35 unwritten
251 17277 52398667 52397816 35 unwritten
252 17312 52403250 52398702 154 unwritten
253 17466 52396870 52403404 34 unwritten
254 17500 52398189 52396904 34 unwritten
255 17534 52403406 52398223 33 unwritten
256 17567 52397387 52403439 32 unwritten
257 17599 52403508 52397419 38 unwritten
258 17637 52396259 52403546 31 unwritten
259 17668 52403554 52396290 141 unwritten
260 17809 52396546 52403695 31 unwritten
261 17840 52398151 52396577 31 unwritten
262 17871 52403725 52398182 187 unwritten
263 18058 52398711 52403912 31 unwritten
264 18089 52403915 52398742 39 unwritten
265 18128 52403955 52403954 58 unwritten
266 18186 52404014 52404013 157 unwritten
267 18343 52404172 52404171 53 unwritten
268 18396 52399606 52404225 31 unwritten
269 18427 52400558 52399637 31 unwritten
270 18458 52401006 52400589 31 unwritten
271 18489 52401087 52401037 31 unwritten
272 18520 52396580 52401118 30 unwritten
273 18550 52400087 52396610 30 unwritten
274 18580 52396333 52400117 29 unwritten
275 18609 52397457 52396362 29 unwritten
276 18638 52404302 52397486 37 unwritten
277 18675 52407521 52404339 108 unwritten
278 18783 52406311 52407629 107 unwritten
279 18890 52406629 52406418 107 unwritten
280 18997 52406894 52406736 107 unwritten
281 19104 52407283 52407001 104 unwritten
282 19208 52405118 52407387 81 unwritten
283 19289 52405498 52405199 63 unwritten
284 19352 52407688 52405561 63 unwritten
285 19415 52408078 52407751 51 unwritten
286 19466 52408164 52408129 51 unwritten
287 19517 52404451 52408215 50 unwritten
288 19567 52408301 52404501 53 unwritten
289 19620 52406750 52408354 50 unwritten
290 19670 52407775 52406800 49 unwritten
291 19719 52407960 52407824 46 unwritten
292 19765 52404632 52408006 40 unwritten
293 19805 52405251 52404672 40 unwritten
294 19845 52408425 52405291 40 unwritten
295 19885 52404778 52408465 38 unwritten
296 19923 52407244 52404816 38 unwritten
297 19961 52408247 52407282 37 unwritten
298 19998 52406133 52408284 36 unwritten
299 20034 52408566 52406169 54 unwritten
300 20088 52406491 52408620 36 unwritten
301 20124 52408651 52406527 51 unwritten
302 20175 52408703 52408702 137 unwritten
303 20312 52408841 52408840 107 unwritten
304 20419 52407194 52408948 36 unwritten
305 20455 52510893 52407230 25 unwritten,eof
testfile2: 306 extents found

> What we probably need to do is to have some hueristic where if we know
> there are plenty of block groups with lots of large contiguous free
> space, and the block group which we are preferring either because it's
> the same one as the inode number, or because it's where we had
> previously done the last block allocation for the file, and we the
> user requests a large fallocated region, that it's better to switch
> over to one of the other block groups with lots and lots of free
> space.

In this particular case, those allocations look to be in 2 main areas
for each file, however I know that in other tests we saw data spread all
over the place.

For that matter, one big question I have is why each of these results is
so different.

[robm@imap14 conf]$ for i in 1 2 3 4 5 6 7 8 9 10; do fallocate -l 20m
testfile3; filefrag testfile3; /bin/rm testfile3; done
testfile3: 496 extents found
testfile3: 284 extents found
testfile3: 714 extents found
testfile3: 285 extents found
testfile3: 632 extents found
testfile3: 269 extents found
testfile3: 701 extents found
testfile3: 290 extents found
testfile3: 349 extents found
testfile3: 337 extents found


--
Rob Mueller
[email protected]



2013-01-30 21:44:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Thu, Jan 31, 2013 at 08:21:50AM +1100, Robert Mueller wrote:
>
> For that matter, one big question I have is why each of these results is
> so different.
>
> [robm@imap14 conf]$ for i in 1 2 3 4 5 6 7 8 9 10; do fallocate -l 20m
> testfile3; filefrag testfile3; /bin/rm testfile3; done

The most likely reason is that it depends on transaction boundaries.
After a block has been released, we can't reuse it until after the
jbd2 transaction which contains the deletion of the inode has
committed. So even after you've deleted the file, we can't reuse the
blocks right away. The other thing which will influence the block
allocation is which block group the last allocation was for that
particular file. So if blocks become available after a commit
completes, if we've started allocating in another block group, we
won't go back to the initial block group.

Cheers,

- Ted


2013-01-30 22:40:12

by Bron Gondwana

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Thu, Jan 31, 2013, at 08:43 AM, Theodore Ts'o wrote:
> On Thu, Jan 31, 2013 at 08:21:50AM +1100, Robert Mueller wrote:

(around now, was dropping the kids at school)

> > For that matter, one big question I have is why each of these results is
> > so different.
> >
> > [robm@imap14 conf]$ for i in 1 2 3 4 5 6 7 8 9 10; do fallocate -l 20m
> > testfile3; filefrag testfile3; /bin/rm testfile3; done
>
> The most likely reason is that it depends on transaction boundaries.
> After a block has been released, we can't reuse it until after the
> jbd2 transaction which contains the deletion of the inode has
> committed. So even after you've deleted the file, we can't reuse the
> blocks right away. The other thing which will influence the block
> allocation is which block group the last allocation was for that
> particular file. So if blocks become available after a commit
> completes, if we've started allocating in another block group, we
> won't go back to the initial block group.

The particular directory we're doing this test in is a cyrus imapd "conf"
directory. It contains mostly symlinks and sub directories (some of them
quite hot) but it also contains mailboxes.db, which is a very active
database file. In this case it's twoskip, which is a skiplist-based file
format.

When any change is made to a twoskip file, the IO pattern is:

1) rewrite first 64 bytes (marking file dirty) and fdatasync
2) append new change/delete records and update back pointers (involves
between 1 and 20 random rewrites of between 32 and 200ish bytes per
change)
3) fsync
4) rewrite first 64 bytes (marking file clean again) and fdatasync

So we get two fdatasyncs, one fsync (to save the metadata about the
file being longer now) a bunch of random updates throughout the file,
and some amount of new data appended to the file.

Every so often the file contains too many obsolete records, and it gets
repacked. This involves creating a new database file (mailboxes.db.NEW)
and walking through the original database copying each record to the new
database. Finally, the new database is renamed over the old.

It uses flock on the entire file for serialisation, so there can only be
a single writer at a time.

Writes are done using seek and writev, reads are done by MMAPing the
entire file.

More detail about twoskip here if anyone cares:

http://opera.brong.fastmail.fm/talks/twoskip/

It's the twoskip files that we're particularly concerned about. Not so
much that they fragment during use, that's kind of expected - but that a
repack doesn't result in a single contiguous file. Apart from the header,
I can't see why it doesn't.

I could probably change the repack code to not do the two first fdatasyncs,
and just do a final fsync before renaming, if you think that initial fsync
of just a couple of hundred bytes (header plus initial dummy record) is
likely to mess up page allocation.

Bron.
--
Bron Gondwana
[email protected]


2013-01-30 22:49:37

by Robert Mueller

[permalink] [raw]
Subject: Re: fallocate creating fragmented files


> It's the twoskip files that we're particularly concerned about. Not
> so much that they fragment during use, that's kind of expected - but
> that a repack doesn't result in a single contiguous file. Apart from
> the header, I can't see why it doesn't.

Well yes, this is ultimately what we're worried about, but we did some
digging and found that even just fallocating a file creates a very
fragmented file.

I think it's better to try and work out why just fallocating space for
files is creating such fragmented files, it's certainly a simpler test
case to try and get information on first.

Rob

2013-01-30 22:51:23

by Robert Mueller

[permalink] [raw]
Subject: Re: fallocate creating fragmented files


> The most likely reason is that it depends on transaction boundaries.
> After a block has been released, we can't reuse it until after the
> jbd2 transaction which contains the deletion of the inode has
> committed. So even after you've deleted the file, we can't reuse the
> blocks right away. The other thing which will influence the block
> allocation is which block group the last allocation was for that
> particular file. So if blocks become available after a commit
> completes, if we've started allocating in another block group, we
> won't go back to the initial block group.

Ok, makes sense.

However it still doesn't answer the question about why the allocator is
choosing smaller extents over larger ones nearby.

For instance, looking at filefrag -v for testfile and testfile2 again.
Remember, these were created immediately one after another.

testfile:
...
398 18841 44779580 44779043 26 unwritten
399 18867 44780335 44779606 26 unwritten
400 18893 44780658 44780361 26 unwritten

testfile2:
...
13 814 44792388 44788982 189 unwritten
14 1003 44792578 44792577 157 unwritten

Those look quite near each other. So when testfile1 was being allocated,
there were some bigger extents right nearby that were ignored, and ended
up being used when the next file testfile2 was allocated. Why?

Also, while e4defrag will try and defrag a file (or multiple files), is
there any way to actually defrag the entire filesystem to try and move
files around more intelligently to make larger extents? I guess running
e4defrag on the entire filesystem multiple times would help, but it
still would not move small files that are breaking up large extents. Is
there any way to do that?

Rob

2013-02-01 11:33:22

by Bron Gondwana

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Thu, Jan 31, 2013, at 09:51 AM, Robert Mueller wrote:
> Also, while e4defrag will try and defrag a file (or multiple files), is
> there any way to actually defrag the entire filesystem to try and move
> files around more intelligently to make larger extents? I guess running
> e4defrag on the entire filesystem multiple times would help, but it
> still would not move small files that are breaking up large extents. Is
> there any way to do that?

In particular, the way that Cyrus works seems entirely suboptimal for ext4.
The index and database files receive very small appends (108 byte per message
for the index, and probably just a few hundred per write for most of the the
twoskip databases), and they happen pretty much randomly to one of tens of
thousands of these little files, depending which mailbox received the message.

This causes allocation patterns which result in tons of tiny holes over time
as files get deleted, so the filesystem is kind of evenly scattered all over.

Here's the same experiment on a "fresh" filesystem. I created this by taking
a server down, copying the entire contents of the SSD to a spare piece of rust,
reformatting, and copying it all back (cp -a). So the data on there is the
same, just the allocations have changed.

[brong@imap15 conf]$ fallocate -l 20m testfile
[brong@imap15 conf]$ filefrag -v testfile
Filesystem type is: ef53
File size of testfile is 20971520 (20480 blocks, blocksize 1024)
ext logical physical expected length flags
0 0 22913025 8182 unwritten
1 8182 22921217 22921207 8182 unwritten
2 16364 22929409 22929399 4116 unwritten,eof
testfile: 3 extents found

As you can see, that's slightly more optimal. I'm assuming 8182 is the
maximum number of contiguous blocks before you hit an assigned metadata
location and have to skip over it.

So in other words, our 2 year old filesystems are shot. We need to do
this sort of "defrag" on a semi-regular basis. Joy.

Bron.
--
Bron Gondwana
[email protected]


2013-02-01 13:56:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Fri, Feb 01, 2013 at 10:33:21PM +1100, Bron Gondwana wrote:
>
> In particular, the way that Cyrus works seems entirely suboptimal for ext4.
> The index and database files receive very small appends (108 byte per message
> for the index, and probably just a few hundred per write for most of the the
> twoskip databases), and they happen pretty much randomly to one of tens of
> thousands of these little files, depending which mailbox received the message.

Are all of these files in a single directory? If so, that's part of
the problem, since ext[34] uses the directory structure to try to
spread apart unrelated files, so that hueristic can't be easily used
if all of the files are in a single directory.

> Here's the same experiment on a "fresh" filesystem. I created this by taking
> a server down, copying the entire contents of the SSD to a spare piece of rust,
> reformatting, and copying it all back (cp -a). So the data on there is the
> same, just the allocations have changed.
>
> [brong@imap15 conf]$ fallocate -l 20m testfile
> [brong@imap15 conf]$ filefrag -v testfile
> Filesystem type is: ef53
> File size of testfile is 20971520 (20480 blocks, blocksize 1024)
> ext logical physical expected length flags
> 0 0 22913025 8182 unwritten
> 1 8182 22921217 22921207 8182 unwritten
> 2 16364 22929409 22929399 4116 unwritten,eof
> testfile: 3 extents found
>
> As you can see, that's slightly more optimal. I'm assuming 8182 is the
> maximum number of contiguous blocks before you hit an assigned metadata
> location and have to skip over it.

Is there a reason why you are using a 1k block size? The size of a
block group is 8192 blocks for 1k blocks (or 8 megabytes), while with
a 4k block size, the size of a block group is 32768 blocks (or 128
megabytes). In general the ext4 file system is going to be far more
efficient with a 4k block size.

Regards,

- Ted

2013-02-02 10:50:38

by Bron Gondwana

[permalink] [raw]
Subject: Re: fallocate creating fragmented files

On Sat, Feb 2, 2013, at 12:55 AM, Theodore Ts'o wrote:
> On Fri, Feb 01, 2013 at 10:33:21PM +1100, Bron Gondwana wrote:
> >
> > In particular, the way that Cyrus works seems entirely suboptimal for ext4.
> > The index and database files receive very small appends (108 byte per message
> > for the index, and probably just a few hundred per write for most of the the
> > twoskip databases), and they happen pretty much randomly to one of tens of
> > thousands of these little files, depending which mailbox received the message.
>
> Are all of these files in a single directory? If so, that's part of
> the problem, since ext[34] uses the directory structure to try to
> spread apart unrelated files, so that hueristic can't be easily used
> if all of the files are in a single directory.

No, but the vast majority of them are 2-3 files per directory which will be
appended to at the same time, so they probably interleave :(

> > Here's the same experiment on a "fresh" filesystem. I created this by taking
> > a server down, copying the entire contents of the SSD to a spare piece of rust,
> > reformatting, and copying it all back (cp -a). So the data on there is the
> > same, just the allocations have changed.
> >
> > [brong@imap15 conf]$ fallocate -l 20m testfile
> > [brong@imap15 conf]$ filefrag -v testfile
> > Filesystem type is: ef53
> > File size of testfile is 20971520 (20480 blocks, blocksize 1024)
> > ext logical physical expected length flags
> > 0 0 22913025 8182 unwritten
> > 1 8182 22921217 22921207 8182 unwritten
> > 2 16364 22929409 22929399 4116 unwritten,eof
> > testfile: 3 extents found
> >
> > As you can see, that's slightly more optimal. I'm assuming 8182 is the
> > maximum number of contiguous blocks before you hit an assigned metadata
> > location and have to skip over it.
>
> Is there a reason why you are using a 1k block size? The size of a
> block group is 8192 blocks for 1k blocks (or 8 megabytes), while with
> a 4k block size, the size of a block group is 32768 blocks (or 128
> megabytes). In general the ext4 file system is going to be far more
> efficient with a 4k block size.

Mostly because a lot of our files are quite small.

Here's a set of file sizes and counts for that filesystem.

72055 zero
501435 <=512
32004 <=1k
46447 <=4k
38411 <=16k
49435 >16k

As you can see, the vast majority are significantly less than 1k in size,
so a 4k block size would add significant space overhead. Basically, we
wouldn't be able to fit everything on there.

There are plans afoot to merge most of those smaller files into a single
larger per-user file, which should help eventually. Meanwhile, this is
what we have. We were actually considering 1k block size for our email
spools as well, which are currently 4k block size, because most emails
are smaller than 4k as well, so we would reduce the space wastage there.

Bron.
--
Bron Gondwana
[email protected]