2015-05-13 15:37:45

by U.Mutlu

[permalink] [raw]
Subject: Htree concept

Hi,
I'm writing a toy-fs, and discover a major shortcoming
(finding a given child (dir/file) as fast as possible),
which other developers (ie. ext3/4) had encountered long ago too.
They introduced HTree. The info on HTree on the web is scarce
or I couldn't find the right texts/papers yet.
I wonder how HTree works on a conceptual basis.
Could a kind soul enligten me pls. TIA.

--
cu
Uenal



2015-05-13 16:24:17

by U.Mutlu

[permalink] [raw]
Subject: Re: Htree concept

U.Mutlu wrote on 05/13/2015 05:37 PM:
> Hi,
> I'm writing a toy-fs, and discover a major shortcoming
> (finding a given child (dir/file) as fast as possible),
> which other developers (ie. ext3/4) had encountered long ago too.
> They introduced HTree. The info on HTree on the web is scarce
> or I couldn't find the right texts/papers yet.
> I wonder how HTree works on a conceptual basis.
> Could a kind soul enligten me pls. TIA.

Now I found this text "PHTree design refresh":
http://phunq.net/pipermail/tux3/2013-January/000026.html
Ok, this has helpful info and further links on HTree/PHTree.

--
cu
Uenal



2015-05-13 16:29:46

by Eric Sandeen

[permalink] [raw]
Subject: Re: Htree concept

On 5/13/15 10:37 AM, U.Mutlu wrote:
> Hi,
> I'm writing a toy-fs, and discover a major shortcoming
> (finding a given child (dir/file) as fast as possible),
> which other developers (ie. ext3/4) had encountered long ago too.
> They introduced HTree. The info on HTree on the web is scarce
> or I couldn't find the right texts/papers yet.
> I wonder how HTree works on a conceptual basis.
> Could a kind soul enligten me pls. TIA.

Regarding htree details, did you look at:

http://en.wikipedia.org/wiki/HTree

which points to:

http://ext2.sourceforge.net/2005-ols/paper-html/node3.html
and more specifically,
http://web.archive.org/web/20131203105316/http://www.linuxshowcase.org/2001/full_papers/phillips/phillips_html/index.html

?

-Eric


2015-05-13 17:23:18

by U.Mutlu

[permalink] [raw]
Subject: Re: Htree concept

Eric Sandeen wrote on 05/13/2015 06:29 PM:
> On 5/13/15 10:37 AM, U.Mutlu wrote:
>> Hi,
>> I'm writing a toy-fs, and discover a major shortcoming
>> (finding a given child (dir/file) as fast as possible),
>> which other developers (ie. ext3/4) had encountered long ago too.
>> They introduced HTree. The info on HTree on the web is scarce
>> or I couldn't find the right texts/papers yet.
>> I wonder how HTree works on a conceptual basis.
>> Could a kind soul enligten me pls. TIA.
>
> Regarding htree details, did you look at:
>
> http://en.wikipedia.org/wiki/HTree
>
> which points to:
>
> http://ext2.sourceforge.net/2005-ols/paper-html/node3.html
> and more specifically,
> http://web.archive.org/web/20131203105316/http://www.linuxshowcase.org/2001/full_papers/phillips/phillips_html/index.html
>
> ?

Thanks, the wiki page and its refs I knew, but needed some more info.

Ok, it is written that HTree uses 32bit (or 64?) hashes for keys.
I wonder if it wouldn't be better if one instead would use that space
(32/64 bit) for storing the first n chars of the key (ie. of the dir/file name)
and keeping the directory entries in a sorted order on the disk,
and then do a bsearch instead of doing sequential table lookup using HTree?
I wonder what the "Tree"-part of HTree stand for in this context.
Am I right in my assumption that HTree mainly means the hashing mechanism,
but does not use any binary search mechanism for searching the key?

--
Thx
Uenal



2015-05-13 17:37:52

by U.Mutlu

[permalink] [raw]
Subject: Re: Htree concept

U.Mutlu wrote on 05/13/2015 07:22 PM:
> Eric Sandeen wrote on 05/13/2015 06:29 PM:
>> On 5/13/15 10:37 AM, U.Mutlu wrote:
>>> Hi,
>>> I'm writing a toy-fs, and discover a major shortcoming
>>> (finding a given child (dir/file) as fast as possible),
>>> which other developers (ie. ext3/4) had encountered long ago too.
>>> They introduced HTree. The info on HTree on the web is scarce
>>> or I couldn't find the right texts/papers yet.
>>> I wonder how HTree works on a conceptual basis.
>>> Could a kind soul enligten me pls. TIA.
>>
>> Regarding htree details, did you look at:
>>
>> http://en.wikipedia.org/wiki/HTree
>>
>> which points to:
>>
>> http://ext2.sourceforge.net/2005-ols/paper-html/node3.html
>> and more specifically,
>> http://web.archive.org/web/20131203105316/http://www.linuxshowcase.org/2001/full_papers/phillips/phillips_html/index.html
>>
>>
>> ?
>
> Thanks, the wiki page and its refs I knew, but needed some more info.
>
> Ok, it is written that HTree uses 32bit (or 64?) hashes for keys.
> I wonder if it wouldn't be better if one instead would use that space
> (32/64 bit) for storing the first n chars of the key (ie. of the dir/file name)
> and keeping the directory entries in a sorted order on the disk,
> and then do a bsearch instead of doing sequential table lookup using HTree?
> I wonder what the "Tree"-part of HTree stand for in this context.
> Am I right in my assumption that HTree mainly means the hashing mechanism,
> but does not use any binary search mechanism for searching the key?

Addendum:
I think I slowly grasp how HTree works: it keeps a (rb/avl tree)
b*tree-db (I guess it stores it on disk) of the hashes (as keys).

In contrast to that here my idea: keep the hdr blocks (ie. where the
dir/file names are) always in a sorted order. Then a bsearch should be doable.
This would eliminate the need for any b*tree-db usage.

--
cu
Uenal



2015-05-13 21:18:57

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Htree concept

On Wed, May 13, 2015 at 07:37:36PM +0200, U.Mutlu wrote:
> I think I slowly grasp how HTree works: it keeps a (rb/avl tree)
> b*tree-db (I guess it stores it on disk) of the hashes (as keys).

The reason for using hashes is it keeps the fanout of the tree very
high, which in turn keeps the depth of the tree very short. This
means that we can do search a very large directory using at most three
disk reads (two levels of internal node, where each node can store up
to 340 hashes plus pointers the next level of the tree, plus a
directory leaf block).

> In contrast to that here my idea: keep the hdr blocks (ie. where the
> dir/file names are) always in a sorted order. Then a bsearch should be doable.
> This would eliminate the need for any b*tree-db usage.

The problem with using a binary search is (a) it's more expensive to
search each disk read divides the search space in half (in contrast,
in the best case using htree, the first disk read can divide the
search space by factor of 340), and (b) insertions are very expensive;
suppose you have a 400 megabyte directory, and you need to insert a
filename into the very beginning of the list. You will have to
performance 800 megabytes of I/O to make room for directory entry, if
you want to keep all of the directory entries sorted.

Cheers,

- Ted

2015-05-14 02:50:25

by U.Mutlu

[permalink] [raw]
Subject: Re: Htree concept

Theodore Ts'o wrote on 05/13/2015 11:18 PM:
> On Wed, May 13, 2015 at 07:37:36PM +0200, U.Mutlu wrote:
>> I think I slowly grasp how HTree works: it keeps a (rb/avl tree)
>> b*tree-db (I guess it stores it on disk) of the hashes (as keys).
>
> The reason for using hashes is it keeps the fanout of the tree very
> high, which in turn keeps the depth of the tree very short. This
> means that we can do search a very large directory using at most three
> disk reads (two levels of internal node, where each node can store up
> to 340 hashes plus pointers the next level of the tree, plus a
> directory leaf block).

Yes, I see. I'll do similar in my prj, perhaps adding one more level,
ie. 3 reads to locate an item among about 10 million items (as said
just a toy-fs for fun :-), plus 1 more read for the item itself.

>> In contrast to that here my idea: keep the hdr blocks (ie. where the
>> dir/file names are) always in a sorted order. Then a bsearch should be doable.
>> This would eliminate the need for any b*tree-db usage.
>
> The problem with using a binary search is (a) it's more expensive to
> search each disk read divides the search space in half (in contrast,
> in the best case using htree, the first disk read can divide the
> search space by factor of 340), and (b) insertions are very expensive;
> suppose you have a 400 megabyte directory, and you need to insert a
> filename into the very beginning of the list. You will have to
> performance 800 megabytes of I/O to make room for directory entry, if
> you want to keep all of the directory entries sorted.

Yes, my initial idea to use bsearch leads to much more disk i/o.
Thx for the info.

--
cu
Uenal