by Mathieu Desnoyers

[permalink] [raw]

Subject: Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable

* Sasha Levin ([email protected]) wrote:
> On 08/28/2012 12:11 PM, Mathieu Desnoyers wrote:
> > * Sasha Levin ([email protected]) wrote:
> >> On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
> >>> * Tejun Heo ([email protected]) wrote:
> >>>> Hello,
> >>>>
> >>>> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> >>>>> Thats the thing, the amount of things of things you can do with a given bucket
> >>>>> is very limited. You can't add entries to any point besides the head (without
> >>>>> walking the entire list).
> >>>>
> >>>> Kinda my point. We already have all the hlist*() interface to deal
> >>>> with such cases. Having something which is evidently the trivial
> >>>> hlist hashtable and advertises as such in the interface can be
> >>>> helpful. I think we need that more than we need anything fancy.
> >>>>
> >>>> Heh, this is a debate about which one is less insignificant. I can
> >>>> see your point. I'd really like to hear what others think on this.
> >>>>
> >>>> Guys, do we want something which is evidently trivial hlist hashtable
> >>>> which can use hlist_*() API directly or do we want something better
> >>>> encapsulated?
> >>>
> >>> My 2 cents, FWIW: I think this specific effort should target a trivially
> >>> understandable API and implementation, for use-cases where one would be
> >>> tempted to reimplement his own trivial hash table anyway. So here
> >>> exposing hlist internals, with which kernel developers are already
> >>> familiar, seems like a good approach in my opinion, because hiding stuff
> >>> behind new abstraction might make the target users go away.
> >>>
> >>> Then, as we see the need, we can eventually merge a more elaborate hash
> >>> table with poneys and whatnot, but I would expect that the trivial hash
> >>> table implementation would still be useful. There are of course very
> >>> compelling reasons to use a more featureful hash table: automatic
> >>> resize, RT-aware updates, scalable updates, etc... but I see a purpose
> >>> for a trivial implementation. Its primary strong points being:
> >>>
> >>> - it's trivially understandable, so anyone how want to be really sure
> >>> they won't end up debugging the hash table instead of their
> >>> work-in-progress code can have a full understanding of it,
> >>> - it has few dependencies, which makes it easier to understand and
> >>> easier to use in some contexts (e.g. early boot).
> >>>
> >>> So I'm in favor of not overdoing the abstraction for this trivial hash
> >>> table, and honestly I would rather prefer that this trivial hash table
> >>> stays trivial. A more elaborate hash table should probably come as a
> >>> separate API.
> >>>
> >>> Thanks,
> >>>
> >>> Mathieu
> >>>
> >>
> >> Alright, let's keep it simple then.
> >>
> >> I do want to keep the hash_for_each[rcu,safe] family though.
> >
> > Just a thought: if the API offered by the simple hash table focus on
> > providing a mechanism to find the hash bucket to which belongs the hash
> > chain containing the key looked up, and then expects the user to use the
> > hlist API to iterate on the chain (with or without the hlist _rcu
> > variant), then it might seem consistent that a helper providing
> > iteration over the entire table would actually just provide iteration on
> > all buckets, and let the user call the hlist for each iterator for each
> > node within the bucket, e.g.:
> >
> > struct hlist_head *head;
> > struct hlist_node *pos;
> >
> > hash_for_each_bucket(ht, head) {
> > hlist_for_each(pos, head) {
> > ...
> > }
> > }
> >
> > That way you only have to provide one single macro
> > (hash_for_each_bucket), and rely on the already existing:
> >
> > - hlist_for_each_entry
> > - hlist_for_each_safe
> > - hlist_for_each_entry_rcu
> > - hlist_for_each_safe_rcu
> > .....
> >
> > and various flavors that can appear in the future without duplicating
> > this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
> > versions of the hash_for_each_bucket macro.
> >
> > Thoughts ?
>
> In my opinion, the downside here is that it'll require 2 function calls and 2
> levels of nesting for a simple hash iteration.

Those are macros, not functions. No function call is required. But I see
your point about nesting.

>
> hash_for_each_bucket() will always be followed by an iteration of that
> bucket, so splitting a hash_for_each() which does both into 2
> different functions which will almost always must be called in that
> given order sounds unintuitive to me.
>
> It's also just 3 different possible iterators:
>
> - hlist_for_each_entry
> - hlist_for_each_entry_safe
> - hlist_for_each_entry_rcu
>
> So I think that it's a good price to pay - 2 extra macro definitions
> in the header to save a macro call + nesting level in each place that
> uses a hashtable.

I must admin I don't care that much one way or another.

Thanks,

Mathieu

>
>
> Thanks,
> Sasha
>
> > Thanks,
> >
> > Mathieu
> >
>

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2012-08-28 23:00:56

by Mathieu Desnoyers

[permalink] [raw]

Subject: Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable

* Mathieu Desnoyers ([email protected]) wrote:
> * Sasha Levin ([email protected]) wrote:
> > On 08/28/2012 12:11 PM, Mathieu Desnoyers wrote:
> > > * Sasha Levin ([email protected]) wrote:
> > >> On 08/25/2012 06:24 AM, Mathieu Desnoyers wrote:
> > >>> * Tejun Heo ([email protected]) wrote:
> > >>>> Hello,
> > >>>>
> > >>>> On Sat, Aug 25, 2012 at 12:59:25AM +0200, Sasha Levin wrote:
> > >>>>> Thats the thing, the amount of things of things you can do with a given bucket
> > >>>>> is very limited. You can't add entries to any point besides the head (without
> > >>>>> walking the entire list).
> > >>>>
> > >>>> Kinda my point. We already have all the hlist*() interface to deal
> > >>>> with such cases. Having something which is evidently the trivial
> > >>>> hlist hashtable and advertises as such in the interface can be
> > >>>> helpful. I think we need that more than we need anything fancy.
> > >>>>
> > >>>> Heh, this is a debate about which one is less insignificant. I can
> > >>>> see your point. I'd really like to hear what others think on this.
> > >>>>
> > >>>> Guys, do we want something which is evidently trivial hlist hashtable
> > >>>> which can use hlist_*() API directly or do we want something better
> > >>>> encapsulated?
> > >>>
> > >>> My 2 cents, FWIW: I think this specific effort should target a trivially
> > >>> understandable API and implementation, for use-cases where one would be
> > >>> tempted to reimplement his own trivial hash table anyway. So here
> > >>> exposing hlist internals, with which kernel developers are already
> > >>> familiar, seems like a good approach in my opinion, because hiding stuff
> > >>> behind new abstraction might make the target users go away.
> > >>>
> > >>> Then, as we see the need, we can eventually merge a more elaborate hash
> > >>> table with poneys and whatnot, but I would expect that the trivial hash
> > >>> table implementation would still be useful. There are of course very
> > >>> compelling reasons to use a more featureful hash table: automatic
> > >>> resize, RT-aware updates, scalable updates, etc... but I see a purpose
> > >>> for a trivial implementation. Its primary strong points being:
> > >>>
> > >>> - it's trivially understandable, so anyone how want to be really sure
> > >>> they won't end up debugging the hash table instead of their
> > >>> work-in-progress code can have a full understanding of it,
> > >>> - it has few dependencies, which makes it easier to understand and
> > >>> easier to use in some contexts (e.g. early boot).
> > >>>
> > >>> So I'm in favor of not overdoing the abstraction for this trivial hash
> > >>> table, and honestly I would rather prefer that this trivial hash table
> > >>> stays trivial. A more elaborate hash table should probably come as a
> > >>> separate API.
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Mathieu
> > >>>
> > >>
> > >> Alright, let's keep it simple then.
> > >>
> > >> I do want to keep the hash_for_each[rcu,safe] family though.
> > >
> > > Just a thought: if the API offered by the simple hash table focus on
> > > providing a mechanism to find the hash bucket to which belongs the hash
> > > chain containing the key looked up, and then expects the user to use the
> > > hlist API to iterate on the chain (with or without the hlist _rcu
> > > variant), then it might seem consistent that a helper providing
> > > iteration over the entire table would actually just provide iteration on
> > > all buckets, and let the user call the hlist for each iterator for each
> > > node within the bucket, e.g.:
> > >
> > > struct hlist_head *head;
> > > struct hlist_node *pos;
> > >
> > > hash_for_each_bucket(ht, head) {
> > > hlist_for_each(pos, head) {
> > > ...
> > > }
> > > }
> > >
> > > That way you only have to provide one single macro
> > > (hash_for_each_bucket), and rely on the already existing:
> > >
> > > - hlist_for_each_entry
> > > - hlist_for_each_safe
> > > - hlist_for_each_entry_rcu
> > > - hlist_for_each_safe_rcu
> > > .....
> > >
> > > and various flavors that can appear in the future without duplicating
> > > this API. So you won't even have to create _rcu, _safe, nor _safe_rcu
> > > versions of the hash_for_each_bucket macro.
> > >
> > > Thoughts ?
> >
> > In my opinion, the downside here is that it'll require 2 function calls and 2
> > levels of nesting for a simple hash iteration.
>
> Those are macros, not functions. No function call is required. But I see
> your point about nesting.
>
> >
> > hash_for_each_bucket() will always be followed by an iteration of that
> > bucket, so splitting a hash_for_each() which does both into 2
> > different functions which will almost always must be called in that
> > given order sounds unintuitive to me.
> >
> > It's also just 3 different possible iterators:
> >
> > - hlist_for_each_entry
> > - hlist_for_each_entry_safe
> > - hlist_for_each_entry_rcu
> >
> > So I think that it's a good price to pay - 2 extra macro definitions
> > in the header to save a macro call + nesting level in each place that
> > uses a hashtable.
>
> I must admin I don't care that much one way or another.

Looking again at:

+#define hash_for_each_size(name, bits, bkt, node, obj, member) \
+ for (bkt = 0; bkt < HASH_SIZE(bits); bkt++) \
+ hlist_for_each_entry(obj, node, &name[bkt], member)

you will notice that a "break" or "continue" in the inner loop will not
affect the outer loop, which is certainly not what the programmer would
expect!

I advise strongly against creating such error-prone construct.

Thanks,

Mathieu

>
> Thanks,
>
> Mathieu
>
> >
> >
> > Thanks,
> > Sasha
> >
> > > Thanks,
> > >
> > > Mathieu
> > >
> >
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com