2006-09-14 09:31:26

by Jörn Engel

[permalink] [raw]
Subject: [RFC] Alignment of fields in struct dentry

After taking a look at struct dentry, Arnd noted an alignment
problem. The first four fields currently are:
atomic_t d_count;
unsigned int d_flags; /* protected by d_lock */
spinlock_t d_lock; /* per dentry lock */
struct inode *d_inode; /* Where the name belongs to - NULL is
* negative */
On 64bit architectures, the first three take 12 bytes and d_inode is
not naturally aligned, so it can be aligned to byte 16. This grows a
struct dentry from 196 to 200 Bytes (assuming no funky config options
like DEBUG_*, PROFILING or PREEMT && SMP are set).

One possible solution would be to exchange d_inode with d_mounted, but
I fear that d_inode would move from a hot cacheline to a cold one,
reducing performance. Could there be a good solution or would any
rearrangement here only cause regressions?

Also, both 196 and 200 bytes are fairly close to 192 bytes, so I could
imagine performance improvements on 64bit machines with 64 Byte
cachelines. Might it make sense to trim DNAME_INLINE_LEN_MIN by 4 or
8 bytes for such machines?

J?rn

--
The wise man seeks everything in himself; the ignorant man tries to get
everything from somebody else.
-- unknown


2006-09-14 10:50:31

by Jörn Engel

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Thu, 14 September 2006 11:31:23 +0200, J?rn Engel wrote:
>
> After taking a look at struct dentry, Arnd noted an alignment
> problem. The first four fields currently are:
> atomic_t d_count;
> unsigned int d_flags; /* protected by d_lock */
> spinlock_t d_lock; /* per dentry lock */
> struct inode *d_inode; /* Where the name belongs to - NULL is
> * negative */
> On 64bit architectures, the first three take 12 bytes and d_inode is
> not naturally aligned, so it can be aligned to byte 16. This grows a
> struct dentry from 196 to 200 Bytes (assuming no funky config options
> like DEBUG_*, PROFILING or PREEMT && SMP are set).
>
> One possible solution would be to exchange d_inode with d_mounted, but
> I fear that d_inode would move from a hot cacheline to a cold one,
> reducing performance. Could there be a good solution or would any
> rearrangement here only cause regressions?
>
> Also, both 196 and 200 bytes are fairly close to 192 bytes, so I could
> imagine performance improvements on 64bit machines with 64 Byte
> cachelines. Might it make sense to trim DNAME_INLINE_LEN_MIN by 4 or
> 8 bytes for such machines?

And here is a patch for those who prefer talking code

J?rn

--
Joern's library part 8:
http://citeseer.ist.psu.edu/plank97tutorial.html

--- slab/include/linux/dcache.h~dentry_alignment 2006-09-14 10:52:51.000000000 +0200
+++ slab/include/linux/dcache.h 2006-09-14 12:44:56.000000000 +0200
@@ -77,14 +77,17 @@ full_name_hash(const unsigned char *name

struct dcookie_struct;

+#if BITS_PER_LONG == 64
+#define DNAME_INLINE_LEN_MIN 32
+#else
#define DNAME_INLINE_LEN_MIN 36
+#endif

struct dentry {
atomic_t d_count;
unsigned int d_flags; /* protected by d_lock */
spinlock_t d_lock; /* per dentry lock */
- struct inode *d_inode; /* Where the name belongs to - NULL is
- * negative */
+ int d_mounted;
/*
* The next three fields are touched by __d_lookup. Place them here
* so they all fit in a cache line.
@@ -93,6 +96,8 @@ struct dentry {
struct dentry *d_parent; /* parent directory */
struct qstr d_name;

+ struct inode *d_inode; /* Where the name belongs to - NULL is
+ * negative */
struct list_head d_lru; /* LRU list */
/*
* d_child and d_rcu can share memory
@@ -110,7 +115,6 @@ struct dentry {
#ifdef CONFIG_PROFILING
struct dcookie_struct *d_cookie; /* cookie, if any */
#endif
- int d_mounted;
unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */
};

2006-09-14 18:33:33

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Sep 14, 2006 12:50 +0200, J�rn Engel wrote:
> > After taking a look at struct dentry, Arnd noted an alignment
> > problem.
> > On 64bit architectures, the first three take 12 bytes and d_inode is
> > not naturally aligned, so it can be aligned to byte 16.
>
> struct dentry {
> atomic_t d_count;
> unsigned int d_flags; /* protected by d_lock */
> spinlock_t d_lock; /* per dentry lock */
> - struct inode *d_inode; /* Where the name belongs to - NULL is
> - * negative */
> + int d_mounted;
> /*
> * The next three fields are touched by __d_lookup. Place them here
> * so they all fit in a cache line.
> @@ -93,6 +96,8 @@ struct dentry {
> struct dentry *d_parent; /* parent directory */
> struct qstr d_name;
>
> + struct inode *d_inode; /* Where the name belongs to - NULL is

I think it makes sense to keep d_inode in the first part of the dentry
always, because it is by far the most referenced field in the dentry,
along with the critical fields from prune_dcache(), shrink_dcache_anon(),
dget(), dput(), d_lookup().

While not totally accurate in terms of runtime frequency of use, the counts
in the code:

fs/*.[ch] fs/*/*.[ch] size32 size64 prune_dc shrk_dc_anon d_lookup
d_inode 384 2131 4 8
d_lock 104 529 4 4 1 2
d_count 18 66 4 4 1 2
d_lru 18 18 4_ 8 1 1
d_hash 37 154 4 8_ 2 1
d_name 73 908 12_ 16 1
d_flags 26 104 4 4 2
d_mounted 7 7 4 4
d_parent 40 231 4 8_ 2
d_op 37 269 4 8
d_rcu/d_child 3+22 3+45 8 16

The '_' are potential cacheline boundaries.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2006-09-14 21:02:45

by Jörn Engel

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

Hello Andreas,

On Thu, 14 September 2006 12:33:25 -0600, Andreas Dilger wrote:
>
> I think it makes sense to keep d_inode in the first part of the dentry
> always, because it is by far the most referenced field in the dentry,
> along with the critical fields from prune_dcache(), shrink_dcache_anon(),
> dget(), dput(), d_lookup().

d_inode is definitely one of the hotter fields in there. It just
happens to cause the misalignment. Bah, I don't see a good solution.

> While not totally accurate in terms of runtime frequency of use, the counts
> in the code:
>
> fs/*.[ch] fs/*/*.[ch] size32 size64 prune_dc shrk_dc_anon d_lookup
> d_inode 384 2131 4 8
> d_lock 104 529 4 4 1 2
> d_count 18 66 4 4 1 2
> d_lru 18 18 4_ 8 1 1
> d_hash 37 154 4 8_ 2 1
> d_name 73 908 12_ 16 1
> d_flags 26 104 4 4 2
> d_mounted 7 7 4 4
> d_parent 40 231 4 8_ 2
> d_op 37 269 4 8
> d_rcu/d_child 3+22 3+45 8 16

[ d_hash is 8/16, actually ]

d_hash, d_name and d_parent belong way up to the top of the list, imo.
d_lookup() should be the hottest function of all, as the comment in
the structure definition already indicates. Maybe the solution is to
rearrange the fields with those going to the top?

Using your scheme (slightly reduced) we now have:
size32 size64 funky?
d_count 4 4
d_flags 4 4
d_lock 4 4_ y
d_inode 4_ 8
d_hash 8 16--
d_parent 4 8_
d_name 12-- 16___
d_lru 8_ 16_
d_rcu/d_child 8 16__
d_subdirs 8___ 16_
d_alias 8 16____
d_time 4 8
d_op 4_ 8_
d_sb 4 8
d_fsdata 4 8__
d_cookie 0 0 y
d_mounted 4 4
d_iname 36____ 36

With the two funky fields possibly growing, depending on kernel
config. [_-] mark 16-, 32- 64- and 128-byte boundaries, depending on
len. What really frightens me is that a 32-byte boundary goes right
through d_name on 32bit machines. Iirc, my PIII has 32-byte
cachelines. Not good.

How about moving [d_hash,d_parent,d_name] to the front? Something
like:
size32 size64 funky?
d_hash 8 16_
d_parent 4 8
d_name 12- 16--
d_inode 4 8_
d_count 4__ 4
d_flags 4 4
d_lock 4 4 y

d_mounted 4 4

d_lru 8 16
d_rcu/d_child 8 16
d_subdirs 8 16
d_alias 8 16
d_time 4 8
d_op 4 8
d_sb 4 8
d_fsdata 4 8
d_cookie 0 0 y
d_iname 36 36

Now d_lookup() should use a single cacheline, even on my aged
notebook, and the other hot fields remain at the top. d_mounted is
also moved up to remove the misalignment on 64bit. Might be worth
a benchmark or two to see whether it makes a difference...

J?rn

--
Joern's library part 1:
http://lwn.net/Articles/2.6-kernel-api/

2006-09-14 21:55:54

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Sep 14, 2006 23:02 +0200, J�rn Engel wrote:
> On Thu, 14 September 2006 12:33:25 -0600, Andreas Dilger wrote:
> > I think it makes sense to keep d_inode in the first part of the dentry
> > always, because it is by far the most referenced field in the dentry,
> > along with the critical fields from prune_dcache(), shrink_dcache_anon(),
> > dget(), dput(), d_lookup().
>
> d_inode is definitely one of the hotter fields in there. It just
> happens to cause the misalignment. Bah, I don't see a good solution.

As is d_lock, which didn't exist when those comments were made.

> > While not totally accurate in terms of runtime frequency of use, the counts
> > in the code:
> >
> > fs/*.[ch] fs/*/*.[ch] size32 size64 prune_dc shrk_dc_anon d_lookup
> > d_inode 384 2131 4 8
> > d_lock 104 529 4 4 1 2
> > d_count 18 66 4 4 1 2
> > d_lru 18 18 4_ 8 1 1
> > d_hash 37 154 4 8_ 2 1
> > d_name 73 908 12_ 16 1
> > d_flags 26 104 4 4 2
> > d_mounted 7 7 4 4
> > d_parent 40 231 4 8_ 2
> > d_op 37 269 4 8
> > d_rcu/d_child 3+22 3+45 8 16
>
> [ d_hash is 8/16, actually ]

Doh!

> d_hash, d_name and d_parent belong way up to the top of the list, imo.
> d_lookup() should be the hottest function of all, as the comment in
> the structure definition already indicates. Maybe the solution is to
> rearrange the fields with those going to the top?
>
> Using your scheme (slightly reduced) we now have:
> size32 size64 funky?
> d_count 4 4
> d_flags 4 4
> d_lock 4 4_ y
> d_inode 4_ 8
> d_hash 8 16--
> d_parent 4 8_
> d_name 12-- 16___
> d_lru 8_ 16_
> d_rcu/d_child 8 16__
> d_subdirs 8___ 16_
> d_alias 8 16____
> d_time 4 8
> d_op 4_ 8_
> d_sb 4 8
> d_fsdata 4 8__
> d_cookie 0 0 y
> d_mounted 4 4
> d_iname 36____ 36
>
> With the two funky fields possibly growing, depending on kernel
> config. [_-] mark 16-, 32- 64- and 128-byte boundaries, depending on
> len. What really frightens me is that a 32-byte boundary goes right
> through d_name on 32bit machines.

Actually, splitting d_name like this is not so bad (as long as the
compiler doesn't add padding) because the important fields (hash
and len) are first and are compared for all non-matching dentries
in __d_lookup().

> Iirc, my PIII has 32-byte cachelines. Not good.
>
> How about moving [d_hash,d_parent,d_name] to the front? Something
> like:
> size32 size64 funky?
> d_hash 8 16_
> d_parent 4 8
> d_name 12- 16--
> d_inode 4 8_
> d_count 4__ 4
> d_flags 4 4
> d_lock 4 4 y
>
> d_mounted 4 4
>
> d_lru 8 16
> d_rcu/d_child 8 16
> d_subdirs 8 16
> d_alias 8 16
> d_time 4 8
> d_op 4 8
> d_sb 4 8
> d_fsdata 4 8
> d_cookie 0 0 y
> d_iname 36 36
>
> Now d_lookup() should use a single cacheline, even on my aged
> notebook, and the other hot fields remain at the top. d_mounted is
> also moved up to remove the misalignment on 64bit. Might be worth
> a benchmark or two to see whether it makes a difference...

Might not be too hard (even if it temporarily kills performance)
to add atomic counters for each of these fields where they are
referenced in dcache.c, namei.c and e.g. fs/ext3/*.c (which is
only using d_inode, d_name, and d_parent). Run a find and a
kernel compile and dump the counters at shutdown.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2006-09-15 10:27:34

by Jörn Engel

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Thu, 14 September 2006 15:55:45 -0600, Andreas Dilger wrote:
> On Sep 14, 2006 23:02 +0200, J???rn Engel wrote:
> >
> > Using your scheme (slightly reduced) we now have:
> > size32 size64 funky?
> > d_count 4 4
> > d_flags 4 4
> > d_lock 4 4_ y
> > d_inode 4_ 8
> > d_hash 8 16--
> > d_parent 4 8_
> > d_name 12-- 16___
> > d_lru 8_ 16_
> > d_rcu/d_child 8 16__
> > d_subdirs 8___ 16_
> > d_alias 8 16____
> > d_time 4 8
> > d_op 4_ 8_
> > d_sb 4 8
> > d_fsdata 4 8__
> > d_cookie 0 0 y
> > d_mounted 4 4
> > d_iname 36____ 36
> >
> > With the two funky fields possibly growing, depending on kernel
> > config. [_-] mark 16-, 32- 64- and 128-byte boundaries, depending on
> > len. What really frightens me is that a 32-byte boundary goes right
> > through d_name on 32bit machines.
>
> Actually, splitting d_name like this is not so bad (as long as the
> compiler doesn't add padding) because the important fields (hash
> and len) are first and are compared for all non-matching dentries
> in __d_lookup().

Except that d_name is split between hash and len, not between len and
name. You said d_lock was added later? Looks like it broke careful
tuning for 32-byte cacheline machines. And it could also have caused
the misalignment on 64bit.

> > Now d_lookup() should use a single cacheline, even on my aged
> > notebook, and the other hot fields remain at the top. d_mounted is
> > also moved up to remove the misalignment on 64bit. Might be worth
> > a benchmark or two to see whether it makes a difference...
>
> Might not be too hard (even if it temporarily kills performance)
> to add atomic counters for each of these fields where they are
> referenced in dcache.c, namei.c and e.g. fs/ext3/*.c (which is
> only using d_inode, d_name, and d_parent). Run a find and a
> kernel compile and dump the counters at shutdown.

Might be even simpler by running gcov/lcov. That would not show which
fields are used at similar times, but it is a start.

Btw, I already mentioned how reducing the d_iname fields by a few
bytes could save a cacheline. And whenever I have a good idea, Arnd
usually comes up with a better one. How about the patch below to fix
dentries to 128/192 bytes on 32/64 bit machines, independently of
config options? It would shrink them from 132 bytes to 128 bytes in
my current config (although I don't quite remember why I turned
CONFIG_PROFILING on).

J?rn

--
Joern's library part 14:
http://www.sandpile.org/

--- slab/include/linux/dcache.h~dentry_size 2006-09-14 22:19:51.000000000 +0200
+++ slab/include/linux/dcache.h 2006-09-15 12:25:27.000000000 +0200
@@ -77,7 +77,7 @@ full_name_hash(const unsigned char *name

struct dcookie_struct;

-#define DNAME_INLINE_LEN_MIN 36
+#define DNAME_INLINE_LEN_MIN 16

struct dentry {
atomic_t d_count;
@@ -112,7 +112,10 @@ struct dentry {
#endif
int d_mounted;
unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */
-};
+}__attribute__((aligned(64))); /* make sure the dentry is 128/192 bytes
+ on 32/64 bit independently of config
+ options. d_iname will vary in length
+ a bit. */

/*
* dentry->d_lock spinlock nesting subclasses:

2006-09-15 13:41:54

by Jörn Engel

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Fri, 15 September 2006 12:27:36 +0200, J?rn Engel wrote:
>
> --- slab/include/linux/dcache.h~dentry_size 2006-09-14 22:19:51.000000000 +0200
> +++ slab/include/linux/dcache.h 2006-09-15 12:25:27.000000000 +0200
> @@ -77,7 +77,7 @@ full_name_hash(const unsigned char *name
>
> struct dcookie_struct;
>
> -#define DNAME_INLINE_LEN_MIN 36
> +#define DNAME_INLINE_LEN_MIN 16
>
> struct dentry {
> atomic_t d_count;
> @@ -112,7 +112,10 @@ struct dentry {
> #endif
> int d_mounted;
> unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */
> -};
> +}__attribute__((aligned(64))); /* make sure the dentry is 128/192 bytes
> + on 32/64 bit independently of config
> + options. d_iname will vary in length
> + a bit. */
>
> /*
> * dentry->d_lock spinlock nesting subclasses:

Here is the filename length distribution on my absolutely non-standard
system (thanks for the script, Arnd). The spikes are:
37: ccache
38: git
44: ccache

Could it make sense to set DNAME_INLINE_LEN_MIN to 44 for hackers? :)

limerick:~# for i in `seq 100` ; do echo $i `find / | grep "/[^/]\{$i\}$" | wc -l` ; done 2>/dev/null
1 3182
2 9375
3 27751
4 45573
5 107506
6 160836
7 186526
8 218420
9 169198
10 166416
11 114927
12 106676
13 85750
14 57269
15 41845
16 32014
17 22970
18 17139
19 15323
20 10334
21 7935
22 6780
23 6806
24 4716
25 3965
26 3152
27 3331
28 3045
29 2486
30 2586
31 2068
32 1907
33 1568
34 1549
35 1232
36 4971
37 19504
38 66679
39 606
40 586
41 723
42 483
43 4484
44 19026
45 297
46 245
47 272
48 275
49 200
50 192
51 193
52 148
53 140
54 152
55 137
56 102
57 100
58 164
59 81
60 92
61 93
62 54
63 68
64 112
65 65
66 44
67 43
68 62
69 39
70 29
71 41
72 33
73 36
74 23
75 19
76 31
77 23
78 13
79 23
80 10
81 12
82 8
83 12
84 4
85 7
86 7
87 10
88 10
89 11
90 3
91 5
92 5
93 5
94 4
95 1
96 2
97 1
98 2
99 4
100 8


J?rn

--
It is better to die of hunger having lived without grief and fear,
than to live with a troubled spirit amid abundance.
-- Epictetus

2006-09-15 20:44:17

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

Am Friday 15 September 2006 12:27 schrieb J?rn Engel:
> -};
> +}__attribute__((aligned(64)));?/* make sure the dentry is 128/192 bytes
> +??????????????????????????????? ? on 32/64 bit independently of config
> +??????????????????????????????? ? options. ?d_iname will vary in length
> +??????????????????????????????? ? a bit. */

I'd guess that a 32 byte alignment is much better here, 64 byte sounds
excessive. It should have the same effect with the current dentry layout
and default config options, but would keep the d_iname length in the
16-44 byte range instead of 16-76 byte as your patch does.

Since all important fields are supposed to be kept in 32 bytes anyway,
they are still either at the start or the end of a given cache line,
but never cross two.

Arnd <><

2006-09-18 21:24:19

by Jörn Engel

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Fri, 15 September 2006 22:44:07 +0200, Arnd Bergmann wrote:
> Am Friday 15 September 2006 12:27 schrieb J?rn Engel:
> > -};
> > +}__attribute__((aligned(64)));?/* make sure the dentry is 128/192 bytes
> > +??????????????????????????????? ? on 32/64 bit independently of config
> > +??????????????????????????????? ? options. ?d_iname will vary in length
> > +??????????????????????????????? ? a bit. */

Wow. Mutt really didn't like quoting this.

> I'd guess that a 32 byte alignment is much better here, 64 byte sounds
> excessive. It should have the same effect with the current dentry layout
> and default config options, but would keep the d_iname length in the
> 16-44 byte range instead of 16-76 byte as your patch does.
>
> Since all important fields are supposed to be kept in 32 bytes anyway,
> they are still either at the start or the end of a given cache line,
> but never cross two.

Another take would be to use a cacheline. But I guess the difference
between 32/64/cacheline is mostly academic, given the rate of changes
to struct dentry.

Unless the minimum length for inline names is configurable, as in the
patch below. (Note to self: I really should finish my test setup
before posting any further patches without performance numbers.)

J?rn

--
Don't worry about people stealing your ideas. If your ideas are any good,
you'll have to ram them down people's throats.
-- Howard Aiken quoted by Ken Iverson quoted by Jim Horning quoted by
Raph Levien, 1979

--- slab/include/linux/dcache.h~dentry_size 2006-09-14 22:19:51.000000000 +0200
+++ slab/include/linux/dcache.h 2006-09-18 23:22:44.000000000 +0200
@@ -77,8 +77,6 @@ full_name_hash(const unsigned char *name

struct dcookie_struct;

-#define DNAME_INLINE_LEN_MIN 36
-
struct dentry {
atomic_t d_count;
unsigned int d_flags; /* protected by d_lock */
@@ -111,8 +109,8 @@ struct dentry {
struct dcookie_struct *d_cookie; /* cookie, if any */
#endif
int d_mounted;
- unsigned char d_iname[DNAME_INLINE_LEN_MIN]; /* small names */
-};
+ unsigned char d_iname[CONFIG_DCACHE_INLINE_LEN]; /* small names */
+}__cacheline_aligned;

/*
* dentry->d_lock spinlock nesting subclasses:
--- slab/fs/Kconfig~dentry_size 2006-08-11 13:03:43.000000000 +0200
+++ slab/fs/Kconfig 2006-09-18 23:22:44.000000000 +0200
@@ -1931,5 +1931,18 @@ endmenu

source "fs/nls/Kconfig"

+config DCACHE_INLINE_LEN
+ int "Length of inline filenames"
+ depends on EXPERIMENTAL
+ default 20
+ help
+ Filenames up to this length are stored inline in a struct dentry.
+ For most people something short like 20 is ok. Developers with
+ extensive use of git or ccache may want to set this to 38 or 44,
+ respectively, as those programs create and often access thousands
+ of files of that length.
+
+ If unsure, choose 20.
+
endmenu

2006-09-18 21:54:12

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [RFC] Alignment of fields in struct dentry

On Monday 18 September 2006 23:24, J?rn Engel wrote:
> On Fri, 15 September 2006 22:44:07 +0200, Arnd Bergmann wrote:
> >
> > I'd guess that a 32 byte alignment is much better here, 64 byte sounds
> > excessive. It should have the same effect with the current dentry layout
> > and default config options, but would keep the d_iname length in the
> > 16-44 byte range instead of 16-76 byte as your patch does.
> >
> > Since all important fields are supposed to be kept in 32 bytes anyway,
> > they are still either at the start or the end of a given cache line,
> > but never cross two.
>
> Another take would be to use a cacheline. But I guess the difference
> between 32/64/cacheline is mostly academic, given the rate of changes
> to struct dentry.

There have been so many optimizations and misoptimizations regarding
the dentry struct over the years. See http://lkml.org/lkml/2004/5/8/117
for the almost exact opposite of this patch, along with the same discussion
that we're having now.

Arnd <><