2013-03-04 11:29:43

by Eric Wong

[permalink] [raw]
Subject: [PATCH] epoll: trim epitem by one cache line on x86_64

It is common for epoll users to have thousands of epitems, so saving a
cache line on every allocation leads to large memory savings.

Since epitem allocations are cache-aligned, reducing sizeof(struct
epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
cache line boundary on x86_64.

>From /sys/kernel/slab/eventpoll_epi, I see the following changes on my
x86_64 Core2 Duo (which has 64-byte cache alignment):

object_size : 192 => 128
objs_per_slab: 21 => 32

I have no access to other 64-bit machines, so I am limiting this to
x86_64-only with EPOLL_PACKED instead of __attribute__((packed))

Signed-off-by: Eric Wong <[email protected]>
Cc: Davide Libenzi <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Andrew Morton <[email protected]>
---
fs/eventpoll.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index cfc4b16..06f3d0e 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -107,7 +107,7 @@
struct epoll_filefd {
struct file *file;
int fd;
-};
+} EPOLL_PACKED;

/*
* Structure used to track possible nested calls, for too deep recursions
--
Eric Wong


2013-03-06 22:37:45

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] epoll: trim epitem by one cache line on x86_64

On Mon, 4 Mar 2013 11:29:41 +0000 Eric Wong <[email protected]> wrote:

> It is common for epoll users to have thousands of epitems, so saving a
> cache line on every allocation leads to large memory savings.
>
> Since epitem allocations are cache-aligned, reducing sizeof(struct
> epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
> cache line boundary on x86_64.
>
> >From /sys/kernel/slab/eventpoll_epi, I see the following changes on my
> x86_64 Core2 Duo (which has 64-byte cache alignment):
>
> object_size : 192 => 128
> objs_per_slab: 21 => 32
>
> I have no access to other 64-bit machines, so I am limiting this to
> x86_64-only with EPOLL_PACKED instead of __attribute__((packed))
>
> ...
>
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -107,7 +107,7 @@
> struct epoll_filefd {
> struct file *file;
> int fd;
> -};
> +} EPOLL_PACKED;
>
> /*
> * Structure used to track possible nested calls, for too deep recursions

Yes, I see the same numbers on my gcc, x86_64 allmodconfig.

It's going to be hard to maintain this - someone will change something
sometime and break it. I suppose we could add a runtime check if we
cared enough. Adding a big fat comment to struct epitem might help.

I don't see much additional room to be saved. We could probably remove
epitem.nwait, but that wouldn't actually save anything because nwait
nestles with ffd.fd.

I tested your patch on powerpc and it reduced sizeof(epitem) from 136
to 128 for that arch as well, so I suggest we run with

--- a/fs/eventpoll.c~epoll-trim-epitem-by-one-cache-line-on-x86_64-fix
+++ a/fs/eventpoll.c
@@ -105,7 +105,7 @@
struct epoll_filefd {
struct file *file;
int fd;
-} EPOLL_PACKED;
+} __packed;

/*
* Structure used to track possible nested calls, for too deep recursions
_

2013-03-07 10:32:42

by Eric Wong

[permalink] [raw]
Subject: Re: [PATCH] epoll: trim epitem by one cache line on x86_64

Andrew Morton <[email protected]> wrote:
> It's going to be hard to maintain this - someone will change something
> sometime and break it. I suppose we could add a runtime check if we
> cared enough. Adding a big fat comment to struct epitem might help.

Thanks for looking at this patch. I'll send a patch with a comment
about keeping epitem size in check. Also, would adding (with comments):

BUILD_BUG_ON(sizeof(struct epitem) > 128);

...be too heavy-handed? I used that in my testing. I'll check for:
sizeof(void *) <= 8 too; in case 128-bit machines appear...

> I don't see much additional room to be saved. We could probably remove
> epitem.nwait, but that wouldn't actually save anything because nwait
> nestles with ffd.fd.

If we remove nwait, we can move epoll_event up and have event.events
tucked in there. I have more and more depending on epoll, so I'll be
around to comment on future epoll changes as they come up.

> I tested your patch on powerpc and it reduced sizeof(epitem) from 136
> to 128 for that arch as well, so I suggest we run with
>
> --- a/fs/eventpoll.c~epoll-trim-epitem-by-one-cache-line-on-x86_64-fix
> +++ a/fs/eventpoll.c
> @@ -105,7 +105,7 @@
> struct epoll_filefd {
> struct file *file;
> int fd;
> -} EPOLL_PACKED;
> +} __packed;

Thanks for testing on ppc. Looks good to me. For what it's worth:
Acked-by: Eric Wong <[email protected]>

2013-03-08 23:54:12

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] epoll: trim epitem by one cache line on x86_64

On Thu, 7 Mar 2013 10:32:40 +0000 Eric Wong <[email protected]> wrote:

> Andrew Morton <[email protected]> wrote:
> > It's going to be hard to maintain this - someone will change something
> > sometime and break it. I suppose we could add a runtime check if we
> > cared enough. Adding a big fat comment to struct epitem might help.
>
> Thanks for looking at this patch. I'll send a patch with a comment
> about keeping epitem size in check. Also, would adding (with comments):
>
> BUILD_BUG_ON(sizeof(struct epitem) > 128);
>
> ...be too heavy-handed? I used that in my testing. I'll check for:
> sizeof(void *) <= 8 too; in case 128-bit machines appear...

I guess such a check might avoid accidents in the future. If it
becomes a problem, we can always delete it.

2013-03-09 00:41:10

by Eric Wong

[permalink] [raw]
Subject: [PATCH] epoll: comment + BUILD_BUG_ON to prevent epitem bloat

This will prevent us from accidentally introducing a memory bloat
regression here in the future.

Signed-off-by: Eric Wong <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Davide Libenzi <[email protected]>,
Cc: Al Viro <[email protected]>
---
Andrew Morton <[email protected]> wrote:
> On Thu, 7 Mar 2013 10:32:40 +0000 Eric Wong <[email protected]> wrote:
>
> > Andrew Morton <[email protected]> wrote:
> > > It's going to be hard to maintain this - someone will change something
> > > sometime and break it. I suppose we could add a runtime check if we
> > > cared enough. Adding a big fat comment to struct epitem might help.
> >
> > Thanks for looking at this patch. I'll send a patch with a comment
> > about keeping epitem size in check. Also, would adding (with comments):
> >
> > BUILD_BUG_ON(sizeof(struct epitem) > 128);
> >
> > ...be too heavy-handed? I used that in my testing. I'll check for:
> > sizeof(void *) <= 8 too; in case 128-bit machines appear...
>
> I guess such a check might avoid accidents in the future. If it
> becomes a problem, we can always delete it.

fs/eventpoll.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 9fec183..55028da 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -128,6 +128,8 @@ struct nested_calls {
/*
* Each file descriptor added to the eventpoll interface will
* have an entry of this type linked to the "rbr" RB tree.
+ * Avoid increasing the size of this struct, there can be many thousands
+ * of these on a server and we do not want this to take another cache line.
*/
struct epitem {
/* RB tree node used to link this structure to the eventpoll RB tree */
@@ -1964,6 +1966,12 @@ static int __init eventpoll_init(void)
/* Initialize the structure used to perform file's f_op->poll() calls */
ep_nested_calls_init(&poll_readywalk_ncalls);

+ /*
+ * We can have many thousands of epitems, so prevent this from
+ * using an extra cache line on 64-bit (and smaller) CPUs
+ */
+ BUILD_BUG_ON(sizeof(void *) <= 8 && sizeof(struct epitem) > 128);
+
/* Allocates slab cache used to allocate "struct epitem" items */
epi_cache = kmem_cache_create("eventpoll_epi", sizeof(struct epitem),
0, SLAB_HWCACHE_ALIGN | SLAB_PANIC, NULL);
--
Eric Wong