2010-08-14 17:44:42

by Andreas Schwab

[permalink] [raw]
Subject: struct fanotify_event_metadata

The pid field of struct fanotify_event_metadata has 64 bits which looks
excessive. Wouldn't it make sense to make it 32 bits and swap it with
the mask field? That would avoid the unaligned mask field, and remove
the need for the packed attribute.

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."


2010-08-19 15:44:34

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Saturday 14 Aug 2010 18:44:38 Andreas Schwab wrote:
> The pid field of struct fanotify_event_metadata has 64 bits which looks
> excessive. Wouldn't it make sense to make it 32 bits and swap it with
> the mask field? That would avoid the unaligned mask field, and remove
> the need for the packed attribute.

No one seems to have picked up on this what I thought was an obvious good
idea.

So yes, 32-bit PID makes sense I think. If that was changed, would removing
the packed attribute get us anything more? If not it may stay for extra safety
if/when new fields are added.

Also, would it make sense to move mask in front of fd so 32-bit pid can stay
at the end? I think mask logically should be higher because is the more meta
data, while pid is auxiliary.

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.

2010-08-19 16:12:38

by Andreas Gruenbacher

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Thursday 19 August 2010 17:44:29 Tvrtko Ursulin wrote:
> On Saturday 14 Aug 2010 18:44:38 Andreas Schwab wrote:
> > The pid field of struct fanotify_event_metadata has 64 bits which looks
> > excessive. Wouldn't it make sense to make it 32 bits and swap it with
> > the mask field? That would avoid the unaligned mask field, and remove
> > the need for the packed attribute.
>
> No one seems to have picked up on this what I thought was an obvious good
> idea.

Yes, the pid field should be shrunk; it is a 32-bit value in user-space even
on 64-bit platforms. I also don't see that we'll ever need a 64-bit mask
actually.

Andreas

2010-08-19 16:35:47

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Thursday 19 Aug 2010 17:07:01 Andreas Gruenbacher wrote:
> On Thursday 19 August 2010 17:44:29 Tvrtko Ursulin wrote:
> > On Saturday 14 Aug 2010 18:44:38 Andreas Schwab wrote:
> > > The pid field of struct fanotify_event_metadata has 64 bits which looks
> > > excessive. Wouldn't it make sense to make it 32 bits and swap it with
> > > the mask field? That would avoid the unaligned mask field, and remove
> > > the need for the packed attribute.
> >
> > No one seems to have picked up on this what I thought was an obvious good
> > idea.
>
> Yes, the pid field should be shrunk; it is a 32-bit value in user-space
> even on 64-bit platforms. I also don't see that we'll ever need a 64-bit
> mask actually.

You could be probably right since even though at the moment there are 23 bits
already allocated (purely mechanical count), protocol is future proof to
expand it later if needed.

On the other hand since it the same mask as in the syscall (where we really
want to have enough width top begin with) it is nice that both user interfaces
agree on the field width and it is probably a negligible overhead anyway.

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.

2010-08-19 17:54:06

by Eric Paris

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Thu, 2010-08-19 at 16:44 +0100, Tvrtko Ursulin wrote:
> On Saturday 14 Aug 2010 18:44:38 Andreas Schwab wrote:
> > The pid field of struct fanotify_event_metadata has 64 bits which looks
> > excessive. Wouldn't it make sense to make it 32 bits and swap it with
> > the mask field? That would avoid the unaligned mask field, and remove
> > the need for the packed attribute.

Wish this thought came up 2 weeks ago :) It's going to stay __packed__
no matter what, even if the alignment works out nicely and it doesn't do
anything.

I'm certainly willing to shrink the pid and switch some locations if
noone objects but it will definitely break userspace, in that it is
going to require a recompile of anyone's userspace listener (the
interface was only intended to grow, not get switched around) but it has
only been in there about a week so I'm not seeing a huge harm.

I would not be happy to see the mask shrink, we might not be there yet,
we might not ever get there, but it was part of the future proofing of
the interface.

Would anyone like to send a patch? Tvrtko?

-Eric

2010-08-20 09:02:21

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Thursday 19 Aug 2010 18:53:46 Eric Paris wrote:
> On Thu, 2010-08-19 at 16:44 +0100, Tvrtko Ursulin wrote:
> > On Saturday 14 Aug 2010 18:44:38 Andreas Schwab wrote:
> > > The pid field of struct fanotify_event_metadata has 64 bits which looks
> > > excessive. Wouldn't it make sense to make it 32 bits and swap it with
> > > the mask field? That would avoid the unaligned mask field, and remove
> > > the need for the packed attribute.
>
> Wish this thought came up 2 weeks ago :) It's going to stay __packed__
> no matter what, even if the alignment works out nicely and it doesn't do
> anything.
>
> I'm certainly willing to shrink the pid and switch some locations if
> noone objects but it will definitely break userspace, in that it is
> going to require a recompile of anyone's userspace listener (the
> interface was only intended to grow, not get switched around) but it has
> only been in there about a week so I'm not seeing a huge harm.
>
> I would not be happy to see the mask shrink, we might not be there yet,
> we might not ever get there, but it was part of the future proofing of
> the interface.
>
> Would anyone like to send a patch? Tvrtko?

I think it is OK to break userspace while still in the merge window.
It is not even a big breakage but just a recompile.

So something like the below?
---
Shrink pid field in the fanotify_event_metadata to 32-bit to match
the kernel representation. Pull mask field up since it logically
comes before event auxiliary data and also makes for a nicer
alignment.

Signed-off-by: Tvrtko Ursulin <[email protected]>
---
include/linux/fanotify.h | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index f0949a5..0e7f1bb 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -70,9 +70,9 @@
struct fanotify_event_metadata {
__u32 event_len;
__u32 vers;
- __s32 fd;
__u64 mask;
- __s64 pid;
+ __s32 fd;
+ __s32 pid;
} __attribute__ ((packed));

struct fanotify_response {


Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.

2010-08-20 09:16:28

by Andreas Schwab

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

Tvrtko Ursulin <[email protected]> writes:

> Shrink pid field in the fanotify_event_metadata to 32-bit to match
> the kernel representation. Pull mask field up since it logically
> comes before event auxiliary data and also makes for a nicer
> alignment.

That won't buy you much wrt. alignment though, due to the packed
attribute.

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2010-08-20 09:23:12

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Friday 20 Aug 2010 10:16:22 Andreas Schwab wrote:
> Tvrtko Ursulin <[email protected]> writes:
> > Shrink pid field in the fanotify_event_metadata to 32-bit to match
> > the kernel representation. Pull mask field up since it logically
> > comes before event auxiliary data and also makes for a nicer
> > alignment.
>
> That won't buy you much wrt. alignment though, due to the packed
> attribute.

I know, it is primarily more logical ordering of fields withing the event. It
is only secondary that I thought it is nicer to have 32-32-64-32-32 than,
32-32-32-64-32, maybe there is some platform where it is nicer?

Tvrtko

Sophos Plc, The Pentagon, Abingdon Science Park, Abingdon, OX14 3YP, United Kingdom.
Company Reg No 2096520. VAT Reg No GB 348 3873 20.

2010-08-20 13:13:58

by Eric Paris

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Fri, 2010-08-20 at 11:16 +0200, Andreas Schwab wrote:
> Tvrtko Ursulin <[email protected]> writes:
>
> > Shrink pid field in the fanotify_event_metadata to 32-bit to match
> > the kernel representation. Pull mask field up since it logically
> > comes before event auxiliary data and also makes for a nicer
> > alignment.
>
> That won't buy you much wrt. alignment though, due to the packed
> attribute.

Can you help me understand the packed attribute and why it hurts in this
case? It's not going to change the alignment or placement of anything
and I saw it as just being needed to make sure everything would continue
to work in the future.

I'm going to compile a couple of test programs (I only have x86 and
x86_64) to see if I can find what the assembly is doing different but
maybe you can point me at the information more easily?

-Eric

2010-08-20 13:27:52

by Andreas Schwab

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

Eric Paris <[email protected]> writes:

> Can you help me understand the packed attribute and why it hurts in this
> case?

It changes the alignment of all applicable objects to 1 which means that
the compiler cannot assume _any_ aligment. Thus on STRICT_ALIGNMENT
targets it has to use more expensive access methods to avoid generating
unaligned loads and stores (unless it can infer proper alignment from
the context).

You can add an aligned attribute to raise the assumed alignment again.
Of course, this can only work correctly if the actual alignment of the
object matches the declared one.

> I'm going to compile a couple of test programs (I only have x86 and
> x86_64) to see if I can find what the assembly is doing different but
> maybe you can point me at the information more easily?

Neither x86 nor x86-64 are STRICT_ALIGNMENT targets.

Andreas.

--
Andreas Schwab, [email protected]
GPG Key fingerprint = D4E8 DBE3 3813 BB5D FA84 5EC7 45C6 250E 6F00 984E
"And now for something completely different."

2010-08-20 15:19:40

by Eric Paris

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Fri, 2010-08-20 at 15:27 +0200, Andreas Schwab wrote:
> Eric Paris <[email protected]> writes:
>
> > Can you help me understand the packed attribute and why it hurts in this
> > case?
>
> It changes the alignment of all applicable objects to 1 which means that
> the compiler cannot assume _any_ aligment. Thus on STRICT_ALIGNMENT
> targets it has to use more expensive access methods to avoid generating
> unaligned loads and stores (unless it can infer proper alignment from
> the context).

Andreas suggested (I accidentally dropped the list when I ask him) I use
natural alignment and explicit padding rather than ((packed))

I'm open to the idea but I want to make it idiot proof (aka I won't
screw it up later) My best offhand idea would be to do something like
so:

Expose this to userspace:
struct fanotify_event_metadata {
__u32 event_len;
__u32 vers;
__s32 fd;
__u64 mask;
__s64 pid;
};

Wrap this in #ifdef KERNEL
struct fanotify_event_metadata_packed {
__u32 event_len;
__u32 vers;
__s32 fd;
__u64 mask;
__s64 pid;
} __attribute__ ((packed));

Then add:
BUILD_BUG_ON(sizeof(struct fanotify_event_metadata) !=
sizeof(struct fanotify_event_metadata_packed);

Is that a good way to make the actual object non-packed but keep myself
from ever letting it fail alignment and padding requirements?

-Eric

2010-08-20 17:47:56

by Eric Paris

[permalink] [raw]
Subject: Re: struct fanotify_event_metadata

On Fri, 2010-08-20 at 11:19 -0400, Eric Paris wrote:
> On Fri, 2010-08-20 at 15:27 +0200, Andreas Schwab wrote:
> > Eric Paris <[email protected]> writes:
> >
> > > Can you help me understand the packed attribute and why it hurts in this
> > > case?
> >
> > It changes the alignment of all applicable objects to 1 which means that
> > the compiler cannot assume _any_ aligment. Thus on STRICT_ALIGNMENT
> > targets it has to use more expensive access methods to avoid generating
> > unaligned loads and stores (unless it can infer proper alignment from
> > the context).
>
> Andreas suggested (I accidentally dropped the list when I ask him) I use
> natural alignment and explicit padding rather than ((packed))

What would anyone think of this patch (which is on top of Tvrtko's
reordering)?

>From 9629f0435bdd00b8338ab41bd078b3c625fd8804 Mon Sep 17 00:00:00 2001
From: Eric Paris <[email protected]>
Date: Fri, 20 Aug 2010 13:33:27 -0400
Subject: [PATCH] fanotify: drops the packed attribute from userspace event metadata

The userspace event metadata structure was packed so when sent from a kernel
with a certain set of alignment rules to a userspace listener with a different
set of alignment rules the userspace process would be able to use the
structure. On some arches just using packed, even if it doesn't do anything
to the alignment can cause a severe performance hit. From now on we are
not going to set the packed attribute and will just need to be very careful
to make sure the structure is naturally aligned and that explicit padding is
used when necessary. To make sure noone gets this wrong in the future, we
enforce this fact, at build time, using a similar structure that is packed
and comparing their sizes.

Signed-off-by: Eric Paris <[email protected]>
---
fs/notify/fanotify/fanotify_user.c | 2 ++
include/linux/fanotify.h | 22 ++++++++++++++++++++--
2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index b966b72..c3a5742 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -778,6 +778,8 @@ SYSCALL_ALIAS(sys_fanotify_mark, SyS_fanotify_mark);
*/
static int __init fanotify_user_setup(void)
{
+ BUILD_BUG_ON(sizeof(struct fanotify_event_metadata) !=
+ sizeof(struct fan_event_meta_packed));
fanotify_mark_cache = KMEM_CACHE(fsnotify_mark, SLAB_PANIC);
fanotify_response_event_cache = KMEM_CACHE(fanotify_response_event,
SLAB_PANIC);
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 0535461..3d7a9b5 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -66,14 +66,21 @@
FAN_Q_OVERFLOW)

#define FANOTIFY_METADATA_VERSION 1
-
+/*
+ * This structue must be naturally aligned so that a 32 bit userspace process
+ * will find the offsets the same as a 64bit process. If there would be padding
+ * in the structure it must be added explictly by hand. Please note that
+ * anything added to this structure must also be added to the fan_event_meta_packed
+ * struct, which is used to enforce the alignment and padding rules at build
+ * time.
+ */
struct fanotify_event_metadata {
__u32 event_len;
__u32 vers;
__u64 mask;
__s32 fd;
__s32 pid;
-} __attribute__ ((packed));
+};

struct fanotify_response {
__s32 fd;
@@ -95,4 +102,15 @@ struct fanotify_response {
(long)(meta)->event_len >= (long)FAN_EVENT_METADATA_LEN && \
(long)(meta)->event_len <= (long)(len))

+#ifdef __KERNEL__
+/* see struct fanotify_event_metadata for the reason this exists */
+struct fan_event_meta_packed {
+ __u32 event_len;
+ __u32 vers;
+ __u64 mask;
+ __s32 fd;
+ __s32 pid;
+} __attribute__ ((packed));
+
+#endif /* __KERNEL__ */
#endif /* _LINUX_FANOTIFY_H */
--
1.6.5.3