2022-04-29 16:01:54

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH v2 0/3] fanotify: Allow user space to pass back additional audit info

The Fanotify API can be used for access control by requesting permission
event notification. The user space tooling that uses it may have a
complicated policy that inherently contains additional context for the
decision. If this information were available in the audit trail, policy
writers can close the loop on debugging policy. Also, if this additional
information were available, it would enable the creation of tools that
can suggest changes to the policy similar to how audit2allow can help
refine labeled security.

This patch defines 2 additional fields within the response structure
returned from user space on a permission event. The first field is 16
bits for the context type. The context type will describe what the
meaning is of the second field. The audit system will separate the
pieces and log them individually.

The audit function was updated to log the additional information in the
AUDIT_FANOTIFY record. The following is an example of the new record
format:

type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_ctx=17

changelog:
v1:
- first version by Steve Grubb <[email protected]>
Link: https://lore.kernel.org/r/2042449.irdbgypaU6@x2

v2:
- enhancements suggested by Jan Kara <[email protected]>
- 1/3 change %d to %u in pr_debug
- 2/3 change response from __u32 to __u16
- mod struct fanotify_response and fanotify_perm_event add extra_info_type, extra_info_buf
- extra_info_buf size max FANOTIFY_MAX_RESPONSE_EXTRA_LEN, add struct fanotify_response_audit_rule
- extend debug statements
- remove unneeded macros
- [internal] change interface to finish_permission_event() and process_access_response()
- 3/3 update format of extra information
- [internal] change interface to audit_fanotify()
- change ctx_type= to fan_type=
Link: https://lore.kernel.org/r/[email protected]

Richard Guy Briggs (3):
fanotify: Ensure consistent variable type for response
fanotify: define struct members to hold response decision context
fanotify: Allow audit to use the full permission event response

fs/notify/fanotify/fanotify.c | 5 ++-
fs/notify/fanotify/fanotify.h | 4 +-
fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
include/linux/audit.h | 8 ++--
include/linux/fanotify.h | 3 ++
include/uapi/linux/fanotify.h | 27 +++++++++++++-
kernel/auditsc.c | 18 +++++++--
7 files changed, 94 insertions(+), 30 deletions(-)

--
2.27.0


2022-04-29 22:01:25

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

This patch adds 2 structure members to the response returned from user
space on a permission event. The first field is 16 bits for the context
type. The context type will describe what the meaning is of the second
field. The default is none. The patch defines one additional context
type which means that the second field is a 32-bit rule number. This
will allow for the creation of other context types in the future if
other users of the API identify different needs. The second field size
is defined by the context type and can be used to pass along the data
described by the context.

To support this, there is a macro for user space to check that the data
being sent is valid. Of course, without this check, anything that
overflows the bit field will trigger an EINVAL based on the use of
FAN_INVALID_RESPONSE_MASK in process_access_response().

Suggested-by: Steve Grubb <[email protected]>
Link: https://lore.kernel.org/r/2745105.e9J7NaK4W3@x2
Suggested-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Richard Guy Briggs <[email protected]>
Link: https://lore.kernel.org/r/17660b3f2817e5c0a19d1e9e5d40b53ff4561845.1651174324.git.rgb@redhat.com
---
fs/notify/fanotify/fanotify.c | 1 -
fs/notify/fanotify/fanotify.h | 4 +-
fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
include/linux/fanotify.h | 3 ++
include/uapi/linux/fanotify.h | 27 +++++++++++++-
5 files changed, 72 insertions(+), 22 deletions(-)

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 985e995d2a39..00aff6e29bf8 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -266,7 +266,6 @@ static int fanotify_get_response(struct fsnotify_group *group,
case FAN_ALLOW:
ret = 0;
break;
- case FAN_DENY:
default:
ret = -EPERM;
}
diff --git a/fs/notify/fanotify/fanotify.h b/fs/notify/fanotify/fanotify.h
index 70acfd497771..87d643deabce 100644
--- a/fs/notify/fanotify/fanotify.h
+++ b/fs/notify/fanotify/fanotify.h
@@ -425,9 +425,11 @@ FANOTIFY_PE(struct fanotify_event *event)
struct fanotify_perm_event {
struct fanotify_event fae;
struct path path;
- __u32 response; /* userspace answer to the event */
+ __u16 response; /* userspace answer to the event */
+ __u16 extra_info_type;
unsigned short state; /* state of the event */
int fd; /* fd we passed to userspace for this event */
+ char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
};

static inline struct fanotify_perm_event *
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 694516470660..f1ff4cf683fb 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -289,13 +289,19 @@ static int create_fd(struct fsnotify_group *group, struct path *path,
*/
static void finish_permission_event(struct fsnotify_group *group,
struct fanotify_perm_event *event,
- __u32 response)
+ struct fanotify_response *response)
__releases(&group->notification_lock)
{
bool destroy = false;

assert_spin_locked(&group->notification_lock);
- event->response = response;
+ event->response = response->response;
+ event->extra_info_type = response->extra_info_type;
+ switch (event->extra_info_type) {
+ case FAN_RESPONSE_INFO_AUDIT_RULE:
+ memcpy(event->extra_info_buf, response->extra_info_buf,
+ sizeof(struct fanotify_response_audit_rule));
+ }
if (event->state == FAN_EVENT_CANCELED)
destroy = true;
else
@@ -306,22 +312,29 @@ static void finish_permission_event(struct fsnotify_group *group,
}

static int process_access_response(struct fsnotify_group *group,
- struct fanotify_response *response_struct)
+ struct fanotify_response *response_struct,
+ size_t count)
{
struct fanotify_perm_event *event;
int fd = response_struct->fd;
- __u32 response = response_struct->response;
+ __u16 response = response_struct->response;

- pr_debug("%s: group=%p fd=%d response=%u\n", __func__, group,
- fd, response);
+ pr_debug("%s: group=%p fd=%d response=%u type=%u size=%lu\n", __func__,
+ group, fd, response, response_struct->extra_info_type, count);
/*
* make sure the response is valid, if invalid we do nothing and either
* userspace can send a valid response or we will clean it up after the
* timeout
*/
- switch (response & ~FAN_AUDIT) {
- case FAN_ALLOW:
- case FAN_DENY:
+ if (FAN_INVALID_RESPONSE_MASK(response))
+ return -EINVAL;
+ switch (response_struct->extra_info_type) {
+ case FAN_RESPONSE_INFO_AUDIT_NONE:
+ break;
+ case FAN_RESPONSE_INFO_AUDIT_RULE:
+ if (count < offsetof(struct fanotify_response, extra_info_buf)
+ + sizeof(struct fanotify_response_audit_rule))
+ return -EINVAL;
break;
default:
return -EINVAL;
@@ -340,7 +353,7 @@ static int process_access_response(struct fsnotify_group *group,
continue;

list_del_init(&event->fae.fse.list);
- finish_permission_event(group, event, response);
+ finish_permission_event(group, event, response_struct);
wake_up(&group->fanotify_data.access_waitq);
return 0;
}
@@ -802,9 +815,14 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
fsnotify_destroy_event(group, &event->fse);
} else {
if (ret <= 0) {
+ struct fanotify_response response = {
+ .fd = FAN_NOFD,
+ .response = FAN_DENY,
+ .extra_info_type = FAN_RESPONSE_INFO_AUDIT_NONE };
+
spin_lock(&group->notification_lock);
finish_permission_event(group,
- FANOTIFY_PERM(event), FAN_DENY);
+ FANOTIFY_PERM(event), &response);
wake_up(&group->fanotify_data.access_waitq);
} else {
spin_lock(&group->notification_lock);
@@ -827,26 +845,25 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,

static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
{
- struct fanotify_response response = { .fd = -1, .response = -1 };
+ struct fanotify_response response;
struct fsnotify_group *group;
int ret;
+ size_t size = min(count, sizeof(struct fanotify_response));

if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
return -EINVAL;

group = file->private_data;

- if (count < sizeof(response))
+ if (count < offsetof(struct fanotify_response, extra_info_buf))
return -EINVAL;

- count = sizeof(response);
-
pr_debug("%s: group=%p count=%zu\n", __func__, group, count);

- if (copy_from_user(&response, buf, count))
+ if (copy_from_user(&response, buf, size))
return -EFAULT;

- ret = process_access_response(group, &response);
+ ret = process_access_response(group, &response, count);
if (ret < 0)
count = ret;

@@ -857,6 +874,10 @@ static int fanotify_release(struct inode *ignored, struct file *file)
{
struct fsnotify_group *group = file->private_data;
struct fsnotify_event *fsn_event;
+ struct fanotify_response response = {
+ .fd = FAN_NOFD,
+ .response = FAN_ALLOW,
+ .extra_info_type = FAN_RESPONSE_INFO_AUDIT_NONE };

/*
* Stop new events from arriving in the notification queue. since
@@ -876,7 +897,7 @@ static int fanotify_release(struct inode *ignored, struct file *file)
event = list_first_entry(&group->fanotify_data.access_list,
struct fanotify_perm_event, fae.fse.list);
list_del_init(&event->fae.fse.list);
- finish_permission_event(group, event, FAN_ALLOW);
+ finish_permission_event(group, event, &response);
spin_lock(&group->notification_lock);
}

@@ -893,7 +914,7 @@ static int fanotify_release(struct inode *ignored, struct file *file)
fsnotify_destroy_event(group, fsn_event);
} else {
finish_permission_event(group, FANOTIFY_PERM(event),
- FAN_ALLOW);
+ &response);
}
spin_lock(&group->notification_lock);
}
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 419cadcd7ff5..dc3722749d52 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -113,6 +113,9 @@
#define ALL_FANOTIFY_EVENT_BITS (FANOTIFY_OUTGOING_EVENTS | \
FANOTIFY_EVENT_FLAGS)

+/* This mask is to check for invalid bits of a user space permission response */
+#define FAN_INVALID_RESPONSE_MASK(x) ((x) & ~(FAN_ALLOW | FAN_DENY | FAN_AUDIT))
+
/* Do not use these old uapi constants internally */
#undef FAN_ALL_CLASS_BITS
#undef FAN_ALL_INIT_FLAGS
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index e8ac38cc2fd6..efb5a3a6f814 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -179,9 +179,34 @@ struct fanotify_event_info_error {
__u32 error_count;
};

+/*
+ * User space may need to record additional information about its decision.
+ * The extra information type records what kind of information is included.
+ * The default is none. We also define an extra informaion buffer whose
+ * size is determined by the extra information type.
+ *
+ * If the context type is Rule, then the context following is the rule number
+ * that triggered the user space decision.
+ */
+
+#define FAN_RESPONSE_INFO_AUDIT_NONE 0
+#define FAN_RESPONSE_INFO_AUDIT_RULE 1
+
+struct fanotify_response_audit_rule {
+ __u32 rule;
+};
+
+#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
+ (sizeof(union { \
+ struct fanotify_response_audit_rule r; \
+ /* add other extra info structures here */ \
+ }))
+
struct fanotify_response {
__s32 fd;
- __u32 response;
+ __u16 response;
+ __u16 extra_info_type;
+ char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
};

/* Legit userspace responses to a _PERM event */
--
2.27.0

2022-05-02 09:02:05

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] fanotify: Allow user space to pass back additional audit info

On 2022-04-28 20:44, Richard Guy Briggs wrote:
> The Fanotify API can be used for access control by requesting permission
> event notification. The user space tooling that uses it may have a
> complicated policy that inherently contains additional context for the
> decision. If this information were available in the audit trail, policy
> writers can close the loop on debugging policy. Also, if this additional
> information were available, it would enable the creation of tools that
> can suggest changes to the policy similar to how audit2allow can help
> refine labeled security.
>
> This patch defines 2 additional fields within the response structure
> returned from user space on a permission event. The first field is 16
> bits for the context type. The context type will describe what the
> meaning is of the second field. The audit system will separate the
> pieces and log them individually.
>
> The audit function was updated to log the additional information in the
> AUDIT_FANOTIFY record. The following is an example of the new record
> format:
>
> type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_ctx=17

It might have been a good idea to tag this as RFC... I have a few
questions:

1. Where did "resp=" come from? It isn't in the field dictionary. It
seems like a needless duplication of "res=". If it isn't, maybe it
should have a "fan_" namespace prefix and become "fan_res="?

2. It appears I'm ok changing the "__u32 response" to "__u16" without
breaking old userspace. Is this true on all arches?

3. What should be the action if response contains unknown flags or
types? Is it reasonable to return -EINVAL?

4. Currently, struct fanotify_response has a fixed size, but if future
types get defined that have variable buffer sizes, how would that be
communicated or encoded?

> changelog:
> v1:
> - first version by Steve Grubb <[email protected]>
> Link: https://lore.kernel.org/r/2042449.irdbgypaU6@x2
>
> v2:
> - enhancements suggested by Jan Kara <[email protected]>
> - 1/3 change %d to %u in pr_debug
> - 2/3 change response from __u32 to __u16
> - mod struct fanotify_response and fanotify_perm_event add extra_info_type, extra_info_buf
> - extra_info_buf size max FANOTIFY_MAX_RESPONSE_EXTRA_LEN, add struct fanotify_response_audit_rule
> - extend debug statements
> - remove unneeded macros
> - [internal] change interface to finish_permission_event() and process_access_response()
> - 3/3 update format of extra information
> - [internal] change interface to audit_fanotify()
> - change ctx_type= to fan_type=
> Link: https://lore.kernel.org/r/[email protected]
>
> Richard Guy Briggs (3):
> fanotify: Ensure consistent variable type for response
> fanotify: define struct members to hold response decision context
> fanotify: Allow audit to use the full permission event response
>
> fs/notify/fanotify/fanotify.c | 5 ++-
> fs/notify/fanotify/fanotify.h | 4 +-
> fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
> include/linux/audit.h | 8 ++--
> include/linux/fanotify.h | 3 ++
> include/uapi/linux/fanotify.h | 27 +++++++++++++-
> kernel/auditsc.c | 18 +++++++--
> 7 files changed, 94 insertions(+), 30 deletions(-)
>
> --
> 2.27.0
>

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2022-05-03 01:28:11

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] fanotify: Allow user space to pass back additional audit info

On Thu, Apr 28, 2022 at 8:55 PM Richard Guy Briggs <[email protected]> wrote:
> On 2022-04-28 20:44, Richard Guy Briggs wrote:
> > The Fanotify API can be used for access control by requesting permission
> > event notification. The user space tooling that uses it may have a
> > complicated policy that inherently contains additional context for the
> > decision. If this information were available in the audit trail, policy
> > writers can close the loop on debugging policy. Also, if this additional
> > information were available, it would enable the creation of tools that
> > can suggest changes to the policy similar to how audit2allow can help
> > refine labeled security.
> >
> > This patch defines 2 additional fields within the response structure
> > returned from user space on a permission event. The first field is 16
> > bits for the context type. The context type will describe what the
> > meaning is of the second field. The audit system will separate the
> > pieces and log them individually.
> >
> > The audit function was updated to log the additional information in the
> > AUDIT_FANOTIFY record. The following is an example of the new record
> > format:
> >
> > type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_ctx=17
>
> It might have been a good idea to tag this as RFC... I have a few
> questions:
>
> 1. Where did "resp=" come from?

According to the git log, it came from Steve Grubb via de8cd83e91bc
("audit: Record fanotify access control decisions"). Steve should
have known what he was doing with respect to field names so I'm
assuming he had a reason.

> It isn't in the field dictionary. It
> seems like a needless duplication of "res=". If it isn't, maybe it
> should have a "fan_" namespace prefix and become "fan_res="?

Regardless of what it should have been, it is "resp" now and we can't
really change it. As far as the field dictionary is concerned, while
we should document these fields, it is important to note that when the
dictionary conflicts with the kernel, the kernel wins by definition.

> 2. It appears I'm ok changing the "__u32 response" to "__u16" without
> breaking old userspace. Is this true on all arches?

I can't answer that for you, the fanotify folks will need to look at
that, but you likely already know that. While I haven't gone through
the entire patchset yet, if it was me I probably would have left
response as a u32 and just added the extra fields; you are already
increasing the size of fanotify_response so why bother with shrinking
an existing field?

> 3. What should be the action if response contains unknown flags or
> types? Is it reasonable to return -EINVAL?

Once again, not a fanotify expert, but EINVAL is intended for invalid
input so it seems like a reasonable choice.

> 4. Currently, struct fanotify_response has a fixed size, but if future
> types get defined that have variable buffer sizes, how would that be
> communicated or encoded?

If that is a concern, you should probably include a length field in
the structure before the variable length field. You can't put it
before fd or response, so it's really a question of before or after
your new extra_info_type; I might suggest *after* extra_info_type, but
that's just me.


--
paul-moore.com

2022-05-03 01:28:15

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
>
> This patch adds 2 structure members to the response returned from user
> space on a permission event. The first field is 16 bits for the context
> type. The context type will describe what the meaning is of the second
> field. The default is none. The patch defines one additional context
> type which means that the second field is a 32-bit rule number. This
> will allow for the creation of other context types in the future if
> other users of the API identify different needs. The second field size
> is defined by the context type and can be used to pass along the data
> described by the context.
>
> To support this, there is a macro for user space to check that the data
> being sent is valid. Of course, without this check, anything that
> overflows the bit field will trigger an EINVAL based on the use of
> FAN_INVALID_RESPONSE_MASK in process_access_response().
>
> Suggested-by: Steve Grubb <[email protected]>
> Link: https://lore.kernel.org/r/2745105.e9J7NaK4W3@x2
> Suggested-by: Jan Kara <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Richard Guy Briggs <[email protected]>
> Link: https://lore.kernel.org/r/17660b3f2817e5c0a19d1e9e5d40b53ff4561845.1651174324.git.rgb@redhat.com
> ---
> fs/notify/fanotify/fanotify.c | 1 -
> fs/notify/fanotify/fanotify.h | 4 +-
> fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
> include/linux/fanotify.h | 3 ++
> include/uapi/linux/fanotify.h | 27 +++++++++++++-
> 5 files changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 985e995d2a39..00aff6e29bf8 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -266,7 +266,6 @@ static int fanotify_get_response(struct fsnotify_group *group,
> case FAN_ALLOW:
> ret = 0;
> break;
> - case FAN_DENY:

I personally would drop this from the patch if it was me, it doesn't
change the behavior so it falls under the "noise" category, which
could be a problem considering the lack of response on the original
posting and this one. Small, focused patches have a better shot of
review/merging.

> default:
> ret = -EPERM;
> }

...

> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index 694516470660..f1ff4cf683fb 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -289,13 +289,19 @@ static int create_fd(struct fsnotify_group *group, struct path *path,
> */
> static void finish_permission_event(struct fsnotify_group *group,
> struct fanotify_perm_event *event,
> - __u32 response)
> + struct fanotify_response *response)
> __releases(&group->notification_lock)
> {
> bool destroy = false;
>
> assert_spin_locked(&group->notification_lock);
> - event->response = response;
> + event->response = response->response;
> + event->extra_info_type = response->extra_info_type;
> + switch (event->extra_info_type) {
> + case FAN_RESPONSE_INFO_AUDIT_RULE:
> + memcpy(event->extra_info_buf, response->extra_info_buf,
> + sizeof(struct fanotify_response_audit_rule));

Since the fanotify_perm_event:extra_info_buf and
fanotify_response:extra_info_buf are the same type/length, and they
will be the same regardless of the extra_info_type field, why not
simply get rid of the above switch statement and do something like
this:

memcpy(event->extra_info_buf, response->extra_info_buf,
sizeof(response->extra_info_buf));

> + }
> if (event->state == FAN_EVENT_CANCELED)
> destroy = true;
> else

...

> @@ -827,26 +845,25 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
>
> static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> {
> - struct fanotify_response response = { .fd = -1, .response = -1 };
> + struct fanotify_response response;
> struct fsnotify_group *group;
> int ret;
> + size_t size = min(count, sizeof(struct fanotify_response));
>
> if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> return -EINVAL;
>
> group = file->private_data;
>
> - if (count < sizeof(response))
> + if (count < offsetof(struct fanotify_response, extra_info_buf))
> return -EINVAL;

Is this why you decided to shrink the fanotify_response:response field
from 32-bits to 16-bits? I hope not. I would suggest both keeping
the existing response field as 32-bits and explicitly checking for
writes that are either the existing/compat length as well as the
newer, longer length.

> - count = sizeof(response);
> -
> pr_debug("%s: group=%p count=%zu\n", __func__, group, count);
>
> - if (copy_from_user(&response, buf, count))
> + if (copy_from_user(&response, buf, size))
> return -EFAULT;
>
> - ret = process_access_response(group, &response);
> + ret = process_access_response(group, &response, count);
> if (ret < 0)
> count = ret;
>

...

> diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> index e8ac38cc2fd6..efb5a3a6f814 100644
> --- a/include/uapi/linux/fanotify.h
> +++ b/include/uapi/linux/fanotify.h
> @@ -179,9 +179,34 @@ struct fanotify_event_info_error {
> __u32 error_count;
> };
>
> +/*
> + * User space may need to record additional information about its decision.
> + * The extra information type records what kind of information is included.
> + * The default is none. We also define an extra informaion buffer whose
> + * size is determined by the extra information type.
> + *
> + * If the context type is Rule, then the context following is the rule number
> + * that triggered the user space decision.
> + */
> +
> +#define FAN_RESPONSE_INFO_AUDIT_NONE 0
> +#define FAN_RESPONSE_INFO_AUDIT_RULE 1
> +
> +struct fanotify_response_audit_rule {
> + __u32 rule;
> +};
> +
> +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> + (sizeof(union { \
> + struct fanotify_response_audit_rule r; \
> + /* add other extra info structures here */ \
> + }))
> +
> struct fanotify_response {
> __s32 fd;
> - __u32 response;
> + __u16 response;
> + __u16 extra_info_type;
> + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> };

Since both the kernel and userspace are going to need to agree on the
content and formatting of the fanotify_response:extra_info_buf field,
why is it hidden behind a char array? You might as well get rid of
that abstraction and put the union directly in the fanotify_response
struct. It is possible you could also get rid of the
fanotify_response_audit_rule struct this way too and just access the
rule scalar directly.


--
paul-moore.com

2022-05-04 01:49:53

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] fanotify: Allow user space to pass back additional audit info

On 2022-05-02 20:16, Paul Moore wrote:
> On Thu, Apr 28, 2022 at 8:55 PM Richard Guy Briggs <[email protected]> wrote:
> > On 2022-04-28 20:44, Richard Guy Briggs wrote:
> > > The Fanotify API can be used for access control by requesting permission
> > > event notification. The user space tooling that uses it may have a
> > > complicated policy that inherently contains additional context for the
> > > decision. If this information were available in the audit trail, policy
> > > writers can close the loop on debugging policy. Also, if this additional
> > > information were available, it would enable the creation of tools that
> > > can suggest changes to the policy similar to how audit2allow can help
> > > refine labeled security.
> > >
> > > This patch defines 2 additional fields within the response structure
> > > returned from user space on a permission event. The first field is 16
> > > bits for the context type. The context type will describe what the
> > > meaning is of the second field. The audit system will separate the
> > > pieces and log them individually.
> > >
> > > The audit function was updated to log the additional information in the
> > > AUDIT_FANOTIFY record. The following is an example of the new record
> > > format:
> > >
> > > type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_ctx=17
> >
> > It might have been a good idea to tag this as RFC... I have a few
> > questions:
> >
> > 1. Where did "resp=" come from?
>
> According to the git log, it came from Steve Grubb via de8cd83e91bc
> ("audit: Record fanotify access control decisions"). Steve should
> have known what he was doing with respect to field names so I'm
> assuming he had a reason.
>
> > It isn't in the field dictionary. It
> > seems like a needless duplication of "res=". If it isn't, maybe it
> > should have a "fan_" namespace prefix and become "fan_res="?
>
> Regardless of what it should have been, it is "resp" now and we can't
> really change it. As far as the field dictionary is concerned, while
> we should document these fields, it is important to note that when the
> dictionary conflicts with the kernel, the kernel wins by definition.

Agree on all counts. It was an open-ended question. It is also moot
since it is even expected in the audit-testsuite and would break that if
it were changed.

> > 2. It appears I'm ok changing the "__u32 response" to "__u16" without
> > breaking old userspace. Is this true on all arches?
>
> I can't answer that for you, the fanotify folks will need to look at
> that, but you likely already know that. While I haven't gone through
> the entire patchset yet, if it was me I probably would have left
> response as a u32 and just added the extra fields; you are already
> increasing the size of fanotify_response so why bother with shrinking
> an existing field?

I was thinking of that, but chose to follow the lead of the fanotify
mainainer.

> > 3. What should be the action if response contains unknown flags or
> > types? Is it reasonable to return -EINVAL?
>
> Once again, not a fanotify expert, but EINVAL is intended for invalid
> input so it seems like a reasonable choice.

The choice of the error code wasn't in question but rather the need to
fail rather than ignore unknown flags.

> > 4. Currently, struct fanotify_response has a fixed size, but if future
> > types get defined that have variable buffer sizes, how would that be
> > communicated or encoded?
>
> If that is a concern, you should probably include a length field in
> the structure before the variable length field. You can't put it
> before fd or response, so it's really a question of before or after
> your new extra_info_type; I might suggest *after* extra_info_type, but
> that's just me.

After extra_info_type is what I was thinking. The other possibility is
that a type with a variable length field could define its data size as
the first field within the variable field as set out in the format of
that varible length field so that all the fixed length fields would not
need to waste that space or bandwidth.

Thanks for the feedback.

> paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2022-05-04 15:54:37

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On 2022-05-02 20:16, Paul Moore wrote:
> On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > This patch adds 2 structure members to the response returned from user
> > space on a permission event. The first field is 16 bits for the context
> > type. The context type will describe what the meaning is of the second
> > field. The default is none. The patch defines one additional context
> > type which means that the second field is a 32-bit rule number. This
> > will allow for the creation of other context types in the future if
> > other users of the API identify different needs. The second field size
> > is defined by the context type and can be used to pass along the data
> > described by the context.
> >
> > To support this, there is a macro for user space to check that the data
> > being sent is valid. Of course, without this check, anything that
> > overflows the bit field will trigger an EINVAL based on the use of
> > FAN_INVALID_RESPONSE_MASK in process_access_response().
> >
> > Suggested-by: Steve Grubb <[email protected]>
> > Link: https://lore.kernel.org/r/2745105.e9J7NaK4W3@x2
> > Suggested-by: Jan Kara <[email protected]>
> > Link: https://lore.kernel.org/r/[email protected]
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > Link: https://lore.kernel.org/r/17660b3f2817e5c0a19d1e9e5d40b53ff4561845.1651174324.git.rgb@redhat.com
> > ---
> > fs/notify/fanotify/fanotify.c | 1 -
> > fs/notify/fanotify/fanotify.h | 4 +-
> > fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
> > include/linux/fanotify.h | 3 ++
> > include/uapi/linux/fanotify.h | 27 +++++++++++++-
> > 5 files changed, 72 insertions(+), 22 deletions(-)
> >
> > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> > index 985e995d2a39..00aff6e29bf8 100644
> > --- a/fs/notify/fanotify/fanotify.c
> > +++ b/fs/notify/fanotify/fanotify.c
> > @@ -266,7 +266,6 @@ static int fanotify_get_response(struct fsnotify_group *group,
> > case FAN_ALLOW:
> > ret = 0;
> > break;
> > - case FAN_DENY:
>
> I personally would drop this from the patch if it was me, it doesn't
> change the behavior so it falls under the "noise" category, which
> could be a problem considering the lack of response on the original
> posting and this one. Small, focused patches have a better shot of
> review/merging.

It was a harmless part of the original patch, but I agree it should go.

> > default:
> > ret = -EPERM;
> > }
>
> ...
>
> > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > index 694516470660..f1ff4cf683fb 100644
> > --- a/fs/notify/fanotify/fanotify_user.c
> > +++ b/fs/notify/fanotify/fanotify_user.c
> > @@ -289,13 +289,19 @@ static int create_fd(struct fsnotify_group *group, struct path *path,
> > */
> > static void finish_permission_event(struct fsnotify_group *group,
> > struct fanotify_perm_event *event,
> > - __u32 response)
> > + struct fanotify_response *response)
> > __releases(&group->notification_lock)
> > {
> > bool destroy = false;
> >
> > assert_spin_locked(&group->notification_lock);
> > - event->response = response;
> > + event->response = response->response;
> > + event->extra_info_type = response->extra_info_type;
> > + switch (event->extra_info_type) {
> > + case FAN_RESPONSE_INFO_AUDIT_RULE:
> > + memcpy(event->extra_info_buf, response->extra_info_buf,
> > + sizeof(struct fanotify_response_audit_rule));
>
> Since the fanotify_perm_event:extra_info_buf and
> fanotify_response:extra_info_buf are the same type/length, and they
> will be the same regardless of the extra_info_type field, why not
> simply get rid of the above switch statement and do something like
> this:
>
> memcpy(event->extra_info_buf, response->extra_info_buf,
> sizeof(response->extra_info_buf));

I've been wrestling with the possibility of doing a split between what
is presented to userspace and what's used in the kernel for struct
fanotify_response, while attempting to future-proof it.

At the moment, since the extra_info_buf is either zero or has a fixed
size for the "RULE" type, it seemed to be most efficient to do a static
allocation on the stack upon entry into fanotify_write() that was
only 4 octets more than the type "NONE" case. Later, if a new type has
a variable extra_info_buf size, it can be internally allocated
dynamically.

> > + }
> > if (event->state == FAN_EVENT_CANCELED)
> > destroy = true;
> > else
>
> ...
>
> > @@ -827,26 +845,25 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
> >
> > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > {
> > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > + struct fanotify_response response;
> > struct fsnotify_group *group;
> > int ret;
> > + size_t size = min(count, sizeof(struct fanotify_response));
> >
> > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > return -EINVAL;
> >
> > group = file->private_data;
> >
> > - if (count < sizeof(response))
> > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > return -EINVAL;
>
> Is this why you decided to shrink the fanotify_response:response field
> from 32-bits to 16-bits? I hope not. I would suggest both keeping
> the existing response field as 32-bits and explicitly checking for
> writes that are either the existing/compat length as well as the
> newer, longer length.

No. I shrank it at Jan's suggestion. I think I agree with you that
the response field should be kept at u32 as it is defined in userspace
and purge the doubt about what would happen with a new userspace with
an old kernel.

> > - count = sizeof(response);
> > -
> > pr_debug("%s: group=%p count=%zu\n", __func__, group, count);
> >
> > - if (copy_from_user(&response, buf, count))
> > + if (copy_from_user(&response, buf, size))
> > return -EFAULT;
> >
> > - ret = process_access_response(group, &response);
> > + ret = process_access_response(group, &response, count);
> > if (ret < 0)
> > count = ret;
> >
>
> ...
>
> > diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> > index e8ac38cc2fd6..efb5a3a6f814 100644
> > --- a/include/uapi/linux/fanotify.h
> > +++ b/include/uapi/linux/fanotify.h
> > @@ -179,9 +179,34 @@ struct fanotify_event_info_error {
> > __u32 error_count;
> > };
> >
> > +/*
> > + * User space may need to record additional information about its decision.
> > + * The extra information type records what kind of information is included.
> > + * The default is none. We also define an extra informaion buffer whose
> > + * size is determined by the extra information type.
> > + *
> > + * If the context type is Rule, then the context following is the rule number
> > + * that triggered the user space decision.
> > + */
> > +
> > +#define FAN_RESPONSE_INFO_AUDIT_NONE 0
> > +#define FAN_RESPONSE_INFO_AUDIT_RULE 1
> > +
> > +struct fanotify_response_audit_rule {
> > + __u32 rule;
> > +};
> > +
> > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > + (sizeof(union { \
> > + struct fanotify_response_audit_rule r; \
> > + /* add other extra info structures here */ \
> > + }))
> > +
> > struct fanotify_response {
> > __s32 fd;
> > - __u32 response;
> > + __u16 response;
> > + __u16 extra_info_type;
> > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > };
>
> Since both the kernel and userspace are going to need to agree on the
> content and formatting of the fanotify_response:extra_info_buf field,
> why is it hidden behind a char array? You might as well get rid of
> that abstraction and put the union directly in the fanotify_response
> struct. It is possible you could also get rid of the
> fanotify_response_audit_rule struct this way too and just access the
> rule scalar directly.

This does make sense and my only concern would be a variable-length
type. There isn't any reason to hide it. If userspace chooses to use
the old interface and omit the type field then it defaults to NONE.

If future types with variable data are defined, the first field could be
a u32 that unions with the rule number that won't change the struct
size.

Thanks for the feedback.

> paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


2022-05-04 18:18:37

by Steve Grubb

[permalink] [raw]
Subject: Re: [PATCH v2 0/3] fanotify: Allow user space to pass back additional audit info

On Thursday, April 28, 2022 8:55:33 PM EDT Richard Guy Briggs wrote:
> On 2022-04-28 20:44, Richard Guy Briggs wrote:
> > The Fanotify API can be used for access control by requesting permission
> > event notification. The user space tooling that uses it may have a
> > complicated policy that inherently contains additional context for the
> > decision. If this information were available in the audit trail, policy
> > writers can close the loop on debugging policy. Also, if this additional
> > information were available, it would enable the creation of tools that
> > can suggest changes to the policy similar to how audit2allow can help
> > refine labeled security.
> >
> > This patch defines 2 additional fields within the response structure
> > returned from user space on a permission event. The first field is 16
> > bits for the context type. The context type will describe what the
> > meaning is of the second field. The audit system will separate the
> > pieces and log them individually.
> >
> > The audit function was updated to log the additional information in the
> > AUDIT_FANOTIFY record. The following is an example of the new record
> > format:
> >
> > type=FANOTIFY msg=audit(1600385147.372:590): resp=2 fan_type=1 fan_ctx=17
>
> It might have been a good idea to tag this as RFC... I have a few
> questions:
>
> 1. Where did "resp=" come from?

This is an abbreviation for the response field of struct fanotify_response. I
wanted to keep it short to save bytes.

> It isn't in the field dictionary. It seems like a needless duplication of
> "res=". If it isn't, maybe it should have a "fan_" namespace prefix and
> become "fan_res="?

At this point it's been interpretted for years.

> 2. It appears I'm ok changing the "__u32 response" to "__u16" without
> breaking old userspace. Is this true on all arches?

If done carefully. Old user space won't try to use the new capabilities. The
only trick is new user space/old kernel.

> 3. What should be the action if response contains unknown flags or
> types? Is it reasonable to return -EINVAL?

The original patch was designed to allow the response metadata to take on
various different meanings based on new usage. The original patch only defined
a rule order numbering which if something else wanted to send it's own
metadata about a decision, a new patch could layer on top of this without
interfering.

If this is an access decision that is rejected with EINVAL (and assuming
future decisions will also be formed the same way), the whole system is
getting ready to lock up - even though a real answer is in the response.

> 4. Currently, struct fanotify_response has a fixed size, but if future
> types get defined that have variable buffer sizes, how would that be
> communicated or encoded?

I hadn't considered that as it would be too much of a change that I would be
uncomfortable doing. That could be a future evolution if it's ever needed.

-Steve



2022-05-06 04:26:08

by Steve Grubb

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

Hello Jan,

On Thursday, May 5, 2022 10:44:56 AM EDT Jan Kara wrote:
> On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> > On 2022-05-02 20:16, Paul Moore wrote:
> > > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]>
wrote:
> > > > This patch adds 2 structure members to the response returned from
> > > > user
> > > > space on a permission event. The first field is 16 bits for the
> > > > context
> > > > type. The context type will describe what the meaning is of the
> > > > second
> > > > field. The default is none. The patch defines one additional context
> > > > type which means that the second field is a 32-bit rule number. This
> > > > will allow for the creation of other context types in the future if
> > > > other users of the API identify different needs. The second field
> > > > size
> > > > is defined by the context type and can be used to pass along the data
> > > > described by the context.
> > > >
> > > > To support this, there is a macro for user space to check that the
> > > > data
> > > > being sent is valid. Of course, without this check, anything that
> > > > overflows the bit field will trigger an EINVAL based on the use of
> > > > FAN_INVALID_RESPONSE_MASK in process_access_response().
>
> ...
>
> > > > static ssize_t fanotify_write(struct file *file, const char __user
> > > > *buf, size_t count, loff_t *pos) {
> > > >
> > > > - struct fanotify_response response = { .fd = -1, .response =
> > > > -1 };
> > > > + struct fanotify_response response;
> > > >
> > > > struct fsnotify_group *group;
> > > > int ret;
> > > >
> > > > + size_t size = min(count, sizeof(struct fanotify_response));
> > > >
> > > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > >
> > > > return -EINVAL;
> > > >
> > > > group = file->private_data;
> > > >
> > > > - if (count < sizeof(response))
> > > > + if (count < offsetof(struct fanotify_response,
> > > > extra_info_buf))
> > > >
> > > > return -EINVAL;
> > >
> > > Is this why you decided to shrink the fanotify_response:response field
> > > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > > the existing response field as 32-bits and explicitly checking for
> > > writes that are either the existing/compat length as well as the
> > > newer, longer length.
> >
> > No. I shrank it at Jan's suggestion. I think I agree with you that
> > the response field should be kept at u32 as it is defined in userspace
> > and purge the doubt about what would happen with a new userspace with
> > an old kernel.
>
> Hum, for the life of me I cannot find my response you mention here. Can you
> send a link so that I can refresh my memory? It has been a long time...

It was this thread:

https://marc.info/?t=160148236400005&r=1&w=2

-Steve

> > > > +
> > > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > > + (sizeof(union { \
> > > > + struct fanotify_response_audit_rule r; \
> > > > + /* add other extra info structures here */ \
> > > > + }))
> > > > +
> > > >
> > > > struct fanotify_response {
> > > >
> > > > __s32 fd;
> > > >
> > > > - __u32 response;
> > > > + __u16 response;
> > > > + __u16 extra_info_type;
> > > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > >
> > > > };
> > >
> > > Since both the kernel and userspace are going to need to agree on the
> > > content and formatting of the fanotify_response:extra_info_buf field,
> > > why is it hidden behind a char array? You might as well get rid of
> > > that abstraction and put the union directly in the fanotify_response
> > > struct. It is possible you could also get rid of the
> > > fanotify_response_audit_rule struct this way too and just access the
> > > rule scalar directly.
> >
> > This does make sense and my only concern would be a variable-length
> > type. There isn't any reason to hide it. If userspace chooses to use
> > the old interface and omit the type field then it defaults to NONE.
> >
> > If future types with variable data are defined, the first field could be
> > a u32 that unions with the rule number that won't change the struct
> > size.
>
> Struct fanotify_response size must not change, it is part of the kernel
> ABI. In particular your above change would break userspace code that is
> currently working just fine (e.g. allocating 8 bytes and expecting struct
> fanotify_response fits there, or just writing sizeof(struct
> fanotify_response) as a response while initializing only first 8 bytes).
> How I'd suggest doing it now (and I'd like to refresh my memory from my
> past emails you mention because in the past I might have thought something
> else ;)) is that you add another flag to 'response' field similar to
> FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
> to be expected after struct fanotify_response. The extra info would always
> start with a header like:
>
> struct fanotify_response_info_header {
> __u8 info_type;
> __u8 pad;
> __u16 len; /* This is including the header itself */
> }
>
> And after such header there would be the 'blob' of data 'len - header size'
> long. We use this same scheme when passing fanotify events to userspace
> and it has proven to be lightweight and extensible. It covers the
> situation when in the future audit would decide it wants other data (just
> change data type), it would also cover the situation when some other
> subsystem wants its information passed as well - there can be more
> structures like this attached at the end, we can process the response up
> to the length of the write.
>
> Now these are just possible future extensions making sure we can extend the
> ABI without too much pain. In the current implementation I'd just return
> EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
> and do very strict checks on what gets passed in. It is also trivially
> backwards compatible (old userspace on new kernel works just fine).
>
> If you want to achieve compatibility of running new userspace on old kernel
> (I guess that's desirable), we have group flags for that - like we
> introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
> need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
> should expect an allow more info returning for permission events. At the
> same time this is the way for userspace to be able to tell whether the
> kernel supports this. I know this sounds tedious but that's the cost of
> extending the ABI in the compatible way. We've made various API mistakes
> in the past having to add weird workarounds to fanotify and we don't want
> to repeat those mistakes :).
>
> One open question I have is what should the kernel do with 'info_type' in
> response it does not understand (in the future when there are possibly more
> different info types). It could just skip it because this should be just
> additional info for introspection (the only mandatory part is in
> fanotify_response, however it could surprise userspace that passed info is
> just getting ignored. To solve this we would have to somewhere report
> supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> cross that bridge when we get to it.
>
> Amir, what do you think?
>
> Honza





2022-05-06 11:34:29

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> On 2022-05-02 20:16, Paul Moore wrote:
> > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > > This patch adds 2 structure members to the response returned from user
> > > space on a permission event. The first field is 16 bits for the context
> > > type. The context type will describe what the meaning is of the second
> > > field. The default is none. The patch defines one additional context
> > > type which means that the second field is a 32-bit rule number. This
> > > will allow for the creation of other context types in the future if
> > > other users of the API identify different needs. The second field size
> > > is defined by the context type and can be used to pass along the data
> > > described by the context.
> > >
> > > To support this, there is a macro for user space to check that the data
> > > being sent is valid. Of course, without this check, anything that
> > > overflows the bit field will trigger an EINVAL based on the use of
> > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > >

...

> > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > {
> > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > + struct fanotify_response response;
> > > struct fsnotify_group *group;
> > > int ret;
> > > + size_t size = min(count, sizeof(struct fanotify_response));
> > >
> > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > return -EINVAL;
> > >
> > > group = file->private_data;
> > >
> > > - if (count < sizeof(response))
> > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > return -EINVAL;
> >
> > Is this why you decided to shrink the fanotify_response:response field
> > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > the existing response field as 32-bits and explicitly checking for
> > writes that are either the existing/compat length as well as the
> > newer, longer length.
>
> No. I shrank it at Jan's suggestion. I think I agree with you that
> the response field should be kept at u32 as it is defined in userspace
> and purge the doubt about what would happen with a new userspace with
> an old kernel.

Hum, for the life of me I cannot find my response you mention here. Can you
send a link so that I can refresh my memory? It has been a long time...

> > > +
> > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > + (sizeof(union { \
> > > + struct fanotify_response_audit_rule r; \
> > > + /* add other extra info structures here */ \
> > > + }))
> > > +
> > > struct fanotify_response {
> > > __s32 fd;
> > > - __u32 response;
> > > + __u16 response;
> > > + __u16 extra_info_type;
> > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > };
> >
> > Since both the kernel and userspace are going to need to agree on the
> > content and formatting of the fanotify_response:extra_info_buf field,
> > why is it hidden behind a char array? You might as well get rid of
> > that abstraction and put the union directly in the fanotify_response
> > struct. It is possible you could also get rid of the
> > fanotify_response_audit_rule struct this way too and just access the
> > rule scalar directly.
>
> This does make sense and my only concern would be a variable-length
> type. There isn't any reason to hide it. If userspace chooses to use
> the old interface and omit the type field then it defaults to NONE.
>
> If future types with variable data are defined, the first field could be
> a u32 that unions with the rule number that won't change the struct
> size.

Struct fanotify_response size must not change, it is part of the kernel
ABI. In particular your above change would break userspace code that is
currently working just fine (e.g. allocating 8 bytes and expecting struct
fanotify_response fits there, or just writing sizeof(struct
fanotify_response) as a response while initializing only first 8 bytes).
How I'd suggest doing it now (and I'd like to refresh my memory from my
past emails you mention because in the past I might have thought something
else ;)) is that you add another flag to 'response' field similar to
FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
to be expected after struct fanotify_response. The extra info would always
start with a header like:

struct fanotify_response_info_header {
__u8 info_type;
__u8 pad;
__u16 len; /* This is including the header itself */
}

And after such header there would be the 'blob' of data 'len - header size'
long. We use this same scheme when passing fanotify events to userspace
and it has proven to be lightweight and extensible. It covers the situation
when in the future audit would decide it wants other data (just change data
type), it would also cover the situation when some other subsystem wants
its information passed as well - there can be more structures like this
attached at the end, we can process the response up to the length of the
write.

Now these are just possible future extensions making sure we can extend the
ABI without too much pain. In the current implementation I'd just return
EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
and do very strict checks on what gets passed in. It is also trivially
backwards compatible (old userspace on new kernel works just fine).

If you want to achieve compatibility of running new userspace on old kernel
(I guess that's desirable), we have group flags for that - like we
introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
should expect an allow more info returning for permission events. At the
same time this is the way for userspace to be able to tell whether the
kernel supports this. I know this sounds tedious but that's the cost of
extending the ABI in the compatible way. We've made various API mistakes in
the past having to add weird workarounds to fanotify and we don't want to
repeat those mistakes :).

One open question I have is what should the kernel do with 'info_type' in
response it does not understand (in the future when there are possibly more
different info types). It could just skip it because this should be just
additional info for introspection (the only mandatory part is in
fanotify_response, however it could surprise userspace that passed info is
just getting ignored. To solve this we would have to somewhere report
supported info types (maybe in fanotify fdinfo in proc). I guess we'll
cross that bridge when we get to it.

Amir, what do you think?

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-05-09 04:41:43

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On 2022-05-05 16:44, Jan Kara wrote:
> On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> > On 2022-05-02 20:16, Paul Moore wrote:
> > > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > > > This patch adds 2 structure members to the response returned from user
> > > > space on a permission event. The first field is 16 bits for the context
> > > > type. The context type will describe what the meaning is of the second
> > > > field. The default is none. The patch defines one additional context
> > > > type which means that the second field is a 32-bit rule number. This
> > > > will allow for the creation of other context types in the future if
> > > > other users of the API identify different needs. The second field size
> > > > is defined by the context type and can be used to pass along the data
> > > > described by the context.
> > > >
> > > > To support this, there is a macro for user space to check that the data
> > > > being sent is valid. Of course, without this check, anything that
> > > > overflows the bit field will trigger an EINVAL based on the use of
> > > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > > >
>
> ...
>
> > > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > > {
> > > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > > + struct fanotify_response response;
> > > > struct fsnotify_group *group;
> > > > int ret;
> > > > + size_t size = min(count, sizeof(struct fanotify_response));
> > > >
> > > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > > return -EINVAL;
> > > >
> > > > group = file->private_data;
> > > >
> > > > - if (count < sizeof(response))
> > > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > > return -EINVAL;
> > >
> > > Is this why you decided to shrink the fanotify_response:response field
> > > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > > the existing response field as 32-bits and explicitly checking for
> > > writes that are either the existing/compat length as well as the
> > > newer, longer length.
> >
> > No. I shrank it at Jan's suggestion. I think I agree with you that
> > the response field should be kept at u32 as it is defined in userspace
> > and purge the doubt about what would happen with a new userspace with
> > an old kernel.
>
> Hum, for the life of me I cannot find my response you mention here. Can you
> send a link so that I can refresh my memory? It has been a long time...

https://listman.redhat.com/archives/linux-audit/2020-October/017066.html
https://listman.redhat.com/archives/linux-audit/2020-October/017067.html

> > > > +
> > > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > > + (sizeof(union { \
> > > > + struct fanotify_response_audit_rule r; \
> > > > + /* add other extra info structures here */ \
> > > > + }))
> > > > +
> > > > struct fanotify_response {
> > > > __s32 fd;
> > > > - __u32 response;
> > > > + __u16 response;
> > > > + __u16 extra_info_type;
> > > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > > };
> > >
> > > Since both the kernel and userspace are going to need to agree on the
> > > content and formatting of the fanotify_response:extra_info_buf field,
> > > why is it hidden behind a char array? You might as well get rid of
> > > that abstraction and put the union directly in the fanotify_response
> > > struct. It is possible you could also get rid of the
> > > fanotify_response_audit_rule struct this way too and just access the
> > > rule scalar directly.
> >
> > This does make sense and my only concern would be a variable-length
> > type. There isn't any reason to hide it. If userspace chooses to use
> > the old interface and omit the type field then it defaults to NONE.
> >
> > If future types with variable data are defined, the first field could be
> > a u32 that unions with the rule number that won't change the struct
> > size.
>
> Struct fanotify_response size must not change, it is part of the kernel
> ABI. In particular your above change would break userspace code that is
> currently working just fine (e.g. allocating 8 bytes and expecting struct
> fanotify_response fits there, or just writing sizeof(struct
> fanotify_response) as a response while initializing only first 8 bytes).

Many kernel ABIs have been expanded without breaking them.

Is it reasonable for a userspace program to use a kernel structure
without also using its size for allocation and initialization?

> How I'd suggest doing it now (and I'd like to refresh my memory from my
> past emails you mention because in the past I might have thought something
> else ;)) is that you add another flag to 'response' field similar to
> FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
> to be expected after struct fanotify_response.

That's an interesting possibility... I'm trying to predict if that
would be a problem for an old kernel... In process_access_response() it
would fallthrough to the default case of -EINVAL but do so safely.

> The extra info would always start with a header like:
>
> struct fanotify_response_info_header {
> __u8 info_type;
> __u8 pad;
> __u16 len; /* This is including the header itself */
> }
>
> And after such header there would be the 'blob' of data 'len - header size'
> long. We use this same scheme when passing fanotify events to userspace
> and it has proven to be lightweight and extensible. It covers the situation
> when in the future audit would decide it wants other data (just change data
> type), it would also cover the situation when some other subsystem wants
> its information passed as well - there can be more structures like this
> attached at the end, we can process the response up to the length of the
> write.

This reminds me of the RFC2367 (PF_KEYv2, why is that still right there
at the tip of my fingers?)

> Now these are just possible future extensions making sure we can extend the
> ABI without too much pain. In the current implementation I'd just return
> EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
> and do very strict checks on what gets passed in. It is also trivially
> backwards compatible (old userspace on new kernel works just fine).

Old userspace on new kernel would work fine with this idea, the v2 patch
posted, or with leaving the response field of struct fanotify_response
as __u32.

> If you want to achieve compatibility of running new userspace on old kernel
> (I guess that's desirable), we have group flags for that - like we
> introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
> need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
> should expect an allow more info returning for permission events. At the
> same time this is the way for userspace to be able to tell whether the
> kernel supports this. I know this sounds tedious but that's the cost of
> extending the ABI in the compatible way. We've made various API mistakes in
> the past having to add weird workarounds to fanotify and we don't want to
> repeat those mistakes :).
>
> One open question I have is what should the kernel do with 'info_type' in
> response it does not understand (in the future when there are possibly more
> different info types). It could just skip it because this should be just
> additional info for introspection (the only mandatory part is in
> fanotify_response, however it could surprise userspace that passed info is
> just getting ignored. To solve this we would have to somewhere report
> supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> cross that bridge when we get to it.

Well, one possibility is to return -EINVAL to signal the kernel is out
of date.

> Amir, what do you think?
>
> Jan Kara <[email protected]>

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635


2022-05-09 05:09:34

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On Thu 05-05-22 20:34:06, Amir Goldstein wrote:
> > One open question I have is what should the kernel do with 'info_type' in
> > response it does not understand (in the future when there are possibly more
> > different info types). It could just skip it because this should be just
> > additional info for introspection (the only mandatory part is in
> > fanotify_response, however it could surprise userspace that passed info is
> > just getting ignored. To solve this we would have to somewhere report
> > supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> > cross that bridge when we get to it.
> >
> > Amir, what do you think?
>
> Regardless if and how we provide a way to enumerate supported info types,
> I would prefer to reject (EINVAL) unknown info types.

OK, agreed. I will be also calmer when we do that because then we can be
certain userspace does not pass bogus data for unknown info types.

> We can provide a command FAN_RESPONSE_TEST to write a test response with
> FAN_NOFD and some extra info so the program can test if certain info
> types are supported.

Hum, that would be an option as well. We don't even need the
FAN_RESPONSE_TEST command, do we? The write to fanotify fd for FAN_NOFD fd
would just perform validation of the response and either accept it (do
nothing) or return EINVAL.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-05-09 07:32:31

by Paul Moore

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On Tue, May 3, 2022 at 9:33 PM Richard Guy Briggs <[email protected]> wrote:
> On 2022-05-02 20:16, Paul Moore wrote:
> > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > > This patch adds 2 structure members to the response returned from user
> > > space on a permission event. The first field is 16 bits for the context
> > > type. The context type will describe what the meaning is of the second
> > > field. The default is none. The patch defines one additional context
> > > type which means that the second field is a 32-bit rule number. This
> > > will allow for the creation of other context types in the future if
> > > other users of the API identify different needs. The second field size
> > > is defined by the context type and can be used to pass along the data
> > > described by the context.
> > >
> > > To support this, there is a macro for user space to check that the data
> > > being sent is valid. Of course, without this check, anything that
> > > overflows the bit field will trigger an EINVAL based on the use of
> > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > >
> > > Suggested-by: Steve Grubb <[email protected]>
> > > Link: https://lore.kernel.org/r/2745105.e9J7NaK4W3@x2
> > > Suggested-by: Jan Kara <[email protected]>
> > > Link: https://lore.kernel.org/r/[email protected]
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > Link: https://lore.kernel.org/r/17660b3f2817e5c0a19d1e9e5d40b53ff4561845.1651174324.git.rgb@redhat.com
> > > ---
> > > fs/notify/fanotify/fanotify.c | 1 -
> > > fs/notify/fanotify/fanotify.h | 4 +-
> > > fs/notify/fanotify/fanotify_user.c | 59 ++++++++++++++++++++----------
> > > include/linux/fanotify.h | 3 ++
> > > include/uapi/linux/fanotify.h | 27 +++++++++++++-
> > > 5 files changed, 72 insertions(+), 22 deletions(-)

...

> > > diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> > > index 694516470660..f1ff4cf683fb 100644
> > > --- a/fs/notify/fanotify/fanotify_user.c
> > > +++ b/fs/notify/fanotify/fanotify_user.c
> > > @@ -289,13 +289,19 @@ static int create_fd(struct fsnotify_group *group, struct path *path,
> > > */
> > > static void finish_permission_event(struct fsnotify_group *group,
> > > struct fanotify_perm_event *event,
> > > - __u32 response)
> > > + struct fanotify_response *response)
> > > __releases(&group->notification_lock)
> > > {
> > > bool destroy = false;
> > >
> > > assert_spin_locked(&group->notification_lock);
> > > - event->response = response;
> > > + event->response = response->response;
> > > + event->extra_info_type = response->extra_info_type;
> > > + switch (event->extra_info_type) {
> > > + case FAN_RESPONSE_INFO_AUDIT_RULE:
> > > + memcpy(event->extra_info_buf, response->extra_info_buf,
> > > + sizeof(struct fanotify_response_audit_rule));
> >
> > Since the fanotify_perm_event:extra_info_buf and
> > fanotify_response:extra_info_buf are the same type/length, and they
> > will be the same regardless of the extra_info_type field, why not
> > simply get rid of the above switch statement and do something like
> > this:
> >
> > memcpy(event->extra_info_buf, response->extra_info_buf,
> > sizeof(response->extra_info_buf));
>
> I've been wrestling with the possibility of doing a split between what
> is presented to userspace and what's used in the kernel for struct
> fanotify_response, while attempting to future-proof it.

You really only need to worry about what is presented to userspace,
the kernel internals can always change if needed. Right now I would
focus on making sure the userspace visible data structures are done
properly: preserve the existing data offsets/lengths, and ensure that
the new additions do not make it harder to extend the structure again
in the future.

> > > @@ -827,26 +845,25 @@ static ssize_t fanotify_read(struct file *file, char __user *buf,
> > >
> > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > {
> > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > + struct fanotify_response response;
> > > struct fsnotify_group *group;
> > > int ret;
> > > + size_t size = min(count, sizeof(struct fanotify_response));
> > >
> > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > return -EINVAL;
> > >
> > > group = file->private_data;
> > >
> > > - if (count < sizeof(response))
> > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > return -EINVAL;
> >
> > Is this why you decided to shrink the fanotify_response:response field
> > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > the existing response field as 32-bits and explicitly checking for
> > writes that are either the existing/compat length as well as the
> > newer, longer length.
>
> No. I shrank it at Jan's suggestion. I think I agree with you that
> the response field should be kept at u32 as it is defined in userspace
> and purge the doubt about what would happen with a new userspace with
> an old kernel.

I'm struggling to think of why shrinking an existing field is a good
idea. Unfortunately, there is a possibility that any problems this
would cause might not be caught until it has been in a couple of
kernel releases and some applications have been written/updated to use
the new struct definition, at which point restoring the field to a u32
value will break all of these new applications.

I think changing the fanotify_response:response field is an
unnecessary risk, and I'll leave it at that.

> > > diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
> > > index e8ac38cc2fd6..efb5a3a6f814 100644
> > > --- a/include/uapi/linux/fanotify.h
> > > +++ b/include/uapi/linux/fanotify.h
> > > @@ -179,9 +179,34 @@ struct fanotify_event_info_error {
> > > __u32 error_count;
> > > };
> > >
> > > +/*
> > > + * User space may need to record additional information about its decision.
> > > + * The extra information type records what kind of information is included.
> > > + * The default is none. We also define an extra informaion buffer whose
> > > + * size is determined by the extra information type.
> > > + *
> > > + * If the context type is Rule, then the context following is the rule number
> > > + * that triggered the user space decision.
> > > + */
> > > +
> > > +#define FAN_RESPONSE_INFO_AUDIT_NONE 0
> > > +#define FAN_RESPONSE_INFO_AUDIT_RULE 1
> > > +
> > > +struct fanotify_response_audit_rule {
> > > + __u32 rule;
> > > +};
> > > +
> > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > + (sizeof(union { \
> > > + struct fanotify_response_audit_rule r; \
> > > + /* add other extra info structures here */ \
> > > + }))
> > > +
> > > struct fanotify_response {
> > > __s32 fd;
> > > - __u32 response;
> > > + __u16 response;
> > > + __u16 extra_info_type;
> > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > };
> >
> > Since both the kernel and userspace are going to need to agree on the
> > content and formatting of the fanotify_response:extra_info_buf field,
> > why is it hidden behind a char array? You might as well get rid of
> > that abstraction and put the union directly in the fanotify_response
> > struct. It is possible you could also get rid of the
> > fanotify_response_audit_rule struct this way too and just access the
> > rule scalar directly.
>
> This does make sense and my only concern would be a variable-length
> type. There isn't any reason to hide it. If userspace chooses to use
> the old interface and omit the type field then it defaults to NONE.

There is no reason you couldn't put flexible-array field in a union if
that is what was needed. Of you could have the flexible-array field
outside of the union and use a union field as the length value.
There's probably other clever solutions to this too.

--
paul-moore.com

2022-05-09 09:31:15

by Amir Goldstein

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

> One open question I have is what should the kernel do with 'info_type' in
> response it does not understand (in the future when there are possibly more
> different info types). It could just skip it because this should be just
> additional info for introspection (the only mandatory part is in
> fanotify_response, however it could surprise userspace that passed info is
> just getting ignored. To solve this we would have to somewhere report
> supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> cross that bridge when we get to it.
>
> Amir, what do you think?

Regardless if and how we provide a way to enumerate supported info types,
I would prefer to reject (EINVAL) unknown info types.

We can provide a command FAN_RESPONSE_TEST to write a test response with
FAN_NOFD and some extra info so the program can test if certain info
types are supported.

Thanks,
Amir.

2022-05-09 10:49:05

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On Fri 06-05-22 14:46:49, Richard Guy Briggs wrote:
> On 2022-05-05 16:44, Jan Kara wrote:
> > On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> > > On 2022-05-02 20:16, Paul Moore wrote:
> > > > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > > > > This patch adds 2 structure members to the response returned from user
> > > > > space on a permission event. The first field is 16 bits for the context
> > > > > type. The context type will describe what the meaning is of the second
> > > > > field. The default is none. The patch defines one additional context
> > > > > type which means that the second field is a 32-bit rule number. This
> > > > > will allow for the creation of other context types in the future if
> > > > > other users of the API identify different needs. The second field size
> > > > > is defined by the context type and can be used to pass along the data
> > > > > described by the context.
> > > > >
> > > > > To support this, there is a macro for user space to check that the data
> > > > > being sent is valid. Of course, without this check, anything that
> > > > > overflows the bit field will trigger an EINVAL based on the use of
> > > > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > > > >
> >
> > ...
> >
> > > > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > > > {
> > > > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > > > + struct fanotify_response response;
> > > > > struct fsnotify_group *group;
> > > > > int ret;
> > > > > + size_t size = min(count, sizeof(struct fanotify_response));
> > > > >
> > > > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > > > return -EINVAL;
> > > > >
> > > > > group = file->private_data;
> > > > >
> > > > > - if (count < sizeof(response))
> > > > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > > > return -EINVAL;
> > > >
> > > > Is this why you decided to shrink the fanotify_response:response field
> > > > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > > > the existing response field as 32-bits and explicitly checking for
> > > > writes that are either the existing/compat length as well as the
> > > > newer, longer length.
> > >
> > > No. I shrank it at Jan's suggestion. I think I agree with you that
> > > the response field should be kept at u32 as it is defined in userspace
> > > and purge the doubt about what would happen with a new userspace with
> > > an old kernel.
> >
> > Hum, for the life of me I cannot find my response you mention here. Can you
> > send a link so that I can refresh my memory? It has been a long time...
>
> https://listman.redhat.com/archives/linux-audit/2020-October/017066.html
> https://listman.redhat.com/archives/linux-audit/2020-October/017067.html

Thanks!

> > > > > +
> > > > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > > > + (sizeof(union { \
> > > > > + struct fanotify_response_audit_rule r; \
> > > > > + /* add other extra info structures here */ \
> > > > > + }))
> > > > > +
> > > > > struct fanotify_response {
> > > > > __s32 fd;
> > > > > - __u32 response;
> > > > > + __u16 response;
> > > > > + __u16 extra_info_type;
> > > > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > > > };
> > > >
> > > > Since both the kernel and userspace are going to need to agree on the
> > > > content and formatting of the fanotify_response:extra_info_buf field,
> > > > why is it hidden behind a char array? You might as well get rid of
> > > > that abstraction and put the union directly in the fanotify_response
> > > > struct. It is possible you could also get rid of the
> > > > fanotify_response_audit_rule struct this way too and just access the
> > > > rule scalar directly.
> > >
> > > This does make sense and my only concern would be a variable-length
> > > type. There isn't any reason to hide it. If userspace chooses to use
> > > the old interface and omit the type field then it defaults to NONE.
> > >
> > > If future types with variable data are defined, the first field could be
> > > a u32 that unions with the rule number that won't change the struct
> > > size.
> >
> > Struct fanotify_response size must not change, it is part of the kernel
> > ABI. In particular your above change would break userspace code that is
> > currently working just fine (e.g. allocating 8 bytes and expecting struct
> > fanotify_response fits there, or just writing sizeof(struct
> > fanotify_response) as a response while initializing only first 8 bytes).
>
> Many kernel ABIs have been expanded without breaking them.
>
> Is it reasonable for a userspace program to use a kernel structure
> without also using its size for allocation and initialization?

Well, I'm not sure whether to call it reasonable but it certainly happens
and we are generally obliged to keep backwards compatibility (the golden
"don't break userspace" rule).

> > How I'd suggest doing it now (and I'd like to refresh my memory from my
> > past emails you mention because in the past I might have thought something
> > else ;)) is that you add another flag to 'response' field similar to
> > FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
> > to be expected after struct fanotify_response.
>
> That's an interesting possibility... I'm trying to predict if that
> would be a problem for an old kernel... In process_access_response() it
> would fallthrough to the default case of -EINVAL but do so safely.

Yes, old kernel would just refuse such response so new userspace can adapt
to that.

> > The extra info would always start with a header like:
> >
> > struct fanotify_response_info_header {
> > __u8 info_type;
> > __u8 pad;
> > __u16 len; /* This is including the header itself */
> > }
> >
> > And after such header there would be the 'blob' of data 'len - header size'
> > long. We use this same scheme when passing fanotify events to userspace
> > and it has proven to be lightweight and extensible. It covers the situation
> > when in the future audit would decide it wants other data (just change data
> > type), it would also cover the situation when some other subsystem wants
> > its information passed as well - there can be more structures like this
> > attached at the end, we can process the response up to the length of the
> > write.
>
> This reminds me of the RFC2367 (PF_KEYv2, why is that still right there
> at the tip of my fingers?)
>
> > Now these are just possible future extensions making sure we can extend the
> > ABI without too much pain. In the current implementation I'd just return
> > EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
> > and do very strict checks on what gets passed in. It is also trivially
> > backwards compatible (old userspace on new kernel works just fine).
>
> Old userspace on new kernel would work fine with this idea, the v2 patch
> posted, or with leaving the response field of struct fanotify_response
> as __u32.

So I've realized the idea to reduce 'response' field to __u16 will not work
for big endian architectures (that was a thinko of my old suggestion from
2020). Old userspace binaries will break with that. So we have to keep
'response' to __u32 and just add a flag like I'm suggesting now or
something like that.

> > If you want to achieve compatibility of running new userspace on old kernel
> > (I guess that's desirable), we have group flags for that - like we
> > introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
> > need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
> > should expect an allow more info returning for permission events. At the
> > same time this is the way for userspace to be able to tell whether the
> > kernel supports this. I know this sounds tedious but that's the cost of
> > extending the ABI in the compatible way. We've made various API mistakes in
> > the past having to add weird workarounds to fanotify and we don't want to
> > repeat those mistakes :).
> >
> > One open question I have is what should the kernel do with 'info_type' in
> > response it does not understand (in the future when there are possibly more
> > different info types). It could just skip it because this should be just
> > additional info for introspection (the only mandatory part is in
> > fanotify_response, however it could surprise userspace that passed info is
> > just getting ignored. To solve this we would have to somewhere report
> > supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> > cross that bridge when we get to it.
>
> Well, one possibility is to return -EINVAL to signal the kernel is out
> of date.

Yes, I think we've settled with Amir on returning -EINVAL in case unknown
info is spotted.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2022-05-17 11:10:36

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] fanotify: define struct members to hold response decision context

On 2022-05-09 10:54, Jan Kara wrote:
> On Fri 06-05-22 14:46:49, Richard Guy Briggs wrote:
> > On 2022-05-05 16:44, Jan Kara wrote:
> > > On Tue 03-05-22 21:33:35, Richard Guy Briggs wrote:
> > > > On 2022-05-02 20:16, Paul Moore wrote:
> > > > > On Thu, Apr 28, 2022 at 8:45 PM Richard Guy Briggs <[email protected]> wrote:
> > > > > > This patch adds 2 structure members to the response returned from user
> > > > > > space on a permission event. The first field is 16 bits for the context
> > > > > > type. The context type will describe what the meaning is of the second
> > > > > > field. The default is none. The patch defines one additional context
> > > > > > type which means that the second field is a 32-bit rule number. This
> > > > > > will allow for the creation of other context types in the future if
> > > > > > other users of the API identify different needs. The second field size
> > > > > > is defined by the context type and can be used to pass along the data
> > > > > > described by the context.
> > > > > >
> > > > > > To support this, there is a macro for user space to check that the data
> > > > > > being sent is valid. Of course, without this check, anything that
> > > > > > overflows the bit field will trigger an EINVAL based on the use of
> > > > > > FAN_INVALID_RESPONSE_MASK in process_access_response().
> > > > > >
> > >
> > > ...
> > >
> > > > > > static ssize_t fanotify_write(struct file *file, const char __user *buf, size_t count, loff_t *pos)
> > > > > > {
> > > > > > - struct fanotify_response response = { .fd = -1, .response = -1 };
> > > > > > + struct fanotify_response response;
> > > > > > struct fsnotify_group *group;
> > > > > > int ret;
> > > > > > + size_t size = min(count, sizeof(struct fanotify_response));
> > > > > >
> > > > > > if (!IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS))
> > > > > > return -EINVAL;
> > > > > >
> > > > > > group = file->private_data;
> > > > > >
> > > > > > - if (count < sizeof(response))
> > > > > > + if (count < offsetof(struct fanotify_response, extra_info_buf))
> > > > > > return -EINVAL;
> > > > >
> > > > > Is this why you decided to shrink the fanotify_response:response field
> > > > > from 32-bits to 16-bits? I hope not. I would suggest both keeping
> > > > > the existing response field as 32-bits and explicitly checking for
> > > > > writes that are either the existing/compat length as well as the
> > > > > newer, longer length.
> > > >
> > > > No. I shrank it at Jan's suggestion. I think I agree with you that
> > > > the response field should be kept at u32 as it is defined in userspace
> > > > and purge the doubt about what would happen with a new userspace with
> > > > an old kernel.
> > >
> > > Hum, for the life of me I cannot find my response you mention here. Can you
> > > send a link so that I can refresh my memory? It has been a long time...
> >
> > https://listman.redhat.com/archives/linux-audit/2020-October/017066.html
> > https://listman.redhat.com/archives/linux-audit/2020-October/017067.html
>
> Thanks!
>
> > > > > > +
> > > > > > +#define FANOTIFY_RESPONSE_EXTRA_LEN_MAX \
> > > > > > + (sizeof(union { \
> > > > > > + struct fanotify_response_audit_rule r; \
> > > > > > + /* add other extra info structures here */ \
> > > > > > + }))
> > > > > > +
> > > > > > struct fanotify_response {
> > > > > > __s32 fd;
> > > > > > - __u32 response;
> > > > > > + __u16 response;
> > > > > > + __u16 extra_info_type;
> > > > > > + char extra_info_buf[FANOTIFY_RESPONSE_EXTRA_LEN_MAX];
> > > > > > };
> > > > >
> > > > > Since both the kernel and userspace are going to need to agree on the
> > > > > content and formatting of the fanotify_response:extra_info_buf field,
> > > > > why is it hidden behind a char array? You might as well get rid of
> > > > > that abstraction and put the union directly in the fanotify_response
> > > > > struct. It is possible you could also get rid of the
> > > > > fanotify_response_audit_rule struct this way too and just access the
> > > > > rule scalar directly.
> > > >
> > > > This does make sense and my only concern would be a variable-length
> > > > type. There isn't any reason to hide it. If userspace chooses to use
> > > > the old interface and omit the type field then it defaults to NONE.
> > > >
> > > > If future types with variable data are defined, the first field could be
> > > > a u32 that unions with the rule number that won't change the struct
> > > > size.
> > >
> > > Struct fanotify_response size must not change, it is part of the kernel
> > > ABI. In particular your above change would break userspace code that is
> > > currently working just fine (e.g. allocating 8 bytes and expecting struct
> > > fanotify_response fits there, or just writing sizeof(struct
> > > fanotify_response) as a response while initializing only first 8 bytes).
> >
> > Many kernel ABIs have been expanded without breaking them.
> >
> > Is it reasonable for a userspace program to use a kernel structure
> > without also using its size for allocation and initialization?
>
> Well, I'm not sure whether to call it reasonable but it certainly happens
> and we are generally obliged to keep backwards compatibility (the golden
> "don't break userspace" rule).

There's a lot of stupid things userspace could do that would render it
impossible to ever change or improve anything, including fixing security
bugs, with that absolutist approach.

> > > How I'd suggest doing it now (and I'd like to refresh my memory from my
> > > past emails you mention because in the past I might have thought something
> > > else ;)) is that you add another flag to 'response' field similar to
> > > FAN_AUDIT - like FAN_EXTRA_INFO. If that is present, it means extra info is
> > > to be expected after struct fanotify_response.
> >
> > That's an interesting possibility... I'm trying to predict if that
> > would be a problem for an old kernel... In process_access_response() it
> > would fallthrough to the default case of -EINVAL but do so safely.
>
> Yes, old kernel would just refuse such response so new userspace can adapt
> to that.

Fair enough.

> > > The extra info would always start with a header like:
> > >
> > > struct fanotify_response_info_header {
> > > __u8 info_type;
> > > __u8 pad;
> > > __u16 len; /* This is including the header itself */
> > > }
> > >
> > > And after such header there would be the 'blob' of data 'len - header size'
> > > long. We use this same scheme when passing fanotify events to userspace
> > > and it has proven to be lightweight and extensible. It covers the situation
> > > when in the future audit would decide it wants other data (just change data
> > > type), it would also cover the situation when some other subsystem wants
> > > its information passed as well - there can be more structures like this
> > > attached at the end, we can process the response up to the length of the
> > > write.
> >
> > This reminds me of the RFC2367 (PF_KEYv2, why is that still right there
> > at the tip of my fingers?)
> >
> > > Now these are just possible future extensions making sure we can extend the
> > > ABI without too much pain. In the current implementation I'd just return
> > > EINVAL whenever more than FANOTIFY_RESPONSE_MAX_LEN (16 bytes) is written
> > > and do very strict checks on what gets passed in. It is also trivially
> > > backwards compatible (old userspace on new kernel works just fine).
> >
> > Old userspace on new kernel would work fine with this idea, the v2 patch
> > posted, or with leaving the response field of struct fanotify_response
> > as __u32.
>
> So I've realized the idea to reduce 'response' field to __u16 will not work
> for big endian architectures (that was a thinko of my old suggestion from
> 2020). Old userspace binaries will break with that. So we have to keep
> 'response' to __u32 and just add a flag like I'm suggesting now or
> something like that.

Ok, this is the nagging suspicion I had. This decides it in favour of
not changing that u32 response.

> > > If you want to achieve compatibility of running new userspace on old kernel
> > > (I guess that's desirable), we have group flags for that - like we
> > > introduced FAN_ENABLE_AUDIT to allow for FAN_AUDIT flag in response we now
> > > need to add a flag like FAN_EXTENDED_PERMISSION_INFO telling the kernel it
> > > should expect an allow more info returning for permission events. At the
> > > same time this is the way for userspace to be able to tell whether the
> > > kernel supports this. I know this sounds tedious but that's the cost of
> > > extending the ABI in the compatible way. We've made various API mistakes in
> > > the past having to add weird workarounds to fanotify and we don't want to
> > > repeat those mistakes :).
> > >
> > > One open question I have is what should the kernel do with 'info_type' in
> > > response it does not understand (in the future when there are possibly more
> > > different info types). It could just skip it because this should be just
> > > additional info for introspection (the only mandatory part is in
> > > fanotify_response, however it could surprise userspace that passed info is
> > > just getting ignored. To solve this we would have to somewhere report
> > > supported info types (maybe in fanotify fdinfo in proc). I guess we'll
> > > cross that bridge when we get to it.
> >
> > Well, one possibility is to return -EINVAL to signal the kernel is out
> > of date.
>
> Yes, I think we've settled with Amir on returning -EINVAL in case unknown
> info is spotted.

Good.

> Honza
> --
> Jan Kara <[email protected]>
> SUSE Labs, CR

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635