2021-12-07 02:12:27

by Steven Rostedt

[permalink] [raw]
Subject: [RFC][PATCH] tracefs: Set all files to the same group ownership as the mount option

From: "Steven Rostedt (VMware)" <[email protected]>

As people have been asking to allow non-root processes to have access to
the tracefs directory, it was considered best to only allow groups to have
access to the directory, where it is easier to just set the tracefs file
system to a specific group (as other would be too dangerous), and that way
the admins could pick which processes would have access to tracefs.

Unfortunately, this broke tooling on Android that expected the other bit
to be set. For some special cases, for non-root tools to trace the system,
tracefs would be mounted and change the permissions of the top level
directory which gave access to all running tasks permission to the
tracing directory. Even though this would be dangerous to do in a
production environment, for testing environments this can be useful.

Now with the new changes to not allow other (which is still the proper
thing to do), it breaks the testing tooling. Now more code needs to be
loaded on the system to change ownership of the tracing directory.

The real solution is to have tracefs honor the gid=xxx option when
mounting. That is,

(tracing group tracing has value 1003)

mount -t tracefs -o gid=1003 tracefs /sys/kernel/tracing

should have it that all files in the tracing directory should be of the
given group.

Copy the logic from d_walk() from dcache.c and simplify it for the mount
case of tracefs if gid is set. All the files in tracefs will be walked and
their group will be set to the value passed in.

Reported-by: Kalesh Singh <[email protected]>
Reported-by: Yabin Cui <[email protected]>
Fixes: 49d67e445742 ("tracefs: Have tracefs directories not set OTH permission bits by default")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
---

I'm posting this as an RFC as I want to make sure this is the proper way
to handle this. It really makes sense. As tracefs is simply a file system
with a bunch of control knobs to control tracing, if you mount it with
gid=xxx then the control knobs should be controlled by group xxx.


fs/tracefs/inode.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)

diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
index 925a621b432e..f20f575cdaef 100644
+++ b/fs/tracefs/inode.c
@@ -161,6 +161,79 @@ struct tracefs_fs_info {
struct tracefs_mount_opts mount_opts;
};

+static void change_gid(struct dentry *dentry, kgid_t gid)
+{
+ if (!dentry->d_inode)
+ return;
+ dentry->d_inode->i_gid = gid;
+}
+
+/*
+ * Taken from d_walk, but without he need for handling renames.
+ * Nothing can be renamed while walking the list, as tracefs
+ * does not support renames. This is only called when mounting
+ * or remounting the file system, to set all the files to
+ * the given gid.
+ */
+static void set_gid(struct dentry *parent, kgid_t gid)
+{
+ struct dentry *this_parent;
+ struct list_head *next;
+
+ this_parent = parent;
+ spin_lock(&this_parent->d_lock);
+
+ change_gid(this_parent, gid);
+repeat:
+ next = this_parent->d_subdirs.next;
+resume:
+ while (next != &this_parent->d_subdirs) {
+ struct list_head *tmp = next;
+ struct dentry *dentry = list_entry(tmp, struct dentry, d_child);
+ next = tmp->next;
+
+ spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
+
+ change_gid(dentry, gid);
+
+ if (!list_empty(&dentry->d_subdirs)) {
+ spin_unlock(&this_parent->d_lock);
+ spin_release(&dentry->d_lock.dep_map, _RET_IP_);
+ this_parent = dentry;
+ spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
+ goto repeat;
+ }
+ spin_unlock(&dentry->d_lock);
+ }
+ /*
+ * All done at this level ... ascend and resume the search.
+ */
+ rcu_read_lock();
+ascend:
+ if (this_parent != parent) {
+ struct dentry *child = this_parent;
+ this_parent = child->d_parent;
+
+ spin_unlock(&child->d_lock);
+ spin_lock(&this_parent->d_lock);
+
+ /* go into the first sibling still alive */
+ do {
+ next = child->d_child.next;
+ if (next == &this_parent->d_subdirs)
+ goto ascend;
+ child = list_entry(next, struct dentry, d_child);
+ } while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED));
+ rcu_read_unlock();
+ goto resume;
+ }
+ rcu_read_unlock();
+
+out_unlock:
+ spin_unlock(&this_parent->d_lock);
+ return;
+}
+
static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
{
substring_t args[MAX_OPT_ARGS];
@@ -193,6 +266,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
if (!gid_valid(gid))
return -EINVAL;
opts->gid = gid;
+ set_gid(tracefs_mount->mnt_root, gid);
break;
case Opt_mode:
if (match_octal(&args[0], &option))
--
2.31.1



2021-12-07 17:04:46

by Kalesh Singh

[permalink] [raw]
Subject: Re: [RFC][PATCH] tracefs: Set all files to the same group ownership as the mount option

On Mon, Dec 6, 2021 at 6:12 PM Steven Rostedt <[email protected]> wrote:
>
> From: "Steven Rostedt (VMware)" <[email protected]>
>
> As people have been asking to allow non-root processes to have access to
> the tracefs directory, it was considered best to only allow groups to have
> access to the directory, where it is easier to just set the tracefs file
> system to a specific group (as other would be too dangerous), and that way
> the admins could pick which processes would have access to tracefs.
>
> Unfortunately, this broke tooling on Android that expected the other bit
> to be set. For some special cases, for non-root tools to trace the system,
> tracefs would be mounted and change the permissions of the top level
> directory which gave access to all running tasks permission to the
> tracing directory. Even though this would be dangerous to do in a
> production environment, for testing environments this can be useful.
>
> Now with the new changes to not allow other (which is still the proper
> thing to do), it breaks the testing tooling. Now more code needs to be
> loaded on the system to change ownership of the tracing directory.
>
> The real solution is to have tracefs honor the gid=xxx option when
> mounting. That is,
>
> (tracing group tracing has value 1003)
>
> mount -t tracefs -o gid=1003 tracefs /sys/kernel/tracing
>
> should have it that all files in the tracing directory should be of the
> given group.
>
> Copy the logic from d_walk() from dcache.c and simplify it for the mount
> case of tracefs if gid is set. All the files in tracefs will be walked and
> their group will be set to the value passed in.
>
> Reported-by: Kalesh Singh <[email protected]>
> Reported-by: Yabin Cui <[email protected]>
> Fixes: 49d67e445742 ("tracefs: Have tracefs directories not set OTH permission bits by default")
> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
> ---
>
> I'm posting this as an RFC as I want to make sure this is the proper way
> to handle this. It really makes sense. As tracefs is simply a file system
> with a bunch of control knobs to control tracing, if you mount it with
> gid=xxx then the control knobs should be controlled by group xxx.
>
>
> fs/tracefs/inode.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 74 insertions(+)
>
> diff --git a/fs/tracefs/inode.c b/fs/tracefs/inode.c
> index 925a621b432e..f20f575cdaef 100644
> +++ b/fs/tracefs/inode.c
> @@ -161,6 +161,79 @@ struct tracefs_fs_info {
> struct tracefs_mount_opts mount_opts;
> };
>
> +static void change_gid(struct dentry *dentry, kgid_t gid)
> +{
> + if (!dentry->d_inode)
> + return;
> + dentry->d_inode->i_gid = gid;
> +}
> +
> +/*
> + * Taken from d_walk, but without he need for handling renames.
> + * Nothing can be renamed while walking the list, as tracefs
> + * does not support renames. This is only called when mounting
> + * or remounting the file system, to set all the files to
> + * the given gid.
> + */

Hi Steve,

One thing that I missed before: There are files that can be generated
after the mount, for instance when a new synthetic event is added new
entries for that event are created under events/synthetic/ and when a
new instance is created the new entries generated under instances/.
These new entries don't honor the gid specified when mounting. Could
we make it so that they also respect the specified gid?

Thanks,
Kalesh

> +static void set_gid(struct dentry *parent, kgid_t gid)
> +{
> + struct dentry *this_parent;
> + struct list_head *next;
> +
> + this_parent = parent;
> + spin_lock(&this_parent->d_lock);
> +
> + change_gid(this_parent, gid);
> +repeat:
> + next = this_parent->d_subdirs.next;
> +resume:
> + while (next != &this_parent->d_subdirs) {
> + struct list_head *tmp = next;
> + struct dentry *dentry = list_entry(tmp, struct dentry, d_child);
> + next = tmp->next;
> +
> + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED);
> +
> + change_gid(dentry, gid);
> +
> + if (!list_empty(&dentry->d_subdirs)) {
> + spin_unlock(&this_parent->d_lock);
> + spin_release(&dentry->d_lock.dep_map, _RET_IP_);
> + this_parent = dentry;
> + spin_acquire(&this_parent->d_lock.dep_map, 0, 1, _RET_IP_);
> + goto repeat;
> + }
> + spin_unlock(&dentry->d_lock);
> + }
> + /*
> + * All done at this level ... ascend and resume the search.
> + */
> + rcu_read_lock();
> +ascend:
> + if (this_parent != parent) {
> + struct dentry *child = this_parent;
> + this_parent = child->d_parent;
> +
> + spin_unlock(&child->d_lock);
> + spin_lock(&this_parent->d_lock);
> +
> + /* go into the first sibling still alive */
> + do {
> + next = child->d_child.next;
> + if (next == &this_parent->d_subdirs)
> + goto ascend;
> + child = list_entry(next, struct dentry, d_child);
> + } while (unlikely(child->d_flags & DCACHE_DENTRY_KILLED));
> + rcu_read_unlock();
> + goto resume;
> + }
> + rcu_read_unlock();
> +
> +out_unlock:
> + spin_unlock(&this_parent->d_lock);
> + return;
> +}
> +
> static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
> {
> substring_t args[MAX_OPT_ARGS];
> @@ -193,6 +266,7 @@ static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
> if (!gid_valid(gid))
> return -EINVAL;
> opts->gid = gid;
> + set_gid(tracefs_mount->mnt_root, gid);
> break;
> case Opt_mode:
> if (match_octal(&args[0], &option))
> --
> 2.31.1
>

2021-12-07 17:10:30

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC][PATCH] tracefs: Set all files to the same group ownership as the mount option

On Tue, 7 Dec 2021 09:04:30 -0800
Kalesh Singh <[email protected]> wrote:

> One thing that I missed before: There are files that can be generated
> after the mount, for instance when a new synthetic event is added new
> entries for that event are created under events/synthetic/ and when a
> new instance is created the new entries generated under instances/.
> These new entries don't honor the gid specified when mounting. Could
> we make it so that they also respect the specified gid?

They don't?

/me looks at code

Aw crap. I thought since I have this:

static int tracefs_parse_options(char *data, struct tracefs_mount_opts *opts)
{
[..]
case Opt_gid:
if (match_int(&args[0], &option))
return -EINVAL;
gid = make_kgid(current_user_ns(), option);
if (!gid_valid(gid))
return -EINVAL;

opts->gid = gid;

set_gid(tracefs_mount->mnt_root, gid);
[..]


That the new files would inherit the opts->gid. But I see that they do not.

I'll add that as a separate patch.

Thanks for bringing that to my attention.

-- Steve