2019-11-03 14:56:48

by Topi Miettinen

[permalink] [raw]
Subject: [PATCH] Allow restricting permissions in /proc/sys

Several items in /proc/sys need not be accessible to unprivileged
tasks. Let the system administrator change the permissions, but only
to more restrictive modes than what the sysctl tables allow.

Signed-off-by: Topi Miettinen <[email protected]>
---
fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index d80989b6c344..88c4ca7d2782 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode,
int mask)
if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
return -EACCES;

+ error = generic_permission(inode, mask);
+ if (error)
+ return error;
+
head = grab_header(inode);
if (IS_ERR(head))
return PTR_ERR(head);
@@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry,
struct iattr *attr)
struct inode *inode = d_inode(dentry);
int error;

- if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
+ if (attr->ia_valid & (ATTR_UID | ATTR_GID))
return -EPERM;

+ if (attr->ia_valid & ATTR_MODE) {
+ struct ctl_table_header *head = grab_header(inode);
+ struct ctl_table *table = PROC_I(inode)->sysctl_entry;
+ umode_t max_mode = 0777; /* Only these bits may change */
+
+ if (IS_ERR(head))
+ return PTR_ERR(head);
+
+ if (!table) /* global root - r-xr-xr-x */
+ max_mode &= ~0222;
+ else /*
+ * Don't allow permissions to become less
+ * restrictive than the sysctl table entry
+ */
+ max_mode &= table->mode;
+
+ sysctl_head_finish(head);
+
+ /* Execute bits only allowed for directories */
+ if (!S_ISDIR(inode->i_mode))
+ max_mode &= ~0111;
+
+ if (attr->ia_mode & ~S_IFMT & ~max_mode)
+ return -EPERM;
+ }
+
error = setattr_prepare(dentry, attr);
if (error)
return error;
@@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path
*path, struct kstat *stat,
u32 request_mask, unsigned int query_flags)
{
struct inode *inode = d_inode(path->dentry);
- struct ctl_table_header *head = grab_header(inode);
- struct ctl_table *table = PROC_I(inode)->sysctl_entry;
-
- if (IS_ERR(head))
- return PTR_ERR(head);

generic_fillattr(inode, stat);
- if (table)
- stat->mode = (stat->mode & S_IFMT) | table->mode;
-
- sysctl_head_finish(head);
return 0;
}

--
2.24.0.rc1


Attachments:
0001-Allow-restricting-permissions-in-proc-sys.patch (2.62 kB)

2019-11-03 17:58:55

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> Several items in /proc/sys need not be accessible to unprivileged
> tasks. Let the system administrator change the permissions, but only
> to more restrictive modes than what the sysctl tables allow.
>
> Signed-off-by: Topi Miettinen <[email protected]>

Why should restruct the system administrator from changing the
permissions to one which is more lax than what the sysctl tables?

The system administrator is already very much trusted. Why should we
take that discretion away from the system administrator?

- Ted

2019-11-03 18:52:03

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

Topi Miettinen <[email protected]> writes:

> Several items in /proc/sys need not be accessible to unprivileged
> tasks. Let the system administrator change the permissions, but only
> to more restrictive modes than what the sysctl tables allow.

This looks quite buggy. You neither update table->mode nor
do you ever read from table->mode to initialize the inode.
I am missing something in my quick reading of your patch?

The not updating table->mode almost certainly means that as soon as the
cached inode is invalidated the mode changes will disappear. Not to
mention they will fail to propogate between different instances of
proc.

Loosing all of your changes at cache invalidation seems to make this a
useless feature.

Eric


> Signed-off-by: Topi Miettinen <[email protected]>
> ---
> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
> 1 file changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index d80989b6c344..88c4ca7d2782 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
> mask)
> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
> return -EACCES;
>
> + error = generic_permission(inode, mask);
> + if (error)
> + return error;
> +
> head = grab_header(inode);
> if (IS_ERR(head))
> return PTR_ERR(head);
> @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry, struct
> iattr *attr)
> struct inode *inode = d_inode(dentry);
> int error;
>
> - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
> + if (attr->ia_valid & (ATTR_UID | ATTR_GID))
> return -EPERM;
>
> + if (attr->ia_valid & ATTR_MODE) {
> + struct ctl_table_header *head = grab_header(inode);
> + struct ctl_table *table = PROC_I(inode)->sysctl_entry;
> + umode_t max_mode = 0777; /* Only these bits may change */
> +
> + if (IS_ERR(head))
> + return PTR_ERR(head);
> +
> + if (!table) /* global root - r-xr-xr-x */
> + max_mode &= ~0222;
> + else /*
> + * Don't allow permissions to become less
> + * restrictive than the sysctl table entry
> + */
> + max_mode &= table->mode;
> +
> + sysctl_head_finish(head);
> +
> + /* Execute bits only allowed for directories */
> + if (!S_ISDIR(inode->i_mode))
> + max_mode &= ~0111;
> +
> + if (attr->ia_mode & ~S_IFMT & ~max_mode)
> + return -EPERM;
> + }
> +
> error = setattr_prepare(dentry, attr);
> if (error)
> return error;
> @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path, struct
> kstat *stat,
> u32 request_mask, unsigned int query_flags)
> {
> struct inode *inode = d_inode(path->dentry);
> - struct ctl_table_header *head = grab_header(inode);
> - struct ctl_table *table = PROC_I(inode)->sysctl_entry;
> -
> - if (IS_ERR(head))
> - return PTR_ERR(head);
>
> generic_fillattr(inode, stat);
> - if (table)
> - stat->mode = (stat->mode & S_IFMT) | table->mode;
> -
> - sysctl_head_finish(head);
> return 0;
> }

2019-11-03 19:25:51

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 3.11.2019 19.56, Theodore Y. Ts'o wrote:
> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
>> Several items in /proc/sys need not be accessible to unprivileged
>> tasks. Let the system administrator change the permissions, but only
>> to more restrictive modes than what the sysctl tables allow.
>>
>> Signed-off-by: Topi Miettinen <[email protected]>
>
> Why should restruct the system administrator from changing the
> permissions to one which is more lax than what the sysctl tables?
>
> The system administrator is already very much trusted. Why should we
> take that discretion away from the system administrator?

That could make sense, in addition changing UID/GID would allow even
more flexibility. The current checks and restrictions which prevent
those changes were already present in original code in 2007. I didn't
want to change the logic too much. Perhaps loosening the restrictions
could be a follow-up patch, as it may give chance to use more of generic
proc or fslib code and thus a larger restructuring.

-Topi

2019-11-03 19:39:55

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 3.11.2019 20.50, Eric W. Biederman wrote:
> Topi Miettinen <[email protected]> writes:
>
>> Several items in /proc/sys need not be accessible to unprivileged
>> tasks. Let the system administrator change the permissions, but only
>> to more restrictive modes than what the sysctl tables allow.
>
> This looks quite buggy. You neither update table->mode nor
> do you ever read from table->mode to initialize the inode.
> I am missing something in my quick reading of your patch?

inode->i_mode gets initialized in proc_sys_make_inode().

I didn't want to touch the table, so that the original permissions can
be used to restrict the changes made. In case the restrictions are
removed as suggested by Theodore Ts'o, table->mode could be changed.
Otherwise I'd rather add a new field to store the current mode and the
mode field can remain for reference. As the original author of the code
from 2007, would you let the administrator to chmod/chown the items in
/proc/sys without restrictions (e.g. 0400 -> 0777)?

> The not updating table->mode almost certainly means that as soon as the
> cached inode is invalidated the mode changes will disappear. Not to
> mention they will fail to propogate between different instances of
> proc.
>
> Loosing all of your changes at cache invalidation seems to make this a
> useless feature.

At least different proc instances seem to work just fine here (they show
the same changes), but I suppose you are right about cache invalidation.

-Topi

2019-11-04 15:45:23

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

Topi Miettinen <[email protected]> writes:

> On 3.11.2019 20.50, Eric W. Biederman wrote:
>> Topi Miettinen <[email protected]> writes:
>>
>>> Several items in /proc/sys need not be accessible to unprivileged
>>> tasks. Let the system administrator change the permissions, but only
>>> to more restrictive modes than what the sysctl tables allow.
>>
>> This looks quite buggy. You neither update table->mode nor
>> do you ever read from table->mode to initialize the inode.
>> I am missing something in my quick reading of your patch?
>
> inode->i_mode gets initialized in proc_sys_make_inode().
>
> I didn't want to touch the table, so that the original permissions can
> be used to restrict the changes made. In case the restrictions are
> removed as suggested by Theodore Ts'o, table->mode could be
> changed. Otherwise I'd rather add a new field to store the current
> mode and the mode field can remain for reference. As the original
> author of the code from 2007, would you let the administrator to
> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 ->
> 0777)?

At an architectural level I think we need to do this carefully and have
a compelling reason. The code has survived nearly the entire life of
linux without this capability.

I think right now the common solution is to mount another file over the
file you are trying to hide/limit. Changing the permissions might be
better but that is not at all clear.

Do you have specific examples of the cases where you would like to
change the permissions?

>> The not updating table->mode almost certainly means that as soon as the
>> cached inode is invalidated the mode changes will disappear. Not to
>> mention they will fail to propogate between different instances of
>> proc.
>>
>> Loosing all of your changes at cache invalidation seems to make this a
>> useless feature.
>
> At least different proc instances seem to work just fine here (they
> show the same changes), but I suppose you are right about cache
> invalidation.

It is going to take the creation of a pid namespace to see different
proc instances. All mounts of the proc within the same pid_namespace
return the same instance.

Eric

2019-11-04 18:02:30

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 4.11.2019 17.44, Eric W. Biederman wrote:
> Topi Miettinen <[email protected]> writes:
>
>> On 3.11.2019 20.50, Eric W. Biederman wrote:
>>> Topi Miettinen <[email protected]> writes:
>>>
>>>> Several items in /proc/sys need not be accessible to unprivileged
>>>> tasks. Let the system administrator change the permissions, but only
>>>> to more restrictive modes than what the sysctl tables allow.
>>>
>>> This looks quite buggy. You neither update table->mode nor
>>> do you ever read from table->mode to initialize the inode.
>>> I am missing something in my quick reading of your patch?
>>
>> inode->i_mode gets initialized in proc_sys_make_inode().
>>
>> I didn't want to touch the table, so that the original permissions can
>> be used to restrict the changes made. In case the restrictions are
>> removed as suggested by Theodore Ts'o, table->mode could be
>> changed. Otherwise I'd rather add a new field to store the current
>> mode and the mode field can remain for reference. As the original
>> author of the code from 2007, would you let the administrator to
>> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 ->
>> 0777)?
>
> At an architectural level I think we need to do this carefully and have
> a compelling reason. The code has survived nearly the entire life of
> linux without this capability.

I'd be happy with only allowing restrictions to access for now. Perhaps
later with more analysis, also relaxing changes and maybe UID/GID
changes can be allowed.

> I think right now the common solution is to mount another file over the
> file you are trying to hide/limit. Changing the permissions might be
> better but that is not at all clear.
>
> Do you have specific examples of the cases where you would like to
> change the permissions?

Unprivileged applications typically do not need to access most items in
/proc/sys, so I'd like to gradually find out which are needed. So far
I've seen no problems with 0500 mode for directories abi, crypto, debug,
dev, fs, user or vm.

I'm also using systemd's InaccessiblePaths to limit access (which mounts
an inaccessible directory over the path), but that's a bit too big
hammer. For example there are over 100 files in /proc/sys/kernel,
perhaps there will be issues when creating a mount for each, and that
multiplied by a number of services.

>>> The not updating table->mode almost certainly means that as soon as the
>>> cached inode is invalidated the mode changes will disappear. Not to
>>> mention they will fail to propogate between different instances of
>>> proc.
>>>
>>> Loosing all of your changes at cache invalidation seems to make this a
>>> useless feature.
>>
>> At least different proc instances seem to work just fine here (they
>> show the same changes), but I suppose you are right about cache
>> invalidation.
>
> It is going to take the creation of a pid namespace to see different
> proc instances. All mounts of the proc within the same pid_namespace
> return the same instance.

I see no problems by using Firejail (which uses PID namespacing) with
v2, the permissions in /proc/sys are the same as outside the namespace.

-Topi

2019-11-04 23:43:53

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

Topi Miettinen <[email protected]> writes:

> On 4.11.2019 17.44, Eric W. Biederman wrote:
>> Topi Miettinen <[email protected]> writes:
>>
>>> On 3.11.2019 20.50, Eric W. Biederman wrote:
>>>> Topi Miettinen <[email protected]> writes:
>>>>
>>>>> Several items in /proc/sys need not be accessible to unprivileged
>>>>> tasks. Let the system administrator change the permissions, but only
>>>>> to more restrictive modes than what the sysctl tables allow.
>>>>
>>>> This looks quite buggy. You neither update table->mode nor
>>>> do you ever read from table->mode to initialize the inode.
>>>> I am missing something in my quick reading of your patch?
>>>
>>> inode->i_mode gets initialized in proc_sys_make_inode().
>>>
>>> I didn't want to touch the table, so that the original permissions can
>>> be used to restrict the changes made. In case the restrictions are
>>> removed as suggested by Theodore Ts'o, table->mode could be
>>> changed. Otherwise I'd rather add a new field to store the current
>>> mode and the mode field can remain for reference. As the original
>>> author of the code from 2007, would you let the administrator to
>>> chmod/chown the items in /proc/sys without restrictions (e.g. 0400 ->
>>> 0777)?
>>
>> At an architectural level I think we need to do this carefully and have
>> a compelling reason. The code has survived nearly the entire life of
>> linux without this capability.
>
> I'd be happy with only allowing restrictions to access for
> now. Perhaps later with more analysis, also relaxing changes and maybe
> UID/GID changes can be allowed.

Let's find the use case where someone cares before we think about that.

>> I think right now the common solution is to mount another file over the
>> file you are trying to hide/limit. Changing the permissions might be
>> better but that is not at all clear.
>>
>> Do you have specific examples of the cases where you would like to
>> change the permissions?
>
> Unprivileged applications typically do not need to access most items
> in /proc/sys, so I'd like to gradually find out which are needed. So
> far I've seen no problems with 0500 mode for directories abi, crypto,
> debug, dev, fs, user or vm.

But if there is no problem in letting everyone access the information
why reduce the permissions?

> I'm also using systemd's InaccessiblePaths to limit access (which
> mounts an inaccessible directory over the path), but that's a bit too
> big hammer. For example there are over 100 files in /proc/sys/kernel,
> perhaps there will be issues when creating a mount for each, and that
> multiplied by a number of services.

My sense is that if there is any kind of compelling reason to make
world-readable values not world-readable, and it doesn't break anything
(except malicious applications) than a kernel patch is probably the way
to go.

Policy knobs like this on proc tend to break in normal maintenance
because they are not used enough so I am not a big fan of adding policy
knobs just because we can.

> I see no problems by using Firejail (which uses PID namespacing) with
> v2, the permissions in /proc/sys are the same as outside the
> namespace.

Thank you for testing.

Eric

2019-11-05 07:39:00

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 5.11.2019 1.41, Eric W. Biederman wrote:
> Topi Miettinen <[email protected]> writes:
>
>> On 4.11.2019 17.44, Eric W. Biederman wrote:
>>> Do you have specific examples of the cases where you would like to
>>> change the permissions?
>>
>> Unprivileged applications typically do not need to access most items
>> in /proc/sys, so I'd like to gradually find out which are needed. So
>> far I've seen no problems with 0500 mode for directories abi, crypto,
>> debug, dev, fs, user or vm.
>
> But if there is no problem in letting everyone access the information
> why reduce the permissions?

Because information could be useful to an attacker. If there is no
problem in not letting everyone access the information why not allow
reducing the permissions? There certainly is no need to know.

>> I'm also using systemd's InaccessiblePaths to limit access (which
>> mounts an inaccessible directory over the path), but that's a bit too
>> big hammer. For example there are over 100 files in /proc/sys/kernel,
>> perhaps there will be issues when creating a mount for each, and that
>> multiplied by a number of services.
>
> My sense is that if there is any kind of compelling reason to make
> world-readable values not world-readable, and it doesn't break anything
> (except malicious applications) than a kernel patch is probably the way
> to go.

With kernel patch, do you propose to change individual sysctls to not
world-readable? That surely would help everybody instead of just those
who care enough to change /proc/sys permissions. I guess it would also
be more effort by an order of magnitude or two to convince each owner of
a sysctl to accept the change.

> Policy knobs like this on proc tend to break in normal maintenance
> because they are not used enough so I am not a big fan of adding policy
> knobs just because we can.

But the rest of the /proc (except PID tree) allows changing permissions
(and also UID and GID), just /proc/sys doesn't. This doesn't seem very
logical to me.

These code paths have not changed much or at all since the initial
version in 2007, so I suppose the maintenance burden has not been
overwhelming.

By the way, /proc/sys still allows changing the {a,c,m}time. I think
those are not backed anywhere, so they probably suffer from same caching
problems as my first version of the patch.

-Topi

2019-11-12 23:16:50

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Sun, Nov 03, 2019 at 12:56:48PM -0500, Theodore Y. Ts'o wrote:
> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> > Several items in /proc/sys need not be accessible to unprivileged
> > tasks. Let the system administrator change the permissions, but only
> > to more restrictive modes than what the sysctl tables allow.
> >
> > Signed-off-by: Topi Miettinen <[email protected]>
>
> Why should restruct the system administrator from changing the
> permissions to one which is more lax than what the sysctl tables?
>
> The system administrator is already very much trusted. Why should we
> take that discretion away from the system administrator?

Generally speaking, they're there to provide some sense of boundary
between uid 0 and the kernel proper. I think it's correct to not allow
weakening of these permissions (which is the current state: no change at
all).

--
Kees Cook

2019-11-12 23:22:39

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Tue, Nov 05, 2019 at 09:35:46AM +0200, Topi Miettinen wrote:
> On 5.11.2019 1.41, Eric W. Biederman wrote:
> > My sense is that if there is any kind of compelling reason to make
> > world-readable values not world-readable, and it doesn't break anything
> > (except malicious applications) than a kernel patch is probably the way
> > to go.
>
> With kernel patch, do you propose to change individual sysctls to not
> world-readable? That surely would help everybody instead of just those who
> care enough to change /proc/sys permissions. I guess it would also be more
> effort by an order of magnitude or two to convince each owner of a sysctl to
> accept the change.

I would think of this as a two-stage process: provide a mechanism to
tighten permissions arbitrarily so that it is easier to gather evidence
about which could have their default changed in the future.

> These code paths have not changed much or at all since the initial version
> in 2007, so I suppose the maintenance burden has not been overwhelming.
>
> By the way, /proc/sys still allows changing the {a,c,m}time. I think those
> are not backed anywhere, so they probably suffer from same caching problems
> as my first version of the patch.

Is a v2 of this patch needed? It wasn't clear to me if the inode modes
were incorrectly cached...?

--
Kees Cook

2019-11-12 23:26:09

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

[Cc+ [email protected]]

since that's potentially relevant to quite a few people.

On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> Several items in /proc/sys need not be accessible to unprivileged
> tasks. Let the system administrator change the permissions, but only
> to more restrictive modes than what the sysctl tables allow.
>
> Signed-off-by: Topi Miettinen <[email protected]>
> ---
> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
> 1 file changed, 31 insertions(+), 10 deletions(-)
>
> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> index d80989b6c344..88c4ca7d2782 100644
> --- a/fs/proc/proc_sysctl.c
> +++ b/fs/proc/proc_sysctl.c
> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
> mask)
> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
> return -EACCES;
>
> + error = generic_permission(inode, mask);
> + if (error)
> + return error;
> +
> head = grab_header(inode);
> if (IS_ERR(head))
> return PTR_ERR(head);
> @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry,
> struct iattr *attr)
> struct inode *inode = d_inode(dentry);
> int error;
>
> - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
> + if (attr->ia_valid & (ATTR_UID | ATTR_GID))
> return -EPERM;
>
> + if (attr->ia_valid & ATTR_MODE) {
> + struct ctl_table_header *head = grab_header(inode);
> + struct ctl_table *table = PROC_I(inode)->sysctl_entry;
> + umode_t max_mode = 0777; /* Only these bits may change */
> +
> + if (IS_ERR(head))
> + return PTR_ERR(head);
> +
> + if (!table) /* global root - r-xr-xr-x */
> + max_mode &= ~0222;
> + else /*
> + * Don't allow permissions to become less
> + * restrictive than the sysctl table entry
> + */
> + max_mode &= table->mode;
> +
> + sysctl_head_finish(head);
> +
> + /* Execute bits only allowed for directories */
> + if (!S_ISDIR(inode->i_mode))
> + max_mode &= ~0111;
> +
> + if (attr->ia_mode & ~S_IFMT & ~max_mode)
> + return -EPERM;
> + }
> +
> error = setattr_prepare(dentry, attr);
> if (error)
> return error;
> @@ -853,17 +883,8 @@ static int proc_sys_getattr(const struct path *path,
> struct kstat *stat,
> u32 request_mask, unsigned int query_flags)
> {
> struct inode *inode = d_inode(path->dentry);
> - struct ctl_table_header *head = grab_header(inode);
> - struct ctl_table *table = PROC_I(inode)->sysctl_entry;
> -
> - if (IS_ERR(head))
> - return PTR_ERR(head);
>
> generic_fillattr(inode, stat);
> - if (table)
> - stat->mode = (stat->mode & S_IFMT) | table->mode;
> -
> - sysctl_head_finish(head);
> return 0;
> }
>
> --
> 2.24.0.rc1
>


2019-11-13 01:05:05

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Tue, Nov 12, 2019 at 03:19:00PM -0800, Kees Cook wrote:
> On Tue, Nov 05, 2019 at 09:35:46AM +0200, Topi Miettinen wrote:
> Is a v2 of this patch needed? It wasn't clear to me if the inode modes
> were incorrectly cached...?

I provided a review for it just now. The patch is cleaner but I believe
it can be split up, and also I am not yet sure it is correct. You were
CC'd on it but the subject was not clear that it was a V2.

Topi, if you send a V3 can you please prefix the subject with this?

Luis

2019-11-13 04:52:33

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Tue, Nov 12, 2019 at 3:22 PM Christian Brauner
<[email protected]> wrote:
>
> [Cc+ [email protected]]
>
> since that's potentially relevant to quite a few people.
>
> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> > Several items in /proc/sys need not be accessible to unprivileged
> > tasks. Let the system administrator change the permissions, but only
> > to more restrictive modes than what the sysctl tables allow.
> >
> > Signed-off-by: Topi Miettinen <[email protected]>
> > ---
> > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
> > 1 file changed, 31 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> > index d80989b6c344..88c4ca7d2782 100644
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
> > mask)
> > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
> > return -EACCES;
> >
> > + error = generic_permission(inode, mask);
> > + if (error)
> > + return error;
> > +
> > head = grab_header(inode);
> > if (IS_ERR(head))
> > return PTR_ERR(head);
> > @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry,
> > struct iattr *attr)
> > struct inode *inode = d_inode(dentry);
> > int error;
> >
> > - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
> > + if (attr->ia_valid & (ATTR_UID | ATTR_GID))
> > return -EPERM;

Supporting at least ATTR_GID would make this much more useful.

> >
> > + if (attr->ia_valid & ATTR_MODE) {
> > + struct ctl_table_header *head = grab_header(inode);
> > + struct ctl_table *table = PROC_I(inode)->sysctl_entry;
> > + umode_t max_mode = 0777; /* Only these bits may change */
> > +
> > + if (IS_ERR(head))
> > + return PTR_ERR(head);
> > +
> > + if (!table) /* global root - r-xr-xr-x */
> > + max_mode &= ~0222;
> > + else /*
> > + * Don't allow permissions to become less
> > + * restrictive than the sysctl table entry
> > + */
> > + max_mode &= table->mode;

Style nit: please put braces around multi-line if and else branches,
even if they're only multi-line because of comments.

> > +
> > + sysctl_head_finish(head);
> > +
> > + /* Execute bits only allowed for directories */
> > + if (!S_ISDIR(inode->i_mode))
> > + max_mode &= ~0111;

Why is this needed?

2019-11-13 10:54:04

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 13.11.2019 6.50, Andy Lutomirski wrote:
> On Tue, Nov 12, 2019 at 3:22 PM Christian Brauner
> <[email protected]> wrote:
>>
>> [Cc+ [email protected]]
>>
>> since that's potentially relevant to quite a few people.
>>
>> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
>>> Several items in /proc/sys need not be accessible to unprivileged
>>> tasks. Let the system administrator change the permissions, but only
>>> to more restrictive modes than what the sysctl tables allow.
>>>
>>> Signed-off-by: Topi Miettinen <[email protected]>
>>> ---
>>> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
>>> 1 file changed, 31 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
>>> index d80989b6c344..88c4ca7d2782 100644
>>> --- a/fs/proc/proc_sysctl.c
>>> +++ b/fs/proc/proc_sysctl.c
>>> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
>>> mask)
>>> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
>>> return -EACCES;
>>>
>>> + error = generic_permission(inode, mask);
>>> + if (error)
>>> + return error;
>>> +
>>> head = grab_header(inode);
>>> if (IS_ERR(head))
>>> return PTR_ERR(head);
>>> @@ -837,9 +841,35 @@ static int proc_sys_setattr(struct dentry *dentry,
>>> struct iattr *attr)
>>> struct inode *inode = d_inode(dentry);
>>> int error;
>>>
>>> - if (attr->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID))
>>> + if (attr->ia_valid & (ATTR_UID | ATTR_GID))
>>> return -EPERM;
>
> Supporting at least ATTR_GID would make this much more useful.

Yes, also XATTR/ACL support would be useful. But so far I've tried to
allow only tightening of permissions.

>>>
>>> + if (attr->ia_valid & ATTR_MODE) {
>>> + struct ctl_table_header *head = grab_header(inode);
>>> + struct ctl_table *table = PROC_I(inode)->sysctl_entry;
>>> + umode_t max_mode = 0777; /* Only these bits may change */
>>> +
>>> + if (IS_ERR(head))
>>> + return PTR_ERR(head);
>>> +
>>> + if (!table) /* global root - r-xr-xr-x */
>>> + max_mode &= ~0222;
>>> + else /*
>>> + * Don't allow permissions to become less
>>> + * restrictive than the sysctl table entry
>>> + */
>>> + max_mode &= table->mode;
>
> Style nit: please put braces around multi-line if and else branches,
> even if they're only multi-line because of comments.

OK, thanks.

>>> +
>>> + sysctl_head_finish(head);
>>> +
>>> + /* Execute bits only allowed for directories */
>>> + if (!S_ISDIR(inode->i_mode))
>>> + max_mode &= ~0111;
>
> Why is this needed?
>

In general, /proc/sys does not allow executable permissions for the
files, so I've continued this policy.

-Topi

2019-11-13 17:02:33

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner
<[email protected]> wrote:
> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> > Several items in /proc/sys need not be accessible to unprivileged
> > tasks. Let the system administrator change the permissions, but only
> > to more restrictive modes than what the sysctl tables allow.
> >
> > Signed-off-by: Topi Miettinen <[email protected]>
> > ---
> > fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
> > 1 file changed, 31 insertions(+), 10 deletions(-)
> >
> > diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
> > index d80989b6c344..88c4ca7d2782 100644
> > --- a/fs/proc/proc_sysctl.c
> > +++ b/fs/proc/proc_sysctl.c
> > @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
> > mask)
> > if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
> > return -EACCES;
> >
> > + error = generic_permission(inode, mask);
> > + if (error)
> > + return error;

In kernel/ucount.c, the ->permissions handler set_permissions() grants
access based on whether the caller has CAP_SYS_RESOURCE. And in
net/sysctl_net.c, the handler net_ctl_permissions() grants access
based on whether the caller has CAP_NET_ADMIN. This added check is
going to break those, right?

2019-11-13 17:08:17

by Topi Miettinen

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On 13.11.2019 18.00, Jann Horn wrote:
> On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner
> <[email protected]> wrote:
>> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
>>> Several items in /proc/sys need not be accessible to unprivileged
>>> tasks. Let the system administrator change the permissions, but only
>>> to more restrictive modes than what the sysctl tables allow.
>>>
>>> Signed-off-by: Topi Miettinen <[email protected]>
>>> ---
>>> fs/proc/proc_sysctl.c | 41 +++++++++++++++++++++++++++++++----------
>>> 1 file changed, 31 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
>>> index d80989b6c344..88c4ca7d2782 100644
>>> --- a/fs/proc/proc_sysctl.c
>>> +++ b/fs/proc/proc_sysctl.c
>>> @@ -818,6 +818,10 @@ static int proc_sys_permission(struct inode *inode, int
>>> mask)
>>> if ((mask & MAY_EXEC) && S_ISREG(inode->i_mode))
>>> return -EACCES;
>>>
>>> + error = generic_permission(inode, mask);
>>> + if (error)
>>> + return error;
>
> In kernel/ucount.c, the ->permissions handler set_permissions() grants
> access based on whether the caller has CAP_SYS_RESOURCE. And in
> net/sysctl_net.c, the handler net_ctl_permissions() grants access
> based on whether the caller has CAP_NET_ADMIN. This added check is
> going to break those, right?
>

Right. The comment above seems then a bit misleading:
/*
* sysctl entries that are not writeable,
* are _NOT_ writeable, capabilities or not.
*/

-Topi

2019-11-13 17:19:06

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH] Allow restricting permissions in /proc/sys

On Wed, Nov 13, 2019 at 5:19 PM Topi Miettinen <[email protected]> wrote:
> On 13.11.2019 18.00, Jann Horn wrote:
> > On Wed, Nov 13, 2019 at 12:22 AM Christian Brauner
> > <[email protected]> wrote:
> >> On Sun, Nov 03, 2019 at 04:55:48PM +0200, Topi Miettinen wrote:
> >>> Several items in /proc/sys need not be accessible to unprivileged
> >>> tasks. Let the system administrator change the permissions, but only
> >>> to more restrictive modes than what the sysctl tables allow.
[...]
> > In kernel/ucount.c, the ->permissions handler set_permissions() grants
> > access based on whether the caller has CAP_SYS_RESOURCE. And in
> > net/sysctl_net.c, the handler net_ctl_permissions() grants access
> > based on whether the caller has CAP_NET_ADMIN. This added check is
> > going to break those, right?
> >
>
> Right. The comment above seems then a bit misleading:
> /*
> * sysctl entries that are not writeable,
> * are _NOT_ writeable, capabilities or not.
> */

I don't see the problem. Those handlers never make a file writable
that doesn't have one of the three write bits (0222) set.