2005-04-10 15:39:10

by René Scharfe

[permalink] [raw]
Subject: [RFC][PATCH] Simple privacy enhancement for /proc/<pid>

Hi all,

sorry it took me so long before offering another patch for restricting
/proc permissions. Real life kept on intervening.

Albert, allowing access based on tty sounds nice, but it _is_ expansive.
More importantly, perhaps, it would "virtualize" /proc: every user would
see different permissions for certain files in there. That's too comlex
for my taste.

My previous patchset was complex, too, given its simple purpose: to
restrict regular users from spying on each other. So let's cut out what
we don't really need.

First, configuring via kernel parameters is sufficient. It simplifies
implementation a lot because we know the settings cannot change. And we
don't need the added flexibility of sysctls anyway -- I assume these
parameters are set at installation time and never touched again.

Then I suppose we don't need to be able to fine-tune the permissions for
each file in /proc/<pid>/. All that we need is a distinction between
"normal" users (which are to be restricted) and admins (which need to
see everything).

In a previous mail I asked how to identify tasks every user should be
able to see details of. Bodo came up with some nice ideas, like
checking if parent is init(1) to catch daemons and identifying kernel
threads by their unique ability to block SIGKILL. That's simple and
catches most interesting processes; it fails to catch the children of
forking servers like apache and, notably, sshd, however.

I have another idea: let's keep the details of _every_ process owned by
user root readable by anyone. The superuser doesn't deserve privacy.
No new hole is opened, as root's files don't change their permissions.
Admittedly, admins are people, too, and they can get privacy, too -- but
only if they use their own regular user ID. They should be doing that
anyway.

This catches the _important_ processes, those used to provide logins
(and then some :-). Tools like w, who and pstree keep on working just
fine, even for SSH logins.

This patch introduces two kernel parameters: proc.privacy and proc.gid.
The group ID attribute of all files below /proc/<pid> is set to
proc.gid, but only if you activate the feature by setting proc.privacy
to a non-zero value. 1 means to allow all users to see root's process
details and hide everyone else's (as described above), 2 gives you the
"shared nothing" semantics where even root's processes are "cloaked".

Patch is against 2.6.12-rc2-mm2, please give it a try and tell me how
you like it.

Thanks,
Rene



diff -pur l1/Documentation/kernel-parameters.txt l2/Documentation/kernel-parameters.txt
--- l1/Documentation/kernel-parameters.txt 2005-04-07 23:18:36.000000000 +0200
+++ l2/Documentation/kernel-parameters.txt 2005-04-10 13:52:56.000000000 +0200
@@ -1116,16 +1116,25 @@ running once the system is up.
[ISAPNP] Exclude memory regions for the autoconfiguration
Ranges are in pairs (memory base and size).

+ processor.max_cstate= [HW, ACPI]
+ Limit processor to maximum C-state
+ max_cstate=9 overrides any DMI blacklist limit.
+
+ proc.gid= [KNL] Group ID attribute of the files in /proc/<pid>,
+ default is 0. This parameter is ignored if
+ proc.privacy is 0.
+ proc.privacy= [KNL] Take away permissions for files in /proc/<pid>
+ from world (think "chmod o-rwx") and apply proc.gid
+ setting. 0 = disabled (default), 1 = restrict access
+ to all process details except those having a uid of 0,
+ 2 = restrict access to all process details.
+
profile= [KNL] Enable kernel profiling via /proc/profile
{ schedule | <number> }
(param: schedule - profile schedule points}
(param: profile step/bucket size as a power of 2 for
statistical time based profiling)

- processor.max_cstate= [HW, ACPI]
- Limit processor to maximum C-state
- max_cstate=9 overrides any DMI blacklist limit.
-
prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
before loading.
See Documentation/ramdisk.txt.
diff -pur l1/fs/proc/base.c l2/fs/proc/base.c
--- l1/fs/proc/base.c 2005-04-07 23:18:47.000000000 +0200
+++ l2/fs/proc/base.c 2005-04-10 14:00:28.000000000 +0200
@@ -35,8 +35,18 @@
#include <linux/seccomp.h>
#include <linux/cpuset.h>
#include <linux/audit.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
#include "internal.h"

+static int proc_privacy;
+module_param_named(privacy, proc_privacy, uint, 0);
+MODULE_PARM_DESC(umask, "restrict permissions of files in /proc/<pid>");
+
+static gid_t proc_gid;
+module_param_named(gid, proc_gid, uint, 0);
+MODULE_PARM_DESC(gid, "process admin group");
+
/*
* For hysterical raisins we keep the same inumbers as in the old procfs.
* Feel free to change the macro below - just keep the range distinct from
@@ -227,6 +237,8 @@ static struct pid_entry tid_attr_stuff[]

#undef E

+#define IS_PID_DIR(type) ((type) == PROC_TGID_INO || (type) == PROC_TID_INO)
+
static int proc_fd_link(struct inode *inode, struct dentry **dentry, struct vfsmount **mnt)
{
struct task_struct *task = proc_task(inode);
@@ -1145,7 +1157,7 @@ static struct inode *proc_pid_make_inode
ei->type = ino;
inode->i_uid = 0;
inode->i_gid = 0;
- if (ino == PROC_TGID_INO || ino == PROC_TID_INO || task_dumpable(task)) {
+ if (IS_PID_DIR(ino) || task_dumpable(task)) {
inode->i_uid = task->euid;
inode->i_gid = task->egid;
}
@@ -1175,13 +1187,15 @@ static int pid_revalidate(struct dentry
struct inode *inode = dentry->d_inode;
struct task_struct *task = proc_task(inode);
if (pid_alive(task)) {
- if (proc_type(inode) == PROC_TGID_INO || proc_type(inode) == PROC_TID_INO || task_dumpable(task)) {
+ if (IS_PID_DIR(proc_type(inode)) || task_dumpable(task)) {
inode->i_uid = task->euid;
inode->i_gid = task->egid;
} else {
inode->i_uid = 0;
inode->i_gid = 0;
}
+ if (proc_privacy && !IS_PID_DIR(proc_type(inode)))
+ inode->i_gid = proc_gid;
security_task_to_inode(task, inode);
return 1;
}
@@ -1454,6 +1468,11 @@ static struct dentry *proc_pident_lookup

ei = PROC_I(inode);
inode->i_mode = p->mode;
+ if (proc_privacy && !IS_PID_DIR(p->type)) {
+ inode->i_gid = proc_gid;
+ if (proc_privacy == 2 || task->euid != 0)
+ inode->i_mode &= ~S_IRWXO;
+ }
/*
* Yes, it does not scale. And it should not. Don't add
* new entries into /proc/<tgid>/ without very good reasons.


2005-04-10 18:07:56

by Bodo Eggert

[permalink] [raw]
Subject: Re: [RFC][PATCH] Simple privacy enhancement for /proc/<pid>

On Sun, 10 Apr 2005, Rene Scharfe wrote:

> First, configuring via kernel parameters is sufficient.

I don't remember: Would a mount option be equally easy to implement?
(Kernel parameters are OK for me, too.)

> I have another idea: let's keep the details of _every_ process owned by
> user root readable by anyone.

What about SUID processes acting on behalf of users?

> - processor.max_cstate= [HW, ACPI]
> - Limit processor to maximum C-state
> - max_cstate=9 overrides any DMI blacklist limit.
> -

This seems to belong into another patch



(in pid_revalidate:)
What about moving the things around? (just editing in the MUA)

> + if (IS_PID_DIR(proc_type(inode)) || task_dumpable(task)) {
> inode->i_uid = task->euid;
> + inode->i_gid = proc_gid;
> + if (!proc_privacy || IS_PID_DIR(proc_type(inode)))
> inode->i_gid = task->egid;
> } else {
> inode->i_uid = 0;
> inode->i_gid = 0;
> }
> security_task_to_inode(task, inode);
> return 1;
> }

BTW: You might be able to cache IS_PID_DIR(). It looks like being a gain.

> @@ -1454,6 +1468,11 @@ static struct dentry *proc_pident_lookup

> + if (proc_privacy == 2 || task->euid != 0)
^^^^^
redundand.
--
Funny quotes:
27. If people from Poland are called Poles, why aren't people from Holland
called Holes?
Fri?, Spammer: [email protected] [email protected]

2005-04-11 21:19:47

by René Scharfe

[permalink] [raw]
Subject: Re: [RFC][PATCH] Simple privacy enhancement for /proc/<pid>

Bodo Eggert schrieb:
> On Sun, 10 Apr 2005, Rene Scharfe wrote:
>
>
>>First, configuring via kernel parameters is sufficient.
>
>
> I don't remember: Would a mount option be equally easy to implement?
> (Kernel parameters are OK for me, too.)

A mount option for procfs would be changable at remount, making
implementation a bit more involved.

>>I have another idea: let's keep the details of _every_ process owned by
>>user root readable by anyone.
>
>
> What about SUID processes acting on behalf of users?

SUID root processes will we visible for all, too. That's fair enough, I
think. If it's a concern to you use proc.privacy=2.

>>- processor.max_cstate= [HW, ACPI]
>>- Limit processor to maximum C-state
>>- max_cstate=9 overrides any DMI blacklist limit.
>>-
>
>
> This seems to belong into another patch

Strictly speaking, yes, but it's just a trivial cleanup near my own
change. And I guarantee it has zero impact on any built kernel image. :]

> (in pid_revalidate:)
> What about moving the things around? (just editing in the MUA)
>
>
>>+ if (IS_PID_DIR(proc_type(inode)) || task_dumpable(task)) {
>> inode->i_uid = task->euid;
>>+ inode->i_gid = proc_gid;
>>+ if (!proc_privacy || IS_PID_DIR(proc_type(inode)))
>> inode->i_gid = task->egid;
>> } else {
>> inode->i_uid = 0;
>> inode->i_gid = 0;
>> }
>> security_task_to_inode(task, inode);
>> return 1;
>> }

I suppose you could do that, but I don't see any gain. I also find my
version easier to read because it keeps the two conditionals (having
different intents and purposes) apart.

> BTW: You might be able to cache IS_PID_DIR(). It looks like being a gain.

I'd rather let the compiler do that job. It's only a small macro, I
really doubt you would measure any speedup from putting it into a local
variable.

>>@@ -1454,6 +1468,11 @@ static struct dentry *proc_pident_lookup
>
>
>>+ if (proc_privacy == 2 || task->euid != 0)
>
> ^^^^^
> redundand.

You're right and it's a matter of taste, I guess. By the way, this is
also what the FreeBSD crowd calls a "bikeshed". :-)

Thanks for reviewing my patch!
Rene

2005-04-12 06:04:12

by Albert Cahalan

[permalink] [raw]
Subject: Re: [RFC][PATCH] Simple privacy enhancement for /proc/<pid>

On Sun, 2005-04-10 at 17:38 +0200, Rene Scharfe wrote:

> Albert, allowing access based on tty sounds nice, but it _is_ expansive.
> More importantly, perhaps, it would "virtualize" /proc: every user would
> see different permissions for certain files in there. That's too comlex
> for my taste.

If you really can't allow access based on tty, then at least allow
access if any UID value matches any UID value. Without this, a user
can not always see a setuid program they are running.

> First, configuring via kernel parameters is sufficient. It simplifies
> implementation a lot because we know the settings cannot change. And we
> don't need the added flexibility of sysctls anyway -- I assume these
> parameters are set at installation time and never touched again.

This means mucking with boot parameters, which can be a pain.
The various boot loaders do not all use the same config file.

> Then I suppose we don't need to be able to fine-tune the permissions for
> each file in /proc/<pid>/. All that we need is a distinction between
> "normal" users (which are to be restricted) and admins (which need to
> see everything).

The /proc/*/maps file sure is different from the /proc/*/status file.
The same for all the others, really.

> This patch introduces two kernel parameters: proc.privacy and proc.gid.
> The group ID attribute of all files below /proc/<pid> is set to
> proc.gid, but only if you activate the feature by setting proc.privacy
> to a non-zero value.

This is very bad. Please do not change the GID as seen by
the stat() call. This value is used.


2005-04-12 21:22:06

by René Scharfe

[permalink] [raw]
Subject: Re: [RFC][PATCH] Simple privacy enhancement for /proc/<pid>

On Tue, Apr 12, 2005 at 01:29:35AM -0400, Albert Cahalan wrote:
> If you really can't allow access based on tty, then at least allow
> access if any UID value matches any UID value. Without this, a user
> can not always see a setuid program they are running.

Yes, that's a bug. Below is a new version of the patch that keeps
setuid'd processes visible to any user.

> This means mucking with boot parameters, which can be a pain.
> The various boot loaders do not all use the same config file.

True, kernel parameters not as comfortable to use as sysctl's or mount
options. You only have to set it once, however.

> > Then I suppose we don't need to be able to fine-tune the permissions for
> > each file in /proc/<pid>/. All that we need is a distinction between
> > "normal" users (which are to be restricted) and admins (which need to
> > see everything).
>
> The /proc/*/maps file sure is different from the /proc/*/status file.
> The same for all the others, really.

The contents and purposes are different, but the intent of my patch is
to protect all information about the processes of one user from all
others. If you need to access that information you have at least three
options: a) don't enable proc.privacy, b) put the users who need the
info into the proc.gid group, c) chmod a+r /proc/$pid/$file. (Note:
that last option is a feature of the vanilla kernel.)

> > This patch introduces two kernel parameters: proc.privacy and proc.gid.
> > The group ID attribute of all files below /proc/<pid> is set to
> > proc.gid, but only if you activate the feature by setting proc.privacy
> > to a non-zero value.
>
> This is very bad. Please do not change the GID as seen by
> the stat() call. This value is used.

What is it used for? E.g. using the group ID of /proc/<pid>/stat to
determine the egid that process is running as is not reliable even
without my patch. It's better to use the group ID of the directory
/proc/<pid>/ instead or to look up the ID in /proc/<pid>/status. And
the permissions of the <pid> directories are not changed by the patch.

Thanks,
Rene


diff -pur l1/Documentation/kernel-parameters.txt l2/Documentation/kernel-parameters.txt
--- l1/Documentation/kernel-parameters.txt 2005-04-07 23:18:36.000000000 +0200
+++ l2/Documentation/kernel-parameters.txt 2005-04-10 16:56:58.000000000 +0200
@@ -1116,16 +1116,27 @@ running once the system is up.
[ISAPNP] Exclude memory regions for the autoconfiguration
Ranges are in pairs (memory base and size).

+ processor.max_cstate= [HW, ACPI]
+ Limit processor to maximum C-state
+ max_cstate=9 overrides any DMI blacklist limit.
+
+ proc.gid= [KNL] Group ID attribute of the files in /proc/<pid>,
+ default is 0. This parameter is ignored if
+ proc.privacy is 0.
+ proc.privacy= [KNL] Take away permissions for files in /proc/<pid>
+ from world (think "chmod o-rwx") and apply proc.gid
+ setting. 0 = disabled (default), 1 = restrict access
+ to all process details except those having a uid of 0,
+ 2 = restrict access to all process details except for
+ kernel threads and init and its children, 3 = restrict
+ access to all process details.
+
profile= [KNL] Enable kernel profiling via /proc/profile
{ schedule | <number> }
(param: schedule - profile schedule points}
(param: profile step/bucket size as a power of 2 for
statistical time based profiling)

- processor.max_cstate= [HW, ACPI]
- Limit processor to maximum C-state
- max_cstate=9 overrides any DMI blacklist limit.
-
prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
before loading.
See Documentation/ramdisk.txt.
diff -pur l1/fs/proc/base.c l2/fs/proc/base.c
--- l1/fs/proc/base.c 2005-04-07 23:18:47.000000000 +0200
+++ l2/fs/proc/base.c 2005-04-12 21:26:13.000000000 +0200
@@ -35,8 +35,18 @@
#include <linux/seccomp.h>
#include <linux/cpuset.h>
#include <linux/audit.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
#include "internal.h"

+static int proc_privacy;
+module_param_named(privacy, proc_privacy, uint, 0);
+MODULE_PARM_DESC(umask, "restrict permissions of files in /proc/<pid>");
+
+static gid_t proc_gid;
+module_param_named(gid, proc_gid, uint, 0);
+MODULE_PARM_DESC(gid, "process admin group");
+
/*
* For hysterical raisins we keep the same inumbers as in the old procfs.
* Feel free to change the macro below - just keep the range distinct from
@@ -227,6 +237,8 @@ static struct pid_entry tid_attr_stuff[]

#undef E

+#define IS_PID_DIR(type) ((type) == PROC_TGID_INO || (type) == PROC_TID_INO)
+
static int proc_fd_link(struct inode *inode, struct dentry **dentry, struct vfsmount **mnt)
{
struct task_struct *task = proc_task(inode);
@@ -1145,7 +1157,7 @@ static struct inode *proc_pid_make_inode
ei->type = ino;
inode->i_uid = 0;
inode->i_gid = 0;
- if (ino == PROC_TGID_INO || ino == PROC_TID_INO || task_dumpable(task)) {
+ if (IS_PID_DIR(ino) || task_dumpable(task)) {
inode->i_uid = task->euid;
inode->i_gid = task->egid;
}
@@ -1175,13 +1187,20 @@ static int pid_revalidate(struct dentry
struct inode *inode = dentry->d_inode;
struct task_struct *task = proc_task(inode);
if (pid_alive(task)) {
- if (proc_type(inode) == PROC_TGID_INO || proc_type(inode) == PROC_TID_INO || task_dumpable(task)) {
+ if (IS_PID_DIR(proc_type(inode)) || task_dumpable(task)) {
inode->i_uid = task->euid;
inode->i_gid = task->egid;
} else {
inode->i_uid = 0;
inode->i_gid = 0;
}
+ if (proc_privacy && !IS_PID_DIR(proc_type(inode))) {
+ inode->i_gid = proc_gid;
+ if (proc_privacy == 2 || inode->i_uid != 0)
+ inode->i_mode &= ~S_IRWXO;
+ else
+ inode->i_mode |= (inode->i_mode & S_IRWXG) >> 3;
+ }
security_task_to_inode(task, inode);
return 1;
}
@@ -1454,6 +1473,11 @@ static struct dentry *proc_pident_lookup

ei = PROC_I(inode);
inode->i_mode = p->mode;
+ if (proc_privacy && !IS_PID_DIR(p->type)) {
+ inode->i_gid = proc_gid;
+ if (proc_privacy == 2 || inode->i_uid != 0)
+ inode->i_mode &= ~S_IRWXO;
+ }
/*
* Yes, it does not scale. And it should not. Don't add
* new entries into /proc/<tgid>/ without very good reasons.