2018-12-30 13:30:32

by Lee, Chun-Yi

[permalink] [raw]
Subject: [PATCH 0/2] [RFC] sysfs: Add hook for checking the file capability of opener

There have some discussion in the following mail loop about checking
capability in sysfs write handler:
https://lkml.org/lkml/2018/9/13/978

Sometimes we check the capability in sysfs implementation by using
capable function. But the checking can be bypassed by opening sysfs
file within an unprivileged process then writing the file within a
privileged process. The tricking way has been exposed by Andy Lutomirski
for CVE-2013-1959.

Because the sysfs_ops does not forward the file descriptor to the
show/store callback, there doesn't have chance to check the capability
of file's opener. This patch adds the hook to sysfs_ops that allows
different implementation in object and attribute levels for checking
file capable before accessing sysfs interfaces.

The callback function of kobject sysfs_ops is the first implementation
of new hook. It casts attribute to kobj_attribute then calls the file
capability callback function of attribute level. The same logic can
be implemented in other sysfs file types, like: device, driver and
bus type.

The capability checking logic in wake_lock/wake_unlock sysfs interface
is the first example for kobject. It will check the opener's capability.

Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Chen Yu <[email protected]>
Cc: Giovanni Gherdovich <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Len Brown <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Bart Van Assche <[email protected]>
Signed-off-by: "Lee, Chun-Yi" <[email protected]>

Lee, Chun-Yi (2):
sysfs: Add hook for checking the file capable
PM / Sleep: Checking the file capability when writing wak lock
interface

fs/sysfs/file.c | 8 ++++++++
include/linux/kobject.h | 2 ++
include/linux/sysfs.h | 2 ++
kernel/power/main.c | 14 ++++++++++++++
kernel/power/wakelock.c | 6 ------
lib/kobject.c | 26 ++++++++++++++++++++++++++
6 files changed, 52 insertions(+), 6 deletions(-)

--
2.13.6



2018-12-30 13:30:38

by Lee, Chun-Yi

[permalink] [raw]
Subject: [PATCH 1/2] sysfs: Add hook for checking the file capable for opener

Base on the discussion in the following mail loop about checking
capability in sysfs write handler:
https://lkml.org/lkml/2018/9/13/978

Some sysfs write handler are checking the writer's capability by using
capable(). Base on CVE-2013-1959, those code should use file_ns_capable()
to check the opener's capability. Otherwise the capability checking logic
can be bypassed.

This patch adds hook to sysfs_ops that it allows different implementation
in object and attribute levels for checking file capable before
accessing sysfs interfaces.

The callback function in kobject sysfs_ops is the first implementation of
new hook. It casts attribute to kobj_attribute and calls the file
capability callback function in kobject attribute level.

Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Chen Yu <[email protected]>
Cc: Giovanni Gherdovich <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Len Brown <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Bart Van Assche <[email protected]>
Signed-off-by: "Lee, Chun-Yi" <[email protected]>
---
fs/sysfs/file.c | 8 ++++++++
include/linux/kobject.h | 2 ++
include/linux/sysfs.h | 2 ++
lib/kobject.c | 26 ++++++++++++++++++++++++++
4 files changed, 38 insertions(+)

diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index bb71db63c99c..cf98957babd0 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -58,6 +58,10 @@ static int sysfs_kf_seq_show(struct seq_file *sf, void *v)
* if @ops->show() isn't implemented.
*/
if (ops->show) {
+ if (ops->show_file_capable &&
+ !ops->show_file_capable(of->file, of->kn->priv))
+ return -EPERM;
+
count = ops->show(kobj, of->kn->priv, buf);
if (count < 0)
return count;
@@ -136,6 +140,10 @@ static ssize_t sysfs_kf_write(struct kernfs_open_file *of, char *buf,
if (!count)
return 0;

+ if (ops->store_file_capable &&
+ !ops->store_file_capable(of->file, of->kn->priv))
+ return -EPERM;
+
return ops->store(kobj, of->kn->priv, buf, count);
}

diff --git a/include/linux/kobject.h b/include/linux/kobject.h
index 1ab0d624fb36..f89fd692e812 100644
--- a/include/linux/kobject.h
+++ b/include/linux/kobject.h
@@ -166,6 +166,8 @@ struct kobj_attribute {
char *buf);
ssize_t (*store)(struct kobject *kobj, struct kobj_attribute *attr,
const char *buf, size_t count);
+ bool (*show_file_capable)(const struct file *);
+ bool (*store_file_capable)(const struct file *);
};

extern const struct sysfs_ops kobj_sysfs_ops;
diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
index 786816cf4aa5..0b107795bee4 100644
--- a/include/linux/sysfs.h
+++ b/include/linux/sysfs.h
@@ -214,6 +214,8 @@ struct bin_attribute bin_attr_##_name = __BIN_ATTR_RW(_name, _size)
struct sysfs_ops {
ssize_t (*show)(struct kobject *, struct attribute *, char *);
ssize_t (*store)(struct kobject *, struct attribute *, const char *, size_t);
+ bool (*show_file_capable)(const struct file *, struct attribute *);
+ bool (*store_file_capable)(const struct file *, struct attribute *);
};

#ifdef CONFIG_SYSFS
diff --git a/lib/kobject.c b/lib/kobject.c
index b72e00fd7d09..81699b2b7f72 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -800,6 +800,18 @@ static ssize_t kobj_attr_show(struct kobject *kobj, struct attribute *attr,
return ret;
}

+static bool kobj_attr_show_file_capable(const struct file *file,
+ struct attribute *attr)
+{
+ struct kobj_attribute *kattr;
+
+ kattr = container_of(attr, struct kobj_attribute, attr);
+ if (kattr->show_file_capable)
+ return kattr->show_file_capable(file);
+
+ return true;
+}
+
static ssize_t kobj_attr_store(struct kobject *kobj, struct attribute *attr,
const char *buf, size_t count)
{
@@ -812,9 +824,23 @@ static ssize_t kobj_attr_store(struct kobject *kobj, struct attribute *attr,
return ret;
}

+static bool kobj_attr_store_file_capable(const struct file *file,
+ struct attribute *attr)
+{
+ struct kobj_attribute *kattr;
+
+ kattr = container_of(attr, struct kobj_attribute, attr);
+ if (kattr->store_file_capable)
+ return kattr->store_file_capable(file);
+
+ return true;
+}
+
const struct sysfs_ops kobj_sysfs_ops = {
.show = kobj_attr_show,
.store = kobj_attr_store,
+ .show_file_capable = kobj_attr_show_file_capable,
+ .store_file_capable = kobj_attr_store_file_capable,
};
EXPORT_SYMBOL_GPL(kobj_sysfs_ops);

--
2.13.6


2018-12-30 13:31:26

by Lee, Chun-Yi

[permalink] [raw]
Subject: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

The wake lock/unlock sysfs interfaces check that the writer must has
CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
by opening sysfs file within an unprivileged process and then writing
the file within a privileged process. The tricking way has been exposed
by Andy Lutomirski in CVE-2013-1959.

Here is an example that it's a simplified version from CVE-2013-1959
to bypass the capability checking of wake_lock sysfs:

int main(int argc, char* argv[])
{
int fd, ret = 0;

fd = open("/sys/power/wake_lock", O_RDWR);
if (fd < 0)
err(1, "open wake_lock");

if (dup2(fd, 1) != 1)
err(1, "dup2");
sleep(1);
execl("./string", "string");

return ret;
}

The string is a privileged program that it can be used to write
string to wake_lock interface. The main unprivileged process opens
the sysfs interface and overwrites stdout. So the privileged
process will write to wake_lock.

This patch implemented the callback of file capable checking hook
in kobject attribute level. It will check the opener's capability.

Cc: Greg Kroah-Hartman <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Chen Yu <[email protected]>
Cc: Giovanni Gherdovich <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Pavel Machek <[email protected]>
Cc: Len Brown <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Bart Van Assche <[email protected]>
Signed-off-by: "Lee, Chun-Yi" <[email protected]>
---
kernel/power/main.c | 14 ++++++++++++++
kernel/power/wakelock.c | 6 ------
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/power/main.c b/kernel/power/main.c
index 98e76cad128b..265199efedc1 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -661,6 +661,11 @@ static ssize_t wake_lock_store(struct kobject *kobj,
return error ? error : n;
}

+static bool wake_lock_store_file_capable(const struct file *file)
+{
+ return file_ns_capable(file, &init_user_ns, CAP_BLOCK_SUSPEND);
+}
+
power_attr(wake_lock);

static ssize_t wake_unlock_show(struct kobject *kobj,
@@ -678,6 +683,11 @@ static ssize_t wake_unlock_store(struct kobject *kobj,
return error ? error : n;
}

+static bool wake_unlock_store_file_capable(const struct file *file)
+{
+ return file_ns_capable(file, &init_user_ns, CAP_BLOCK_SUSPEND);
+}
+
power_attr(wake_unlock);

#endif /* CONFIG_PM_WAKELOCKS */
@@ -803,6 +813,10 @@ static int __init pm_init(void)
power_kobj = kobject_create_and_add("power", NULL);
if (!power_kobj)
return -ENOMEM;
+#ifdef CONFIG_PM_WAKELOCKS
+ wake_lock_attr.store_file_capable = wake_lock_store_file_capable;
+ wake_unlock_attr.store_file_capable = wake_unlock_store_file_capable;
+#endif
error = sysfs_create_group(power_kobj, &attr_group);
if (error)
return error;
diff --git a/kernel/power/wakelock.c b/kernel/power/wakelock.c
index 4210152e56f0..52a4cfe55eb5 100644
--- a/kernel/power/wakelock.c
+++ b/kernel/power/wakelock.c
@@ -205,9 +205,6 @@ int pm_wake_lock(const char *buf)
size_t len;
int ret = 0;

- if (!capable(CAP_BLOCK_SUSPEND))
- return -EPERM;
-
while (*str && !isspace(*str))
str++;

@@ -251,9 +248,6 @@ int pm_wake_unlock(const char *buf)
size_t len;
int ret = 0;

- if (!capable(CAP_BLOCK_SUSPEND))
- return -EPERM;
-
len = strlen(buf);
if (!len)
return -EINVAL;
--
2.13.6


2018-12-30 14:46:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/2] [RFC] sysfs: Add hook for checking the file capability of opener

On Sun, Dec 30, 2018 at 09:28:54PM +0800, Lee, Chun-Yi wrote:
> There have some discussion in the following mail loop about checking
> capability in sysfs write handler:
> https://lkml.org/lkml/2018/9/13/978

A sysfs callback should not care about stuff like this.

Worst case, do a simple:
if (!capable(CAP_FOO))
return -EPERM

you don't care or need to worry about the file handle for that at all,
right?

> Sometimes we check the capability in sysfs implementation by using
> capable function.

Which should be fine, right?

> But the checking can be bypassed by opening sysfs
> file within an unprivileged process then writing the file within a
> privileged process. The tricking way has been exposed by Andy Lutomirski
> for CVE-2013-1959.

And who does this for a sysfs file? And why?

> Because the sysfs_ops does not forward the file descriptor to the
> show/store callback, there doesn't have chance to check the capability
> of file's opener.

Which is by design. If you care about open, you are using sysfs wrong.

> This patch adds the hook to sysfs_ops that allows
> different implementation in object and attribute levels for checking
> file capable before accessing sysfs interfaces.

No, please no.

> The callback function of kobject sysfs_ops is the first implementation
> of new hook. It casts attribute to kobj_attribute then calls the file
> capability callback function of attribute level. The same logic can
> be implemented in other sysfs file types, like: device, driver and
> bus type.
>
> The capability checking logic in wake_lock/wake_unlock sysfs interface
> is the first example for kobject. It will check the opener's capability.

Why doesn't the file permission of that sysfs file determine who can or
can not write to that file?

thanks,

greg k-h

2018-12-30 14:49:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
> The wake lock/unlock sysfs interfaces check that the writer must has
> CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
> by opening sysfs file within an unprivileged process and then writing
> the file within a privileged process. The tricking way has been exposed
> by Andy Lutomirski in CVE-2013-1959.

Don't you mean "open by privileged and then written by unprivileged?"
Or if not, exactly how is this a problem? You check the capabilities
when you do the write and if that is not allowed then, well

And you are checking the namespace of the person trying to do the write
when the write happens, which is correct here, right?

If you really want to mess with wake locks in a namespaced environment,
then put it in a real namespaced environment, which is {HUGE HINT} not
sysfs.

So no, this patch isn't ok...

thanks,

greg k-h

2018-12-31 09:40:42

by joeyli

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

Hi Greg,

On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote:
> On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
> > The wake lock/unlock sysfs interfaces check that the writer must has
> > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
> > by opening sysfs file within an unprivileged process and then writing
> > the file within a privileged process. The tricking way has been exposed
> > by Andy Lutomirski in CVE-2013-1959.
>
> Don't you mean "open by privileged and then written by unprivileged?"
> Or if not, exactly how is this a problem? You check the capabilities
> when you do the write and if that is not allowed then, well
>

Sorry for I didn't provide clear explanation.

The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission
has already relaxed for non-root user. Then the expected behavior is that non-root
process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs.

But, the CAP_BLOCK_SUSPEND restrict can be bypassed:

int main(int argc, char* argv[])
{
int fd, ret = 0;

fd = open("/sys/power/wake_lock", O_RDWR);
if (fd < 0)
err(1, "open wake_lock");

if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock
err(1, "dup2");
sleep(1);
execl("./string", "string"); //string has capability

return ret;
}

This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened
wake_lock sysfs and overwrited stdout. Then it executes the "string" program
that has CAP_BLOCK_SUSPEND. The string program writes to stdout, which means
that it writes to wake_lock. So an unpriviledged opener can trick an priviledged
writer for writing sysfs.

> And you are checking the namespace of the person trying to do the write
> when the write happens, which is correct here, right?
>
> If you really want to mess with wake locks in a namespaced environment,
> then put it in a real namespaced environment, which is {HUGE HINT} not
> sysfs.
>

I don't want to mess with wake locks in namespace.

> So no, this patch isn't ok...
>

Thanks a lot!
Joey Lee

2018-12-31 09:43:52

by joeyli

[permalink] [raw]
Subject: Re: [PATCH 0/2] [RFC] sysfs: Add hook for checking the file capability of opener

Hi Greg,

Thanks for your review!

On Sun, Dec 30, 2018 at 03:45:06PM +0100, Greg Kroah-Hartman wrote:
> On Sun, Dec 30, 2018 at 09:28:54PM +0800, Lee, Chun-Yi wrote:
> > There have some discussion in the following mail loop about checking
> > capability in sysfs write handler:
> > https://lkml.org/lkml/2018/9/13/978
>
> A sysfs callback should not care about stuff like this.
>
> Worst case, do a simple:
> if (!capable(CAP_FOO))
> return -EPERM
>
> you don't care or need to worry about the file handle for that at all,
> right?
>

The capable() can be bypassed. Unprivileged process may reads or writes
those sysfs if file permission be relaxed by root for non-root user.

> > Sometimes we check the capability in sysfs implementation by using
> > capable function.
>
> Which should be fine, right?
>

If file permission is enough to restrict sysfs that can only be used
by root. Why do some sysfs interfaces use capable()? It's not
redundancy?

> > But the checking can be bypassed by opening sysfs
> > file within an unprivileged process then writing the file within a
> > privileged process. The tricking way has been exposed by Andy Lutomirski
> > for CVE-2013-1959.
>
> And who does this for a sysfs file? And why?
>

Just want to bypass the capable() checking.

> > Because the sysfs_ops does not forward the file descriptor to the
> > show/store callback, there doesn't have chance to check the capability
> > of file's opener.
>
> Which is by design. If you care about open, you are using sysfs wrong.
>

OK~ So the sysfs doesn't care opener's capability.

> > This patch adds the hook to sysfs_ops that allows
> > different implementation in object and attribute levels for checking
> > file capable before accessing sysfs interfaces.
>
> No, please no.
>
> > The callback function of kobject sysfs_ops is the first implementation
> > of new hook. It casts attribute to kobj_attribute then calls the file
> > capability callback function of attribute level. The same logic can
> > be implemented in other sysfs file types, like: device, driver and
> > bus type.
> >
> > The capability checking logic in wake_lock/wake_unlock sysfs interface
> > is the first example for kobject. It will check the opener's capability.
>
> Why doesn't the file permission of that sysfs file determine who can or
> can not write to that file?
>

I agree that the file permission can restrict the writer of sysfs. But,
I still confused for why do some sysfs interface use capable()?

Thanks a lot!
Joey Lee

2018-12-31 10:40:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 0/2] [RFC] sysfs: Add hook for checking the file capability of opener

On Mon, Dec 31, 2018 at 05:41:05PM +0800, joeyli wrote:
> Hi Greg,
>
> Thanks for your review!
>
> On Sun, Dec 30, 2018 at 03:45:06PM +0100, Greg Kroah-Hartman wrote:
> > On Sun, Dec 30, 2018 at 09:28:54PM +0800, Lee, Chun-Yi wrote:
> > > There have some discussion in the following mail loop about checking
> > > capability in sysfs write handler:
> > > https://lkml.org/lkml/2018/9/13/978
> >
> > A sysfs callback should not care about stuff like this.
> >
> > Worst case, do a simple:
> > if (!capable(CAP_FOO))
> > return -EPERM
> >
> > you don't care or need to worry about the file handle for that at all,
> > right?
> >
>
> The capable() can be bypassed.

How?

> Unprivileged process may reads or writes those sysfs if file
> permission be relaxed by root for non-root user.

So if root says "I want a non-root user to be able to write to this
file" by changing its permissions, yes, a non-root user will be able to
write to them. But the capable() check will catch this, right?

And why is the kernel trying to protect userspace from itself? Root
explicitly asked for anyone to read/write this file, why do we think the
kernel knows better?

> > > Sometimes we check the capability in sysfs implementation by using
> > > capable function.
> >
> > Which should be fine, right?
> >
>
> If file permission is enough to restrict sysfs that can only be used
> by root. Why do some sysfs interfaces use capable()? It's not
> redundancy?

I don't know why some sysfs files use this, it was the choice of that
author it seems. Perhaps it is to "check" that the root user really
does have that capability. Some root users do not have all capabilities
if they were previously "dropped".

> > > But the checking can be bypassed by opening sysfs
> > > file within an unprivileged process then writing the file within a
> > > privileged process. The tricking way has been exposed by Andy Lutomirski
> > > for CVE-2013-1959.
> >
> > And who does this for a sysfs file? And why?
>
> Just want to bypass the capable() checking.

But why would you want to bypass this? I don't understand.

What is the use case you are trying to solve for here.

> > > Because the sysfs_ops does not forward the file descriptor to the
> > > show/store callback, there doesn't have chance to check the capability
> > > of file's opener.
> >
> > Which is by design. If you care about open, you are using sysfs wrong.
> >
>
> OK~ So the sysfs doesn't care opener's capability.

Exactly.

Except for the permission/mode that the file has on it. The vfs checks
this at open time and either allows it or forbids it.

> > > This patch adds the hook to sysfs_ops that allows
> > > different implementation in object and attribute levels for checking
> > > file capable before accessing sysfs interfaces.
> >
> > No, please no.
> >
> > > The callback function of kobject sysfs_ops is the first implementation
> > > of new hook. It casts attribute to kobj_attribute then calls the file
> > > capability callback function of attribute level. The same logic can
> > > be implemented in other sysfs file types, like: device, driver and
> > > bus type.
> > >
> > > The capability checking logic in wake_lock/wake_unlock sysfs interface
> > > is the first example for kobject. It will check the opener's capability.
> >
> > Why doesn't the file permission of that sysfs file determine who can or
> > can not write to that file?
> >
>
> I agree that the file permission can restrict the writer of sysfs. But,
> I still confused for why do some sysfs interface use capable()?

Go ask the individual authors who added those checks :)

thanks,

greg k-h

2018-12-31 10:43:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote:
> Hi Greg,
>
> On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote:
> > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
> > > The wake lock/unlock sysfs interfaces check that the writer must has
> > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
> > > by opening sysfs file within an unprivileged process and then writing
> > > the file within a privileged process. The tricking way has been exposed
> > > by Andy Lutomirski in CVE-2013-1959.
> >
> > Don't you mean "open by privileged and then written by unprivileged?"
> > Or if not, exactly how is this a problem? You check the capabilities
> > when you do the write and if that is not allowed then, well
> >
>
> Sorry for I didn't provide clear explanation.
>
> The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission
> has already relaxed for non-root user. Then the expected behavior is that non-root
> process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs.
>
> But, the CAP_BLOCK_SUSPEND restrict can be bypassed:
>
> int main(int argc, char* argv[])
> {
> int fd, ret = 0;
>
> fd = open("/sys/power/wake_lock", O_RDWR);
> if (fd < 0)
> err(1, "open wake_lock");
>
> if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock
> err(1, "dup2");
> sleep(1);
> execl("./string", "string"); //string has capability
>
> return ret;
> }
>
> This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened
> wake_lock sysfs and overwrited stdout. Then it executes the "string" program
> that has CAP_BLOCK_SUSPEND.

That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to
"string". If any user can run that program, there's nothing the kernel
can do about this, right? Just don't allow that program on the system :)

> The string program writes to stdout, which means that it writes to
> wake_lock. So an unpriviledged opener can trick an priviledged writer
> for writing sysfs.

That sounds like a userspace program that was somehow given incorrect
rights by the admin, and a user is taking advantage of it. That's not
the kernel's fault.

> > And you are checking the namespace of the person trying to do the write
> > when the write happens, which is correct here, right?
> >
> > If you really want to mess with wake locks in a namespaced environment,
> > then put it in a real namespaced environment, which is {HUGE HINT} not
> > sysfs.
> >
>
> I don't want to mess with wake locks in namespace.

Neither do I :)

so all should be fine, don't allow crazy executables with odd
permissions to be run by any user and you should be fine. That's always
been the case, right?

thanks,

greg k-h

2018-12-31 12:04:08

by Jann Horn

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote:
> > Hi Greg,
> >
> > On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote:
> > > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
> > > > The wake lock/unlock sysfs interfaces check that the writer must has
> > > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
> > > > by opening sysfs file within an unprivileged process and then writing
> > > > the file within a privileged process. The tricking way has been exposed
> > > > by Andy Lutomirski in CVE-2013-1959.
> > >
> > > Don't you mean "open by privileged and then written by unprivileged?"
> > > Or if not, exactly how is this a problem? You check the capabilities
> > > when you do the write and if that is not allowed then, well
> > >
> >
> > Sorry for I didn't provide clear explanation.
> >
> > The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission
> > has already relaxed for non-root user. Then the expected behavior is that non-root
> > process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs.
> >
> > But, the CAP_BLOCK_SUSPEND restrict can be bypassed:
> >
> > int main(int argc, char* argv[])
> > {
> > int fd, ret = 0;
> >
> > fd = open("/sys/power/wake_lock", O_RDWR);
> > if (fd < 0)
> > err(1, "open wake_lock");
> >
> > if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock
> > err(1, "dup2");
> > sleep(1);
> > execl("./string", "string"); //string has capability
> >
> > return ret;
> > }
> >
> > This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened
> > wake_lock sysfs and overwrited stdout. Then it executes the "string" program
> > that has CAP_BLOCK_SUSPEND.
>
> That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to
> "string". If any user can run that program, there's nothing the kernel
> can do about this, right? Just don't allow that program on the system :)
>
> > The string program writes to stdout, which means that it writes to
> > wake_lock. So an unpriviledged opener can trick an priviledged writer
> > for writing sysfs.
>
> That sounds like a userspace program that was somehow given incorrect
> rights by the admin, and a user is taking advantage of it. That's not
> the kernel's fault.

Isn't it? Pretty much any setuid program will write to stdout or
stderr; even the glibc linker code does so if you set LD_DEBUG.
(Normally the output isn't entirely attacker-controlled, but it is in
the case of stuff like "procmail", which I think Debian still ships as
setuid root.) setuid programs should always be able to safely call
read() and write() on caller-provided file descriptors. Also, you're
supposed to be able to receive file descriptors over unix domain
sockets and then write to them without trusting the sender. Basically,
the ->read and ->write VFS handlers should never look at the caller's
credentials, only the opener's (with the exception of LSMs, which tend
to do weird things to the system's security model).

2018-12-31 12:34:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface

On Mon, Dec 31, 2018 at 01:02:35PM +0100, Jann Horn wrote:
> On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote:
> > > Hi Greg,
> > >
> > > On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote:
> > > > On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
> > > > > The wake lock/unlock sysfs interfaces check that the writer must has
> > > > > CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
> > > > > by opening sysfs file within an unprivileged process and then writing
> > > > > the file within a privileged process. The tricking way has been exposed
> > > > > by Andy Lutomirski in CVE-2013-1959.
> > > >
> > > > Don't you mean "open by privileged and then written by unprivileged?"
> > > > Or if not, exactly how is this a problem? You check the capabilities
> > > > when you do the write and if that is not allowed then, well
> > > >
> > >
> > > Sorry for I didn't provide clear explanation.
> > >
> > > The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission
> > > has already relaxed for non-root user. Then the expected behavior is that non-root
> > > process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs.
> > >
> > > But, the CAP_BLOCK_SUSPEND restrict can be bypassed:
> > >
> > > int main(int argc, char* argv[])
> > > {
> > > int fd, ret = 0;
> > >
> > > fd = open("/sys/power/wake_lock", O_RDWR);
> > > if (fd < 0)
> > > err(1, "open wake_lock");
> > >
> > > if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock
> > > err(1, "dup2");
> > > sleep(1);
> > > execl("./string", "string"); //string has capability
> > >
> > > return ret;
> > > }
> > >
> > > This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened
> > > wake_lock sysfs and overwrited stdout. Then it executes the "string" program
> > > that has CAP_BLOCK_SUSPEND.
> >
> > That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to
> > "string". If any user can run that program, there's nothing the kernel
> > can do about this, right? Just don't allow that program on the system :)
> >
> > > The string program writes to stdout, which means that it writes to
> > > wake_lock. So an unpriviledged opener can trick an priviledged writer
> > > for writing sysfs.
> >
> > That sounds like a userspace program that was somehow given incorrect
> > rights by the admin, and a user is taking advantage of it. That's not
> > the kernel's fault.
>
> Isn't it? Pretty much any setuid program will write to stdout or
> stderr; even the glibc linker code does so if you set LD_DEBUG.
> (Normally the output isn't entirely attacker-controlled, but it is in
> the case of stuff like "procmail", which I think Debian still ships as
> setuid root.) setuid programs should always be able to safely call
> read() and write() on caller-provided file descriptors. Also, you're
> supposed to be able to receive file descriptors over unix domain
> sockets and then write to them without trusting the sender. Basically,
> the ->read and ->write VFS handlers should never look at the caller's
> credentials, only the opener's (with the exception of LSMs, which tend
> to do weird things to the system's security model).

So a root program gets the file handle to the sysfs file and then passes
it off to a setuid program and the kernel should somehow protect from
this?

I think that any sysfs file that is relying on the capable() check
should just set their permissions properly first, and then it should be
ok.

thanks,

greg k-h

2018-12-31 16:16:42

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH 2/2] PM / Sleep: Check the file capability when writing wake lock interface



> On Dec 31, 2018, at 5:33 AM, Greg Kroah-Hartman <[email protected]> wrote:
>
>> On Mon, Dec 31, 2018 at 01:02:35PM +0100, Jann Horn wrote:
>> On Mon, Dec 31, 2018 at 11:41 AM Greg Kroah-Hartman
>> <[email protected]> wrote:
>>>
>>>> On Mon, Dec 31, 2018 at 05:38:51PM +0800, joeyli wrote:
>>>> Hi Greg,
>>>>
>>>>> On Sun, Dec 30, 2018 at 03:48:35PM +0100, Greg Kroah-Hartman wrote:
>>>>>> On Sun, Dec 30, 2018 at 09:28:56PM +0800, Lee, Chun-Yi wrote:
>>>>>> The wake lock/unlock sysfs interfaces check that the writer must has
>>>>>> CAP_BLOCK_SUSPEND capability. But the checking logic can be bypassed
>>>>>> by opening sysfs file within an unprivileged process and then writing
>>>>>> the file within a privileged process. The tricking way has been exposed
>>>>>> by Andy Lutomirski in CVE-2013-1959.
>>>>>
>>>>> Don't you mean "open by privileged and then written by unprivileged?"
>>>>> Or if not, exactly how is this a problem? You check the capabilities
>>>>> when you do the write and if that is not allowed then, well
>>>>>
>>>>
>>>> Sorry for I didn't provide clear explanation.
>>>>
>>>> The privileged means CAP_BLOCK_SUSPEND but not file permission. The file permission
>>>> has already relaxed for non-root user. Then the expected behavior is that non-root
>>>> process must has CAP_BLOCK_SUSPEND capability for writing wake_lock sysfs.
>>>>
>>>> But, the CAP_BLOCK_SUSPEND restrict can be bypassed:
>>>>
>>>> int main(int argc, char* argv[])
>>>> {
>>>> int fd, ret = 0;
>>>>
>>>> fd = open("/sys/power/wake_lock", O_RDWR);
>>>> if (fd < 0)
>>>> err(1, "open wake_lock");
>>>>
>>>> if (dup2(fd, 1) != 1) // overwrite the stdout with wake_lock
>>>> err(1, "dup2");
>>>> sleep(1);
>>>> execl("./string", "string"); //string has capability
>>>>
>>>> return ret;
>>>> }
>>>>
>>>> This program is an unpriviledged process (has no CAP_BLOCK_SUSPEND), it opened
>>>> wake_lock sysfs and overwrited stdout. Then it executes the "string" program
>>>> that has CAP_BLOCK_SUSPEND.
>>>
>>> That's the problem right there, do not give CAP_BLOCK_SUSPEND rights to
>>> "string". If any user can run that program, there's nothing the kernel
>>> can do about this, right? Just don't allow that program on the system :)
>>>
>>>> The string program writes to stdout, which means that it writes to
>>>> wake_lock. So an unpriviledged opener can trick an priviledged writer
>>>> for writing sysfs.
>>>
>>> That sounds like a userspace program that was somehow given incorrect
>>> rights by the admin, and a user is taking advantage of it. That's not
>>> the kernel's fault.
>>
>> Isn't it? Pretty much any setuid program will write to stdout or
>> stderr; even the glibc linker code does so if you set LD_DEBUG.
>> (Normally the output isn't entirely attacker-controlled, but it is in
>> the case of stuff like "procmail", which I think Debian still ships as
>> setuid root.) setuid programs should always be able to safely call
>> read() and write() on caller-provided file descriptors. Also, you're
>> supposed to be able to receive file descriptors over unix domain
>> sockets and then write to them without trusting the sender. Basically,
>> the ->read and ->write VFS handlers should never look at the caller's
>> credentials, only the opener's (with the exception of LSMs, which tend
>> to do weird things to the system's security model).
>
> So a root program gets the file handle to the sysfs file and then passes
> it off to a setuid program and the kernel should somehow protect from
> this?

Yes, the kernel should. If the kernel wants to check caps, it should do it right.

Calling capable() from a .write handler is wrong, even in sysfs.

>
> I think that any sysfs file that is relying on the capable() check
> should just set their permissions properly first, and then it should be
> ok.
>

Probably true.


> thanks,
>
> greg k-h