2013-09-06 10:41:14

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 0/8] Getting rid of get_unused_fd() / use O_CLOEXEC

Hi,

With help from subsystem maintainers, I've managed to get some of
get_unused_fd() calls converted to use get_unused_fd_flags(O_CLOEXEC)
instead. [ANDROID][IB][VFIO]

KVM subsystem maintainers helped me to change calls to anon_inode_getfd()
to use O_CLOEXEC by default. [KVM]

Some maintainers applied my patches to convert get_unused_fd() to
get_unused_fd_flags(0) were using O_CLOEXEC wasn't possible without breaking ABI.
[SCTP][XFS]

So, still in the hope to get rid of get_unused_fd(), please find a another
attempt to remove get_unused_fd() macro and encourage subsystems to use
get_unused_fd_flags(O_CLOEXEC) or anon_inode_getfd(O_CLOEXEC) by default
were appropriate.

The patchset convert all calls to get_unused_fd()
to get_unused_fd_flags(0) before removing get_unused_fd() macro.

Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.

Additionally I'm suggesting Industrial IO (IIO) subsystem to use
anon_inode_getfd(O_CLOEXEC): it's the only subsystem using anon_inode_getfd()
with a fixed set of flags not including O_CLOEXEC.

Not using O_CLOEXEC by default or not letting userspace provide the "open" flags
should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across exec().
Using O_CLOEXEC by default when flags are not provided by userspace allows it
to choose, without race, if the file descriptor is going to be inherited
across exec().

Status:

In linux-next tag 20130906, they're currently:

- 22 calls to get_unused_fd_flags() (+3)
not counting get_unused_fd() and anon_inode_getfd()
- 7 calls to get_unused_fd() (-3)
- 11 calls to anon_inode_getfd() (0)

Changes from patchset v2 [PATCHSETv2]:

- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream

- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream

- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied upstream.
Additionally subsystem maintainer applied another patch on top
to set the flags to O_CLOEXEC.

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
NEW: propose to use O_CLOEXEC as default flag.

Links:

[ANDROID]
http://lkml.kernel.org/r/CACSP8SjXGMk2_kX_+RgzqqQwqKernvF1Wt3K5tw991W5dfAnCA@mail.gmail.com
http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w@mail.gmail.com

[IB]
http://lkml.kernel.org/r/CAL1RGDXV1_BjSLrQDFdVQ1_D75+bMtOtikHOUp8VBiy_OJUf=w@mail.gmail.com

[VFIO]
http://lkml.kernel.org/r/[email protected]

[KVM]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]

[SCTP]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]

[XFS]
http://lkml.kernel.org/r/[email protected]

[PATCHSETv2]
http://lkml.kernel.org/r/[email protected]

Yann Droneaud (8):
ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
file: use get_unused_fd_flags(0) instead of get_unused_fd()
fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
events: use get_unused_fd_flags(0) instead of get_unused_fd()
file: remove get_unused_fd()
industrialio: use anon_inode_getfd() with O_CLOEXEC flag

arch/ia64/kernel/perfmon.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
drivers/iio/industrialio-event.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/file.c | 2 +-
fs/notify/fanotify/fanotify_user.c | 2 +-
include/linux/file.h | 1 -
kernel/events/core.c | 2 +-
8 files changed, 8 insertions(+), 9 deletions(-)

--
1.8.3.1


2013-09-06 10:41:36

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 1/8] ia64: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/ia64/kernel/perfmon.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 5a9ff1c..64757c1 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2668,7 +2668,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg

ret = -ENOMEM;

- fd = get_unused_fd();
+ fd = get_unused_fd_flags(0);
if (fd < 0)
return fd;

--
1.8.3.1

2013-09-06 10:41:49

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 4/8] file: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/file.c b/fs/file.c
index 4a78f98..1420d28 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -897,7 +897,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
struct file *file = fget_raw(fildes);

if (file) {
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret >= 0)
fd_install(ret, file);
else
--
1.8.3.1

2013-09-06 10:41:44

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 2/8] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf..51effce 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

--
1.8.3.1

2013-09-06 10:42:01

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 5/8] fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/notify/fanotify/fanotify_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e44cb64..644b9a7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -69,7 +69,7 @@ static int create_fd(struct fsnotify_group *group,

pr_debug("%s: group=%p event=%p\n", __func__, group, event);

- client_fd = get_unused_fd();
+ client_fd = get_unused_fd_flags(0);
if (client_fd < 0)
return client_fd;

--
1.8.3.1

2013-09-06 10:42:15

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 7/8] file: remove get_unused_fd()

Macro get_unused_fd() allocates a file descriptor without O_CLOEXEC flag.

This can be seen as an unsafe default: in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().

This patch removes get_unused_fd() so that newer kernel code use
anon_inode_getfd() or get_unused_fd_flags() with flags provided
by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
include/linux/file.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/include/linux/file.h b/include/linux/file.h
index cbacf4f..8666002 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -63,7 +63,6 @@ extern void set_close_on_exec(unsigned int fd, int flag);
extern bool get_close_on_exec(unsigned int fd);
extern void put_filp(struct file *);
extern int get_unused_fd_flags(unsigned flags);
-#define get_unused_fd() get_unused_fd_flags(0)
extern void put_unused_fd(unsigned int fd);

extern void fd_install(unsigned int fd, struct file *file);
--
1.8.3.1

2013-09-06 10:42:08

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 6/8] events: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2207efc..b12fdd3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6935,7 +6935,7 @@ SYSCALL_DEFINE5(perf_event_open,
if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
return -EINVAL;

- event_fd = get_unused_fd();
+ event_fd = get_unused_fd_flags(0);
if (event_fd < 0)
return event_fd;

--
1.8.3.1

2013-09-06 10:42:22

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 8/8] industrialio: use anon_inode_getfd() with O_CLOEXEC flag

IIO uses anon_inode_get() to allocate file descriptors as part
of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file
descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

KVM usage of anon_inode_getfd() was fixed in a previous patchset [1],
so IIO is the only subsystem using anon_inode_getfd() with a fixed set
of flags not including O_CLOEXEC.

This patch set O_CLOEXEC flag on the event file descriptor created
with anon_inode_getfd() to not leak file descriptors across exec().

Links:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
http://danwalsh.livejournal.com/53603.html

- [1] kvm: use anon_inode_getfd() with O_CLOEXEC flag
http://lkml.kernel.org/r/[email protected]

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
drivers/iio/industrialio-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iio/industrialio-event.c b/drivers/iio/industrialio-event.c
index 10aa9ef..2390e3d 100644
--- a/drivers/iio/industrialio-event.c
+++ b/drivers/iio/industrialio-event.c
@@ -159,7 +159,7 @@ int iio_event_getfd(struct iio_dev *indio_dev)
}
spin_unlock_irq(&ev_int->wait.lock);
fd = anon_inode_getfd("iio:event",
- &iio_event_chrdev_fileops, ev_int, O_RDONLY);
+ &iio_event_chrdev_fileops, ev_int, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
spin_lock_irq(&ev_int->wait.lock);
__clear_bit(IIO_BUSY_BIT_POS, &ev_int->flags);
--
1.8.3.1

2013-09-06 10:41:43

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v3 3/8] binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/binfmt_misc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 1c740e1..052f6dc 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -138,7 +138,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
/* if the binary should be opened on behalf of the
* interpreter than keep it open and assign descriptor
* to it */
- fd_binary = get_unused_fd();
+ fd_binary = get_unused_fd_flags(0);
if (fd_binary < 0) {
retval = fd_binary;
goto _ret;
--
1.8.3.1

2013-09-15 15:26:45

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v3 8/8] industrialio: use anon_inode_getfd() with O_CLOEXEC flag

On 09/06/13 11:39, Yann Droneaud wrote:
> IIO uses anon_inode_get() to allocate file descriptors as part
> of its ioctls. But those ioctls are lacking a flag argument
> allowing userspace to choose options for the newly opened file
> descriptor.
>
> In such case it's advised to use O_CLOEXEC by default so that
> userspace is allowed to choose, without race, if the file descriptor
> is going to be inherited across exec().
>
> KVM usage of anon_inode_getfd() was fixed in a previous patchset [1],
> so IIO is the only subsystem using anon_inode_getfd() with a fixed set
> of flags not including O_CLOEXEC.
>
> This patch set O_CLOEXEC flag on the event file descriptor created
> with anon_inode_getfd() to not leak file descriptors across exec().
>
> Links:
>
> - Secure File Descriptor Handling (Ulrich Drepper, 2008)
> http://udrepper.livejournal.com/20407.html
>
> - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
> http://danwalsh.livejournal.com/53603.html
>
> - [1] kvm: use anon_inode_getfd() with O_CLOEXEC flag
> http://lkml.kernel.org/r/[email protected]
>
> Signed-off-by: Yann Droneaud <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]

Thanks Yann. I had no idea about this issue but your well supported description
above made this easy to review.

Applied to the togreg branch of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
as it's not a regression but if there is support for pushing this back to stable
I personally am not against it.

Jonathan
>
---
> drivers/iio/industrialio-event.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iio/industrialio-event.c b/drivers/iio/industrialio-event.c
> index 10aa9ef..2390e3d 100644
> --- a/drivers/iio/industrialio-event.c
> +++ b/drivers/iio/industrialio-event.c
> @@ -159,7 +159,7 @@ int iio_event_getfd(struct iio_dev *indio_dev)
> }
> spin_unlock_irq(&ev_int->wait.lock);
> fd = anon_inode_getfd("iio:event",
> - &iio_event_chrdev_fileops, ev_int, O_RDONLY);
> + &iio_event_chrdev_fileops, ev_int, O_RDONLY | O_CLOEXEC);
> if (fd < 0) {
> spin_lock_irq(&ev_int->wait.lock);
> __clear_bit(IIO_BUSY_BIT_POS, &ev_int->flags);
>

2013-09-15 16:34:09

by Yann Droneaud

[permalink] [raw]
Subject: Re: [PATCH v3 8/8] industrialio: use anon_inode_getfd() with O_CLOEXEC flag

Hi,

Le dimanche 15 septembre 2013 à 17:26 +0100, Jonathan Cameron a écrit :
> On 09/06/13 11:39, Yann Droneaud wrote:

[...]

> > http://lkml.kernel.org/r/[email protected]
> >
> > Signed-off-by: Yann Droneaud <[email protected]>
> > Link: http://lkml.kernel.org/r/[email protected]
>
> Thanks Yann. I had no idea about this issue but your well supported description
> above made this easy to review.
>
> Applied to the togreg branch of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
> as it's not a regression but if there is support for pushing this back to stable
> I personally am not against it.
>

It's not a -stable materiel. So let it flow to v3.12 or v3.13.

Thanks for the review.

Regards.

--
Yann Droneaud
OPTEYA