Hi,
With help from subsystem maintainers, I've managed to get some of
get_unused_fd() calls converted to use get_unused_fd_flags(O_CLOEXEC)
instead. [ANDROID][IB][VFIO]
KVM subsystem maintainers helped me to change calls to anon_inode_getfd()
to use O_CLOEXEC by default. [KVM]
Some maintainers applied my patches to convert get_unused_fd() to
get_unused_fd_flags(0) were using O_CLOEXEC wasn't possible without breaking ABI.
[SCTP][XFS]
So, still in the hope to get rid of get_unused_fd(), please find a another
attempt to remove get_unused_fd() macro and encourage subsystems to use
get_unused_fd_flags(O_CLOEXEC) or anon_inode_getfd(O_CLOEXEC) by default
were appropriate.
The patchset convert all calls to get_unused_fd()
to get_unused_fd_flags(0) before removing get_unused_fd() macro.
Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.
Additionally I'm suggesting Industrial IO (IIO) subsystem to use
anon_inode_getfd(O_CLOEXEC): it's the only subsystem using anon_inode_getfd()
with a fixed set of flags not including O_CLOEXEC.
Not using O_CLOEXEC by default or not letting userspace provide the "open" flags
should be considered bad practice from security point of view:
in most case O_CLOEXEC must be used to not leak file descriptor across exec().
Using O_CLOEXEC by default when flags are not provided by userspace allows it
to choose, without race, if the file descriptor is going to be inherited
across exec().
Status:
In linux-next tag 20130906, they're currently:
- 22 calls to get_unused_fd_flags() (+3)
not counting get_unused_fd() and anon_inode_getfd()
- 7 calls to get_unused_fd() (-3)
- 11 calls to anon_inode_getfd() (0)
Changes from patchset v2 [PATCHSETv2]:
- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream
- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream
- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied upstream.
Additionally subsystem maintainer applied another patch on top
to set the flags to O_CLOEXEC.
- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
NEW: propose to use O_CLOEXEC as default flag.
Links:
[ANDROID]
http://lkml.kernel.org/r/CACSP8SjXGMk2_kX_+RgzqqQwqKernvF1Wt3K5tw991W5dfAnCA@mail.gmail.com
http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w@mail.gmail.com
[IB]
http://lkml.kernel.org/r/CAL1RGDXV1_BjSLrQDFdVQ1_D75+bMtOtikHOUp8VBiy_OJUf=w@mail.gmail.com
[VFIO]
http://lkml.kernel.org/r/[email protected]
[KVM]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]
[SCTP]
http://lkml.kernel.org/r/[email protected]
http://lkml.kernel.org/r/[email protected]
[XFS]
http://lkml.kernel.org/r/[email protected]
[PATCHSETv2]
http://lkml.kernel.org/r/[email protected]
Yann Droneaud (8):
ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
file: use get_unused_fd_flags(0) instead of get_unused_fd()
fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
events: use get_unused_fd_flags(0) instead of get_unused_fd()
file: remove get_unused_fd()
industrialio: use anon_inode_getfd() with O_CLOEXEC flag
arch/ia64/kernel/perfmon.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
drivers/iio/industrialio-event.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/file.c | 2 +-
fs/notify/fanotify/fanotify_user.c | 2 +-
include/linux/file.h | 1 -
kernel/events/core.c | 2 +-
8 files changed, 8 insertions(+), 9 deletions(-)
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/ia64/kernel/perfmon.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 5a9ff1c..64757c1 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2668,7 +2668,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg
ret = -ENOMEM;
- fd = get_unused_fd();
+ fd = get_unused_fd_flags(0);
if (fd < 0)
return fd;
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/file.c b/fs/file.c
index 4a78f98..1420d28 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -897,7 +897,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
struct file *file = fget_raw(fildes);
if (file) {
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret >= 0)
fd_install(ret, file);
else
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf..51effce 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
int ret;
struct file *filp;
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;
@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
int ret;
struct file *filp;
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/notify/fanotify/fanotify_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e44cb64..644b9a7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -69,7 +69,7 @@ static int create_fd(struct fsnotify_group *group,
pr_debug("%s: group=%p event=%p\n", __func__, group, event);
- client_fd = get_unused_fd();
+ client_fd = get_unused_fd_flags(0);
if (client_fd < 0)
return client_fd;
--
1.8.3.1
Macro get_unused_fd() allocates a file descriptor without O_CLOEXEC flag.
This can be seen as an unsafe default: in most case O_CLOEXEC
must be used to not leak file descriptor across exec().
Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().
This patch removes get_unused_fd() so that newer kernel code use
anon_inode_getfd() or get_unused_fd_flags() with flags provided
by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
include/linux/file.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/include/linux/file.h b/include/linux/file.h
index cbacf4f..8666002 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -63,7 +63,6 @@ extern void set_close_on_exec(unsigned int fd, int flag);
extern bool get_close_on_exec(unsigned int fd);
extern void put_filp(struct file *);
extern int get_unused_fd_flags(unsigned flags);
-#define get_unused_fd() get_unused_fd_flags(0)
extern void put_unused_fd(unsigned int fd);
extern void fd_install(unsigned int fd, struct file *file);
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2207efc..b12fdd3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6935,7 +6935,7 @@ SYSCALL_DEFINE5(perf_event_open,
if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
return -EINVAL;
- event_fd = get_unused_fd();
+ event_fd = get_unused_fd_flags(0);
if (event_fd < 0)
return event_fd;
--
1.8.3.1
IIO uses anon_inode_get() to allocate file descriptors as part
of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file
descriptor.
In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().
KVM usage of anon_inode_getfd() was fixed in a previous patchset [1],
so IIO is the only subsystem using anon_inode_getfd() with a fixed set
of flags not including O_CLOEXEC.
This patch set O_CLOEXEC flag on the event file descriptor created
with anon_inode_getfd() to not leak file descriptors across exec().
Links:
- Secure File Descriptor Handling (Ulrich Drepper, 2008)
http://udrepper.livejournal.com/20407.html
- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
http://danwalsh.livejournal.com/53603.html
- [1] kvm: use anon_inode_getfd() with O_CLOEXEC flag
http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
drivers/iio/industrialio-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iio/industrialio-event.c b/drivers/iio/industrialio-event.c
index 10aa9ef..2390e3d 100644
--- a/drivers/iio/industrialio-event.c
+++ b/drivers/iio/industrialio-event.c
@@ -159,7 +159,7 @@ int iio_event_getfd(struct iio_dev *indio_dev)
}
spin_unlock_irq(&ev_int->wait.lock);
fd = anon_inode_getfd("iio:event",
- &iio_event_chrdev_fileops, ev_int, O_RDONLY);
+ &iio_event_chrdev_fileops, ev_int, O_RDONLY | O_CLOEXEC);
if (fd < 0) {
spin_lock_irq(&ev_int->wait.lock);
__clear_bit(IIO_BUSY_BIT_POS, &ev_int->flags);
--
1.8.3.1
Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().
Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.
In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.
This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.
The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.
Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/binfmt_misc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 1c740e1..052f6dc 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -138,7 +138,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
/* if the binary should be opened on behalf of the
* interpreter than keep it open and assign descriptor
* to it */
- fd_binary = get_unused_fd();
+ fd_binary = get_unused_fd_flags(0);
if (fd_binary < 0) {
retval = fd_binary;
goto _ret;
--
1.8.3.1
On 09/06/13 11:39, Yann Droneaud wrote:
> IIO uses anon_inode_get() to allocate file descriptors as part
> of its ioctls. But those ioctls are lacking a flag argument
> allowing userspace to choose options for the newly opened file
> descriptor.
>
> In such case it's advised to use O_CLOEXEC by default so that
> userspace is allowed to choose, without race, if the file descriptor
> is going to be inherited across exec().
>
> KVM usage of anon_inode_getfd() was fixed in a previous patchset [1],
> so IIO is the only subsystem using anon_inode_getfd() with a fixed set
> of flags not including O_CLOEXEC.
>
> This patch set O_CLOEXEC flag on the event file descriptor created
> with anon_inode_getfd() to not leak file descriptors across exec().
>
> Links:
>
> - Secure File Descriptor Handling (Ulrich Drepper, 2008)
> http://udrepper.livejournal.com/20407.html
>
> - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
> http://danwalsh.livejournal.com/53603.html
>
> - [1] kvm: use anon_inode_getfd() with O_CLOEXEC flag
> http://lkml.kernel.org/r/[email protected]
>
> Signed-off-by: Yann Droneaud <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
Thanks Yann. I had no idea about this issue but your well supported description
above made this easy to review.
Applied to the togreg branch of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
as it's not a regression but if there is support for pushing this back to stable
I personally am not against it.
Jonathan
>
---
> drivers/iio/industrialio-event.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/iio/industrialio-event.c b/drivers/iio/industrialio-event.c
> index 10aa9ef..2390e3d 100644
> --- a/drivers/iio/industrialio-event.c
> +++ b/drivers/iio/industrialio-event.c
> @@ -159,7 +159,7 @@ int iio_event_getfd(struct iio_dev *indio_dev)
> }
> spin_unlock_irq(&ev_int->wait.lock);
> fd = anon_inode_getfd("iio:event",
> - &iio_event_chrdev_fileops, ev_int, O_RDONLY);
> + &iio_event_chrdev_fileops, ev_int, O_RDONLY | O_CLOEXEC);
> if (fd < 0) {
> spin_lock_irq(&ev_int->wait.lock);
> __clear_bit(IIO_BUSY_BIT_POS, &ev_int->flags);
>
Hi,
Le dimanche 15 septembre 2013 à 17:26 +0100, Jonathan Cameron a écrit :
> On 09/06/13 11:39, Yann Droneaud wrote:
[...]
> > http://lkml.kernel.org/r/[email protected]
> >
> > Signed-off-by: Yann Droneaud <[email protected]>
> > Link: http://lkml.kernel.org/r/[email protected]
>
> Thanks Yann. I had no idea about this issue but your well supported description
> above made this easy to review.
>
> Applied to the togreg branch of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
> as it's not a regression but if there is support for pushing this back to stable
> I personally am not against it.
>
It's not a -stable materiel. So let it flow to v3.12 or v3.13.
Thanks for the review.
Regards.
--
Yann Droneaud
OPTEYA