2013-10-30 19:49:10

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 0/7] Getting rid of get_unused_fd()

Hi,

Please find the fourth revision of my patchset to remove get_unused_fd()
macro in order to encourage subsystems to use get_unused_fd_flags() or
anon_inode_getfd() with open flags set to O_CLOEXEC were appropriate.

The patchset convert all calls to get_unused_fd() to
get_unused_fd_flags(0) before removing get_unused_fd() macro.

Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.

Not using O_CLOEXEC by default or not letting userspace provide the
"open" flags should be considered bad practice from security point
of view: in most case O_CLOEXEC must be used to not leak file descriptor
across exec().

Using O_CLOEXEC by default when flags are not provided by userspace
allows userspace to set, using fcntl(), without any risk of race,
if the file descriptor is going to be inherited or not across exec().

Status:

In linux-next tag 20131029, they're currently:

- 32 calls to fd_install()
- 23 calls to get_unused_fd_flags()
- 11 calls to anon_inode_getfd()
- 7 calls to get_unused_fd()

Changes from patchset v3 [PATCHSETv3]:

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
DROPPED: applied upstream

Changes from patchset v2 [PATCHSETv2]:

- android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream

- android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()
DROPPED: applied upstream

- vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied upstream.
Additionally subsystem maintainer applied another patch on top
to set the flags to O_CLOEXEC.

- industrialio: use anon_inode_getfd() with O_CLOEXEC flag
NEW: propose to use O_CLOEXEC as default flag.

Changes from patchset v1 [PATCHSETv1]:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: subsystem maintainer applied another patch
using get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied asis by subsystem maintainer.

Links:

[PATCHSETv3]
http://lkml.kernel.org/r/[email protected]

[PATCHSETv2]
http://lkml.kernel.org/r/[email protected]

[PATCHSETv1]
http://lkml.kernel.org/r/[email protected]

Yann Droneaud (7):
ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
file: use get_unused_fd_flags(0) instead of get_unused_fd()
fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
events: use get_unused_fd_flags(0) instead of get_unused_fd()
file: remove get_unused_fd()

arch/ia64/kernel/perfmon.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
fs/binfmt_misc.c | 2 +-
fs/file.c | 2 +-
fs/notify/fanotify/fanotify_user.c | 2 +-
include/linux/file.h | 1 -
kernel/events/core.c | 2 +-
7 files changed, 7 insertions(+), 8 deletions(-)

--
1.8.3.1


2013-10-30 19:49:32

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 1/7] ia64: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/ia64/kernel/perfmon.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 5a9ff1c..64757c1 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2668,7 +2668,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg

ret = -ENOMEM;

- fd = get_unused_fd();
+ fd = get_unused_fd_flags(0);
if (fd < 0)
return fd;

--
1.8.3.1

2013-10-30 19:49:44

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 2/7] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index 87ba7cf..51effce 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

--
1.8.3.1

2013-10-30 19:50:41

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 3/7] binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/binfmt_misc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 1c740e1..052f6dc 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -138,7 +138,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
/* if the binary should be opened on behalf of the
* interpreter than keep it open and assign descriptor
* to it */
- fd_binary = get_unused_fd();
+ fd_binary = get_unused_fd_flags(0);
if (fd_binary < 0) {
retval = fd_binary;
goto _ret;
--
1.8.3.1

2013-10-30 19:50:46

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 4/7] file: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/file.c b/fs/file.c
index 4a78f98..1420d28 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -897,7 +897,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
struct file *file = fget_raw(fildes);

if (file) {
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret >= 0)
fd_install(ret, file);
else
--
1.8.3.1

2013-10-30 19:50:58

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 5/7] fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
fs/notify/fanotify/fanotify_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e44cb64..644b9a7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -69,7 +69,7 @@ static int create_fd(struct fsnotify_group *group,

pr_debug("%s: group=%p event=%p\n", __func__, group, event);

- client_fd = get_unused_fd();
+ client_fd = get_unused_fd_flags(0);
if (client_fd < 0)
return client_fd;

--
1.8.3.1

2013-10-30 19:51:34

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 7/7] file: remove get_unused_fd() macro

Macro get_unused_fd() allocates a file descriptor without O_CLOEXEC flag.

This can be seen as an unsafe default: in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec(),
by calling fcntl() when needed.

This patch removes get_unused_fd() so that newer kernel code use
anon_inode_getfd() or get_unused_fd_flags() with flags provided
by userspace. If flags cannot be given by userspace, O_CLOEXEC must be
used by default.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
include/linux/file.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/include/linux/file.h b/include/linux/file.h
index cbacf4f..8666002 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -63,7 +63,6 @@ extern void set_close_on_exec(unsigned int fd, int flag);
extern bool get_close_on_exec(unsigned int fd);
extern void put_filp(struct file *);
extern int get_unused_fd_flags(unsigned flags);
-#define get_unused_fd() get_unused_fd_flags(0)
extern void put_unused_fd(unsigned int fd);

extern void fd_install(unsigned int fd, struct file *file);
--
1.8.3.1

2013-10-30 19:51:33

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 354a59c..142d13c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6970,7 +6970,7 @@ SYSCALL_DEFINE5(perf_event_open,
if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
return -EINVAL;

- event_fd = get_unused_fd();
+ event_fd = get_unused_fd_flags(0);
if (event_fd < 0)
return event_fd;

--
1.8.3.1

2013-10-30 20:20:53

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()

On Wed, Oct 30, 2013 at 08:47:46PM +0100, Yann Droneaud wrote:
> This patch replaces calls to get_unused_fd() with equivalent call to
> get_unused_fd_flags(0) to preserve current behavor for existing code.
>
> The hard coded flag value (0) should be reviewed on a per-subsystem basis,
> and, if possible, set to O_CLOEXEC.

And how am I supposed to know if that is 'possible'? You provide a total
number of 0 useful clues on how to determine this.

2013-10-30 21:19:01

by Yann Droneaud

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()

Hi,

Le 30.10.2013 21:20, Peter Zijlstra a écrit :
> On Wed, Oct 30, 2013 at 08:47:46PM +0100, Yann Droneaud wrote:
>> This patch replaces calls to get_unused_fd() with equivalent call to
>> get_unused_fd_flags(0) to preserve current behavor for existing code.
>>
>> The hard coded flag value (0) should be reviewed on a per-subsystem
>> basis,
>> and, if possible, set to O_CLOEXEC.
>
> And how am I supposed to know if that is 'possible'? You provide a
> total
> number of 0 useful clues on how to determine this.

Fair.

Short: Will it break kernel ABI ? If no, use O_CLOEXEC, if yes, use 0.

Long: If userspace expect to retrieve a file descriptor with plain old
Unix(tm)
semantics, O_CLOEXEC must not be the default, as it could break
some applications
expecting that the file descriptor will be inherited during
exec().

But for some subsystems, such as InfiniBand, KVM, VFIO, it make no
sense to have
file descriptors inherited since those are tied to resources that
will vanish
when a another program will replace the current one by mean of
exec(),
so it's safe to use O_CLOEXEC in such cases.

For others, like XFS, the file descriptor is retrieved by one
program and will
be used by a different program, executed as a child. In this case,
setting O_CLOEXEC
would break existing application, which do not expect to have to
call fcntl(fd, F_SETFD,
FD_CLOEXEC) to make it available across exec().

If file descriptor created by events subsystem are not tied to the
current process
resources, it's likely legal to use it in a different process
context, thus O_CLOEXEC
must not be the default.

Aside: If O_CLOEXEC cannot be made the default, it would be interesting
to think to extend
the API to have a (set of) function(s) taking a flags parameter
so that userspace can
set O_CLOEXEC if wanted. And I have a patch for this :)

PS: I like the title of this article: "Excuse me son, but your code is
leaking !!!" [1]
by Dan Walsh but one should probably read PEP-446 "Make newly
created file descriptors
non-inheritable" [2] instead since it has lot more background
information on file
descriptor leaking.

[1] http://danwalsh.livejournal.com/53603.html
[2] http://www.python.org/dev/peps/pep-0446/


Regards.

--
Yann Droneaud
OPTEYA

2013-10-30 21:36:24

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH] events: add a flag to perf_event_open() to set O_CLOEXEC

This patch adds PERF_FLAG_FD_CLOEXEC flag for
perf_event_open() syscall.

perf_event_open() creates a new file descriptor,
but unlike open() syscall, it lack a flag to atomicaly
set close-on-exec (O_CLOEXEC).

Not using O_CLOEXEC by default and not letting userspace
provide the "open" flags should be avoided: in most case
O_CLOEXEC must be used to not leak file descriptor across
exec().

Using O_CLOEXEC when creating a file descriptor allows
userspace to set latter, using fcntl(), without any risk
of race, if the file descriptor is going to be inherited
or not across exec().

Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Yann Droneaud <[email protected]>
---

Hi Peter,

This patch should replaces
"[PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()"

Please have a look.

Regards.

include/uapi/linux/perf_event.h | 1 +
kernel/events/core.c | 11 ++++++++---
2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 009a655..c217765 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -699,6 +699,7 @@ enum perf_callchain_context {
#define PERF_FLAG_FD_NO_GROUP (1U << 0)
#define PERF_FLAG_FD_OUTPUT (1U << 1)
#define PERF_FLAG_PID_CGROUP (1U << 2) /* pid=cgroup id, per-cpu mode only */
+#define PERF_FLAG_FD_CLOEXEC (1U << 3) /* O_CLOEXEC */

union perf_mem_data_src {
__u64 val;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 953c143..44de88c 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -119,7 +119,8 @@ static int cpu_function_call(int cpu, int (*func) (void *info), void *info)

#define PERF_FLAG_ALL (PERF_FLAG_FD_NO_GROUP |\
PERF_FLAG_FD_OUTPUT |\
- PERF_FLAG_PID_CGROUP)
+ PERF_FLAG_PID_CGROUP |\
+ PERF_FLAG_FD_CLOEXEC)

/*
* branch priv levels that need permission checks
@@ -6933,6 +6934,7 @@ SYSCALL_DEFINE5(perf_event_open,
int event_fd;
int move_group = 0;
int err;
+ int fd_flags;

/* for future expandability... */
if (flags & ~PERF_FLAG_ALL)
@@ -6961,7 +6963,9 @@ SYSCALL_DEFINE5(perf_event_open,
if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
return -EINVAL;

- event_fd = get_unused_fd();
+ fd_flags = (flags & PERF_FLAG_FD_CLOEXEC) ? O_CLOEXEC : 0;
+
+ event_fd = get_unused_fd_flags(fd_flags);
if (event_fd < 0)
return event_fd;

@@ -7083,7 +7087,8 @@ SYSCALL_DEFINE5(perf_event_open,
goto err_context;
}

- event_file = anon_inode_getfile("[perf_event]", &perf_fops, event, O_RDWR);
+ event_file = anon_inode_getfile("[perf_event]", &perf_fops, event,
+ O_RDWR | fd_flags);
if (IS_ERR(event_file)) {
err = PTR_ERR(event_file);
goto err_context;
--
1.8.1.4

2013-10-30 22:00:29

by Yann Droneaud

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()

Le mercredi 30 octobre 2013 à 22:18 +0100, Yann Droneaud a écrit :
> Hi,
>
> Le 30.10.2013 21:20, Peter Zijlstra a écrit :
> > On Wed, Oct 30, 2013 at 08:47:46PM +0100, Yann Droneaud wrote:
> >> This patch replaces calls to get_unused_fd() with equivalent call to
> >> get_unused_fd_flags(0) to preserve current behavor for existing code.
> >>
> >> The hard coded flag value (0) should be reviewed on a per-subsystem
> >> basis,
> >> and, if possible, set to O_CLOEXEC.
> >
> > And how am I supposed to know if that is 'possible'? You provide a
> > total
> > number of 0 useful clues on how to determine this.
>

I'm sorry for sending this email so unreadable ... "unformatted" by
RoundCube webmail.
Please find something more readable below:

> Fair.
>
> Short: Will it break kernel ABI ?
> If no, use O_CLOEXEC, if yes, use 0.
>
> Long: If userspace expect to retrieve a file descriptor with plain
> old Unix(tm) semantics, O_CLOEXEC must not be the default, as it
> could break some applications expecting that the file descriptor
> will be inherited during exec().
>
> But for some subsystems, such as InfiniBand, KVM, VFIO, it make no
> sense to have file descriptors inherited since those are tied to
> resources that will vanish when a another program will replace the
> current one by mean of exec(), so it's safe to use O_CLOEXEC in
> such cases.
>
> For others, like XFS, the file descriptor is retrieved by one
> program and will be used by a different program, executed as a
> child. In this case, setting O_CLOEXEC would break existing
> application, which do not expect to have to call fcntl(fd,
> F_SETFD, FD_CLOEXEC) to make it available across exec().
>
> If file descriptor created by events subsystem are not tied to the
> current process resources, it's likely legal to use it in a
> different process context, thus O_CLOEXEC must not be the default.
>
> Aside: If O_CLOEXEC cannot be made the default, it would be interesting
> to think to extend the API to have a (set of) function(s) taking
> a flags parameter so that userspace can set O_CLOEXEC if wanted.
> And I have a patch for this :)
>
> PS: I like the title of this article: "Excuse me son, but your code is
> leaking !!!" [1] by Dan Walsh but one should probably read PEP-446
> "Make newly created file descriptors non-inheritable" [2] instead
> since it has lot more background information on file descriptor
> leaking.
>
> [1] http://danwalsh.livejournal.com/53603.html
> [2] http://www.python.org/dev/peps/pep-0446/
>
>
> Regards.
>

--
Yann Droneaud
OPTEYA

2013-10-31 18:12:14

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] events: add a flag to perf_event_open() to set O_CLOEXEC

On Wed, Oct 30, 2013 at 10:35:50PM +0100, Yann Droneaud wrote:
> This patch adds PERF_FLAG_FD_CLOEXEC flag for
> perf_event_open() syscall.
>
> perf_event_open() creates a new file descriptor,
> but unlike open() syscall, it lack a flag to atomicaly
> set close-on-exec (O_CLOEXEC).
>
> Not using O_CLOEXEC by default and not letting userspace
> provide the "open" flags should be avoided: in most case
> O_CLOEXEC must be used to not leak file descriptor across
> exec().
>
> Using O_CLOEXEC when creating a file descriptor allows
> userspace to set latter, using fcntl(), without any risk
> of race, if the file descriptor is going to be inherited
> or not across exec().
>
> Link: http://lkml.kernel.org/r/[email protected]
> Link: http://lkml.kernel.org/r/[email protected]
> Signed-off-by: Yann Droneaud <[email protected]>
> ---
>
> Hi Peter,
>
> This patch should replaces
> "[PATCH v4 6/7] events: use get_unused_fd_flags(0) instead of get_unused_fd()"
>
> Please have a look.

I'm still terminally confused as to all of this... Why does it matter
what the default is if you can change it with fcntl() ? Also, how can
you tell nobody relies on the current behaviour and its therefore safe
to change?