2013-08-15 13:12:22

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 00/10] Getting rid of get_unused_fd_flags()

Hi,

Macro get_unused_fd() is a shortcut to call function get_unused_fd_flags(),
to allocate a file descriptor.

The macro use 0 as flags, so the file descriptor is created
without O_CLOEXEC flag.

This can be seen as an unsafe default eg. in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Newer kernel code should use anon_inode_getfd() or get_unused_fd_flags()
with flags provided by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().

They are two ways to achieve this:

- makes get_unused_fd() use O_CLOEXEC by default

It's difficult to get it right: every code using of get_unused_fd()
must take this change into account and be fixed as soon as
macro get_unused_fd() do the switch. Non updated code will have
unexpected behavor and it's likely going to break API contract.

- remove get_unused_fd()

It's going to break some out of tree, not yet upstream kernel code,
but it's easy to notice and fix. Anyway, newer code should use
anon_inode_getfd() or get_unused_fd_flags().

The latter option was choosen to ensure no unexpected behavor
for out of tree, not yet upstream code. Removing the macro is the safest
choice: it's better to break build than trying to make get_unused_fd()
use O_CLOEXEC by default and get all user of get_unused_fd() update.

Additionnaly, removing the macro is not going to break modules ABI.

In linux-next tag 20130815, they're currently:

- 19 calls to get_unused_fd_flags() (+4)
not counting get_unused_fd() and anon_inode_getfd()
- 10 calls to get_unused_fd() (-4)
- 11 calls to anon_inode_getfd() (0)

The following patchset try to convert all calls to get_unused_fd()
to get_unused_fd_flags(0) before removing get_unused_fd() macro.

Without get_unused_fd() macro, more subsystems are likely to use
anon_inode_getfd() and be teached to provide an API that let userspace
choose the opening flags of the file descriptor.

Changes from v1 <http://lkml.kernel.org/r/[email protected]>:

- explicitly added subsystem maintainers as mail recepients.

- infiniband: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: subsystem maintainer applied another patch using
get_unused_fd_flags(O_CLOEXEC) as suggested.

- android/sw_sync: use get_unused_fd_flags(0) instead of get_unused_fd()
MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by
<http://lkml.kernel.org/r/CACSP8SjXGMk2_kX_+RgzqqQwqKernvF1Wt3K5tw991W5dfAnCA@mail.gmail.com>

- android/sync: use get_unused_fd_flags(0) instead of get_unused_fd()
MODIFIED: use get_unused_fd_flags(O_CLOEXEC) as suggested by
<http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w@mail.gmail.com>

- xfs: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied asis by subsystem maintainer.

- sctp: use get_unused_fd_flags(0) instead of get_unused_fd()
DROPPED: applied asis by subsystem maintainer.

Yann Droneaud (10):
ia64: use get_unused_fd_flags(0) instead of get_unused_fd()
ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()
android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of
get_unused_fd()
android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of
get_unused_fd()
vfio: use get_unused_fd_flags(0) instead of get_unused_fd()
binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()
file: use get_unused_fd_flags(0) instead of get_unused_fd()
fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()
events: use get_unused_fd_flags(0) instead of get_unused_fd()
file: remove get_unused_fd()

arch/ia64/kernel/perfmon.c | 2 +-
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
drivers/staging/android/sw_sync.c | 2 +-
drivers/staging/android/sync.c | 2 +-
drivers/vfio/vfio.c | 2 +-
fs/binfmt_misc.c | 2 +-
fs/file.c | 2 +-
fs/notify/fanotify/fanotify_user.c | 2 +-
include/linux/file.h | 1 -
kernel/events/core.c | 2 +-
10 files changed, 10 insertions(+), 11 deletions(-)

--
1.8.3.1


2013-08-15 13:12:12

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 01/10] ia64: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
arch/ia64/kernel/perfmon.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/ia64/kernel/perfmon.c b/arch/ia64/kernel/perfmon.c
index 5a9ff1c..64757c1 100644
--- a/arch/ia64/kernel/perfmon.c
+++ b/arch/ia64/kernel/perfmon.c
@@ -2668,7 +2668,7 @@ pfm_context_create(pfm_context_t *ctx, void *arg, int count, struct pt_regs *reg

ret = -ENOMEM;

- fd = get_unused_fd();
+ fd = get_unused_fd_flags(0);
if (fd < 0)
return fd;

--
1.8.3.1

2013-08-15 13:12:25

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 02/10] ppc/cell: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
arch/powerpc/platforms/cell/spufs/inode.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/inode.c b/arch/powerpc/platforms/cell/spufs/inode.c
index f390042..88df441 100644
--- a/arch/powerpc/platforms/cell/spufs/inode.c
+++ b/arch/powerpc/platforms/cell/spufs/inode.c
@@ -301,7 +301,7 @@ static int spufs_context_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

@@ -518,7 +518,7 @@ static int spufs_gang_open(struct path *path)
int ret;
struct file *filp;

- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0)
return ret;

--
1.8.3.1

2013-08-15 13:12:36

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 03/10] android/sw_sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with call to
get_unused_fd_flags(O_CLOEXEC) following advice from Erik Gilling.


Signed-off-by: Yann Droneaud <[email protected]>
Cc: Erik Gilling <[email protected]>
Cc: Colin Cross <[email protected]>
Link: http://lkml.kernel.org/r/CACSP8SjZcpcpEtQHzcGYhf-MP7QGo0XpN7-uN7rmD=vNtopG=w@mail.gmail.com
Link: http://lkml.kernel.org/r/[email protected]

---
drivers/staging/android/sw_sync.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c
index 765c757..f24493a 100644
--- a/drivers/staging/android/sw_sync.c
+++ b/drivers/staging/android/sw_sync.c
@@ -163,7 +163,7 @@ static int sw_sync_release(struct inode *inode, struct file *file)
static long sw_sync_ioctl_create_fence(struct sw_sync_timeline *obj,
unsigned long arg)
{
- int fd = get_unused_fd();
+ int fd = get_unused_fd_flags(O_CLOEXEC);
int err;
struct sync_pt *pt;
struct sync_fence *fence;
--
1.8.3.1

2013-08-15 13:12:47

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 05/10] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
drivers/vfio/vfio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index d3cb342..75c16cc 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1109,7 +1109,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
* We can't use anon_inode_getfd() because we need to modify
* the f_mode flags directly to allow more than just ioctls
*/
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret < 0) {
device->ops->release(device->device_data);
break;
--
1.8.3.1

2013-08-15 13:12:52

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 06/10] binfmt_misc: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
fs/binfmt_misc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/binfmt_misc.c b/fs/binfmt_misc.c
index 1c740e1..052f6dc 100644
--- a/fs/binfmt_misc.c
+++ b/fs/binfmt_misc.c
@@ -138,7 +138,7 @@ static int load_misc_binary(struct linux_binprm *bprm)
/* if the binary should be opened on behalf of the
* interpreter than keep it open and assign descriptor
* to it */
- fd_binary = get_unused_fd();
+ fd_binary = get_unused_fd_flags(0);
if (fd_binary < 0) {
retval = fd_binary;
goto _ret;
--
1.8.3.1

2013-08-15 13:13:01

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 07/10] file: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
fs/file.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/file.c b/fs/file.c
index 4a78f98..1420d28 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -897,7 +897,7 @@ SYSCALL_DEFINE1(dup, unsigned int, fildes)
struct file *file = fget_raw(fildes);

if (file) {
- ret = get_unused_fd();
+ ret = get_unused_fd_flags(0);
if (ret >= 0)
fd_install(ret, file);
else
--
1.8.3.1

2013-08-15 13:12:39

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 04/10] android/sync: use get_unused_fd_flags(O_CLOEXEC) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with call to
get_unused_fd_flags(O_CLOEXEC) following advice from Erik Gilling.

Signed-off-by: Yann Droneaud <[email protected]>
Cc: Erik Gilling <[email protected]>
Cc: Colin Cross <[email protected]>
Link: http://lkml.kernel.org/r/CACSP8SjXGMk2_kX_+RgzqqQwqKernvF1Wt3K5tw991W5dfAnCA@mail.gmail.com
Link: http://lkml.kernel.org/r/[email protected]

---
drivers/staging/android/sync.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
index 2996077..38e5d3b 100644
--- a/drivers/staging/android/sync.c
+++ b/drivers/staging/android/sync.c
@@ -697,7 +697,7 @@ static long sync_fence_ioctl_wait(struct sync_fence *fence, unsigned long arg)

static long sync_fence_ioctl_merge(struct sync_fence *fence, unsigned long arg)
{
- int fd = get_unused_fd();
+ int fd = get_unused_fd_flags(O_CLOEXEC);
int err;
struct sync_fence *fence2, *fence3;
struct sync_merge_data data;
--
1.8.3.1

2013-08-15 13:13:12

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 08/10] fanotify: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
fs/notify/fanotify/fanotify_user.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index e44cb64..644b9a7 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -69,7 +69,7 @@ static int create_fd(struct fsnotify_group *group,

pr_debug("%s: group=%p event=%p\n", __func__, group, event);

- client_fd = get_unused_fd();
+ client_fd = get_unused_fd_flags(0);
if (client_fd < 0)
return client_fd;

--
1.8.3.1

2013-08-15 13:13:31

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 10/10] file: remove get_unused_fd()

Macro get_unused_fd() allocates a file descriptor without O_CLOEXEC flag.

This can be seen as an unsafe default: in most case O_CLOEXEC
must be used to not leak file descriptor across exec().

Using O_CLOEXEC by default allows userspace to choose, without race,
if the file descriptor is going to be inherited across exec().

This patch removes get_unused_fd() so that newer kernel code use
anon_inode_getfd() or get_unused_fd_flags() with flags provided
by userspace. If flags cannot be given by userspace,
O_CLOEXEC must be the default flag.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
include/linux/file.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/include/linux/file.h b/include/linux/file.h
index cbacf4f..8666002 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -63,7 +63,6 @@ extern void set_close_on_exec(unsigned int fd, int flag);
extern bool get_close_on_exec(unsigned int fd);
extern void put_filp(struct file *);
extern int get_unused_fd_flags(unsigned flags);
-#define get_unused_fd() get_unused_fd_flags(0)
extern void put_unused_fd(unsigned int fd);

extern void fd_install(unsigned int fd, struct file *file);
--
1.8.3.1

2013-08-15 14:10:55

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH v2 09/10] events: use get_unused_fd_flags(0) instead of get_unused_fd()

Macro get_unused_fd() is used to allocate a file descriptor with
default flags. Those default flags (0) can be "unsafe":
O_CLOEXEC must be used by default to not leak file descriptor
across exec().

Instead of macro get_unused_fd(), functions anon_inode_getfd()
or get_unused_fd_flags() should be used with flags given by userspace.
If not possible, flags should be set to O_CLOEXEC to provide userspace
with a default safe behavor.

In a further patch, get_unused_fd() will be removed so that
new code start using anon_inode_getfd() or get_unused_fd_flags()
with correct flags.

This patch replaces calls to get_unused_fd() with equivalent call to
get_unused_fd_flags(0) to preserve current behavor for existing code.

The hard coded flag value (0) should be reviewed on a per-subsystem basis,
and, if possible, set to O_CLOEXEC.

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index f25ce6d..a224a14 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -6891,7 +6891,7 @@ SYSCALL_DEFINE5(perf_event_open,
if ((flags & PERF_FLAG_PID_CGROUP) && (pid == -1 || cpu == -1))
return -EINVAL;

- event_fd = get_unused_fd();
+ event_fd = get_unused_fd_flags(0);
if (event_fd < 0)
return event_fd;

--
1.8.3.1

2013-08-22 15:53:37

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH v2 05/10] vfio: use get_unused_fd_flags(0) instead of get_unused_fd()

On Thu, 2013-08-15 at 15:10 +0200, Yann Droneaud wrote:
> Macro get_unused_fd() is used to allocate a file descriptor with
> default flags. Those default flags (0) can be "unsafe":
> O_CLOEXEC must be used by default to not leak file descriptor
> across exec().
>
> Instead of macro get_unused_fd(), functions anon_inode_getfd()
> or get_unused_fd_flags() should be used with flags given by userspace.
> If not possible, flags should be set to O_CLOEXEC to provide userspace
> with a default safe behavor.
>
> In a further patch, get_unused_fd() will be removed so that
> new code start using anon_inode_getfd() or get_unused_fd_flags()
> with correct flags.
>
> This patch replaces calls to get_unused_fd() with equivalent call to
> get_unused_fd_flags(0) to preserve current behavor for existing code.
>
> The hard coded flag value (0) should be reviewed on a per-subsystem basis,
> and, if possible, set to O_CLOEXEC.
>
> Signed-off-by: Yann Droneaud <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]
>
> ---
> drivers/vfio/vfio.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index d3cb342..75c16cc 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -1109,7 +1109,7 @@ static int vfio_group_get_device_fd(struct vfio_group *group, char *buf)
> * We can't use anon_inode_getfd() because we need to modify
> * the f_mode flags directly to allow more than just ioctls
> */
> - ret = get_unused_fd();
> + ret = get_unused_fd_flags(0);
> if (ret < 0) {
> device->ops->release(device->device_data);
> break;

I don't see any reason why we shouldn't be adding O_CLOEXEC here. If
anyone disagrees, please speak up. I'll include this in my next tree
and post a follow-on patch to replace 0 with O_CLOEXEC. Thanks,

Alex