2013-08-24 20:21:42

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

Hi,

Following a patchset asking to change calls to get_unused_flag() [1]
to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
to use the flag.

Since it's a related subsystem to KVM, using O_CLOEXEC for
file descriptors created by KVM might be applicable too.

I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
as default flag.

This patchset should be reviewed to not break existing userspace program.

BTW, if it's not applicable, I would suggest that new ioctls be added to
KVM subsystem, those ioctls would have a "flag" field added to their arguments.
Such "flag" would let userspace choose the open flag to use.
See for example other APIs using anon_inode_getfd() such as fanotify,
inotify, signalfd and timerfd.

You might be interested to read:

- Secure File Descriptor Handling (Ulrich Drepper, 2008)
http://udrepper.livejournal.com/20407.html

- Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
http://danwalsh.livejournal.com/53603.html

Regards.

[1] http://lkml.kernel.org/r/[email protected]
[2] http://lkml.kernel.org/r/[email protected]
[3] http://lkml.kernel.org/r/[email protected]

Yann Droneaud (2):
kvm: use anon_inode_getfd() with O_CLOEXEC flag
ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
arch/powerpc/kvm/book3s_64_vio.c | 2 +-
arch/powerpc/kvm/book3s_hv.c | 2 +-
virt/kvm/kvm_main.c | 6 +++---
4 files changed, 6 insertions(+), 6 deletions(-)

--
1.8.3.1


2013-08-24 20:21:27

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
virt/kvm/kvm_main.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 89f74d1..d65cc0c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = {
*/
static int create_vcpu_fd(struct kvm_vcpu *vcpu)
{
- return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR);
+ return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR | O_CLOEXEC);
}

/*
@@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
return ret;
}

- ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
+ ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR | O_CLOEXEC);
if (ret < 0) {
ops->destroy(dev);
return ret;
@@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
return r;
}
#endif
- r = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, O_RDWR);
+ r = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC);
if (r < 0)
kvm_put_kvm(kvm);

--
1.8.3.1

2013-08-24 20:21:57

by Yann Droneaud

[permalink] [raw]
Subject: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

KVM uses anon_inode_get() to allocate file descriptors as part
of some of its ioctls. But those ioctls are lacking a flag argument
allowing userspace to choose options for the newly opened file descriptor.

In such case it's advised to use O_CLOEXEC by default so that
userspace is allowed to choose, without race, if the file descriptor
is going to be inherited across exec().

This patch set O_CLOEXEC flag on all file descriptors created
with anon_inode_getfd() to not leak file descriptors across exec().

Signed-off-by: Yann Droneaud <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]

---
arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
arch/powerpc/kvm/book3s_64_vio.c | 2 +-
arch/powerpc/kvm/book3s_hv.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 710d313..f7c9e8a 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -1579,7 +1579,7 @@ int kvm_vm_ioctl_get_htab_fd(struct kvm *kvm, struct kvm_get_htab_fd *ghf)
ctx->first_pass = 1;

rwflag = (ghf->flags & KVM_GET_HTAB_WRITE) ? O_WRONLY : O_RDONLY;
- ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag);
+ ret = anon_inode_getfd("kvm-htab", &kvm_htab_fops, ctx, rwflag | O_CLOEXEC);
if (ret < 0) {
kvm_put_kvm(kvm);
return ret;
diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index b2d3f3b..54cf9bc 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -136,7 +136,7 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
mutex_unlock(&kvm->lock);

return anon_inode_getfd("kvm-spapr-tce", &kvm_spapr_tce_fops,
- stt, O_RDWR);
+ stt, O_RDWR | O_CLOEXEC);

fail:
if (stt) {
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index e8d51cb..3503829 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1556,7 +1556,7 @@ long kvm_vm_ioctl_allocate_rma(struct kvm *kvm, struct kvm_allocate_rma *ret)
if (!ri)
return -ENOMEM;

- fd = anon_inode_getfd("kvm-rma", &kvm_rma_fops, ri, O_RDWR);
+ fd = anon_inode_getfd("kvm-rma", &kvm_rma_fops, ri, O_RDWR | O_CLOEXEC);
if (fd < 0)
kvm_release_rma(ri);

--
1.8.3.1

2013-08-25 06:51:01

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 1/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

Il 24/08/2013 22:14, Yann Droneaud ha scritto:
> KVM uses anon_inode_get() to allocate file descriptors as part
> of some of its ioctls. But those ioctls are lacking a flag argument
> allowing userspace to choose options for the newly opened file descriptor.
>
> In such case it's advised to use O_CLOEXEC by default so that
> userspace is allowed to choose, without race, if the file descriptor
> is going to be inherited across exec().
>
> This patch set O_CLOEXEC flag on all file descriptors created
> with anon_inode_getfd() to not leak file descriptors across exec().
>
> Signed-off-by: Yann Droneaud <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]

Reviewed-by: Paolo Bonzini <[email protected]>

> ---
> virt/kvm/kvm_main.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 89f74d1..d65cc0c 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1896,7 +1896,7 @@ static struct file_operations kvm_vcpu_fops = {
> */
> static int create_vcpu_fd(struct kvm_vcpu *vcpu)
> {
> - return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR);
> + return anon_inode_getfd("kvm-vcpu", &kvm_vcpu_fops, vcpu, O_RDWR | O_CLOEXEC);
> }
>
> /*
> @@ -2306,7 +2306,7 @@ static int kvm_ioctl_create_device(struct kvm *kvm,
> return ret;
> }
>
> - ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR);
> + ret = anon_inode_getfd(ops->name, &kvm_device_fops, dev, O_RDWR | O_CLOEXEC);
> if (ret < 0) {
> ops->destroy(dev);
> return ret;
> @@ -2590,7 +2590,7 @@ static int kvm_dev_ioctl_create_vm(unsigned long type)
> return r;
> }
> #endif
> - r = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, O_RDWR);
> + r = anon_inode_getfd("kvm-vm", &kvm_vm_fops, kvm, O_RDWR | O_CLOEXEC);
> if (r < 0)
> kvm_put_kvm(kvm);
>
>

2013-08-25 15:04:29

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag


On 24.08.2013, at 21:14, Yann Droneaud wrote:

> KVM uses anon_inode_get() to allocate file descriptors as part
> of some of its ioctls. But those ioctls are lacking a flag argument
> allowing userspace to choose options for the newly opened file descriptor.
>
> In such case it's advised to use O_CLOEXEC by default so that
> userspace is allowed to choose, without race, if the file descriptor
> is going to be inherited across exec().
>
> This patch set O_CLOEXEC flag on all file descriptors created
> with anon_inode_getfd() to not leak file descriptors across exec().
>
> Signed-off-by: Yann Droneaud <[email protected]>
> Link: http://lkml.kernel.org/r/[email protected]

Reviewed-by: Alexander Graf <[email protected]>

Would it make sense to simply inherit the O_CLOEXEC flag from the parent kvm fd instead? That would give user space the power to keep fds across exec() if it wants to.


Alex

2013-08-26 07:39:23

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

Il 25/08/2013 17:04, Alexander Graf ha scritto:
>
> On 24.08.2013, at 21:14, Yann Droneaud wrote:
>
>> KVM uses anon_inode_get() to allocate file descriptors as part
>> of some of its ioctls. But those ioctls are lacking a flag argument
>> allowing userspace to choose options for the newly opened file descriptor.
>>
>> In such case it's advised to use O_CLOEXEC by default so that
>> userspace is allowed to choose, without race, if the file descriptor
>> is going to be inherited across exec().
>>
>> This patch set O_CLOEXEC flag on all file descriptors created
>> with anon_inode_getfd() to not leak file descriptors across exec().
>>
>> Signed-off-by: Yann Droneaud <[email protected]>
>> Link: http://lkml.kernel.org/r/[email protected]
>
> Reviewed-by: Alexander Graf <[email protected]>
>
> Would it make sense to simply inherit the O_CLOEXEC flag from the
> parent kvm fd instead? That would give user space the power to keep
> fds across exec() if it wants to.

Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
all? Besides fork() not being supported by KVM, as described in
Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
away as soon as you exec(). I'm not sure how you can use the inherited
file descriptor in a sensible way after exec().

Paolo

2013-08-26 08:23:21

by Yann Droneaud

[permalink] [raw]
Subject: Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

Le 26.08.2013 09:39, Paolo Bonzini a écrit :
> Il 25/08/2013 17:04, Alexander Graf ha scritto:
>> On 24.08.2013, at 21:14, Yann Droneaud wrote:
>>
>>>
>>> This patch set O_CLOEXEC flag on all file descriptors created
>>> with anon_inode_getfd() to not leak file descriptors across exec().
>>>
>>> Signed-off-by: Yann Droneaud <[email protected]>
>>> Link:
>>> http://lkml.kernel.org/r/[email protected]
>>
>> Reviewed-by: Alexander Graf <[email protected]>
>>
>> Would it make sense to simply inherit the O_CLOEXEC flag from the
>> parent kvm fd instead? That would give user space the power to keep
>> fds across exec() if it wants to.
>
> Does it make sense to use non-O_CLOEXEC file descriptors with KVM at
> all? Besides fork() not being supported by KVM, as described in
> Documentation/virtual/kvm/api.txt, the VMAs of the parent process go
> away as soon as you exec(). I'm not sure how you can use the inherited
> file descriptor in a sensible way after exec().
>

Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
are of no use accross exec() since memory mappings tied to those fds
won't be available in the new process:

https://lkml.org/lkml/2013/7/8/380
http://mid.gmane.org/[email protected]

Regards.

--
Yann Droneaud
OPTEYA

2013-08-26 08:29:11

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 2/2] ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag

Il 26/08/2013 10:23, Yann Droneaud ha scritto:
>
> Sounds a lot like InfiniBand subsystem behavor: IB file descriptors
> are of no use accross exec() since memory mappings tied to those fds
> won't be available in the new process:
>
> https://lkml.org/lkml/2013/7/8/380
> http://mid.gmane.org/[email protected]

Yes, it is very similar.

Paolo

2013-08-26 10:20:49

by Gleb Natapov

[permalink] [raw]
Subject: Re: [PATCH 0/2] kvm: use anon_inode_getfd() with O_CLOEXEC flag

On Sat, Aug 24, 2013 at 10:14:06PM +0200, Yann Droneaud wrote:
> Hi,
>
> Following a patchset asking to change calls to get_unused_flag() [1]
> to use O_CLOEXEC, Alex Williamson [2][3] decided to change VFIO
> to use the flag.
>
> Since it's a related subsystem to KVM, using O_CLOEXEC for
> file descriptors created by KVM might be applicable too.
>
> I'm suggesting to change calls to anon_inode_getfd() to use O_CLOEXEC
> as default flag.
>
> This patchset should be reviewed to not break existing userspace program.
>
> BTW, if it's not applicable, I would suggest that new ioctls be added to
> KVM subsystem, those ioctls would have a "flag" field added to their arguments.
> Such "flag" would let userspace choose the open flag to use.
> See for example other APIs using anon_inode_getfd() such as fanotify,
> inotify, signalfd and timerfd.
>
> You might be interested to read:
>
> - Secure File Descriptor Handling (Ulrich Drepper, 2008)
> http://udrepper.livejournal.com/20407.html
>
> - Excuse me son, but your code is leaking !!! (Dan Walsh, March 2012)
> http://danwalsh.livejournal.com/53603.html
>
Applied, thanks.

> Regards.
>
> [1] http://lkml.kernel.org/r/[email protected]
> [2] http://lkml.kernel.org/r/[email protected]
> [3] http://lkml.kernel.org/r/[email protected]
>
> Yann Droneaud (2):
> kvm: use anon_inode_getfd() with O_CLOEXEC flag
> ppc: kvm: use anon_inode_getfd() with O_CLOEXEC flag
>
> arch/powerpc/kvm/book3s_64_mmu_hv.c | 2 +-
> arch/powerpc/kvm/book3s_64_vio.c | 2 +-
> arch/powerpc/kvm/book3s_hv.c | 2 +-
> virt/kvm/kvm_main.c | 6 +++---
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> --
> 1.8.3.1

--
Gleb.