2013-10-29 15:14:10

by Tim Gardner

[permalink] [raw]
Subject: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

The x86 specific kvm init creates a new conflicting
debugfs directory which causes modprobe issues
with kvm_intel and kvm_amd. For example,

sudo modprobe kvm_amd
modprobe: ERROR: could not insert 'kvm_amd': Bad address

The simplest fix is to just rename the directory. The following
KVM config options are set:

CONFIG_KVM_GUEST=y
CONFIG_KVM_DEBUG_FS=y
CONFIG_HAVE_KVM=y
CONFIG_HAVE_KVM_IRQCHIP=y
CONFIG_HAVE_KVM_IRQ_ROUTING=y
CONFIG_HAVE_KVM_EVENTFD=y
CONFIG_KVM_APIC_ARCHITECTURE=y
CONFIG_KVM_MMIO=y
CONFIG_KVM_ASYNC_PF=y
CONFIG_HAVE_KVM_MSI=y
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
CONFIG_KVM=m
CONFIG_KVM_INTEL=m
CONFIG_KVM_AMD=m
# CONFIG_KVM_MMU_AUDIT is not set
CONFIG_KVM_DEVICE_ASSIGNMENT=y

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Gleb Natapov <[email protected]>
Cc: Raghavendra K T <[email protected]>
Cc: Marcelo Tosatti <[email protected]>
Signed-off-by: Tim Gardner <[email protected]>
---

There is likely a more elegant and complicated way to solve this
problem, but a simple rename for a debug feature seems best,
especially for an -rc7.

arch/x86/kernel/kvm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a0e2a8a..d17895a 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -609,7 +609,7 @@ static struct dentry *d_kvm_debug;

struct dentry *kvm_init_debugfs(void)
{
- d_kvm_debug = debugfs_create_dir("kvm", NULL);
+ d_kvm_debug = debugfs_create_dir("kvm-x86", NULL);
if (!d_kvm_debug)
printk(KERN_WARNING "Could not create 'kvm' debugfs directory\n");

--
1.7.9.5


2013-10-29 16:45:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Ugh. I won't comment on the actual kvm part of this patch, somebody
who knows that code should do so.

But I reacted to this:

On Tue, Oct 29, 2013 at 8:13 AM, Tim Gardner <[email protected]> wrote:
>
> sudo modprobe kvm_amd
> modprobe: ERROR: could not insert 'kvm_amd': Bad address

"Bad address"? Christ people, are you guys making up error numbers
with some kind of dice-roll? I can just see it now, somebody sitting
there with a D20, playing some kind of kernel-specific D&D, and
rolling a ten means that you get to slay the orc, and pick an error
number of EFAULT for some random kernel function. Because quite
frankly, "random dice roll" is the _only_ thing that explains "Bad
address" sufficiently.

Please, whoever wrote virt/kvm/kvm_main.c:: kvm_init_debug(), WTF?
EFAULT means "user passed in an invalid virtual address pointer",
which is why the error string is "Bad address". It makes absolutely NO
SENSE here. Perhaps EEXIST or EBUSY.

Linus

2013-10-29 17:17:22

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Il 29/10/2013 16:13, Tim Gardner ha scritto:
> The x86 specific kvm init creates a new conflicting
> debugfs directory which causes modprobe issues
> with kvm_intel and kvm_amd. For example,
>
> sudo modprobe kvm_amd
> modprobe: ERROR: could not insert 'kvm_amd': Bad address
>
> The simplest fix is to just rename the directory. The following
> KVM config options are set:
>
> CONFIG_KVM_GUEST=y
> CONFIG_KVM_DEBUG_FS=y
> CONFIG_HAVE_KVM=y
> CONFIG_HAVE_KVM_IRQCHIP=y
> CONFIG_HAVE_KVM_IRQ_ROUTING=y
> CONFIG_HAVE_KVM_EVENTFD=y
> CONFIG_KVM_APIC_ARCHITECTURE=y
> CONFIG_KVM_MMIO=y
> CONFIG_KVM_ASYNC_PF=y
> CONFIG_HAVE_KVM_MSI=y
> CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT=y
> CONFIG_KVM=m
> CONFIG_KVM_INTEL=m
> CONFIG_KVM_AMD=m
> # CONFIG_KVM_MMU_AUDIT is not set
> CONFIG_KVM_DEVICE_ASSIGNMENT=y
>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: "H. Peter Anvin" <[email protected]>
> Cc: Gleb Natapov <[email protected]>
> Cc: Raghavendra K T <[email protected]>
> Cc: Marcelo Tosatti <[email protected]>
> Signed-off-by: Tim Gardner <[email protected]>
> ---
>
> There is likely a more elegant and complicated way to solve this
> problem, but a simple rename for a debug feature seems best,
> especially for an -rc7.

I agree, but I'd prefer "kvm-pv" to distinguish x86-specific KVM
features on the host (the debugfs directory that Linus was complaining
about) from KVM features that require guest cooperation (this case).

I'll send a pull request tomorrow.

Paolo

> arch/x86/kernel/kvm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index a0e2a8a..d17895a 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -609,7 +609,7 @@ static struct dentry *d_kvm_debug;
>
> struct dentry *kvm_init_debugfs(void)
> {
> - d_kvm_debug = debugfs_create_dir("kvm", NULL);
> + d_kvm_debug = debugfs_create_dir("kvm-x86", NULL);
> if (!d_kvm_debug)
> printk(KERN_WARNING "Could not create 'kvm' debugfs directory\n");
>
>

2013-10-29 19:26:36

by Raghavendra K T

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Adding Greg/AI too since we touch debugfs code.
[...]
>>
>> sudo modprobe kvm_amd
>> modprobe: ERROR: could not insert 'kvm_amd': Bad address
>
> "Bad address"? Christ people, are you guys making up error numbers
> with some kind of dice-roll? I can just see it now, somebody sitting
> there with a D20, playing some kind of kernel-specific D&D, and
> rolling a ten means that you get to slay the orc, and pick an error
> number of EFAULT for some random kernel function. Because quite
> frankly, "random dice roll" is the _only_ thing that explains "Bad
> address" sufficiently.
>
> Please, whoever wrote virt/kvm/kvm_main.c:: kvm_init_debug(), WTF?
> EFAULT means "user passed in an invalid virtual address pointer",
> which is why the error string is "Bad address". It makes absolutely NO
> SENSE here. Perhaps EEXIST or EBUSY.
>

Right. In current scenario it should have been EEXIST :(.

debugfs_create_dir() currently returns NULL dentry on both
EEXIST, ENOMEM ... cases.

Could one solution be cascading actual error
that is lost in fs/debugfs/inode.c:__create_file(), so that we could
take correct action in case of failure of debugfs_create_dir()?

(ugly side is we increase total number of params for __create_file to
6). or I hope there could be some better solution.

2013-10-29 19:33:40

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On Tue, Oct 29, 2013 at 12:27 PM, Raghavendra K T
<[email protected]> wrote:
>
> Could one solution be cascading actual error
> that is lost in fs/debugfs/inode.c:__create_file(), so that we could
> take correct action in case of failure of debugfs_create_dir()?
>
> (ugly side is we increase total number of params for __create_file to
> 6). or I hope there could be some better solution.

The solution to this would be to simply return an error-pointer. See
<linux/err.h>. That's what we do for most complex subsystems that
return a pointer to a struct: rather than returning "NULL" as an
error, return the actual error number encoded in the pointer itself.

But that would require every user of debugfs_create_dir() to be
updated to check errors using IS_ERR() instead of checking against
NULL, and there's quite a few of them.

So I think just making the error be EEXIST is a simpler solution right now.

Linus

2013-10-29 20:00:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On Wed, Oct 30, 2013 at 12:57:32AM +0530, Raghavendra K T wrote:
> Adding Greg/AI too since we touch debugfs code.

You do?

> [...]
> >>
> >>sudo modprobe kvm_amd
> >>modprobe: ERROR: could not insert 'kvm_amd': Bad address
> >
> >"Bad address"? Christ people, are you guys making up error numbers
> >with some kind of dice-roll? I can just see it now, somebody sitting
> >there with a D20, playing some kind of kernel-specific D&D, and
> >rolling a ten means that you get to slay the orc, and pick an error
> >number of EFAULT for some random kernel function. Because quite
> >frankly, "random dice roll" is the _only_ thing that explains "Bad
> >address" sufficiently.
> >
> >Please, whoever wrote virt/kvm/kvm_main.c:: kvm_init_debug(), WTF?
> >EFAULT means "user passed in an invalid virtual address pointer",
> >which is why the error string is "Bad address". It makes absolutely NO
> >SENSE here. Perhaps EEXIST or EBUSY.
> >
>
> Right. In current scenario it should have been EEXIST :(.
>
> debugfs_create_dir() currently returns NULL dentry on both
> EEXIST, ENOMEM ... cases.
>
> Could one solution be cascading actual error
> that is lost in fs/debugfs/inode.c:__create_file(), so that we could
> take correct action in case of failure of debugfs_create_dir()?

What would you do here? You shouldn't really care about debugfs files,
if there's an error, keep on going, no code path should really care,
right?

thanks,

greg k-h

2013-10-30 13:54:41

by Raghavendra K T

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On 10/30/2013 01:30 AM, Greg KH wrote:
[...]
>> debugfs_create_dir() currently returns NULL dentry on both
>> EEXIST, ENOMEM ... cases.
>>
>> Could one solution be cascading actual error
>> that is lost in fs/debugfs/inode.c:__create_file(), so that we could
>> take correct action in case of failure of debugfs_create_dir()?
>
> What would you do here? You shouldn't really care about debugfs files,
> if there's an error, keep on going, no code path should really care,
> right?
>
Thanks Greg.

Yes you are right. we can't do anything useful after that error.

If debugfs directories are not critical, may be we could
continue from there.

2013-10-30 14:00:18

by Raghavendra K T

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On 10/30/2013 01:03 AM, Linus Torvalds wrote:
> On Tue, Oct 29, 2013 at 12:27 PM, Raghavendra K T
> <[email protected]> wrote:
>>
>> Could one solution be cascading actual error
>> that is lost in fs/debugfs/inode.c:__create_file(), so that we could
>> take correct action in case of failure of debugfs_create_dir()?
>>
>> (ugly side is we increase total number of params for __create_file to
>> 6). or I hope there could be some better solution.
>
> The solution to this would be to simply return an error-pointer. See
> <linux/err.h>. That's what we do for most complex subsystems that
> return a pointer to a struct: rather than returning "NULL" as an
> error, return the actual error number encoded in the pointer itself.

Thank you Linus. Yes, this would have been ideal.

>
> But that would require every user of debugfs_create_dir() to be
> updated to check errors using IS_ERR() instead of checking against
> NULL, and there's quite a few of them.
>
> So I think just making the error be EEXIST is a simpler solution right now.
>

Agree.
I had below patch, and just before sending as formal mail I saw
Paolo's pull request which already took care of it.
---8<---



Attachments:
0001-Return-EEXIST-on-debugfs_create_dir-failure-in-kvm.patch (1.75 kB)

2013-10-30 14:22:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On Wed, Oct 30, 2013 at 07:31:05PM +0530, Raghavendra K T wrote:
> On 10/30/2013 01:03 AM, Linus Torvalds wrote:
> > On Tue, Oct 29, 2013 at 12:27 PM, Raghavendra K T
> > <[email protected]> wrote:
> >>
> >> Could one solution be cascading actual error
> >> that is lost in fs/debugfs/inode.c:__create_file(), so that we could
> >> take correct action in case of failure of debugfs_create_dir()?
> >>
> >> (ugly side is we increase total number of params for __create_file to
> >> 6). or I hope there could be some better solution.
> >
> > The solution to this would be to simply return an error-pointer. See
> > <linux/err.h>. That's what we do for most complex subsystems that
> > return a pointer to a struct: rather than returning "NULL" as an
> > error, return the actual error number encoded in the pointer itself.
>
> Thank you Linus. Yes, this would have been ideal.
>
> >
> > But that would require every user of debugfs_create_dir() to be
> > updated to check errors using IS_ERR() instead of checking against
> > NULL, and there's quite a few of them.
> >
> > So I think just making the error be EEXIST is a simpler solution right now.
> >
>
> Agree.
> I had below patch, and just before sending as formal mail I saw
> Paolo's pull request which already took care of it.
> ---8<---
>
>

> >From ac5d9f038c273f27bea7a54aab6af79b57f56317 Mon Sep 17 00:00:00 2001
> From: Raghavendra K T <[email protected]>
> Date: Wed, 30 Oct 2013 18:59:46 +0530
> Subject: [PATCH] Return EEXIST on debugfs_create_dir failure in kvm
>
> As quoted by Linus, EFAULT means "user passed in an invalid
> virtual address pointer", which is why the error string is Bad address.
> But when a debugfs directory creation fails, the above error is not valid.
>
> Signed-off-by: Raghavendra K T <[email protected]>
> ---
> I understand that Tim's patch that renames directory to something like
> kvm-pv would solve kvm-amd/kvm-intel modules insertion problem.
> This patch is to address error code change complained by Linus.
>
> arch/x86/kernel/kvm.c | 2 +-
> virt/kvm/kvm_main.c | 3 ++-
> 2 files changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index a0e2a8a..e475fdb 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -622,7 +622,7 @@ static int __init kvm_spinlock_debugfs(void)
>
> d_kvm = kvm_init_debugfs();
> if (d_kvm == NULL)
> - return -ENOMEM;
> + return -EEXIST;

Why even error out at all here? You should never care about debugfs
file creation return values, so why pass any error back up the stack?

greg k-h

2013-10-30 15:38:58

by Raghavendra K T

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On 10/30/2013 07:53 PM, Greg KH wrote:
[...]
>>
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index a0e2a8a..e475fdb 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -622,7 +622,7 @@ static int __init kvm_spinlock_debugfs(void)
>>
>> d_kvm = kvm_init_debugfs();
>> if (d_kvm == NULL)
>> - return -ENOMEM;
>> + return -EEXIST;
>
> Why even error out at all here? You should never care about debugfs
> file creation return values, so why pass any error back up the stack?

We could change this to return 0, but we will still be left with
kvm_main.c: kvm_init_debug() function which creates the kvm debugfs
directory. (I thought Paolo and Gleb's ack would be needed for
that change since it would be a bigger decision for me)

So I just made the patch to fix only Linus's concern.

But sorry that I did not make that explicit. (infact module insertion
error was because of successful creation of kvm directory in above code
and then error coming from kvm_init_debug() in kvm_main.c).

2013-10-30 15:46:52

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Il 30/10/2013 16:39, Raghavendra K T ha scritto:
>>
>> Why even error out at all here? You should never care about debugfs
>> file creation return values, so why pass any error back up the stack?
>
> We could change this to return 0, but we will still be left with
> kvm_main.c: kvm_init_debug() function which creates the kvm debugfs
> directory. (I thought Paolo and Gleb's ack would be needed for
> that change since it would be a bigger decision for me)
>
> So I just made the patch to fix only Linus's concern.

Even if it is okay to exit and not create the files (and I think it's a
bit surprising), I'd like at least a printk to signal what's happening.
But there should be no reason for debugfs directory creation to fail in
the end, except for basic mistakes such as the one that Tim reported, so
I think it's better to report the failure.

> But sorry that I did not make that explicit. (infact module insertion
> error was because of successful creation of kvm directory in above code
> and then error coming from kvm_init_debug() in kvm_main.c).

Yes, and I'm keeping the error reporting there too, while fixing the
bogus EFAULT.

Paolo

2013-10-30 15:59:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On Wed, Oct 30, 2013 at 04:46:28PM +0100, Paolo Bonzini wrote:
> Il 30/10/2013 16:39, Raghavendra K T ha scritto:
> >>
> >> Why even error out at all here? You should never care about debugfs
> >> file creation return values, so why pass any error back up the stack?
> >
> > We could change this to return 0, but we will still be left with
> > kvm_main.c: kvm_init_debug() function which creates the kvm debugfs
> > directory. (I thought Paolo and Gleb's ack would be needed for
> > that change since it would be a bigger decision for me)
> >
> > So I just made the patch to fix only Linus's concern.
>
> Even if it is okay to exit and not create the files (and I think it's a
> bit surprising), I'd like at least a printk to signal what's happening.
> But there should be no reason for debugfs directory creation to fail in
> the end, except for basic mistakes such as the one that Tim reported, so
> I think it's better to report the failure.

Creation will "fail" if debugfs is not enabled, so spiting out printk
messages in that case is not a good idea.

greg k-h

2013-10-30 16:08:36

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Il 30/10/2013 16:59, Greg KH ha scritto:
>> > Even if it is okay to exit and not create the files (and I think it's a
>> > bit surprising), I'd like at least a printk to signal what's happening.
>> > But there should be no reason for debugfs directory creation to fail in
>> > the end, except for basic mistakes such as the one that Tim reported, so
>> > I think it's better to report the failure.
> Creation will "fail" if debugfs is not enabled, so spiting out printk
> messages in that case is not a good idea.

Interestingly, if debugfs is not enabled we are already returning an
error-valued pointer:

static inline struct dentry *debugfs_create_dir(const char *name,
struct dentry *parent)
{
return ERR_PTR(-ENODEV);
}

which would oops a lot of the current callers. Very few places use the
currently correct idiom

if (IS_ERR(root) || !root)

but it's very ugly... Perhaps debugfs_create_dir *should* return an
error-valued pointer after all.

Paolo

2013-10-30 16:23:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

On Wed, Oct 30, 2013 at 05:08:09PM +0100, Paolo Bonzini wrote:
> Il 30/10/2013 16:59, Greg KH ha scritto:
> >> > Even if it is okay to exit and not create the files (and I think it's a
> >> > bit surprising), I'd like at least a printk to signal what's happening.
> >> > But there should be no reason for debugfs directory creation to fail in
> >> > the end, except for basic mistakes such as the one that Tim reported, so
> >> > I think it's better to report the failure.
> > Creation will "fail" if debugfs is not enabled, so spiting out printk
> > messages in that case is not a good idea.
>
> Interestingly, if debugfs is not enabled we are already returning an
> error-valued pointer:
>
> static inline struct dentry *debugfs_create_dir(const char *name,
> struct dentry *parent)
> {
> return ERR_PTR(-ENODEV);
> }
>
> which would oops a lot of the current callers.

It will oops? Really? Where? That shouldn't happen at all.

> Very few places use the currently correct idiom
>
> if (IS_ERR(root) || !root)
>
> but it's very ugly... Perhaps debugfs_create_dir *should* return an
> error-valued pointer after all.

Or just don't care about the return value, and all will work out just
fine, right?

thanks,

greg k-h

2013-10-30 16:41:31

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 3.12-rc7] KVM: Fix modprobe failure for kvm_intel/kvm_amd

Il 30/10/2013 17:23, Greg KH ha scritto:
>> > static inline struct dentry *debugfs_create_dir(const char *name,
>> > struct dentry *parent)
>> > {
>> > return ERR_PTR(-ENODEV);
>> > }
>> >
>> > which would oops a lot of the current callers.
> It will oops? Really? Where? That shouldn't happen at all.

Doh, it obviously won't because the bogus pointer value will never be
dereferenced by the dummy debugfs_create_file.

>> > Very few places use the currently correct idiom
>> >
>> > if (IS_ERR(root) || !root)
>> >
>> > but it's very ugly... Perhaps debugfs_create_dir *should* return an
>> > error-valued pointer after all.
> Or just don't care about the return value, and all will work out just
> fine, right?

Debugfs files would then be created in the debugfs root, which is not nice.

/* If the parent is not specified, we create it in the root.
* We need the root dentry to do this, which is in the super
* block. A pointer to that is in the struct vfsmount that we
* have around.
*/
if (!parent)
parent = debugfs_mount->mnt_root;

This is all documented right:

/**
* debugfs_create_file - create a file in the debugfs filesystem
* ...
* This function will return a pointer to a dentry if it succeeds. This
* pointer must be passed to the debugfs_remove() function when the file is
* to be removed (no automatic cleanup happens if your module is unloaded,
* you are responsible here.) If an error occurs, %NULL will be returned.
*
* If debugfs is not enabled in the kernel, the value -%ENODEV will be
* returned.
*/

so I guess it's just part of the API.

Paolo