Hi.
Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
start a git-bisect, I thought I might ask if anyone knew of some
compilation option I might have missed.
Regards,
Nigel
On 5/1/07, Nigel Cunningham <[email protected]> wrote:
> Hi.
>
> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
I'm on i386 and noticed it makes the "guest" very unresponsive.
Uncheck "disable write-caching" solves the problem.
Jeff.
Hi.
On Tue, 2007-05-01 at 14:13 +0800, Jeff Chua wrote:
> On 5/1/07, Nigel Cunningham <[email protected]> wrote:
> > Hi.
> >
> > Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> > for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
>
>
> I'm on i386 and noticed it makes the "guest" very unresponsive.
>
> Uncheck "disable write-caching" solves the problem.
Thanks for the reply.
x86_64 seems to have a completely different issue. I think something in
the changes that went in for x86 might have broken x86_64 completely.
Guess I'll have to try a bisect.
Regards,
Nigel
I'm having the same problem. I think it has to do with timer problems...
Does /sbin/hwclock --show freeze your box? It "sometimes" works on
mine, but it more often than not locks it hard (especially as user,
not root) with no hints in syslog or anywhere else as to what
happened. This is a turion x2 with nvidia mcp51
On 5/1/07, Nigel Cunningham <[email protected]> wrote:
> Hi.
>
> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
Hi,
On Tue, 2007-05-01 at 01:50 -0500, Marcos Pinto wrote:
> I'm having the same problem. I think it has to do with timer problems...
> Does /sbin/hwclock --show freeze your box? It "sometimes" works on
> mine, but it more often than not locks it hard (especially as user,
> not root) with no hints in syslog or anywhere else as to what
> happened. This is a turion x2 with nvidia mcp51
I just tried that four times in a row, and they all worked. Mine is a
single core (M-34) Turion - Mitac 8350 mobo.
I've attached my .config, if it helps.
Regards,
Nigel
> Does anyone have VMware working on x86_64 with 2.6.21?
VMware server 1.0.2 (free edition) is working fine here with 2.6.21.1
using vmware-any-any-update109.
System is Intel Core2 T7600 (x86_64)
Best,
Michael
--
Technosis GmbH, Geschäftsführer: Michael Gerdau, Tobias Dittmar
Sitz Hamburg; HRB 89145 Amtsgericht Hamburg
Vote against SPAM - see http://www.politik-digital.de/spam/
Michael Gerdau email: [email protected]
GPG-keys available on request or at public keyserver
Hi.
On Tue, 2007-05-01 at 09:24 +0200, Michael Gerdau wrote:
> > Does anyone have VMware working on x86_64 with 2.6.21?
>
> VMware server 1.0.2 (free edition) is working fine here with 2.6.21.1
> using vmware-any-any-update109.
>
> System is Intel Core2 T7600 (x86_64)
Ok. Maybe I'll try the any-any update. I'm using the Workstation 6 beta.
Could I get you to send me your .config, just in case?
Regards,
Nigel
On May 1 2007 14:13, Jeff Chua wrote:
> On 5/1/07, Nigel Cunningham <[email protected]> wrote:
>> Hi.
>>
>> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
>> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
>
>
> I'm on i386 and noticed it makes the "guest" very unresponsive.
>
> Uncheck "disable write-caching" solves the problem.
Why do you even have "disable wc" turned _on_? (Yes, it's a VMware 5 bug where
it gets randomly set upon machine creation, VM WS 6 has it fixed.)
Jan
--
On May 1 2007 17:36, Nigel Cunningham wrote:
>On Tue, 2007-05-01 at 09:24 +0200, Michael Gerdau wrote:
>> > Does anyone have VMware working on x86_64 with 2.6.21?
>>
>> VMware server 1.0.2 (free edition) is working fine here with 2.6.21.1
>> using vmware-any-any-update109.
>>
>> System is Intel Core2 T7600 (x86_64)
>
>Ok. Maybe I'll try the any-any update. I'm using the Workstation 6 beta.
>Could I get you to send me your .config, just in case?
the AA updates are not for WS6 AFAICT (at least this was the case with aa105).
Jan
--
Hi.
On Tue, 2007-05-01 at 13:10 +0200, Jan Engelhardt wrote:
> On May 1 2007 17:36, Nigel Cunningham wrote:
> >On Tue, 2007-05-01 at 09:24 +0200, Michael Gerdau wrote:
> >> > Does anyone have VMware working on x86_64 with 2.6.21?
> >>
> >> VMware server 1.0.2 (free edition) is working fine here with 2.6.21.1
> >> using vmware-any-any-update109.
> >>
> >> System is Intel Core2 T7600 (x86_64)
> >
> >Ok. Maybe I'll try the any-any update. I'm using the Workstation 6 beta.
> >Could I get you to send me your .config, just in case?
>
> the AA updates are not for WS6 AFAICT (at least this was the case with aa105).
Yeah. It turns out that I was still on WS 6 beta 1. Upgrading to beta 2
made the problem go away.
Regards,
Nigel
On 5/1/07, Jan Engelhardt <[email protected]> wrote:
>
> On May 1 2007 14:13, Jeff Chua wrote:
> > On 5/1/07, Nigel Cunningham <[email protected]> wrote:
> >> Hi.
> >>
> >> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> >> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
> >
> >
> > I'm on i386 and noticed it makes the "guest" very unresponsive.
> >
> > Uncheck "disable write-caching" solves the problem.
>
> Why do you even have "disable wc" turned _on_? (Yes, it's a VMware 5 bug where
> it gets randomly set upon machine creation, VM WS 6 has it fixed.)
I thought so too, never knew it was defaulted on, until I poked around
and realized that it was the culprit.
I'll wait for the stable WS6 to be released. Right now, WS5 is good
enough for me.
Thanks,
Jeff.
Nigel Cunningham wrote:
..
> Yeah. It turns out that I was still on WS 6 beta 1. Upgrading to beta 2
> made the problem go away.
Then you're still behind. They've had a WS6 RC available for the past week.
Cheers
On Tue, 2007-05-01 at 15:42 +1000, Nigel Cunningham wrote:
> Hi.
>
> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
> start a git-bisect, I thought I might ask if anyone knew of some
> compilation option I might have missed.
if you want to ask questions about proprietary kernel stuff you're
better off asking the vendor directly, not lkml
Hi.
On Tue, 2007-05-01 at 09:46 -0400, Mark Lord wrote:
> Nigel Cunningham wrote:
> ..
> > Yeah. It turns out that I was still on WS 6 beta 1. Upgrading to beta 2
> > made the problem go away.
>
> Then you're still behind. They've had a WS6 RC available for the past week.
Or I'm confused about the name :) Build 44426 is what I have. Sorry :)
Nigel
Hi Arjan.
On Tue, 2007-05-01 at 07:57 -0700, Arjan van de Ven wrote:
> On Tue, 2007-05-01 at 15:42 +1000, Nigel Cunningham wrote:
> > Hi.
> >
> > Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
> > for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
> > start a git-bisect, I thought I might ask if anyone knew of some
> > compilation option I might have missed.
>
>
> if you want to ask questions about proprietary kernel stuff you're
> better off asking the vendor directly, not lkml
I did, but given that it the failure only appeared with a change of
vanilla kernel version, I didn't think it was out of place to ask here
too.
Regards,
Nigel
Marcos Pinto wrote:
> I'm having the same problem. I think it has to do with timer problems...
> Does /sbin/hwclock --show freeze your box? It "sometimes" works on
> mine, but it more often than not locks it hard (especially as user,
> not root) with no hints in syslog or anywhere else as to what
> happened. This is a turion x2 with nvidia mcp51
/sbin/hwclock freezes the whole box for me turion x2 with nvidia mcp51
(HP Pavillion dv9210us), but only when syncing hwclock to system time.
I suspect it could be a hwclock IOPL bug, but I will have to check
source to be sure. But I see this with 2.6.20 Gentoo kernel, not just
2.6.21, so I think it is a different bug.
Zach
Nigel Cunningham wrote:
> Hi Arjan.
>
> On Tue, 2007-05-01 at 07:57 -0700, Arjan van de Ven wrote:
>> On Tue, 2007-05-01 at 15:42 +1000, Nigel Cunningham wrote:
>>> Hi.
>>>
>>> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
>>> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
>>> start a git-bisect, I thought I might ask if anyone knew of some
>>> compilation option I might have missed.
>>
>> if you want to ask questions about proprietary kernel stuff you're
>> better off asking the vendor directly, not lkml
>
> I did, but given that it the failure only appeared with a change of
> vanilla kernel version, I didn't think it was out of place to ask here
> too.
I thought I already talked about that on VMware's forums, but apparently
I just discussed it in email only. Culprit (if I can say that) is
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=610142927b5bc149da92b03c7ab08b8b5f205b74
It changed interrupt layout - before that change IRQ 0-15 were using
vectors 0x20-0x2F, after change they use interrupts 0x30-0x3F. Which
has unfortunate effect that when hardware IRQ 8 arrives while VM is
running, vmm believes that it internally used 'INT 0x38' to call some
hypervisor service - and (1) hardware interrupt is never acknowledged,
and (2) hypervisor issues random operation depending on contents of
registers at the time interrupt arrived. Both are quite bad, and usual
result is that VMware panics, and while writing core dump kernel hangs
as IOAPIC believes that there is IRQ 8 in service, and so it does not
ever deliver IRQs 14/15 for legacy IDE harddisks (which are at same level).
One of possible fixes (if you need to run older products than VMware
Workstation 6 on 64bit 2.6.21+) is replacing
#define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x10
with
#define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x08
Then IRQ 0x38 will be skipped. Other option is move only IRQ8_VECTOR
somewhere else (into 0x21-0x2F range).
Petr Vandrovec
P.S.: Well, and obviously this has nothing to do with vmmon...
Petr Vandrovec <[email protected]> writes:
> Nigel Cunningham wrote:
>> Hi Arjan.
>>
>> On Tue, 2007-05-01 at 07:57 -0700, Arjan van de Ven wrote:
>>> On Tue, 2007-05-01 at 15:42 +1000, Nigel Cunningham wrote:
>>>> Hi.
>>>>
>>>> Does anyone have VMware working on x86_64 with 2.6.21? It's working fine
>>>> for me with 2.6.20, but freezes the whole computer with 2.6.21. Before I
>>>> start a git-bisect, I thought I might ask if anyone knew of some
>>>> compilation option I might have missed.
>>>
>>> if you want to ask questions about proprietary kernel stuff you're
>>> better off asking the vendor directly, not lkml
>>
>> I did, but given that it the failure only appeared with a change of
>> vanilla kernel version, I didn't think it was out of place to ask here
>> too.
>
> I thought I already talked about that on VMware's forums, but apparently I just
> discussed it in email only. Culprit (if I can say that) is
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=610142927b5bc149da92b03c7ab08b8b5f205b74
>
> It changed interrupt layout - before that change IRQ 0-15 were using vectors
> 0x20-0x2F, after change they use interrupts 0x30-0x3F. Which has unfortunate
> effect that when hardware IRQ 8 arrives while VM is running, vmm believes that
> it internally used 'INT 0x38' to call some hypervisor service - and (1) hardware
> interrupt is never acknowledged, and (2) hypervisor issues random operation
> depending on contents of registers at the time interrupt arrived. Both are
> quite bad, and usual result is that VMware panics, and while writing core dump
> kernel hangs as IOAPIC believes that there is IRQ 8 in service, and so it does
> not ever deliver IRQs 14/15 for legacy IDE harddisks (which are at same level).
>
> One of possible fixes (if you need to run older products than VMware Workstation
> 6 on 64bit 2.6.21+) is replacing
>
> #define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x10
>
> with
>
> #define IRQ0_VECTOR FIRST_EXTERNAL_VECTOR + 0x08
Nope. That will break irq migration don't even think about it.
> Then IRQ 0x38 will be skipped. Other option is move only IRQ8_VECTOR somewhere
> else (into 0x21-0x2F range).
> Petr Vandrovec
>
> P.S.: Well, and obviously this has nothing to do with vmmon...
I don't even want to think about how a kernel module gets far enough
into the kernel to be affected by our vector layout. These are internal
implementation details, without anything exported to modules.
Can I please see the source of the code in vmware that is doing this?
Eric
Eric W. Biederman wrote:
> I don't even want to think about how a kernel module gets far enough
> into the kernel to be affected by our vector layout. These are internal
> implementation details, without anything exported to modules.
>
> Can I please see the source of the code in vmware that is doing this?
>
Sorry, that code is not part of the kernel or any kernel module. It is
part of a fixed set of assumptions about the platform coded in the
hypervisor, which is not open source. This code runs completely outside
the scope of Linux, and uses a platform dependent set of IDT software
vectors which are known not to collide with IDT IRQ vectors. We use
these software vectors for internal purposes; they are never visible to
any Linux software, but are handled and trapped by the hypervisor.
Nevertheless, since we must distinguish between software IRQs and
hardware IRQs, we must find vectors that do not collide with the set of
hardware IRQs or processor exceptions.
To avoid this dependence on fixed assumtions about vector layout, what
is needed is a mechanism to reserve and allocate software IDT vectors.
It may be a GPL'd interface; it certainly is interfacing with the kernel
at a low-level.
An interface that would likely looking something like:
int idt_allocate_swirq(int best_irq);
void idt_release_swirq(int irq);
int __init idt_reserve_irqs(int count);
void idt_set_swirq_handler (int irq, int is_user, void (*handle)(struct
pt_regs *regs, unsigned long error_code));
EXPORT_SYMBOL_GPL(idt_allocate_swirq);
EXPORT_SYMBOL_GPL(idt_release_swirq);
EXPORT_SYMBOL_GPL(idt_set_swirq_handler);
Now you can set aside a fixed number of IRQs to be used for software
IRQs at boot time, and allocate them as required. You can even create
software IRQs which can be handled by userspace applications, or reserve
software IRQs for other uses - from within the kernel itself, or from
outside any kernel context (for example an IPI invoked from a non-kernel
CPU). There are cases where this would be a useful feature for us;
being able to issue IPIs directly to a hypervisor mode CPU would be a
significant speedup (alternatively, having a kernel module handle the
IPI when the CPU is in kernel mode and schedule the vmx process to run
and forward the IPI).
The thought of running non-kernel code in ring-0 on some CPU is scary,
certainly. Nevertheless it is required for running a hypervisor which
does not live in the kernel address space and must handle its own page
faults and other exceptions.
Zach
Zachary Amsden <[email protected]> writes:
> Eric W. Biederman wrote:
>> I don't even want to think about how a kernel module gets far enough
>> into the kernel to be affected by our vector layout. These are internal
>> implementation details, without anything exported to modules.
>>
>> Can I please see the source of the code in vmware that is doing this?
>>
>
> Sorry, that code is not part of the kernel or any kernel module. It is part of
> a fixed set of assumptions about the platform coded in the hypervisor, which is
> not open source. This code runs completely outside the scope of Linux, and uses
> a platform dependent set of IDT software vectors which are known not to collide
> with IDT IRQ vectors.
Is this linux running on vmware or vmware running on linux?
> We use these software vectors for internal purposes; they
> are never visible to any Linux software, but are handled and trapped by the
> hypervisor. Nevertheless, since we must distinguish between software IRQs and
> hardware IRQs, we must find vectors that do not collide with the set of hardware
> IRQs or processor exceptions.
This sounds like playing with fire. Although I suppose you could do it generally
by making software irqs trigger a general protection fault.
> To avoid this dependence on fixed assumtions about vector layout, what is needed
> is a mechanism to reserve and allocate software IDT vectors. It may be a GPL'd
> interface; it certainly is interfacing with the kernel at a low-level.
>
> An interface that would likely looking something like:
>
> int idt_allocate_swirq(int best_irq);
> void idt_release_swirq(int irq);
> int __init idt_reserve_irqs(int count);
> void idt_set_swirq_handler (int irq, int is_user, void (*handle)(struct pt_regs
> *regs, unsigned long error_code));
> EXPORT_SYMBOL_GPL(idt_allocate_swirq);
> EXPORT_SYMBOL_GPL(idt_release_swirq);
> EXPORT_SYMBOL_GPL(idt_set_swirq_handler);
What we currently have is:
int assign_irq_vector(int irq, cpumask_t);
It has a number of interesting properties such as you can change
the vector assignment at runtime, and we can migrate the irq
between cpus.
None of this is accessible as a module because we don't have any irq
controllers that it makes sense to write as modules. There is to much
platform magic involved to have portable irq controller code and not
really any point in it.
> Now you can set aside a fixed number of IRQs to be used for software IRQs at
> boot time, and allocate them as required. You can even create software IRQs
> which can be handled by userspace applications, or reserve software IRQs for
> other uses - from within the kernel itself, or from outside any kernel context
> (for example an IPI invoked from a non-kernel CPU). There are cases where this
> would be a useful feature for us; being able to issue IPIs directly to a
> hypervisor mode CPU would be a significant speedup (alternatively, having a
> kernel module handle the IPI when the CPU is in kernel mode and schedule the vmx
> process to run and forward the IPI).
That is pretty much the architecture we have to support msi. Although
irq != vector not even at a fixed offset.
> The thought of running non-kernel code in ring-0 on some CPU is scary,
> certainly. Nevertheless it is required for running a hypervisor which does not
> live in the kernel address space and must handle its own page faults and other
> exceptions.
Yep. Untrusted binary blobs in ring-0 are very scary.
Eric
Eric W. Biederman wrote:
> Is this linux running on vmware or vmware running on linux?
>
VMware running on Linux.
> This sounds like playing with fire. Although I suppose you could do it generally
> by making software irqs trigger a general protection fault.
>
Better if you don't have to; the whole point of the swirq is faster
handling.
> What we currently have is:
> int assign_irq_vector(int irq, cpumask_t);
>
> It has a number of interesting properties such as you can change
> the vector assignment at runtime, and we can migrate the irq
> between cpus.
>
Yes, but still not enough here.
> That is pretty much the architecture we have to support msi. Although
> irq != vector not even at a fixed offset.
>
It doesn't look like you can safely allocate an exclusive IRQ here
however - the IO-APIC could always route a hardware IRQ for the matching
vector right on top of you unless, I'm misreading something.
Zach
On Wed, May 02, 2007 at 01:14:16AM +1000, Nigel Cunningham wrote:
> > if you want to ask questions about proprietary kernel stuff you're
> > better off asking the vendor directly, not lkml
>
> I did, but given that it the failure only appeared with a change of
> vanilla kernel version, I didn't think it was out of place to ask here
> too.
No, it's still totally offtopic here.
> > > if you want to ask questions about proprietary kernel stuff you're
> > > better off asking the vendor directly, not lkml
> >
> > I did, but given that it the failure only appeared with a change of
> > vanilla kernel version, I didn't think it was out of place to ask here
> > too.
>
> No, it's still totally offtopic here.
And a change of vanilla kernel version leading to breakage is not a
sufficient condition to conclude the kernel's at fault. The entire history
of computing is saturated with examples of coders doing stupid things with
foolish assumptions and incidental interfaces of operating systems (early
games on *-DOS being a prime example) that are subsequently broken by new
versions of the operating system. That's programs being stupid, not the
operating system.
On Sat, 05 May 2007 10:56:09 BST, Christoph Hellwig said:
> On Wed, May 02, 2007 at 01:14:16AM +1000, Nigel Cunningham wrote:
> > > if you want to ask questions about proprietary kernel stuff you're
> > > better off asking the vendor directly, not lkml
> >
> > I did, but given that it the failure only appeared with a change of
> > vanilla kernel version, I didn't think it was out of place to ask here
> > too.
>
> No, it's still totally offtopic here.
I'm not convinced it's *totally* off-topic. I'll agree that third-party
binaries are on their own as far as active support goes, but I don't see
that it's off-topic to post a simple statement-of-fact like "2.6.mumble-rc1
breaks <popular-driver-FOO>" just so it's a *known* issue and people who
search the list archives don't spend forever re-inventing the wheel. Also,
it's quite *possible* that the binary module has tripped over a geniune
regression or bug in the kernel.
On Sun, May 06, 2007 at 03:16:13AM -0400, [email protected] wrote:
> I'm not convinced it's *totally* off-topic. I'll agree that third-party
> binaries are on their own as far as active support goes, but I don't see
> that it's off-topic to post a simple statement-of-fact like "2.6.mumble-rc1
> breaks <popular-driver-FOO>" just so it's a *known* issue and people who
> search the list archives don't spend forever re-inventing the wheel. Also,
> it's quite *possible* that the binary module has tripped over a geniune
> regression or bug in the kernel.
Actually it's totally offtopic. Not only are prorpitary module not
on the agenda at all here, but ones that poke into deep down kernel
internals should be expected to break every time. Note to mention that
they are on the almost black side of the illegality scala for propritary
modules.
---end quoted text---