Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932254AbcDDJHr (ORCPT ); Mon, 4 Apr 2016 05:07:47 -0400 Received: from smtp02.citrix.com ([66.165.176.63]:63036 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932223AbcDDJHo (ORCPT ); Mon, 4 Apr 2016 05:07:44 -0400 X-IronPort-AV: E=Sophos;i="5.24,439,1454976000"; d="scan'208";a="351219619" Subject: Re: Getting rid of inside_vm in intel8x0 To: Andy Lutomirski , Takashi Iwai References: <20160331222618.GE1990@wotan.suse.de> <20160401222831.GX1990@wotan.suse.de> CC: "Luis R. Rodriguez" , Stefano Stabellini , Andrew Cooper , Stephen Hemminger , Jim Gettys , Dario Faggioli , , Ingo Molnar , "Thomas Gleixner" , Steven Rostedt , "Mel Gorman" , Konstantin Ozerkov , "linux-kernel@vger.kernel.org" , "ALSA development" , Matt Fleming , Borislav Petkov , xen-devel , Joerg Roedel From: George Dunlap Message-ID: <57022E67.7050907@citrix.com> Date: Mon, 4 Apr 2016 10:05:43 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-DLP: MIA2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4895 Lines: 100 On 02/04/16 13:57, Andy Lutomirski wrote: > On Fri, Apr 1, 2016 at 10:33 PM, Takashi Iwai wrote: >> On Sat, 02 Apr 2016 00:28:31 +0200, >> Luis R. Rodriguez wrote: >>> If the former, could a we somehow detect an emulated device other than through >>> this type of check ? Or could we *add* a capability of some sort to detect it >>> on the driver ? This would not address the removal, but it could mean finding a >>> way to address emulation issues. >>> >>> If its an IO issue -- exactly what is the causing the delays in IO ? >> >> Luis, there is no problem about emulation itself. It's rather an >> optimization to lighten the host side load, as I/O access on a VM is >> much heavier. >> >>>>>> This is satisfied mostly only on VM, and can't >>>>>> be measured easily unlike the IO read speed. >>>>> >>>>> Interesting, note the original patch claimed it was for KVM and >>>>> Parallels hypervisor only, but since the code uses: >>>>> >>>>> +#if defined(__i386__) || defined(__x86_64__) >>>>> + inside_vm = inside_vm || boot_cpu_has(X86_FEATURE_HYPERVISOR); >>>>> +#endif >>>>> >>>>> This makes it apply also to Xen as well, this makes this hack more >>>>> broad, but does is it only applicable when an emulated device is >>>>> used ? What about if a hypervisor is used and PCI passthrough is >>>>> used ? >>>> >>>> A good question. Xen was added there at the time from positive >>>> results by quick tests, but it might show an issue if it's running on >>>> a very old chip with PCI passthrough. But I'm not sure whether PCI >>>> passthrough would work on such old chipsets at all. >>> >>> If it did have an issue then that would have to be special cased, that >>> is the module parameter would not need to be enabled for such type of >>> systems, and heuristics would be needed. As you note, fortunately this >>> may not be common though... >> >> Actually this *is* module parametered. If set to a boolean value, it >> can be applied / skipped forcibly. So, if there has been a problem on >> Xen, this should have been reported. That's why I wrote it's no >> common case. This comes from the real experience. >> >>> but if this type of work around may be >>> taken as a precedent to enable other types of hacks in other drivers >>> I'm very fearful of more hacks later needing these considerations as >>> well. >>> >>>>>>> There are a pile of nonsensical "are we in a VM" checks of various >>>>>>> sorts scattered throughout the kernel, they're all a mess to maintain >>>>>>> (there are lots of kinds of VMs in the world, and Linux may not even >>>>>>> know it's a guest), and, in most cases, it appears that the correct >>>>>>> solution is to delete the checks. I just removed a nasty one in the >>>>>>> x86_32 entry asm, and this one is written in C so it should be a piece >>>>>>> of cake :) >>>>>> >>>>>> This cake looks sweet, but a worm is hidden behind the cream. >>>>>> The loop in the code itself is already a kludge for the buggy hardware >>>>>> where the inconsistent read happens not so often (only at the boundary >>>>>> and in a racy way). It would be nice if we can have a more reliably >>>>>> way to know the hardware buggyness, but it's difficult, >>>>>> unsurprisingly. >>>>> >>>>> The concern here is setting precedents for VM cases sprinkled in the kernel. >>>>> The assumption here is such special cases are really paper'ing over another >>>>> type of issue, so its best to ultimately try to root cause the issue in >>>>> a more generalized fashion. >>>> >>>> Well, it's rather bare metal that shows the buggy behavior, thus we >>>> need to paper over it. In that sense, it's other way round; we don't >>>> tune for VM. The VM check we're discussing is rather for skipping the >>>> strange workaround. >>> >>> What is it exactly about a VM that enables this work around to be skipped? >>> I don't quite get it yet. >> >> VM -- at least the full one with the sound hardware emulation -- >> doesn't have the hardware bug. So, the check isn't needed. > > Here's the issue, though: asking "am I in a VM" is not a good way to > learn properties of hardware. Just off the top of my head, here are > some types of VM and what they might imply about hardware: > > Intel Kernel Guard: your sound card is passed through from real hardware. > > Xen: could go either way. In dom0, it's likely passed through. In > domU, it could be passed through or emulated, and I believe this is > the case for all of the Xen variants. > > KVM: Probably emulated, but could be passed through. I'm not sure exactly why I was CC'd into this thread, but this is an important point -- even if you're running in a VM, you may actually have direct un-emulated IO access to a real (buggy) piece of hardware; in which case it sounds like you still need the work-around. So boot_cpu_has(X86_FEATURE_HYPERVISOR) is probably not the right check. -George