Message-ID: <557142C0.4070103@linux.intel.com>
Date: Fri, 05 Jun 2015 14:33:36 +0800
From: Xiao Guangrong <guangrong.xiao@linux.intel.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0
MIME-Version: 1.0
To: Paolo Bonzini <pbonzini@redhat.com>
CC: gleb@kernel.org, mtosatti@redhat.com, kvm@vger.kernel.org,
        linux-kernel@vger.kernel.org, "Zhang, Yang Z" <yang.z.zhang@intel.com>
Subject: Re: [PATCH 14/15] KVM: MTRR: do not map huage page for non-consistent
 range
References: <1432983566-15773-1-git-send-email-guangrong.xiao@linux.intel.com> <1432983566-15773-15-git-send-email-guangrong.xiao@linux.intel.com> <556C27A5.1040908@redhat.com> <556E6CF8.9070602@linux.intel.com> <556EB30F.8030100@redhat.com> <55700B0D.8080808@linux.intel.com> <55700E1F.9090803@redhat.com>
In-Reply-To: <55700E1F.9090803@redhat.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2665
Lines: 63


[ CCed Zhang Yang ]

On 06/04/2015 04:36 PM, Paolo Bonzini wrote:
>
>
> On 04/06/2015 10:23, Xiao Guangrong wrote:
>>>
>>> So, why do you need to always use IPAT=0?  Can patch 15 keep the current
>>> logic for RAM, like this:
>>>
>>>      if (is_mmio || kvm_arch_has_noncoherent_dma(vcpu->kvm))
>>>          ret = kvm_mtrr_get_guest_memory_type(vcpu, gfn) <<
>>>                VMX_EPT_MT_EPTE_SHIFT;
>>>      else
>>>          ret = (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT)
>>>              | VMX_EPT_IPAT_BIT;
>>
>> Yeah, it's okay, actually we considered this way, however
>> - it's light enough, it did not hurt guest performance based on our
>>    benchmark.
>> - the logic has always used for noncherent_dma case, extend it to
>>    normal case should have low risk and also help us to check the logic.
>
> But noncoherent_dma is not the common case, so it's not necessarily true
> that the risk is low.

I thought noncoherent_dma exists on 1st generation(s) IOMMU, it should
be fully tested at that time.

>
>> - completely follow MTRRS spec would be better than host hides it.
>
> We are a virtualization platform, we know well when MTRRs are necessary.
>
> Tis a risk from blindly obeying the guest MTRRs: userspace can see stale
> data if the guest's accesses bypass the cache.  AMD bypasses this by
> enabling snooping even in cases that ordinarily wouldn't snoop; for
> Intel the solution is that RAM-backed areas should always use IPAT.

Not sure if UC and other cacheable type combinations on guest and host
will cause problem. The SMD mentioned that snoop is not required only when
"The UC attribute comes from the MTRRs and the processors are not required
  to snoop their caches since the data could never have been cached."
(Vol 3. 11.5.2.2)
VMX do not touch hardware MTRR MSRs and i guess snoop works under this case.

I also noticed if SS (self-snooping) is supported we need not to invalidate
cache when programming memory type (Vol 3. 11.11.8), so that means CPU works
well on the page which has different cache types i guess.

After think it carefully, we (Zhang Yang) doubt if always set WB for DMA
memory is really a good idea because we can not assume WB DMA works well for
all devices. One example is that audio DMA (not a MMIO region) is required WC
to improve its performance.

However, we think the SDM is not clear enough so let's do full vMTRR on MMIO
and noncoherent_dma first.　:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/