I just tracked down what was cutting performance 10x on one of my
systems on a microbenchmark I'd just written:
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -540,7 +540,7 @@ int drm_gem_mmap(struct file *filp, struct
vm_area_struct *vma)
/* FIXME: use pgprot_writecombine when available */
prot = pgprot_val(vma->vm_page_prot);
#ifdef CONFIG_X86
- prot |= _PAGE_CACHE_WC;
+ /*prot |= _PAGE_CACHE_WC;*/
#endif
vma->vm_page_prot = __pgprot(prot);
Turns out that setting PAGE_CACHE_WC disables the WC effect of the MTRR
on my non-PAT (disabled due to CPU errata) 945GM system, and this
workaround took GTT-mapped writes from 120MB/s to 1180MB/s.
What's the right way to be setting our PTEs? This is similar to the
workaround in ef5fa0ab24b87646c7bc98645acbb4b51fc2acd4, but to do it in
the driver as well means exporting pat_enabled, and it really seems like
PAT presence shouldn't be something the driver has to worry about --
I've got a WC MTRR and I'm asking for a WC mapping and I'm getting
uncached. If I failed to init the MTRR, the state of the aperture I'm
mapping would be uncached.
Test code is at
git://anongit.freedesktop.org/git/xorg/app/intel-gpu-tools
under benchmarks/intel_upload_blit_large_gtt but requires libdrm master
until we get a release out in the next week or so. Also includes a few
standalone regression tests, which may be useful for people making
changes that may affect i915.
--
Eric Anholt
[email protected] [email protected]
On Tue, 31 Mar 2009 17:10:47 -0700
Eric Anholt <[email protected]> wrote:
> I just tracked down what was cutting performance 10x on one of my
> systems on a microbenchmark I'd just written:
>
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -540,7 +540,7 @@ int drm_gem_mmap(struct file *filp, struct
> vm_area_struct *vma)
> /* FIXME: use pgprot_writecombine when available */
> prot = pgprot_val(vma->vm_page_prot);
> #ifdef CONFIG_X86
> - prot |= _PAGE_CACHE_WC;
> + /*prot |= _PAGE_CACHE_WC;*/
> #endif
> vma->vm_page_prot = __pgprot(prot);
>
> Turns out that setting PAGE_CACHE_WC disables the WC effect of the
> MTRR on my non-PAT (disabled due to CPU errata) 945GM system, and this
> workaround took GTT-mapped writes from 120MB/s to 1180MB/s.
What the... There's a pgprot_writecombine now, but it basically does
the same thing. Why is WC so broken? Venki is the fix for this
covered in your last patchset?
--
Jesse Barnes, Intel Open Source Technology Center
>-----Original Message-----
>From: Jesse Barnes [mailto:[email protected]]
>Sent: Tuesday, March 31, 2009 5:15 PM
>To: Eric Anholt
>Cc: lkml; Pallipadi, Venkatesh
>Subject: Re: PAGE_CACHE_WC strikes again
>
>On Tue, 31 Mar 2009 17:10:47 -0700
>Eric Anholt <[email protected]> wrote:
>
>> I just tracked down what was cutting performance 10x on one of my
>> systems on a microbenchmark I'd just written:
>>
>> --- a/drivers/gpu/drm/drm_gem.c
>> +++ b/drivers/gpu/drm/drm_gem.c
>> @@ -540,7 +540,7 @@ int drm_gem_mmap(struct file *filp, struct
>> vm_area_struct *vma)
>> /* FIXME: use pgprot_writecombine when available */
>> prot = pgprot_val(vma->vm_page_prot);
>> #ifdef CONFIG_X86
>> - prot |= _PAGE_CACHE_WC;
>> + /*prot |= _PAGE_CACHE_WC;*/
>> #endif
>> vma->vm_page_prot = __pgprot(prot);
>>
>> Turns out that setting PAGE_CACHE_WC disables the WC effect of the
>> MTRR on my non-PAT (disabled due to CPU errata) 945GM
>system, and this
>> workaround took GTT-mapped writes from 120MB/s to 1180MB/s.
>
>What the... There's a pgprot_writecombine now, but it basically does
>the same thing. Why is WC so broken? Venki is the fix for this
>covered in your last patchset?
>
The key point here is
> setting PAGE_CACHE_WC disables the WC effect of the
> MTRR on my non-PAT (disabled due to CPU errata)
When PAT is disabled, the default setting in PAT MSR is
00 - WB
01 - WT
10 - UC_MINUS
11 - UC
There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
(which is 01) the driver is basically selecting write-through!
And when MTRR says WC and PAT says WT, effective type is UC.
Basically, no one should be hard-coding the memory type. Please use
pgprot_writecombine() which does the right thing by using WC
(when PAT is enabled) or UC_MINUS (when PAT is disabled).
Thanks,
Venki-
On Tue, 31 Mar 2009 17:29:02 -0700
"Pallipadi, Venkatesh" <[email protected]> wrote:
>
>
> >-----Original Message-----
> >From: Jesse Barnes [mailto:[email protected]]
> >Sent: Tuesday, March 31, 2009 5:15 PM
> >To: Eric Anholt
> >Cc: lkml; Pallipadi, Venkatesh
> >Subject: Re: PAGE_CACHE_WC strikes again
> >
> >On Tue, 31 Mar 2009 17:10:47 -0700
> >Eric Anholt <[email protected]> wrote:
> >
> >> I just tracked down what was cutting performance 10x on one of my
> >> systems on a microbenchmark I'd just written:
> >>
> >> --- a/drivers/gpu/drm/drm_gem.c
> >> +++ b/drivers/gpu/drm/drm_gem.c
> >> @@ -540,7 +540,7 @@ int drm_gem_mmap(struct file *filp, struct
> >> vm_area_struct *vma)
> >> /* FIXME: use pgprot_writecombine when available */
> >> prot = pgprot_val(vma->vm_page_prot);
> >> #ifdef CONFIG_X86
> >> - prot |= _PAGE_CACHE_WC;
> >> + /*prot |= _PAGE_CACHE_WC;*/
> >> #endif
> >> vma->vm_page_prot = __pgprot(prot);
> >>
> >> Turns out that setting PAGE_CACHE_WC disables the WC effect of the
> >> MTRR on my non-PAT (disabled due to CPU errata) 945GM
> >system, and this
> >> workaround took GTT-mapped writes from 120MB/s to 1180MB/s.
> >
> >What the... There's a pgprot_writecombine now, but it basically does
> >the same thing. Why is WC so broken? Venki is the fix for this
> >covered in your last patchset?
> >
>
> The key point here is
>
> > setting PAGE_CACHE_WC disables the WC effect of the
> > MTRR on my non-PAT (disabled due to CPU errata)
>
> When PAT is disabled, the default setting in PAT MSR is
> 00 - WB
> 01 - WT
> 10 - UC_MINUS
> 11 - UC
>
> There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
> (which is 01) the driver is basically selecting write-through!
>
> And when MTRR says WC and PAT says WT, effective type is UC.
>
> Basically, no one should be hard-coding the memory type. Please use
> pgprot_writecombine() which does the right thing by using WC
> (when PAT is enabled) or UC_MINUS (when PAT is disabled).
Ah ok, that makes sense. I'm glad we have pgprot_writecombine now
(sorry Eric I should have sent that patch along with my patch
containing the vm_insert_pfn -EINVAL handling).
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 2009-03-31 at 17:29 -0700, Pallipadi, Venkatesh wrote:
> The key point here is
>
> > setting PAGE_CACHE_WC disables the WC effect of the
> > MTRR on my non-PAT (disabled due to CPU errata)
>
> When PAT is disabled, the default setting in PAT MSR is
> 00 - WB
> 01 - WT
> 10 - UC_MINUS
> 11 - UC
>
> There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
> (which is 01) the driver is basically selecting write-through!
>
> And when MTRR says WC and PAT says WT, effective type is UC.
>
> Basically, no one should be hard-coding the memory type. Please use
> pgprot_writecombine() which does the right thing by using WC
> (when PAT is enabled) or UC_MINUS (when PAT is disabled).
And the driver should use right API to track the underlying page frame
thats getting mapped by this vma, with the corresponding attribute.
API's like remap_pfn_range(), vm_insert_pfn() will setup the PTE's
aswell as track the pfn's attributes.
API's like set_memory_uc/wc() will explicitly setup the page attributes.
Jesse, As far as I see, the drm GEM fault handler routines don't seem to
do any of this. Am I missing something? We need to fix this so that we
can avoid potential aliasing issues.
thanks,
suresh
On Tue, 31 Mar 2009 17:03:10 -0800
Suresh Siddha <[email protected]> wrote:
> On Tue, 2009-03-31 at 17:29 -0700, Pallipadi, Venkatesh wrote:
> > The key point here is
> >
> > > setting PAGE_CACHE_WC disables the WC effect of the
> > > MTRR on my non-PAT (disabled due to CPU errata)
> >
> > When PAT is disabled, the default setting in PAT MSR is
> > 00 - WB
> > 01 - WT
> > 10 - UC_MINUS
> > 11 - UC
> >
> > There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
> > (which is 01) the driver is basically selecting write-through!
> >
> > And when MTRR says WC and PAT says WT, effective type is UC.
> >
> > Basically, no one should be hard-coding the memory type. Please use
> > pgprot_writecombine() which does the right thing by using WC
> > (when PAT is enabled) or UC_MINUS (when PAT is disabled).
>
> And the driver should use right API to track the underlying page frame
> thats getting mapped by this vma, with the corresponding attribute.
> API's like remap_pfn_range(), vm_insert_pfn() will setup the PTE's
> aswell as track the pfn's attributes.
>
> API's like set_memory_uc/wc() will explicitly setup the page
> attributes.
>
> Jesse, As far as I see, the drm GEM fault handler routines don't seem
> to do any of this. Am I missing something? We need to fix this so
> that we can avoid potential aliasing issues.
Right, the drm driver code went in before we had pgprot_writecombine.
Now that it's available we should definitely use it. I'm not sure
about the set_memory_* routines though; we create io mappings in
i915_dma.c at init time, and I thought we took care of things in
i915_gem.c but we may need updates there.
--
Jesse Barnes, Intel Open Source Technology Center
On Tue, 31 Mar 2009 17:03:10 -0800
Suresh Siddha <[email protected]> wrote:
> On Tue, 2009-03-31 at 17:29 -0700, Pallipadi, Venkatesh wrote:
> > The key point here is
> >
> > > setting PAGE_CACHE_WC disables the WC effect of the
> > > MTRR on my non-PAT (disabled due to CPU errata)
> >
> > When PAT is disabled, the default setting in PAT MSR is
> > 00 - WB
> > 01 - WT
> > 10 - UC_MINUS
> > 11 - UC
> >
> > There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
> > (which is 01) the driver is basically selecting write-through!
> >
> > And when MTRR says WC and PAT says WT, effective type is UC.
> >
> > Basically, no one should be hard-coding the memory type. Please use
> > pgprot_writecombine() which does the right thing by using WC
> > (when PAT is enabled) or UC_MINUS (when PAT is disabled).
>
> And the driver should use right API to track the underlying page frame
> thats getting mapped by this vma, with the corresponding attribute.
> API's like remap_pfn_range(), vm_insert_pfn() will setup the PTE's
> aswell as track the pfn's attributes.
>
> API's like set_memory_uc/wc() will explicitly setup the page
> attributes.
>
> Jesse, As far as I see, the drm GEM fault handler routines don't seem
> to do any of this. Am I missing something? We need to fix this so
> that we can avoid potential aliasing issues.
Ok so we use set_memory_wc/wb in the physical mapping stuff, which I
think is fine. We also use io_mapping_create_wc on the whole aperture
at load time. That will take care of any pages we map there right? So
along with this fix I think we'll be ok?
--
Jesse Barnes, Intel Open Source Technology Center
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index c1173d8..fa828a8 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -538,11 +538,7 @@ int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
vma->vm_ops = obj->dev->driver->gem_vm_ops;
vma->vm_private_data = map->handle;
/* FIXME: use pgprot_writecombine when available */
- prot = pgprot_val(vma->vm_page_prot);
-#ifdef CONFIG_X86
- prot |= _PAGE_CACHE_WC;
-#endif
- vma->vm_page_prot = __pgprot(prot);
+ vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
/* Take a ref for this mapping of the object, so that the fault
* handler can dereference the mmap offset's pointer to the object.
On Tue, 2009-03-31 at 18:12 -0700, Jesse Barnes wrote:
> Right, the drm driver code went in before we had pgprot_writecombine.
> Now that it's available we should definitely use it. I'm not sure
> about the set_memory_* routines though; we create io mappings in
> i915_dma.c at init time, and I thought we took care of things in
> i915_gem.c but we may need updates there.
Jesse, yes i915_gem.c seems to be doing the right thing.
What about various fault handlers in drm_vm.c like drm_do_vm_shm_fault()
etc. None of these fault handlers require special attributes like wc/uc
etc?
thanks,
suresh
On Tue, 31 Mar 2009 17:30:24 -0800
Suresh Siddha <[email protected]> wrote:
> On Tue, 2009-03-31 at 18:12 -0700, Jesse Barnes wrote:
> > Right, the drm driver code went in before we had
> > pgprot_writecombine. Now that it's available we should definitely
> > use it. I'm not sure about the set_memory_* routines though; we
> > create io mappings in i915_dma.c at init time, and I thought we
> > took care of things in i915_gem.c but we may need updates there.
>
> Jesse, yes i915_gem.c seems to be doing the right thing.
>
> What about various fault handlers in drm_vm.c like
> drm_do_vm_shm_fault() etc. None of these fault handlers require
> special attributes like wc/uc etc?
Those are legacy. I don't think they need anything changed... Dave
could correct me though.
--
Jesse Barnes, Intel Open Source Technology Center
>-----Original Message-----
>From: Jesse Barnes [mailto:[email protected]]
>Sent: Tuesday, March 31, 2009 6:13 PM
>To: Siddha, Suresh B
>Cc: Pallipadi, Venkatesh; Eric Anholt; lkml
>Subject: Re: PAGE_CACHE_WC strikes again
>
>On Tue, 31 Mar 2009 17:03:10 -0800
>Suresh Siddha <[email protected]> wrote:
>
>> On Tue, 2009-03-31 at 17:29 -0700, Pallipadi, Venkatesh wrote:
>> > The key point here is
>> >
>> > > setting PAGE_CACHE_WC disables the WC effect of the
>> > > MTRR on my non-PAT (disabled due to CPU errata)
>> >
>> > When PAT is disabled, the default setting in PAT MSR is
>> > 00 - WB
>> > 01 - WT
>> > 10 - UC_MINUS
>> > 11 - UC
>> >
>> > There is no way to set WC with PAT. By hardcoding _PAGE_CACHE_WC
>> > (which is 01) the driver is basically selecting write-through!
>> >
>> > And when MTRR says WC and PAT says WT, effective type is UC.
>> >
>> > Basically, no one should be hard-coding the memory type. Please use
>> > pgprot_writecombine() which does the right thing by using WC
>> > (when PAT is enabled) or UC_MINUS (when PAT is disabled).
>>
>> And the driver should use right API to track the underlying
>page frame
>> thats getting mapped by this vma, with the corresponding attribute.
>> API's like remap_pfn_range(), vm_insert_pfn() will setup the PTE's
>> aswell as track the pfn's attributes.
>>
>> API's like set_memory_uc/wc() will explicitly setup the page
>> attributes.
>>
>> Jesse, As far as I see, the drm GEM fault handler routines don't seem
>> to do any of this. Am I missing something? We need to fix this so
>> that we can avoid potential aliasing issues.
>
>Right, the drm driver code went in before we had pgprot_writecombine.
>Now that it's available we should definitely use it. I'm not sure
>about the set_memory_* routines though; we create io mappings in
>i915_dma.c at init time, and I thought we took care of things in
>i915_gem.c but we may need updates there.
>
There is one problem of too many single page mappings, most likely
coming from gem vm_insert_pfn. As in this mail
http://marc.info/?l=linux-kernel&m=123845244713183&w=2
The actual 'freeing invalid memtype' is a bug that I still trying
to zero in on. Regardless of that, having so many small WC
mappings in IO region will result it pretty slow vm_insert_pfn.
I think we should switch to io_mapping_create_wc kind of API
where we do bulk reserve and have a vm_insert_pfn_quick().
Thanks,
Venki-
On Wed, Apr 1, 2009 at 11:57 AM, Jesse Barnes <[email protected]> wrote:
> On Tue, 31 Mar 2009 17:30:24 -0800
> Suresh Siddha <[email protected]> wrote:
>
>> On Tue, 2009-03-31 at 18:12 -0700, Jesse Barnes wrote:
>> > Right, the drm driver code went in before we had
>> > pgprot_writecombine. Now that it's available we should definitely
>> > use it. ?I'm not sure about the set_memory_* routines though; we
>> > create io mappings in i915_dma.c at init time, and I thought we
>> > took care of things in i915_gem.c but we may need updates there.
>>
>> Jesse, yes i915_gem.c seems to be doing the right thing.
>>
>> What about various fault handlers in drm_vm.c like
>> drm_do_vm_shm_fault() etc. None of these fault handlers require
>> special attributes like wc/uc etc?
>
> Those are legacy. ?I don't think they need anything changed... ?Dave
> could correct me though.
Yeah just don't look at the top of drm_vm.c:drm_io_prot.
the attribute handling in there is one of the bits of the drm I hide away from.
and I think we can just move away from that with memory managed
drivers doing it properly.
Dave.
>
> --
> Jesse Barnes, Intel Open Source Technology Center
>