2008-08-04 06:43:32

by Shaohua Li

[permalink] [raw]
Subject: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation

To reduce tlb/cache flush, makes agp memory allocation do one flush
after all pages in a region are changed to uc.

All agp drivers except agp-sgi uses agp_generic_alloc_page()
for .agp_alloc_page, so the patch should work for them. agp-sgi is only
for ia64, so not a problem too.

Signed-off-by: Shaohua Li <[email protected]>
---
drivers/char/agp/agp.h | 4 ++++
drivers/char/agp/generic.c | 4 +++-
include/asm-x86/agp.h | 3 +++
3 files changed, 10 insertions(+), 1 deletion(-)

Index: linux/drivers/char/agp/generic.c
===================================================================
--- linux.orig/drivers/char/agp/generic.c 2008-08-04 12:03:37.000000000 +0800
+++ linux/drivers/char/agp/generic.c 2008-08-04 12:06:31.000000000 +0800
@@ -274,6 +274,7 @@ struct agp_memory *agp_allocate_memory(s
new->memory[i] = virt_to_gart(addr);
new->page_count++;
}
+ map_page_into_agp_global_flush();
new->bridge = bridge;

return new;
@@ -1186,7 +1187,8 @@ void *agp_generic_alloc_page(struct agp_
if (page == NULL)
return NULL;

- map_page_into_agp(page);
+ /* agp_allocate_memory will do flush */
+ map_page_into_agp_noflush(page);

get_page(page);
atomic_inc(&agp_bridge->current_memory_agp);
Index: linux/include/asm-x86/agp.h
===================================================================
--- linux.orig/include/asm-x86/agp.h 2008-08-04 12:03:37.000000000 +0800
+++ linux/include/asm-x86/agp.h 2008-08-04 12:06:31.000000000 +0800
@@ -15,6 +15,9 @@
#define map_page_into_agp(page) set_pages_uc(page, 1)
#define unmap_page_from_agp(page) set_pages_wb(page, 1)

+#define map_page_into_agp_noflush(page) set_pages_uc_noflush(page, 1)
+#define map_page_into_agp_global_flush() set_memory_flush_all()
+
/*
* Could use CLFLUSH here if the cpu supports it. But then it would
* need to be called for each cacheline of the whole page so it may
Index: linux/drivers/char/agp/agp.h
===================================================================
--- linux.orig/drivers/char/agp/agp.h 2008-08-04 12:03:37.000000000 +0800
+++ linux/drivers/char/agp/agp.h 2008-08-04 12:06:31.000000000 +0800
@@ -30,6 +30,10 @@
#define _AGP_BACKEND_PRIV_H 1

#include <asm/agp.h> /* for flush_agp_cache() */
+#ifndef map_page_into_agp_noflush
+#define map_page_into_agp_noflush(page) map_page_into_agp(page)
+#define map_page_into_agp_global_flush()
+#endif

#define PFX "agpgart: "



2008-08-15 14:32:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation


* Shaohua Li <[email protected]> wrote:

> To reduce tlb/cache flush, makes agp memory allocation do one flush
> after all pages in a region are changed to uc.
>
> All agp drivers except agp-sgi uses agp_generic_alloc_page() for
> .agp_alloc_page, so the patch should work for them. agp-sgi is only
> for ia64, so not a problem too.

applied to tip/x86/pat - thanks!

I've Cc:-ed more PAT folks - any objections?

Ingo

2008-08-15 14:40:50

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation

On Fri, 15 Aug 2008 16:31:31 +0200
Ingo Molnar <[email protected]> wrote:

>
> * Shaohua Li <[email protected]> wrote:
>
> > To reduce tlb/cache flush, makes agp memory allocation do one flush
> > after all pages in a region are changed to uc.
> >
> > All agp drivers except agp-sgi uses agp_generic_alloc_page() for
> > .agp_alloc_page, so the patch should work for them. agp-sgi is only
> > for ia64, so not a problem too.
>
> applied to tip/x86/pat - thanks!
>
> I've Cc:-ed more PAT folks - any objections?
>

it really needs something else instead; it needs airlied's array
allocator
otherwise you hit the second wall as well (the pat checks per page)

in reality an array version of ioremap (or set_pages_X) is what is
needed

2008-08-15 14:43:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation


* Arjan van de Ven <[email protected]> wrote:

> On Fri, 15 Aug 2008 16:31:31 +0200
> Ingo Molnar <[email protected]> wrote:
>
> >
> > * Shaohua Li <[email protected]> wrote:
> >
> > > To reduce tlb/cache flush, makes agp memory allocation do one flush
> > > after all pages in a region are changed to uc.
> > >
> > > All agp drivers except agp-sgi uses agp_generic_alloc_page() for
> > > .agp_alloc_page, so the patch should work for them. agp-sgi is only
> > > for ia64, so not a problem too.
> >
> > applied to tip/x86/pat - thanks!
> >
> > I've Cc:-ed more PAT folks - any objections?
> >
>
> it really needs something else instead; it needs airlied's array
> allocator otherwise you hit the second wall as well (the pat checks
> per page)
>
> in reality an array version of ioremap (or set_pages_X) is what is
> needed

ok, agreed.

Ingo

2008-08-18 01:21:25

by Shaohua Li

[permalink] [raw]
Subject: RE: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation



>-----Original Message-----
>From: Arjan van de Ven [mailto:[email protected]]
>Sent: Friday, August 15, 2008 10:41 PM
>To: Ingo Molnar
>Cc: Li, Shaohua; lkml; [email protected]; Andrew Morton; Ingo Molnar; Siddha,
>Suresh B; Pallipadi, Venkatesh; Thomas Gleixner; H. Peter Anvin
>Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory
>allocation
>
>On Fri, 15 Aug 2008 16:31:31 +0200
>Ingo Molnar <[email protected]> wrote:
>
>>
>> * Shaohua Li <[email protected]> wrote:
>>
>> > To reduce tlb/cache flush, makes agp memory allocation do one flush
>> > after all pages in a region are changed to uc.
>> >
>> > All agp drivers except agp-sgi uses agp_generic_alloc_page() for
>> > .agp_alloc_page, so the patch should work for them. agp-sgi is only
>> > for ia64, so not a problem too.
>>
>> applied to tip/x86/pat - thanks!
>>
>> I've Cc:-ed more PAT folks - any objections?
>>
>
>it really needs something else instead; it needs airlied's array
>allocator
>otherwise you hit the second wall as well (the pat checks per page)
Somebody should have a measurement. In my test, the real bottleneck is the cache flush. It appears flush cache page is slow if there are a lot of pages, In my patch, I use a wbinvd. This can be optimized to do wbinvd with a threshold. Maybe airlied can change his patch with this way.

Thanks,
Shaohua

2008-08-18 03:57:16

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation

On Mon, 18 Aug 2008 09:21:12 +0800
"Li, Shaohua" <[email protected]> wrote:

>
> >
> >it really needs something else instead; it needs airlied's array
> >allocator
> >otherwise you hit the second wall as well (the pat checks per page)
> Somebody should have a measurement. In my test, the real bottleneck
> is the cache flush. It appears flush cache page is slow if there are
> a lot of pages, In my patch, I use a wbinvd. This can be optimized to
> do wbinvd with a threshold. Maybe airlied can change his patch with
> this way.


it would be great if you had time to update his patch and this to
it...

and the logic probably should be "if there's more than X pags in the
the array, just use wbinvd".
Although wbinvd is very painful if you have 12Mb of cache and you wipe
it for all cores in the system ;-(



--
If you want to reach me at my work email, use [email protected]
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2008-08-18 06:19:53

by Shaohua Li

[permalink] [raw]
Subject: RE: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation



>-----Original Message-----
>From: Arjan van de Ven [mailto:[email protected]]
>Sent: Monday, August 18, 2008 11:57 AM
>To: Li, Shaohua
>Cc: Ingo Molnar; lkml; [email protected]; Andrew Morton; Ingo Molnar; Siddha,
>Suresh B; Pallipadi, Venkatesh; Thomas Gleixner; H. Peter Anvin
>Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory
>allocation
>
>On Mon, 18 Aug 2008 09:21:12 +0800
>"Li, Shaohua" <[email protected]> wrote:
>
>>
>> >
>> >it really needs something else instead; it needs airlied's array
>> >allocator
>> >otherwise you hit the second wall as well (the pat checks per page)
>> Somebody should have a measurement. In my test, the real bottleneck
>> is the cache flush. It appears flush cache page is slow if there are
>> a lot of pages, In my patch, I use a wbinvd. This can be optimized to
>> do wbinvd with a threshold. Maybe airlied can change his patch with
>> this way.
>
>
>it would be great if you had time to update his patch and this to
>it...
I'll do it soon.

>and the logic probably should be "if there's more than X pags in the
>the array, just use wbinvd".
>Although wbinvd is very painful if you have 12Mb of cache and you wipe
>it for all cores in the system ;-(
This might not be that bad, changing attribute isn't frequently used.

Thanks,
Shaohua

2008-08-18 08:04:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation


* Li, Shaohua <[email protected]> wrote:

> > it would be great if you had time to update his patch and this to
> > it...
>
> I'll do it soon.

great! Please do it as a delta patch against tip/master:

http://people.redhat.com/mingo/tip.git/README

as your first two patches are being tested already:

466ae83: reduce tlb/cache flush times of agpgart memory allocation
1ac2f7d: introduce two APIs for page attribute

[ but it can all only go upstream once the observations from Arjan have
been addressed. ]

> > and the logic probably should be "if there's more than X pags in the
> > the array, just use wbinvd". Although wbinvd is very painful if you
> > have 12Mb of cache and you wipe it for all cores in the system ;-(
>
> This might not be that bad, changing attribute isn't frequently used.

dont some X/DRM drivers do it for a wide range of GX ops?

Ingo

2008-08-18 17:26:50

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch 2/2] reduce tlb/cache flush times of agpgart memory allocation

Ingo Molnar wrote:
> * Li, Shaohua <[email protected]> wrote:
>
>>> it would be great if you had time to update his patch and this to
>>> it...
>> I'll do it soon.
>
> great! Please do it as a delta patch against tip/master:
>
> http://people.redhat.com/mingo/tip.git/README
>
> as your first two patches are being tested already:
>
> 466ae83: reduce tlb/cache flush times of agpgart memory allocation
> 1ac2f7d: introduce two APIs for page attribute
>
> [ but it can all only go upstream once the observations from Arjan have
> been addressed. ]
>
>>> and the logic probably should be "if there's more than X pags in the
>>> the array, just use wbinvd". Although wbinvd is very painful if you
>>> have 12Mb of cache and you wipe it for all cores in the system ;-(
>> This might not be that bad, changing attribute isn't frequently used.
>
> dont some X/DRM drivers do it for a wide range of GX ops?
>

I think so... at least it's likely to become more common.
Realistically, it means WBINVD, being uninterruptible, is probably
unsafe even for very large amounts.

-hpa