2006-02-22 00:20:59

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: [PATCH] remove zone_mem_map

This patch removes zone_mem_map from zone.
By this, (generic) page_to_pfn and pfn_to_page can use the same logic.
This modifies page_to_pfn implementation. Could anyone do performance test on NUMA ?

(ia64 is not affected by this because it doesn't use generic page_to_pfn.)

-- Kame


This patch removes zone_mem_map.

However pfn_to_page uses pgdat, page_to_pfn uses zone.
page_to_pfn can use pgdat instead of zone, which is only one
user of zone_mem_map. By modifing it, we can remove zone_mem_map.


Signed-Off-By: KAMEZAWA Hiroyuki <[email protected]>

Index: test/include/asm-generic/memory_model.h
===================================================================
--- test.orig/include/asm-generic/memory_model.h
+++ test/include/asm-generic/memory_model.h
@@ -47,9 +47,9 @@ extern unsigned long page_to_pfn(struct

#define page_to_pfn(pg) \
({ struct page *__pg = (pg); \
- struct zone *__zone = page_zone(__pg); \
- (unsigned long)(__pg - __zone->zone_mem_map) + \
- __zone->zone_start_pfn; \
+ struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg)); \
+ (unsigned long)(__pg - __pgdat->node_mem_map) + \
+ __pgdat->node_start_pfn; \
})

#elif defined(CONFIG_SPARSEMEM)
Index: test/include/linux/mmzone.h
===================================================================
--- test.orig/include/linux/mmzone.h
+++ test/include/linux/mmzone.h
@@ -225,7 +225,6 @@ struct zone {
* Discontig memory support fields.
*/
struct pglist_data *zone_pgdat;
- struct page *zone_mem_map;
/* zone_start_pfn == zone_start_paddr >> PAGE_SHIFT */
unsigned long zone_start_pfn;

Index: test/mm/page_alloc.c
===================================================================
--- test.orig/mm/page_alloc.c
+++ test/mm/page_alloc.c
@@ -2117,7 +2117,6 @@ static __meminit void init_currently_emp
zone_wait_table_init(zone, size);
pgdat->nr_zones = zone_idx(zone) + 1;

- zone->zone_mem_map = pfn_to_page(zone_start_pfn);
zone->zone_start_pfn = zone_start_pfn;

memmap_init(size, pgdat->node_id, zone_idx(zone), zone_start_pfn);
@@ -2844,8 +2843,8 @@ struct page *pfn_to_page(unsigned long p
}
unsigned long page_to_pfn(struct page *page)
{
- struct zone *zone = page_zone(page);
- return (page - zone->zone_mem_map) + zone->zone_start_pfn;
+ struct pglist_data *pgdat = NODE_DATA(page_to_nid(page));
+ return (page - pgdat->node_mem_map) + pgdat->node_start_pfn;

}
#elif defined(CONFIG_SPARSEMEM)
Index: test/include/asm-alpha/mmzone.h
===================================================================
--- test.orig/include/asm-alpha/mmzone.h
+++ test/include/asm-alpha/mmzone.h
@@ -83,8 +83,7 @@ PLAT_NODE_DATA_LOCALNR(unsigned long p,
pte_t pte; \
unsigned long pfn; \
\
- pfn = ((unsigned long)((page)-page_zone(page)->zone_mem_map)) << 32; \
- pfn += page_zone(page)->zone_start_pfn << 32; \
+ pfn = page_to_pfn(page) << 32; \
pte_val(pte) = pfn | pgprot_val(pgprot); \
\
pte; \


2006-02-22 02:35:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

KAMEZAWA Hiroyuki <[email protected]> wrote:
>
> This patch removes zone_mem_map from zone.
> By this, (generic) page_to_pfn and pfn_to_page can use the same logic.

I assume this is dependent upon unify-pfn_to_page-*.patch?

> This modifies page_to_pfn implementation. Could anyone do performance test on NUMA ?

Do you expect there to be NUMA performance problems? If so, how do they
arise and what sort of tests should be run?

2006-02-22 02:51:32

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

Andrew Morton wrote:
> KAMEZAWA Hiroyuki <[email protected]> wrote:
>> This patch removes zone_mem_map from zone.
>> By this, (generic) page_to_pfn and pfn_to_page can use the same logic.
>
> I assume this is dependent upon unify-pfn_to_page-*.patch?
>
yes. sorry for forgetting to write it.

>> This modifies page_to_pfn implementation. Could anyone do performance test on NUMA ?
>
> Do you expect there to be NUMA performance problems? If so, how do they
> arise and what sort of tests should be run?
>
I don't expect it. But when I posted this before (as RFC), some persons
(Martin J. Bligh and Dave Hansen) had concerns about it.

I think the heaviest users of page_to_pfn() are the page allocator and
mk_pte(page_to_pfn(page), hogehoge).

So, tests like "mmap -> touch all -> unmap" will be good test.

powerpc and ia64 is not a good test environment, because they don't use
page_to_pfn() of generic DISCONTIG definitions.

other NUMAs (i386, x86_64 etc..) will be good.

Thanks,
-- Kame

2006-02-22 03:04:42

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

On Wed, 22 Feb 2006, KAMEZAWA Hiroyuki wrote:

> This patch removes zone_mem_map.

Note that IA64 does not seem to depend on zone_mem_map...

> Index: test/include/asm-generic/memory_model.h
> ===================================================================
> --- test.orig/include/asm-generic/memory_model.h
> +++ test/include/asm-generic/memory_model.h
> @@ -47,9 +47,9 @@ extern unsigned long page_to_pfn(struct
>
> #define page_to_pfn(pg) \
> ({ struct page *__pg = (pg); \
> - struct zone *__zone = page_zone(__pg); \
> - (unsigned long)(__pg - __zone->zone_mem_map) + \
> - __zone->zone_start_pfn; \
> + struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg)); \
> + (unsigned long)(__pg - __pgdat->node_mem_map) + \
> + __pgdat->node_start_pfn; \
> })

NODE_DATA is an arch specific lookup, If it always is a table lookup
then the performance will be comparable to page_zone because that also
involves one table lookup.

2006-02-22 03:24:21

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

Christoph Lameter wrote:
> On Wed, 22 Feb 2006, KAMEZAWA Hiroyuki wrote:
>
>> This patch removes zone_mem_map.
>
> Note that IA64 does not seem to depend on zone_mem_map...
>
Oh, yes. ia64 doesn't includes asm-generic/memory_model.h when DISCONTIGMEM.

>> Index: test/include/asm-generic/memory_model.h
>> ===================================================================
>> --- test.orig/include/asm-generic/memory_model.h
>> +++ test/include/asm-generic/memory_model.h
>> @@ -47,9 +47,9 @@ extern unsigned long page_to_pfn(struct
>>
>> #define page_to_pfn(pg) \
>> ({ struct page *__pg = (pg); \
>> - struct zone *__zone = page_zone(__pg); \
>> - (unsigned long)(__pg - __zone->zone_mem_map) + \
>> - __zone->zone_start_pfn; \
>> + struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg)); \
>> + (unsigned long)(__pg - __pgdat->node_mem_map) + \
>> + __pgdat->node_start_pfn; \
>> })
>
> NODE_DATA is an arch specific lookup, If it always is a table lookup
> then the performance will be comparable to page_zone because that also
> involves one table lookup.
>
There are several types of NODE_DATA definitions.
1. #define NODE_DATA(node) (&node_data[node]) alpha,arm,
2. #define NODE_DATA(node) (node_data[node]) i386,powerpc,x86_64,m32r
3. #define NODE_DATA(node) (&node_data[node]->pgdat) parisc,mips
4. #define NODE_DATA(node) (per-cpu-page has node_data[nid] pointer array) ia64

BTW, ia64 looks very special. Does it make sensible performance gain ?

-- Kame

2006-02-22 03:41:41

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

On Wed, 22 Feb 2006, KAMEZAWA Hiroyuki wrote:

> BTW, ia64 looks very special. Does it make sensible performance gain ?

Well yes, we actually have virtual mappings in kernel address space.
F.e. The hotplug remove issues could be fixed there by remapping pages.

2006-02-22 03:51:33

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: [PATCH] remove zone_mem_map

Christoph Lameter wrote:
> On Wed, 22 Feb 2006, KAMEZAWA Hiroyuki wrote:
>
>> BTW, ia64 looks very special. Does it make sensible performance gain ?
>
> Well yes, we actually have virtual mappings in kernel address space.
> F.e. The hotplug remove issues could be fixed there by remapping pages.
>
Ah, if we place node_data[i](array of pointer to pgdat) in region 7,
there is no trouble ?
(maybe zone_table[] should be also..)

-- Kame