2009-10-02 07:03:18

by Christoph Lameter

[permalink] [raw]
Subject: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

Explicitly initialize the pagesets after the per cpu areas have been
initialized. This is necessary in order to be able to use per cpu
operations in later patches.

Cc: Mel Gorman <[email protected]>
Signed-off-by: Christoph Lameter <[email protected]>

---
arch/ia64/kernel/setup.c | 1 +
arch/powerpc/kernel/setup_64.c | 1 +
arch/sparc/kernel/smp_64.c | 1 +
arch/x86/kernel/setup_percpu.c | 2 ++
include/linux/mm.h | 1 +
mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
mm/percpu.c | 2 ++
7 files changed, 37 insertions(+), 11 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===================================================================
--- linux-2.6.orig/mm/page_alloc.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/mm/page_alloc.c 2009-10-01 09:36:19.000000000 -0500
@@ -3270,23 +3270,42 @@ void zone_pcp_update(struct zone *zone)
stop_machine(__zone_pcp_update, zone, NULL);
}

-static __meminit void zone_pcp_init(struct zone *zone)
+/*
+ * Early setup of pagesets.
+ *
+ * In the NUMA case the pageset setup simply results in all zones pcp
+ * pointer being directed at a per cpu pageset with zero batchsize.
+ *
+ * This means that every free and every allocation occurs directly from
+ * the buddy allocator tables.
+ *
+ * The pageset never queues pages during early boot and is therefore usable
+ * for every type of zone.
+ */
+__meminit void setup_pagesets(void)
{
int cpu;
- unsigned long batch = zone_batchsize(zone);
+ struct zone *zone;

- for (cpu = 0; cpu < NR_CPUS; cpu++) {
+ for_each_zone(zone) {
#ifdef CONFIG_NUMA
- /* Early boot. Slab allocator not functional yet */
- zone_pcp(zone, cpu) = &boot_pageset[cpu];
- setup_pageset(&boot_pageset[cpu],0);
+ unsigned long batch = 0;
+
+ for (cpu = 0; cpu < NR_CPUS; cpu++) {
+ /* Early boot. Slab allocator not functional yet */
+ zone_pcp(zone, cpu) = &boot_pageset[cpu];
+ }
#else
- setup_pageset(zone_pcp(zone,cpu), batch);
+ unsigned long batch = zone_batchsize(zone);
#endif
+
+ for_each_possible_cpu(cpu)
+ setup_pageset(zone_pcp(zone, cpu), batch);
+
+ if (zone->present_pages)
+ printk(KERN_DEBUG " %s zone: %lu pages, LIFO batch:%lu\n",
+ zone->name, zone->present_pages, batch);
}
- if (zone->present_pages)
- printk(KERN_DEBUG " %s zone: %lu pages, LIFO batch:%lu\n",
- zone->name, zone->present_pages, batch);
}

__meminit int init_currently_empty_zone(struct zone *zone,
@@ -3841,7 +3860,6 @@ static void __paginginit free_area_init_

zone->prev_priority = DEF_PRIORITY;

- zone_pcp_init(zone);
for_each_lru(l) {
INIT_LIST_HEAD(&zone->lru[l].list);
zone->reclaim_stat.nr_saved_scan[l] = 0;
Index: linux-2.6/include/linux/mm.h
===================================================================
--- linux-2.6.orig/include/linux/mm.h 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/include/linux/mm.h 2009-10-01 09:36:19.000000000 -0500
@@ -1060,6 +1060,7 @@ extern void show_mem(void);
extern void si_meminfo(struct sysinfo * val);
extern void si_meminfo_node(struct sysinfo *val, int nid);
extern int after_bootmem;
+extern void setup_pagesets(void);

#ifdef CONFIG_NUMA
extern void setup_per_cpu_pageset(void);
Index: linux-2.6/arch/ia64/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/ia64/kernel/setup.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/arch/ia64/kernel/setup.c 2009-10-01 09:35:39.000000000 -0500
@@ -864,6 +864,7 @@ void __init
setup_per_cpu_areas (void)
{
/* start_kernel() requires this... */
+ setup_pagesets();
}
#endif

Index: linux-2.6/arch/powerpc/kernel/setup_64.c
===================================================================
--- linux-2.6.orig/arch/powerpc/kernel/setup_64.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/arch/powerpc/kernel/setup_64.c 2009-10-01 09:35:39.000000000 -0500
@@ -578,6 +578,7 @@ static void ppc64_do_msg(unsigned int sr
snprintf(buf, 128, "%s", msg);
ppc_md.progress(buf, 0);
}
+ setup_pagesets();
}

/* Print a boot progress message. */
Index: linux-2.6/arch/sparc/kernel/smp_64.c
===================================================================
--- linux-2.6.orig/arch/sparc/kernel/smp_64.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/arch/sparc/kernel/smp_64.c 2009-10-01 09:35:39.000000000 -0500
@@ -1486,4 +1486,5 @@ void __init setup_per_cpu_areas(void)
of_fill_in_cpu_data();
if (tlb_type == hypervisor)
mdesc_fill_in_cpu_data(cpu_all_mask);
+ setup_pagesets();
}
Index: linux-2.6/arch/x86/kernel/setup_percpu.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup_percpu.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/arch/x86/kernel/setup_percpu.c 2009-10-01 09:35:39.000000000 -0500
@@ -269,4 +269,6 @@ void __init setup_per_cpu_areas(void)

/* Setup cpu initialized, callin, callout masks */
setup_cpu_local_masks();
+
+ setup_pagesets();
}
Index: linux-2.6/mm/percpu.c
===================================================================
--- linux-2.6.orig/mm/percpu.c 2009-10-01 08:54:19.000000000 -0500
+++ linux-2.6/mm/percpu.c 2009-10-01 09:35:39.000000000 -0500
@@ -2062,5 +2062,7 @@ void __init setup_per_cpu_areas(void)
delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start;
for_each_possible_cpu(cpu)
__per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];
+
+ setup_pagesets();
}
#endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */

--


2009-10-02 14:16:14

by Mel Gorman

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Thu, Oct 01, 2009 at 05:25:33PM -0400, [email protected] wrote:
> Explicitly initialize the pagesets after the per cpu areas have been
> initialized. This is necessary in order to be able to use per cpu
> operations in later patches.
>

Can you be more explicit about this? I think the reasoning is as follows

A later patch will use DEFINE_PER_CPU which allocates memory later in
the boot-cycle after zones have already been initialised. Without this
patch, use of DEFINE_PER_CPU would result in invalid memory accesses
during pageset initialisation.

Have another question below as well.

> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Christoph Lameter <[email protected]>
>
> ---
> arch/ia64/kernel/setup.c | 1 +
> arch/powerpc/kernel/setup_64.c | 1 +
> arch/sparc/kernel/smp_64.c | 1 +
> arch/x86/kernel/setup_percpu.c | 2 ++
> include/linux/mm.h | 1 +
> mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
> mm/percpu.c | 2 ++
> 7 files changed, 37 insertions(+), 11 deletions(-)
>
> Index: linux-2.6/mm/page_alloc.c
> ===================================================================
> --- linux-2.6.orig/mm/page_alloc.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/mm/page_alloc.c 2009-10-01 09:36:19.000000000 -0500
> @@ -3270,23 +3270,42 @@ void zone_pcp_update(struct zone *zone)
> stop_machine(__zone_pcp_update, zone, NULL);
> }
>
> -static __meminit void zone_pcp_init(struct zone *zone)
> +/*
> + * Early setup of pagesets.
> + *
> + * In the NUMA case the pageset setup simply results in all zones pcp
> + * pointer being directed at a per cpu pageset with zero batchsize.
> + *

The batchsize becomes 1, not 0 if you look at setup_pageset() but that aside,
it's unclear from the comment *why* the batchsize is 1 in the NUMA case.
Maybe something like the following?

=====
In the NUMA case, the boot_pageset is used until the slab allocator is
available to allocate per-zone pagesets as each CPU is brought up. At
this point, the batchsize is set to 1 to prevent pages "leaking" onto the
boot_pageset freelists.
=====

Otherwise, nothing in the patch jumped out at me other than to double
check CPU-up events actually result in process_zones() being called and
that boot_pageset is not being accidentally used in the long term.

> + * This means that every free and every allocation occurs directly from
> + * the buddy allocator tables.
> + *
> + * The pageset never queues pages during early boot and is therefore usable
> + * for every type of zone.
> + */
> +__meminit void setup_pagesets(void)
> {
> int cpu;
> - unsigned long batch = zone_batchsize(zone);
> + struct zone *zone;
>
> - for (cpu = 0; cpu < NR_CPUS; cpu++) {
> + for_each_zone(zone) {
> #ifdef CONFIG_NUMA
> - /* Early boot. Slab allocator not functional yet */
> - zone_pcp(zone, cpu) = &boot_pageset[cpu];
> - setup_pageset(&boot_pageset[cpu],0);
> + unsigned long batch = 0;
> +
> + for (cpu = 0; cpu < NR_CPUS; cpu++) {
> + /* Early boot. Slab allocator not functional yet */
> + zone_pcp(zone, cpu) = &boot_pageset[cpu];
> + }
> #else
> - setup_pageset(zone_pcp(zone,cpu), batch);
> + unsigned long batch = zone_batchsize(zone);
> #endif
> +
> + for_each_possible_cpu(cpu)
> + setup_pageset(zone_pcp(zone, cpu), batch);
> +
> + if (zone->present_pages)
> + printk(KERN_DEBUG " %s zone: %lu pages, LIFO batch:%lu\n",
> + zone->name, zone->present_pages, batch);
> }
> - if (zone->present_pages)
> - printk(KERN_DEBUG " %s zone: %lu pages, LIFO batch:%lu\n",
> - zone->name, zone->present_pages, batch);
> }
>
> __meminit int init_currently_empty_zone(struct zone *zone,
> @@ -3841,7 +3860,6 @@ static void __paginginit free_area_init_
>
> zone->prev_priority = DEF_PRIORITY;
>
> - zone_pcp_init(zone);
> for_each_lru(l) {
> INIT_LIST_HEAD(&zone->lru[l].list);
> zone->reclaim_stat.nr_saved_scan[l] = 0;
> Index: linux-2.6/include/linux/mm.h
> ===================================================================
> --- linux-2.6.orig/include/linux/mm.h 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/include/linux/mm.h 2009-10-01 09:36:19.000000000 -0500
> @@ -1060,6 +1060,7 @@ extern void show_mem(void);
> extern void si_meminfo(struct sysinfo * val);
> extern void si_meminfo_node(struct sysinfo *val, int nid);
> extern int after_bootmem;
> +extern void setup_pagesets(void);
>
> #ifdef CONFIG_NUMA
> extern void setup_per_cpu_pageset(void);
> Index: linux-2.6/arch/ia64/kernel/setup.c
> ===================================================================
> --- linux-2.6.orig/arch/ia64/kernel/setup.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/arch/ia64/kernel/setup.c 2009-10-01 09:35:39.000000000 -0500
> @@ -864,6 +864,7 @@ void __init
> setup_per_cpu_areas (void)
> {
> /* start_kernel() requires this... */
> + setup_pagesets();
> }
> #endif
>
> Index: linux-2.6/arch/powerpc/kernel/setup_64.c
> ===================================================================
> --- linux-2.6.orig/arch/powerpc/kernel/setup_64.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/arch/powerpc/kernel/setup_64.c 2009-10-01 09:35:39.000000000 -0500
> @@ -578,6 +578,7 @@ static void ppc64_do_msg(unsigned int sr
> snprintf(buf, 128, "%s", msg);
> ppc_md.progress(buf, 0);
> }
> + setup_pagesets();
> }
>
> /* Print a boot progress message. */
> Index: linux-2.6/arch/sparc/kernel/smp_64.c
> ===================================================================
> --- linux-2.6.orig/arch/sparc/kernel/smp_64.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/arch/sparc/kernel/smp_64.c 2009-10-01 09:35:39.000000000 -0500
> @@ -1486,4 +1486,5 @@ void __init setup_per_cpu_areas(void)
> of_fill_in_cpu_data();
> if (tlb_type == hypervisor)
> mdesc_fill_in_cpu_data(cpu_all_mask);
> + setup_pagesets();
> }
> Index: linux-2.6/arch/x86/kernel/setup_percpu.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kernel/setup_percpu.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/arch/x86/kernel/setup_percpu.c 2009-10-01 09:35:39.000000000 -0500
> @@ -269,4 +269,6 @@ void __init setup_per_cpu_areas(void)
>
> /* Setup cpu initialized, callin, callout masks */
> setup_cpu_local_masks();
> +
> + setup_pagesets();
> }
> Index: linux-2.6/mm/percpu.c
> ===================================================================
> --- linux-2.6.orig/mm/percpu.c 2009-10-01 08:54:19.000000000 -0500
> +++ linux-2.6/mm/percpu.c 2009-10-01 09:35:39.000000000 -0500
> @@ -2062,5 +2062,7 @@ void __init setup_per_cpu_areas(void)
> delta = (unsigned long)pcpu_base_addr - (unsigned long)__per_cpu_start;
> for_each_possible_cpu(cpu)
> __per_cpu_offset[cpu] = delta + pcpu_unit_offsets[cpu];
> +
> + setup_pagesets();
> }
> #endif /* CONFIG_HAVE_SETUP_PER_CPU_AREA */
>
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-10-02 17:36:20

by Christoph Lameter

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Fri, 2 Oct 2009, Mel Gorman wrote:

> On Thu, Oct 01, 2009 at 05:25:33PM -0400, [email protected] wrote:
> > Explicitly initialize the pagesets after the per cpu areas have been
> > initialized. This is necessary in order to be able to use per cpu
> > operations in later patches.
> >
>
> Can you be more explicit about this? I think the reasoning is as follows
>
> A later patch will use DEFINE_PER_CPU which allocates memory later in
> the boot-cycle after zones have already been initialised. Without this
> patch, use of DEFINE_PER_CPU would result in invalid memory accesses
> during pageset initialisation.

Nope. Pagesets are not statically allocated per cpu data. They are
allocated with the per cpu allocator.

The per cpu allocator is not initialized that early in boot. We cannot
allocate the pagesets then. Therefore we use a fake single item pageset
(like used now for NUMA boot) to take its place until the slab and percpu
allocators are up. Then we allocate the real pagesets.

> > -static __meminit void zone_pcp_init(struct zone *zone)
> > +/*
> > + * Early setup of pagesets.
> > + *
> > + * In the NUMA case the pageset setup simply results in all zones pcp
> > + * pointer being directed at a per cpu pageset with zero batchsize.
> > + *
>
> The batchsize becomes 1, not 0 if you look at setup_pageset() but that aside,
> it's unclear from the comment *why* the batchsize is 1 in the NUMA case.
> Maybe something like the following?
>
> =====
> In the NUMA case, the boot_pageset is used until the slab allocator is
> available to allocate per-zone pagesets as each CPU is brought up. At
> this point, the batchsize is set to 1 to prevent pages "leaking" onto the
> boot_pageset freelists.
> =====
>
> Otherwise, nothing in the patch jumped out at me other than to double
> check CPU-up events actually result in process_zones() being called and
> that boot_pageset is not being accidentally used in the long term.

This is already explained in a commment where boot_pageset is defined.
Should we add some more elaborate comments to zone_pcp_init()?

2009-10-03 10:29:28

by Tejun Heo

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

[email protected] wrote:
> Explicitly initialize the pagesets after the per cpu areas have been
> initialized. This is necessary in order to be able to use per cpu
> operations in later patches.
>
> Cc: Mel Gorman <[email protected]>
> Signed-off-by: Christoph Lameter <[email protected]>
>
> ---
> arch/ia64/kernel/setup.c | 1 +
> arch/powerpc/kernel/setup_64.c | 1 +
> arch/sparc/kernel/smp_64.c | 1 +
> arch/x86/kernel/setup_percpu.c | 2 ++
> include/linux/mm.h | 1 +
> mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
> mm/percpu.c | 2 ++

Hmmm... why not call this function from start_kernel() after calling
setup_per_cpu_areas() instead of modifying every implementation of
setup_per_cpu_areas()?

Thanks.

--
tejun

2009-10-05 09:36:11

by Mel Gorman

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Fri, Oct 02, 2009 at 01:30:58PM -0400, Christoph Lameter wrote:
> On Fri, 2 Oct 2009, Mel Gorman wrote:
>
> > On Thu, Oct 01, 2009 at 05:25:33PM -0400, [email protected] wrote:
> > > Explicitly initialize the pagesets after the per cpu areas have been
> > > initialized. This is necessary in order to be able to use per cpu
> > > operations in later patches.
> > >
> >
> > Can you be more explicit about this? I think the reasoning is as follows
> >
> > A later patch will use DEFINE_PER_CPU which allocates memory later in
> > the boot-cycle after zones have already been initialised. Without this
> > patch, use of DEFINE_PER_CPU would result in invalid memory accesses
> > during pageset initialisation.
>
> Nope. Pagesets are not statically allocated per cpu data. They are
> allocated with the per cpu allocator.
>

I don't think I said they were statically allocated.

> The per cpu allocator is not initialized that early in boot. We cannot
> allocate the pagesets then. Therefore we use a fake single item pageset
> (like used now for NUMA boot) to take its place until the slab and percpu
> allocators are up. Then we allocate the real pagesets.
>

Ok, that explanation matches my expectations. Thanks.

> > > -static __meminit void zone_pcp_init(struct zone *zone)
> > > +/*
> > > + * Early setup of pagesets.
> > > + *
> > > + * In the NUMA case the pageset setup simply results in all zones pcp
> > > + * pointer being directed at a per cpu pageset with zero batchsize.
> > > + *
> >
> > The batchsize becomes 1, not 0 if you look at setup_pageset() but that aside,
> > it's unclear from the comment *why* the batchsize is 1 in the NUMA case.
> > Maybe something like the following?
> >
> > =====
> > In the NUMA case, the boot_pageset is used until the slab allocator is
> > available to allocate per-zone pagesets as each CPU is brought up. At
> > this point, the batchsize is set to 1 to prevent pages "leaking" onto the
> > boot_pageset freelists.
> > =====
> >
> > Otherwise, nothing in the patch jumped out at me other than to double
> > check CPU-up events actually result in process_zones() being called and
> > that boot_pageset is not being accidentally used in the long term.
>
> This is already explained in a commment where boot_pageset is defined.
> Should we add some more elaborate comments to zone_pcp_init()?
>

I suppose not. Point them to the comment in boot_pageset so there is a
chance the comment stays up to date.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2009-10-05 14:46:19

by Christoph Lameter

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Sat, 3 Oct 2009, Tejun Heo wrote:

> > arch/ia64/kernel/setup.c | 1 +
> > arch/powerpc/kernel/setup_64.c | 1 +
> > arch/sparc/kernel/smp_64.c | 1 +
> > arch/x86/kernel/setup_percpu.c | 2 ++
> > include/linux/mm.h | 1 +
> > mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
> > mm/percpu.c | 2 ++
>
> Hmmm... why not call this function from start_kernel() after calling
> setup_per_cpu_areas() instead of modifying every implementation of
> setup_per_cpu_areas()?

Because it has to be called immediately after per cpu areas become
available. Otherwise page allocator uses will fail.

2009-10-05 15:02:57

by Tejun Heo

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

Christoph Lameter wrote:
> On Sat, 3 Oct 2009, Tejun Heo wrote:
>
>>> arch/ia64/kernel/setup.c | 1 +
>>> arch/powerpc/kernel/setup_64.c | 1 +
>>> arch/sparc/kernel/smp_64.c | 1 +
>>> arch/x86/kernel/setup_percpu.c | 2 ++
>>> include/linux/mm.h | 1 +
>>> mm/page_alloc.c | 40 +++++++++++++++++++++++++++++-----------
>>> mm/percpu.c | 2 ++
>> Hmmm... why not call this function from start_kernel() after calling
>> setup_per_cpu_areas() instead of modifying every implementation of
>> setup_per_cpu_areas()?
>
> Because it has to be called immediately after per cpu areas become
> available. Otherwise page allocator uses will fail.

But... setup_per_cpu_area() isn't supposed to call page allocator and
start_kernel() is the only caller of setup_per_cpu_areas().

--
tejun

2009-10-05 15:13:36

by Christoph Lameter

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Tue, 6 Oct 2009, Tejun Heo wrote:

> > Because it has to be called immediately after per cpu areas become
> > available. Otherwise page allocator uses will fail.
>
> But... setup_per_cpu_area() isn't supposed to call page allocator and
> start_kernel() is the only caller of setup_per_cpu_areas().

setup_per_cpu_areas() is not calling the page allocator. However, any
caller after that can call the page allocator.

There are various arch implementations that do their own implementation of
setup_per_cpu_areas() at their own time (Check sparc and ia64 for
example).

2009-10-05 15:22:57

by Tejun Heo

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

Christoph Lameter wrote:
> On Tue, 6 Oct 2009, Tejun Heo wrote:
>
>>> Because it has to be called immediately after per cpu areas become
>>> available. Otherwise page allocator uses will fail.
>> But... setup_per_cpu_area() isn't supposed to call page allocator and
>> start_kernel() is the only caller of setup_per_cpu_areas().
>
> setup_per_cpu_areas() is not calling the page allocator. However, any
> caller after that can call the page allocator.
>
> There are various arch implementations that do their own implementation of
> setup_per_cpu_areas() at their own time (Check sparc and ia64 for
> example).

sparc is doing everything on setup_per_cpu_areas(). ia64 is an
exception. Hmmm... I really don't like scattering mostly unrelated
init call to every setup_per_cpu_areas() implementation. How about
adding "static bool initialized __initdata" to the function and allow
it to be called earlier when necessary (only ia64 at the moment)?

Thanks.

--
tejun

2009-10-05 15:35:45

by Christoph Lameter

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Tue, 6 Oct 2009, Tejun Heo wrote:

> > There are various arch implementations that do their own implementation of
> > setup_per_cpu_areas() at their own time (Check sparc and ia64 for
> > example).
>
> sparc is doing everything on setup_per_cpu_areas(). ia64 is an
> exception. Hmmm... I really don't like scattering mostly unrelated
> init call to every setup_per_cpu_areas() implementation. How about
> adding "static bool initialized __initdata" to the function and allow
> it to be called earlier when necessary (only ia64 at the moment)?

It would be best to consolidate all setup_per_cpu_areas() to work at the
same time during bootup. How about having a single setup_per_cpu_areas()
function and then within the function allow arch specific processing. Try
to share as much code as possible?

2009-10-05 15:45:55

by Christoph Lameter

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

On Tue, 6 Oct 2009, Tejun Heo wrote:

> I'm under the impression that ia64 needs its percpu areas setup
> earlier during the boot so I'm not sure what you have in mind but as
> long as the call isn't scattered over different setup_per_cpu_areas()
> implementations, I'm okay.

Currently it does. But there must be some way to untangle that mess.

2009-10-05 15:42:43

by Tejun Heo

[permalink] [raw]
Subject: Re: [this_cpu_xx V4 12/20] Move early initialization of pagesets out of zone_wait_table_init()

Christoph Lameter wrote:
> On Tue, 6 Oct 2009, Tejun Heo wrote:
>
>>> There are various arch implementations that do their own implementation of
>>> setup_per_cpu_areas() at their own time (Check sparc and ia64 for
>>> example).
>> sparc is doing everything on setup_per_cpu_areas(). ia64 is an
>> exception. Hmmm... I really don't like scattering mostly unrelated
>> init call to every setup_per_cpu_areas() implementation. How about
>> adding "static bool initialized __initdata" to the function and allow
>> it to be called earlier when necessary (only ia64 at the moment)?
>
> It would be best to consolidate all setup_per_cpu_areas() to work at the
> same time during bootup. How about having a single setup_per_cpu_areas()
> function and then within the function allow arch specific processing. Try
> to share as much code as possible?

I'm under the impression that ia64 needs its percpu areas setup
earlier during the boot so I'm not sure what you have in mind but as
long as the call isn't scattered over different setup_per_cpu_areas()
implementations, I'm okay.

Thanks.

--
tejun