2002-03-08 21:11:46

by Patricia Gaughen

[permalink] [raw]
Subject: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18


Hi,

I'm currently working on a discontigmem patch for IBM NUMAQ (an ia32
NUMA box) and want to reuse the standard i386 code as much as
possible. To achieve this, I've modularized setup_arch() and
mem_init(). This modularization is what the patch that I've included
in this email contains.

Here's a breakdown of the changes I've made and their justification:

- PFN_{UP,DOWN,PHYS}, MAXMEM_PFN, MAX_NONPAE_PFN macros have
been moved to include/asm-i386/setup.h to allow use of them in
my discontigmem code that is located in numa.c (part of my
discontigmem patch, which is not available yet)

- Created a structure of the pfns used during initialization
(struct pfns - see include/asm-i386/setup.h_for ease of
passing to functions, and to enable reuse of use the structure
(1 for each node) in my discontig mem code for paging_init().

- Several blocks of code in mem_init() and setup_arch() where
moved into functions for reuse in the discontigmem patch
and also for readability.

Let me know if you have any comments.

Thanks,
Pat

--
Patricia Gaughen ([email protected])
Linux Technology Center

--- virgin-2.4.18/arch/i386/kernel/setup.c Mon Feb 25 11:37:53 2002
+++ linux-2.4.18-cleanup/arch/i386/kernel/setup.c Sun Mar 3 22:05:39 2002
@@ -113,6 +113,7 @@
#include <asm/dma.h>
#include <asm/mpspec.h>
#include <asm/mmu_context.h>
+#include <asm/setup.h>
/*
* Machine setup..
*/
@@ -779,69 +780,14 @@
}
}

-void __init setup_arch(char **cmdline_p)
-{
- unsigned long bootmap_size, low_mem_size;
- unsigned long start_pfn, max_pfn, max_low_pfn;
- int i;
-
-#ifdef CONFIG_VISWS
- visws_get_board_type_and_rev();
-#endif
-
- ROOT_DEV = to_kdev_t(ORIG_ROOT_DEV);
- drive_info = DRIVE_INFO;
- screen_info = SCREEN_INFO;
- apm_info.bios = APM_BIOS_INFO;
- if( SYS_DESC_TABLE.length != 0 ) {
- MCA_bus = SYS_DESC_TABLE.table[3] &0x2;
- machine_id = SYS_DESC_TABLE.table[0];
- machine_submodel_id = SYS_DESC_TABLE.table[1];
- BIOS_revision = SYS_DESC_TABLE.table[2];
- }
- aux_device_present = AUX_DEVICE_INFO;
-
-#ifdef CONFIG_BLK_DEV_RAM
- rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK;
- rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0);
- rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0);
-#endif
- setup_memory_region();
-
- if (!MOUNT_ROOT_RDONLY)
- root_mountflags &= ~MS_RDONLY;
- init_mm.start_code = (unsigned long) &_text;
- init_mm.end_code = (unsigned long) &_etext;
- init_mm.end_data = (unsigned long) &_edata;
- init_mm.brk = (unsigned long) &_end;
-
- code_resource.start = virt_to_bus(&_text);
- code_resource.end = virt_to_bus(&_etext)-1;
- data_resource.start = virt_to_bus(&_etext);
- data_resource.end = virt_to_bus(&_edata)-1;
-
- parse_mem_cmdline(cmdline_p);
-
-#define PFN_UP(x) (((x) + PAGE_SIZE-1) >> PAGE_SHIFT)
-#define PFN_DOWN(x) ((x) >> PAGE_SHIFT)
-#define PFN_PHYS(x) ((x) << PAGE_SHIFT)
-
/*
- * Reserved space for vmalloc and iomap - defined in asm/page.h
+ * Find the highest page frame number we have available
*/
-#define MAXMEM_PFN PFN_DOWN(MAXMEM)
-#define MAX_NONPAE_PFN (1 << 20)
-
- /*
- * partially used pages are not usable - thus
- * we are rounding upwards:
- */
- start_pfn = PFN_UP(__pa(&_end));
+void __init find_max_pfn(struct pfns *bootpfns)
+{
+ int i;

- /*
- * Find the highest page frame number we have available
- */
- max_pfn = 0;
+ bootpfns->max_pfn = 0;
for (i = 0; i < e820.nr_map; i++) {
unsigned long start, end;
/* RAM? */
@@ -851,51 +797,46 @@
end = PFN_DOWN(e820.map[i].addr + e820.map[i].size);
if (start >= end)
continue;
- if (end > max_pfn)
- max_pfn = end;
+ if (end > bootpfns->max_pfn)
+ bootpfns->max_pfn = end;
}
+}

- /*
- * Determine low and high memory ranges:
- */
- max_low_pfn = max_pfn;
- if (max_low_pfn > MAXMEM_PFN) {
- max_low_pfn = MAXMEM_PFN;
+/*
+ * Determine low and high memory ranges:
+ */
+void __init find_max_low_pfn(struct pfns *bootpfns)
+{
+ bootpfns->max_low_pfn = bootpfns->max_pfn;
+ if (bootpfns->max_low_pfn > MAXMEM_PFN) {
+ bootpfns->max_low_pfn = MAXMEM_PFN;
#ifndef CONFIG_HIGHMEM
/* Maximum memory usable is what is directly addressable */
printk(KERN_WARNING "Warning only %ldMB will be used.\n",
MAXMEM>>20);
- if (max_pfn > MAX_NONPAE_PFN)
+ if (bootpfns->max_pfn > MAX_NONPAE_PFN)
printk(KERN_WARNING "Use a PAE enabled kernel.\n");
else
printk(KERN_WARNING "Use a HIGHMEM enabled kernel.\n");
#else /* !CONFIG_HIGHMEM */
#ifndef CONFIG_X86_PAE
- if (max_pfn > MAX_NONPAE_PFN) {
- max_pfn = MAX_NONPAE_PFN;
+ if (bootpfns->max_pfn > MAX_NONPAE_PFN) {
+ bootpfns->max_pfn = MAX_NONPAE_PFN;
printk(KERN_WARNING "Warning only 4GB will be used.\n");
printk(KERN_WARNING "Use a PAE enabled kernel.\n");
}
#endif /* !CONFIG_X86_PAE */
#endif /* !CONFIG_HIGHMEM */
}
+}

-#ifdef CONFIG_HIGHMEM
- highstart_pfn = highend_pfn = max_pfn;
- if (max_pfn > MAXMEM_PFN) {
- highstart_pfn = MAXMEM_PFN;
- printk(KERN_NOTICE "%ldMB HIGHMEM available.\n",
- pages_to_mb(highend_pfn - highstart_pfn));
- }
-#endif
- /*
- * Initialize the boot-time allocator (with low memory only):
- */
- bootmap_size = init_bootmem(start_pfn, max_low_pfn);
+/*
+ * Register fully available low RAM pages with the bootmem allocator.
+ */
+static void __init register_bootmem_low_pages(struct pfns *bootpfns)
+{
+ int i;

- /*
- * Register fully available low RAM pages with the bootmem allocator.
- */
for (i = 0; i < e820.nr_map; i++) {
unsigned long curr_pfn, last_pfn, size;
/*
@@ -907,15 +848,15 @@
* We are rounding up the start address of usable memory:
*/
curr_pfn = PFN_UP(e820.map[i].addr);
- if (curr_pfn >= max_low_pfn)
+ if (curr_pfn >= bootpfns->max_low_pfn)
continue;
/*
* ... and at the end of the usable range downwards:
*/
last_pfn = PFN_DOWN(e820.map[i].addr + e820.map[i].size);

- if (last_pfn > max_low_pfn)
- last_pfn = max_low_pfn;
+ if (last_pfn > bootpfns->max_low_pfn)
+ last_pfn = bootpfns->max_low_pfn;

/*
* .. finally, did all the rounding and playing
@@ -927,13 +868,45 @@
size = last_pfn - curr_pfn;
free_bootmem(PFN_PHYS(curr_pfn), PFN_PHYS(size));
}
+}
+
+static void __init setup_memory(struct pfns *bootpfns)
+{
+ unsigned long bootmap_size;
+
+ /*
+ * partially used pages are not usable - thus
+ * we are rounding upwards:
+ */
+ bootpfns->start_pfn = PFN_UP(__pa(&_end));
+
+ find_max_pfn(bootpfns);
+
+ find_max_low_pfn(bootpfns);
+
+#ifdef CONFIG_HIGHMEM
+ highstart_pfn = highend_pfn = bootpfns->max_pfn;
+ if (bootpfns->max_pfn > MAXMEM_PFN) {
+ highstart_pfn = MAXMEM_PFN;
+ printk(KERN_NOTICE "%ldMB HIGHMEM available.\n",
+ pages_to_mb(highend_pfn - highstart_pfn));
+ }
+#endif
+
+ /*
+ * Initialize the boot-time allocator (with low memory only):
+ */
+ bootmap_size = init_bootmem(bootpfns->start_pfn, bootpfns->max_low_pfn);
+
+ register_bootmem_low_pages(bootpfns);
+
/*
* Reserve the bootmem bitmap itself as well. We do this in two
* steps (first step was init_bootmem()) because this catches
* the (very unlikely) case of us accidentally initializing the
* bootmem allocator with an invalid RAM area.
*/
- reserve_bootmem(HIGH_MEMORY, (PFN_PHYS(start_pfn) +
+ reserve_bootmem(HIGH_MEMORY, (PFN_PHYS(bootpfns->start_pfn) +
bootmap_size + PAGE_SIZE-1) - (HIGH_MEMORY));

/*
@@ -950,14 +923,11 @@
*/
reserve_bootmem(PAGE_SIZE, PAGE_SIZE);
#endif
+}

-#ifdef CONFIG_X86_LOCAL_APIC
- /*
- * Find and reserve possible boot-time SMP configuration:
- */
- find_smp_config();
-#endif
#ifdef CONFIG_BLK_DEV_INITRD
+static void __init setup_mem_initrd(struct pfns *bootpfns)
+{
if (LOADER_TYPE && INITRD_START) {
if (INITRD_START + INITRD_SIZE <= (max_low_pfn << PAGE_SHIFT)) {
reserve_bootmem(INITRD_START, INITRD_SIZE);
@@ -973,31 +943,18 @@
initrd_start = 0;
}
}
-#endif
-
- /*
- * NOTE: before this point _nobody_ is allowed to allocate
- * any memory using the bootmem allocator.
- */
-
-#ifdef CONFIG_SMP
- smp_alloc_memory(); /* AP processor realmode stacks in low memory*/
-#endif
- paging_init();
-#ifdef CONFIG_X86_LOCAL_APIC
- /*
- * get boot-time SMP configuration:
- */
- if (smp_found_config)
- get_smp_config();
- init_apic_mappings();
-#endif
+}
+#endif /* CONFIG_BLK_DEV_INITRD */

+/*
+ * Request address space for all standard RAM and ROM resources
+ * and also for regions reported as reserved by the e820.
+ */
+static void __init register_memory(struct pfns *bootpfns)
+{
+ unsigned long low_mem_size;
+ int i;

- /*
- * Request address space for all standard RAM and ROM resources
- * and also for regions reported as reserved by the e820.
- */
probe_roms();
for (i = 0; i < e820.nr_map; i++) {
struct resource *res;
@@ -1031,10 +988,85 @@
request_resource(&ioport_resource, standard_io_resources+i);

/* Tell the PCI layer not to allocate too close to the RAM area.. */
- low_mem_size = ((max_low_pfn << PAGE_SHIFT) + 0xfffff) & ~0xfffff;
+ low_mem_size = ((bootpfns->max_low_pfn << PAGE_SHIFT) + 0xfffff) & ~0xfffff;
if (low_mem_size > pci_mem_start)
pci_mem_start = low_mem_size;

+}
+
+void __init setup_arch(char **cmdline_p)
+{
+ struct pfns bootpfns;
+
+#ifdef CONFIG_VISWS
+ visws_get_board_type_and_rev();
+#endif
+
+ ROOT_DEV = to_kdev_t(ORIG_ROOT_DEV);
+ drive_info = DRIVE_INFO;
+ screen_info = SCREEN_INFO;
+ apm_info.bios = APM_BIOS_INFO;
+ if( SYS_DESC_TABLE.length != 0 ) {
+ MCA_bus = SYS_DESC_TABLE.table[3] &0x2;
+ machine_id = SYS_DESC_TABLE.table[0];
+ machine_submodel_id = SYS_DESC_TABLE.table[1];
+ BIOS_revision = SYS_DESC_TABLE.table[2];
+ }
+ aux_device_present = AUX_DEVICE_INFO;
+
+#ifdef CONFIG_BLK_DEV_RAM
+ rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK;
+ rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0);
+ rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0);
+#endif
+ setup_memory_region();
+
+ if (!MOUNT_ROOT_RDONLY)
+ root_mountflags &= ~MS_RDONLY;
+ init_mm.start_code = (unsigned long) &_text;
+ init_mm.end_code = (unsigned long) &_etext;
+ init_mm.end_data = (unsigned long) &_edata;
+ init_mm.brk = (unsigned long) &_end;
+
+ code_resource.start = virt_to_bus(&_text);
+ code_resource.end = virt_to_bus(&_etext)-1;
+ data_resource.start = virt_to_bus(&_etext);
+ data_resource.end = virt_to_bus(&_edata)-1;
+
+ parse_mem_cmdline(cmdline_p);
+
+ setup_memory(&bootpfns);
+
+#ifdef CONFIG_X86_LOCAL_APIC
+ /*
+ * Find and reserve possible boot-time SMP configuration:
+ */
+ find_smp_config();
+#endif
+#ifdef CONFIG_BLK_DEV_INITRD
+ setup_mem_initrd(&bootpfns);
+#endif
+
+ /*
+ * NOTE: before this point _nobody_ is allowed to allocate
+ * any memory using the bootmem allocator.
+ */
+
+#ifdef CONFIG_SMP
+ smp_alloc_memory(); /* AP processor realmode stacks in low memory*/
+#endif
+ paging_init();
+#ifdef CONFIG_X86_LOCAL_APIC
+ /*
+ * get boot-time SMP configuration:
+ */
+ if (smp_found_config)
+ get_smp_config();
+ init_apic_mappings();
+#endif
+
+ register_memory(&bootpfns);
+
#ifdef CONFIG_VT
#if defined(CONFIG_VGA_CONSOLE)
conswitchp = &vga_con;
--- virgin-2.4.18/include/asm-i386/setup.h Fri Nov 12 10:12:11 1999
+++ linux-2.4.18-cleanup/include/asm-i386/setup.h Mon Mar 4 17:04:21 2002
@@ -1,10 +1,20 @@
-/*
- * Just a place holder. We don't want to have to test x86 before
- * we include stuff
- */
-
#ifndef _i386_SETUP_H
#define _i386_SETUP_H

+struct pfns {
+ unsigned long start_pfn;
+ unsigned long max_pfn;
+ unsigned long max_low_pfn;
+};
+
+#define PFN_UP(x) (((x) + PAGE_SIZE-1) >> PAGE_SHIFT)
+#define PFN_DOWN(x) ((x) >> PAGE_SHIFT)
+#define PFN_PHYS(x) ((x) << PAGE_SHIFT)
+
+/*
+ * Reserved space for vmalloc and iomap - defined in asm/page.h
+ */
+#define MAXMEM_PFN PFN_DOWN(MAXMEM)
+#define MAX_NONPAE_PFN (1 << 20)

#endif /* _i386_SETUP_H */
--- virgin-2.4.18/arch/i386/mm/init.c Fri Dec 21 09:41:53 2001
+++ linux-2.4.18-cleanup/arch/i386/mm/init.c Tue Mar 5 11:52:22 2002
@@ -447,6 +447,49 @@
return 0;
}

+void __init init_one_highpage(struct page *page, int pfn, int bad_ppro)
+{
+ if (!page_is_ram(pfn)) {
+ SetPageReserved(page);
+ return;
+ }
+
+ if (bad_ppro && page_kills_ppro(pfn))
+ {
+ SetPageReserved(page);
+ return;
+ }
+ ClearPageReserved(page);
+ set_bit(PG_highmem, &page->flags);
+ atomic_set(&page->count, 1);
+ __free_page(page);
+ totalhigh_pages++;
+}
+
+static int __init mem_init_free_pages(int bad_ppro)
+{
+ int reservedpages;
+ int tmp;
+
+ /* this will put all low memory onto the freelists */
+ totalram_pages += free_all_bootmem();
+
+ reservedpages = 0;
+ for (tmp = 0; tmp < max_low_pfn; tmp++)
+ /*
+ * Only count reserved RAM pages
+ */
+ if (page_is_ram(tmp) && PageReserved(mem_map+tmp))
+ reservedpages++;
+#ifdef CONFIG_HIGHMEM
+ for (tmp = highstart_pfn; tmp < highend_pfn; tmp++) {
+ init_one_highpage((struct page *) (mem_map + tmp), tmp, bad_ppro);
+ }
+ totalram_pages += totalhigh_pages;
+#endif
+ return reservedpages;
+}
+
void __init mem_init(void)
{
extern int ppro_with_ram_bug(void);
@@ -470,37 +513,8 @@
/* clear the zero-page */
memset(empty_zero_page, 0, PAGE_SIZE);

- /* this will put all low memory onto the freelists */
- totalram_pages += free_all_bootmem();
-
- reservedpages = 0;
- for (tmp = 0; tmp < max_low_pfn; tmp++)
- /*
- * Only count reserved RAM pages
- */
- if (page_is_ram(tmp) && PageReserved(mem_map+tmp))
- reservedpages++;
-#ifdef CONFIG_HIGHMEM
- for (tmp = highstart_pfn; tmp < highend_pfn; tmp++) {
- struct page *page = mem_map + tmp;
+ reservedpages = mem_init_free_pages(bad_ppro);

- if (!page_is_ram(tmp)) {
- SetPageReserved(page);
- continue;
- }
- if (bad_ppro && page_kills_ppro(tmp))
- {
- SetPageReserved(page);
- continue;
- }
- ClearPageReserved(page);
- set_bit(PG_highmem, &page->flags);
- atomic_set(&page->count, 1);
- __free_page(page);
- totalhigh_pages++;
- }
- totalram_pages += totalhigh_pages;
-#endif
codesize = (unsigned long) &_etext - (unsigned long) &_text;
datasize = (unsigned long) &_edata - (unsigned long) &_etext;
initsize = (unsigned long) &__init_end - (unsigned long) &__init_begin;




2002-03-08 21:33:56

by Dave Jones

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

On Fri, Mar 08, 2002 at 01:08:18PM -0800, Patricia Gaughen wrote:
>
> Hi,
>
> I'm currently working on a discontigmem patch for IBM NUMAQ (an ia32
> NUMA box) and want to reuse the standard i386 code as much as
> possible. To achieve this, I've modularized setup_arch() and
> mem_init(). This modularization is what the patch that I've included
> in this email contains.

As a sidenote (sort of related topic) :
An idea being kicked around a little right now is x86 subarch
support for 2.5. With so many of the niche x86 spin-offs appearing
lately, all fighting for their own piece of various files in
arch/i386/kernel/, it may be time to do the same as the ARM folks did,
and have..

arch/i386/generic/
arch/i386/numaq/
arch/i386/visws
arch/i386/voyager/
etc..

I've been meaning to find some time to move the necessary bits around,
and jiggle configs to see how it would work out, but with a pending
house move, I haven't got around to it yet.. Maybe next week.

The downsides to this:
- Code duplication.
Some routines will likely be very similar if not identical.
- Bug propagation.
If something is fixed in one subarch, theres a high possibility
it needs fixing in other subarchs

The plus sides of this:
- Removal of #ifdef noise
With more and more of these subarchs appearing, this is getting
more of an issue.
- subarchs are free to do things 'their way' without affecting the
common case.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-08 21:43:08

by Greg KH

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

On Fri, Mar 08, 2002 at 10:33:30PM +0100, Dave Jones wrote:
> As a sidenote (sort of related topic) :
> An idea being kicked around a little right now is x86 subarch
> support for 2.5. With so many of the niche x86 spin-offs appearing
> lately, all fighting for their own piece of various files in
> arch/i386/kernel/, it may be time to do the same as the ARM folks did,
> and have..
>
> arch/i386/generic/
> arch/i386/numaq/
> arch/i386/visws
> arch/i386/voyager/
> etc..

YES!!!
I've been working on the Foster patches and keep thinking that this
would be the best solution to our current #ifdef hell.

> I've been meaning to find some time to move the necessary bits around,
> and jiggle configs to see how it would work out, but with a pending
> house move, I haven't got around to it yet.. Maybe next week.
>
> The downsides to this:
> - Code duplication.
> Some routines will likely be very similar if not identical.
> - Bug propagation.
> If something is fixed in one subarch, theres a high possibility
> it needs fixing in other subarchs

Make sure that every subarch has a maintainer/someone to blame who needs
to make sure their subarch also keeps up to date with the "generic" one
would help out a lot with this problem.

> The plus sides of this:
> - Removal of #ifdef noise
> With more and more of these subarchs appearing, this is getting
> more of an issue.
> - subarchs are free to do things 'their way' without affecting the
> common case.

I think Martin's recent CONFIG_MULTIQUAD patches prove that the plus
side would outweigh any possible downside :)

thanks,

greg k-h

2002-03-08 21:59:29

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [Lse-tech] Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

> As a sidenote (sort of related topic) :
> An idea being kicked around a little right now is x86 subarch
> support for 2.5. With so many of the niche x86 spin-offs appearing
> lately, all fighting for their own piece of various files in
> arch/i386/kernel/, it may be time to do the same as the ARM folks did,
> and have..
>
> arch/i386/generic/
> arch/i386/numaq/
> arch/i386/visws
> arch/i386/voyager/
> etc..
>
> I've been meaning to find some time to move the necessary bits around,
> and jiggle configs to see how it would work out, but with a pending
> house move, I haven't got around to it yet.. Maybe next week.

I'm willing to help you out with this if you like (especially as I caused
some of the current ifdefs ;-)).

> The downsides to this:
> - Code duplication.
> Some routines will likely be very similar if not identical.
> - Bug propagation.
> If something is fixed in one subarch, theres a high possibility
> it needs fixing in other subarchs

The above are what I'm really afraid of. I think the best way to avoid
most of the downside is to split up some of the current monster functions
(like setup_arch) into generic and platform-specific parts ... exactly as
Pat's patch does.

It would be nice to see a "blessing in principle" from Marcelo and
Linus before we / you start spending lots of time on this.

M.

2002-03-08 22:16:33

by Dave Jones

[permalink] [raw]
Subject: Re: [Lse-tech] Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

On Fri, Mar 08, 2002 at 01:59:01PM -0800, Martin J. Bligh wrote:

> It would be nice to see a "blessing in principle" from Marcelo and
> Linus before we / you start spending lots of time on this.

When I first brought it up with hpa & Linus, I only got back
a reply from hpa. Whether Linus was in "I want to think about this"
mode or just random-drop I don't know, but I agree it's worth
making sure theres some degree of acceptance before doing such
a large change.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-08 22:55:18

by James Bottomley

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

> As a sidenote (sort of related topic) :
> An idea being kicked around a little right now is x86 subarch
> support for 2.5. With so many of the niche x86 spin-offs appearing
> lately, all fighting for their own piece of various files in
> arch/i386/kernel/, it may be time to do the same as the ARM folks
> did,
> and have..

> arch/i386/generic/
> arch/i386/numaq/
> arch/i386/visws
> arch/i386/voyager/
> etc..

I'll go for this (although it's probably a 2.5 thing rather than 2.4). The
key to making an effective split is to get the abstractions in the generic
part correct. I suspect that each of the different arch's has slightly
different abstraction requirements of the i386 routines, but if we begin the
split in one arch and pass it around to the others we'll end up with something
that is roughly correct.

I'll look at doing at least a generic and voyager in the next week (if I get
time).

James


2002-03-08 23:48:45

by Christer Weinigel

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Dave Jones <[email protected]> wrote:
> As a sidenote (sort of related topic) :
> An idea being kicked around a little right now is x86 subarch
> support for 2.5. With so many of the niche x86 spin-offs appearing
> lately, all fighting for their own piece of various files in
> arch/i386/kernel/, it may be time to do the same as the ARM folks did,
> and have..
>
> arch/i386/generic/
> arch/i386/numaq/
> arch/i386/visws
> arch/i386/voyager/
> etc..

Yes please. I've been working with at least 4 different National
Semiconductor Geode based designs so far, and they will get more and
more common I belive. It'd be nice not having to crap in the rest of
the i386 tree just because one system has its own bootloader or
special motherboard.

I just got my SC2200 based board booting with LinuxBIOS, so I'll
probably have to do a special kernel initialization that does some
board-specific setup since there is no BIOS to do that.

> The downsides to this:
> - Code duplication.
> Some routines will likely be very similar if not identical.
> - Bug propagation.
> If something is fixed in one subarch, theres a high possibility
> it needs fixing in other subarchs

Couldn't this be done with a common subroutine library, such as
arch/i386/common that contains code to set up the interrupt controller
and such. The PC platform code just includes everything, other
platforms could be a bit more choosy, have its own bootloader and
memory detection code and just skip the BIOS calls.

/Christer

--
"Just how much can I get away with and still go to heaven?"

2002-03-09 00:01:05

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [Lse-tech] Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Followup to: <[email protected]>
By author: Dave Jones <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Fri, Mar 08, 2002 at 01:59:01PM -0800, Martin J. Bligh wrote:
>
> > It would be nice to see a "blessing in principle" from Marcelo and
> > Linus before we / you start spending lots of time on this.
>
> When I first brought it up with hpa & Linus, I only got back
> a reply from hpa. Whether Linus was in "I want to think about this"
> mode or just random-drop I don't know, but I agree it's worth
> making sure theres some degree of acceptance before doing such
> a large change.
>

It seems like it's the "obviously right thing" to do. So far x86 (a
CPU architecture) has pretty much implied PC (a system architecuture),
since building a PC was the *only* reason to use x86, but that's
changing quickly.

-hpa

--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-03-09 01:15:54

by Josh Fryman

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18


excuse me for intruding a bit, but in the restructuring of kernel 2.5.x, is
there any notion of separating the build directories from the source
directories? if you're all hacking up the tree org anyway, this would be a
nice feature... (somewhat like gcc, i guess)

i ask because there are some pci cards i'm tinkering with that run linux
themselves. it would be nice to go off to /usr/src/linux-x.y.z, and do
something like this:

mkdir host
cd host
../make config
make dep && make bzImage && ...
cd ..
mkdir ixp
cd ixp
../make config
make dep && make bzImage ....

this way i can keep both sets from one source tree. right now i either
get to make mrproper between builds, or keep dual trees. if i'm missing
something major in why this isn't practical, feel free to flame :)

just curious.

-josh

2002-03-09 01:22:25

by Dave Jones

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

On Fri, Mar 08, 2002 at 08:15:18PM -0500, Josh Fryman wrote:
> excuse me for intruding a bit, but in the restructuring of kernel 2.5.x, is
> there any notion of separating the build directories from the source
> directories? if you're all hacking up the tree org anyway, this would be a
> nice feature... (somewhat like gcc, i guess)

Sounds like you want the shadow tree feature of kbuild-2.5

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-09 01:23:07

by Christer Weinigel

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Josh Fryman <[email protected]> wrote:
> excuse me for intruding a bit, but in the restructuring of kernel 2.5.x, is
> there any notion of separating the build directories from the source
> directories? if you're all hacking up the tree org anyway, this would be a
> nice feature... (somewhat like gcc, i guess)

<emulates Keith Owens>kbuild-2.5 will allow you to do that</>

/Christer

--
"Just how much can I get away with and still go to heaven?"

2002-03-09 07:25:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Lse-tech] Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

On Fri, Mar 08, 2002 at 10:33:30PM +0100, Dave Jones wrote:
> arch/i386/generic/

I'd rather call it pc, x86at or something like that.
Just because 99% of the i386 machines are IBM's PC AT architecture it's
still not generic :)

2002-03-10 07:33:14

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Christer Weinigel <[email protected]> writes:

> Dave Jones <[email protected]> wrote:
> > As a sidenote (sort of related topic) :
> > An idea being kicked around a little right now is x86 subarch
> > support for 2.5. With so many of the niche x86 spin-offs appearing
> > lately, all fighting for their own piece of various files in
> > arch/i386/kernel/, it may be time to do the same as the ARM folks did,
> > and have..
> >
> > arch/i386/generic/
> > arch/i386/numaq/
> > arch/i386/visws
> > arch/i386/voyager/
> > etc..
>
> Yes please. I've been working with at least 4 different National
> Semiconductor Geode based designs so far, and they will get more and
> more common I belive. It'd be nice not having to crap in the rest of
> the i386 tree just because one system has its own bootloader or
> special motherboard.
>
> I just got my SC2200 based board booting with LinuxBIOS, so I'll
> probably have to do a special kernel initialization that does some
> board-specific setup since there is no BIOS to do that.

O.k. On the LinuxBIOS front it will take a little more work but we should
be able to have the LinuxBIOS table report the presence of the devices
that need to be configured, and possibly some motherboard specific configuration
for those devices (Think of irq routing). All of which should reduce
strange motherboard configuration to the general device driver problem.

> > The downsides to this:
> > - Code duplication.
> > Some routines will likely be very similar if not identical.
> > - Bug propagation.
> > If something is fixed in one subarch, theres a high possibility
> > it needs fixing in other subarchs
>
> Couldn't this be done with a common subroutine library, such as
> arch/i386/common that contains code to set up the interrupt controller
> and such. The PC platform code just includes everything, other
> platforms could be a bit more choosy, have its own bootloader and
> memory detection code and just skip the BIOS calls.

I have just posted a patch that does the heavy lifting needed to make
using a 32bit entry point possible. It allows skipping of the 16bit
bios calls. With that patch in place it becomes trivial to query
the data from LinuxBIOS instead of the PCBIOS.

Eric

2002-03-10 07:47:58

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Dave Jones <[email protected]> writes:

> On Fri, Mar 08, 2002 at 01:08:18PM -0800, Patricia Gaughen wrote:
> >
> > Hi,
> >
> > I'm currently working on a discontigmem patch for IBM NUMAQ (an ia32
> > NUMA box) and want to reuse the standard i386 code as much as
> > possible. To achieve this, I've modularized setup_arch() and
> > mem_init(). This modularization is what the patch that I've included
> > in this email contains.
>
> As a sidenote (sort of related topic) :
> An idea being kicked around a little right now is x86 subarch
> support for 2.5. With so many of the niche x86 spin-offs appearing
> lately, all fighting for their own piece of various files in
> arch/i386/kernel/, it may be time to do the same as the ARM folks did,
> and have..

I will tenatively vote in favor of this kind of action. There
are a couple of directions to consider. This is a two dimensional
problem.

Dimension 1. Different basic hardware architectures.
(pc,numaq,visws,voyager)
Dimension 2. Different firmware implementations.
(pcbios,linuxbios,openfirmware,acpi?)

And beyond that it is fairly important to be able to build a generic
kernel. That works on everything. You might have to specify a
command line parameter to tell it which arch it is really running on
but it should work.

>From working with the alpha I can say that it is just nasty when you
must have per motherboard information in your kernel. Generally life
is much more pleasant if a small handful of things like irq routing
information is provided by the firmware so you only have to code for a
specific hardware device, and not a specific motherboard.

And even if we get to the point of putting in motherboard specific
code I would suggest it just provide the information like irq routing,
and which superio chips are present and allow a more generic layer to
handle their setup.

Anyway on the multiplexing the firmware score I have just done the
heavy lifting needed so we can put the firmware switching logic
all in C code.

Eric

2002-03-10 12:53:26

by Alan

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

> I will tenatively vote in favor of this kind of action. There
> are a couple of directions to consider. This is a two dimensional
> problem.

That should not be suprising

> Dimension 1. Different basic hardware architectures.
> (pc,numaq,visws,voyager)
(and others upcoming)

> Dimension 2. Different firmware implementations.
> (pcbios,linuxbios,openfirmware,acpi?)

i386-pc-pcbios

Maybe autoconf got the concept right. You don't neccessarily want to think
of it as a grid though. A lot of the stuff is i386-*-pcbios and i386-pc-*

Alan


2002-03-11 03:23:24

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

Alan Cox <[email protected]> writes:

> > I will tenatively vote in favor of this kind of action. There
> > are a couple of directions to consider. This is a two dimensional
> > problem.
>
> That should not be suprising
>
> > Dimension 1. Different basic hardware architectures.
> > (pc,numaq,visws,voyager)
> (and others upcoming)
>
> > Dimension 2. Different firmware implementations.
> > (pcbios,linuxbios,openfirmware,acpi?)
>
> i386-pc-pcbios
>
> Maybe autoconf got the concept right. You don't neccessarily want to think
> of it as a grid though. A lot of the stuff is i386-*-pcbios and i386-pc-*

Agreed, there is a lot of potential for sharing.

Eric



2002-03-11 16:52:01

by James Bottomley

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

[resend: vger ate the email with the diffs inline]

I've done a first pass at the separation Dave Jones Suggested.

The patches are here (diff is 250k, BK patch 25k):

http://www.hansenpartnership.com/voyager/files/split-2.4.6.diff
http://www.hansenpartnership.com/voyager/files/split-2.4.6.BK

(The bitkeeper patch is much smaller because a lot of files are only moved).

This split introduces new generic and visw directories and pulls all of the
visw specific defines out of kernel (except for some tiny cases inside
smpboot.c which were rather difficult to hook out). It also contains
preparatory abstraction work for me doing the same split for my voyager patch.

The basics of the patch are:

- creating a set of hooks which all archs must supply (asm-i386/arch_hooks.h)
- adding some arch specific includes for inline functions (currently only for
hooks in the timer interrupt).
- using more fine grained CONFIG_X86_* options to turn on and off other pieces
of the compile.

Please provide feedback; I'm going to continue on now and try to place voyager
in this abstraction (so I'll probably be adding more hooks and things).

James


2002-03-12 03:44:08

by James Bottomley

[permalink] [raw]
Subject: Re: [RFC] modularization of i386 setup_arch and mem_init in 2.4.18

I've completed the split into generic visw and voyager. The base abstraction
stuff is at:

http://www.hansenpartnership.com/voyager/files/split-2.5.6.diff
http://www.hansenpartnership.com/voyager/files/split-2.5.6.BK

And the voyager stuff which goes on top is at:

http://www.hansenpartnership.com/voyager/files/voyager-2.5.6.diff
http://www.hansenpartnership.com/voyager/files/voyager-2.5.6.BK

There are now no voyager ifdefs anywhere in the arch/i386 directories (there
are still one or two in the asm-i386, though).

That's all I need to do for this split up if anyone else would like to pick it
up.

James