2002-04-16 15:55:58

by James Bottomley

[permalink] [raw]
Subject: [PATCH] i386 arch subdivision into machine types for 2.5.8

This patch tries to split arch/i386 up into machine specific directories
(similar to the way arch/arm is done). The idea is to separate out those
machines which don't look like standard PCs (particularly from an SMP
standpoint). For the current kernel, all it really does is to get the visws
stuff into a separate directory (arch/i386/visws). I've also taken some files
which aren't going to be used by non-pc SMP machines (mainly related to mpbios
and ioapic) and placed them into arch/i386/generic.

The patch goes much further than visws needs, mainly because it now allows me
to add my voyager stuff in a separate arch/i386/voyager directory with
virtually no disturbance of the main line code. I'm afraid there are also
still four VISWS defines in arch/i386/kernel/smpboot.c because it wasn't
obvious to me how to get rid of them simply.

The 269k diff file (large because it has a lot of file moves) is at

http://www.hansenpartnership.com/voyager/files/arch-split-2.5.8.diff

There's also a bitkeeper repository with all this in at

http://linux-voyager.bkbits.net/arch-split-2.5

I haven't done anything about the other half of i386/arch reform which is
splitting the PC directory up into bus types, but I believe Patrick Mochel is
thinking about this.

Comments and suggestions welcome.

James Bottomley



2002-04-16 16:49:10

by Patrick Mochel

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8


> I haven't done anything about the other half of i386/arch reform which is
> splitting the PC directory up into bus types, but I believe Patrick Mochel is
> thinking about this.

Not necessarily bus types, but close.

I've done three sets of cleanups in the arch/i386/kernel/ directory:

- x86 CPU
- mtrr
- PCI

Each one does similar things to those drivers: moves the support into
subdirectories, and splits the monolithic files into platform-specific
modules.

Doing this has several advantages:

- Only the code for your platform gets compiled in
- Resulting code has fewer conditional compilation constructs
- Resulting code is more extensible and modular
- Fewer confliciting changes in files with mulitple contributors.
- It's easier to figure out what the heck is going on


The main motivation behind this has been the PCI driver, especially with
the numerous conflicting changes that I've seen both personally, and with
the various ACPI and NUMA changes. I've been wanting to do something like
this for about a year. About a month ago, I finally just sat down and did
it.

The patches can all be found at

http://kernel.org/pub/linux/kernel/people/mochel/patches/

Unfortunately, maintaining these massive changes is time consuming, and
conflicting with other goals and timelines. The only one I really care
about is the PCI driver. I've had a chance to up-port it to 2.5.8, and
should work for most people (though I've only tested it on single and dual
x86 boxes w/o ACPI support)

The CPU cleanups are against ~2.5.6, and most likely won't apply to the
current tree. Conflicts tend to be obvious, and easily fixable, if anyone
is willing to up-port it.

Ditto for the mtrr driver, though it's pretty stale (~1 month old), and
likely to have more conflicts.

If there is serious interest, I'll up-port them to the latest kernel and
export BK trees.


One issue that I encountered along the way was arch/i386/kernel/Makefile.
I found that you can't easily build multiple targets in the same
directory, and have dependencies for one target in subdirectories.
Typically, target objects have one or the other.

In order to make this work, I had to do:

-all: kernel.o head.o init_task.o
+all: first_rule kernel.o head.o init_task.o

...

+kernel-subdir-$(CONFIG_PCI) += pci
+subdir-y := $(kernel-subdir-y)
+obj-y += $(foreach dir,$(subdir-y),$(dir)/$(dir).o)


The last part is decent, but the explicit dependency on the first_rule
target is kinda gross. Is there a better way to do this? Will kbuild 2.5
make this nicer?


-pat



2002-04-16 19:37:29

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

James Bottomley <[email protected]> writes:

> This patch tries to split arch/i386 up into machine specific directories
> (similar to the way arch/arm is done). The idea is to separate out those
> machines which don't look like standard PCs (particularly from an SMP
> standpoint). For the current kernel, all it really does is to get the visws
> stuff into a separate directory (arch/i386/visws). I've also taken some files
> which aren't going to be used by non-pc SMP machines (mainly related to mpbios
> and ioapic) and placed them into arch/i386/generic.

A couple of comments.
- There is no way to build a generic kernel, that just needs
a command line to select the architecture. Something that is important
for installers. Even better would auto detection of the platform from
firmware information, but you can't always do that.

- By just allowing redirecting setup_memory_region you don't allow for
architectures that don't have the 384K memory hole.

- The hooks you add aren't used and are so generic it isn't obvious what
they are supposed do from their names.

- setup_arch.h is nasty. What code it has depends on what it is defined
when it is included. Couldn't 2 headers to this job better? Or better yet
can't you just use function calls?

And of course you don't look at allowing different firmware implementations,
but I'm doing that, so it is covered. :)

Eric

2002-04-16 20:51:24

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

[email protected] said:
> - There is no way to build a generic kernel, that just needs
> a command line to select the architecture. Something that is
> important
> for installers. Even better would auto detection of the platform
> from
> firmware information, but you can't always do that.

The design is to do this from config.in, not to modularise so you can select
on boot. Is that what you were asking?

> - By just allowing redirecting setup_memory_region you don't allow for
> architectures that don't have the 384K memory hole.

True. The split has been evolved only far enough to let me slot in the
voyager port fairly easily, and it has a 384K hole too. The idea is more to
begin the framework, so others can adapt it as more machine types come along.

Like all abstractions, unless they're tightly bound to the actual use, they
can become unwieldy and unusable very quickly as you abstract out things that
no-one is ever going to want. I erred on the side of utility.

> - setup_arch.h is nasty. What code it has depends on what it is
> defined
> when it is included. Couldn't 2 headers to this job better? Or
> better yet
> can't you just use function calls?

I agree with both of these. The main problem with the memory setup calls is
that most of them are static. I could export them and do overrides, like I do
for everything else, but as someone who also debugs the kernel, I like static
functions because they tell me the use is tightly isolated. I could easily do
two files, it was just looking more messy.

I'll see if I can export some of the setup.c internals and re-arrange this in
a more orderly way.

> - The hooks you add aren't used and are so generic it isn't obvious
> what
> they are supposed do from their names.

All of them are used if you look at the additional voyager stuff, what names
would you like to be more explicit?

> And of course you don't look at allowing different firmware
> implementations, but I'm doing that, so it is covered. :)

actually, I've silently ignored all the boot problems as well.

James


2002-04-16 21:06:48

by Dave Jones

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

On Tue, Apr 16, 2002 at 03:51:12PM -0500, James Bottomley wrote:
> I agree with both of these. The main problem with the memory setup calls is
> that most of them are static. I could export them and do overrides, like I do
> for everything else, but as someone who also debugs the kernel, I like static
> functions because they tell me the use is tightly isolated. I could easily do
> two files, it was just looking more messy.
>
> I'll see if I can export some of the setup.c internals and re-arrange this in
> a more orderly way.

I think this is where Patrick Mochel's recent work in that area is going to
come in handy. setup.c has been nicely abstracted out into seperate
parts, that should make things a little easier.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-04-16 21:51:48

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

James Bottomley <[email protected]> writes:

> [email protected] said:
> > - There is no way to build a generic kernel, that just needs
> > a command line to select the architecture. Something that is
> > important
> > for installers. Even better would auto detection of the platform
> > from
> > firmware information, but you can't always do that.
>
> The design is to do this from config.in, not to modularise so you can select
> on boot. Is that what you were asking?

Yes. I'm totally for the ability to select from config.in. But at
the same time having being able to build a kernel that works in all
kinds of configurations comes in quite handy. I know the alpha does
this I'm not quite certain about ARM.


> > - By just allowing redirecting setup_memory_region you don't allow for
> > architectures that don't have the 384K memory hole.
>
> True. The split has been evolved only far enough to let me slot in the
> voyager port fairly easily, and it has a 384K hole too. The idea is more to
> begin the framework, so others can adapt it as more machine types come along.
>
> Like all abstractions, unless they're tightly bound to the actual use, they
> can become unwieldy and unusable very quickly as you abstract out things that
> no-one is ever going to want. I erred on the side of utility.

True. It's just that I have a machine that doesn't have the 384K hole..
I found all I needed to export was add_memory_region and
print_memory_region, and then I could do whatever was needed.

> > - setup_arch.h is nasty. What code it has depends on what it is
> > defined
> > when it is included. Couldn't 2 headers to this job better? Or
> > better yet
> > can't you just use function calls?
>
> I agree with both of these. The main problem with the memory setup calls is
> that most of them are static. I could export them and do overrides, like I do
> for everything else, but as someone who also debugs the kernel, I like static
> functions because they tell me the use is tightly isolated. I could easily do
> two files, it was just looking more messy.
>
> I'll see if I can export some of the setup.c internals and re-arrange this in
> a more orderly way.
>
> > - The hooks you add aren't used and are so generic it isn't obvious
> > what
> > they are supposed do from their names.
>
> All of them are used if you look at the additional voyager stuff, what names
> would you like to be more explicit?

O.k. When I was looking I hadn't gotten that post yet.

The names pre_arch_setup_hook is my best example, seems to
answer nothing.

And ARCH_SETUP looks nasty.

>
> > And of course you don't look at allowing different firmware
> > implementations, but I'm doing that, so it is covered. :)
>
> actually, I've silently ignored all the boot problems as well.

Do you have boot problems on the NCR voyagers? If so I'd be
interested in hearing what the issues are.

Eric

2002-04-16 23:27:35

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

[email protected] said:
> Yes. I'm totally for the ability to select from config.in. But at
> the same time having being able to build a kernel that works in all
> kinds of configurations comes in quite handy. I know the alpha does
> this I'm not quite certain about ARM.

The alpha uses a machine type function table switch to achieve this. It's
certainly possible, just slightly more than I bargained for.

The issue will become more interesting with Patrick's cpu/bus/mtrr switch,
where self configuration does become more of an issue. Can I just wait to see
what he comes up with and then copy it?

> Do you have boot problems on the NCR voyagers? If so I'd be
> interested in hearing what the issues are.

The 8 byte GDT alignment requirement in boot/setup.S was the biggest problem
(until I found it empirically), if that's not done, they crash when jumping to
protected mode.

Not all boot managers work on voyager: grub and syslinux don't, lilo does (for
now) but complains that EBDA is too big.

I think it's because they actually have a larger than 384k hole (low memory
seems to end at 588k instead of 640k), but I was just so relieved to get them
to boot finally that I've never explored the problems in detail.

This is the actual memory map:

BIOS-provided physical RAM map:
Voyager-SUS: 0000000000000000 - 0000000000093000 (usable)
^^^^^ usually around 9fffff
Voyager-SUS: 0000000000100000 - 000000003ffff000 (usable)

James


2002-04-16 23:45:15

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

Followup to: <[email protected]>
By author: James Bottomley <[email protected]>
In newsgroup: linux.dev.kernel
>
> Not all boot managers work on voyager: grub and syslinux don't, lilo does (for
> now) but complains that EBDA is too big.
>

If syslinux doesn't work, it's a bug.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-04-17 00:57:44

by Keith Owens

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

On Tue, 16 Apr 2002 09:46:09 -0700 (PDT),
Patrick Mochel <[email protected]> wrote:
>One issue that I encountered along the way was arch/i386/kernel/Makefile.
>I found that you can't easily build multiple targets in the same
>directory, and have dependencies for one target in subdirectories.
>Typically, target objects have one or the other.
>
>In order to make this work, I had to do:
>
>-all: kernel.o head.o init_task.o
>+all: first_rule kernel.o head.o init_task.o
>
>...
>
>+kernel-subdir-$(CONFIG_PCI) += pci
>+subdir-y := $(kernel-subdir-y)
>+obj-y += $(foreach dir,$(subdir-y),$(dir)/$(dir).o)
>
>
>The last part is decent, but the explicit dependency on the first_rule
>target is kinda gross. Is there a better way to do this? Will kbuild 2.5
>make this nicer?

Much nicer.

arch/i386/kernel/Makefile.in

link_subdirs(pci ...)
select(head.o init_task.o)

arch/i386/kernel/pci/Makefile.in

select(foo.o bar.o)

2002-04-17 07:07:21

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

James Bottomley <[email protected]> writes:

> [email protected] said:
> > Yes. I'm totally for the ability to select from config.in. But at
> > the same time having being able to build a kernel that works in all
> > kinds of configurations comes in quite handy. I know the alpha does
> > this I'm not quite certain about ARM.
>
> The alpha uses a machine type function table switch to achieve this. It's
> certainly possible, just slightly more than I bargained for.
>
> The issue will become more interesting with Patrick's cpu/bus/mtrr switch,
> where self configuration does become more of an issue. Can I just wait to see
> what he comes up with and then copy it?

Sounds reasonable. What I care about is that we have the goals straight at least.

> > Do you have boot problems on the NCR voyagers? If so I'd be
> > interested in hearing what the issues are.
>
> The 8 byte GDT alignment requirement in boot/setup.S was the biggest problem
> (until I found it empirically), if that's not done, they crash when jumping to
> protected mode.

It sounds like we may have been getting lucky on that one. I guess an explicit
align directive fixes that.

> Not all boot managers work on voyager: grub and syslinux don't, lilo does (for
> now) but complains that EBDA is too big.

Interesting, so reading this and skimming your patch the voyager BIOS is a
descendant of the XT & AT BIOS. But it is a very weird one.

What was the gate a20 issue, you fixed in setup.S?

> I think it's because they actually have a larger than 384k hole (low memory
> seems to end at 588k instead of 640k), but I was just so relieved to get them
> to boot finally that I've never explored the problems in detail.

That could be it. But there have been enough systems with that
problem I would have thought the various bootloaders would have
already handled it. syslinux especially.

> This is the actual memory map:
>
> BIOS-provided physical RAM map:
> Voyager-SUS: 0000000000000000 - 0000000000093000 (usable)
> ^^^^^ usually around 9fffff
> Voyager-SUS: 0000000000100000 - 000000003ffff000 (usable)

Certainly a different one. I find it interesting how none of these
maps reserve the bios interrupt table, or the BIOS data area. Basically
the first 1280 bytes of memory... And they just assume everyone will
know better and not touch them :)

Eric

2002-04-17 16:31:56

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH] i386 arch subdivision into machine types for 2.5.8

> The 8 byte GDT alignment requirement in boot/setup.S was the biggest problem
> (until I found it empirically), if that's not done, they crash when jumping to
> protected mode.

[email protected] said:
> It sounds like we may have been getting lucky on that one. I guess an
> explicit align directive fixes that.

No, most CPUs don't require this alignment. It was only a requirement of the
voyager quad processor cards. I can boot the system on 6 cpus (3 dyads)
perfectly happily with the gdt anywhere. I suspect it's because the Quad
cards use a clever memory cache line invalidation scheme to exchange
interprocessor interrupts, but I've never investigated.

> Interesting, so reading this and skimming your patch the voyager BIOS
> is a descendant of the XT & AT BIOS. But it is a very weird one.

Yes, it tries to use the basic AT BIOS sequence. It's wierd because the
initial BIOS (actually called SUS) setup is done by a small i386 that's part
of the baseboard (voyagers can actually boot up and tell you they don't have
any CPUs). This CPU does all the peripheral configuration too, so the BIOS
that the real CPUs see is very hacked down.

The only reason they have that much BIOS functionality is because the OS they
boot for reference disc configuration is an ancient version of DOS.

> What was the gate a20 issue, you fixed in setup.S?

Well, the a20 stuff worked pretty much OK until someone re-did the way we
started setting and checking it. All the #defines really do is ignore all the
fancy a20 gate setting stuff and just use the standard one (after all, if I'm
never going to use the code, there's not much point having it in the boot
sequence).


> BIOS-provided physical RAM map:
> Voyager-SUS: 0000000000000000 - 0000000000093000 (usable)
> ^^^^^ usually around 9fffff
> Voyager-SUS: 0000000000100000 - 000000003ffff000 (usable)

[email protected] said:
> Certainly a different one. I find it interesting how none of these
> maps reserve the bios interrupt table, or the BIOS data area.
> Basically the first 1280 bytes of memory... And they just assume
> everyone will know better and not touch them :)

Well, technically the BIOS interrupt table isn't "reserved" memory because you
can and do relocate it and reuse the memory. There's no e820 classification
for "don't mess with this if you want BIOS to work but otherwise you're free
to trash it". "reserved" at least as far as voyager is concerned means "never
ever treat this as ordinary memory".

The missing 0xfff at the top is the memory used to send IPIs (it actually
overlays real memory and you can read and write it as normal memory, it will
just cause havoc with SMP).

James