2006-05-01 12:44:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


FYI, even on 2.6.17-rc3 i get the one below. v2.6.17 showstopper i
guess?

Ingo

zone c1e04600 (HighMem):
pfn: 00037d00
zone->zone_start_pfn: 00037e00
zone->spanned_pages: 00007e00
zone->zone_start_pfn + zone->spanned_pages: 0003fc00
------------[ cut here ]------------
kernel BUG at mm/page_alloc.c:526!
invalid opcode: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 1
EIP: 0060:[<c0159183>] Not tainted VLI
EFLAGS: 00010002 (2.6.17-rc3-lockdep #143)
EIP is at __rmqueue+0x98/0xdd
eax: 00000001 ebx: c305d400 ecx: c012ac22 edx: 00000001
esi: c1e046dc edi: 00000008 ebp: f4ea6bd0 esp: f4ea6bb4
ds: 007b es: 007b ss: 0068
Process chroot02 (pid: 9970, threadinfo=f4ea6000 task=f448e030)
Stack: <0>00000000 c1e04600 c3058000 00000100 c6d42e50 c6d42e5c 00000296 f4ea6c20
c01592ec 00000000 f448e608 00000003 c1e04908 00000000 000201d2 c1e04908
00000014 00000000 00000003 00000001 00000001 c1e04600 00000377 00000000
Call Trace:
[<c0104e5c>] show_stack_log_lvl+0x8b/0x95
[<c0104ffc>] show_registers+0x147/0x1ad
[<c010531b>] die+0x179/0x24e
[<c0f48f9f>] do_trap+0x7c/0x96
[<c0105768>] do_invalid_op+0x89/0x93
[<c0104a03>] error_code+0x4f/0x54
[<c01592ec>] get_page_from_freelist+0x124/0x436
[<c015965e>] __alloc_pages+0x60/0x290
[<c016ba71>] alloc_pages_current+0x77/0x7c
[<c0154cec>] page_cache_alloc_cold+0x7f/0x83
[<c015aa4a>] __do_page_cache_readahead+0x9e/0x1b2
[<c015ac6d>] do_page_cache_readahead+0x40/0x4d
[<c0155fa8>] filemap_nopage+0x149/0x30a
[<c016141b>] __handle_mm_fault+0x3e9/0xb18
[<c0f49e4d>] do_page_fault+0x317/0x725
[<c0104a03>] error_code+0x4f/0x54
[<c019f63d>] padzero+0x19/0x28
[<c01a06fc>] load_elf_binary+0x904/0x15de
[<c0181896>] search_binary_handler+0xcf/0x29e
[<c0181bc0>] do_execve+0x15b/0x1fe
[<c01029df>] sys_execve+0x2a/0x6d
[<c0103ddb>] syscall_call+0x7/0xb
Code: 4d e8 29 01 88 d9 d3 e2 89 55 f0 eb 40 d1 6d f0 83 ee 0c 8b 5d ec 4f 6b 45 f0 54 01 c3 8b 45 e8 89 da e8 24 f8 ff ff 85 c0 74 08 <0f> 0b 0e 02 43 8f 04 c1 8b 16 8d 43 4c 89 42 04 89 53 4c 89 70


2006-05-02 06:48:34

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Ingo Molnar <[email protected]> writes:

> FYI, even on 2.6.17-rc3 i get the one below. v2.6.17 showstopper i
> guess?

Did you send a full boot log?

If it's using ACPI NUMA try numa=noacpi - it might be some problem
with the node discovery on your machine.

-Andi

2006-05-02 07:01:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Andi Kleen <[email protected]> wrote:

> Ingo Molnar <[email protected]> writes:
>
> > FYI, even on 2.6.17-rc3 i get the one below. v2.6.17 showstopper i
> > guess?
>
> Did you send a full boot log?

yes, in the previous mail, in the same thread. (maybe lkml ate it - it's
an allyesconfig bootup so a large bootlog and a large config) I've also
uploaded them to:

http://redhat.com/~mingo/misc/

debug-pagealloc.patch is the debug patch i made based on Nick's earlier
suggestions.

> If it's using ACPI NUMA try numa=noacpi - it might be some problem
> with the node discovery on your machine.

this is a non-NUMA box (Athlon64 X2 desktop machine).

Ingo

2006-05-02 07:05:35

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 09:06, Ingo Molnar wrote:
>
> * Andi Kleen <[email protected]> wrote:
>
> > Ingo Molnar <[email protected]> writes:
> >
> > > FYI, even on 2.6.17-rc3 i get the one below. v2.6.17 showstopper i
> > > guess?
> >
> > Did you send a full boot log?
>
> yes, in the previous mail, in the same thread. (maybe lkml ate it - it's
> an allyesconfig bootup so a large bootlog and a large config) I've also
> uploaded them to:
>
> http://redhat.com/~mingo/misc/
>
> debug-pagealloc.patch is the debug patch i made based on Nick's earlier
> suggestions.
>
> > If it's using ACPI NUMA try numa=noacpi - it might be some problem
> > with the node discovery on your machine.
>
> this is a non-NUMA box (Athlon64 X2 desktop machine).

Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
anywhere but some Summit systems (at least every time I tried it it blew up
on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)

If you want NUMA use 64bit.

-Andi

2006-05-02 08:22:40

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Andi Kleen <[email protected]> wrote:

> On Tuesday 02 May 2006 09:06, Ingo Molnar wrote:
> >
> > * Andi Kleen <[email protected]> wrote:
> >
> > > Ingo Molnar <[email protected]> writes:
> > >
> > > > FYI, even on 2.6.17-rc3 i get the one below. v2.6.17 showstopper i
> > > > guess?
> > >
> > > Did you send a full boot log?
> >
> > yes, in the previous mail, in the same thread. (maybe lkml ate it - it's
> > an allyesconfig bootup so a large bootlog and a large config) I've also
> > uploaded them to:
> >
> > http://redhat.com/~mingo/misc/
> >
> > debug-pagealloc.patch is the debug patch i made based on Nick's earlier
> > suggestions.
> >
> > > If it's using ACPI NUMA try numa=noacpi - it might be some problem
> > > with the node discovery on your machine.
> >
> > this is a non-NUMA box (Athlon64 X2 desktop machine).
>
> Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> anywhere but some Summit systems (at least every time I tried it it
> blew up on me and nobody seems to use it regularly). Maybe it would be
> finally time to mark it CONFIG_BROKEN though or just remove it (even
> by design it doesn't work very well)

what you saw before could easily be this particular bug - the zones are
apparently mis-sized (aligned to 1MB while they need to be aligned to
4MB), which causes quick and nasty crashes under light user load.

Ingo

2006-05-02 14:02:54

by Martin Bligh

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

> Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> anywhere but some Summit systems (at least every time I tried it it blew up
> on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
> CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)

Bollocks. It works fine, and is tested every single day, on every git
release, and every -mm tree.

M.

2006-05-02 15:04:53

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 16:02, Martin J. Bligh wrote:
> > Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> > anywhere but some Summit systems (at least every time I tried it it blew up
> > on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
> > CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)
>
> Bollocks. It works fine,

On what kind of box? Some summit system, right?

> and is tested every single day, on every git
> release, and every -mm tree.

Well, it doesn't work for Ingo clearly. My own experiences every time
I tried it were similar.

I think I stand by my original statement.

-Andi

2006-05-02 15:18:00

by Martin Bligh

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Andi Kleen wrote:
> On Tuesday 02 May 2006 16:02, Martin J. Bligh wrote:
>
>>>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
>>>anywhere but some Summit systems (at least every time I tried it it blew up
>>>on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
>>>CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)
>>
>>Bollocks. It works fine,
>
> On what kind of box? Some summit system, right?

Summit and NUMA-Q, ie everything we originally created it for.

> Well, it doesn't work for Ingo clearly. My own experiences every time
> I tried it were similar.

What platform?

> I think I stand by my original statement.

If it works fine on some platforms and not on others, I would venture to
suggest it's a platform-specific issue, and marking the whole thing as
CONFIG_BROKEN would be an entirely inappropriate overreaction to what
is probably a simple bug.

M.

2006-05-02 15:45:38

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 17:17, Martin J. Bligh wrote:
> Andi Kleen wrote:
> > On Tuesday 02 May 2006 16:02, Martin J. Bligh wrote:
> >
> >>>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> >>>anywhere but some Summit systems (at least every time I tried it it blew up
> >>>on me and nobody seems to use it regularly). Maybe it would be finally time to mark it
> >>>CONFIG_BROKEN though or just remove it (even by design it doesn't work very well)
> >>
> >>Bollocks. It works fine,
> >
> > On what kind of box? Some summit system, right?
>
> Summit and NUMA-Q, ie everything we originally created it for.

That's my point - it usually crashes everywhere else.

>
> > I think I stand by my original statement.
>
> If it works fine on some platforms and not on others, I would venture to
> suggest it's a platform-specific issue, and marking the whole thing as
> CONFIG_BROKEN would be an entirely inappropriate overreaction to what
> is probably a simple bug.

The problem is that it's not regression tested and quite complex and tends
to break often and stay broken.

If you don't want to mark it CONFIG_BROKEN then i would suggest a panic
early when the system isn't SUMMIT (I think NUMAQ does this already)

Something like the appended patch

-Andi

i386: Panic the system early when a NUMA kernel doesn't run on IBM NUMA

It has been broken forever anywhere else and is not too useful
anyways so best to disable it.

Signed-off-by: Andi Kleen <[email protected]>

Index: linux/arch/i386/kernel/srat.c
===================================================================
--- linux.orig/arch/i386/kernel/srat.c
+++ linux/arch/i386/kernel/srat.c
@@ -327,6 +327,14 @@ int __init get_memcfg_from_srat(void)
int tables = 0;
int i = 0;

+ extern int use_cyclone;
+ if (use_cyclone == 0) {
+ /* Make sure user sees something */
+ static const char s[] __initdata = "Not an IBM x440/NUMAQ. Don't use i386 CONFIG_NUMA anywhere else."
+ early_printk(s);
+ panic(s);
+ }
+
if (ACPI_FAILURE(acpi_find_root_pointer(ACPI_PHYSICAL_ADDRESSING,
rsdp_address))) {
printk("%s: System description tables not found\n",
Index: linux/arch/i386/Kconfig
===================================================================
--- linux.orig/arch/i386/Kconfig
+++ linux/arch/i386/Kconfig
@@ -517,6 +517,9 @@ config NUMA
depends on SMP && HIGHMEM64G && (X86_NUMAQ || X86_GENERICARCH || (X86_SUMMIT && ACPI))
default n if X86_PC
default y if (X86_NUMAQ || X86_SUMMIT)
+ help
+ NUMA support. Note this only works on IBM x440 or IBM NUMAQ.
+ Don't try to use it anywhere else.

comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI"
depends on X86_SUMMIT && (!HIGHMEM64G || !ACPI)


2006-05-02 15:47:57

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Martin J. Bligh <[email protected]> wrote:

> >Well, it doesn't work for Ingo clearly. My own experiences every time
> >I tried it were similar.
>
> What platform?

i only booted it on a non-NUMA PC. Most likely the instability is caused
by some sort of zone mis-sizing. (See more details in this same thread.)

Ingo

2006-05-02 16:02:44

by Martin Bligh

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

>>>>Bollocks. It works fine,
>>>
>>>On what kind of box? Some summit system, right?
>>
>>Summit and NUMA-Q, ie everything we originally created it for.
>
> Andi:
>
> That's my point - it usually crashes everywhere else.

Ingo Molnar wrote:
>
> i only booted it on a non-NUMA PC. Most likely the instability is
> caused by some sort of zone mis-sizing. (See more details in this
> same thread.)

Ooooh, on ordinary PCs. that makes more sense.

> The problem is that it's not regression tested and quite complex and
> tends to break often and stay broken.

OK, well the regression testing issue is easily fixed, but whether it's
worth it or not is a different issue. It was originally done for the
distros really, so they could have a single kernel that supported
everything.

> If you don't want to mark it CONFIG_BROKEN then i would suggest a panic
> early when the system isn't SUMMIT (I think NUMAQ does this already)
>
> Something like the appended patch

apw: this was your baby ... what do you want to do with it? Add it
to the automated regression testing, or kill it?

> -Andi
>
> i386: Panic the system early when a NUMA kernel doesn't run on IBM NUMA
>
> It has been broken forever anywhere else and is not too useful
> anyways so best to disable it.
>
> Signed-off-by: Andi Kleen <[email protected]>
>
> Index: linux/arch/i386/kernel/srat.c
> ===================================================================
> --- linux.orig/arch/i386/kernel/srat.c
> +++ linux/arch/i386/kernel/srat.c
> @@ -327,6 +327,14 @@ int __init get_memcfg_from_srat(void)
> int tables = 0;
> int i = 0;
>
> + extern int use_cyclone;
> + if (use_cyclone == 0) {
> + /* Make sure user sees something */
> + static const char s[] __initdata = "Not an IBM x440/NUMAQ. Don't use i386 CONFIG_NUMA anywhere else."
> + early_printk(s);
> + panic(s);
> + }
> +
> if (ACPI_FAILURE(acpi_find_root_pointer(ACPI_PHYSICAL_ADDRESSING,
> rsdp_address))) {
> printk("%s: System description tables not found\n",
> Index: linux/arch/i386/Kconfig
> ===================================================================
> --- linux.orig/arch/i386/Kconfig
> +++ linux/arch/i386/Kconfig
> @@ -517,6 +517,9 @@ config NUMA
> depends on SMP && HIGHMEM64G && (X86_NUMAQ || X86_GENERICARCH || (X86_SUMMIT && ACPI))
> default n if X86_PC
> default y if (X86_NUMAQ || X86_SUMMIT)
> + help
> + NUMA support. Note this only works on IBM x440 or IBM NUMAQ.
> + Don't try to use it anywhere else.
>
> comment "NUMA (Summit) requires SMP, 64GB highmem support, ACPI"
> depends on X86_SUMMIT && (!HIGHMEM64G || !ACPI)

2006-05-02 16:05:20

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 18:02, Martin J. Bligh wrote:

> >
> > i only booted it on a non-NUMA PC. Most likely the instability is
> > caused by some sort of zone mis-sizing. (See more details in this
> > same thread.)
>
> Ooooh, on ordinary PCs. that makes more sense.

It tends to crash on Opteron NUMA systems too.

-Andi

2006-05-02 19:42:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Martin J. Bligh <[email protected]> wrote:

> > i only booted it on a non-NUMA PC. Most likely the instability is
> > caused by some sort of zone mis-sizing. (See more details in this
> > same thread.)
>
> Ooooh, on ordinary PCs. that makes more sense.

btw., NUMA emulation on 64-bit had problems too on and off (no problems
currently), so i'm not surprised that 32-bit NUMA has problems on
non-NUMA boxes. It would still be useful to have it, because that way
the NUMA codepaths can be excercised.

Ingo

2006-05-02 19:43:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Andi Kleen <[email protected]> wrote:

> i386: Panic the system early when a NUMA kernel doesn't run on IBM
> NUMA

nah! Lets just fix the zone sizing bug ...

Ingo

2006-05-02 19:45:08

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 21:48, Ingo Molnar wrote:
>
> * Andi Kleen <[email protected]> wrote:
>
> > i386: Panic the system early when a NUMA kernel doesn't run on IBM
> > NUMA
>
> nah! Lets just fix the zone sizing bug ...

The problem is that nobody regression tests it. So even if you fix it
now it will be likely broken again in a few months.

-Andi

2006-05-02 19:55:27

by Eric Dumazet

[permalink] [raw]
Subject: [RFC, PATCH] cond_resched() added to close_files()

--- a/kernel/exit.c 2006-05-02 17:31:39.000000000 +0200
+++ b/kernel/exit.c 2006-05-02 17:32:06.000000000 +0200
@@ -445,20 +445,21 @@
set = fdt->open_fds->fds_bits[j++];
while (set) {
if (set & 1) {
struct file * file = xchg(&fdt->fd[i], NULL);
if (file)
filp_close(file, files);
}
i++;
set >>= 1;
}
+ cond_resched();
}
}

struct files_struct *get_files_struct(struct task_struct *task)
{
struct files_struct *files;

task_lock(task);
files = task->files;
if (files)


Attachments:
(No filename) (1.58 kB)
close_files.patch (524.00 B)
Download all attachments

2006-05-02 19:56:18

by Martin Bligh

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Andi Kleen wrote:
> On Tuesday 02 May 2006 21:48, Ingo Molnar wrote:
>
>>* Andi Kleen <[email protected]> wrote:
>>
>>
>>>i386: Panic the system early when a NUMA kernel doesn't run on IBM
>>>NUMA
>>
>>nah! Lets just fix the zone sizing bug ...
>
>
> The problem is that nobody regression tests it. So even if you fix it
> now it will be likely broken again in a few months.

We can add a box to the test.kernel.org harness easily enough, and
it will show up with an eerie red glow.

M.

2006-05-02 20:00:20

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 21:56, Martin Bligh wrote:
> Andi Kleen wrote:
> > On Tuesday 02 May 2006 21:48, Ingo Molnar wrote:
> >
> >>* Andi Kleen <[email protected]> wrote:
> >>
> >>
> >>>i386: Panic the system early when a NUMA kernel doesn't run on IBM
> >>>NUMA
> >>
> >>nah! Lets just fix the zone sizing bug ...
> >
> >
> > The problem is that nobody regression tests it. So even if you fix it
> > now it will be likely broken again in a few months.
>
> We can add a box to the test.kernel.org harness easily enough, and
> it will show up with an eerie red glow.

Single box is not enough - there are many possible combinations
(e.g. Opteron NUMA, IBM NUMA, no NUMA small box, big box with weird
mappings etc.). Basically you would need a real tester base.

-Andi

2006-05-02 20:09:16

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Andi Kleen <[email protected]> wrote:

> > > The problem is that nobody regression tests it. So even if you fix it
> > > now it will be likely broken again in a few months.
> >
> > We can add a box to the test.kernel.org harness easily enough, and
> > it will show up with an eerie red glow.
>
> Single box is not enough - there are many possible combinations (e.g.
> Opteron NUMA, IBM NUMA, no NUMA small box, big box with weird mappings
> etc.). Basically you would need a real tester base.

nah. And the fact that i could boot this on a non-NUMA box already
unearthed a weakness in the buddy allocator. (it should have much
clearer asserts about mis-sized zones - it's not the first time we had
them and they are hard to debug) So consider this a debugging feature.
It also found other bugs, so even if nobody but me uses it, it's useful.

ingo

2006-05-02 20:12:20

by Andi Kleen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Tuesday 02 May 2006 22:13, Ingo Molnar wrote:

> nah. And the fact that i could boot this on a non-NUMA box already
> unearthed a weakness in the buddy allocator. (it should have much
> clearer asserts about mis-sized zones - it's not the first time we had
> them and they are hard to debug)

GIGO.

> So consider this a debugging feature.
> It also found other bugs, so even if nobody but me uses it, it's useful.

It's an awful lot of ugly code for a debugging feature.

Also I never considered i386 NUMA to be particularly interesting
because it doesn't work for the kernel lowmem which is always on node 0.
So no matter what you try you have a nasty hotspot on node 0's memory.

-Andi

2006-05-02 23:36:29

by Nick Piggin

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Martin J. Bligh wrote:
>> Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
>> anywhere but some Summit systems (at least every time I tried it it
>> blew up on me and nobody seems to use it regularly). Maybe it would be
>> finally time to mark it CONFIG_BROKEN though or just remove it (even
>> by design it doesn't work very well)
>
>
> Bollocks. It works fine, and is tested every single day, on every git
> release, and every -mm tree.

Whatever the case, there definitely does not appear to be sufficient
zone alignment enforced for the buddy allocator. I cannot see how it
could work if zones are not aligned on 4MB boundaries.

Maybe some architectures / subarch code naturally does this for us,
but Ingo is definitely hitting this bug because his config does not
(align, that is).

I've randomly added a couple more cc's.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-03 06:57:04

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC, PATCH] cond_resched() added to close_files()


* Eric Dumazet <[email protected]> wrote:

> This patch makes sure a cond_resched() call is done every 32 (or 64)
> files closed. This also helps reducing number of files waiting in RCU
> queues for final freeing as call_rcu() might have called
> force_quiescent_state()

the -rt tree already has this latency breaker (and had it for a long
time), it just somehow didnt get pushed upstream.

> Signed-off-by: Eric Dumazet <[email protected]>

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-05-04 01:32:45

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Nick Piggin wrote: [Tue May 02 2006, 10:25:57AM EDT]
> Martin J. Bligh wrote:
> >>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
> >>anywhere but some Summit systems (at least every time I tried it it
> >>blew up on me and nobody seems to use it regularly). Maybe it would be
> >>finally time to mark it CONFIG_BROKEN though or just remove it (even
> >>by design it doesn't work very well)
> >
> >
> >Bollocks. It works fine, and is tested every single day, on every git
> >release, and every -mm tree.
>
> Whatever the case, there definitely does not appear to be sufficient
> zone alignment enforced for the buddy allocator. I cannot see how it
> could work if zones are not aligned on 4MB boundaries.
>
> Maybe some architectures / subarch code naturally does this for us,
> but Ingo is definitely hitting this bug because his config does not
> (align, that is).
>
> I've randomly added a couple more cc's.
>
The patch below isn't compile tested or correct for those cases where
alloc_remap is called or where arch code has allocated node_mem_map for
CONFIG_FLAT_NODE_MEM_MAP. It's just conveying what I believe the issue is.

Andy added code to buddy allocator which doesn't require the zone's endpoints
to be aligned to MAX_ORDER. I think the issue is that the buddy
allocator requires the node_mem_map's endpoints to be MAX_ORDER aligned.
Otherwise __page_find_buddy could compute a buddy not in node_mem_map
for partial MAX_ORDER regions at zone's endpoints. page_is_buddy will
detect that these pages at endpoints aren't PG_buddy (they were zeroed
out by bootmem allocator and not part of zone). Of course the negative
here is we could waste a little memory but the positive is eliminating
all the old checks for zone boundary conditions.

SPARSEMEM won't encounter this issue because of MAX_ORDER size
constraint when SPARSEMEM is configured. ia64 VIRTUAL_MEM_MAP doesn't
need the logic either because the holes and endpoints are handled
differently. This leaves checking alloc_remap and other arches which
privately allocate for node_mem_map.

Any how I could be totally wrong but like I said this requires more
thought.

bob


Index: linux-2.6.17-rc3/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/page_alloc.c 2006-04-27 09:44:02.000000000 -0400
+++ linux-2.6.17-rc3/mm/page_alloc.c 2006-05-03 14:50:13.000000000 -0400
@@ -2123,14 +2123,23 @@ static void __init alloc_node_mem_map(st
#ifdef CONFIG_FLAT_NODE_MEM_MAP
/* ia64 gets its own node_mem_map, before this, without bootmem */
if (!pgdat->node_mem_map) {
- unsigned long size;
+ unsigned long size, start, end;
struct page *map;

- size = (pgdat->node_spanned_pages + 1) * sizeof(struct page);
+ /*
+ * The zone's endpoints aren't required to be MAX_ORDER
+ * aligned but the node_mem_map endpoints must be in order
+ * for the buddy allocator to function correctly.
+ */
+ start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
+ end = start + pgdat->node_spanned_pages;
+ end = (end + ((1 << (MAX_ORDER - 1)) - 1) &
+ ~((1 << (MAX_ORDER - 1)) - 1);
+ size = (end - start) * sizeof(struct page);
map = alloc_remap(pgdat->node_id, size);
if (!map)
map = alloc_bootmem_node(pgdat, size);
- pgdat->node_mem_map = map;
+ pgdat->node_mem_map = map + ( pgdat->node_start_pfn - start);
}
#ifdef CONFIG_FLATMEM
/*

2006-05-04 08:32:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Bob Picco <[email protected]> wrote:

> The patch below isn't compile tested or correct for those cases where
> alloc_remap is called or where arch code has allocated node_mem_map
> for CONFIG_FLAT_NODE_MEM_MAP. It's just conveying what I believe the
> issue is.

thx. One pair of parentheses were missing i think - see the delta fix
below. I'll try it.

Ingo

Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -2296,7 +2296,7 @@ static void __init alloc_node_mem_map(st
*/
start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
end = start + pgdat->node_spanned_pages;
- end = (end + ((1 << (MAX_ORDER - 1)) - 1) &
+ end = (end + ((1 << (MAX_ORDER - 1)) - 1)) &
~((1 << (MAX_ORDER - 1)) - 1);
size = (end - start) * sizeof(struct page);
map = alloc_remap(pgdat->node_id, size);

2006-05-04 08:37:50

by Andy Whitcroft

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Bob Picco wrote:
> Nick Piggin wrote: [Tue May 02 2006, 10:25:57AM EDT]
>
>>Martin J. Bligh wrote:
>>
>>>>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
>>>>anywhere but some Summit systems (at least every time I tried it it
>>>>blew up on me and nobody seems to use it regularly). Maybe it would be
>>>>finally time to mark it CONFIG_BROKEN though or just remove it (even
>>>>by design it doesn't work very well)
>>>
>>>
>>>Bollocks. It works fine, and is tested every single day, on every git
>>>release, and every -mm tree.
>>
>>Whatever the case, there definitely does not appear to be sufficient
>>zone alignment enforced for the buddy allocator. I cannot see how it
>>could work if zones are not aligned on 4MB boundaries.
>>
>>Maybe some architectures / subarch code naturally does this for us,
>>but Ingo is definitely hitting this bug because his config does not
>>(align, that is).
>>
>>I've randomly added a couple more cc's.
>>
>
> The patch below isn't compile tested or correct for those cases where
> alloc_remap is called or where arch code has allocated node_mem_map for
> CONFIG_FLAT_NODE_MEM_MAP. It's just conveying what I believe the issue is.
>
> Andy added code to buddy allocator which doesn't require the zone's endpoints
> to be aligned to MAX_ORDER. I think the issue is that the buddy
> allocator requires the node_mem_map's endpoints to be MAX_ORDER aligned.
> Otherwise __page_find_buddy could compute a buddy not in node_mem_map
> for partial MAX_ORDER regions at zone's endpoints. page_is_buddy will
> detect that these pages at endpoints aren't PG_buddy (they were zeroed
> out by bootmem allocator and not part of zone). Of course the negative
> here is we could waste a little memory but the positive is eliminating
> all the old checks for zone boundary conditions.

Yes this is correct. The buddy location algorithm uses the relative pfn
number to locate the buddy. Both the old anew new free detect
algorithms require a struct page exist for that buddy regardless of
whether the page exists in memory. Thus the node_mem_map needs to exist
out to a MAX_ORDER boundry in both directions. With real machines we
would likely get this as memory is mostly in larger chunks than
MAX_ORDER and generally maximally aligned. From what I can see we could
potentially not be allocating the end correctly but the rmap allocation
would likely be larger than the request so we'd get away with it.

>
> SPARSEMEM won't encounter this issue because of MAX_ORDER size
> constraint when SPARSEMEM is configured. ia64 VIRTUAL_MEM_MAP doesn't
> need the logic either because the holes and endpoints are handled
> differently. This leaves checking alloc_remap and other arches which
> privately allocate for node_mem_map.
>
> Any how I could be totally wrong but like I said this requires more
> thought.

I'll have a go at testing this to see what difference it makes. I think
there might be a problem with the buddy merging at zone boundries as we
are anyhow (or the code commentry is incomplete), so am going to have a
look at that at the same time.

-apw

2006-05-04 09:09:34

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Ingo Molnar <[email protected]> wrote:

> > The patch below isn't compile tested or correct for those cases where
> > alloc_remap is called or where arch code has allocated node_mem_map
> > for CONFIG_FLAT_NODE_MEM_MAP. It's just conveying what I believe the
> > issue is.
>
> thx. One pair of parentheses were missing i think - see the delta fix
> below. I'll try it.

the same easy crash still happens if i enable CONFIG_NUMA:

zone c214e600 (HighMem):
pfn: 00037d00
zone->zone_start_pfn: 00037e00
zone->spanned_pages: 00007e00
zone->zone_start_pfn + zone->spanned_pages: 0003fc00

[<c010574a>] do_invalid_op+0x63/0x93
[<c0104a0b>] error_code+0x4f/0x54
[<c0164d48>] get_page_from_freelist+0x13e/0x565
[<c01651dd>] __alloc_pages+0x6e/0x325
[<c017a6c9>] alloc_page_vma+0x80/0x86
[<c016e2ae>] __handle_mm_fault+0x1e7/0xd00
[<c10fe9af>] do_page_fault+0x339/0x7c5
[<c0104a0b>] error_code+0x4f/0x54

see the debug patch below.

Ingo

----
From: Ingo Molnar <[email protected]>

do buddy zone size checks unconditionally.

Signed-off-by: Ingo Molnar <[email protected]>

----

mm/page_alloc.c | 31 ++++++++++++++++++++++++-------
1 files changed, 24 insertions(+), 7 deletions(-)

Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c
+++ linux/mm/page_alloc.c
@@ -101,17 +101,32 @@ static int page_outside_zone_boundaries(
ret = 1;
} while (zone_span_seqretry(zone, seq));

+#define P(x) printk("%s: %08lx\n", #x, x)
+
+ if (ret) {
+ printk("zone %p (%s):\n", zone, zone->name);
+ P(pfn);
+ P(zone->zone_start_pfn);
+ P(zone->spanned_pages);
+ P(zone->zone_start_pfn + zone->spanned_pages);
+ }
+
return ret;
}

static int page_is_consistent(struct zone *zone, struct page *page)
{
-#ifdef CONFIG_HOLES_IN_ZONE
- if (!pfn_valid(page_to_pfn(page)))
+ if (!pfn_valid(page_to_pfn(page))) {
+ printk("BUG: pfn: %08lx, page: %p\n",
+ page_to_pfn(page), page);
+ dump_stack();
return 0;
-#endif
- if (zone != page_zone(page))
+ }
+ if (zone != page_zone(page)) {
+ printk("zone: %p != %p == page_zone(%p)\n",
+ zone, page_zone(page), page);
return 0;
+ }

return 1;
}
@@ -309,10 +324,12 @@ __find_combined_index(unsigned long page
*/
static inline int page_is_buddy(struct page *page, int order)
{
-#ifdef CONFIG_HOLES_IN_ZONE
- if (!pfn_valid(page_to_pfn(page)))
+ if (!pfn_valid(page_to_pfn(page))) {
+ printk("BUG: pfn: %08lx, page: %p, order: %d\n",
+ page_to_pfn(page), page, order);
+ dump_stack();
return 0;
-#endif
+ }

if (PageBuddy(page) && page_order(page) == order) {
BUG_ON(page_count(page) != 0);

2006-05-04 09:21:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Ingo Molnar <[email protected]> wrote:

> the same easy crash still happens if i enable CONFIG_NUMA:

btw., with CONFIG_NUMA off i get this warning during bootup:

BUG: pfn: 0003fff0, page: c404d840, order: 4
[<c0104e7f>] show_trace+0xd/0xf
[<c0104e96>] dump_stack+0x15/0x17
[<c0163312>] free_pages_bulk+0x207/0x370
[<c01642f9>] free_hot_cold_page+0x127/0x17c
[<c016438d>] free_hot_page+0xa/0xc
[<c01643e5>] __free_pages+0x56/0x6f
[<c0172e14>] __vunmap+0xc1/0xed
[<c0172f02>] vfree+0x3b/0x3e
[<c0128865>] build_sched_domains+0xaf2/0xcde
[<c0128a6a>] arch_init_sched_domains+0x19/0x1b
[<c1bd3a67>] sched_init_smp+0x18/0x349
[<c01003c6>] init+0xb9/0x2cb
[<c0102005>] kernel_thread_helper+0x5/0xb

but this is nonfatal and the system is robust afterwards. (this warning
is not present if CONFIG_NUMA is on) [Btw., in the NUMA test i also had
CONFIG_MIGRATION enabled.]

Ingo

2006-05-04 15:22:06

by Dave Hansen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

I haven't thought through it completely, but these two lines worry me:

> + start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> + end = start + pgdat->node_spanned_pages;

Should the "end" be based off of the original "start", or the aligned
"start"?

(using decimal math to make it easy) ...

Let's say that MAX_ORDER comes out to be 10 pages. node_start_pfn is 9,
and the node's end pfn is 21. node_spanned_pages will be 12. "start"
will get rounded down to 0. "end" will be "start" (0) +
node_spanned_pages (12), so 12. "end" then gets rounded up to 20.
However, this is not sufficient space for the mem_map as the node
*actually* ended at 21.

I think that "end" needs to be calculated without rounding down the
start_pfn, or the node_spanned_pages number needs to be rounded up in
the same way that "end" is.

Does that sound right?

Also, it might look nicer if there was an intermediate variable
something like this:

#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))

Take a look at the loop below, I've also used ALIGN() from kernel.h for
the "end" alignment. I think it is just a drop-in replacement.

/* ia64 gets its own node_mem_map, before this, without bootmem */
if (!pgdat->node_mem_map) {
unsigned long size, start, end;
struct page *map;

/*
* The zone's endpoints aren't required to be MAX_ORDER
* aligned but the node_mem_map endpoints must be in order
* for the buddy allocator to function correctly.
*/
start = pgdat->node_start_pfn & ~(MAX_ORDER_NR_PAGES - 1);
end = start + pgdat->node_spanned_pages;
end = ALIGN(end, MAX_ORDER_NR_PAGES);
size = (end - start) * sizeof(struct page);
map = alloc_remap(pgdat->node_id, size);
if (!map)
map = alloc_bootmem_node(pgdat, size);
pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
}

-- Dave

2006-05-04 15:46:57

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
> I haven't thought through it completely, but these two lines worry me:
>
> > + start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> > + end = start + pgdat->node_spanned_pages;
>
> Should the "end" be based off of the original "start", or the aligned
> "start"?
Yes. I failed to quilt refresh before sending. You mean end should be
end = pgdat->node_start_pfn + pgdat->node_spanned_pages before rounding
up.
>
> (using decimal math to make it easy) ...
>
> Let's say that MAX_ORDER comes out to be 10 pages. node_start_pfn is 9,
> and the node's end pfn is 21. node_spanned_pages will be 12. "start"
> will get rounded down to 0. "end" will be "start" (0) +
> node_spanned_pages (12), so 12. "end" then gets rounded up to 20.
> However, this is not sufficient space for the mem_map as the node
> *actually* ended at 21.
>
> I think that "end" needs to be calculated without rounding down the
> start_pfn, or the node_spanned_pages number needs to be rounded up in
> the same way that "end" is.
>
> Does that sound right?
Yes.
>
> Also, it might look nicer if there was an intermediate variable
> something like this:
>
> #define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))
Yes.
>
> Take a look at the loop below, I've also used ALIGN() from kernel.h for
> the "end" alignment. I think it is just a drop-in replacement.
>
> /* ia64 gets its own node_mem_map, before this, without bootmem */
> if (!pgdat->node_mem_map) {
> unsigned long size, start, end;
> struct page *map;
>
> /*
> * The zone's endpoints aren't required to be MAX_ORDER
> * aligned but the node_mem_map endpoints must be in order
> * for the buddy allocator to function correctly.
> */
> start = pgdat->node_start_pfn & ~(MAX_ORDER_NR_PAGES - 1);
end = pgdat->node_start_pfn + pgdat->node_spanned_pages;
> end = start + pgdat->node_spanned_pages;
> end = ALIGN(end, MAX_ORDER_NR_PAGES);
> size = (end - start) * sizeof(struct page);
> map = alloc_remap(pgdat->node_id, size);
> if (!map)
> map = alloc_bootmem_node(pgdat, size);
> pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
> }
>
> -- Dave
bob
>

2006-05-04 16:08:06

by Dave Hansen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Thu, 2006-05-04 at 11:46 -0400, Bob Picco wrote:
> Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
> > I haven't thought through it completely, but these two lines worry me:
> >
> > > + start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> > > + end = start + pgdat->node_spanned_pages;
> >
> > Should the "end" be based off of the original "start", or the aligned
> > "start"?
> Yes. I failed to quilt refresh before sending. You mean end should be
> end = pgdat->node_start_pfn + pgdat->node_spanned_pages before rounding
> up.

Yep. Looks good.

-- Dave

2006-05-04 19:20:57

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Bob Picco <[email protected]> wrote:

> Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
> > I haven't thought through it completely, but these two lines worry me:
> >
> > > + start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> > > + end = start + pgdat->node_spanned_pages;
> >
> > Should the "end" be based off of the original "start", or the aligned
> > "start"?
>
> Yes. I failed to quilt refresh before sending. You mean end should be
> end = pgdat->node_start_pfn + pgdat->node_spanned_pages before
> rounding up.

do you have an updated patch i should try?

Ingo

2006-05-04 19:43:37

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Ingo Molnar wrote: [Thu May 04 2006, 03:25:28PM EDT]
>
> * Bob Picco <[email protected]> wrote:
>
> > Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
> > > I haven't thought through it completely, but these two lines worry me:
> > >
> > > > + start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> > > > + end = start + pgdat->node_spanned_pages;
> > >
> > > Should the "end" be based off of the original "start", or the aligned
> > > "start"?
> >
> > Yes. I failed to quilt refresh before sending. You mean end should be
> > end = pgdat->node_start_pfn + pgdat->node_spanned_pages before
> > rounding up.
>
> do you have an updated patch i should try?
>
> Ingo
You can try this but don't believe it will change your outcome. I've
booted this on ia64 with slight modification to eliminate
VIRTUAL_MEM_MAP and have only DISCONTIGMEM. Your case is failing at the
front edge of of the zone and not the ending edge which had a flaw in my
first post of the patch. I would have expected the first patch to handle
the front edge correctly.

I don't remember seeing your .config in the thread (or blind and unable
to see it). Would you please send it my way.

I'm also hoping Andy has time to look into this.

bob


Index: linux-2.6.17-rc3/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/page_alloc.c 2006-04-27 09:44:02.000000000 -0400
+++ linux-2.6.17-rc3/mm/page_alloc.c 2006-05-04 13:01:25.000000000 -0400
@@ -2123,14 +2123,22 @@ static void __init alloc_node_mem_map(st
#ifdef CONFIG_FLAT_NODE_MEM_MAP
/* ia64 gets its own node_mem_map, before this, without bootmem */
if (!pgdat->node_mem_map) {
- unsigned long size;
+ unsigned long size, start, end;
struct page *map;

- size = (pgdat->node_spanned_pages + 1) * sizeof(struct page);
+ /*
+ * The zone's endpoints aren't required to be MAX_ORDER
+ * aligned but the node_mem_map endpoints must be in order
+ * for the buddy allocator to function correctly.
+ */
+ start = pgdat->node_start_pfn & ~(MAX_ORDER_NR_PAGES - 1);
+ end = pgdat->node_start_pfn + pgdat->node_spanned_pages;
+ end = ALIGN(end, MAX_ORDER_NR_PAGES);
+ size = (end - start) * sizeof(struct page);
map = alloc_remap(pgdat->node_id, size);
if (!map)
map = alloc_bootmem_node(pgdat, size);
- pgdat->node_mem_map = map;
+ pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
}
#ifdef CONFIG_FLATMEM
/*
Index: linux-2.6.17-rc3/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/mmzone.h 2006-04-27 09:44:02.000000000 -0400
+++ linux-2.6.17-rc3/include/linux/mmzone.h 2006-05-04 13:01:39.000000000 -0400
@@ -22,6 +22,7 @@
#else
#define MAX_ORDER CONFIG_FORCE_MAX_ZONEORDER
#endif
+#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))

struct free_area {
struct list_head free_list;

2006-05-04 21:51:12

by Andy Whitcroft

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Bob Picco wrote:
> Ingo Molnar wrote: [Thu May 04 2006, 03:25:28PM EDT]
>
>>* Bob Picco <[email protected]> wrote:
>>
>>
>>>Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
>>>
>>>>I haven't thought through it completely, but these two lines worry me:
>>>>
>>>>
>>>>>+ start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
>>>>>+ end = start + pgdat->node_spanned_pages;
>>>>
>>>>Should the "end" be based off of the original "start", or the aligned
>>>>"start"?
>>>
>>>Yes. I failed to quilt refresh before sending. You mean end should be
>>>end = pgdat->node_start_pfn + pgdat->node_spanned_pages before
>>>rounding up.
>>
>>do you have an updated patch i should try?
>>
>> Ingo
>
> You can try this but don't believe it will change your outcome. I've
> booted this on ia64 with slight modification to eliminate
> VIRTUAL_MEM_MAP and have only DISCONTIGMEM. Your case is failing at the
> front edge of of the zone and not the ending edge which had a flaw in my
> first post of the patch. I would have expected the first patch to handle
> the front edge correctly.
>
> I don't remember seeing your .config in the thread (or blind and unable
> to see it). Would you please send it my way.
>
> I'm also hoping Andy has time to look into this.
>
> bob

Yeah will have a look tommorrow my time. Could you drop me the .config
too. There is definatly some unstated requirements on alignment, which
I was testing today. I presume its one of those thats being violated.

-apw

2006-05-05 05:13:13

by Ingo Molnar

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA


* Andy Whitcroft <[email protected]> wrote:

> Yeah will have a look tommorrow my time. Could you drop me the
> .config too. There is definatly some unstated requirements on
> alignment, which I was testing today. I presume its one of those
> thats being violated.

the config is at:

http://redhat.com/~mingo/misc/config

the bootlog is at:

http://redhat.com/~mingo/misc/crash.log

Ingo

2006-05-05 13:55:15

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Andy Wihitcroft wrote: [Thu May 04 2006, 05:50:29PM EDT]
> Bob Picco wrote:
> > Ingo Molnar wrote: [Thu May 04 2006, 03:25:28PM EDT]
> >
> >>* Bob Picco <[email protected]> wrote:
> >>
> >>
> >>>Dave Hansen wrote: [Thu May 04 2006, 11:21:06AM EDT]
> >>>
> >>>>I haven't thought through it completely, but these two lines worry me:
> >>>>
> >>>>
> >>>>>+ start = pgdat->node_start_pfn & ~((1 << (MAX_ORDER - 1)) - 1);
> >>>>>+ end = start + pgdat->node_spanned_pages;
> >>>>
> >>>>Should the "end" be based off of the original "start", or the aligned
> >>>>"start"?
> >>>
> >>>Yes. I failed to quilt refresh before sending. You mean end should be
> >>>end = pgdat->node_start_pfn + pgdat->node_spanned_pages before
> >>>rounding up.
> >>
> >>do you have an updated patch i should try?
> >>
> >> Ingo
> >
> > You can try this but don't believe it will change your outcome. I've
> > booted this on ia64 with slight modification to eliminate
> > VIRTUAL_MEM_MAP and have only DISCONTIGMEM. Your case is failing at the
> > front edge of of the zone and not the ending edge which had a flaw in my
> > first post of the patch. I would have expected the first patch to handle
> > the front edge correctly.
> >
> > I don't remember seeing your .config in the thread (or blind and unable
> > to see it). Would you please send it my way.
> >
> > I'm also hoping Andy has time to look into this.
> >
> > bob
>
> Yeah will have a look tommorrow my time. Could you drop me the .config
> too. There is definatly some unstated requirements on alignment, which
> I was testing today. I presume its one of those thats being violated.
>
> -apw
I think the problem was my not looking closely at the full email thread.
I finally found time to read entire thread (found Ingo's config and boot logs). The patch below should fix Ingo's problem. It's probably only required for
ZONE_HIGHMEM. To be safe, I think we should apply it generically.

Not only must node_mem_map array be MAX_ORDER aligned but the the distance
between interior zones covered by node_mem_map must satisfy this alignment.
While in the buddy allocator before checking for a valid buddy the buddy page
must reside in the parent's zone too. ZONE_HIGHMEM doesn't satisfy the zone
alignment condition and requires this new check that the parent's buddy and
parent are within by the same zone.

The other possible solution is aligning HIGHMEM zone to satisfy MAX_ORDER.
This I didn't pursue and possibly is what Andy refers to above.

Adding a printk for the line with the zonenum mismatch condition caught two
instances in boot up on my x86 which was configured similarly to Ingo's config.

bob

Index: linux-2.6.17-rc3/mm/page_alloc.c
===================================================================
--- linux-2.6.17-rc3.orig/mm/page_alloc.c 2006-04-27 09:44:02.000000000 -0400
+++ linux-2.6.17-rc3/mm/page_alloc.c 2006-05-05 07:42:40.000000000 -0400
@@ -280,6 +280,15 @@ __find_combined_index(unsigned long page
return (page_idx & ~(1 << order));
}

+static inline int page_in_zone_hole(struct page *page)
+{
+#ifdef CONFIG_HOLES_IN_ZONE
+ if (!pfn_valid(page_to_pfn(page)))
+ return 1;
+#endif
+ return 0;
+}
+
/*
* This function checks whether a page is free && is the buddy
* we can do coalesce a page and its buddy if
@@ -294,11 +303,6 @@ __find_combined_index(unsigned long page
*/
static inline int page_is_buddy(struct page *page, int order)
{
-#ifdef CONFIG_HOLES_IN_ZONE
- if (!pfn_valid(page_to_pfn(page)))
- return 0;
-#endif
-
if (PageBuddy(page) && page_order(page) == order) {
BUG_ON(page_count(page) != 0);
return 1;
@@ -351,7 +355,11 @@ static inline void __free_one_page(struc
struct page *buddy;

buddy = __page_find_buddy(page, page_idx, order);
- if (!page_is_buddy(buddy, order))
+ if (page_in_zone_hole(buddy))
+ break;
+ else if (page_zonenum(buddy) != page_zonenum(page))
+ break;
+ else if (!page_is_buddy(buddy, order))
break; /* Move the buddy up one level. */

list_del(&buddy->lru);
@@ -2123,14 +2131,22 @@ static void __init alloc_node_mem_map(st
#ifdef CONFIG_FLAT_NODE_MEM_MAP
/* ia64 gets its own node_mem_map, before this, without bootmem */
if (!pgdat->node_mem_map) {
- unsigned long size;
+ unsigned long size, start, end;
struct page *map;

- size = (pgdat->node_spanned_pages + 1) * sizeof(struct page);
+ /*
+ * The zone's endpoints aren't required to be MAX_ORDER
+ * aligned but the node_mem_map endpoints must be in order
+ * for the buddy allocator to function correctly.
+ */
+ start = pgdat->node_start_pfn & ~(MAX_ORDER_NR_PAGES - 1);
+ end = pgdat->node_start_pfn + pgdat->node_spanned_pages;
+ end = ALIGN(end, MAX_ORDER_NR_PAGES);
+ size = (end - start) * sizeof(struct page);
map = alloc_remap(pgdat->node_id, size);
if (!map)
map = alloc_bootmem_node(pgdat, size);
- pgdat->node_mem_map = map;
+ pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
}
#ifdef CONFIG_FLATMEM
/*
Index: linux-2.6.17-rc3/include/linux/mmzone.h
===================================================================
--- linux-2.6.17-rc3.orig/include/linux/mmzone.h 2006-04-27 09:44:02.000000000 -0400
+++ linux-2.6.17-rc3/include/linux/mmzone.h 2006-05-04 13:01:39.000000000 -0400
@@ -22,6 +22,7 @@
#else
#define MAX_ORDER CONFIG_FORCE_MAX_ZONEORDER
#endif
+#define MAX_ORDER_NR_PAGES (1 << (MAX_ORDER - 1))

struct free_area {
struct list_head free_list;

2006-05-05 14:34:10

by Dave Hansen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Fri, 2006-05-05 at 09:55 -0400, Bob Picco wrote:
> - if (!page_is_buddy(buddy, order))
> + if (page_in_zone_hole(buddy))
> + break;
> + else if (page_zonenum(buddy) != page_zonenum(page))
> + break;
> + else if (!page_is_buddy(buddy, order))
> break; /* Move the buddy up one level. */

The page_zonenum() checks look good, but I'm not sure I understand the
page_in_zone_hole() part. If a page is in a hole in a zone, it will
still have a valid mem_map entry, right? It should also never have been
put into the allocator, so it also won't ever be coalesced.

I'm a bit confused. :(

BTW, I like the idea of just aligning HIGHMEM's start because it has no
runtime cost. Buuuuut, it is still just a shift and compare of the two
page->flags, which should already be (or will soon anyway be) in the
cache.

-- Dave

2006-05-05 14:50:22

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Dave Hansen wrote: [Fri May 05 2006, 10:33:10AM EDT]
> On Fri, 2006-05-05 at 09:55 -0400, Bob Picco wrote:
> > - if (!page_is_buddy(buddy, order))
> > + if (page_in_zone_hole(buddy))
> > + break;
> > + else if (page_zonenum(buddy) != page_zonenum(page))
> > + break;
> > + else if (!page_is_buddy(buddy, order))
> > break; /* Move the buddy up one level. */
>
> The page_zonenum() checks look good, but I'm not sure I understand the
> page_in_zone_hole() part. If a page is in a hole in a zone, it will
> still have a valid mem_map entry, right? It should also never have been
> put into the allocator, so it also won't ever be coalesced.
This has always been subtle and not too revealing. It probably should
have a comment. The page_in_zone_hole check is for ia64
VIRTUAL_MEM_MAP. You might compute a page structure which is in a hole not
backed by memory; an unallocated page which covers pages structures.
VIRTUAL_MEM_MAP uses a contiguous virtual region with virtual space holes
not backed by memory. Take a look at ia64_pfn_valid.
>
> I'm a bit confused. :(
>
> BTW, I like the idea of just aligning HIGHMEM's start because it has no
> runtime cost. Buuuuut, it is still just a shift and compare of the two
> page->flags, which should already be (or will soon anyway be) in the
> cache.
Yes. I'll defer to Andy whether he wants the zonenum check or to align
HIGHMEM corrrectly.
>
> -- Dave
>
bob

2006-05-05 14:58:53

by Dave Hansen

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

On Fri, 2006-05-05 at 10:50 -0400, Bob Picco wrote:
> Dave Hansen wrote: [Fri May 05 2006, 10:33:10AM EDT]
> > The page_zonenum() checks look good, but I'm not sure I understand the
> > page_in_zone_hole() part. If a page is in a hole in a zone, it will
> > still have a valid mem_map entry, right? It should also never have been
> > put into the allocator, so it also won't ever be coalesced.
> This has always been subtle and not too revealing. It probably should
> have a comment. The page_in_zone_hole check is for ia64
> VIRTUAL_MEM_MAP. You might compute a page structure which is in a hole not
> backed by memory; an unallocated page which covers pages structures.
> VIRTUAL_MEM_MAP uses a contiguous virtual region with virtual space holes
> not backed by memory. Take a look at ia64_pfn_valid.

Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
somewhere. Might be worth at least a comment like this:

+ if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
+ break;
+ else if (page_zonenum(buddy) != page_zonenum(page))
+ break;
+ else if (!page_is_buddy(buddy, order))
break; /* Move the buddy up one level. */

BTW, wasn't the whole idea of discontig to have holes in zones (before
NUMA) without tricks like this? ;)

-- Dave

2006-05-05 15:03:13

by Martin Bligh

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

> Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
> making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
> somewhere. Might be worth at least a comment like this:
>
> + if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
> + break;
> + else if (page_zonenum(buddy) != page_zonenum(page))
> + break;
> + else if (!page_is_buddy(buddy, order))
> break; /* Move the buddy up one level. */
>
> BTW, wasn't the whole idea of discontig to have holes in zones (before
> NUMA) without tricks like this? ;)

Sparsemem should fix this - that was one of the things Andy designed it
for. Then we can remove the virtual memmap stuff (and discontig).
Indeed, I'd hope we're ready to do that real soon now ... has anyone
got an ia64 box that needed virtual memmap that they could test this
on?

M.

2006-05-05 16:18:34

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Dave Hansen wrote: [Fri May 05 2006, 10:57:44AM EDT]
> On Fri, 2006-05-05 at 10:50 -0400, Bob Picco wrote:
> > Dave Hansen wrote: [Fri May 05 2006, 10:33:10AM EDT]
> > > The page_zonenum() checks look good, but I'm not sure I understand the
> > > page_in_zone_hole() part. If a page is in a hole in a zone, it will
> > > still have a valid mem_map entry, right? It should also never have been
> > > put into the allocator, so it also won't ever be coalesced.
> > This has always been subtle and not too revealing. It probably should
> > have a comment. The page_in_zone_hole check is for ia64
> > VIRTUAL_MEM_MAP. You might compute a page structure which is in a hole not
> > backed by memory; an unallocated page which covers pages structures.
> > VIRTUAL_MEM_MAP uses a contiguous virtual region with virtual space holes
> > not backed by memory. Take a look at ia64_pfn_valid.
>
> Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
> making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
> somewhere. Might be worth at least a comment like this:
>
> + if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
> + break;
> + else if (page_zonenum(buddy) != page_zonenum(page))
> + break;
> + else if (!page_is_buddy(buddy, order))
> break; /* Move the buddy up one level. */
>
> BTW, wasn't the whole idea of discontig to have holes in zones (before
> NUMA) without tricks like this? ;)
Sure you could boot ia64 with just DISCONTIGMEM and no VIRTUAL_MEM_MAP.
In fact that's exactly what I did to test code added in alloc_node_mem_map.
Unfortunately I was missing 1Gb from free memory after booting. The
missing 1Gb was consumed by reserved pages structures :)
>
> -- Dave
>
bob

2006-05-05 16:22:13

by Bob Picco

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Martin J. Bligh wrote: [Fri May 05 2006, 11:03:02AM EDT]
> >Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
> >making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
> >somewhere. Might be worth at least a comment like this:
> >
> >+ if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
> >+ break;
> >+ else if (page_zonenum(buddy) != page_zonenum(page))
> >+ break;
> >+ else if (!page_is_buddy(buddy, order))
> > break; /* Move the buddy up one level. */
> >
> >BTW, wasn't the whole idea of discontig to have holes in zones (before
> >NUMA) without tricks like this? ;)
>
> Sparsemem should fix this - that was one of the things Andy designed it
> for. Then we can remove the virtual memmap stuff (and discontig).
> Indeed, I'd hope we're ready to do that real soon now ... has anyone
> got an ia64 box that needed virtual memmap that they could test this
> on?
>
> M.
I totally agree about SPARSEMEM. I believe most ia64 boxes use
VIRTUAL_MEM_MAP. I only know of Fujitsu and myself that use SPARSEMEM
for ia64 (perhaps Andy too in his testing). Dave and I have advocated its use
more than once.

bob

2006-05-06 10:52:20

by Nick Piggin

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Dave Hansen wrote:

> Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
> making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
> somewhere. Might be worth at least a comment like this:
>
> + if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
> + break;
> + else if (page_zonenum(buddy) != page_zonenum(page))
> + break;
> + else if (!page_is_buddy(buddy, order))
> break; /* Move the buddy up one level. */
>
> BTW, wasn't the whole idea of discontig to have holes in zones (before
> NUMA) without tricks like this? ;)

Yes.

I don't like the patch much, because all that logic should be moved
into page_is_buddy where I put it (surely it is more readable not to
have the checks spilling out -- a page which is not in the correct
zone or is a "hole" is by definition not a buddy, right?)

So, I agree with adding the zone check if any architecture needs it.
But it would be something under CONFIG_HOLES_IN_ZONE, and the arch
needs to *either* align zones correctly (as they've always had to),
or turn this option on.

Thanks for working at this, everyone.

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-07 13:08:29

by Andy Whitcroft

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Nick Piggin wrote:
> Dave Hansen wrote:
>
>> Ahhh. I hadn't made the ia64 connection. I wonder if it is worth
>> making CONFIG_HOLES_IN_ZONE say ia64 or something about vmem_map in it
>> somewhere. Might be worth at least a comment like this:
>>
>> + if (page_in_zone_hole(buddy)) /* noop on all but ia64 */
>> + break;
>> + else if (page_zonenum(buddy) != page_zonenum(page))
>> + break;
>> + else if (!page_is_buddy(buddy, order))
>> break; /* Move the buddy up one
>> level. */
>>
>> BTW, wasn't the whole idea of discontig to have holes in zones (before
>> NUMA) without tricks like this? ;)
>
>
> Yes.
>
> I don't like the patch much, because all that logic should be moved
> into page_is_buddy where I put it (surely it is more readable not to
> have the checks spilling out -- a page which is not in the correct
> zone or is a "hole" is by definition not a buddy, right?)
>
> So, I agree with adding the zone check if any architecture needs it.
> But it would be something under CONFIG_HOLES_IN_ZONE, and the arch
> needs to *either* align zones correctly (as they've always had to),
> or turn this option on.

I agree that there is no need for these checks to leak out of
page_is_buddy(). If its not there or in another zone, its not my buddy.
The allocator loop is nasty enough as it is.

I think we need to do a couple of things:

1) check the alignment of the zones matches the implied alignment
constraints and correct it as we go.
2) optionally allow an architecture to say its not aligning and doesn't
want to have to align its zone -- providing a config option to add the
zone index checks

I think the later is valuable for these test builds and potentially for
the embedded side where megabytes mean something.

I'm testing a patch for this at the moment and will drop it out when I'm
done.

-apw

2006-05-07 13:18:42

by Nick Piggin

[permalink] [raw]
Subject: Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

Andy Whitcroft wrote:

> I agree that there is no need for these checks to leak out of
> page_is_buddy(). If its not there or in another zone, its not my buddy.
> The allocator loop is nasty enough as it is.

OK, glad you agree.

>
> I think we need to do a couple of things:
>
> 1) check the alignment of the zones matches the implied alignment
> constraints and correct it as we go.

Yes. And preferably have checks in the generic page allocator setup
code, so we can do something sane if the arch code gets it wrong.

> 2) optionally allow an architecture to say its not aligning and doesn't
> want to have to align its zone -- providing a config option to add the
> zone index checks
>
> I think the later is valuable for these test builds and potentially for
> the embedded side where megabytes mean something.

Yes. Depends whether we fold it under the HOLES_IN_ZONE config. I guess
HOLES_IN_ZONE is potentially quite a bit more expensive than the plain
zone check, so having 2 config options may not be unreasonable.

Also, if the architecture doesn't align the ends of zones, *and* they are
not adjacent to another zone, they need either CONFIG_HOLES_IN_ZONE or
they need to provide dummy 'struct page's that never have PageBuddy set.


>
> I'm testing a patch for this at the moment and will drop it out when I'm
> done.

Great!

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2006-05-12 09:48:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [RFC, PATCH] cond_resched() added to close_files()

Ingo Molnar <[email protected]> wrote:
>
>
> * Eric Dumazet <[email protected]> wrote:
>
> > This patch makes sure a cond_resched() call is done every 32 (or 64)
> > files closed. This also helps reducing number of files waiting in RCU
> > queues for final freeing as call_rcu() might have called
> > force_quiescent_state()
>
> the -rt tree already has this latency breaker (and had it for a long
> time), it just somehow didnt get pushed upstream.
>

Makes my machine hang early during the startup of init.

The last process to pass through close_file() is `hostname', presuably
parented by init. `hostname' exits then everything stops. init is
left sleeping in select().

All very strange.

#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.17-rc3
# Fri May 12 02:35:30 2006
#
CONFIG_X86_32=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_IKCONFIG=y
# CONFIG_IKCONFIG_PROC is not set
# CONFIG_CPUSETS is not set
# CONFIG_RELAY is not set
CONFIG_INITRAMFS_SOURCE=""
CONFIG_UID16=y
CONFIG_VM86=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_EMBEDDED=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SHMEM=y
CONFIG_SLAB=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_SLOB is not set

#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y

#
# Block layer
#
# CONFIG_LBD is not set
# CONFIG_BLK_DEV_IO_TRACE is not set
# CONFIG_LSF is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Processor type and features
#
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_CMPXCHG64=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_NR_CPUS=8
# CONFIG_SCHED_SMT is not set
CONFIG_SCHED_MC=y
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_X86_REBOOTFIXUPS is not set
# CONFIG_MICROCODE is not set
CONFIG_X86_MSR=y
CONFIG_X86_CPUID=y

#
# Firmware Drivers
#
CONFIG_EDD=y
# CONFIG_DELL_RBU is not set
CONFIG_DCDBAS=m
# CONFIG_NOHIGHMEM is not set
# CONFIG_HIGHMEM4G is not set
CONFIG_HIGHMEM64G=y
CONFIG_PAGE_OFFSET=0xC0000000
CONFIG_HIGHMEM=y
CONFIG_X86_PAE=y
CONFIG_ARCH_FLATMEM_ENABLE=y
CONFIG_ARCH_SPARSEMEM_ENABLE=y
CONFIG_ARCH_SELECT_MEMORY_MODEL=y
CONFIG_SELECT_MEMORY_MODEL=y
CONFIG_FLATMEM_MANUAL=y
# CONFIG_DISCONTIGMEM_MANUAL is not set
# CONFIG_SPARSEMEM_MANUAL is not set
CONFIG_FLATMEM=y
CONFIG_FLAT_NODE_MEM_MAP=y
CONFIG_SPARSEMEM_STATIC=y
CONFIG_SPLIT_PTLOCK_CPUS=4
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
# CONFIG_MTRR is not set
# CONFIG_EFI is not set
CONFIG_IRQBALANCE=y
# CONFIG_REGPARM is not set
CONFIG_SECCOMP=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
CONFIG_PHYSICAL_START=0x100000
# CONFIG_HOTPLUG_CPU is not set

#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_PM_LEGACY=y
# CONFIG_PM_DEBUG is not set

#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_AC=m
CONFIG_ACPI_BATTERY=m
CONFIG_ACPI_BUTTON=m
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_HOTKEY=m
CONFIG_ACPI_FAN=m
CONFIG_ACPI_PROCESSOR=m
CONFIG_ACPI_THERMAL=m
CONFIG_ACPI_ASUS=m
CONFIG_ACPI_IBM=m
# CONFIG_ACPI_IBM_DOCK is not set
CONFIG_ACPI_TOSHIBA=m
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
CONFIG_ACPI_CONTAINER=m

#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set

#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_GOV_POWERSAVE is not set
# CONFIG_CPU_FREQ_GOV_USERSPACE is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_GOV_CONSERVATIVE is not set

#
# CPUFreq processor drivers
#
# CONFIG_X86_ACPI_CPUFREQ is not set
# CONFIG_X86_POWERNOW_K6 is not set
# CONFIG_X86_POWERNOW_K7 is not set
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_GX_SUSPMOD is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_SPEEDSTEP_ICH is not set
# CONFIG_X86_SPEEDSTEP_SMI is not set
CONFIG_X86_P4_CLOCKMOD=m
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
# CONFIG_X86_LONGRUN is not set

#
# shared options
#
CONFIG_X86_SPEEDSTEP_LIB=m

#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
# CONFIG_PCIEPORTBUS is not set
# CONFIG_PCI_MSI is not set
# CONFIG_PCI_DEBUG is not set
CONFIG_ISA_DMA_API=y
CONFIG_ISA=y
CONFIG_EISA=y
# CONFIG_EISA_VLB_PRIMING is not set
CONFIG_EISA_PCI_EISA=y
CONFIG_EISA_VIRTUAL_ROOT=y
CONFIG_EISA_NAMES=y
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set

#
# PCCARD (PCMCIA/CardBus) support
#
# CONFIG_PCCARD is not set

#
# PCI Hotplug Support
#
CONFIG_HOTPLUG_PCI=y
# CONFIG_HOTPLUG_PCI_FAKE is not set
# CONFIG_HOTPLUG_PCI_COMPAQ is not set
# CONFIG_HOTPLUG_PCI_IBM is not set
# CONFIG_HOTPLUG_PCI_ACPI is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
# CONFIG_HOTPLUG_PCI_SHPC is not set

#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y

#
# Networking
#
CONFIG_NET=y

#
# Networking options
#
# CONFIG_NETDEBUG is not set
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_UNIX=y
CONFIG_XFRM=y
# CONFIG_XFRM_USER is not set
CONFIG_NET_KEY=y
CONFIG_INET=y
# CONFIG_IP_MULTICAST is not set
# CONFIG_IP_ADVANCED_ROUTER is not set
CONFIG_IP_FIB_HASH=y
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_XFRM_TUNNEL is not set
# CONFIG_INET_TUNNEL is not set
CONFIG_INET_DIAG=y
CONFIG_INET_TCP_DIAG=y
# CONFIG_TCP_CONG_ADVANCED is not set
CONFIG_TCP_CONG_BIC=y
# CONFIG_IPV6 is not set
# CONFIG_INET6_XFRM_TUNNEL is not set
# CONFIG_INET6_TUNNEL is not set
# CONFIG_NETFILTER is not set

#
# DCCP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_DCCP is not set

#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set

#
# TIPC Configuration (EXPERIMENTAL)
#
# CONFIG_TIPC is not set
# CONFIG_ATM is not set
CONFIG_BRIDGE=y
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set

#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set

#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
# CONFIG_IEEE80211 is not set

#
# Device Drivers
#

#
# Generic Driver Options
#
# CONFIG_STANDALONE is not set
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=y
# CONFIG_DEBUG_DRIVER is not set

#
# Connector - unified userspace <-> kernelspace linker
#
CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y

#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set

#
# Parallel port support
#
# CONFIG_PARPORT is not set

#
# Plug and Play support
#
CONFIG_PNP=y
CONFIG_PNP_DEBUG=y

#
# Protocols
#
CONFIG_ISAPNP=y
CONFIG_PNPBIOS=y
# CONFIG_PNPBIOS_PROC_FS is not set
CONFIG_PNPACPI=y

#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
# CONFIG_BLK_DEV_COW_COMMON is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=16
CONFIG_BLK_DEV_RAM_SIZE=4000
CONFIG_BLK_DEV_INITRD=y
# CONFIG_CDROM_PKTCDVD is not set
# CONFIG_ATA_OVER_ETH is not set

#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y

#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
CONFIG_BLK_DEV_IDETAPE=y
CONFIG_BLK_DEV_IDEFLOPPY=y
# CONFIG_BLK_DEV_IDESCSI is not set
CONFIG_IDE_TASK_IOCTL=y

#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_CS5535 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_IT821X is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set

#
# SCSI device support
#
# CONFIG_RAID_ATTRS is not set
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y

#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=y
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=y
# CONFIG_CHR_DEV_SCH is not set

#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set

#
# SCSI Transport Attributes
#
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_FC_ATTRS=y
# CONFIG_SCSI_ISCSI_ATTRS is not set
# CONFIG_SCSI_SAS_ATTRS is not set

#
# SCSI low-level drivers
#
# CONFIG_ISCSI_TCP is not set
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_MEGARAID_SAS is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=64
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
CONFIG_SCSI_SYM53C8XX_MMIO=y
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLA_FC is not set
# CONFIG_SCSI_LPFC is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set

#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set

#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set

#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
# CONFIG_FUSION_SPI is not set
# CONFIG_FUSION_FC is not set
# CONFIG_FUSION_SAS is not set

#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set

#
# I2O device support
#
# CONFIG_I2O is not set

#
# Network device support
#
CONFIG_NETDEVICES=y
CONFIG_DUMMY=y
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set

#
# ARCnet devices
#
# CONFIG_ARCNET is not set

#
# PHY device support
#
# CONFIG_PHYLIB is not set

#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_CASSINI is not set
CONFIG_NET_VENDOR_3COM=y
# CONFIG_EL1 is not set
# CONFIG_EL2 is not set
# CONFIG_ELPLUS is not set
# CONFIG_EL16 is not set
CONFIG_EL3=m
# CONFIG_3C515 is not set
CONFIG_VORTEX=m
# CONFIG_TYPHOON is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set

#
# Tulip family network device support
#
CONFIG_NET_TULIP=y
# CONFIG_DE2104X is not set
# CONFIG_TULIP is not set
CONFIG_DE4X5=m
# CONFIG_WINBOND_840 is not set
# CONFIG_DM9102 is not set
# CONFIG_ULI526X is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
CONFIG_E100=m
# CONFIG_LNE390 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set

#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_R8169=y
# CONFIG_R8169_NAPI is not set
# CONFIG_SIS190 is not set
# CONFIG_SKGE is not set
# CONFIG_SKY2 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
# CONFIG_BNX2 is not set

#
# Ethernet (10000 Mbit)
#
# CONFIG_CHELSIO_T1 is not set
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set

#
# Token Ring devices
#
# CONFIG_TR is not set

#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set

#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
CONFIG_NETCONSOLE=y
CONFIG_NETPOLL=y
CONFIG_NETPOLL_RX=y
CONFIG_NETPOLL_TRAP=y
CONFIG_NET_POLL_CONTROLLER=y

#
# ISDN subsystem
#
# CONFIG_ISDN is not set

#
# Telephony Support
#
# CONFIG_PHONE is not set

#
# Input device support
#
CONFIG_INPUT=y

#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set

#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set

#
# Hardware I/O ports
#
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
CONFIG_GAMEPORT=y
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_FM801 is not set

#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set

#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
CONFIG_SERIAL_8250_PCI=y
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set

#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
# CONFIG_SERIAL_JSM is not set
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256

#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set

#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set

#
# Ftape, the floppy tape device driver
#
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_CS5535_GPIO is not set
CONFIG_RAW_DRIVER=m
CONFIG_MAX_RAW_DEVS=256
# CONFIG_HPET is not set
# CONFIG_HANGCHECK_TIMER is not set

#
# TPM devices
#
# CONFIG_TCG_TPM is not set
# CONFIG_TELCLOCK is not set

#
# I2C support
#
# CONFIG_I2C is not set

#
# SPI support
#
# CONFIG_SPI is not set
# CONFIG_SPI_MASTER is not set

#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set

#
# Hardware Monitoring support
#
# CONFIG_HWMON is not set
# CONFIG_HWMON_VID is not set

#
# Misc devices
#
# CONFIG_IBM_ASM is not set

#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set

#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set

#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y

#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_VGACON_SOFT_SCROLLBACK is not set
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y

#
# Sound
#
# CONFIG_SOUND is not set

#
# USB support
#
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
CONFIG_USB_ARCH_HAS_EHCI=y
# CONFIG_USB is not set

#
# NOTE: USB_STORAGE enables SCSI, and 'SCSI disk support'
#

#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set

#
# MMC/SD Card support
#
# CONFIG_MMC is not set

#
# LED devices
#
# CONFIG_NEW_LEDS is not set

#
# LED drivers
#

#
# LED Triggers
#

#
# InfiniBand support
#
# CONFIG_INFINIBAND is not set

#
# EDAC - error detection and reporting (RAS) (EXPERIMENTAL)
#
CONFIG_EDAC=y

#
# Reporting subsystems
#
CONFIG_EDAC_DEBUG=y
CONFIG_EDAC_MM_EDAC=y
CONFIG_EDAC_AMD76X=y
CONFIG_EDAC_E7XXX=y
CONFIG_EDAC_E752X=y
CONFIG_EDAC_I82875P=y
CONFIG_EDAC_I82860=y
CONFIG_EDAC_R82600=y
CONFIG_EDAC_POLL=y

#
# Real Time Clock
#
# CONFIG_RTC_CLASS is not set

#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
# CONFIG_EXT2_FS_XIP is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
CONFIG_JBD_DEBUG=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_OCFS2_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_INOTIFY=y
# CONFIG_QUOTA is not set
CONFIG_DNOTIFY=y
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
# CONFIG_FUSE_FS is not set

#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y

#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
CONFIG_NTFS_DEBUG=y
# CONFIG_NTFS_RW is not set

#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_RAMFS=y
# CONFIG_CONFIGFS_FS is not set

#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set

#
# Network File Systems
#
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V3_ACL is not set
CONFIG_NFS_V4=y
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V3_ACL is not set
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_NFS_COMMON=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_RPCSEC_GSS_KRB5=y
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
# CONFIG_9P_FS is not set

#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y

#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
# CONFIG_NLS_CODEPAGE_437 is not set
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ASCII=m
# CONFIG_NLS_ISO8859_1 is not set
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set

#
# Instrumentation Support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=y
# CONFIG_KPROBES is not set

#
# Kernel hacking
#
# CONFIG_PRINTK_TIME is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_KERNEL=y
CONFIG_LOG_BUF_SHIFT=17
CONFIG_DETECT_SOFTLOCKUP=y
# CONFIG_SCHEDSTATS is not set
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_HIGHMEM=y
CONFIG_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_VM=y
CONFIG_FRAME_POINTER=y
# CONFIG_UNWIND_INFO is not set
CONFIG_FORCED_INLINING=y
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_EARLY_PRINTK is not set
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_STACK_USAGE=y
CONFIG_STACK_BACKTRACE_COLS=2
CONFIG_DEBUG_RODATA=y
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_DOUBLEFAULT=y

#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set

#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=y
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_TGR192 is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES is not set
# CONFIG_CRYPTO_AES_586 is not set
CONFIG_CRYPTO_CAST5=m
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_ANUBIS is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set

#
# Hardware crypto devices
#
# CONFIG_CRYPTO_DEV_PADLOCK is not set

#
# Library routines
#
CONFIG_CRC_CCITT=m
# CONFIG_CRC16 is not set
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_GENERIC_PENDING_IRQ=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_KTIME_SCALAR=y

2006-05-12 10:21:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [RFC, PATCH] cond_resched() added to close_files()


* Andrew Morton <[email protected]> wrote:

> Makes my machine hang early during the startup of init.
>
> The last process to pass through close_file() is `hostname', presuably
> parented by init. `hostname' exits then everything stops. init is
> left sleeping in select().
>
> All very strange.

weird. This really shouldnt cause a hang - i think there must be a bug
hiding elsewhere, this cond_resched() ought to be fine.

Ingo

2006-05-12 12:24:27

by Eric Dumazet

[permalink] [raw]
Subject: Re: [RFC, PATCH] cond_resched() added to close_files()

Ingo Molnar a ?crit :
> * Andrew Morton <[email protected]> wrote:
>
>
>> Makes my machine hang early during the startup of init.
>>
>> The last process to pass through close_file() is `hostname', presuably
>> parented by init. `hostname' exits then everything stops. init is
>> left sleeping in select().
>>
>> All very strange.
>>
>
> weird. This really shouldnt cause a hang - i think there must be a bug
> hiding elsewhere, this cond_resched() ought to be fine.
>
> Ingo
>
>
Maybe a process is now awaken because of a closed pipe, instead of a SIGCHLD

Process A, father/parent of process B. They share a pipe (A reads the
pipe, B writes to)

Before

Process B exits, "atomically close all of its files and sending a SIGCLD
to its parent"
Process A catch the SIGCLD.

After :
Process B exits "close its files (and the pipe), reschedule before final
SIGCLD"
Process A gets the POLLIN/POLLHUP indication on the pipe -> Another code
path is run and may trigger a user side bug ?


Eric




2006-05-14 00:29:55

by Lee Revell

[permalink] [raw]
Subject: Re: [RFC, PATCH] cond_resched() added to close_files()

On Fri, 2006-05-12 at 12:20 +0200, Ingo Molnar wrote:
> * Andrew Morton <[email protected]> wrote:
>
> > Makes my machine hang early during the startup of init.
> >
> > The last process to pass through close_file() is `hostname', presuably
> > parented by init. `hostname' exits then everything stops. init is
> > left sleeping in select().
> >
> > All very strange.
>
> weird. This really shouldnt cause a hang - i think there must be a bug
> hiding elsewhere, this cond_resched() ought to be fine.
>

Ingo,

Would this be the latency it fixes? (seen with 2.6.16):

preemption latency trace v1.1.5 on 2.6.16
--------------------------------------------------------------------
latency: 9313 us, #12766/12766, CPU#0 | (M:rt VP:0, KP:0, SP:0 HP:0)
-----------------
| task: jackd-26580 (uid:1000 nice:0 policy:1 rt_prio:80)
-----------------

_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| /
||||| delay
cmd pid ||||| time | caller
\ / ||||| \ | /
ldconfig-26779 0d.h5 0us : __trace_start_sched_wakeup
(try_to_wake_up)
ldconfig-26779 0d.h5 1us : __trace_start_sched_wakeup <<...>-26580>
(13 0)
ldconfig-26779 0d.h. 2us : kill_fasync (snd_pcm_period_elapsed)
ldconfig-26779 0d.h. 3us : snd_emu10k1_voice_intr_ack
(snd_emu10k1_interrupt)
ldconfig-26779 0d.h. 5us+: snd_emu10k1_ptr_read
(snd_emu10k1_interrupt)
ldconfig-26779 0d.h. 8us+: snd_emu10k1_ptr_read
(snd_emu10k1_interrupt)
ldconfig-26779 0d.h1 12us : note_interrupt (__do_IRQ)
ldconfig-26779 0d.h1 13us : end_8259A_irq (__do_IRQ)
ldconfig-26779 0d.h1 13us : enable_8259A_irq (end_8259A_irq)
ldconfig-26779 0dnh2 15us : irq_exit (do_IRQ)
ldconfig-26779 0dn.2 16us < (2097760)
ldconfig-26779 0dn.2 17us : preempt_schedule (find_get_page)
ldconfig-26779 0dn.2 18us : put_page (free_file)
ldconfig-26779 0dn.2 18us : free_file (unmap_vmas)
ldconfig-26779 0dn.2 19us : find_get_page (free_file)
ldconfig-26779 0dn.3 20us : radix_tree_lookup (find_get_page)
ldconfig-26779 0dn.2 20us : preempt_schedule (find_get_page)
ldconfig-26779 0dn.2 21us : put_page (free_file)
ldconfig-26779 0dn.2 22us : free_file (unmap_vmas)
ldconfig-26779 0dn.2 22us : find_get_page (free_file)
ldconfig-26779 0dn.3 23us : radix_tree_lookup (find_get_page)
ldconfig-26779 0dn.2 24us : preempt_schedule (find_get_page)
ldconfig-26779 0dn.2 24us : put_page (free_file)
ldconfig-26779 0dn.2 25us : free_file (unmap_vmas)
ldconfig-26779 0dn.2 26us : find_get_page (free_file)
ldconfig-26779 0dn.3 26us : radix_tree_lookup (find_get_page)
ldconfig-26779 0dn.2 27us : preempt_schedule (find_get_page)

(...)

ldconfig-26779 0dn.2 9256us : put_page (free_file)
ldconfig-26779 0dn.2 9256us : free_file (unmap_vmas)
ldconfig-26779 0dn.2 9257us : find_get_page (free_file)
ldconfig-26779 0dn.3 9258us : radix_tree_lookup (find_get_page)
ldconfig-26779 0dn.2 9259us : preempt_schedule (find_get_page)
ldconfig-26779 0dn.2 9259us : put_page (free_file)
ldconfig-26779 0dn.2 9260us : free_file (unmap_vmas)
ldconfig-26779 0dn.2 9261us : find_get_page (free_file)
ldconfig-26779 0dn.3 9261us : radix_tree_lookup (find_get_page)
ldconfig-26779 0dn.2 9262us : preempt_schedule (find_get_page)
ldconfig-26779 0dn.2 9263us : put_page (free_file)
ldconfig-26779 0dn.1 9263us+: preempt_schedule (unmap_vmas)
ldconfig-26779 0dn.1 9266us : free_pgtables (unmap_region)
ldconfig-26779 0dn.1 9267us : anon_vma_unlink (free_pgtables)
ldconfig-26779 0dn.1 9269us : unlink_file_vma (free_pgtables)
ldconfig-26779 0dn.2 9269us : __remove_shared_vm_struct
(unlink_file_vma)
ldconfig-26779 0dn.2 9270us : vma_prio_tree_remove
(__remove_shared_vm_struct)
ldconfig-26779 0dn.2 9271us : prio_tree_remove (vma_prio_tree_remove)
ldconfig-26779 0dn.1 9273us : preempt_schedule (unlink_file_vma)
ldconfig-26779 0dn.1 9274us : free_pgd_range (free_pgtables)
ldconfig-26779 0dn.1 9275us : free_page_and_swap_cache (free_pgd_range)
ldconfig-26779 0dn.1 9276us : put_page (free_page_and_swap_cache)
ldconfig-26779 0dn.1 9277us : __page_cache_release (put_page)
ldconfig-26779 0dn.1 9278us : preempt_schedule (__page_cache_release)
ldconfig-26779 0dn.1 9279us : free_hot_page (__page_cache_release)
ldconfig-26779 0dn.1 9279us : free_hot_cold_page (free_hot_page)
ldconfig-26779 0dn.2 9280us : __mod_page_state_offset
(free_hot_cold_page)
ldconfig-26779 0dn.1 9281us : preempt_schedule (free_hot_cold_page)
ldconfig-26779 0dn.1 9282us : mod_page_state_offset (free_pgd_range)
ldconfig-26779 0dn.1 9283us : free_page_and_swap_cache (free_pgd_range)
ldconfig-26779 0dn.1 9284us : put_page (free_page_and_swap_cache)
ldconfig-26779 0dn.1 9285us : __page_cache_release (put_page)
ldconfig-26779 0dn.1 9285us : preempt_schedule (__page_cache_release)
ldconfig-26779 0dn.1 9286us : free_hot_page (__page_cache_release)
ldconfig-26779 0dn.1 9287us : free_hot_cold_page (free_hot_page)
ldconfig-26779 0dn.2 9287us : __mod_page_state_offset
(free_hot_cold_page)
ldconfig-26779 0dn.1 9288us : preempt_schedule (free_hot_cold_page)
ldconfig-26779 0dn.1 9289us : mod_page_state_offset (free_pgd_range)
ldconfig-26779 0dn.1 9290us : free_page_and_swap_cache (free_pgd_range)
ldconfig-26779 0dn.1 9291us : put_page (free_page_and_swap_cache)
ldconfig-26779 0dn.1 9291us : __page_cache_release (put_page)
ldconfig-26779 0dn.1 9292us : preempt_schedule (__page_cache_release)
ldconfig-26779 0dn.1 9293us : free_hot_page (__page_cache_release)
ldconfig-26779 0dn.1 9293us : free_hot_cold_page (free_hot_page)
ldconfig-26779 0dn.2 9294us : __mod_page_state_offset
(free_hot_cold_page)
ldconfig-26779 0dn.1 9295us : preempt_schedule (free_hot_cold_page)
ldconfig-26779 0dn.1 9295us : mod_page_state_offset (free_pgd_range)
ldconfig-26779 0dn.. 9297us : preempt_schedule (unmap_region)
ldconfig-26779 0dn.. 9298us : schedule (preempt_schedule)
ldconfig-26779 0dn.. 9298us : profile_hit (schedule)
ldconfig-26779 0dn.1 9299us+: sched_clock (schedule)
<...>-26580 0d..2 9306us+: __switch_to (schedule)
<...>-26580 0d..2 9309us : schedule <ldconfig-26779> (76 13)
<...>-26580 0d..1 9310us : trace_stop_sched_switched (schedule)
<...>-26580 0d..2 9310us : trace_stop_sched_switched <<...>-26580>
(13 0)
<...>-26580 0d..2 9312us : schedule (schedule)

Lee