2006-11-16 04:21:09

by Linus Torvalds

[permalink] [raw]
Subject: Linux 2.6.19-rc6


Ok,
there's nothing earth-shattering here (and there shouldn't be), but we've
hopefully made good progress on the regression list (and thanks again to
Adrian Bunk for reminding people, especially when they thought *cough*
that some particular regression had already been fixed)..

So with -rc6, we hopefully should leave the irq-related regressions behind
us. There were issues both with devices that started enabling MSI (which
seem to trigger hardware bugs, although there's also been discussion about
what we should do to make things safer) and with the new genirq layer that
showed problems with edge-triggered irq's (notably legacy ISA interrupts,
or, more commonly these days, the 16-bit PCMCIA interrupts that are
basically just ISA in another formfactor).

Thanks for everybody involved in whittling down that regression list.

Also, apart from the regression tracking, we've had some other updates, eg
infiniband and DVB fixes, network driver fixes, some networking fixes etc.

The ShortLog is appended, and gives a mostly readable picture of what has
been going on. But the main thing to take away is: regressions fixed, and
not a whole lot of changes since -rc5 (it may not look that way, but a lot
of these things are essentially one-liners or close to it, so the total
diff between -rc5 and -rc6 is actually just about 5k lines, which is not
a whole lot, considering).

Linus

---
Aaron Durbin (1):
x86-64: Fix partial page check to ensure unusable memory is not being marked usable.

Adrian Bunk (2):
bcm43xx: Add error checking in bcm43xx_sprom_write()
drivers/telephony/ixj: fix an array overrun

Alan Cox (1):
hpt37x: Check the enablebits

Alan Stern (1):
SCSI core: always store >= 36 bytes of INQUIRY data

Alasdair G Kergon (2):
dm: fix find_device race
dm: suspend: fix error path

Alexey Dobriyan (4):
ipmi_si_intf.c: fix "&& 0xff" typos
V4L/DVB (4795): Tda826x: use correct max frequency
V4L/DVB (4818): Flexcop-usb: fix debug printk
pata_artop: fix "& (1 >>" typo

Andi Kleen (6):
Revert "MMCONFIG and new Intel motherboards"
x86-64: Fix PTRACE_[SG]ET_THREAD_AREA regression with ia32 emulation.
x86-64: Handle reserve_bootmem_generic beyond end_pfn
x86: Add acpi_user_timer_override option for Asus boards
x86-64: Fix vgetcpu when CONFIG_HOTPLUG_CPU is disabled
x86-64: Fix race in exit_idle

Andrew Morton (2):
setup_irq(): better mismatch debugging
revert "PCI: quirk for IBM Dock II cardbus controllers"

Arjan van de Ven (1):
Regression in 2.6.19-rc microcode driver

Benjamin Herrenschmidt (2):
[POWERPC] Fix cell "new style" mapping and add debug
powerpc: windfarm shall request it's sub modules

Brian King (1):
libata: Convert from module_init to subsys_initcall

Bryan O'Sullivan (1):
IB/ipath - program intconfig register using new HT irq hook

Chris Lalancette (1):
[NETPOLL]: Compute checksum properly in netpoll_send_udp().

Corey Minyard (3):
IPMI: Clean up the waiting message queue properly on unload
IPMI: retry messages on certain error returns
IPMI: Fix more && typos

Daniel Ritz (1):
fix via586 irq routing for pirq 5

Darrick J. Wong (1):
libata: fix double-completion on error

David Brownell (1):
usb: MAINTAINERS updates

David Chinner (3):
[XFS] Clean up i_flags and i_flags_lock handling.
[XFS] Prevent a deadlock when xfslogd unpins inodes.
[XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h

David Gibson (1):
hugetlb: check for brk() entering a hugepage region

David Miller (1):
pci: don't try to remove sysfs files before they are setup.

David Rientjes (1):
drivers cris: return on NULL dev_alloc_skb()

Eric Dumazet (1):
vmalloc: optimization, cleanup, bugfixes

Eric W. Biederman (4):
sysctl: Undeprecate sys_sysctl
htirq: refactor so we only have one function that writes to the chip
htirq: allow buggy drivers of buggy hardware to write the registers
Use delayed disable mode of ioapic edge triggered interrupts

Franck Bui-Huu (1):
.gitignore: add miscellaneous files

Geoff Levand (1):
[POWERPC] cell: set ARCH_SPARSEMEM_DEFAULT in Kconfig

Herbert Xu (1):
[NET]: Set truesize in pskb_copy

Hermann Pitton (1):
V4L/DVB (4802): Cx88: fix remote control on WinFast 2000XP Expert

Hoang-Nam Nguyen (3):
IB/ehca: Assure 4K alignment for firmware control blocks
IB/ehca: Use named constant for max mtu
IB/ehca: Activate scaling code by default

Hugh Dickins (2):
hugetlb: prepare_hugepage_range check offset too
hugetlb: fix error return for brk() entering a hugepage region

Ian Kent (1):
autofs4: panic after mount fail

J. Bruce Fields (3):
nfsd4: reindent do_open_lookup()
nfsd4: fix open-create permissions
nfsd: fix spurious error return from nfsd_create in async case

Jean Delvare (2):
V4L/DVB (4817): Fix uses of "&&" where "&" was intended
RDMA/amso1100: Fix && typo

Jeff Garzik (1):
[libata] sata_via: fix obvious typo

Jens Axboe (4):
Fix bad data direction in SG_IO
ide-cd: only set rq->errors SCSI style for block pc requests
cciss: fix iostat
cpqarray: fix iostat

Jes Sorensen (1):
mspec driver build fix

Jiri Slaby (2):
[NET]: kconfig, correct traffic shaper
Char: isicom, fix close bug

John Heffner (1):
[TCP]: Don't use highmem in tcp hash size calculation.

John Rose (1):
[POWERPC] pseries: Force 4k update_flash block and list sizes

Jonathan E Brassow (2):
dm: multipath: fix rr_add_path order
dm: raid1: fix waiting for io on suspend

Julian Anastasov (1):
[IPVS]: More endianness fixed.

Kalle Pokki (2):
[POWERPC] CPM_UART: Fix non-console transmit
[POWERPC] CPM_UART: Fix non-console initialisation

KAMEZAWA Hiroyuki (1):
ia64: select ACPI_NUMA if ACPI

Linus Torvalds (6):
Revert "i386: Add MMCFG resources to i386 too"
x86-64: clean up io-apic accesses
x86-64: write IO APIC irq routing entries in correct order
[dvb saa7134] Fix missing 'break' for avermedia card case
Revert "fix Data Acess error in dup_fd"
Linux 2.6.19-rc6

Magnus Damm (1):
x86-64: setup saved_max_pfn correctly (kdump)

Masami Hiramatsu (1):
kretprobe: fix kretprobe-booster to save regs and set status

Mauro Carvalho Chehab (1):
V4L/DVB (4804): Fix missing i2c dependency for saa7110

Michael Buesch (1):
bcm43xx: Drain TX status before starting IRQs

Michael Chan (1):
[TG3]: Fix array overrun in tg3_read_partno().

Nathan Lynch (1):
nvidiafb: fix unreachable code in nv10GetConfig

NeilBrown (2):
md: change ONLINE/OFFLINE events to a single CHANGE event
md: fix sizing problem with raid5-reshape and CONFIG_LBD=n

Nicolas Kaiser (1):
drivers/ide: stray bracket

Oleg Nesterov (1):
A minor fix for set_mb() in Documentation/memory-barriers.txt

[email protected] (3):
V4L/DVB (4814): Remote support for Avermedia 777
V4L/DVB (4815): Remote support for Avermedia A16AR
V4L/DVB (4816): Change tuner type for Avermedia A16AR

Paul Mackerras (1):
[POWERPC] Make sure initrd and dtb sections get into zImage correctly

Pavel Emelianov (1):
Fix misrouted interrupts deadlocks

Peter Zijlstra (1):
bonding: lockdep annotation

Rafael J. Wysocki (1):
md: do not freeze md threads for suspend

Randy Dunlap (1):
com20020 build fix

Roland Dreier (1):
IB/mad: Fix race between cancel and receive completion

Russell King (1):
Fix missing parens in set_personality()

Sharyathi Nagesh (1):
fix Data Acess error in dup_fd

Simon Horman (1):
[IPVS]: Compile fix for annotations in userland.

Stephen Hemminger (1):
[PKT_SCHED] sch_htb: Use hlist_del_init().

Stephen Rothwell (2):
[POWERPC] Add the thread_siblings files to sysfs
[POWERPC] Wire up sys_move_pages

Steve French (4):
[CIFS] NFS stress test generates flood of "close with pending write" messages
[CIFS] Explicitly set stat->blksize
[CIFS] Fix mount failure when domain not specified
[CIFS] Fix minor problem with previous patch

Steven Rostedt (1):
x86-64: shorten the x86_64 boot setup GDT to what the comment says

Steven Whitehouse (1):
[DECNET]: Endianess fixes (try #2)

Takashi Iwai (1):
ALSA: hda-intel - Disable MSI support by default

Tigran Aivazian (1):
Tigran has moved

Tim Shimmin (1):
[XFS] Keep lockdep happy.

Timo Teras (2):
MMC: Poll card status after rescanning cards
MMC: Do not set unsupported bits in OCR response

Tom Tucker (1):
RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device

Vivek Goyal (1):
i386: Force data segment to be 4K aligned

Vlad Apostolov (3):
[XFS] 956618: Linux crashes on boot with XFS-DMAPI filesystem when
[XFS] rename uio_read() to xfs_uio_read()
[XFS] 956664: dm_read_invis() changes i_atime

Wink Saville (1):
Patch for nvidia divide by zero error for 7600 pci-express card


2006-11-16 21:37:24

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.19-rc6: known regressions

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : bcm43xx: serious problems
References : http://lkml.org/lkml/2006/11/15/296
Submitter : Ray Lee <[email protected]>
Handled-By : Michael Buesch <[email protected]>
Larry Finger <[email protected]>
Status : problem is being debugged


Subject : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter : David Brownell <[email protected]>
Handled-By : Len Brown <[email protected]>
Alexey Starikovskiy <[email protected]>
Status : problem is being debugged


Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter : Ernst Herzberg <[email protected]>
Handled-By : Len Brown <[email protected]>
Status : problem is being debugged


Subject : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
http://lkml.org/lkml/2006/11/10/208
Submitter : Andre Noll <[email protected]>
Handled-By : Andi Kleen <[email protected]>
Status : Andi is investigating


Subject : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
http://lkml.org/lkml/2006/11/15/92
Submitter : Prakash Punnoor <[email protected]>
Status : problem is being discussed


Subject : x86_64 UP compile error
References : http://lkml.org/lkml/2006/11/16/29
Submitter : Ingo Molnar <[email protected]>
Caused-By : Andi Kleen <[email protected]>
commit 8c131af1db510793f87dc43edbc8950a35370df3
Handled-By : Andi Kleen <[email protected]>
Ingo Molnar <[email protected]>
Patch : http://lkml.org/lkml/2006/11/16/36
Status : patch available


Subject : aoe: Add forgotten NULL at end of attribute list in aoeblk.c
References : http://lkml.org/lkml/2006/11/13/26
Submitter : Dennis Stosberg <[email protected]>
Caused-By : Greg Kroah-Hartman <[email protected]>
commit 4ca5224f3ea4779054d96e885ca9b3980801ce13
Handled-By : Dennis Stosberg <[email protected]>
Patch : http://lkml.org/lkml/2006/11/13/26
Status : patch available


Subject : can't disable OHCI wakeup via sysfs
References : http://lkml.org/lkml/2006/11/11/33
Submitter : Andrey Borzenkov <[email protected]>
Handled-By : Alan Stern <[email protected]>
Patch : http://lkml.org/lkml/2006/11/13/261
Status : patch available

2006-11-16 21:44:35

by Greg KH

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions

On Thu, Nov 16, 2006 at 10:37:18PM +0100, Adrian Bunk wrote:
> Subject : aoe: Add forgotten NULL at end of attribute list in aoeblk.c
> References : http://lkml.org/lkml/2006/11/13/26
> Submitter : Dennis Stosberg <[email protected]>
> Caused-By : Greg Kroah-Hartman <[email protected]>
> commit 4ca5224f3ea4779054d96e885ca9b3980801ce13
> Handled-By : Dennis Stosberg <[email protected]>
> Patch : http://lkml.org/lkml/2006/11/13/26
> Status : patch available
>
>
> Subject : can't disable OHCI wakeup via sysfs
> References : http://lkml.org/lkml/2006/11/11/33
> Submitter : Andrey Borzenkov <[email protected]>
> Handled-By : Alan Stern <[email protected]>
> Patch : http://lkml.org/lkml/2006/11/13/261
> Status : patch available

I'll be sending Linus both of these patches later today.

thanks,

greg k-h

2006-11-17 20:40:59

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.19-rc6: known regressions (v2)

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : cpufreq notification broken
References : http://lkml.org/lkml/2006/11/16/177
Submitter : Thomas Gleixner <[email protected]>
Caused-By : Alan Stern <[email protected]>
commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9
Handled-By : Ingo Molnar <[email protected]>
Linus Torvalds <[email protected]>
Status : patches are being discussed


Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter : [email protected]
Caused-By : Alexey Starikovskiy <[email protected]>
commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <[email protected]>
Patch : http://lkml.org/lkml/2006/11/17/236
Status : patch available


Subject : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
http://lkml.org/lkml/2006/11/10/208
Submitter : Andre Noll <[email protected]>
Handled-By : Andi Kleen <[email protected]>
Status : Andi is investigating


Subject : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
http://lkml.org/lkml/2006/11/15/92
Submitter : Prakash Punnoor <[email protected]>
Status : problem is being discussed


Subject : bcm43xx: serious problems
References : http://lkml.org/lkml/2006/11/15/296
Submitter : Ray Lee <[email protected]>
Handled-By : Michael Buesch <[email protected]>
Larry Finger <[email protected]>
Status : problem is being debugged


Subject : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter : David Brownell <[email protected]>
Handled-By : Len Brown <[email protected]>
Alexey Starikovskiy <[email protected]>
Status : problem is being debugged


Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter : Ernst Herzberg <[email protected]>
Handled-By : Len Brown <[email protected]>
Status : problem is being debugged

2006-11-18 04:04:13

by Christian Kujau

[permalink] [raw]
Subject: Re: Linux 2.6.19-rc6 - NFSD working again

Hi,

I just wanted to report a 'it works again' for rc6: after encountering
the very same problems with -rc3 Jeff Garzik described in [0], I
upgraded to -rc5 and applied the proposed[1] patch[2].
Now, the knfsd behaved a bit better (nfs-mounted /home, X11
applications created thousands of empty 'configuration'-files),
however 'mkdir' and 'touch' still failed too often:

$ mkdir /mnt/nfs/compile-farm/foo
mkdir: /mnt/nfs/compile-farm/foo: Operation not permitted
$ mkdir /mnt/nfs/compile-farm/foo
mkdir: /mnt/nfs/compile-farm/foo: File exists

...and things like that.

With -rc6 this seems to be gone. However, I noticed this message in the
server's (192.168.10.10) syslog:

nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out
nfs4_cb: server 127.0.1.1/192.168.10.10 AUTH_UNIX 0 not responding, timed out

The NFS server is running on 0.0.0.0:2049, what does this mean?
The message occurs once in a while, not sure what triggers it, found
not much in the archives...

Thanks,
Christian.

[0] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1418.html
[1] http://uwsg.iu.edu/hypermail/linux/kernel/0611.0/1491.html
[2] http://www.citi.umich.edu/projects/nfsv4/linux/kernel-patches/2.6.19-rc3-2/linux-2.6.19-rc3-CITI_NFS4_ALL-2.diff
--
BOFH excuse #106:

The electrician didn't know what the yellow cable was so he yanked the ethernet out.

2006-11-18 08:02:37

by David Rientjes

[permalink] [raw]
Subject: [PATCH] mm: do not call bad_page on PG_reserved check

The return value of free_pages_check() indicates if PG_reserved was set.
If so, the calling functions return immediately and no pages are freed so
there is no need to call bad_page().

Cc: Andi Kleen <[email protected]>
Cc: Nick Piggin <[email protected]>
Signed-off-by: David Rientjes <[email protected]>
---
mm/page_alloc.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bf2f6cf..99bc29d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -439,7 +439,6 @@ static inline int free_pages_check(struc
1 << PG_slab |
1 << PG_swapcache |
1 << PG_writeback |
- 1 << PG_reserved |
1 << PG_buddy ))))
bad_page(page);
if (PageDirty(page))

2006-11-18 17:14:43

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH] mm: do not call bad_page on PG_reserved check

On Sat, 18 Nov 2006, David Rientjes wrote:

> The return value of free_pages_check() indicates if PG_reserved was set.
> If so, the calling functions return immediately and no pages are freed so
> there is no need to call bad_page().
>
> Cc: Andi Kleen <[email protected]>
> Cc: Nick Piggin <[email protected]>
> Signed-off-by: David Rientjes <[email protected]>

NAK. You're missing the point. If an attempt is made to free a
reserved page, it implies that the page reference counting has
gone wrong: we want to hear about that (so call bad_page),
and we dare not reuse the page (so skip freeing it).

What might be a good change, is to avoid freeing a page which meets
_any_ of the criteria for calling bad_page: I often wonder whether
to do that, alongside abandoning that hopeless page_mapcount BUG in
page_remove_rmap, which has almost(?) never helped lead us to any fix.

Hugh

> ---
> mm/page_alloc.c | 1 -
> 1 files changed, 0 insertions(+), 1 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index bf2f6cf..99bc29d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -439,7 +439,6 @@ static inline int free_pages_check(struc
> 1 << PG_slab |
> 1 << PG_swapcache |
> 1 << PG_writeback |
> - 1 << PG_reserved |
> 1 << PG_buddy ))))
> bad_page(page);
> if (PageDirty(page))

2006-11-20 19:53:14

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.19-rc6: known regressions (v3)

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : kernel hangs when booting with irqpoll
References : http://lkml.org/lkml/2006/11/20/233
Submitter : Vivek Goyal <[email protected]>
Status : unknown


Subject : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
http://lkml.org/lkml/2006/11/10/208
Submitter : Andre Noll <[email protected]>
Handled-By : David Rientjes <[email protected]>
Status : problem is being debugged


Subject : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
http://lkml.org/lkml/2006/11/15/92
Submitter : Prakash Punnoor <[email protected]>
Status : problem is being discussed


Subject : cpufreq notification broken
References : http://lkml.org/lkml/2006/11/16/177
Submitter : Thomas Gleixner <[email protected]>
Caused-By : Alan Stern <[email protected]>
commit b4dfdbb3c707474a2254c5b4d7e62be31a4b7da9
Handled-By : Ingo Molnar <[email protected]>
Linus Torvalds <[email protected]>
Oleg Nesterov <[email protected]>
Paul E. McKenney <[email protected]>
Status : patches are being discussed


Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter : [email protected]
Caused-By : Alexey Starikovskiy <[email protected]>
commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <[email protected]>
Patch : http://lkml.org/lkml/2006/11/17/236
Status : patch available


Subject : nasty ACPI regression, AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter : David Brownell <[email protected]>
Handled-By : Len Brown <[email protected]>
Alexey Starikovskiy <[email protected]>
Status : problem is being debugged


Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter : Ernst Herzberg <[email protected]>
Handled-By : Len Brown <[email protected]>
Status : problem is being debugged


2006-11-21 21:24:28

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.19-rc6: known regressions (v4)

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : kernel hangs when booting with irqpoll
References : http://lkml.org/lkml/2006/11/20/233
Submitter : Vivek Goyal <[email protected]>
Caused-By : Pavel Emelianov <[email protected]>
commit f72fa707604c015a6625e80f269506032d5430dc
Handled-By : Vivek Goyal <[email protected]>
Status : problem is being debugged


Subject : x86_64: Bad page state in process 'swapper'
References : http://lkml.org/lkml/2006/11/10/135
http://lkml.org/lkml/2006/11/10/208
Submitter : Andre Noll <[email protected]>
Handled-By : David Rientjes <[email protected]>
Status : problem is being debugged


Subject : x86_64: oprofile doesn't work
References : http://lkml.org/lkml/2006/10/27/3
http://lkml.org/lkml/2006/11/15/92
Submitter : Prakash Punnoor <[email protected]>
Status : problem is being discussed


Subject : ACPI: AE_TIME errors
References : http://lkml.org/lkml/2006/11/15/12
Submitter : David Brownell <[email protected]>
Handled-By : Len Brown <[email protected]>
Alexey Starikovskiy <[email protected]>
Status : problem is being debugged


Subject : ThinkPad R50p: boot fail with (lapic && on_battery)
References : http://lkml.org/lkml/2006/10/31/333
Submitter : Ernst Herzberg <[email protected]>
Handled-By : Len Brown <[email protected]>
Status : problem is being debugged


Subject : powerpc: serious RTC problems
References : http://lkml.org/lkml/2006/11/17/187
http://lkml.org/lkml/2006/11/18/99
Submitter : Kumar Gala <[email protected]>
Joakim Tjernlund <[email protected]>
Caused-By : Kim Phillips <[email protected]>
commit 7a69af63e788a324d162201a0b23df41bcf158dd
commit a8ed4f7ec3aa472134d7de6176f823b2667e450b
Handled-By : David Brownell <[email protected]
Kim Phillips <[email protected]>
Patch : http://lkml.org/lkml/2006/11/20/320
http://lkml.org/lkml/2006/11/20/321
Status : patches available


Subject : xconfig crashes on x86_64
References : http://lkml.org/lkml/2006/11/19/177
Submitter : Randy Dunlap <[email protected]>
Handled-By : Roman Zippel <[email protected]>
Patch : http://lkml.org/lkml/2006/11/20/340
Status : patch available


Subject : menuconfig problems with TERM=vt100
References : http://lkml.org/lkml/2006/11/13/369
Submitter : Phil Oester <[email protected]>
Caused-By : Sam Ravnborg <[email protected]>
commit 350b5b76384e77bcc58217f00455fdbec5cac594
Handled-By : Roman Zippel <[email protected]>
Patch : http://lkml.org/lkml/2006/11/20/341
Status : patch available


Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
References : http://lkml.org/lkml/2006/11/17/198
Submitter : [email protected]
Caused-By : Alexey Starikovskiy <[email protected]>
commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
Handled-By : Mattia Dongili <[email protected]>
Patch : http://lkml.org/lkml/2006/11/17/236
Status : patch available


2006-11-21 21:32:55

by Dave Jones

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:

> Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
> References : http://lkml.org/lkml/2006/11/17/198
> Submitter : [email protected]
> Caused-By : Alexey Starikovskiy <[email protected]>
> commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
> Handled-By : Mattia Dongili <[email protected]>
> Patch : http://lkml.org/lkml/2006/11/17/236
> Status : patch available

not a regression, easily worked around, queued for .20

Dave

--
http://www.codemonkey.org.uk

2006-11-21 21:34:05

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)

On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> that are not yet fixed in Linus' tree.
>
> If you find your name in the Cc header, you are either submitter of one
> of the bugs, maintainer of an affectected subsystem or driver, a patch
> of you caused a breakage or I'm considering you in any other way possibly
> involved with one or more of these issues.
>
> Due to the huge amount of recipients, please trim the Cc when answering.
>
>
> Subject : kernel hangs when booting with irqpoll
> References : http://lkml.org/lkml/2006/11/20/233
> Submitter : Vivek Goyal <[email protected]>
> Caused-By : Pavel Emelianov <[email protected]>
> commit f72fa707604c015a6625e80f269506032d5430dc
> Handled-By : Vivek Goyal <[email protected]>
> Status : problem is being debugged
>

Adrian,

Pavel already provided a fix for this issue.

http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

Thanks
Vivek

2006-11-21 21:39:03

by Adrian Bunk

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote:
> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
>
> > Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
> > References : http://lkml.org/lkml/2006/11/17/198
> > Submitter : [email protected]
> > Caused-By : Alexey Starikovskiy <[email protected]>
> > commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
> > Handled-By : Mattia Dongili <[email protected]>
> > Patch : http://lkml.org/lkml/2006/11/17/236
> > Status : patch available
>
> not a regression, easily worked around, queued for .20

It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
was merged after 2.6.18.

Considering that the fix is trivial, why shouldn't it be merged before
2.6.19?

> Dave

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-11-21 21:41:41

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)

On Tue, Nov 21, 2006 at 04:33:35PM -0500, Vivek Goyal wrote:
> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> > that are not yet fixed in Linus' tree.
> >
> > If you find your name in the Cc header, you are either submitter of one
> > of the bugs, maintainer of an affectected subsystem or driver, a patch
> > of you caused a breakage or I'm considering you in any other way possibly
> > involved with one or more of these issues.
> >
> > Due to the huge amount of recipients, please trim the Cc when answering.
> >
> >
> > Subject : kernel hangs when booting with irqpoll
> > References : http://lkml.org/lkml/2006/11/20/233
> > Submitter : Vivek Goyal <[email protected]>
> > Caused-By : Pavel Emelianov <[email protected]>
> > commit f72fa707604c015a6625e80f269506032d5430dc
> > Handled-By : Vivek Goyal <[email protected]>
> > Status : problem is being debugged
> >
>
> Adrian,
>
> Pavel already provided a fix for this issue.
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

Thanks for the information, I missed this patch.

> Thanks
> Vivek

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-11-21 21:58:06

by Dave Jones

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Tue, Nov 21, 2006 at 10:39:00PM +0100, Adrian Bunk wrote:
> On Tue, Nov 21, 2006 at 04:31:39PM -0500, Dave Jones wrote:
> > On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> >
> > > Subject : CPU_FREQ_GOV_ONDEMAND=y compile error
> > > References : http://lkml.org/lkml/2006/11/17/198
> > > Submitter : [email protected]
> > > Caused-By : Alexey Starikovskiy <[email protected]>
> > > commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
> > > Handled-By : Mattia Dongili <[email protected]>
> > > Patch : http://lkml.org/lkml/2006/11/17/236
> > > Status : patch available
> >
> > not a regression, easily worked around, queued for .20
>
> It is a regression since commit 05ca0350e8caa91a5ec9961c585c98005b6934ea
> was merged after 2.6.18.

Ah, I misinterpreted when that cset went in (I read the commit date
which was back in June, not the merge date, which was september).

> Considering that the fix is trivial, why shouldn't it be merged before
> 2.6.19?

Yes, I'll push it on.

Dave

--
http://www.codemonkey.org.uk

2006-11-21 22:23:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)



On Tue, 21 Nov 2006, Vivek Goyal wrote:

> On Tue, Nov 21, 2006 at 10:24:24PM +0100, Adrian Bunk wrote:
> > This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> > that are not yet fixed in Linus' tree.
> >
> > If you find your name in the Cc header, you are either submitter of one
> > of the bugs, maintainer of an affectected subsystem or driver, a patch
> > of you caused a breakage or I'm considering you in any other way possibly
> > involved with one or more of these issues.
> >
> > Due to the huge amount of recipients, please trim the Cc when answering.
> >
> >
> > Subject : kernel hangs when booting with irqpoll
> > References : http://lkml.org/lkml/2006/11/20/233
> > Submitter : Vivek Goyal <[email protected]>
> > Caused-By : Pavel Emelianov <[email protected]>
> > commit f72fa707604c015a6625e80f269506032d5430dc
> > Handled-By : Vivek Goyal <[email protected]>
> > Status : problem is being debugged
> >
>
> Adrian,
>
> Pavel already provided a fix for this issue.
>
> http://marc.theaimsgroup.com/?l=linux-kernel&m=116409933100117&w=2

I really think this is wrong.

The original patch was wrong, and the _real_ problem is in __do_IRQ() that
got the desc->lock too early.

I _think_ the correct fix is to simply revert the broken commit, and fix
the _one_ place that called "misnote_interrupt()" with the lock held.

Something like this..

I also think that the real fix will be to move the whole

if (!noirqdebug)
note_interrupt(irq, desc, action_ret);


into handle_IRQ_event itself, since every caller (except for
"misrouted_irq()" itself, and that should probably be done separately)
should always do it. Right now we have a lot of people that just do

action_ret = handle_IRQ_event(irq, action);
if (!noirqdebug)
note_interrupt(irq, desc, action_ret);

explicitly.

The only thing that keeps us from doing that is that we don't pass in
"desc", but we should just do that.

But in the meantime, this appears to be the minimal fix. Can people please
test and verify?

Linus

---
diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index 42aa6f1..a681912 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -231,10 +231,10 @@ fastcall unsigned int __do_IRQ(unsigned
spin_unlock(&desc->lock);

action_ret = handle_IRQ_event(irq, action);
-
- spin_lock(&desc->lock);
if (!noirqdebug)
note_interrupt(irq, desc, action_ret);
+
+ spin_lock(&desc->lock);
if (likely(!(desc->status & IRQ_PENDING)))
break;
desc->status &= ~IRQ_PENDING;
diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c
index 9c7e2e4..543ea2e 100644
--- a/kernel/irq/spurious.c
+++ b/kernel/irq/spurious.c
@@ -147,11 +147,7 @@ void note_interrupt(unsigned int irq, st
if (unlikely(irqfixup)) {
/* Don't punish working computers */
if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) {
- int ok;
-
- spin_unlock(&desc->lock);
- ok = misrouted_irq(irq);
- spin_lock(&desc->lock);
+ int ok = misrouted_irq(irq);
if (action_ret == IRQ_NONE)
desc->irqs_unhandled -= ok;
}

2006-11-22 09:50:21

by Pavel Emelyanov

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)

> I really think this is wrong.
>
> The original patch was wrong, and the _real_ problem is in __do_IRQ() that
> got the desc->lock too early.
>
> I _think_ the correct fix is to simply revert the broken commit, and fix
> the _one_ place that called "misnote_interrupt()" with the lock held.
>
> Something like this..
>
> I also think that the real fix will be to move the whole
>
> if (!noirqdebug)
> note_interrupt(irq, desc, action_ret);
>
>
> into handle_IRQ_event itself, since every caller (except for
> "misrouted_irq()" itself, and that should probably be done separately)
> should always do it. Right now we have a lot of people that just do
>
> action_ret = handle_IRQ_event(irq, action);
> if (!noirqdebug)
> note_interrupt(irq, desc, action_ret);
>
> explicitly.
>
> The only thing that keeps us from doing that is that we don't pass in
> "desc", but we should just do that.
>
> But in the meantime, this appears to be the minimal fix. Can people please
> test and verify?

This works for me, but is this normal that desc's fields are
modified non-atomically in note_interrupt()?

And one more thing - report_bad_irq() traverses desc->action
list without any locking either.

2006-11-22 10:42:45

by Andi Kleen

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

ject : x86_64: Bad page state in process 'swapper'
> References : http://lkml.org/lkml/2006/11/10/135
> http://lkml.org/lkml/2006/11/10/208
> Submitter : Andre Noll <[email protected]>
> Handled-By : David Rientjes <[email protected]>
> Status : problem is being debugged

Does this still happen with -rc6?

It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
but the debugging information in the standard kernel unfortunately
doesn't give enough output to find out where it happens.

-Andi

2006-11-22 14:58:48

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)

On Wed, Nov 22, 2006 at 12:44:14PM +0300, Pavel Emelianov wrote:
> > I really think this is wrong.
> >
> > The original patch was wrong, and the _real_ problem is in __do_IRQ() that
> > got the desc->lock too early.
> >
> > I _think_ the correct fix is to simply revert the broken commit, and fix
> > the _one_ place that called "misnote_interrupt()" with the lock held.
> >
> > Something like this..
> >
> > I also think that the real fix will be to move the whole
> >
> > if (!noirqdebug)
> > note_interrupt(irq, desc, action_ret);
> >
> >
> > into handle_IRQ_event itself, since every caller (except for
> > "misrouted_irq()" itself, and that should probably be done separately)
> > should always do it. Right now we have a lot of people that just do
> >
> > action_ret = handle_IRQ_event(irq, action);
> > if (!noirqdebug)
> > note_interrupt(irq, desc, action_ret);
> >
> > explicitly.
> >
> > The only thing that keeps us from doing that is that we don't pass in
> > "desc", but we should just do that.
> >
> > But in the meantime, this appears to be the minimal fix. Can people please
> > test and verify?
>
> This works for me, but is this normal that desc's fields are
> modified non-atomically in note_interrupt()?
>
> And one more thing - report_bad_irq() traverses desc->action
> list without any locking either.

Works for me too. But Pavel's concern look genuine. May be we should take
the lock again in note_interrupt()/report_bad_irq() whenever we are
accessing/modifying desc.

Thanks
Vivek

2006-11-22 15:52:37

by mel

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On (22/11/06 11:42), Andi Kleen didst pronounce:
> ject : x86_64: Bad page state in process 'swapper'
> > References : http://lkml.org/lkml/2006/11/10/135
> > http://lkml.org/lkml/2006/11/10/208
> > Submitter : Andre Noll <[email protected]>
> > Handled-By : David Rientjes <[email protected]>
> > Status : problem is being debugged
>
> Does this still happen with -rc6?
>
> It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
> but the debugging information in the standard kernel unfortunately
> doesn't give enough output to find out where it happens.
>

Right, so I took a closer look to see what the story was.

According to the thread, this was the E820 map with the corresponding
PFNs appended to the usable regions.

BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) ( 0-159)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable) (256-1032176)
BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000200000000 (usable) (1048576-2097152)

This is what the PFN ranges look like to arch-independent zone-sizing
reading the map without node awareness

Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used

That matches exactly. So far so good. Later with node awareness, we get

SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used

Unusual ordering, but the information is still correct. The final sorted
map looks like;

early_node_map[3] active PFN ranges
0: 0 -> 159
0: 256 -> 1032176
1: 1048576 -> 2097152

Again, everything there looks like what the E820 map reports so I don't
believe this is the zone-sizings code fault although it may be exposing a
bug from elsewhere. According to bootmap, things look like

Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000

That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.

That is showing an additional 16 page frames that are not in the E820 map
(although I have seen this before and it didn't show up as a bad page). I
would be very interested in finding out what the bad_page PFNs are if this
bug still exists to see if it is those 16 frames. I've included a patch
below that might help.

Andre, if the bug still exists for you, can you apply Andi's patch to
reduce the log size and the following patch please and post us the
output with loglevel=8 please? Thanks

Signed-off-by: Mel Gorman <[email protected]>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c
--- linux-2.6.19-rc6-clean/arch/x86_64/mm/numa.c 2006-11-22 15:08:20.000000000 +0000
+++ linux-2.6.19-rc6-debug_bootmem_init_issues/arch/x86_64/mm/numa.c 2006-11-22 15:07:47.000000000 +0000
@@ -192,6 +192,9 @@ void __init setup_node_zones(int nodeid)
memmapsize, SMP_CACHE_BYTES,
round_down(limit - memmapsize, PAGE_SIZE),
limit);
+ printk(KERN_DEBUG "Node %d memmap at 0x%p size %lu first pfn 0x%p\n",
+ nodeid, NODE_DATA(nodeid)->node_mem_map,
+ memmapsize, NODE_DATA(nodeid)->node_mem_map);
#endif
}

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c
--- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-16 04:03:40.000000000 +0000
+++ linux-2.6.19-rc6-debug_bootmem_init_issues/mm/page_alloc.c 2006-11-22 14:16:46.000000000 +0000
@@ -2453,6 +2453,9 @@ static void __init alloc_node_mem_map(st
if (!map)
map = alloc_bootmem_node(pgdat, size);
pgdat->node_mem_map = map + (pgdat->node_start_pfn - start);
+ printk(KERN_DEBUG
+ "Node %d memmap at 0x%p size %lu first pfn 0x%p\n",
+ pgdat->node_id, map, size, pgdat->node_mem_map);
}
#ifdef CONFIG_FLATMEM
/*
@@ -2683,6 +2686,9 @@ void __init free_area_init_nodes(unsigne
/* Regions in the early_node_map can be in any order */
sort_node_map();

+ /* Print out the page size for debugging meminit problems */
+ printk(KERN_DEBUG "sizeof(struct page) = %d\n", sizeof(struct page));
+
/* Print out the zone ranges */
printk("Zone PFN ranges:\n");
for (i = 0; i < MAX_NR_ZONES; i++)

2006-11-22 16:09:59

by Andre Noll

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On 11:42, Andi Kleen wrote:
> ject : x86_64: Bad page state in process 'swapper'
> > References : http://lkml.org/lkml/2006/11/10/135
> > http://lkml.org/lkml/2006/11/10/208
> > Submitter : Andre Noll <[email protected]>
> > Handled-By : David Rientjes <[email protected]>
> > Status : problem is being debugged
>
> Does this still happen with -rc6?

Unfortunately, yes. I tried rc6, current git, and currrent git + David
Rientjes' patch. They all show the same behaviour.

> It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
> but the debugging information in the standard kernel unfortunately
> doesn't give enough output to find out where it happens.

Feel free to send me a debugging patch..

Andre
--
The only person who always got his work done by Friday was Robinson Crusoe


Attachments:
(No filename) (836.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-11-22 17:03:47

by Mel Gorman

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Wed, 22 Nov 2006, Andre Noll wrote:

> On 11:42, Andi Kleen wrote:
>> ject : x86_64: Bad page state in process 'swapper'
>>> References : http://lkml.org/lkml/2006/11/10/135
>>> http://lkml.org/lkml/2006/11/10/208
>>> Submitter : Andre Noll <[email protected]>
>>> Handled-By : David Rientjes <[email protected]>
>>> Status : problem is being debugged
>>
>> Does this still happen with -rc6?
>
> Unfortunately, yes. I tried rc6, current git, and currrent git + David
> Rientjes' patch. They all show the same behaviour.
>
>> It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
>> but the debugging information in the standard kernel unfortunately
>> doesn't give enough output to find out where it happens.
>
> Feel free to send me a debugging patch..
>



You should have received such a patch from me later in the thread. In
combination with the patch at http://lkml.org/lkml/2006/11/10/198 and a
copy of the dmesg, I might be able to guess what is going wrong. Thanks

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-11-22 17:09:10

by Andi Kleen

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote:
> Unfortunately, yes. I tried rc6, current git, and currrent git + David
> Rientjes' patch. They all show the same behaviour.

I must have missed that patch.
>
> > It's probably another bug in the memmap parsing rewrite (Mel cc'ed)
> > but the debugging information in the standard kernel unfortunately
> > doesn't give enough output to find out where it happens.
>
> Feel free to send me a debugging patch..

Here's one. Please send output (unless Mel finds the problem first..)

-Andi

Index: linux-2.6.19-rc6-hack/mm/page_alloc.c
===================================================================
--- linux-2.6.19-rc6-hack/mm/page_alloc.c
+++ linux-2.6.19-rc6-hack/mm/page_alloc.c
@@ -188,6 +188,10 @@ static inline int bad_range(struct zone

static void bad_page(struct page *page)
{
+ static int warned;
+ if (!warned) {
+ warned = 1;
+ printk(KERN_EMERG "page address %lx\n", page_address(page));
printk(KERN_EMERG "Bad page state in process '%s'\n"
KERN_EMERG "page:%p flags:0x%0*lx mapping:%p mapcount:%d count:%d\n"
KERN_EMERG "Trying to fix it up, but a reboot is needed\n"
@@ -196,6 +200,7 @@ static void bad_page(struct page *page)
(unsigned long)page->flags, page->mapping,
page_mapcount(page), page_count(page));
dump_stack();
+ }
page->flags &= ~(1 << PG_lru |
1 << PG_private |
1 << PG_locked |

2006-11-22 17:32:33

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)



On Wed, 22 Nov 2006, Pavel Emelianov wrote:
>
> This works for me, but is this normal that desc's fields are
> modified non-atomically in note_interrupt()?

This is all inside the normal interrupt handling logic, so it should be
exactly as safe as any interrupt is: we don't allow the _same_ interrupt
to be entered recursively at the same time.

So yes, the counts etc are done non-atomically, but the code around it all
guarantees that only one concurrent invocation happens per irq descriptor,
so it's all ok.

(The one exception to that may be the "desc->status" modification in case
the irq is determined to have screamed, since "status" can be modified by
a recursive interrupt coming in, but (a) that's a "this irq is dead"
schenario _anyway_ and (b) if we ever care, we should lock it _there_, not
somewhere else).

Linus

2006-11-22 17:42:56

by Andre Noll

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On 15:52, Mel Gorman wrote:

> Right, so I took a closer look to see what the story was.

Thanks a lot, Mel.

> Bootmem setup node 0 0000000000000000-00000000fc000000
> Bootmem setup node 1 0000000100000000-0000000200000000
>
> That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.
>
> That is showing an additional 16 page frames that are not in the E820 map
> (although I have seen this before and it didn't show up as a bad page). I
> would be very interested in finding out what the bad_page PFNs are if this
> bug still exists to see if it is those 16 frames. I've included a patch
> below that might help.
>
> Andre, if the bug still exists for you, can you apply Andi's patch to
> reduce the log size and the following patch please and post us the
> output with loglevel=8 please? Thanks

Done. Here's the output of dmesg with your and Andi's patch applied.

Andre

Linux version 2.6.19-rc6-mel-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #11 SMP Wed Nov 22 17:11:44 CET 2006
Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)
BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000200000000 (usable)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used
end_pfn_map = 2097152
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000f6bc0
ACPI: RSDT (v001 A M I OEMRSDT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000
ACPI: FADT (v001 A M I OEMFACP 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200
ACPI: MADT (v001 A M I OEMAPIC 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380
ACPI: OEMB (v001 A M I OEMBIOS 0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040
ACPI: SRAT (v001 A M I OEMSRAT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0
ACPI: DSDT (v001 0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used
NUMA: Using 32 for the hash shift.
Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000
Node 0 memmap at 0xffff810000893000 size 57802752 first pfn 0xffff810000893000
Node 1 memmap at 0xffff8101fc800000 size 58720256 first pfn 0xffff8101fc800000
sizeof(struct page) = 56
Zone PFN ranges:
DMA 256 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 2097152
early_node_map[3] active PFN ranges
0: 0 -> 159
0: 256 -> 1032176
1: 1048576 -> 2097152
On node 0 totalpages: 1031920
DMA zone: 52 pages used for memmap
DMA zone: 1953 pages reserved
DMA zone: 1835 pages, LIFO batch:0
DMA32 zone: 14055 pages used for memmap
DMA32 zone: 1014025 pages, LIFO batch:31
Normal zone: 0 pages used for memmap
On node 1 totalpages: 1048576
DMA zone: 0 pages used for memmap
DMA32 zone: 0 pages used for memmap
Normal zone: 14336 pages used for memmap
Normal zone: 1034240 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x5008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009f000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000fbff0000 - 00000000fbfff000
Nosave address range: 00000000fbfff000 - 00000000fc000000
Nosave address range: 00000000fc000000 - 00000000ff780000
Nosave address range: 00000000ff780000 - 0000000100000000
Allocating PCI resources starting at fc400000 (gap: fc000000:3780000)
PERCPU: Allocating 25728 bytes of per cpu data
Built 2 zonelists. Total pages: 2050100
Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Checking aperture...
CPU 0: aperture @ f5cc000000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Bad page state in process 'swapper'
page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
[<ffffffff8014f1dd>] bad_page+0x71/0x9f
[<ffffffff8014f6be>] __free_pages_ok+0x78/0xf9
[<ffffffff805cd878>] free_all_bootmem_core+0xce/0x1c2
[<ffffffff805cad99>] numa_free_all_bootmem+0x39/0x78
[<ffffffff805ca603>] mem_init+0x59/0x16c
[<ffffffff805bb75c>] start_kernel+0x165/0x1e7
[<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130

Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init)
Calibrating delay using timer specific routine.. 4784.66 BogoMIPS (lpj=9569329)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0 -> Node 0
Freeing SMP alternatives: 32k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12447006
Detected 12.447 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4780.00 BogoMIPS (lpj=9560010)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1/1 -> Node 1
AMD Opteron(tm) Processor 250 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -14 cycles, maxerr 1190 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 2389.823 MHz processor.
migration_cost=569
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:03:06.0
PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
AMD768 RNG detected
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
PCI: Bridge: 0000:00:06.0
IO window: a000-bfff
MEM window: fc900000-feafffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0a.0
IO window: 9000-9fff
MEM window: fc600000-fc8fffff
PREFETCH window: ff500000-ff5fffff
PCI: Bridge: 0000:00:0b.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
microcode: CPU0 not a capable Intel processor
microcode: CPU1 not a capable Intel processor
IA-32 Microcode Update Driver: v1.14a <[email protected]>
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Getting cpuindex for acpiid 0x3
ACPI: Getting cpuindex for acpiid 0x4
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
ipmi message handler version 39.0
ipmi device interface
IPMI System Interface driver.
ipmi_si: Unable to find any System Interface(s)
IPMI Watchdog: driver initialized
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 7.2.9-k4
Copyright (c) 1999-2006 Intel Corporation.
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <[email protected]> and others
ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18
eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18.
Board assembly 567812-052, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0xd0a6c714).
e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
tg3.c:v3.69 (November 15, 2006)
ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24
eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25
eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth2: dma_rwctrl[769f4000] dma_mask[64-bit]
Linux video capture interface: v2.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

scsi 0:0:0:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3
target0:0:0: asynchronous
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
target0:0:0: Beginning Domain Validation
target0:0:0: wide asynchronous
target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
target0:0:0: Ending Domain Validation
scsi 0:0:1:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3
target0:0:1: asynchronous
scsi0:A:1:0: Tagged Queuing enabled. Depth 32
target0:0:1: Beginning Domain Validation
target0:0:1: wide asynchronous
target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
target0:0:1: Ending Domain Validation
ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

3ware Storage Controller device driver for Linux v1.26.02.001.
3ware 9000 Storage Controller device driver for Linux v2.26.02.008.
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2
sd 0:0:1:0: Attached scsi disk sdb
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:1:0: Attached scsi generic sg1 type 0
Fusion MPT base driver 3.04.02
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SPI Host driver 3.04.02
usbmon: debugfs is not available
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.0: OHCI Host Controller
ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.1: OHCI Host Controller
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
USB Universal Host Controller Interface driver v3.0
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
/.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input0
md: raid0 personality registered for level 0
md: multipath personality registered for level -4
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
CCID: Registered CCID 3 (ccid3)
CCID: Registered CCID 2 (ccid2)
SCTP: Hash tables configured (established 65536 bind 65536)
powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146)
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.
., OK
IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
IP-Config: Complete:
device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254,
host=node120, domain=, nis-domain=(none),
bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath=
Freeing unused kernel memory: 2612k freed
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0: comparing sdb2(55038592) with sdb2(55038592)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(55038592) with sdb2(55038592)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
md: unbind<sdb2>
md: export_rdev(sdb2)
md: unbind<sda2>
md: export_rdev(sda2)
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0: comparing sdb2(55038592) with sdb2(55038592)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(55038592) with sdb2(55038592)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
Adding 16779852k swap on /dev/sda1. Priority:42 extents:1 across:16779852k
Adding 16779852k swap on /dev/sdb1. Priority:42 extents:1 across:16779852k
warning: process `sensors' used the removed sysctl system call with 7.2.1.
warning: process `sensors' used the removed sysctl system call with 7.2.1.
process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT
--
The only person who always got his work done by Friday was Robinson Crusoe


Attachments:
(No filename) (21.10 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-11-22 18:00:47

by Andre Noll

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On 18:08, Andi Kleen wrote:
> On Wed, Nov 22, 2006 at 05:05:49PM +0100, Andre Noll wrote:
> > Unfortunately, yes. I tried rc6, current git, and currrent git + David
> > Rientjes' patch. They all show the same behaviour.
>
> I must have missed that patch.

He sent it to me in private. In fact, he sent several patches. This is
the one I tried today and which didn't work:


Hi Andre,

Please try the following patch to your 2.6.19-rc5 and see if it corrects
the problem (it should also apply to 2.6.19-rc6 cleanly).

David
---
mm/memory.c | 33 ++++++++-------------------------
1 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 156861f..74aa08b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1483,29 +1483,14 @@ static int do_wp_page(struct mm_struct *
{
struct page *old_page, *new_page;
pte_t entry;
- int reuse = 0, ret = VM_FAULT_MINOR;
- struct page *dirty_page = NULL;
+ int reuse, ret = VM_FAULT_MINOR;

old_page = vm_normal_page(vma, address, orig_pte);
if (!old_page)
goto gotten;

- /*
- * Take out anonymous pages first, anonymous shared vmas are
- * not dirty accountable.
- */
- if (PageAnon(old_page)) {
- if (!TestSetPageLocked(old_page)) {
- reuse = can_share_swap_page(old_page);
- unlock_page(old_page);
- }
- } else if (unlikely((vma->vm_flags & (VM_WRITE|VM_SHARED)) ==
- (VM_WRITE|VM_SHARED))) {
- /*
- * Only catch write-faults on shared writable pages,
- * read-only shared pages can get COWed by
- * get_user_pages(.write=1, .force=1).
- */
+ if (unlikely((vma->vm_flags & (VM_SHARED | VM_WRITE)) ==
+ (VM_SHARED | VM_WRITE))) {
if (vma->vm_ops && vma->vm_ops->page_mkwrite) {
/*
* Notify the address space that the page is about to
@@ -1534,10 +1519,12 @@ static int do_wp_page(struct mm_struct *
if (!pte_same(*page_table, orig_pte))
goto unlock;
}
- dirty_page = old_page;
- get_page(dirty_page);
reuse = 1;
- }
+ } else if (PageAnon(old_page) && !TestSetPageLocked(old_page)) {
+ reuse = can_share_swap_page(old_page);
+ unlock_page(old_page);
+ } else
+ reuse = 0;

if (reuse) {
flush_cache_page(vma, address, pte_pfn(orig_pte));
@@ -1609,10 +1596,6 @@ gotten:
page_cache_release(old_page);
unlock:
pte_unmap_unlock(page_table, ptl);
- if (dirty_page) {
- set_page_dirty_balance(dirty_page);
- put_page(dirty_page);
- }
return ret;
oom:
if (old_page)



> > Feel free to send me a debugging patch..

> Here's one. Please send output (unless Mel finds the problem first..)

Here comes the output.
Andre


Linux version 2.6.19-rc6-andi-v2-tt64-6-g0f9005a6-dirty (maan@congo) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #12 SMP Wed Nov 22 18:54:11 CET 2006
Command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)
BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000200000000 (usable)
Entering add_active_range(0, 0, 159) 0 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 1 entries of 3200 used
Entering add_active_range(0, 1048576, 2097152) 2 entries of 3200 used
end_pfn_map = 2097152
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM ) @ 0x00000000000f6bc0
ACPI: RSDT (v001 A M I OEMRSDT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0000
ACPI: FADT (v001 A M I OEMFACP 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0200
ACPI: MADT (v001 A M I OEMAPIC 0x01000510 MSFT 0x00000097) @ 0x00000000fbff0380
ACPI: OEMB (v001 A M I OEMBIOS 0x01000510 MSFT 0x00000097) @ 0x00000000fbfff040
ACPI: SRAT (v001 A M I OEMSRAT 0x01000510 MSFT 0x00000097) @ 0x00000000fbff34e0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0x00000000fbff35f0
ACPI: DSDT (v001 0AAAA 0AAAA000 0x00000000 INTL 0x02002026) @ 0x0000000000000000
SRAT: PXM 0 -> APIC 0 -> Node 0
SRAT: PXM 1 -> APIC 1 -> Node 1
SRAT: Node 0 PXM 0 100000-fc000000
Entering add_active_range(0, 256, 1032176) 0 entries of 3200 used
SRAT: Node 1 PXM 1 100000000-200000000
Entering add_active_range(1, 1048576, 2097152) 1 entries of 3200 used
SRAT: Node 0 PXM 0 0-fc000000
Entering add_active_range(0, 0, 159) 2 entries of 3200 used
Entering add_active_range(0, 256, 1032176) 3 entries of 3200 used
NUMA: Using 32 for the hash shift.
Bootmem setup node 0 0000000000000000-00000000fc000000
Bootmem setup node 1 0000000100000000-0000000200000000
Zone PFN ranges:
DMA 256 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 2097152
early_node_map[3] active PFN ranges
0: 0 -> 159
0: 256 -> 1032176
1: 1048576 -> 2097152
On node 0 totalpages: 1031920
DMA zone: 52 pages used for memmap
DMA zone: 1953 pages reserved
DMA zone: 1835 pages, LIFO batch:0
DMA32 zone: 14055 pages used for memmap
DMA32 zone: 1014025 pages, LIFO batch:31
Normal zone: 0 pages used for memmap
On node 1 totalpages: 1048576
DMA zone: 0 pages used for memmap
DMA32 zone: 0 pages used for memmap
Normal zone: 14336 pages used for memmap
Normal zone: 1034240 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x5008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009f000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000fbff0000 - 00000000fbfff000
Nosave address range: 00000000fbfff000 - 00000000fc000000
Nosave address range: 00000000fc000000 - 00000000ff780000
Nosave address range: 00000000ff780000 - 0000000100000000
Allocating PCI resources starting at fc400000 (gap: fc000000:3780000)
PERCPU: Allocating 25728 bytes of per cpu data
Built 2 zonelists. Total pages: 2050100
Kernel command line: vga=normal ip=dhcp BOOT_IMAGE=2.6.19-rc6-mel
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Checking aperture...
CPU 0: aperture @ f4cc000000 size 32 MB
Aperture too small (32 MB)
No AGP bridge found
Your BIOS doesn't leave a aperture memory hole
Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
page address ffff8100fbef0000
Bad page state in process 'swapper'
page:ffff810003faf480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
[<ffffffff8014f200>] bad_page+0x94/0xbe
[<ffffffff8014f6dd>] __free_pages_ok+0x78/0xf9
[<ffffffff805cd83c>] free_all_bootmem_core+0xce/0x1c2
[<ffffffff805cad5d>] numa_free_all_bootmem+0x39/0x78
[<ffffffff805ca603>] mem_init+0x59/0x16c
[<ffffffff805bb75c>] start_kernel+0x165/0x1e7
[<ffffffff805bb195>] x86_64_start_kernel+0x12b/0x130

Memory: 8122880k/8388608k available (3184k kernel code, 199740k reserved, 1490k data, 2612k init)
Calibrating delay using timer specific routine.. 4782.31 BogoMIPS (lpj=9564629)
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 0/0 -> Node 0
Freeing SMP alternatives: 32k freed
ACPI: Core revision 20060707
Using local APIC timer interrupts.
result 12441507
Detected 12.441 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4777.69 BogoMIPS (lpj=9555388)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU 1/1 -> Node 1
AMD Opteron(tm) Processor 250 stepping 0a
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -177 cycles, maxerr 928 cycles)
Brought up 2 CPUs
testing NMI watchdog ... OK.
Disabling vsyscall due to use of PM timer
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
time.c: Detected 2388.767 MHz processor.
migration_cost=574
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:03:06.0
PCI: Firmware left 0000:03:08.0 e100 interrupts enabled, disabling
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
AMD768 RNG detected
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
PCI-DMA: Disabling AGP.
PCI-DMA: aperture base @ 8000000 size 65536 KB
PCI-DMA: using GART IOMMU.
PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
PCI: Bridge: 0000:00:06.0
IO window: a000-bfff
MEM window: fc900000-feafffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0a.0
IO window: 9000-9fff
MEM window: fc600000-fc8fffff
PREFETCH window: ff500000-ff5fffff
PCI: Bridge: 0000:00:0b.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
microcode: CPU0 not a capable Intel processor
microcode: CPU1 not a capable Intel processor
IA-32 Microcode Update Driver: v1.14a <[email protected]>
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [PWRB]
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Getting cpuindex for acpiid 0x3
ACPI: Getting cpuindex for acpiid 0x4
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
ipmi message handler version 39.0
ipmi device interface
IPMI System Interface driver.
ipmi_si: Unable to find any System Interface(s)
IPMI Watchdog: driver initialized
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
loop: loaded (max 8 devices)
Intel(R) PRO/1000 Network Driver - version 7.2.9-k4
Copyright (c) 1999-2006 Intel Corporation.
eepro100.c:v1.09j-t 9/29/99 Donald Becker http://www.scyld.com/network/eepro100.html
eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <[email protected]> and others
ACPI: PCI Interrupt 0000:03:08.0[A] -> GSI 18 (level, low) -> IRQ 18
eth0: 0000:03:08.0, 00:E0:81:2E:78:F7, IRQ 18.
Board assembly 567812-052, Physical connectors present: RJ45
Primary interface chip i82555 PHY #1.
General self-test: passed.
Serial sub-system self-test: passed.
Internal registers self-test: passed.
ROM checksum self-test: passed (0xd0a6c714).
e100: Intel(R) PRO/100 Network Driver, 3.5.17-k2-NAPI
e100: Copyright(c) 1999-2006 Intel Corporation
tg3.c:v3.69 (November 15, 2006)
ACPI: PCI Interrupt 0000:02:09.0[A] -> GSI 24 (level, low) -> IRQ 24
eth1: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:26
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[769f4000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:02:09.1[B] -> GSI 25 (level, low) -> IRQ 25
eth2: Tigon3 [partno(BCM95704A7) rev 2003 PHY(5704)] (PCIX:100MHz:64-bit) 10/100/1000BaseT Ethernet 00:e0:81:2e:79:27
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth2: dma_rwctrl[769f4000] dma_mask[64-bit]
Linux video capture interface: v2.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:07.1
AMD8111: chipset revision 3
AMD8111: not 100% native mode: will probe irqs later
AMD8111: 0000:00:07.1 (rev 03) UDMA133 controller
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
Probing IDE interface ide0...
Probing IDE interface ide1...
ACPI: PCI Interrupt 0000:02:06.0[A] -> GSI 24 (level, low) -> IRQ 24
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

scsi 0:0:0:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3
target0:0:0: asynchronous
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
target0:0:0: Beginning Domain Validation
target0:0:0: wide asynchronous
target0:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
target0:0:0: Ending Domain Validation
scsi 0:0:1:0: Direct-Access FUJITSU MAT3073NP 0105 PQ: 0 ANSI: 3
target0:0:1: asynchronous
scsi0:A:1:0: Tagged Queuing enabled. Depth 32
target0:0:1: Beginning Domain Validation
target0:0:1: wide asynchronous
target0:0:1: FAST-160 WIDE SCSI 320.0 MB/s DT IU QAS RDSTRM RTI WRFLOW PCOMP (6.25 ns, offset 127)
target0:0:1: Ending Domain Validation
ACPI: PCI Interrupt 0000:02:06.1[B] -> GSI 25 (level, low) -> IRQ 25
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs

3ware Storage Controller device driver for Linux v1.26.02.001.
3ware 9000 Storage Controller device driver for Linux v2.26.02.008.
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
SCSI device sda: 143638992 512-byte hdwr sectors (73543 MB)
sda: Write Protect is off
sda: Mode Sense: b3 00 00 08
SCSI device sda: drive cache: write back
sda: sda1 sda2
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
SCSI device sdb: 143638992 512-byte hdwr sectors (73543 MB)
sdb: Write Protect is off
sdb: Mode Sense: b3 00 00 08
SCSI device sdb: drive cache: write back
sdb: sdb1 sdb2
sd 0:0:1:0: Attached scsi disk sdb
sd 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:1:0: Attached scsi generic sg1 type 0
Fusion MPT base driver 3.04.02
Copyright (c) 1999-2005 LSI Logic Corporation
Fusion MPT SPI Host driver 3.04.02
usbmon: debugfs is not available
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt 0000:03:00.0[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.0: OHCI Host Controller
ohci_hcd 0000:03:00.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:03:00.0: irq 19, io mem 0xfeafc000
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI Interrupt 0000:03:00.1[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:03:00.1: OHCI Host Controller
ohci_hcd 0000:03:00.1: new USB bus registered, assigned bus number 2
ohci_hcd 0000:03:00.1: irq 19, io mem 0xfeafd000
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
USB Universal Host Controller Interface driver v3.0
Initializing USB Mass Storage driver...
usbcore: registered new interface driver usb-storage
USB Mass Storage support registered.
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
/.amd_mnt/huangho/export/kwaid0/home/maan/scm/torvalds/linux-2.6/drivers/usb/input/hid-core.c: v2.6:USB HID core driver
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input0
md: raid0 personality registered for level 0
md: multipath personality registered for level -4
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
NET: Registered protocol family 15
CCID: Registered CCID 3 (ccid3)
CCID: Registered CCID 2 (ccid2)
SCTP: Hash tables configured (established 65536 bind 65536)
powernow-k8: Found 2 AMD Opteron(tm) Processor 250 processors (version 2.00.00)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
PM: Writing back config space on device 0000:02:09.0 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.0 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.0 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.0 at offset 1 (was 2b00000, writing 2b00146)
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
Sending DHCP requests .<6>tg3: eth1: Link is up at 1000 Mbps, full duplex.
tg3: eth1: Flow control is on for TX and on for RX.
., OK
IP-Config: Got DHCP answer from 192.168.1.254, my address is 192.168.1.120
PM: Writing back config space on device 0000:02:09.1 at offset b (was 164814e4, writing 164414e4)
PM: Writing back config space on device 0000:02:09.1 at offset 3 (was 804000, writing 804010)
PM: Writing back config space on device 0000:02:09.1 at offset 2 (was 2000000, writing 2000003)
PM: Writing back config space on device 0000:02:09.1 at offset 1 (was 2b00000, writing 2b00106)
IP-Config: Complete:
device=eth1, addr=192.168.1.120, mask=255.255.0.0, gw=192.168.1.254,
host=node120, domain=, nis-domain=(none),
bootserver=192.168.1.254, rootserver=192.168.1.254, rootpath=
Freeing unused kernel memory: 2612k freed
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0: comparing sdb2(55038592) with sdb2(55038592)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(55038592) with sdb2(55038592)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: md0 stopped.
md: unbind<sdb2>
md: export_rdev(sdb2)
md: unbind<sda2>
md: export_rdev(sda2)
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
program parted is using a deprecated SCSI ioctl, please convert it to SG_IO
md: bind<sda2>
md: bind<sdb2>
md0: setting max_sectors to 128, segment boundary to 32767
raid0: looking at sdb2
raid0: comparing sdb2(55038592) with sdb2(55038592)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at sda2
raid0: comparing sda2(55038592) with sdb2(55038592)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: done.
raid0 : md_size is 110077184 blocks.
raid0 : conf->hash_spacing is 110077184 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
Adding 16779852k swap on /dev/sda1. Priority:42 extents:1 across:16779852k
Adding 16779852k swap on /dev/sdb1. Priority:42 extents:1 across:16779852k
warning: process `sensors' used the removed sysctl system call with 7.2.1.
warning: process `sensors' used the removed sysctl system call with 7.2.1.
process `syslogd' is using obsolete setsockopt SO_BSDCOMPAT

--
The only person who always got his work done by Friday was Robinson Crusoe


Attachments:
(No filename) (22.66 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-11-23 00:41:13

by David Brownell

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions (v4)

On Tuesday 21 November 2006 1:24 pm, Adrian Bunk wrote:

> Subject : ACPI: AE_TIME errors
> References : http://lkml.org/lkml/2006/11/15/12
> Submitter : David Brownell <[email protected]>
> Handled-By : Len Brown <[email protected]>
> Alexey Starikovskiy <[email protected]>
> Status : problem is being debugged

I've not seen this in over 3 days now, and am willing to believe that
the previous instance (after manually reverting the patch identified
by Linus) was a fluke ... it's certainly not the critical/blocking kind
of issue it had previously been.

- Dave

2006-11-23 00:54:56

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.19-rc6: known regressions with patches available

This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
with patches available.

The first issue (for an unknown it never occured before - is seems some
random Kconfig change has triggered this latent bug) seems to have the
potential of affecting more users.

The second issue is so exotic that I wouldn't have listed it if there
was no patch, but considering that the patch looks safe I don't see why
this regression shouldn't be fixed in 2.6.19.


Subject : xconfig crashes on x86_64
References : http://lkml.org/lkml/2006/11/19/177
Submitter : Randy Dunlap <[email protected]>
Handled-By : Roman Zippel <[email protected]>
Patch : http://lkml.org/lkml/2006/11/20/340
Status : patch available


Subject : menuconfig problems with TERM=vt100
References : http://lkml.org/lkml/2006/11/13/369
Submitter : Phil Oester <[email protected]>
Caused-By : Sam Ravnborg <[email protected]>
commit 350b5b76384e77bcc58217f00455fdbec5cac594
Handled-By : Roman Zippel <[email protected]>
Patch : http://lkml.org/lkml/2006/11/20/341
Status : patch available

2006-11-23 01:12:10

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.19-rc6: known regressions with patches available

On Thu, 23 Nov 2006 01:54:57 +0100
Adrian Bunk <[email protected]> wrote:

> This email lists some known regressions in 2.6.19-rc6 compared to 2.6.18
> with patches available.
>
> The first issue (for an unknown it never occured before - is seems some
> random Kconfig change has triggered this latent bug) seems to have the
> potential of affecting more users.
>
> The second issue is so exotic that I wouldn't have listed it if there
> was no patch, but considering that the patch looks safe I don't see why
> this regression shouldn't be fixed in 2.6.19.
>
>
> Subject : xconfig crashes on x86_64
> References : http://lkml.org/lkml/2006/11/19/177
> Submitter : Randy Dunlap <[email protected]>
> Handled-By : Roman Zippel <[email protected]>
> Patch : http://lkml.org/lkml/2006/11/20/340
> Status : patch available
>
>
> Subject : menuconfig problems with TERM=vt100
> References : http://lkml.org/lkml/2006/11/13/369
> Submitter : Phil Oester <[email protected]>
> Caused-By : Sam Ravnborg <[email protected]>
> commit 350b5b76384e77bcc58217f00455fdbec5cac594
> Handled-By : Roman Zippel <[email protected]>
> Patch : http://lkml.org/lkml/2006/11/20/341
> Status : patch available

I have both these queued for 2.6.19, thanks.

2006-11-23 12:01:45

by mel

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On (22/11/06 18:42), Andre Noll didst pronounce:
> On 15:52, Mel Gorman wrote:
>
> > Right, so I took a closer look to see what the story was.
>
> Thanks a lot, Mel.
>

Thank you for getting back promptly.

> > Bootmem setup node 0 0000000000000000-00000000fc000000
> > Bootmem setup node 1 0000000100000000-0000000200000000
> >
> > That's node 0 PFN 0->1032192 and node 1 PFN 1048576->2097152.
> >
> > That is showing an additional 16 page frames that are not in the E820 map
> > (although I have seen this before and it didn't show up as a bad page). I
> > would be very interested in finding out what the bad_page PFNs are if this
> > bug still exists to see if it is those 16 frames. I've included a patch
> > below that might help.
> >
> > Andre, if the bug still exists for you, can you apply Andi's patch to
> > reduce the log size and the following patch please and post us the
> > output with loglevel=8 please? Thanks
>
> Done. Here's the output of dmesg with your and Andi's patch applied.
>

ahhh, I believe I see the problem now. Please try out the following patch.

====

find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
on a sorted early_node_map[]. However, sort_node_map() is being called after
fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
to be registered out of order. This gave the wrong values for the min PFN
range resulting in some pages not being initialised.

This patch sorts the early_node_map in find_min_pfn_for_node(). It has
been boot tested on x86, x86_64, ppc64 and ia64.

Signed-off-by: Mel Gorman <[email protected]>

diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
--- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800
+++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800
@@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
{
int i;

+ /* Regions in the early_node_map can be in any order */
+ sort_node_map();
+
/* Assuming a sorted map, the first range found has the starting pfn */
for_each_active_range_index_in_nid(i, nid)
return early_node_map[i].start_pfn;
@@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
}

- /* Regions in the early_node_map can be in any order */
- sort_node_map();
-
/* Print out the zone ranges */
printk("Zone PFN ranges:\n");
for (i = 0; i < MAX_NR_ZONES; i++)

2006-11-23 13:12:27

by Andre Noll

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On 12:01, Mel Gorman wrote:

> > > Andre, if the bug still exists for you, can you apply Andi's patch to
> > > reduce the log size and the following patch please and post us the
> > > output with loglevel=8 please? Thanks
> >
> > Done. Here's the output of dmesg with your and Andi's patch applied.
> >
>
> ahhh, I believe I see the problem now. Please try out the following patch.

[...]

> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> been boot tested on x86, x86_64, ppc64 and ia64.

That did the trick, you're the man!

Thanks a lot
Andre

--
The only person who always got his work done by Friday was Robinson Crusoe


Attachments:
(No filename) (653.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-11-23 13:28:16

by Mel Gorman

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Thu, 23 Nov 2006, Andre Noll wrote:

> On 12:01, Mel Gorman wrote:
>
>>>> Andre, if the bug still exists for you, can you apply Andi's patch to
>>>> reduce the log size and the following patch please and post us the
>>>> output with loglevel=8 please? Thanks
>>>
>>> Done. Here's the output of dmesg with your and Andi's patch applied.
>>>
>>
>> ahhh, I believe I see the problem now. Please try out the following patch.
>
> [...]
>
>> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
>> been boot tested on x86, x86_64, ppc64 and ia64.
>
> That did the trick, you're the man!
>

heh, I was also the problem. Thanks a lot for reporting and testing.


--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-11-23 19:14:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Thu, 23 Nov 2006 12:01:41 +0000
[email protected] (Mel Gorman) wrote:

> find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
> on a sorted early_node_map[]. However, sort_node_map() is being called after
> fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
> this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
> to be registered out of order. This gave the wrong values for the min PFN
> range resulting in some pages not being initialised.
>
> This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> been boot tested on x86, x86_64, ppc64 and ia64.
>
> Signed-off-by: Mel Gorman <[email protected]>
>
> diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
> --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800
> +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800
> @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
> {
> int i;
>
> + /* Regions in the early_node_map can be in any order */
> + sort_node_map();
> +
> /* Assuming a sorted map, the first range found has the starting pfn */
> for_each_active_range_index_in_nid(i, nid)
> return early_node_map[i].start_pfn;
> @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
> max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
> }
>
> - /* Regions in the early_node_map can be in any order */
> - sort_node_map();
> -
> /* Print out the zone ranges */
> printk("Zone PFN ranges:\n");
> for (i = 0; i < MAX_NR_ZONES; i++)

Doesn't this mean that we can sort that map multiple times?

Seems a bit ... ungainly?

2006-11-23 21:55:51

by mel

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On (23/11/06 11:09), Andrew Morton didst pronounce:
> On Thu, 23 Nov 2006 12:01:41 +0000
> [email protected] (Mel Gorman) wrote:
>
> > find_min_pfn_for_node() and find_min_pfn_with_active_regions() both depend
> > on a sorted early_node_map[]. However, sort_node_map() is being called after
> > fin_min_pfn_with_active_regions() in free_area_init_nodes(). In most cases,
> > this is ok, but on at least one x86_64, the SRAT table caused the E820 ranges
> > to be registered out of order. This gave the wrong values for the min PFN
> > range resulting in some pages not being initialised.
> >
> > This patch sorts the early_node_map in find_min_pfn_for_node(). It has
> > been boot tested on x86, x86_64, ppc64 and ia64.
> >
> > Signed-off-by: Mel Gorman <[email protected]>
> >
> > diff -rup linux-2.6.19-rc6-clean/mm/page_alloc.c linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c
> > --- linux-2.6.19-rc6-clean/mm/page_alloc.c 2006-11-15 20:03:40.000000000 -0800
> > +++ linux-2.6.19-rc6-sort_in_find_min/mm/page_alloc.c 2006-11-23 02:23:57.000000000 -0800
> > @@ -2612,6 +2612,9 @@ unsigned long __init find_min_pfn_for_no
> > {
> > int i;
> >
> > + /* Regions in the early_node_map can be in any order */
> > + sort_node_map();
> > +
> > /* Assuming a sorted map, the first range found has the starting pfn */
> > for_each_active_range_index_in_nid(i, nid)
> > return early_node_map[i].start_pfn;
> > @@ -2680,9 +2683,6 @@ void __init free_area_init_nodes(unsigne
> > max(max_zone_pfn[i], arch_zone_lowest_possible_pfn[i]);
> > }
> >
> > - /* Regions in the early_node_map can be in any order */
> > - sort_node_map();
> > -
> > /* Print out the zone ranges */
> > printk("Zone PFN ranges:\n");
> > for (i = 0; i < MAX_NR_ZONES; i++)
>

yes, once per active node.

> Seems a bit ... ungainly?
>


It is, but this late in the cycle, I was going for the
obviously-correct-and-will-definitly-work solution.

It would be sufficient to call sort_node_map() in
find_min_pfn_with_active_regions() but I wasn't sure someone would call
find_min_pfn_for_node() at some future time causing another fun bug.

A slightly smarter, but not quite as obviously correct, patch is below if
you prefer it. It removes the assumption about early_node_map being sorted
for find_min_pfns and friends by always searching the whole map. The map
is then only sorted once when it is required. Andre, I'd appreciate it if
you could give it a spin to be 100% sure it's ok. It passed a boot-test on
a few machines here.

===========

find_min_pfn_for_node() and find_min_pfn_with_active_regions() both
depend on a sorted early_node_map[] to find the correct values. However,
sort_node_map() is being called after fin_min_pfn_with_active_regions()
in free_area_init_nodes(). In most cases, this is ok, but on an x86_64,
the SRAT table caused the E820 ranges to be registered out of order. This gave
the wrong values for the min PFN range resulting in some pages not being
initialised.

This patch works by always searching the whole early_node_map[] in
find_min_pfn_for_node().

Signed-off-by: Mel Gorman <[email protected]>

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c
--- linux-2.6.19-rc5-mm2-clean/mm/page_alloc.c 2006-11-14 14:01:37.000000000 +0000
+++ linux-2.6.19-rc5-mm2-sort_in_find_min/mm/page_alloc.c 2006-11-23 20:37:18.000000000 +0000
@@ -2945,17 +2945,22 @@ static void __init sort_node_map(void)
cmp_node_active_region, NULL);
}

-/* Find the lowest pfn for a node. This depends on a sorted early_node_map */
+/* Find the lowest pfn for a node */
unsigned long __init find_min_pfn_for_node(unsigned long nid)
{
int i;
+ unsigned long min_pfn = -1UL;

/* Assuming a sorted map, the first range found has the starting pfn */
for_each_active_range_index_in_nid(i, nid)
- return early_node_map[i].start_pfn;
+ min_pfn = min(min_pfn, early_node_map[i].start_pfn);

- printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid);
- return 0;
+ if (min_pfn == -1UL) {
+ printk(KERN_WARNING "Could not find start_pfn for node %lu\n", nid);
+ return 0;
+ }
+
+ return min_pfn;
}

/**

2006-11-24 09:55:34

by Andre Noll

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On 21:55, Mel Gorman wrote:

> A slightly smarter, but not quite as obviously correct, patch is below if
> you prefer it. It removes the assumption about early_node_map being sorted
> for find_min_pfns and friends by always searching the whole map. The map
> is then only sorted once when it is required. Andre, I'd appreciate it if
> you could give it a spin to be 100% sure it's ok. It passed a boot-test on
> a few machines here.

Yes, this one also works for me.

Acked-by: Andre Noll <[email protected]>
--
The only person who always got his work done by Friday was Robinson Crusoe


Attachments:
(No filename) (592.00 B)
signature.asc (189.00 B)
Digital signature
Download all attachments

2006-11-24 09:59:18

by Andi Kleen

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)


> A slightly smarter, but not quite as obviously correct,

I think it's better to go for the "obviously correct" approach right now
And sorting multiple times should be fine

-Andi

2006-11-24 20:48:34

by Andrew Morton

[permalink] [raw]
Subject: Re: [discuss] 2.6.19-rc6: known regressions (v4)

On Fri, 24 Nov 2006 10:58:55 +0100
Andi Kleen <[email protected]> wrote:

>
> > A slightly smarter, but not quite as obviously correct,
>
> I think it's better to go for the "obviously correct" approach right now
> And sorting multiple times should be fine
>

yup, that's what I'd decided.