2007-01-31 04:28:44

by Linus Torvalds

[permalink] [raw]
Subject: Linux 2.6.20-rc7


Yes, I know I said I would only do -rc6 and then the final 2.6.20, but the
thing is, the known regressions list didn't get whittled down as quickly
as I hoped, and as a result we now have a -rc7.

It's in good enough shape that I'd probably have been happy to just
release it as 2.6.20, but since I want 2.6.20 to be a stability release, I
didn't want to risk any stupid bugs while the regressions got fixed, so
here's a final -rc7.

In other words, please do give it a good testing. We should have fixed the
nasty stuff on Adrian's list (and here's another thanks to Adrian for
keeping me on my toes!) and it's all good. But please give it a quick
shake-down to make sure that nothing silly happened while fixing the bad
stuff.

The shortlog really does say most of it - this is just various fixes for a
number of mostly fairly inconsequential things, but the SG_IO timeout bug
that hit any NeroLinux user would quite possibly impact other DVD/CD
reader/writer programs too that used raw commands with timeouts.

The diffstat just looks like line-noise: 244 files changed with an average
of less than 10 lines per file changed in 179 commits. In other words,
really no big diffs: it's just a lot of really small stuff.

Linus

---
Adam Litke (1):
Don't allow the stack to grow into hugetlb reserved regions

Adrian Bunk (1):
fs/lockd/clntlock.c: add missing newlines to dprintk's

Ahmed S. Darwish (1):
[CPUFREQ] check sysfs_create_link return value

Al Viro (9):
b44: src_desc->addr is little-endian
missing exports of pm_power_off() on alpha and sparc32
mtd/nand/cafe.c missing include of dma-mapping.h
sym53c500_cs: remove bogus call fo free_dma()
pata_platform: fallout from set_mode() change
missing dma_sync_single_range_for{cpu,device} on alpha
dma-mapping.h stubs fix
b44: src_desc->addr is little-endian
fix indentation-related breakage in Kconfig.i386

Alan Cox (5):
ide/generic: Jmicron has its own drivers now
libata cmd64x: whack into a shape that looks like the documentation
libata hpt3xn: Hopefully sort out the DPLL logic versus the vendor code
libata: set_mode, Fix the FIXME
libata-sff: Don't call bmdma_stop on non DMA capable controllers

Alexey Dobriyan (2):
Fix NULL ->nsproxy dereference in /proc/*/mounts
core-dumping unreadable binaries via PT_INTERP

Andrew Morton (5):
jmicron: fix warning
pata_platform: set_mode fix
82596 warning fixes
m68k: uaccess.h needs sched.h
ntfs: kmap_atomic() atomicity fix

Andrew Victor (6):
[ARM] 4084/1: Remove CONFIG_DEBUG_WAITQ
[ARM] 4085/1: AT91: Header fixes.
[ARM] 4086/1: AT91: Whitespace cleanup
[ARM] 4087/1: AT91: CPU reset for SAM9x processors
[ARM] 4088/1: AT91: Unbalanced IRQ in serial driver suspend/resume
[ARM] 4089/1: AT91: GPIO wake IRQ cleanup

Andy Gospodarek (1):
bonding: ARP monitoring broken on x86_64

Atsushi Nemoto (1):
SPI: alternative fix for spi_busnum_to_master

Auke Kok (1):
e100: fix irq leak on suspend/resume

Avi Kivity (3):
KVM: Emulate IA32_MISC_ENABLE msr
KVM: MMU: Perform access checks in walk_addr()
KVM: MMU: Report nx faults to the guest

Bartlomiej Zolnierkiewicz (3):
ide: update MAINTAINERS entry
ia64: add pci_get_legacy_ide_irq()
ide: add missing __init tags to IDE PCI host drivers

Baruch Even (1):
[TCP]: Fix sorting of SACK blocks.

Ben Dooks (4):
[ARM] 4095/1: S3C24XX: Fix GPIO set for Bank A
[ARM] 4096/1: S3C24XX: change return code form s3c2410_gpio_getcfg()
S3C24XX: fix passing spi chipselect to select routine
[ARM] 4117/1: S3C2412: Fix writel() usage in selection code

Benjamin Herrenschmidt (1):
[POWERPC] Fix sys_pciconfig_iobase bus matching

Catalin Marinas (2):
[ARM] 4112/1: Only ioremap to supersections if DOMAIN_IO is zero
[ARM] 4111/1: Allow VFP to work with thread migration on SMP

Conke Hu (3):
atiixp.c: remove unused code
atiixp.c: sb600 ide only has one channel
atiixp.c: add cable detection support for ATI IDE

Dan Williams (1):
[ARM] 4100/1: iop3xx: fix cpu mask for iop333

Dave Jones (5):
[AGPGART] Prevent (unlikely) memory leak in amd_create_gatt_pages()
[AGPGART] Remove pointless typedef in ati-agp
[AGPGART] Remove pointless assignment.
[AGPGART] Add new IDs to VIA AGP.
[CPUFREQ] Remove unneeded errata workaround from p4-clockmod.

David Barksdale (1):
IPMI: fix timeout list handling

David Milburn (1):
libata-scsi: ata_task_ioctl should return ATA registers from sense data

David S. Miller (4):
[AF_PACKET]: Fix BPF handling.
[AF_PACKET]: Check device down state before hard header callbacks.
[TCP]: Restore SKB socket owner setting in tcp_transmit_skb().
[SPARC64]: Set g4/g5 properly in sun4v dtlb-prot handling.

David Woodhouse (1):
Fix Maple PATA IRQ assignment.

Dmitriy Monakhov (1):
Broadcom 4400 resume small fix

Eric Van Hensbergen (5):
9p: fix bogus return code checks during initialization
9p: fix rename return code
9p: update documentation regarding server applications
9p: fix segfault caused by race condition in meta-data operations
9p: null terminate error strings for debug print

Eric W. Biederman (3):
[IPV4]: Fix the fib trie iterator to work with a single entry routing tables
[DECNET]: Handle a failure in neigh_parms_alloc (take 2)
i386: In assign_irq_vector look at all vectors before giving up

Evgeniy Dushistov (3):
ufs: alloc metadata null page fix
ufs: truncate negative to unsigned fix
ufs: reallocation fix

Francois Romieu (1):
netdev: add a MAINTAINERS entry for via-velocity and update my address

Geert Uytterhoeven (3):
[POWERPC] PS3: Fix uniprocessor kernel build
[POWERPC] ps3_free_io_irq: Fix inverted error check
`make help' in build tree doesn't show headers_* targets

Geoff Levand (1):
[POWERPC] PS3: add not complete comment to kconfig

H. Peter Anvin (1):
Boot loader ID for Gujin

Haavard Skinnemoen (2):
[AVR32] Export clear_page symbol
[AVR32] Update ATSTK1000 defconfig: Enable macb by default

Hugh Dickins (1):
mm: mremap correct rmap accounting

Ingo Molnar (1):
ACPI: fix cpufreq regression

Jan Altenberg (1):
Malta: Fix build if CONFIG_MTD is diabled.

Jan Engelhardt (1):
cdev.h: forward declarations

Jean Delvare (1):
Fix VIA quirks

Jeff Dike (2):
Fix UML on non-standard VM split hosts
uml: fix signal frame alignment

Jiri Kosina (3):
HID: fix memleaking of collection
USB HID: fix hid_blacklist clash for 0x08ca/0x0010
HID: fix pb_fnmode and move it to generic HID

Joerg Roedel (1):
KVM: SVM: Propagate cpu shutdown events to userspace

Johannes Stezenbach (1):
uml: fix mknod

Josepch Chan (1):
via82cxxx/pata_via: correct PCI_DEVICE_ID_VIA_SATA_EIDE ID and add support for CX700 and 8237S

Jun'ichi Nomura (1):
dm-multipath: fix stall on noflush suspend/resume

Justin Clacherty (1):
spi: fix error setting the spi mode in pxa2xx_spi.c

Lennert Buytenhek (1):
ata_if_xfermask() word 51 fix

Leonard Norrgard (1):
KVM: SVM: Fix SVM idt confusion

Linus Torvalds (6):
Resurrect 'try_to_free_buffers()' VM hackery
Write back inode data pages even when the inode itself is locked
Fix balance_dirty_page() calculations with CONFIG_HIGHMEM
Revert "[PATCH] namespaces: fix exit race by splitting exit"
Revert "net: ifb error path loop fix"
Linux 2.6.20-rc7

Linus Walleij (1):
[ARM] 4102/1: Allow for PHYS_OFFSET on any valid 2MiB address

Mariusz Kozlowski (1):
net: ifb error path loop fix

Mark Fasheh (1):
ocfs2: fix thinko in ocfs2_backup_super_blkno()

Masami Hiramatsu (1):
kprobes: replace magic numbers with enum

Matt Domsch (1):
Fix race in efi variable delete code

Matt Reimer (1):
[ARM] 4106/1: S3C2410: typo fixes in register definitions

Michael Chan (2):
[BNX2]: Fix 2nd port's MAC address.
b44: Fix frequent link changes

Mike Christie (1):
Fix SG_IO timeout jiffy conversion

Mike Frysinger (4):
remove __devinit markings from rtc_sysfs_add_device()
use __u8/__u32 in userspace ioctl defines for I2O
use __u8 rather than u8 in userspace SIZE defines in hdreg.h
translate dashes in filenames for headers install

Miklos Szeredi (1):
fuse: fix bug in control filesystem mount

Neil Brown (1):
Remove warning: VFS is out of sync with lock manager

NeilBrown (12):
knfsd: update email address and status for NFSD in MAINTAINERS
knfsd: fix setting of ACL server versions
knfsd: fix an NFSD bug with full sized, non-page-aligned reads
knfsd: replace some warning ins nfsfh.h with BUG_ON or WARN_ON
md: update email address and status for MD in MAINTAINERS
md: make 'repair' actually work for raid1
md: make sure the events count in an md array never returns to zero
md: avoid reading past the end of a bitmap file
knfsd: Fix type mismatch with filldir_t used by nfsd
md: fix potential memalloc deadlock in md
md: remove unnecessary printk when raid5 gets an unaligned read.
knfsd: ratelimit some nfsd messages that are triggered by external events

Nick Piggin (1):
Fix try_to_free_buffer() locking

Patrick McHardy (3):
[NETFILTER]: nf_nat: fix ICMP translation with statically linked conntrack
[NETFILTER]: nf_nat_pptp: fix expectation removal
[NETFILTER]: nf_conntrack_pptp: fix NAT setup of expected GRE connections

Pavel Pisa (1):
[ARM] 4092/1: i.MX/MX1 CPU Frequency scaling latency definition

Peter Staubach (1):
knfsd: Don't mess with the 'mode' when storing a exclusive-create cookie

Ralf Baechle (1):
[MIPS] Ocelot G: Fix a few misspellings of CONFIG_GALILEO_GT64240_ETH

Robert Hancock (1):
libata: fix translation for START STOP UNIT

Robert Olsson (1):
[IPV4]: Fix single-entry /proc/net/fib_trie output.

Robert P. J. Day (3):
fix various kernel-doc in header files
[MIPS] Fix typo of "CONFIG_MT_SMP".
Fix "CONFIG_X86_64_" typo in drivers/kvm/svm.c

Roland McGrath (8):
x86_64: fix put_user for 64-bit constant
Fix CONFIG_COMPAT_VDSO
Fix gate_vma.vm_flags
Add VM_ALWAYSDUMP
i386 vDSO: use VM_ALWAYSDUMP
x86_64 ia32 vDSO: use VM_ALWAYSDUMP
powerpc vDSO: use VM_ALWAYSDUMP
x86_64 ia32 vDSO: define arch_vma_name

Russell King (3):
[ARM] Fix show_mem() for discontigmem
[ARM] Update mach-types
[ARM] Fix AMBA serial drivers for non-first serial ports

Serge E. Hallyn (2):
namespaces: fix exit race by splitting exit
namespaces: fix task exit disaster

Sergei Shtylyov (1):
pata_sil680: PIO1 taskfile transfers overclocking fix (repost)

Simon Bennett (1):
HID: fix hid-input mapping for Firefly Mini Remote Control

Stephen Hemminger (1):
sky2: revert IRQ dance on suspend/resume

Takashi Iwai (1):
ALSA: Fix sysfs breakage

Tejun Heo (9):
sata_via: don't diddle with ATA_NIEN in ->freeze
ahci: improve and limit spurious interrupt messages, take#3
libata: implement ATA_FLAG_IGN_SIMPLEX and use it in sata_uli
ahci: fix endianness in spurious interrupt message
sata_via: style clean up, no indirect method call in LLD
ahci: use 0x80 as wait stat value instead of 0xff
ahci: port_no should be used when clearing IRQ in ahci_thaw()
libata: fix ata_eh_suspend() return value
ide: unregister idepnp driver on unload

Thomas Klein (2):
ehea: Fixed wrong jumbo frames status query
ehea: Fixed missing tasklet_kill() call

Tilman Schmidt (1):
Gigaset ISDN driver error handling fixes

Trond Myklebust (1):
MM: Remove [PATCH] invalidate_inode_pages2_range() debug

Venkat Yekkirala (1):
[SELINUX]: Fix 2.6.20-rc6 build when no xfrm

Vitaly Bordug (1):
FS_ENET: OF-related fixup for FEC and SCC MAC's

Wang Zhenyu (1):
[AGPGART] intel_agp: restore graphics device's pci space early in resume

[email protected] (1):
jmicron: 40/80pin primary detection


2007-01-31 16:09:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Wed, 31 Jan 2007, Pawe? Sikora wrote:
>
> The 2.6.20-rcX have the same nasty bug as 2.6.19.x.
>
> [ an oops inside kmem_get_pages ]
> http://bugzilla.kernel.org/show_bug.cgi?id=7889

Pabel, can you detail more exactly which kernels don't work, and which do?

>From bugzilla:

- 2.6.18.x does work
- 2.6.19.2 doesn't work.
- what about plain 2.6.19?
- can you please test some of the 2.6.19-rcX kernels? Especially
2.6.19-rc1 would be good to test.

Since it apparently already happens in 2.6.19 (but it would be really good
to know exactly when it starts), and considering _where_ it happens, I'd
be inclined to blame commit d2e7b7d0: "fix potential stack overflow in
mm/slab.c" by Suresh.

When do_tune_cpucache() is called at bootup, I'm not sure how safe it is
to do the kzalloc() thing.

I've added a number of hopefully appropriate people to the Cc. Guys?
Apparently it only happens with MEMORY_HOTPLUG (and possibly with just an
SMP kernel on UP), which probably explains why it's been around without
people really complaining very loudly.

Linus

2007-01-31 17:11:19

by Josef 'Jeff' Sipek

[permalink] [raw]
Subject: [PATCH] x86_64: Fix preprocessor condition

Signed-off-by: Josef 'Jeff' Sipek <[email protected]>
---
include/asm-x86_64/io.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/asm-x86_64/io.h b/include/asm-x86_64/io.h
index 6ee9fad..7d0b568 100644
--- a/include/asm-x86_64/io.h
+++ b/include/asm-x86_64/io.h
@@ -100,7 +100,7 @@ __OUTS(l)

#define IO_SPACE_LIMIT 0xffff

-#if defined(__KERNEL__) && __x86_64__
+#if defined(__KERNEL__) && defined(__x86_64__)

#include <linux/vmalloc.h>

--
1.5.0.rc1.g5355

2007-01-31 17:22:08

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] x86_64: Fix preprocessor condition


> -#if defined(__KERNEL__) && __x86_64__
> +#if defined(__KERNEL__) && defined(__x86_64__)

Undefined symbols are replaced with 0, so the old line was already ok.

-Andi

2007-01-31 17:40:35

by Josef Sipek

[permalink] [raw]
Subject: Re: [PATCH] x86_64: Fix preprocessor condition

On Wed, Jan 31, 2007 at 06:16:02PM +0100, Andi Kleen wrote:
>
> > -#if defined(__KERNEL__) && __x86_64__
> > +#if defined(__KERNEL__) && defined(__x86_64__)
>
> Undefined symbols are replaced with 0, so the old line was already ok.

Fair enough, however sparse is not very happy about underfined symbols.

Jeff.

--
Reality is merely an illusion, albeit a very persistent one.
- Albert Einstein

2007-01-31 18:15:27

by Sunil Naidu

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

On 1/31/07, Linus Torvalds <[email protected]> wrote:
>
> It's in good enough shape that I'd probably have been happy to just
> release it as 2.6.20, but since I want 2.6.20 to be a stability release, I
> didn't want to risk any stupid bugs while the regressions got fixed, so
> here's a final -rc7.

It's a clean boot on my P4/HT with dmesg looks more sense (for me) !
Shall test more...


Linux version 2.6.20-rc7-Akula-II (root@Typhoon) (gcc version 4.1.1
20070105 (Red Hat 4.1.1-51)) #1 SMP Wed Jan 31 23:03:36 IST 2007
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end:
000000000009fc00 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end:
00000000000a0000 type: 2
copy_e820_map() start: 00000000000e6000 size: 000000000001a000 end:
0000000000100000 type: 2
copy_e820_map() start: 0000000000100000 size: 000000001f62f800 end:
000000001f72f800 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000001f72f800 size: 0000000000000800 end:
000000001f730000 type: 4
copy_e820_map() start: 000000001f730000 size: 0000000000010000 end:
000000001f740000 type: 3
copy_e820_map() start: 000000001f740000 size: 00000000000b0000 end:
000000001f7f0000 type: 4
copy_e820_map() start: 000000001f7f0000 size: 0000000000010000 end:
000000001f800000 type: 2
copy_e820_map() start: 00000000e0000000 size: 0000000010000000 end:
00000000f0000000 type: 2
copy_e820_map() start: 00000000fed13000 size: 0000000000007000 end:
00000000fed1a000 type: 2
copy_e820_map() start: 00000000fed1c000 size: 0000000000084000 end:
00000000feda0000 type: 2
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e6000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001f72f800 (usable)
BIOS-e820: 000000001f72f800 - 000000001f730000 (ACPI NVS)
BIOS-e820: 000000001f730000 - 000000001f740000 (ACPI data)
BIOS-e820: 000000001f740000 - 000000001f7f0000 (ACPI NVS)
BIOS-e820: 000000001f7f0000 - 000000001f800000 (reserved)
BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fed13000 - 00000000fed1a000 (reserved)
BIOS-e820: 00000000fed1c000 - 00000000feda0000 (reserved)
503MB LOWMEM available.
found SMP MP-table at 000ff780
Entering add_active_range(0, 0, 128815) 0 entries of 256 used
Zone PFN ranges:
DMA 0 -> 4096
Normal 4096 -> 128815
early_node_map[1] active PFN ranges
0: 0 -> 128815
On node 0 totalpages: 128815
DMA zone: 32 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 4064 pages, LIFO batch:0
Normal zone: 974 pages used for memmap
Normal zone: 123745 pages, LIFO batch:31
DMI 2.3 present.
ACPI: RSDP (v000 ACPIAM ) @ 0x000f4eb0
ACPI: RSDT (v001 INTEL D915GAV 0x20060222 MSFT 0x00000097) @ 0x1f730000
ACPI: FADT (v002 INTEL D915GAV 0x20060222 MSFT 0x00000097) @ 0x1f730200
ACPI: MADT (v001 INTEL D915GAV 0x20060222 MSFT 0x00000097) @ 0x1f730390
ACPI: MCFG (v001 INTEL D915GAV 0x20060222 MSFT 0x00000097) @ 0x1f730400
ACPI: ASF! (v016 LEGEND I865PASF 0x00000001 INTL 0x02002026) @ 0x1f736050
ACPI: TCPA (v001 INTEL TBLOEMID 0x00000001 MSFT 0x00000097) @ 0x1f7360f0
ACPI: WDDT (v001 INTEL OEMWDDT 0x00000001 INTL 0x02002026) @ 0x1f736122
ACPI: DSDT (v001 INTEL D915GAV 0x00000001 INTL 0x02002026) @ 0x00000000
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:4 APIC version 20
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at 20000000 (gap: 1f800000:c0800000)
Detected 3000.282 MHz processor.
Built 1 zonelists. Total pages: 127809
Kernel command line: ro root=LABEL=/1 rhgb quiet
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c0454000 soft=c0452000
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 504924k/515260k available (2060k kernel code, 9812k reserved,
1055k data, 244k init, 0k highmem)
virtual kernel memory layout:
fixmap : 0xfffb7000 - 0xfffff000 ( 288 kB)
vmalloc : 0xe0000000 - 0xfffb5000 ( 511 MB)
lowmem : 0xc0000000 - 0xdf72f000 ( 503 MB)
.init : 0xc0410000 - 0xc044d000 ( 244 kB)
.data : 0xc030320b - 0xc040b0f4 (1055 kB)
.text : 0xc0100000 - 0xc030320b (2060 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 6003.22 BogoMIPS (lpj=3001611)
Security Framework v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 00100000 00000000 00000000
0000441d 00000000 00000000
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00100000 00000000 00003180
0000441d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 11k freed
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 01
Booting processor 1/1 eip 2000
CPU 1 irqstacks, hard=c0455000 soft=c0453000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5999.31 BogoMIPS (lpj=2999656)
CPU: After generic identify, caps: bfebfbff 00100000 00000000 00000000
0000441d 00000000 00000000
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
CPU: After all inits, caps: bfebfbff 00100000 00000000 00003180
0000441d 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 01
Total of 2 processors activated (12002.53 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=5
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using MMCONFIG
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:00:02.0
PCI quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO
PCI quirk: region 0500-053f claimed by ICH6 GPIO
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEGP._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.P0P2._PRT]
ACPI: Power Resource [URP1] (off)
ACPI: Power Resource [FDDP] (off)
ACPI: Power Resource [LPTP] (off)
ACPI: Power Resource [URP2] (off)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PEX4._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs *3 4 5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 13 devices
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
NetLabel: Initializing
NetLabel: domain hash size = 128
NetLabel: protocols = UNLABELED CIPSOv4
NetLabel: unlabeled traffic allowed by default
pnp: 00:0b: ioport range 0x400-0x47f could not be reserved
pnp: 00:0b: ioport range 0x680-0x6ff has been reserved
pnp: 00:0b: ioport range 0x500-0x53f has been reserved
PCI: Ignore bogus resource 6 [0:0] of 0000:00:02.0
PCI: Bridge: 0000:00:01.0
IO window: disabled.
MEM window: ffa00000-ffafffff
PREFETCH window: cff00000-cfffffff
PCI: Bridge: 0000:00:1c.0
IO window: disabled.
MEM window: ff600000-ff6fffff
PREFETCH window: cfb00000-cfbfffff
PCI: Bridge: 0000:00:1c.1
IO window: disabled.
MEM window: ff700000-ff7fffff
PREFETCH window: cfc00000-cfcfffff
PCI: Bridge: 0000:00:1c.2
IO window: disabled.
MEM window: ff800000-ff8fffff
PREFETCH window: cfd00000-cfdfffff
PCI: Bridge: 0000:00:1c.3
IO window: disabled.
MEM window: ff900000-ff9fffff
PREFETCH window: cfe00000-cfefffff
PCI: Bridge: 0000:00:1e.0
IO window: b000-bfff
MEM window: ff500000-ff5fffff
PREFETCH window: cfa00000-cfafffff
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:01.0 to 64
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1c.0 to 64
ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1c.1 to 64
ACPI: PCI Interrupt 0000:00:1c.2[C] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1c.2 to 64
ACPI: PCI Interrupt 0000:00:1c.3[D] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1c.3 to 64
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 6, 327680 bytes)
TCP bind hash table entries: 8192 (order: 5, 163840 bytes)
TCP: Hash tables configured (established 16384 bind 8192)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 1484k freed
audit: initializing netlink socket (disabled)
audit(1170285594.730:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
SELinux: Registering netfilter hooks
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
PCI: Setting latency timer of device 0000:00:01.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:01.0:pcie00]
Allocate Port Service[0000:00:01.0:pcie03]
PCI: Setting latency timer of device 0000:00:1c.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.0:pcie00]
Allocate Port Service[0000:00:1c.0:pcie02]
Allocate Port Service[0000:00:1c.0:pcie03]
PCI: Setting latency timer of device 0000:00:1c.1 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.1:pcie00]
Allocate Port Service[0000:00:1c.1:pcie02]
Allocate Port Service[0000:00:1c.1:pcie03]
PCI: Setting latency timer of device 0000:00:1c.2 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.2:pcie00]
Allocate Port Service[0000:00:1c.2:pcie02]
Allocate Port Service[0000:00:1c.2:pcie03]
PCI: Setting latency timer of device 0000:00:1c.3 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.3:pcie00]
Allocate Port Service[0000:00:1c.3:pcie02]
Allocate Port Service[0000:00:1c.3:pcie03]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
ACPI: Processor [CPU1] (supports 8 throttling states)
ACPI: Processor [CPU2] (supports 8 throttling states)
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
agpgart: Detected an Intel 915G Chipset.
agpgart: Detected 7932K stolen memory.
agpgart: AGP aperture is 256M @ 0xd0000000
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH6: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
ICH6: chipset revision 3
ICH6: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: HL-DT-ST DVDRAM GSA-4163B, ATAPI CD/DVD-ROM drive
hdb: HL-DT-ST GCE-8527B, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
Probing IDE interface ide1...
usbcore: registered new interface driver libusual
usbcore: registered new interface driver hiddev
usbcore: registered new interface driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1
PNP: PS/2 controller doesn't have AUX irq; using default 12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard as /class/input/input0
TCP bic registered
Initializing XFRM netlink socket
NET: Registered protocol family 1
NET: Registered protocol family 17
Starting balanced_irq
Using IPI Shortcut mode
Freeing unused kernel memory: 244k freed
Write protecting the kernel read-only data: 800k
Time: tsc clocksource has been installed.
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 23 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:1d.0: irq 20, io base 0x0000c800
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.1: irq 19, io base 0x0000cc00
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.2: irq 18, io base 0x0000d000
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.3: irq 16, io base 0x0000d400
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
usb 2-1: new low speed USB device using uhci_hcd and address 2
ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 23 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 5
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 20, io mem 0xff43bc00
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 8 ports detected
SCSI subsystem initialized
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00ac7
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xE800 ctl 0xE402 bmdma 0xD800 irq 19
ata2: SATA max UDMA/133 cmd 0xE000 ctl 0xDC02 bmdma 0xD808 irq 19
scsi0 : ata_piix
ATA: abnormal status 0x7F on port 0xE807
scsi1 : ata_piix
ata2.00: ATA-6, max UDMA/133, 312581808 sectors: LBA48 NCQ (depth 0/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
scsi 1:0:0:0: Direct-Access ATA ST3160827AS 3.42 PQ: 0 ANSI: 5
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
SCSI device sda: 312581808 512-byte hdwr sectors (160042 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: write cache: enabled, read cache: enabled, doesn't
support DPO or FUA
sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8<6>usb 2-1: new low speed
USB device using uhci_hcd and address 3
sda9 >
sda2: <solaris: [s0] sda10 [s1] sda11 [s2] sda12 [s3] sda13 [s4]
sda14 [s5] sda15 >
sd 1:0:0:0: Attached scsi disk sda
usb 2-1: configuration #1 chosen from 1 choice
input: Microsoft Basic Optical Mouse as /class/input/input1
input: USB HID v1.10 Mouse [Microsoft Basic Optical Mouse] on usb-0000:00:1d.1-1
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
audit(1170285599.902:2): selinux=0 auid=4294967295
hda: ATAPI 40X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
hdb: ATAPI 52X CD-ROM CD-R/RW drive, 1536kB Cache, UDMA(33)
input: PC Speaker as /class/input/input2
sd 1:0:0:0: Attached scsi generic sg0 type 0
fealnx.c:v2.52 Sep-11-2006
ACPI: PCI Interrupt 0000:06:00.0[A] -> GSI 21 (level, low) -> IRQ 21
eth0: 100/10M Ethernet PCI Adapter at 0001b800, 00:a1:b0:10:88:1a, IRQ 21.
iTCO_wdt: Intel TCO WatchDog Timer Driver v1.01 (11-Nov-2006)
iTCO_wdt: Found a ICH6 or ICH6R TCO device (Version=2, TCOBASE=0x0460)
iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)
intel_rng: Firmware space is locked read-only. If you can't or
intel_rng: don't want to disable this in firmware setup, and if
intel_rng: you are certain that your system has a functional
intel_rng: RNG, try using the 'no_fwh_detect' option.
ACPI: PCI Interrupt 0000:00:1f.3[B] -> GSI 19 (level, low) -> IRQ 19
ACPI: PCI Interrupt 0000:00:1b.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1b.0 to 64
floppy0: no floppy controllers found
parport: PnPBIOS parport detected.
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
lp0: using parport0 (interrupt-driven).
lp0: console ready
input: Power Button (FF) as /class/input/input3
ACPI: Power Button (FF) [PWRF]
input: Power Button (CM) as /class/input/input4
ACPI: Power Button (CM) [PWRB]
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: [email protected]
device-mapper: multipath: version 1.0.5 loaded
EXT3 FS on sda6, internal journal
kjournald starting. Commit interval 5 seconds
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on sda7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
fuse init (API version 7.8)
Adding 1052216k swap on /dev/sda5. Priority:-1 extents:1 across:1052216k
IA-32 Microcode Update Driver: v1.14a <[email protected]>
audit(1170265826.105:3): audit_pid=2038 old=0 by auid=4294967295
[drm] Initialized drm 1.1.0 20060810
ACPI: PCI Interrupt 0000:00:02.0[A] -> GSI 16 (level, low) -> IRQ 16
[drm] Initialized i915 1.6.0 20060119 on minor 0


~Akula2

2007-01-31 23:17:50

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

On Wed, 31 Jan 2007 08:04:29 -0800 (PST)
Linus Torvalds <[email protected]> wrote:

>
>
> On Wed, 31 Jan 2007, Pawe__ Sikora wrote:
> >
> > The 2.6.20-rcX have the same nasty bug as 2.6.19.x.
> >
> > [ an oops inside kmem_get_pages ]
> > http://bugzilla.kernel.org/show_bug.cgi?id=7889
>
> Pabel, can you detail more exactly which kernels don't work, and which do?
>
> >From bugzilla:
>
> - 2.6.18.x does work
> - 2.6.19.2 doesn't work.
> - what about plain 2.6.19?
> - can you please test some of the 2.6.19-rcX kernels? Especially
> 2.6.19-rc1 would be good to test.
>
> Since it apparently already happens in 2.6.19 (but it would be really good
> to know exactly when it starts), and considering _where_ it happens, I'd
> be inclined to blame commit d2e7b7d0: "fix potential stack overflow in
> mm/slab.c" by Suresh.
>
> When do_tune_cpucache() is called at bootup, I'm not sure how safe it is
> to do the kzalloc() thing.
>
> I've added a number of hopefully appropriate people to the Cc. Guys?
> Apparently it only happens with MEMORY_HOTPLUG (and possibly with just an
> SMP kernel on UP), which probably explains why it's been around without
> people really complaining very loudly.
>

I discussed this with Yasunori Goto <[email protected]> (memory
hot-add developer):

"But this config uses CONFIG_MEMORY_HOTPLUG_RESERVE which is made by
Andi Kleen-san and I don't know very well around here. And, I couldn't
reproduce this trouble on my box."

I cannot reproduce it with Pawel's config either.

2007-02-01 00:17:56

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

On Thu, 1 Feb 2007 00:37:48 +0100
Pawe__ Sikora <[email protected]> wrote:

> On Wednesday 31 of January 2007 17:04:29 Linus Torvalds wrote:
> > On Wed, 31 Jan 2007, Pawe__ Sikora wrote:
> > > The 2.6.20-rcX have the same nasty bug as 2.6.19.x.
> > >
> > > [ an oops inside kmem_get_pages ]
> > > http://bugzilla.kernel.org/show_bug.cgi?id=7889
> >
> > Pabel, can you detail more exactly which kernels don't work, and which do?
>
> 2.6.18 works, 2.6.19-rc1 doesn't work.
> git bisect found this bad commit:
>
> commit e80ee884ae0e3794ef2b65a18a767d502ad712ee
> Author: Nick Piggin <[email protected]>
> Date: Wed Oct 4 02:15:23 2006 -0700
>
> [PATCH] mm: micro optimise zone_watermark_ok
>
> Having min be a signed quantity means gcc can't turn high latency divides
> into shifts. There happen to be two such divides for GFP_ATOMIC (ie.
> networking, ie. important) allocations, one of which depends on the
> other.
> Fixing this makes code smaller as a bonus.
>
> Shame on somebody (probably me).
>
> Signed-off-by: Nick Piggin <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> Signed-off-by: Linus Torvalds <[email protected]>
>
> ------------------------- mm/page_alloc.c -----------------------
> @@ -900,7 +900,8 @@ int zone_watermark_ok(struct zone *z, int order, unsigned
> long mark,
> int classzone_idx, int alloc_flags)
> {
> /* free_pages my go negative - that's OK */
> - long min = mark, free_pages = z->free_pages - (1 << order) + 1;
> + unsigned long min = mark;
> + long free_pages = z->free_pages - (1 << order) + 1;
> int o;
>
> if (alloc_flags & ALLOC_HIGH)
>
>
> > Apparently it only happens with MEMORY_HOTPLUG (and possibly with just an
> > SMP kernel on UP), which probably explains why it's been around without
> > people really complaining very loudly.
>
> reverting mentioned commit removes the oops.
>

urgh. zone->free_pages is very small - probably zero. We shouldn't have
got here at all, so something else is wrong.

But local `free_pages' can go negative in normal operation. I guess
that'll cause us to incorrectly return `true' from zone_watermark_ok, thus
ignoring the watermarks.

The below, I guess. But we still don't know why this got called against an
empty zone.




Subject: zone_watermark_ok: signedness fix
From: Andrew Morton <[email protected]>

Local `free_pages' can go negative in normal operation. I guess that'll cause
us to incorrectly return `true' from zone_watermark_ok, thus ignoring the
watermarks.

Cc: Christoph Lameter <[email protected]>
Cc: Nick Piggin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

mm/page_alloc.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN mm/page_alloc.c~zone_watermark_ok-signedness-fix mm/page_alloc.c
--- a/mm/page_alloc.c~zone_watermark_ok-signedness-fix
+++ a/mm/page_alloc.c
@@ -1013,7 +1013,7 @@ int zone_watermark_ok(struct zone *z, in
int classzone_idx, int alloc_flags)
{
/* free_pages my go negative - that's OK */
- unsigned long min = mark;
+ long min = mark;
long free_pages = zone_page_state(z, NR_FREE_PAGES)
- (1 << order) + 1;
int o;
_

2007-02-01 00:22:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Thu, 1 Feb 2007, Pawe? Sikora wrote:
>
> 2.6.18 works, 2.6.19-rc1 doesn't work.
> git bisect found this bad commit:

Git bisect rocks.

I'll give myself yet abother pat on the back for writing it. You can never
encourage genius like that too much.

> commit e80ee884ae0e3794ef2b65a18a767d502ad712ee
> Author: Nick Piggin <[email protected]>
> Date: Wed Oct 4 02:15:23 2006 -0700
>
> [PATCH] mm: micro optimise zone_watermark_ok
>
> reverting mentioned commit removes the oops.

Ok, that commit is just totally broken.

If "free_pages" turns negative (which it can, since it's just doing a

long free_pages = z->free_pages - (1 << order) + 1;

to initialize it, and for all we know, you have an empty or close-to-empty
zone or two, the whole test to do

if (free_pages <= min + z->lowmem_reserve[classzone_idx])
return 0;

gets broken, because the negative 'free_pages' will look like a huge
unsigned positive number (and we'll make it unsigned becaue 'min' got
turned unsigned). There was a reason that thing was signed in the first
place, and neither me nor Andrew noticed.

Bad Nick. And bad me and Andrew for not noticing.

I should either revert that commit or just check for "free_pages" being
negative. The latter, in many ways, is probably better, because generally
we simply should never work with negative numbers in the kernel, so when
something potentially goes negative, we're probably just better off always
testing it explicitly anyway.

Nick, Andrew, any preferences?

Linus

2007-02-01 00:30:17

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

On Wed, 31 Jan 2007 16:19:06 -0800 (PST)
Linus Torvalds <[email protected]> wrote:

> I should either revert that commit or just check for "free_pages" being
> negative. The latter, in many ways, is probably better, because generally
> we simply should never work with negative numbers in the kernel, so when
> something potentially goes negative, we're probably just better off always
> testing it explicitly anyway.
>
> Nick, Andrew, any preferences?

It would be cleaner to check for negativity, but note that we keep
subtracting stuff from free_pages in the later loop, so we'd need to check
there as well.

2007-02-01 00:43:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Wed, 31 Jan 2007, Andrew Morton wrote:
>
> It would be cleaner to check for negativity, but note that we keep
> subtracting stuff from free_pages in the later loop, so we'd need to check
> there as well.

Yeah, not worth it. I'll just revert it.

If we really want to do the micro-optimization that Nick was after, we can
just do

// 'min' is always positive
min = (unsigned long) min >> 1;

or something.

Linus

2007-02-01 00:44:53

by Nick Piggin

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Linus Torvalds wrote:

> if (free_pages <= min + z->lowmem_reserve[classzone_idx])
> return 0;
>
> gets broken, because the negative 'free_pages' will look like a huge
> unsigned positive number (and we'll make it unsigned becaue 'min' got
> turned unsigned). There was a reason that thing was signed in the first
> place, and neither me nor Andrew noticed.
>
> Bad Nick. And bad me and Andrew for not noticing.

Sorry. I think I even wrote that comment at the top of the function.
And probably the function as well :(

> I should either revert that commit or just check for "free_pages" being
> negative. The latter, in many ways, is probably better, because generally
> we simply should never work with negative numbers in the kernel, so when
> something potentially goes negative, we're probably just better off always
> testing it explicitly anyway.
>
> Nick, Andrew, any preferences?

As Andrew says, it would need to be checked each time, because we have
nothing synchronising against free_pages at the top, or nr_free in the
loop.

We could make them both unsigned, and _add_ everything to min rather than
subtracting from free_pages?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com

2007-02-01 00:58:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Thu, 1 Feb 2007, Nick Piggin wrote:
>
> We could make them both unsigned, and _add_ everything to min rather than
> subtracting from free_pages?

Yeah, that's the right thing to do, probably. However, since we do that
"min >>=1" thing, we'd have to do that to a separate "correction" factor
(also unsigned).

No worth it for 2.6.20. I just reverted it. But feel free to send the
fixed/updated micro-optimization after 2.6.20 has been released..

Linus

2007-02-01 02:19:00

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Hi;

31 Oca 2007 Çar tarihinde şunları yazmıştınız:
> In other words, please do give it a good testing. We should have fixed the
> nasty stuff on Adrian's list (and here's another thanks to Adrian for
> keeping me on my toes!) and it's all good. But please give it a quick
> shake-down to make sure that nothing silly happened while fixing the bad
> stuff.

For me, both 2.6.20-rc6 and 2.6.20-rc7 fails while booting with initramfs
(initramfs that uses busybox) with "request_module: runaway loop modprobe
binfmt-0000" error.

So i modified kmod.c with [1] to get dump_stack, here is the picture taken
with [2] dump_stack enabled one with description below.

And here is the some observations that i made;

* Booting without an initramfs works,
* Booting with converted initrd from failed initramfs's content works,
* Booting with "Hello world" type initramfs described
in "Documentation/filesystems/ramfs-rootfs-initramfs.txt" works,

But what is not working is busybox binaries inside the _initramfs_, modifiying
our init shell script to use staticly linked bash with confirms that, bash
script interpretted without a problem by staticaly linked bash until
busybox's mount binary called inside the initramfs.

First i thought somehow busybox is broken but copying initramfs's content into
initrd works without a problem and also same binaries are working with 2.6.18
so i think its a regression in kernel but of course im not sure :), if you
need more information please ask.

[1] http://www.mail-archive.com/[email protected]/msg109266.html
[2] http://cekirdek.uludag.org.tr/~caglar/error.png

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (1.75 kB)
(No filename) (189.00 B)
Download all attachments

2007-02-01 02:37:05

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Thu, 1 Feb 2007, S.?a?lar Onur wrote:
>
> For me, both 2.6.20-rc6 and 2.6.20-rc7 fails while booting with initramfs
> (initramfs that uses busybox) with "request_module: runaway loop modprobe
> binfmt-0000" error.

That _usually_ just means that /sbin/modprobe is corrupt, or compiled with
a binfmt that itself needs a module to load.

Are you 100% certain that you didn't just happen to put an /sbin/modprobe
into your initramfs that happens to be a.out, with a.out being modular? Or
something similarly silly?

Actually, with that "binfmt-0000", I guess the most likely thing is a
corrupt /sbin/modprobe that isn't a valid binfmt format at all (but the
kernel won't know the difference between a missing binfmt thing and an
invalid one). It has bytes 2/3 being zero, which is neither ELF nor
a.out, methinks.

BUT! If that's not it, doing a "git bisect" to figure out exactly what
triggered it would be a wonderful idea..

Linus

2007-02-01 03:03:11

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

01 Şub 2007 Per tarihinde, Linus Torvalds şunları yazmıştı:
> That _usually_ just means that /sbin/modprobe is corrupt, or compiled with
> a binfmt that itself needs a module to load.
>
> Are you 100% certain that you didn't just happen to put an /sbin/modprobe
> into your initramfs that happens to be a.out, with a.out being modular? Or
> something similarly silly?

Yep, i'm sure config and binary is OK.

zangetsu ~ # readelf -h /sbin/busybox
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x8048130
Start of program headers: 52 (bytes into file)
Start of section headers: 1010608 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 5
Size of section headers: 40 (bytes)
Number of section headers: 25
Section header string table index: 24

zangetsu ~ # zcat /proc/config.gz | grep BINFMT
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y

And as i said before same binaries works within initrd image with same kernel

> Actually, with that "binfmt-0000", I guess the most likely thing is a
> corrupt /sbin/modprobe that isn't a valid binfmt format at all (but the
> kernel won't know the difference between a missing binfmt thing and an
> invalid one). It has bytes 2/3 being zero, which is neither ELF nor
> a.out, methinks.
>
> BUT! If that's not it, doing a "git bisect" to figure out exactly what
> triggered it would be a wonderful idea..

I think i found the cause of the problem, initramfs can't handle hardlinks
anymore (which works with 2.6.18), copying same /sbin/busybox binary with
different names into initramfs (which ends ups with 50 MB image) or using
symbolic ones instead of hards seems works.

--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (2.39 kB)
(No filename) (189.00 B)
Download all attachments

2007-02-01 03:32:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Thu, 1 Feb 2007, S.?a?lar Onur wrote:
>
> I think i found the cause of the problem, initramfs can't handle hardlinks
> anymore (which works with 2.6.18), copying same /sbin/busybox binary with
> different names into initramfs (which ends ups with 50 MB image) or using
> symbolic ones instead of hards seems works.

Ok, it would still be interesting to hear what triggered this.

There's really only two initramfs-related changes in the whole 2.6.20
development series that I can see, and neither of them are even _remotely_
likely to cause this, afaik. We have:

- commit 8d610dd52dd1da696e199e4b4545f33a2a5de5c6

"Make sure we populate the initroot filesystem late enough"

(but "late enough" is still _way_ before we actually mount it, so ..) and

- commit 2e591bbc0d563e12f5a260fbbca0df7d5810910e

"Make initramfs printk a warning on incorrect cpio type"

and the latter is just a build-time sanity-check.

Since 2.6.18, there's been a couple more, but none of them look even
remotely likely either. Which is why it really would be good if you can
"git bisect" this behaviour.

There _was_ a hardlink-related thing some time ago, but that was back in
May, and before 2.6.18. So if it worked for you in 2.6.18, I don't see
that being it.

I've added some initramfs people to the Cc: in case somebody has better
ideas, but in the absense of that, it really would be good to do that
bisection. Also, you might want to double-check that your cpio actually
generates a good initramfs image (it that it's not some user-mode upgrade
that caused it to stop working).

Linus

2007-02-01 05:45:33

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Linus Torvalds wrote:
>
> On Thu, 1 Feb 2007, S.?a?lar Onur wrote:
>> I think i found the cause of the problem, initramfs can't handle hardlinks
>> anymore (which works with 2.6.18), copying same /sbin/busybox binary with
>> different names into initramfs (which ends ups with 50 MB image) or using
>> symbolic ones instead of hards seems works.
>
> Ok, it would still be interesting to hear what triggered this.
>

It would be interesting to know what the inode numbers are in the image;
also, what is the exact behaviour -- do you end up with a missing link,
or do both entries end up getting hard-linked to an empty file?

-hpa

2007-02-01 06:00:44

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7



On Wed, 31 Jan 2007, H. Peter Anvin wrote:
>
> It would be interesting to know what the inode numbers are in the image; also,
> what is the exact behaviour -- do you end up with a missing link, or do both
> entries end up getting hard-linked to an empty file?

Judging by the

request_module: runaway loop modprobe binfmt-0000

one or more of the hardlinked binaries (modprobe being one, but not
necessarily the one that initially triggers hits) will read all zeroes-

Or at least bytes at offsets 2 and 3 will read as zero, causing it to not
be recognized as a proper binary, causing that "binfmt-0000" thing.

Linus

2007-02-01 06:11:08

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Linus Torvalds wrote:
>
> On Wed, 31 Jan 2007, H. Peter Anvin wrote:
>> It would be interesting to know what the inode numbers are in the image; also,
>> what is the exact behaviour -- do you end up with a missing link, or do both
>> entries end up getting hard-linked to an empty file?
>
> Judging by the
>
> request_module: runaway loop modprobe binfmt-0000
>
> one or more of the hardlinked binaries (modprobe being one, but not
> necessarily the one that initially triggers hits) will read all zeroes-
>
> Or at least bytes at offsets 2 and 3 will read as zero, causing it to not
> be recognized as a proper binary, causing that "binfmt-0000" thing.
>

Or perhaps not read at all, which would explain the problem.

cpio represents a hard link as who headers with the same type and the
same file (inode) number and a link count that is > 1. Only the first
one contains data; the subsequent ones have length 0. It's fairly easy
for a bug in the decoder to truncate the file upon encountering the
second header, since this is somewhat of a special case (it would have
been better if the cpio format distinguished "hard link" explicitly, as
tar does.)

I will look into this as soon as I can, but as I'm currently in the
middle of job hunting it might take until the weekend.

-hpa

2007-02-01 06:46:11

by Alexander E. Patrakov

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

H. Peter Anvin wrote:
> Linus Torvalds wrote:
>>
>> On Wed, 31 Jan 2007, H. Peter Anvin wrote:
>>> It would be interesting to know what the inode numbers are in the
>>> image; also,
>>> what is the exact behaviour -- do you end up with a missing link, or
>>> do both
>>> entries end up getting hard-linked to an empty file?
>>
>> Judging by the
>>
>> request_module: runaway loop modprobe binfmt-0000
>>
>> one or more of the hardlinked binaries (modprobe being one, but not
>> necessarily the one that initially triggers hits) will read all zeroes-
>>
>> Or at least bytes at offsets 2 and 3 will read as zero, causing it to
>> not be recognized as a proper binary, causing that "binfmt-0000" thing.
>>
>
> Or perhaps not read at all, which would explain the problem.
>
> cpio represents a hard link as who headers with the same type and the
> same file (inode) number and a link count that is > 1. Only the first
> one contains data; the subsequent ones have length 0. It's fairly easy
> for a bug in the decoder to truncate the file upon encountering the
> second header, since this is somewhat of a special case (it would have
> been better if the cpio format distinguished "hard link" explicitly, as
> tar does.)
>
> I will look into this as soon as I can, but as I'm currently in the
> middle of job hunting it might take until the weekend.

What's the proper way to make sure that the fix, when it appears, ends up in
my inbox?

--
Alexander E. Patrakov

2007-02-01 06:48:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

On 1/31/07, Linus Torvalds <[email protected]> wrote:
> When do_tune_cpucache() is called at bootup, I'm not sure how safe it is
> to do the kzalloc() thing.

The kzalloc thing is safe as we have already successfully boostrapped
all kmalloc caches at that point. The per-CPU caches that are replaced
by do_tune_cpucache() point to boostrap-time array caches there (see
use of struct arraycache_init in setup_cpu_cache).

2007-02-01 06:54:04

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Alexander E. Patrakov wrote:
>
> What's the proper way to make sure that the fix, when it appears, ends
> up in my inbox?
>

Say "please", and give prompt feedback on any test patches that we send you.

-hpa

2007-02-01 20:54:13

by S.Çağlar Onur

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

Hi;

01 Şub 2007 Per tarihinde, Linus Torvalds şunları yazmıştı:
> I've added some initramfs people to the Cc: in case somebody has better
> ideas, but in the absense of that, it really would be good to do that
> bisection. Also, you might want to double-check that your cpio actually
> generates a good initramfs image (it that it's not some user-mode upgrade
> that caused it to stop working).

I started bisect with 2.6.19 and ended up with 2.6.20-rc7 and every step
worked without a problem with problematic initramfs script, so i think that
was a compilation failure or something else but currently 2.6.20-rc7 works
without a problem, so please ignore that one, sorry for noise...

Cheers
--
S.Çağlar Onur <[email protected]>
http://cekirdek.pardus.org.tr/~caglar/

Linux is like living in a teepee. No Windows, no Gates and an Apache in house!


Attachments:
(No filename) (866.00 B)
(No filename) (189.00 B)
Download all attachments

2007-02-01 21:07:11

by H. Peter Anvin

[permalink] [raw]
Subject: Re: Linux 2.6.20-rc7

S.Çağlar Onur wrote:
> Hi;
>
> 01 Şub 2007 Per tarihinde, Linus Torvalds şunları yazmıştı:
>> I've added some initramfs people to the Cc: in case somebody has better
>> ideas, but in the absense of that, it really would be good to do that
>> bisection. Also, you might want to double-check that your cpio actually
>> generates a good initramfs image (it that it's not some user-mode upgrade
>> that caused it to stop working).
>
> I started bisect with 2.6.19 and ended up with 2.6.20-rc7 and every step
> worked without a problem with problematic initramfs script, so i think that
> was a compilation failure or something else but currently 2.6.20-rc7 works
> without a problem, so please ignore that one, sorry for noise...
>

OK, well, if it crops up again, please holler. It could be a
nondeterministic bug.

-hpa

2007-02-02 05:49:13

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.20-rc7: known regressions

This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : NULL pointer dereference at as_move_to_dispatch()
References : http://lkml.org/lkml/2007/1/22/141
Submitter : Andrew Vasquez <[email protected]>
Status : unknown


Subject : pktcdvd doesn't work with libata pata drivers
References : http://bugzilla.kernel.org/show_bug.cgi?id=7810
http://lkml.org/lkml/2007/1/25/128
http://bugzilla.kernel.org/show_bug.cgi?id=7910
http://lkml.org/lkml/2007/1/30/289
Submitter : Gerhard Dirschl <[email protected]>
Luca Tettamanti <[email protected]>
Caused-By : Christoph Hellwig <[email protected]>
commit 3b00315799d78f76531b71435fbc2643cd71ae4c
commit 406c9b605cbc45151c03ac9a3f95e9acf050808c
Status : unknown


Subject : powerpc64: performance monitor exception
References : http://ozlabs.org/pipermail/linuxppc-dev/2007-January/030045.html
Submitter : Livio Soares <[email protected]>
Caused-By : Paul Mackerras <[email protected]>
commit d04c56f73c30a5e593202ecfcf25ed43d42363a2
Status : unknown


Subject : reboot instead of powerdown (CONFIG_USB_SUSPEND)
References : http://lkml.org/lkml/2006/12/25/40
http://bugzilla.kernel.org/show_bug.cgi?id=7828
Submitter : Berthold Cogel <[email protected]>
François Valenduc <[email protected]>
Handled-By : Alan Stern <[email protected]>
Status : problem is being debugged


Subject : usb somehow broken (CONFIG_USB_SUSPEND)
References : http://lkml.org/lkml/2007/1/11/146
Submitter : Prakash Punnoor <[email protected]>
Handled-By : Oliver Neukum <[email protected]>
Alan Stern <[email protected]>
Status : problem is being debugged


Subject : BUG: at fs/inotify.c:172 set_dentry_child_flags()
References : http://bugzilla.kernel.org/show_bug.cgi?id=7785
Submitter : Cijoml Cijomlovic Cijomlov <[email protected]>
Handled-By : Nick Piggin <[email protected]>
Status : problem is being debugged


Subject : fix geode_configure()
References : http://lkml.org/lkml/2007/1/9/216
Submitter : Lennart Sorensen <[email protected]>
Caused-By : takada <[email protected]>
commit e4f0ae0ea63caceff37a13f281a72652b7ea71ba
Handled-By : takada <[email protected]>
Lennart Sorensen <[email protected]>
Status : patches are being discussed

2007-02-02 08:48:22

by Fabio Erculiani

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Adrian,

and this one?
http://bugzilla.kernel.org/show_bug.cgi?id=7589

---
Fabio Erculiani
http://www.sabayonlinux.org
Sabayon Linux Founder

2007-02-02 13:58:08

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Fri, Feb 02, 2007 at 08:45:42AM +0000, Fabio Erculiani wrote:

> Adrian,

Hi Fabio,

> and this one?
> http://bugzilla.kernel.org/show_bug.cgi?id=7589

Not a regression in 2.6.20-rc compared to 2.6.19 - it's already in
broken in 2.6.19.

This bug should be fixed, but it's outside the scope of my list.

> Fabio Erculiani

cu
Adrian

BTW: Please don't strip me from the Cc when asking me a question.

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2007-02-03 00:44:47

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.20-rc7: known regressions (v2) (part 1)

This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : NULL pointer dereference at as_move_to_dispatch()
References : http://lkml.org/lkml/2007/1/22/141
Submitter : Andrew Vasquez <[email protected]>
Status : unknown


Subject : pktcdvd doesn't work with libata pata drivers
(caused by scsi_lib and pktcdvd patches)
References : http://bugzilla.kernel.org/show_bug.cgi?id=7810
http://lkml.org/lkml/2007/1/25/128
http://bugzilla.kernel.org/show_bug.cgi?id=7910
http://lkml.org/lkml/2007/1/30/289
Submitter : Gerhard Dirschl <[email protected]>
Luca Tettamanti <[email protected]>
Caused-By : Christoph Hellwig <[email protected]>
commit 3b00315799d78f76531b71435fbc2643cd71ae4c
commit 406c9b605cbc45151c03ac9a3f95e9acf050808c
Status : unknown


Subject : e1000: 82571EB/82572EI PCI-E cards: link is always down
(MSI related)
References : http://lkml.org/lkml/2007/1/16/27
http://lkml.org/lkml/2007/1/17/182
Submitter : Allen Parker <[email protected]>
Adam Kropelin <[email protected]>
Handled-By : Auke Kok <[email protected]>
Status : problem is being debugged


Subject : powerpc64: performance monitor exception
References : http://ozlabs.org/pipermail/linuxppc-dev/2007-January/030045.html
Submitter : Livio Soares <[email protected]>
Caused-By : Paul Mackerras <[email protected]>
commit d04c56f73c30a5e593202ecfcf25ed43d42363a2
Status : unknown

2007-02-03 00:47:17

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.20-rc7: known regressions (v2) (part 2)

This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
that are not yet fixed in Linus' tree.

If you find your name in the Cc header, you are either submitter of one
of the bugs, maintainer of an affectected subsystem or driver, a patch
of you caused a breakage or I'm considering you in any other way possibly
involved with one or more of these issues.

Due to the huge amount of recipients, please trim the Cc when answering.


Subject : reboot instead of powerdown (CONFIG_USB_SUSPEND)
References : http://lkml.org/lkml/2006/12/25/40
http://bugzilla.kernel.org/show_bug.cgi?id=7828
Submitter : Berthold Cogel <[email protected]>
François Valenduc <[email protected]>
Handled-By : Alan Stern <[email protected]>
Status : problem is being debugged


Subject : usb somehow broken (CONFIG_USB_SUSPEND)
References : http://lkml.org/lkml/2007/1/11/146
Submitter : Prakash Punnoor <[email protected]>
Handled-By : Oliver Neukum <[email protected]>
Alan Stern <[email protected]>
Status : problem is being debugged


Subject : BUG: at fs/inotify.c:172 set_dentry_child_flags()
References : http://bugzilla.kernel.org/show_bug.cgi?id=7785
Submitter : Cijoml Cijomlovic Cijomlov <[email protected]>
Handled-By : Nick Piggin <[email protected]>
Status : problem is being debugged


Subject : ocfs2_link() journal credits update
References : http://lkml.org/lkml/2007/2/2/171
Submitter : Mark Fasheh <[email protected]>
Caused-By : Mark Fasheh <[email protected]>
commit 592282cf2eaa33409c6511ddd3f3ecaa57daeaaa
Handled-By : Mark Fasheh <[email protected]>
Patch : http://lkml.org/lkml/2007/2/2/171
Status : patch available


Subject : v9fs_vfs_mkdir(): fix a double free
References : http://lkml.org/lkml/2007/2/2/164
Submitter : Adrian Bunk <[email protected]>
Caused-By : Eric Van Hensbergen <[email protected]>
commit da977b2c7eb4d6312f063a7b486f2aad99809710
Handled-By : Adrian Bunk <[email protected]>
Patch : http://lkml.org/lkml/2007/2/2/164
Status : patch available

2007-02-03 01:58:49

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Fri, 2 Feb 2007 06:49:16 +0100
Adrian Bunk <[email protected]> wrote:

> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
> that are not yet fixed in Linus' tree.

There are still a few things hanging around.

I have these queued:

aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
net-smc911x-match-up-spin-lock-unlock.patch
rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
alpha-fix-epoll-syscall-enumerations.patch
revert-blockdev-direct-io-back-to-2619-version.patch
scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
altix-more-acpi-prt-support.patch

which I'll get through to Linus later today. Plus:


- x86_64-irq-simplfy-__assign_irq_vector.patch and
x86_64-irq-handle-irqs-pending-in-irr-during-irq-migration.patch which
are big and scary. Am awaiting feedback from Andi and Eric on what to do
with these.

- A fix from Trond for http://bugzilla.kernel.org/show_bug.cgi?id=7923.
Am awaiting acks to merge that.

- sky2-flow-control-off.patch from shemminger which I assume Linus will
be merging anyway.

- v9fs_vfs_mkdir-fix-a-double-free.patch which I guess I'll merge unless
Eric suddenly nacks it.

- I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
about, but I forget its status.

- I have efi-x86-pass-firmware-call-parameters-on-the-stack.patch, but
I'm not sure it's right and unless something really rapid happens, we'll
ship with that bug unfixed.

- enable-mouse-button-23-emulation-for-x86-macs.patch looks simple
enough, but I'm waiting for Ben to wake up.

- x86-fix-vdso-mapping-for-aout-executables.patch probably works OK, but
Andi points out that it'd be better to implement this with
attribute-weak. So I guess 2.6.20 will ship with non-functional a.out on
i386, like 2.6.29.

2007-02-03 02:04:03

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Andrew Morton wrote:
> On Fri, 2 Feb 2007 06:49:16 +0100
> Adrian Bunk <[email protected]> wrote:
>
>> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
>> that are not yet fixed in Linus' tree.
>
> There are still a few things hanging around.
>
> I have these queued:
>
> aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
> kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
> net-smc911x-match-up-spin-lock-unlock.patch
> rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
> alpha-fix-epoll-syscall-enumerations.patch
> revert-blockdev-direct-io-back-to-2619-version.patch
> scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
> altix-more-acpi-prt-support.patch

Would you forward the x86-64 dma_noncoherent API build fix I posted?
Anything that uses that API won't build on x86-64 without my [simple and
obvious] patch.


> - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
> about, but I forget its status.

I posted a preferred patch (which someone then noted need to use
setup_timer), and am waiting for an "it works" response of some sort

Jeff


2007-02-03 02:28:30

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Fri, 02 Feb 2007 21:03:48 -0500
Jeff Garzik <[email protected]> wrote:

> Andrew Morton wrote:
> > On Fri, 2 Feb 2007 06:49:16 +0100
> > Adrian Bunk <[email protected]> wrote:
> >
> >> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
> >> that are not yet fixed in Linus' tree.
> >
> > There are still a few things hanging around.
> >
> > I have these queued:
> >
> > aio-fix-buggy-put_ioctx-call-in-aio_complete-v2.patch
> > kexec-avoid-migration-of-already-disabled-irqs-ia64.patch
> > net-smc911x-match-up-spin-lock-unlock.patch
> > rtc-pcf8563-detect-polarity-of-century-bit-automatically.patch
> > alpha-fix-epoll-syscall-enumerations.patch
> > revert-blockdev-direct-io-back-to-2619-version.patch
> > scsi-sd-udev-accessing-an-uninitialized-scsi_disk-results-in-a-crash.patch
> > altix-more-acpi-prt-support.patch
>
> Would you forward the x86-64 dma_noncoherent API build fix I posted?
> Anything that uses that API won't build on x86-64 without my [simple and
> obvious] patch.

Yup. That's this:

--- a/include/asm-x86_64/dma-mapping.h~x86-64-define-dma-noncoherent-api-functions
+++ a/include/asm-x86_64/dma-mapping.h
@@ -63,6 +63,9 @@ static inline int dma_mapping_error(dma_
return (dma_addr == bad_dma_address);
}

+#define dma_alloc_noncoherent(d, s, h, f) dma_alloc_coherent(d, s, h, f)
+#define dma_free_noncoherent(d, s, v, h) dma_free_coherent(d, s, v, h)
+
extern void *dma_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp);
extern void dma_free_coherent(struct device *dev, size_t size, void *vaddr,
_

>
> > - I have r8169-fix-a-race-between-pci-probe-and-dev_open.patch floating
> > about, but I forget its status.
>
> I posted a preferred patch (which someone then noted need to use
> setup_timer), and am waiting for an "it works" response of some sort

OK, thanks, I'll drop it.

2007-02-03 06:06:38

by Kok, Auke

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Adrian Bunk wrote:
> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
> that are not yet fixed in Linus' tree.
>
> If you find your name in the Cc header, you are either submitter of one
> of the bugs, maintainer of an affectected subsystem or driver, a patch
> of you caused a breakage or I'm considering you in any other way possibly
> involved with one or more of these issues.


> Subject : e1000: 82571EB/82572EI PCI-E cards: link is always down
> (MSI related)
> References : http://lkml.org/lkml/2007/1/16/27
> http://lkml.org/lkml/2007/1/17/182
> Submitter : Allen Parker <[email protected]>
> Adam Kropelin <[email protected]>
> Handled-By : Auke Kok <[email protected]>
> Status : problem is being debugged

I probably can't fix this bug. Not only do I doubt that the e1000 driver is at
fault here, I don't have a system with this particular chipset. Most likely the
regression comes from a combination of MSI layer rewrites and possibly platform
issues. We've seen many reports that are similar and all are on the platform
type mentioned here. I really don't want to point fingers here either.

None of the MSI code in e1000 has changed significantly either. as far as I can
see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's no
way I can debug any of this without a system.

I will address the fact that we are lacking any of these systems to test on, but
that is not going to get this issue handled (not to mention soon) in the way it
needs to be.

I strongly encourage the people on the linux-pci list to help out, I'll trace
the e1000 driver for suspicious activity (again), but I run countless tests on
the latest trees and nothing has shown up recently, other than Eric Biederman's
msi irq reclaim leak fix.

Perhaps Adam can git-bisect this issue? Adam?

Cheers,

Auke

2007-02-03 07:43:09

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Auke Kok <[email protected]> writes:

> Adrian Bunk wrote:
>> This email lists some known regressions in 2.6.20-rc7 compared to 2.6.19
>> that are not yet fixed in Linus' tree.
>>
>> If you find your name in the Cc header, you are either submitter of one
>> of the bugs, maintainer of an affectected subsystem or driver, a patch
>> of you caused a breakage or I'm considering you in any other way possibly
>> involved with one or more of these issues.
>
>
>> Subject : e1000: 82571EB/82572EI PCI-E cards: link is always down
>> (MSI related)
>> References : http://lkml.org/lkml/2007/1/16/27
>> http://lkml.org/lkml/2007/1/17/182
>> Submitter : Allen Parker <[email protected]>
>> Adam Kropelin <[email protected]>
>> Handled-By : Auke Kok <[email protected]>
>> Status : problem is being debugged
>
> I probably can't fix this bug. Not only do I doubt that the e1000 driver is at
> fault here, I don't have a system with this particular chipset. Most likely the
> regression comes from a combination of MSI layer rewrites and possibly platform
> issues. We've seen many reports that are similar and all are on the platform
> type mentioned here. I really don't want to point fingers here either.
>
> None of the MSI code in e1000 has changed significantly either. as far as I can
> see, the msi code in e1000 has not changed since 2.6.18. Nonetheless there's no
> way I can debug any of this without a system.
>
> I will address the fact that we are lacking any of these systems to test on, but
> that is not going to get this issue handled (not to mention soon) in the way it
> needs to be.
>
> I strongly encourage the people on the linux-pci list to help out, I'll trace
> the e1000 driver for suspicious activity (again), but I run countless tests on
> the latest trees and nothing has shown up recently, other than Eric Biederman's
> msi irq reclaim leak fix.
>
> Perhaps Adam can git-bisect this issue? Adam?

Do we have any explanation about the weird /proc/interrupts output?
i.e. Multiple MSI irqs being assigned to the same card?

Does /sbin/ifconfig ethN down ; /sbin/ifconfig ethN up have anything to do
with the duplication in /proc/interrupts?

I can't see any way for a pci device that doesn't support msi-x to be assigned
multiple interrupts simultaneously.

I just skimmed through the code and there hasn't been any significant
generic MSI work since 2.6.19.

Did this device really work with MSI enabled in 2.6.19?

Eric

2007-02-03 09:19:57

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le vendredi 02 février 2007 à 17:55 -0800, Andrew Morton a écrit :
> - I have efi-x86-pass-firmware-call-parameters-on-the-stack.patch, but
> I'm not sure it's right and unless something really rapid happens, we'll
> ship with that bug unfixed.

Things I can say:
- Works for me :-)
- When you look at the code, it's obvious that switching to -mregparm=3
changed the way we call into EFI runtime services. If you consider that
that old code was correct, then the patch is needed to keep the good
calling convention.
- It touches only arch/i386/kernel/efi.c which is compiled only with
CONFIG_EFI && X86
- It changes code that is called only when booted in EFI mode.

Last 2 points mean the user base is pretty limited, which can be taken
both as an argument to push it for the release or not to. I'd obviously
prefer that someone knowledgeable about EFI looks at it and ACKs before
it goes in.

Fred.

2007-02-03 09:28:26

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Sat, 03 Feb 2007 10:19:51 +0100 Fr?d?ric Riss <[email protected]> wrote:

> Le vendredi 02 f?vrier 2007 ? 17:55 -0800, Andrew Morton a ?crit :
> > - I have efi-x86-pass-firmware-call-parameters-on-the-stack.patch, but
> > I'm not sure it's right and unless something really rapid happens, we'll
> > ship with that bug unfixed.
>
> Things I can say:
> - Works for me :-)
> - When you look at the code, it's obvious that switching to -mregparm=3
> changed the way we call into EFI runtime services. If you consider that
> that old code was correct, then the patch is needed to keep the good
> calling convention.
> - It touches only arch/i386/kernel/efi.c which is compiled only with
> CONFIG_EFI && X86
> - It changes code that is called only when booted in EFI mode.
>
> Last 2 points mean the user base is pretty limited, which can be taken
> both as an argument to push it for the release or not to. I'd obviously
> prefer that someone knowledgeable about EFI looks at it and ACKs before
> it goes in.
>

OK, well here it is again for everyone's reviwing pleasure:


From: Frederic Riss <[email protected]>

When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: <[email protected]>
Cc: Matt Domsch <[email protected]>
Cc: Andi Kleen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

arch/i386/kernel/efi.c | 89 +++++++++++++++++++++++++++++++--------
1 files changed, 73 insertions(+), 16 deletions(-)

diff -puN arch/i386/kernel/efi.c~efi-x86-pass-firmware-call-parameters-on-the-stack arch/i386/kernel/efi.c
--- a/arch/i386/kernel/efi.c~efi-x86-pass-firmware-call-parameters-on-the-stack
+++ a/arch/i386/kernel/efi.c
@@ -473,6 +473,70 @@ static inline void __init check_range_fo
}

/*
+ * Wrap all the virtual calls in a way that forces the parameters on the stack.
+ */
+
+#define efi_call_virt(f, args...) \
+ ((efi_##f##_t __attribute__((regparm(0)))*)efi.systab->runtime->f)(args)
+
+static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
+{
+ return efi_call_virt(get_time, tm, tc);
+}
+
+static efi_status_t virt_efi_set_time (efi_time_t *tm)
+{
+ return efi_call_virt(set_time, tm);
+}
+
+static efi_status_t virt_efi_get_wakeup_time (efi_bool_t *enabled,
+ efi_bool_t *pending,
+ efi_time_t *tm)
+{
+ return efi_call_virt(get_wakeup_time, enabled, pending, tm);
+}
+
+static efi_status_t virt_efi_set_wakeup_time (efi_bool_t enabled,
+ efi_time_t *tm)
+{
+ return efi_call_virt(set_wakeup_time, enabled, tm);
+}
+
+static efi_status_t virt_efi_get_variable (efi_char16_t *name,
+ efi_guid_t *vendor, u32 *attr,
+ unsigned long *data_size, void *data)
+{
+ return efi_call_virt(get_variable, name, vendor, attr, data_size, data);
+}
+
+static efi_status_t virt_efi_get_next_variable (unsigned long *name_size,
+ efi_char16_t *name,
+ efi_guid_t *vendor)
+{
+ return efi_call_virt(get_next_variable, name_size, name, vendor);
+}
+
+static efi_status_t virt_efi_set_variable (efi_char16_t *name,
+ efi_guid_t *vendor,
+ unsigned long attr,
+ unsigned long data_size, void *data)
+{
+ return efi_call_virt(set_variable, name, vendor, attr, data_size, data);
+}
+
+static efi_status_t virt_efi_get_next_high_mono_count (u32 *count)
+{
+ return efi_call_virt(get_next_high_mono_count, count);
+}
+
+static void virt_efi_reset_system (int reset_type, efi_status_t status,
+ unsigned long data_size,
+ efi_char16_t *data)
+{
+ efi_call_virt(reset_system, reset_type, status, data_size, data);
+}
+
+/*
* This function will switch the EFI runtime services to virtual mode.
* Essentially, look through the EFI memmap and map every region that
* has the runtime attribute bit set in its memory descriptor and update
@@ -525,22 +589,15 @@ void __init efi_enter_virtual_mode(void)
* pointers in the runtime service table to the new virtual addresses.
*/

- efi.get_time = (efi_get_time_t *) efi.systab->runtime->get_time;
- efi.set_time = (efi_set_time_t *) efi.systab->runtime->set_time;
- efi.get_wakeup_time = (efi_get_wakeup_time_t *)
- efi.systab->runtime->get_wakeup_time;
- efi.set_wakeup_time = (efi_set_wakeup_time_t *)
- efi.systab->runtime->set_wakeup_time;
- efi.get_variable = (efi_get_variable_t *)
- efi.systab->runtime->get_variable;
- efi.get_next_variable = (efi_get_next_variable_t *)
- efi.systab->runtime->get_next_variable;
- efi.set_variable = (efi_set_variable_t *)
- efi.systab->runtime->set_variable;
- efi.get_next_high_mono_count = (efi_get_next_high_mono_count_t *)
- efi.systab->runtime->get_next_high_mono_count;
- efi.reset_system = (efi_reset_system_t *)
- efi.systab->runtime->reset_system;
+ efi.get_time = virt_efi_get_time;
+ efi.set_time = virt_efi_set_time;
+ efi.get_wakeup_time = virt_efi_get_wakeup_time;
+ efi.set_wakeup_time = virt_efi_set_wakeup_time;
+ efi.get_variable = virt_efi_get_variable;
+ efi.get_next_variable = virt_efi_get_next_variable;
+ efi.set_variable = virt_efi_set_variable;
+ efi.get_next_high_mono_count = virt_efi_get_next_high_mono_count;
+ efi.reset_system = virt_efi_reset_system;
}

void __init
_

2007-02-03 09:34:05

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions


> /*
> + * Wrap all the virtual calls in a way that forces the parameters on the stack.
> + */
> +
> +#define efi_call_virt(f, args...) \
> + ((efi_##f##_t __attribute__((regparm(0)))*)efi.systab->runtime->f)(args)
> +
> +static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
> +{
> + return efi_call_virt(get_time, tm, tc);

Wouldn't it be better to just declare the pointers in efi.systab with
the correct attribute? Then you wouldn't need all that ugly casting.

-Andi

2007-02-03 09:49:40

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le samedi 03 février 2007 à 10:33 +0100, Andi Kleen a écrit :
> > /*
> > + * Wrap all the virtual calls in a way that forces the parameters on the stack.
> > + */
> > +
> > +#define efi_call_virt(f, args...) \
> > + ((efi_##f##_t __attribute__((regparm(0)))*)efi.systab->runtime->f)(args)
> > +
> > +static efi_status_t virt_efi_get_time(efi_time_t *tm, efi_time_cap_t *tc)
> > +{
> > + return efi_call_virt(get_time, tm, tc);
>
> Wouldn't it be better to just declare the pointers in efi.systab with
> the correct attribute? Then you wouldn't need all that ugly casting.

Was what I did in the initial patch:
http://lkml.org/lkml/2007/1/30/258

The issue is that the structure definition is used on multiple
architectures (for now ia64 and i386) which might used different calling
conventions. As Bjorn Helgaas pointed out, ia64 already has such wrapper
functions. I agree that the casting isn't the nicest thing, but I prefer
that to writing asm stubs.

Fred

2007-02-03 09:58:25

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Saturday 03 February 2007 10:49, Frédéric Riss wrote:

> Was what I did in the initial patch:
> http://lkml.org/lkml/2007/1/30/258
>
> The issue is that the structure definition is used on multiple
> architectures (for now ia64 and i386) which might used different calling
> conventions. As Bjorn Helgaas pointed out, ia64 already has such wrapper
> functions. I agree that the casting isn't the nicest thing, but I prefer
> that to writing asm stubs.

Define a efilinkage macro then that expands to nothing on ia64

Probably asmlinkage would work already, syscall_linkage as used on ia64 doesn't
seem to affect function pointers.

-Andi

2007-02-03 10:47:54

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le samedi 03 février 2007 à 10:58 +0100, Andi Kleen a écrit :
> Define a efilinkage macro then that expands to nothing on ia64
>
> Probably asmlinkage would work already, syscall_linkage as used on ia64 doesn't
> seem to affect function pointers.

And here it goes:

When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <[email protected]>
---

diff --git a/include/linux/efi.h b/include/linux/efi.h
index f8ebd7c..596e806 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -157,22 +157,39 @@ typedef struct {
unsigned long reset_system;
} efi_runtime_services_t;

-typedef efi_status_t efi_get_time_t (efi_time_t *tm, efi_time_cap_t *tc);
-typedef efi_status_t efi_set_time_t (efi_time_t *tm);
-typedef efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled, efi_bool_t *pending,
- efi_time_t *tm);
-typedef efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled, efi_time_t *tm);
-typedef efi_status_t efi_get_variable_t (efi_char16_t *name, efi_guid_t *vendor, u32 *attr,
- unsigned long *data_size, void *data);
-typedef efi_status_t efi_get_next_variable_t (unsigned long *name_size, efi_char16_t *name,
- efi_guid_t *vendor);
-typedef efi_status_t efi_set_variable_t (efi_char16_t *name, efi_guid_t *vendor,
- unsigned long attr, unsigned long data_size,
- void *data);
-typedef efi_status_t efi_get_next_high_mono_count_t (u32 *count);
-typedef void efi_reset_system_t (int reset_type, efi_status_t status,
- unsigned long data_size, efi_char16_t *data);
-typedef efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
+#ifdef CONFIG_X86_32
+#define efilinkage asmlinkage
+#else
+#define efilinkage
+#endif
+
+typedef efilinkage efi_status_t efi_get_time_t (efi_time_t *tm,
+ efi_time_cap_t *tc);
+typedef efilinkage efi_status_t efi_set_time_t (efi_time_t *tm);
+typedef efilinkage efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled,
+ efi_bool_t *pending,
+ efi_time_t *tm);
+typedef efilinkage efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled,
+ efi_time_t *tm);
+typedef efilinkage efi_status_t efi_get_variable_t (efi_char16_t *name,
+ efi_guid_t *vendor,
+ u32 *attr,
+ unsigned long *data_size,
+ void *data);
+typedef efilinkage efi_status_t efi_get_next_variable_t (unsigned long *name_sz,
+ efi_char16_t *name,
+ efi_guid_t *vendor);
+typedef efilinkage efi_status_t efi_set_variable_t (efi_char16_t *name,
+ efi_guid_t *vendor,
+ unsigned long attr,
+ unsigned long data_size,
+ void *data);
+typedef efilinkage efi_status_t efi_get_next_high_mono_count_t (u32 *count);
+typedef efilinkage void efi_reset_system_t (int reset_type,
+ efi_status_t status,
+ unsigned long data_size,
+ efi_char16_t *data);
+typedef efilinkage efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
unsigned long descriptor_size,
u32 descriptor_version,
efi_memory_desc_t *virtual_map);



2007-02-03 10:51:41

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions


> +#ifdef CONFIG_X86_32
> +#define efilinkage asmlinkage
> +#else
> +#define efilinkage
> +#endif

No ifdefs, this should be somewhere in the headers for the EFI supporting
architectures

But I suspect you could actually get away with just using asmlinkage
(after reviewing the ia64 asmlinkage I think it's ok)

x86-64 when it is ever implemented will always need asm stubs though
because it uses completely incompatible calling conventions.

> +
> +typedef efilinkage efi_status_t efi_get_time_t (efi_time_t *tm,
> + efi_time_cap_t *tc);

I assume you have double checked it actually works? (i vaguely recall some
issues with applying attributes to typedefs). If not you would need
to put them to the declarations.

-Andi

2007-02-03 10:57:30

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le samedi 03 février 2007 à 11:51 +0100, Andi Kleen a écrit :
> > +
> > +typedef efilinkage efi_status_t efi_get_time_t (efi_time_t *tm,
> > + efi_time_cap_t *tc);
>
> I assume you have double checked it actually works? (i vaguely recall some
> issues with applying attributes to typedefs). If not you would need
> to put them to the declarations.

Of course, I tested 10 suspend/resume cycles. This is with gcc 4.1.2, I
guess other compilers could mishandle it. Would you prefer the version
putting asmlinkage inside the struct definition?

Fred.


2007-02-03 11:08:10

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le samedi 03 février 2007 à 11:57 +0100, Frédéric Riss a écrit :
> Le samedi 03 février 2007 à 11:51 +0100, Andi Kleen a écrit :
> > > +
> > > +typedef efilinkage efi_status_t efi_get_time_t (efi_time_t *tm,
> > > + efi_time_cap_t *tc);
> >
> > I assume you have double checked it actually works? (i vaguely recall some
> > issues with applying attributes to typedefs). If not you would need
> > to put them to the declarations.
>
> Of course, I tested 10 suspend/resume cycles. This is with gcc 4.1.2, I
> guess other compilers could mishandle it. Would you prefer the version
> putting asmlinkage inside the struct definition?

But then it then throws 'assignment from incompatible pointer type'
warnings in efi.c. This would need fixing in arch/ia64 and arch/i386. Or
should I put it in the typdefs _and_ in the struct to be safe ?

Fred

2007-02-03 18:07:29

by Adam Kropelin

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Eric W. Biederman wrote:
> Auke Kok <[email protected]> writes:
>> None of the MSI code in e1000 has changed significantly either. as
>> far as I can see, the msi code in e1000 has not changed since
>> 2.6.18. Nonetheless there's no way I can debug any of this without a
>> system.
>> [...]
>> Perhaps Adam can git-bisect this issue? Adam?
>
> Do we have any explanation about the weird /proc/interrupts output?
> i.e. Multiple MSI irqs being assigned to the same card?
>
> Does /sbin/ifconfig ethN down ; /sbin/ifconfig ethN up have anything
> to do with the duplication in /proc/interrupts?
>
> I can't see any way for a pci device that doesn't support msi-x to be
> assigned multiple interrupts simultaneously.
>
> I just skimmed through the code and there hasn't been any significant
> generic MSI work since 2.6.19.
>
> Did this device really work with MSI enabled in 2.6.19?

I've never had this device work 100% with MSI on any kernel version I've
tested so far. But I'm not the original reporter of the problem, and I
believe for him it was a true regression where a previous kernel wored
correctly.

The behavior I observe on 2.6.19 is better than 2.6.20-rc7. Link status
interrupts seem to work but rx/tx does not. A few more details here:
<http://www.kroptech.com/~adk0212/mailimport/showmsg.php?msg_id=3339092450&db_name=linux_kernel>

I'm going to test 2.6.16 thru 2.6.20-rc7 this weekend and will report
back any variations in behavior I notice.

--Adam

2007-02-03 20:43:33

by Kok, Auke

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Adam Kropelin wrote:
> Eric W. Biederman wrote:
>> Auke Kok <[email protected]> writes:
>>> None of the MSI code in e1000 has changed significantly either. as
>>> far as I can see, the msi code in e1000 has not changed since
>>> 2.6.18. Nonetheless there's no way I can debug any of this without a
>>> system.
>>> [...]
>>> Perhaps Adam can git-bisect this issue? Adam?
>> Do we have any explanation about the weird /proc/interrupts output?
>> i.e. Multiple MSI irqs being assigned to the same card?
>>
>> Does /sbin/ifconfig ethN down ; /sbin/ifconfig ethN up have anything
>> to do with the duplication in /proc/interrupts?
>>
>> I can't see any way for a pci device that doesn't support msi-x to be
>> assigned multiple interrupts simultaneously.
>>
>> I just skimmed through the code and there hasn't been any significant
>> generic MSI work since 2.6.19.
>>
>> Did this device really work with MSI enabled in 2.6.19?
>
> I've never had this device work 100% with MSI on any kernel version I've
> tested so far. But I'm not the original reporter of the problem, and I
> believe for him it was a true regression where a previous kernel wored
> correctly.

maybe I've been unclear, but here's how e1000 detects link changes:

1) by checking every 2 seconds in the watchdog by reading PHY registers
2) by receiving an interrupt from the NIC with the LSI bit in the interrupt
control register

if the link is down to start with, the watchdog will obviously spot a 'link up'
change since it doesn't use any interrupts.

The link interrupt (LSI) is a generic interrupt that comes over the same vector
(be it MSI or not) as RX interrupts, and in your case doesn't arrive at all,
which should be demonstrateable if you set e.g. the watchdog interval to 30
seconds and unplug the cable - the driver won't spot the link change until the
watchdog fires a lot later than you unplugged the cable.

> The behavior I observe on 2.6.19 is better than 2.6.20-rc7. Link status
> interrupts seem to work but rx/tx does not. A few more details here:
>
<http://www.kroptech.com/~adk0212/mailimport/showmsg.php?msg_id=3339092450&db_name=linux_kernel>

> I'm going to test 2.6.16 thru 2.6.20-rc7 this weekend and will report
> back any variations in behavior I notice.

that would be a good start, but I still think that you might have a broken
bridge on that system. Anyway, thanks for digging into this.

Auke

2007-02-03 21:00:59

by Adam Kropelin

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Auke Kok wrote:
> Adam Kropelin wrote:
>> I've never had this device work 100% with MSI on any kernel version
>> I've tested so far. But I'm not the original reporter of the
>> problem, and I believe for him it was a true regression where a
>> previous kernel wored correctly.
>
> maybe I've been unclear, but here's how e1000 detects link changes:
>
> 1) by checking every 2 seconds in the watchdog by reading PHY
> registers

That would explain why I see link status changes but 0 interrupt count
in /proc/interrupts. However, on >= 2.6.19 the link state never changes.
Ever. It's always down. On <= 2.6.18 the link state does change but with
0 interupt count.

> 2) by receiving an interrupt from the NIC with the LSI bit
> in the interrupt control register
>
> if the link is down to start with, the watchdog will obviously spot a
> 'link up' change since it doesn't use any interrupts.

This does not seem to work on 2.6.19+. Unless the watchdog interval is
tens of minutes. I've waited at least 5 minutes and link never went up.

>> The behavior I observe on 2.6.19 is better than 2.6.20-rc7. Link
>> status interrupts seem to work but rx/tx does not. A few more
>> details here:
> <http://www.kroptech.com/~adk0212/mailimport/showmsg.php?msg_id=3339092450&db_name=linux_kernel>
>
>> I'm going to test 2.6.16 thru 2.6.20-rc7 this weekend and will report
>> back any variations in behavior I notice.
>
> that would be a good start, but I still think that you might have a
> broken bridge on that system. Anyway, thanks for digging into this.

Will continue to dig.

--Adam

2007-02-03 21:14:11

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Auke Kok <[email protected]> writes:

> maybe I've been unclear, but here's how e1000 detects link changes:
>
> 1) by checking every 2 seconds in the watchdog by reading PHY registers
> 2) by receiving an interrupt from the NIC with the LSI bit in the interrupt
> control register
>
> if the link is down to start with, the watchdog will obviously spot a 'link up'
> change since it doesn't use any interrupts.
>
> The link interrupt (LSI) is a generic interrupt that comes over the same vector
> (be it MSI or not) as RX interrupts, and in your case doesn't arrive at all,
> which should be demonstrateable if you set e.g. the watchdog interval to 30
> seconds and unplug the cable - the driver won't spot the link change until the
> watchdog fires a lot later than you unplugged the cable.
>
>> The behavior I observe on 2.6.19 is better than 2.6.20-rc7. Link status
>> interrupts seem to work but rx/tx does not. A few more details here:
>>
> <http://www.kroptech.com/~adk0212/mailimport/showmsg.php?msg_id=3339092450&db_name=linux_kernel>
>
>> I'm going to test 2.6.16 thru 2.6.20-rc7 this weekend and will report
>> back any variations in behavior I notice.
>
> that would be a good start, but I still think that you might have a broken
> bridge on that system. Anyway, thanks for digging into this.

Right. The basic question is on a problem system are MSI interrupts
from the card in /proc/interrupts observed. If interrupts are not
observed (as it sounds like is the case) it is most likely something
outside of the card, and driver. If interrupts are observed but the
card does not work correctly it could be a card or driver bug.

Ok. In the archives I finally found the output of cat
/proc/interrupts and there were none.

This is a PCI-E to Hypertransport system from the lspci output.

Can I get the corresponding lspci -xxx output. I suspect the BIOS
did not program the hypertransport MSI mapping capabilities correctly.
All it has to do is set the enable but still, occasionally BIOS
writers miss the most amazing things.

If that is the case with just a little creativity we should be able to
write a pci quirk that will enable the MSI mapping capability on this
class of systems and save ourselves a lot of trouble.

Although I thought I did see a quirk that disabled MSI if the enable
bit was not set. Hmm.

Anyway please the lspci -xxx output. Although if someone could teach
lspci to decode the hypertransport MSI mapping capability so that
lspic -vvv gave us this information that would be great too.

Eric

2007-02-03 21:27:35

by Kok, Auke

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Adam Kropelin wrote:
> Auke Kok wrote:
>> Adam Kropelin wrote:
>>> I've never had this device work 100% with MSI on any kernel version
>>> I've tested so far. But I'm not the original reporter of the
>>> problem, and I believe for him it was a true regression where a
>>> previous kernel wored correctly.
>> maybe I've been unclear, but here's how e1000 detects link changes:
>>
>> 1) by checking every 2 seconds in the watchdog by reading PHY
>> registers
>
> That would explain why I see link status changes but 0 interrupt count
> in /proc/interrupts. However, on >= 2.6.19 the link state never changes.
> Ever. It's always down. On <= 2.6.18 the link state does change but with
> 0 interupt count.
>
>> 2) by receiving an interrupt from the NIC with the LSI bit
>> in the interrupt control register
>>
>> if the link is down to start with, the watchdog will obviously spot a
>> 'link up' change since it doesn't use any interrupts.
>
> This does not seem to work on 2.6.19+. Unless the watchdog interval is
> tens of minutes. I've waited at least 5 minutes and link never went up.

that's explained by a driver change that did that. Since at initialization we're
basically waiting for a link change to tell the stack that we're up, we decided
to change the order to have the hardware fire an LSI interrupt to trigger a
watchdog run. So no interrupts would immediately explain why the watchdog never
runs. That's nothing to worry about for this problem, as soon as interrupts are
seen in /proc/interrupts this all starts working for e1000.

Cheers,

Auke

2007-02-03 22:25:31

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Auke Kok <[email protected]> writes:

> that's explained by a driver change that did that. Since at initialization we're
> basically waiting for a link change to tell the stack that we're up, we decided
> to change the order to have the hardware fire an LSI interrupt to trigger a
> watchdog run. So no interrupts would immediately explain why the watchdog never
> runs. That's nothing to worry about for this problem, as soon as interrupts are
> seen in /proc/interrupts this all starts working for e1000.


While I think we need to fix this issue, and in general the issue of MSI
interrupts on PCI-Express busses downstream of hypertransport chains.
This e1000 issue is not a regression, so not fixing it for 2.6.20 is
not a big deal.

I have yet to see all of the pieces I'm trying to look at confirmed,
but I believe by at least looking at the hypertransport MSI mapping
capability's enable bit in general we should be able to do a much
better job of detecting if MSI works in a system or not.

I though someone several months ago had made our MSI supported detect logic
a lot smarter, with defaults that were generally correct, but looking
at the kernel that code apparently never made it anywhere. Instead
all I see are a handful of common chipsets special cased by the quirk logic.

We should be able to do a lot better but not in the 2.6.20 time frame.

As for the original problem report with duplicate MSI interrupts in
/proc/interrupts. That sounds like a regression and is probably
simple to fix if we can get some more details.

Eric

2007-02-03 23:21:33

by Adam Kropelin

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Eric W. Biederman wrote:
> Auke Kok <[email protected]> writes:
>
>> maybe I've been unclear, but here's how e1000 detects link changes:
>>
>> 1) by checking every 2 seconds in the watchdog by reading PHY
>> registers 2) by receiving an interrupt from the NIC with the LSI bit
>> in the interrupt control register
>>
>> if the link is down to start with, the watchdog will obviously spot
>> a 'link up' change since it doesn't use any interrupts.
>>
>> The link interrupt (LSI) is a generic interrupt that comes over the
>> same vector (be it MSI or not) as RX interrupts, and in your case
>> doesn't arrive at all, which should be demonstrateable if you set
>> e.g. the watchdog interval to 30 seconds and unplug the cable - the
>> driver won't spot the link change until the watchdog fires a lot
>> later than you unplugged the cable.
>>
>>> The behavior I observe on 2.6.19 is better than 2.6.20-rc7. Link
>>> status interrupts seem to work but rx/tx does not. A few more
>>> details here:
>>>
>> <http://www.kroptech.com/~adk0212/mailimport/showmsg.php?msg_id=3339092450&db_name=linux_kernel>
>
> Can I get the corresponding lspci -xxx output. I suspect the BIOS
> did not program the hypertransport MSI mapping capabilities correctly.
> All it has to do is set the enable but still, occasionally BIOS
> writers miss the most amazing things.

Here you go. This is from 2.6.20-rc7.

--Adam


Attachments:
lspci-2.6.20-rc7 (24.85 kB)

2007-02-04 01:16:30

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

"Adam Kropelin" <[email protected]> writes:

>> Can I get the corresponding lspci -xxx output. I suspect the BIOS
>> did not program the hypertransport MSI mapping capabilities correctly.
>> All it has to do is set the enable but still, occasionally BIOS
>> writers miss the most amazing things.
>
> Here you go. This is from 2.6.20-rc7.

Thanks. Conclusion. I could not find bit 16 (the enable bit) set in any of
your hypertransport msi mapping capabilities.

So MSI interrupts won't work until someone enables your chipset
to transform them into hypertransport interrupts.

Ideally it is a BIOS issue, but it may be the kind of thing we can
fix up with quirks in the kernel.

Eric

2007-02-04 04:45:14

by Adam Kropelin

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

Eric W. Biederman wrote:
> "Adam Kropelin" <[email protected]> writes:
>
>>> Can I get the corresponding lspci -xxx output. I suspect the BIOS
>>> did not program the hypertransport MSI mapping capabilities
>>> correctly. All it has to do is set the enable but still,
>>> occasionally BIOS writers miss the most amazing things.
>>
>> Here you go. This is from 2.6.20-rc7.
>
> Thanks. Conclusion. I could not find bit 16 (the enable bit) set in
> any of your hypertransport msi mapping capabilities.
>
> So MSI interrupts won't work until someone enables your chipset
> to transform them into hypertransport interrupts.

Naive question... Can the pci layer (or e1000) detect that MSI is not
enabled in the hardware and avoid using it in that case? With the number
of MSI problems showing up it seems risky to assume it's usable on any
given platform without some sort of sanity check.

--Adam

2007-02-04 05:14:10

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions (v2) (part 1)

"Adam Kropelin" <[email protected]> writes:

> Naive question... Can the pci layer (or e1000) detect that MSI is not enabled in
> the hardware and avoid using it in that case? With the number of MSI problems
> showing up it seems risky to assume it's usable on any given platform without
> some sort of sanity check.
>

Yes, that is what we should do. Start with the assumption MSI doesn't work
and enable it when we detect the hardware is setup properly.

Thing is that is going to take a little bit of work, and a little bit of
thinking on how to structure it properly. So in real time it is going
to be a couple of weeks before the code to do that is ready.

Right now the model is that piecemeal we put in the code to
conditionally turn off chipsets that are known to have problems.
Which for building a reliable system when MSI isn't mandatory for
operation seems backwards.

Probably in addition we should have a warning such as:
"Found devices supporting MSI and but chipset is unknown".

Eric

2007-02-04 13:13:19

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le samedi 03 février 2007 à 11:51 +0100, Andi Kleen a écrit :
> > +
> > +typedef efilinkage efi_status_t efi_get_time_t (efi_time_t *tm,
> > + efi_time_cap_t *tc);
>
> I assume you have double checked it actually works? (i vaguely recall some
> issues with applying attributes to typedefs). If not you would need
> to put them to the declarations.

OK, I tried the following patch (simply adding asmlinkage to the
typedefs) with gcc 3.3, 3.4, 4.0 and 4.1. I couldn't easily try older
compilers.

New patch:

When calling into the EFI firmware, the parameters need to be passed on
the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
This patch is needed to allow the new Intel-based Macs to suspend to ram
(efi.get_time is called during the suspend phase).

Signed-off-by: Frederic Riss <[email protected]>
---
diff --git a/include/linux/efi.h b/include/linux/efi.h
index f8ebd7c..578bc2a 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -157,22 +157,33 @@ typedef struct {
unsigned long reset_system;
} efi_runtime_services_t;

-typedef efi_status_t efi_get_time_t (efi_time_t *tm, efi_time_cap_t *tc);
-typedef efi_status_t efi_set_time_t (efi_time_t *tm);
-typedef efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled, efi_bool_t *pending,
- efi_time_t *tm);
-typedef efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled, efi_time_t *tm);
-typedef efi_status_t efi_get_variable_t (efi_char16_t *name, efi_guid_t *vendor, u32 *attr,
- unsigned long *data_size, void *data);
-typedef efi_status_t efi_get_next_variable_t (unsigned long *name_size, efi_char16_t *name,
- efi_guid_t *vendor);
-typedef efi_status_t efi_set_variable_t (efi_char16_t *name, efi_guid_t *vendor,
- unsigned long attr, unsigned long data_size,
- void *data);
-typedef efi_status_t efi_get_next_high_mono_count_t (u32 *count);
-typedef void efi_reset_system_t (int reset_type, efi_status_t status,
- unsigned long data_size, efi_char16_t *data);
-typedef efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
+typedef asmlinkage efi_status_t efi_get_time_t (efi_time_t *tm,
+ efi_time_cap_t *tc);
+typedef asmlinkage efi_status_t efi_set_time_t (efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_get_wakeup_time_t (efi_bool_t *enabled,
+ efi_bool_t *pending,
+ efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_set_wakeup_time_t (efi_bool_t enabled,
+ efi_time_t *tm);
+typedef asmlinkage efi_status_t efi_get_variable_t (efi_char16_t *name,
+ efi_guid_t *vendor,
+ u32 *attr,
+ unsigned long *data_size,
+ void *data);
+typedef asmlinkage efi_status_t efi_get_next_variable_t (unsigned long *name_sz,
+ efi_char16_t *name,
+ efi_guid_t *vendor);
+typedef asmlinkage efi_status_t efi_set_variable_t (efi_char16_t *name,
+ efi_guid_t *vendor,
+ unsigned long attr,
+ unsigned long data_size,
+ void *data);
+typedef asmlinkage efi_status_t efi_get_next_high_mono_count_t (u32 *count);
+typedef asmlinkage void efi_reset_system_t (int reset_type,
+ efi_status_t status,
+ unsigned long data_size,
+ efi_char16_t *data);
+typedef asmlinkage efi_status_t efi_set_virtual_address_map_t (unsigned long memory_map_size,
unsigned long descriptor_size,
u32 descriptor_version,
efi_memory_desc_t *virtual_map);


2007-02-04 14:46:10

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions


>
> When calling into the EFI firmware, the parameters need to be passed on
> the stack. The recent change to use -mregparm=3 breaks x86 EFI support.
> This patch is needed to allow the new Intel-based Macs to suspend to ram
> (efi.get_time is called during the suspend phase).

Thanks looks good.

-Andi

2007-02-04 17:38:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions



On Sun, 4 Feb 2007, Fr?d?ric Riss wrote:
>
> New patch:

I didn't get how this would fix the ia64 issues? I thought ia64 needed
the standard calling convention?

My gut feel is that EFI should be handled exactly the same way that we
used to handle APM: never even make it look like it's callable from C, but
make architecture-specific wrapper functions that have bog-standard
calling conventions, and then possibly even use inline asm to actually do
the real call (but even if you don't, at that point it would be inside one
particular arch-specific EFI source file - nobody outside of that would
ever call into the firmware directly).

As it is, I don't think I dare apply this right now, which means that it
will miss 2.6.20, and we'll have to backport it to the stable tree when
everybody agrees and has acked it. I don't like having suspend broken on
EFI macs, but on the other hand, I would hate to have an ia64 regression
even more..

Linus

2007-02-04 18:18:22

by Frederic Riss

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Le dimanche 04 février 2007 à 09:34 -0800, Linus Torvalds a écrit :
>
> On Sun, 4 Feb 2007, Frédéric Riss wrote:
> >
> > New patch:
>
> I didn't get how this would fix the ia64 issues? I thought ia64 needed
> the standard calling convention?

I think Andi said that adding asmlinkage on the function pointers
shouldn't harm ia64. If you prefer wrapper functions, one of the patches
I sent ( http://lkml.org/lkml/2007/1/30/309 ) did that, but the casting
it uses looks clumsy.

Fred.


2007-02-04 18:33:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions



On Sun, 4 Feb 2007, Fr?d?ric Riss wrote:
>
> I think Andi said that adding asmlinkage on the function pointers
> shouldn't harm ia64. If you prefer wrapper functions, one of the patches
> I sent ( http://lkml.org/lkml/2007/1/30/309 ) did that, but the casting
> it uses looks clumsy.

I'm more comfortable with that one, at least for now. It's guaranteed to
not break ia64, at least. Also, it does what I think is right: do the
calling convention conversion at the call-site rather than at a C compiler
level.

Will apply a whitespace-fixed version,

Linus

2007-02-05 08:27:19

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

On Sunday 04 February 2007 18:34, Linus Torvalds wrote:
>
> On Sun, 4 Feb 2007, Fr?d?ric Riss wrote:
> >
> > New patch:
>
> I didn't get how this would fix the ia64 issues? I thought ia64 needed
> the standard calling convention?

asmlinkage is standard enough on ia64 as far as I can see.

It defines to an undocumented attribute that seems to only affect the
generated code for functions (basically it forces the compiler
to not reuse input arguments, similar to prevent_tail_call() on i386),
not for pointers.

-Andi

2007-02-05 09:37:31

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.20-rc7: known regressions

Andi Kleen <[email protected]> writes:

> On Sunday 04 February 2007 18:34, Linus Torvalds wrote:
>>
>> On Sun, 4 Feb 2007, Fr?d?ric Riss wrote:
>> >
>> > New patch:
>>
>> I didn't get how this would fix the ia64 issues? I thought ia64 needed
>> the standard calling convention?
>
> asmlinkage is standard enough on ia64 as far as I can see.
>
> It defines to an undocumented attribute that seems to only affect the
> generated code for functions (basically it forces the compiler
> to not reuse input arguments, similar to prevent_tail_call() on i386),
> not for pointers.

Regardless. If we are serious about supporting EFI instead of just
nursing it along for the rare person stuck with it, we are going
to need to remove the brain-dead relocate exactly once to virtual addresses
call, so we can use kexec.

Therefore we will need a trampoline to switch to physical mode.

Of course that presumes the Open (but no one is allowed to read or implement
the standard, or join the committee ) EFI is sufficiently interesting at
some point to bother supporting properly.

Eric