2006-08-28 04:30:55

by Linus Torvalds

[permalink] [raw]
Subject: Linux v2.6.18-rc5


Ok,
this was delayed three weeks due to a combination of vacations and a
funeral in Finland, but Greg and Andrew kept on top of things, and we were
fairly late in the release cycle anyway, so it hopefully caused no real
problems apart from obviously delaying the final release a tiny bit.

Linux 2.6.18-rc5 is out there now, both in git form and as patches and
tar-balls (the latter which I forgot for -rc4, but Greg covered for me -
blush).

The shortlog (appended) tells the story: various fixes all around.
Powerpc, V4L, networking, SCSI..

Pls test it out, and please remind all the appropriate people about any
regressions you find (including any found earlier if they haven't been
addressed yet).

Thanks,
Linus


---
Adam Litke:
[POWERPC] hugepage BUG fix

Adrian Bunk:
fs/ocfs2/dlm/dlmmaster.c: unexport dlm_migrate_lockres
drivers/net/e1000/: possible cleanups

Alan Cox:
PATCH: 2.6.18 oops on boot fix for IDE
tty layer comment the locking assumptions and functions somewhat
Fix tty layer DoS and comment relevant code

Alan Stern:
unusual_devs update for UCR-61S2B

Albert Lee:
libata: Use ATA_FLAG_PIO_POLLING for pdc_adma

Alexander Zarochentsev:
fuse: fix error case in fuse_readpages

Alexey Dobriyan:
xircom_cb: wire up errors from pci_register_driver()
Input: remove dead URLs from Doclumentation/input/joystick.txt
Fix docs for fs.suid_dumpable

Alexey Kuznetsov:
[IPV4]: severe locking bug in fib_semantics.c

Ananth N Mavinakayanahalli:
[POWERPC] kprobes: Fix possible system crash during out-of-line single-stepping

Andreas Herrmann:
[SCSI] zfcp: minor erp bug fixes
[SCSI] zfcp: bump version number

Andrew Morton:
adfs error message fix
panic.c build fix
workqueue: remove lock_cpu_hotplug()
[NETFILTER]: xt_physdev build fix
82596 section fixes
ac3200 section fixes
cops section fix
cs89x0 section fix
at1700 section fix
e2100 section fix
eepro section fix
eexpress section fix
es3210 section fix
eth16i section fix
lance section fix
lne390 section fix
ni52 section fix
ibmtr section fix
smctr section fix
wd section fix
ni65 section fix
seeq8005 section fix
winbond-840 section fix
fealnx section fix
sundance section fix
s2io build fix
/proc/meminfo: don't put spaces in names

Andrew Vasquez:
[SCSI] qla2xxx: Log Trace/Diagonostic asynchronous events.
[SCSI] qla2xxx: Update version number to 8.01.05-k4.
[SCSI] qla2xxx: Correct PLOGI retry logic.
[SCSI] qla2xxx: Properly re-enable EFT support after an ISP abort.
[SCSI] qla2xxx: Update version number to 8.01.07-k1.

Andries Brouwer:
Fix for minix crash
ext2: prevent div-by-zero on corrupted fs

Andy Fleming:
[POWERPC] Fix interrupts on 8540 ADS board
[POWERPC] Fix CDS IRQ handling and PCI code
[POWERPC] Add 85xx DTS files to powerpc
[POWERPC] Fix FEC node in 8540 ADS dts

ASANO Masahiro:
VFS: add lookup hint for network file systems

Badari Pulavarty:
Manage jbd allocations from its own slabs

Ben Dooks:
[ARM] 3753/1: S3C24XX: DMA fixes
[ARM] 3754/1: S3C24XX: tidy arch/arm/mach-s3c2410/Makefile
drivers/rtc: fix rtc-s3c.c
rtc-s3c.c: fix time setting checks

Benjamin Herrenschmidt:
[POWERPC] Fix irq radix tree remapping typo
[POWERPC] Fix BootX booting with an initrd

Bjorn Helgaas:
PCI: quirk to disable e100 interrupt if RESET failed to

Brandon Philips:
genhd.c reference in Documentation/kobjects.txt

Brice Goglin:
myri10ge: always re-enable dummy rdmas in myri10ge_resume

Catalin Marinas:
[ARM] 3757/1: Use PROCINFO_INITFUNC in head.S

Chen-Li Tien:
[PKTGEN]: Fix oops when used with balance-tlb bonding

Christoph Hellwig:
[SCSI] fix simscsi
[SCSI] hptiop: backout ioctl mess
[NET]: Fix alloc_skb comment typo
[NET]: Assign skb->dev in netdev_alloc_skb
[TG3]: skb->dev assignment is done by netdev_alloc_skb

Chuck Lever:
SUNRPC: avoid choosing an IPMI port for RPC traffic

Cornelia Huck:
[S390] retry after deferred condition code.

Dan Bastone:
initialize parts of udf inode earlier in create

Daniel Kobras:
dm: Fix deadlock under high i/o load in raid1 setup.

Daniel Ritz:
PCI: use PCBIOS as last fallback
PCI: i386 mmconfig: don't forget bus number when setting fallback_slots bits
PCI: fix ICH6 quirks

Danny Tholen:
1394: fix for recently added firewire patch that breaks things on ppc

Dave Jones:
PCI: remove dead HOTPLUG_PCI_SHPC_PHPRM_LEGACY option.
cpufreq: acpi-cpufreq: Ignore failure from acpi_cpufreq_early_init_acpi
fix up lockdep trace in fs/exec.c

Dave Kleikamp:
JFS: Quota support broken, no quota_read and quota_write
JFS: Fix bug in quota code. tmp_bh.b_size must be initialized

David Brownell:
build fixes: smc91x
i2c: tps65010 build fixes

David Howells:
NFS: Check lengths more thoroughly in NFS4 readdir XDR decode

David Kuehling:
USB: unusual_devs entry for A-VOX WSX-300ER MP3 player

David L Stevens:
[MCAST]: Fix filter leak on device removal.

David S. Miller:
[PKTGEN]: Make sure skb->{nh,h} are initialized in fill_packet_ipv6() too.
[RTNETLINK]: Fix IFLA_ADDRESS handling.
[IPX]: Fix typo, ipxhdr() --> ipx_hdr()
[TCP]: Fix botched memory leak fix to tcpprobe_read().
[IPSEC]: Validate properly in xfrm_dst_check()
[VLAN]: Make sure bonding packet drop checks get done in hwaccel RX path.
[NET]: Disallow whitespace in network device names.
[SPARC64]: Fix pfn_pte() build failure.
[SCSI] esp: Fix build on SUN4.
[SERIAL] sunzilog: Mirror the sunsab serial setup bug fix.

David Wilder:
[POWERPC] Make secondary CPUs call into kdump on reset exception

Deepak Saxena:
Update smc91x driver with ARM Versatile board info

Diego Calleja:
V4L/DVB (4430): Quickcam_messenger compilation fix

Dirk Eibach:
char/moxa.c: fix endianess and multiple-card issues

Dmitry Mishin:
[NET]: add_timer -> mod_timer() in dst_run_gc()

Dmitry Torokhov:
Input: wistron - fix crash due to referencing __initdata

Don Fry:
pcnet32: break in 2.6.18-rc1 identified

Douglas Gilbert:
[SCSI] sg: fix incorrect page problem

Edgar E. Iglesias:
skge: remember to run netif_poll_disable()

Edgar Hucek:
add imacfb documentation and detection

Eric Sesterhenn:
Signedness issue in drivers/net/3c515.c

Evgeniy Dushistov:
ufs: write to hole in big file
ufs: truncate correction

Evgeniy Polyakov:
[CONNECTOR]: Add userspace example code into Documentation/connector/

Florin Malita:
Input: atkbd - fix overrun in atkbd_set_repeat_rate()

George G. Davis:
[ARM] 3745/1: Add EXPORT_SYMBOL(rtc_next_alarm_time) to ARM rtctime.c

Gerald Schaefer:
[S390] add __cpuinit to appldata_cpu_notify

Grant Grundler:
[SCSI] sym2: claim only "Storage" class

Greg Kroah-Hartman:
USB: fix bug in cypress_cy7c63.c driver

Handle X:
ACPI: hotkey.c fixes, fix for potential crash of hotkey.c

Hans de Goede:
PATCH: 1 line 2.6.18 bugfix: modpost-64bit-fix.patch
hwmon: abituguru timeout fixes

Hans Verkuil:
V4L/DVB (4416): Cx25840_read4 has wrong endianness.
V4L/DVB (4418): Fix broken msp3400 module option 'standard'
V4L/DVB (4419): Turn on the Low Noise Amplifier of the Samsung tuners.

Haren Myneni:
[POWERPC] Fix might-sleep warning on removing cpus

Heiko Carstens:
[S390] tape class return value handling.
[S390] dasd slab cache alignment.
[S390] kernel page table allocation.
s390: fix arp_tbl lock usage in qeth

Henrik Kretzschmar:
PCI: kerneldoc correction in pci-driver

Herbert Xu:
Send wireless netlink events with a clean slate
[IPV6]: The ifa lock is a BH lock
[INET]: Use pskb_trim_unique when trimming paged unique skbs
[BRIDGE]: Disable SG/GSO if TX checksum is off

HighPoint Linux Team:
[SCSI] hptiop: wrong register used in hptiop_reset_hba()

Horms:
Change panic_on_oops message to "Fatal exception"

Horst Hummel:
[S390] dasd set offline kernel bug.
[S390] dasd calls kzalloc while holding a spinlock.
[S390] dasd PAV enabling.

Ian McDonald:
[DCCP]: Fix typo
[DCCP]: Update contact details and copyright
[DCCP]: Introduces follows48 function
[DCCP]: Introduce dccp_rx_hist_find_entry
[DCCP]: Fix CCID3

Ingo Molnar:
[IPV6] lockdep: annotate __icmpv6_socket
lockdep: annotate idescsi_pc_intr()
lockdep: annotate reiserfs

J. Bruce Fields:
NFSv4: increase client-provided nfs4 clientid size

Jack Morgenstein:
IB/core: Fix SM LID/LID change with client reregister set

James Smart:
[SCSI] lpfc 8.1.7 : Add statistics reset callback for FC transport
[SCSI] lpfc 8.1.7 : Fix failing firmware download due to mailbox delays needing to be longer
[SCSI] lpfc 8.1.7 : Fix race condition between lpfc_sli_issue_mbox and lpfc_online
[SCSI] lpfc 8.1.7 : Short bug fixes
[SCSI] lpfc 8.1.7 : ID String and Message fixes
[SCSI] lpfc 8.1.7 : Change version number to 8.1.8
[SCSI] lpfc 8.1.9 : Misc Bug Fixes
[SCSI] lpfc 8.1.9 : Stall eh handlers if resetting while rport blocked
[SCSI] lpfc 8.1.9 : Change version number to 8.1.9

Jan "Yenya" Kasprzak:
[NET]: Terminology in ip-sysctl.txt

Jan Blunck:
fix hrtimer percpu usage typo

Jan Kara:
Fix possible UDF deadlock and memory corruption (CVE-2006-4145)

Jean Delvare:
ACPI: fix kfree in i2c_ec error path

Jeff Garzik:
[libata] manually inline ata_host_remove()

Jeff Mahoney:
[DISKLABEL] SUN: Fix signed int usage for sector count

Jim Lewis:
Add ethtool -g support to Spidernet network driver

Joerg Ahrens:
xirc2ps_cs: Cannot reset card in atomic context

Johannes Berg:
USB: appletouch: fix atp_disconnect

john stultz:
futex_handle_fault always fails

Jon Loeliger:
[POWERPC] Convert to mac-address for ethernet MAC address data.
[POWERPC] Add MPC8641 HPCN Device Tree Source file.
[POWERPC] Offer PCI as a CONFIG choice for PPC_86xx.
[POWERPC] Fix the mpc8641_hpcn.dts file.
[POWERPC] Rewrite the PPC 86xx IRQ handling to use Flat Device Tree

Jonathan Davies:
USB: ftdi_sio driver - new PIDs

Jonathan McDowell:
MTD NAND: Fix ams-delta after core conversion

Ju, Seokmann:
[SCSI] megaraid_{mm,mbox}: 64-bit DMA capability checker
[SCSI] megaraid_{mm,mbox}: a fix on INQUIRY with EVPD
[SCSI] megaraid_{mm,mbox}: a fix on "kernel unaligned access address" issue

Juha [??l?:
[ARM] 3744/1: MMC: mmcqd gets stuck when block queue is plugged

KAMEZAWA Hiroyuki:
register_one_node() compile fix
CONFIG_ACPI_SRAT NUMA build fix
x86: NUMAQ Kconfig fix

Keith Owens:
Fix compile problem when sata debugging is on

Kevin Hao:
net: Add netconsole support to dm9000 driver

Kevin Hilman:
[ATM]: Compile error on ARM
[ARM] 3755/1: dmabounce: fix return value for find_safe_buffer

Kirill Korotaev:
[IPV4]: Limit rt cache size properly.
sys_getppid oopses on debug kernel

Kristen Carlson Accardi:
ACPI: add Dock Station driver to MAINTAINERS file
pciehp: make pciehp build for powerpc
ACPIPHP: allow acpiphp to build without ACPI_DOCK

Krzysztof Halasa:
WAN: fix C101 card carrier handling

Krzysztof Helt:
[SPARC]: enabling of the 2nd CPU in 2.6.18-rc4
[SPARC]: Small smp cleanup.

Kurt Hackel:
ocfs2: Fix lvb corruption
ocfs2: do not modify lksb->status in the unlock ast
ocfs2: fix check for locally granted state during dlmunlock()

Len Brown:
ACPI: restore some dmesg to DEBUG-only, ala 2.6.17
ACPI: skip smart battery init when acpi=off
ACPI: avoid irqrouter_resume might_sleep oops on resume from S4

Lennert Buytenhek:
smc91x: disable DMA mode on the logicpd pxa270

Li Yang:
Freescale QE UCC gigabit ethernet driver
[POWERPC] Fix compile problem without CONFIG_PCI

Linus Torvalds:
Linux v2.6.18-rc5

Luc Van Oostenryck:
V4L/DVB (4395): Restore compat_ioctl in pwc driver

Marc Zyngier:
[SERIAL] sunsab: Fix E250 console with RSC.

Mark Fasheh:
ocfs2: limit cluster bitmap information saved at mount
ocfs2: better group descriptor consistency checks
ocfs2: allocation hints

Mark Huang:
[NETFILTER]: ulog: fix panic on SMP kernels

Martin Hicks:
libata: PHY reset requires writing 0x4 to SControl

Martin Michlmayr:
[ARM] 3747/1: Fix compilation error in mach-ixp4xx/gtwx5715-setup.c

Martin Schwidefsky:
[S390] xpram system device class.

Masoud Asgharifard Sharbiani:
eventpoll.c compile fix

Matt LaPlante:
[WATCHDOG] Kconfig typos fix.

Mauro Carvalho Chehab:
V4L/DVB (4340): Videodev.h should be included also when V4L1_COMPAT is selected.
V4L/DVB (4371a): Fix V4L1 dependencies on compat_ioctl32
V4L/DVB (4371b): Fix V4L1 dependencies at drivers under sound/oss and sound/pci
V4L/DVB (4399): Fix a typo that caused some compat stuff to not work
V4L/DVB (4407): Driver dsbr100 is a radio device, not a video one!
V4L/DVB (4427): Fix V4L1 Compat for VIDIOCGPICT ioctl

Michael Chan:
[TG3]: Fix tx race condition
[BNX2]: Fix tx race condition.
[BNX2]: Convert to netdev_alloc_skb()

Michael Ellerman:
[POWERPC] Move some kexec logic into machine_kexec.c
[POWERPC] Make crash.c work on 32-bit and 64-bit

Michael Rash:
[TEXTSEARCH]: Fix Boyer Moore initialization bug

Michael Reed:
[SCSI] mptfc: properly wait for firmware target discovery to complete
[SCSI] mptfc: correct out of order event processing

Michael S. Tsirkin:
IB/mthca: Make fence flag work for send work requests
IB/mthca: Update HCA firmware revisions

Michal Januszewski:
fbdev: include backlight.h only when __KERNEL__ is defined

Michal Miroslaw:
dm: BUG/OOPS fix

Michal Ruzicka:
[IPV4]: Possible leak of multicast source filter sctructure

Mike Christie:
[SCSI] iscsi bugfixes: send correct error values to userspace
[SCSI] iscsi bugfixes: fix r2t handling
[SCSI] iscsi bugfixes: handle data rsp errors
[SCSI] iscsi bugfixes: fix abort handling
[SCSI] iscsi bugfixes: fix oops when iser is flushing io
[SCSI] iscsi bugfixes: fix oops when removing session
[SCSI] iscsi bugfixes: dont use GFP_KERNEL for sending errors
[SCSI] iscsi bugfixes: reduce memory allocations
[SCSI] iscsi bugfixes: pass errors from complete_pdu to caller
[SCSI] iscsi bugfixes: fix mem leaks in libiscsi
[SCSI] iscsi bugfixes: update and move version number
[SCSI] fix scsi_send_eh_cmnd regression

Mingming Cao:
ext3 filesystem bogus ENOSPC with reservation fix

Nathan Lynch:
[POWERPC] Fix gettimeofday inaccuracies

Nathan Scott:
[XFS] Fix xfs_free_extent related NULL pointer dereference.

NeilBrown:
md: avoid backward event updates in md superblock when degraded.
md: fix recent breakage of md/raid1 array checking

Nick Piggin:
cpuset: oom panic fix

Nicolas Pitre:
[ARM] 3746/2: Userspace helpers must be Thumb mode interworkable

Nikita Danilov:
NFS: Fix a potential deadlock in nfs_release_page

Norihiko Tomiyama:
USB: Additional PID for SHARP W-ZERO3

Oleg Nesterov:
sys_ioprio_set: minor do_each_thread+break fix
Fix current_io_context() vs set_task_ioprio() race
uninline ioprio_best()
cfq_cic_link: fix usage of wrong cfq_io_context
elv_unregister: fix possible crash on module unload
revert "Drop tasklist lock in do_sched_setscheduler"
futex_find_get_task(): remove an obscure EXIT_ZOMBIE check

Olof Johansson:
[POWERPC] powerpc: Clear HID0 attention enable on PPC970 at boot time

Orjan Friberg:
USB: usbtest.c: unsigned retval makes ctrl_out return 0 in case of error

Panagiotis Issaris:
[PPP]: handle kmalloc failures and convert to using kzalloc

Patrick McHardy:
[NETFILTER]: xt_hashlimit: fix limit off-by-one
[NETFILTER]: {arp,ip,ip6}_tables: proper error recovery in init path
[NETFILTER]: ctnetlink: fix deadlock in table dumping
[NETFILTER]: ip_tables: fix table locking in ipt_do_table
[NETFILTER]: arp_tables: fix table locking in arpt_do_table

Paul A. Clarke:
matroxfb: fix jittery display on non-ppc systems

Paul Gortmaker:
[ARM] 3756/1: Assign value for HWCAP_IWMMXT

Paul Jackson:
cpuset: top_cpuset tracks hotplug changes to cpu_online_map

Paul Mackerras:
[POWERPC] Correct masks used in emulating some instructions

Pavel Machek:
pr_debug() should not be used in drivers
ACPI: fix boot with acpi=off

Pavel Roskin:
spectrum_cs: Fix incorrect use of pcmcia_dev_present()
hostap: Restore antenna selection settings after port reset

Peter Korsgaard:
smc911x: Re-release spinlock on spurious interrupt

Peter Oberparleiter:
[S390] lost interrupt after chpid vary off/on cycle.
[S390] inaccessible PAV alias devices on LPAR.

Peter Zijlstra:
lockdep: fix blkdev_open() warning

Phil Oester:
[NETFILTER]: xt_string: fix negation

Pierre Ossman:
[MMC] Fix base address configuration in wbsd
[MMC] Another stray 'io' reference

Pozsar Balazs:
Input: psmouse - fix Intellimouse 4.0 initialization

Rafael J. Wysocki:
swsusp: Fix swap_type_of

Ralf Hildebrandt:
[PKT_SCHED] cls_u32: Fix typo.

Randy Dunlap:
ACPI: handle firmware_register init errors
ACPI: scan: handle kset/kobject errors
ACPI: add message if firmware_register() init fails
ACPI: verbose on kset/kobject_register errors
cdrom/gdsc: fix printk format warning

Richard Purdie:
spectrum_cs: Fix firmware uploading errors
mtd corruption fix

Roger Luethi:
via-rhine: NAPI support
via-rhine: add option avoid_D3 (work around broken BIOSes)

Roland Dreier:
IB/mthca: Fix potential AB-BA deadlock with CQ locks
IB/mthca: No userspace SRQs if HCA doesn't have SRQ support

Rolf Eike Beer:
tty: remove bogus call to cdev_del()

Russell King:
[ARM] Fix pci export warnings
[ARM] Fix NCR5380-based SCSI card build
[ARM] Fix Acorn platform SCSI driver build failures
lockdep: fix smc91x

Sam Ravnborg:
kbuild: do not try to build content of initramfs
kbuild: external modules shall not check config consistency
kbuild: correct assingment to CFLAGS with CROSS_COMPILE

Samuel Thibault:
vcsa attribute bits -> ioctl(VT_GETHIFONTMASK)

Scott Murray:
CPCI hotplug: fix resource assignment

Shyam Sundar:
[SCSI] qla2xxx: Correct endianess problem while issuing a Marker IOCB on ISP24xx.

Sonny Rao:
[POWERPC] fix PMU initialization on pseries lpar

Sridhar Samudrala:
Fix sctp privilege elevation (CVE-2006-3745)

Starikovskiy, Alexey Y:
ACPI: relax BAD_MADT_ENTRY check to allow LSAPIC variable length string UIDs

Stephen Hemminger:
[IPX]: Header length validation needed
[IPX]: Another nonlinear receive fix
sky2: phy power problems on 88e805X chips
[LLC]: multicast receive device match
via-rhine: NAPI poll enable
[TCP]: Limit window scaling if window is clamped.
[IPV6]: Segmentation offload not set correctly on TCP children
[BRIDGE] netfilter: memory corruption fix

Steven Rostedt:
Add stable branch to maintainers file

Suresh Siddha:
[NET]: Fix potential stack overflow in net/core/utils.c

Tejun Heo:
libata: fix ata_port_detach() for old EH ports
ata_piix: fix host_set private_data intialization
sata_sil24: don't set probe_ent->mmio_base
libata: fix ata_device_add() error path
libata: clear sdev->locked on door lock failure
ata_piix: fix ghost device probing by honoring PCS present bits
ata_piix: ignore PCS on ICH5
ata_piix: implement force_pcs module parameter
sata_via: use old SCR access pattern on vt6420

Thomas Meyer:
x86: Fix dmi detection of MacBookPro and iMac

Tom Zanussi:
Documentation update for relay interface

Tomasz Kazmierczak:
USB: pl2303: removed support for OTi's DKU-5 clone cable

Trent Piepho:
V4L/DVB (4411): Fix minor errors in build files

Trond Myklebust:
fcntl(F_SETSIG) fix
SUNRPC: make rpc_unlink() take a dentry argument instead of a path
NFS: clean up rpc_rmdir
SUNRPC: rpc_unlink() must check for unhashed dentries
SUNRPC: Fix dentry refcounting issues with users of rpc_pipefs
LOCKD: Fix a deadlock in nlm_traverse_files()
NFS: Fix issue with EIO on NFS read
NFSv4: Add v4 exception handling for the ACL functions.
VFS: Fix access("file", X_OK) in the presence of ACLs
VFS: Remove redundant open-coded mode bit check in prepare_binfmt().
VFS: Remove redundant open-coded mode bit checks in open_exec().

Vitaly Bordug:
PAL: Support of the fixed PHY
FS_ENET: use PAL for mii management
ppc32: board-specific part of fs_enet update

Vladislav Bolkhovitin:
[SCSI] qla2xxx: Fix to allow to reset devices using sg interface (sg_reset).

Volker Sameske:
[SCSI] zfcp: improve management of request IDs

Wei Yongjun:
[TCP]: SNMPv2 tcpOutSegs counter error

Will Schmidt:
[POWERPC] update {g5,iseries,pseries}_defconfigs

William Morrrow:
ACPI: Handle BIOS that resumes from S3 to suspend routine rather than resume vector

Yasunori Goto:
ACPI: memory hotplug: remove useless message at boot time

Yeasah Pell:
V4L/DVB (4431): Add several error checks to dst

Yingchao Zhou:
Remove redundant up() in stop_machine()

Yoav Steinberg:
[ARM] 3752/1: fix versatile flash resource map

Yoichi Yuasa:
USB: removed a unbalanced #endif from ohci-au1xxx.c

Zang Roy-r61911:
[POWERPC] Update mpc7448hpc2 board irq support using device tree
[POWERPC] Pass UPIO_TSI flag to 8259 serial driver


2006-08-28 05:17:34

by Marc Perkel

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

You might want to look at this bug.

http://bugzilla.kernel.org/show_bug.cgi?id=6975

The current kernel doesn't run on Asus Motherboards that use the new AM2
CPUs. Should this be addressed before 2.6.18 is finished?

2006-08-28 05:52:25

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5



On Sun, 27 Aug 2006, Marc Perkel wrote:
>
> You might want to look at this bug.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=6975
>
> The current kernel doesn't run on Asus Motherboards that use the new AM2 CPUs.
> Should this be addressed before 2.6.18 is finished?

Hmm. Can you verify that the system boots fine if you get rid of
acpi_skip_timer_override as per the hint from Prakash Punnoor?

That is, in the file arch/x86_64/kernel/io_apic.c, find the place where it
does something like

...
if (nvidia_hpet_detected == 0) {
acpi_skip_timer_override = 1;
printk(KERN_INFO "Nvidia board "
"detected. Ignoring ACPI "
"timer override.\n");
}
...

and just comment that whole thing out (or at least the assignment that
sets the "acpi_skip_timer_override" variable to 1).

Andi? You were talking about how the 64-bit machines don't have some of
the cruft that the old PC's have.. It looks like they are accumulating
_more_ cruft than regular x86 ever had...

Linus

2006-08-28 06:16:17

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
Linus Torvalds <[email protected]> wrote:

> Linux 2.6.18-rc5 is out there now

(Reporters Bcc'ed: please provide updates)

Serious-looking regressions include:


http://bugzilla.kernel.org/show_bug.cgi?id=7062 (HPET)

From: Chuck Ebbert <[email protected]>
Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0

From: "Uwe Bugla" <[email protected]>
Subject: keyboard errors with module atkbd.c in Kernel 2.6.18-rc4

From: Olaf Hering <[email protected]>
Subject: oops in __delayacct_blkio_ticks with 2.6.18-rc4

From: "Catalin Marinas" <[email protected]>
Subject: Possible memory leak in kernel/delayacct.c

From: walt <[email protected]>
Subject: Sound not working correctly as of 2.6.15-rc1

From: Johan Rutgeerts <[email protected]>
Subject: Acpi oops 2.6.17.7 vanilla

From: Andrew Benton <[email protected]>
Subject: ALSA problems with 2.6.18-rc3

From: Sean Bruno <[email protected]>
Subject: [BUG] Kernel Panic from AHD when power cycling external Disk/Array

From: "Beschorner Daniel" <[email protected]>
Subject: fctnl(F_SETSIG) no longer works in 2.6.17, does in 2.6.16.

(I think we fixed this?)

From: Keith Owens <[email protected]>
Subject: 2.6.18-rc4 Intermittent failures to detect sata disks

From: "Zephaniah E. Hull" <[email protected]>
Subject: [patch] Crash on evdev disconnect.

From: Andreas Barth <[email protected]>
Subject: Re: Fw: gdth SCSI driver(?) fails with more than 4GB of memory

(Long saga - attempts were made to fix it but I think we're stumped?)

From: Andi Kleen <[email protected]>
Subject: Futex BUG in 2.6.18rc2-git7

From: Elias Holman <[email protected]>
Subject: PROBLEM: PCI/Intel 82945 trouble on Toshiba M400 notebook

From: "Alex Polvi" <[email protected]>
Subject: [PATCH] sunrpc/auth_gss: NULL pointer deref in gss_pipe_release()

From: Hubert Tonneau <[email protected]>
Subject: Re: Linux v2.6.18-rc3

(USB Audio regression)


That list is maybe a quarter of my list of "recently reported regressions
which haven't been pushed into bugzilla yet". There are many more in
bugzilla. We have a lot of regressions.

2006-08-28 06:13:06

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, Aug 27, 2006 at 10:52:06PM -0700, Linus Torvalds wrote:
>
>
> On Sun, 27 Aug 2006, Marc Perkel wrote:
> >
> > You might want to look at this bug.
> >
> > http://bugzilla.kernel.org/show_bug.cgi?id=6975
> >
> > The current kernel doesn't run on Asus Motherboards that use the new AM2 CPUs.


That sounds like a overly broad statement. How do you know
it affects all Asus boards and not just your specific BIO version?

> > Should this be addressed before 2.6.18 is finished?
>
> Hmm. Can you verify that the system boots fine if you get rid of
> acpi_skip_timer_override as per the hint from Prakash Punnoor?

We already should disable it on NF5 automatically. Timer override was all
broken on NF3/NF4, but apparently works on NF5 again.

But the check relies on HPET being present. Maybe Asus "forgot"
to set up the HPET table again and the test fails.

[In general Asus BIOS writers seems to have issues. They completely
broke all the MCFG tables too]

I can't say from the URL above if it's that because it's missing a complete
boot log. Marc, please add that.

Andy, I guess the timer override check just needs to be tightened to check
the specific PCI IDs of NF3/NF4 only and not rely on HPET being right.
Do you have a list of them?

I suppose we also need a no_acpi_skip_override setup option for future
cases.


>
> Andi? You were talking about how the 64-bit machines don't have some of
> the cruft that the old PC's have.. It looks like they are accumulating
> _more_ cruft than regular x86 ever had...

Just by logic it's impossible because all 64bit systems are regular
PCs too @)

Anyways, there is cruft, but it is new cruft replacing the old cruft,
so overall there is less cruft.

-Andi

2006-08-28 06:19:55

by Olaf Hering

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, Aug 27, Andrew Morton wrote:

> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)

> Subject: oops in __delayacct_blkio_ticks with 2.6.18-rc4

This patch is supposed to fix it.

http://lkml.org/lkml/2006/8/22/245
http://lkml.org/lkml/2006/8/24/299

2006-08-28 06:24:44

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 28 Aug 2006 08:19:40 +0200
Olaf Hering <[email protected]> wrote:

> On Sun, Aug 27, Andrew Morton wrote:
>
> > On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> > Linus Torvalds <[email protected]> wrote:
> >
> > > Linux 2.6.18-rc5 is out there now
> >
> > (Reporters Bcc'ed: please provide updates)
>
> > Subject: oops in __delayacct_blkio_ticks with 2.6.18-rc4
>
> This patch is supposed to fix it.
>
> http://lkml.org/lkml/2006/8/22/245
> http://lkml.org/lkml/2006/8/24/299

Yes, there are two delay-accounting fixes pending - this and a memory leak.
Shailabh is off preparing the final versions (I hope).

2006-08-28 06:29:47

by Jeff Garzik

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Andrew Morton wrote:
> From: Keith Owens <[email protected]>
> Subject: 2.6.18-rc4 Intermittent failures to detect sata disks

Should already be fixed in -rc5, by

commit f3745a3f9fa39fa3c62f7d5b8549ee787d2c6848
Author: Tejun Heo <[email protected]>
Date: Tue Aug 22 21:06:46 2006 +0900

[PATCH] ata_piix: ignore PCS on ICH5

BTW, as of 9dd9c16465c82d1385f97d2a245641464fcb7894 we have a force_pcs
module (& command line) parameter which can tune the driver a bit, if
people are having problems.

JEff


2006-08-28 07:25:41

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Monday 28 August 2006 08:14, Andrew Morton wrote:

> From: Andi Kleen <[email protected]>
> Subject: Futex BUG in 2.6.18rc2-git7

I don't think I saw a fix for that, but Thomas and Ingo should know.

-Andi

2006-08-28 07:25:05

by Prakash Punnoor

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Am Montag 28 August 2006 08:13 schrieb Andi Kleen:
> On Sun, Aug 27, 2006 at 10:52:06PM -0700, Linus Torvalds wrote:
> > On Sun, 27 Aug 2006, Marc Perkel wrote:
> > > You might want to look at this bug.
> > >
> > > http://bugzilla.kernel.org/show_bug.cgi?id=6975
> > >
> > > The current kernel doesn't run on Asus Motherboards that use the new
> > > AM2 CPUs.
>
> That sounds like a overly broad statement. How do you know
> it affects all Asus boards and not just your specific BIO version?
>
> > > Should this be addressed before 2.6.18 is finished?
> >
> > Hmm. Can you verify that the system boots fine if you get rid of
> > acpi_skip_timer_override as per the hint from Prakash Punnoor?
>
> We already should disable it on NF5 automatically. Timer override was all
> broken on NF3/NF4, but apparently works on NF5 again.
>
> But the check relies on HPET being present. Maybe Asus "forgot"
> to set up the HPET table again and the test fails.

At least my dmesg says nothing about hpet and thus wan't to enable the quirk.
It is a nforce430 (thus nf4) chipset, though. You can find my bootlog here:

http://marc.theaimsgroup.com/?l=linux-kernel&m=115545986619977&w=2

Cheers,
--
(?= =?)
//\ Prakash Punnoor /\\
V_/ \_V


Attachments:
(No filename) (1.23 kB)
(No filename) (189.00 B)
Download all attachments

2006-08-28 08:05:05

by Catalin Marinas

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 28/08/06, Andrew Morton <[email protected]> wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > Linux 2.6.18-rc5 is out there now
[...]
> From: "Catalin Marinas" <[email protected]>
> Subject: Possible memory leak in kernel/delayacct.c

Michal (cc'ed) reported that the leak no longer shows with a patch
from Shailabh - http://lkml.org/lkml/2006/8/22/246. It doesn't seem
that the patch was merged yet so I suspect the problem is still there.

--
Catalin

2006-08-28 08:05:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 2006-08-28 at 09:25 +0200, Andi Kleen wrote:
> On Monday 28 August 2006 08:14, Andrew Morton wrote:
>
> > From: Andi Kleen <[email protected]>
> > Subject: Futex BUG in 2.6.18rc2-git7
>
> I don't think I saw a fix for that, but Thomas and Ingo should know.

You should know too :)

tglx


-------- Forwarded Message --------
From: Olaf Hering <[email protected]>
To: Andi Kleen <[email protected]>
Cc: Olaf Hering <[email protected]>, Thomas Gleixner <[email protected]>,
[email protected]
Subject: Re: Futex BUG in 2.6.18rc2-git7
Date: Sat, 5 Aug 2006 10:07:14 +0200

On Sat, Aug 05, 2006 at 01:09:54AM +0200, Andi Kleen wrote:
> On Friday 04 August 2006 22:26, Olaf Hering wrote:
> > On Fri, Aug 04, 2006 at 10:12:15PM +0200, Thomas Gleixner wrote:
> >
> > > Is the glibc the latest CVS version ?
> >
> > Its a snapshot from 2006073023.
>
> Olaf, wagner is running that kernel+Thomas' patch now (although I
> didn't think any compat was involved) now. Can you please restart
> the glibc test?

This patch fixes it, also the ppc32 and ppc64 glibc make check.


2006-08-28 08:25:09

by Catalin Marinas

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 28/08/06, Andrew Morton <[email protected]> wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:

Probably not as serious but it looks like a real memory leak in
drivers/usb/input/hid-core.c. I reported it about 2 weeks ago on the
linux-usb-devel list -
http://article.gmane.org/gmane.linux.usb.devel/45691.

--
Catalin

2006-08-28 08:35:44

by Keith Owens

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Andrew Morton (on Sun, 27 Aug 2006 23:14:21 -0700) wrote:
>(Reporters Bcc'ed: please provide updates)
>
>Serious-looking regressions include:
>From: Keith Owens <[email protected]>
>Subject: 2.6.18-rc4 Intermittent failures to detect sata disks

Two hours of continuous reboots on an ICH5 chipset passed without any
problems. Couple of caveats though -

(1) The "fix" for this bug is to skip the pcs test for SATA ports on
ICH5 chipsets. This results in spurious warning messages for ICH5
SATA ports with no disks attached.

ATA: abnormal status 0x7F on port 0xCCA7

(2) I have seen the same intermittent bug on ICH7 SATA but
PIIX_FLAG_IGNORE_PCS is only set for ich5 and i6300esb_sata. It
probably needs to be set for ich7 as well.

2006-08-28 08:43:16

by Tejun Heo

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Keith Owens wrote:
> Two hours of continuous reboots on an ICH5 chipset passed without any
> problems. Couple of caveats though -
>
> (1) The "fix" for this bug is to skip the pcs test for SATA ports on
> ICH5 chipsets. This results in spurious warning messages for ICH5
> SATA ports with no disks attached.
>
> ATA: abnormal status 0x7F on port 0xCCA7

This is a known annoyance and will be fixed in time.

> (2) I have seen the same intermittent bug on ICH7 SATA but
> PIIX_FLAG_IGNORE_PCS is only set for ich5 and i6300esb_sata. It
> probably needs to be set for ich7 as well.

No, ICH7 up to this point has been believed to have well-behaving PCS.
If you report PCS problem, you'll be the first. Also, note that ICH7
suffers from ghost device probing problem if PCS is not honored exactly.
Are you sure it's the same problem?

Thanks.

--
tejun

2006-08-28 08:53:57

by Beschorner Daniel

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

> Subject: fctnl(F_SETSIG) no longer works in 2.6.17, does in 2.6.16.
> (I think we fixed this?)

Fixed in rc5.

2006-08-28 09:57:17

by Chuck Ebbert

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

In-Reply-To: <[email protected]>

On Sun, 27 Aug 2006 23:14:21 -0700, Andrew Morton wrote:

> > Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:
>
> <...>
>
> From: Chuck Ebbert <[email protected]>
> Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0

Also happens on 2.6.16.28 and 2.6.17.11, so not a regression.

> From: "Beschorner Daniel" <[email protected]>
> Subject: fctnl(F_SETSIG) no longer works in 2.6.17, does in 2.6.16.
>
> (I think we fixed this?)

Fixed by this -rc5 patch:

Trond Myklebust:
fcntl(F_SETSIG) fix

--
Chuck

2006-08-28 10:10:47

by Jesper Juhl

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 28/08/06, Linus Torvalds <[email protected]> wrote:
>
> Ok,
> this was delayed three weeks due to a combination of vacations and a
> funeral in Finland, but Greg and Andrew kept on top of things, and we were
> fairly late in the release cycle anyway, so it hopefully caused no real
> problems apart from obviously delaying the final release a tiny bit.
>
> Linux 2.6.18-rc5 is out there now, both in git form and as patches and
> tar-balls (the latter which I forgot for -rc4, but Greg covered for me -
> blush).
>
> The shortlog (appended) tells the story: various fixes all around.
> Powerpc, V4L, networking, SCSI..
>
> Pls test it out, and please remind all the appropriate people about any
> regressions you find (including any found earlier if they haven't been
> addressed yet).
>
Not really a regression, more like a long standing bug, but XFS has
issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
With heavy rsync load to a machine with XFS filesystems, XFS falls
over and filesystems are in need of xfs_repair.
I'm doing all I can to gather info for Nathan so he can fix the bug,
but it's hard to trigger reliably.
My point is that perhaps it's worth delaying 2.6.18 a little longer in
the hope of getting that bug fixed before release. Nathan?
At least for me, XFS in its current state (and thus 2.6.18) is
unusable in production environments.

See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
kernel NULL pointer dereference at virtual address 00000078" for the
full story.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-08-28 10:28:06

by Kasper Sandberg

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> On 28/08/06, Linus Torvalds <[email protected]> wrote:
> >
> > Ok,
> > this was delayed three weeks due to a combination of vacations and a
> > funeral in Finland, but Greg and Andrew kept on top of things, and we were
> > fairly late in the release cycle anyway, so it hopefully caused no real
> > problems apart from obviously delaying the final release a tiny bit.
> >
> > Linux 2.6.18-rc5 is out there now, both in git form and as patches and
> > tar-balls (the latter which I forgot for -rc4, but Greg covered for me -
> > blush).
> >
> > The shortlog (appended) tells the story: various fixes all around.
> > Powerpc, V4L, networking, SCSI..
> >
> > Pls test it out, and please remind all the appropriate people about any
> > regressions you find (including any found earlier if they haven't been
> > addressed yet).
> >
> Not really a regression, more like a long standing bug, but XFS has
> issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
and you are saying this issue exists in all post .11 kernels?
> With heavy rsync load to a machine with XFS filesystems, XFS falls
> over and filesystems are in need of xfs_repair.
> I'm doing all I can to gather info for Nathan so he can fix the bug,
> but it's hard to trigger reliably.
could you please describe whatever you have found out, im eager to take
a look at it myself
> My point is that perhaps it's worth delaying 2.6.18 a little longer in
> the hope of getting that bug fixed before release. Nathan?
> At least for me, XFS in its current state (and thus 2.6.18) is
> unusable in production environments.
>
> See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
> kernel NULL pointer dereference at virtual address 00000078" for the
> full story.
>

2006-08-28 10:35:04

by Jesper Juhl

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > On 28/08/06, Linus Torvalds <[email protected]> wrote:
> > >
> > > Ok,
> > > this was delayed three weeks due to a combination of vacations and a
> > > funeral in Finland, but Greg and Andrew kept on top of things, and we were
> > > fairly late in the release cycle anyway, so it hopefully caused no real
> > > problems apart from obviously delaying the final release a tiny bit.
> > >
> > > Linux 2.6.18-rc5 is out there now, both in git form and as patches and
> > > tar-balls (the latter which I forgot for -rc4, but Greg covered for me -
> > > blush).
> > >
> > > The shortlog (appended) tells the story: various fixes all around.
> > > Powerpc, V4L, networking, SCSI..
> > >
> > > Pls test it out, and please remind all the appropriate people about any
> > > regressions you find (including any found earlier if they haven't been
> > > addressed yet).
> > >
> > Not really a regression, more like a long standing bug, but XFS has
> > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> and you are saying this issue exists in all post .11 kernels?

No, I don't know that for sure. All I know is that 2.6.17.x (with x >=
7) falls over, 2.6.18-rc[34] falls over and there's nothing in
2.6.18-rc5 that looks like a fix but I've not tested that kernel yet
(but I have tested 2.6.18-rc4 + the xfs fix that went into -rc5 and
that one doesn't solve it).
2.6.11 is simply the kernel the server I can reproduce this on was
running previously, and that kernel is stable. It's a production
machine and it takes hours to hit the problem, so I can't very well do
a binary search of all kernels between 2.6.11 and 2.6.18-rc.


> > With heavy rsync load to a machine with XFS filesystems, XFS falls
> > over and filesystems are in need of xfs_repair.
> > I'm doing all I can to gather info for Nathan so he can fix the bug,
> > but it's hard to trigger reliably.
> could you please describe whatever you have found out, im eager to take
> a look at it myself

Take a look at the thread I mention, that should describe the problem
and what we have found out so far.
Here's a link to the start of the thread: http://lkml.org/lkml/2006/8/4/97


> > My point is that perhaps it's worth delaying 2.6.18 a little longer in
> > the hope of getting that bug fixed before release. Nathan?
> > At least for me, XFS in its current state (and thus 2.6.18) is
> > unusable in production environments.
> >
> > See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
> > kernel NULL pointer dereference at virtual address 00000078" for the
> > full story.
> >
>

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-08-28 12:05:42

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

> At least my dmesg says nothing about hpet and thus wan't to enable the quirk.
> It is a nforce430 (thus nf4) chipset, though. You can find my bootlog here:

Only NF5 is interesting in this case. On NF4 skipping the timer override
is correct.

-Andi

2006-08-28 12:15:57

by Prakash Punnoor

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Am Montag 28 August 2006 14:05 schrieb Andi Kleen:
> > At least my dmesg says nothing about hpet and thus wan't to enable the
> > quirk. It is a nforce430 (thus nf4) chipset, though. You can find my
> > bootlog here:
>
> Only NF5 is interesting in this case. On NF4 skipping the timer override
> is correct.

Well, then please explain me why it hangs on my nf430 with skipping and works
normally w/o skipping?

--
(?= =?)
//\ Prakash Punnoor /\\
V_/ \_V


Attachments:
(No filename) (488.00 B)
(No filename) (189.00 B)
Download all attachments

2006-08-28 13:04:19

by Roger Luethi

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, 27 Aug 2006 21:30:50 -0700, Linus Torvalds wrote:
> Roger Luethi:
> via-rhine: NAPI support
> via-rhine: add option avoid_D3 (work around broken BIOSes)
> [...]
> Stephen Hemminger:
> via-rhine: NAPI poll enable

For the record: Stephen Hemminger wrote the NAPI support for via-rhine, all
I did was point out a minor bug which he fixed promptly.

Roger

2006-08-28 14:41:43

by Shailabh Nagar

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Andrew Morton wrote:
> On Mon, 28 Aug 2006 08:19:40 +0200
> Olaf Hering <[email protected]> wrote:
>
>> On Sun, Aug 27, Andrew Morton wrote:
>>
>>> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
>>> Linus Torvalds <[email protected]> wrote:
>>>
>>>> Linux 2.6.18-rc5 is out there now
>>> (Reporters Bcc'ed: please provide updates)
>>> Subject: oops in __delayacct_blkio_ticks with 2.6.18-rc4
>> This patch is supposed to fix it.
>>
>> http://lkml.org/lkml/2006/8/22/245
>> http://lkml.org/lkml/2006/8/24/299
>
> Yes, there are two delay-accounting fixes pending - this and a memory leak.
> Shailabh is off preparing the final versions (I hope).
>

Both the problems are solved by the same patch.

I'll submit the same by eod. Wanted to get some more stress testing
of the fix using not just the /proc interface (which caused the oops reported)
but also the command interface provided by taskstats/delay accounting.

Thanks,
Shailabh

2006-08-28 15:24:05

by Keith Owens

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Tejun Heo (on Mon, 28 Aug 2006 17:42:22 +0900) wrote:
>Keith Owens wrote:
>> Two hours of continuous reboots on an ICH5 chipset passed without any
>> problems. Couple of caveats though -
>>
>> (1) The "fix" for this bug is to skip the pcs test for SATA ports on
>> ICH5 chipsets. This results in spurious warning messages for ICH5
>> SATA ports with no disks attached.
>>
>> ATA: abnormal status 0x7F on port 0xCCA7
>
>This is a known annoyance and will be fixed in time.
>
>> (2) I have seen the same intermittent bug on ICH7 SATA but
>> PIIX_FLAG_IGNORE_PCS is only set for ich5 and i6300esb_sata. It
>> probably needs to be set for ich7 as well.
>
>No, ICH7 up to this point has been believed to have well-behaving PCS.
>If you report PCS problem, you'll be the first. Also, note that ICH7
>suffers from ghost device probing problem if PCS is not honored exactly.
> Are you sure it's the same problem?

It definitely looks like it. Stock 2.6.18-rc5 plus this patch to
activate ata_debug from boot until just after probing drives.

---
drivers/scsi/ata_piix.c | 5 ++++-
include/linux/libata.h | 4 ++++
2 files changed, 8 insertions(+), 1 deletion(-)

Index: linux/drivers/scsi/ata_piix.c
===================================================================
--- linux.orig/drivers/scsi/ata_piix.c
+++ linux/drivers/scsi/ata_piix.c
@@ -536,6 +536,8 @@ static void piix_pata_error_handler(stru
ata_std_postreset);
}

+int ata_debug = 1;
+
/**
* piix_sata_present_mask - determine present mask for SATA host controller
* @ap: Target port
@@ -615,6 +617,7 @@ static void piix_sata_error_handler(stru
{
ata_bmdma_drive_eh(ap, ata_std_prereset, piix_sata_softreset, NULL,
ata_std_postreset);
+ ata_debug = 0;
}

/**
Index: linux/include/linux/libata.h
===================================================================
--- linux.orig/include/linux/libata.h
+++ linux/include/linux/libata.h
@@ -61,6 +61,10 @@
#define VPRINTK(fmt, args...)
#endif /* ATA_DEBUG */

+extern int ata_debug;
+#undef DPRINTK
+#define DPRINTK(fmt, args...) if (ata_debug) printk(KERN_ERR "%s: " fmt, __FUNCTION__, ## args)
+
#define BPRINTK(fmt, args...) if (ap->flags & ATA_FLAG_DEBUGMSG) printk(KERN_ERR "%s: " fmt, __FUNCTION__, ## args)

/* NEW: debug levels */


Typical debug messages from a series of boots

<3>piix_sata_present_mask: ata1: ENTER, pcs=0x15 base=0
<3>piix_sata_present_mask: ata1: LEAVE, pcs=0x15 present_mask=0x3
<3>piix_sata_present_mask: ata1: ENTER, pcs=0x0 base=0
<3>piix_sata_present_mask: ata1: LEAVE, pcs=0x0 present_mask=0x3
<3>piix_sata_present_mask: ata1: ENTER, pcs=0x15 base=0
<3>piix_sata_present_mask: ata1: LEAVE, pcs=0x15 present_mask=0x3
<3>piix_sata_present_mask: ata1: ENTER, pcs=0x0 base=0
<3>piix_sata_present_mask: ata1: LEAVE, pcs=0x0 present_mask=0x3
<3>piix_sata_present_mask: ata1: ENTER, pcs=0x15 base=0

Note the pcs=0x0 values. Adding PIIX_FLAG_IGNORE_PCS to
ich6m_sata_ahci gets past the failure to detect pcs, with no sign of
any ghost devices. BTW, dropping down to 2.6.17 with the same config
has no problem detecting the disk, even without PIIX_FLAG_IGNORE_PCS on
ich6m_sata_ahci.

lspci extract, this is an ICH7M.

00:1f.0 Class 0601: 8086:27b9 (rev 02)
Subsystem: 1033:832c
Flags: bus master, medium devsel, latency 0
Capabilities: [e0] Vendor Specific Information

00:1f.2 Class 0101: 8086:27c4 (rev 02) (prog-if 80)
Subsystem: 1033:832c
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 18
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at 18b0 [size=16]
Capabilities: [70] Power Management version 2

00:1f.3 Class 0c05: 8086:27da (rev 02)
Subsystem: 1033:832c
Flags: medium devsel, IRQ 11
I/O ports at 18c0 [size=32]

2006-08-28 15:26:34

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 8/28/06, Andrew Morton <[email protected]> wrote:
>
> From: "Uwe Bugla" <[email protected]>
> Subject: keyboard errors with module atkbd.c in Kernel 2.6.18-rc4
>

There were 2 issues in that report. One is that we do not emit keyup
events when unregistering an input device which causes enter key
appear "stuck" after removing atkbd module. I have a patch for this
but it is hardly a critical issue.

The second issue is that he is losing keyboard input (along with the
network) but only in KDE while using video-editing program for some
hours. Unfortunately userspace was updated along with the kernel and I
did not get a clear answer whether simply erolling back to 2.6.17
without changing anything else cures the issue or not.

>
> From: "Zephaniah E. Hull" <[email protected]>
> Subject: [patch] Crash on evdev disconnect.
>

While it is a problem it is not a regression ;P

--
Dmitry

2006-08-28 16:02:39

by Andi Kleen

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, Aug 28, 2006 at 02:15:40PM +0200, Prakash Punnoor wrote:
> Am Montag 28 August 2006 14:05 schrieb Andi Kleen:
> > > At least my dmesg says nothing about hpet and thus wan't to enable the
> > > quirk. It is a nforce430 (thus nf4) chipset, though. You can find my
> > > bootlog here:
> >
> > Only NF5 is interesting in this case. On NF4 skipping the timer override
> > is correct.
>
> Well, then please explain me why it hangs on my nf430 with skipping and works
> normally w/o skipping?

That's new then. Andy, any explanation?

-Andi

2006-08-28 17:21:20

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 28 Aug 2006 10:08:39 +0200
Thomas Gleixner <[email protected]> wrote:

> On Mon, 2006-08-28 at 09:25 +0200, Andi Kleen wrote:
> > On Monday 28 August 2006 08:14, Andrew Morton wrote:
> >
> > > From: Andi Kleen <[email protected]>
> > > Subject: Futex BUG in 2.6.18rc2-git7
> >
> > I don't think I saw a fix for that, but Thomas and Ingo should know.
>
> You should know too :)
>
> tglx
>
>
> -------- Forwarded Message --------
> From: Olaf Hering <[email protected]>
> To: Andi Kleen <[email protected]>
> Cc: Olaf Hering <[email protected]>, Thomas Gleixner <[email protected]>,
> [email protected]
> Subject: Re: Futex BUG in 2.6.18rc2-git7
> Date: Sat, 5 Aug 2006 10:07:14 +0200
>
> On Sat, Aug 05, 2006 at 01:09:54AM +0200, Andi Kleen wrote:
> > On Friday 04 August 2006 22:26, Olaf Hering wrote:
> > > On Fri, Aug 04, 2006 at 10:12:15PM +0200, Thomas Gleixner wrote:
> > >
> > > > Is the glibc the latest CVS version ?
> > >
> > > Its a snapshot from 2006073023.
> >
> > Olaf, wagner is running that kernel+Thomas' patch now (although I
> > didn't think any compat was involved) now. Can you please restart
> > the glibc test?
>
> This patch fixes it, also the ppc32 and ppc64 glibc make check.

Did this fix get merged?

2006-08-28 17:42:14

by Thomas Gleixner

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 2006-08-28 at 10:20 -0700, Andrew Morton wrote:
> On Mon, 28 Aug 2006 10:08:39 +0200
> Thomas Gleixner <[email protected]> wrote:
>
> > On Mon, 2006-08-28 at 09:25 +0200, Andi Kleen wrote:
> > > On Monday 28 August 2006 08:14, Andrew Morton wrote:
> > >
> > > > From: Andi Kleen <[email protected]>
> > > > Subject: Futex BUG in 2.6.18rc2-git7
> > >
> > > I don't think I saw a fix for that, but Thomas and Ingo should know.
> >
> > You should know too :)
> >
> > tglx
> >
> >
> > -------- Forwarded Message --------
> > From: Olaf Hering <[email protected]>
> > To: Andi Kleen <[email protected]>
> > Cc: Olaf Hering <[email protected]>, Thomas Gleixner <[email protected]>,
> > [email protected]
> > Subject: Re: Futex BUG in 2.6.18rc2-git7
> > Date: Sat, 5 Aug 2006 10:07:14 +0200
> >
> > On Sat, Aug 05, 2006 at 01:09:54AM +0200, Andi Kleen wrote:
> > > On Friday 04 August 2006 22:26, Olaf Hering wrote:
> > > > On Fri, Aug 04, 2006 at 10:12:15PM +0200, Thomas Gleixner wrote:
> > > >
> > > > > Is the glibc the latest CVS version ?
> > > >
> > > > Its a snapshot from 2006073023.
> > >
> > > Olaf, wagner is running that kernel+Thomas' patch now (although I
> > > didn't think any compat was involved) now. Can you please restart
> > > the glibc test?
> >
> > This patch fixes it, also the ppc32 and ppc64 glibc make check.
>
> Did this fix get merged?

http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ce2c6b53847afc444c4d0a7a1075c61f499c57a5

AFAICT are all the small fixups merged.

tglx


2006-08-28 18:41:20

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 28 Aug 2006 05:50:07 -0400
Chuck Ebbert <[email protected]> wrote:

> > From: Chuck Ebbert <[email protected]>
> > Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0
>
> Also happens on 2.6.16.28 and 2.6.17.11, so not a regression.

Well it's not a post-2.6.17 regression. But it's something which quite a few
people have been reporting in recent months. I don't _think_ it's associated
with any consistent runtime failures, but otoh I don't think we know what
caused it.

2006-08-28 18:43:54

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, 2006-08-28 at 11:40 -0700, Andrew Morton wrote:
> On Mon, 28 Aug 2006 05:50:07 -0400
> Chuck Ebbert <[email protected]> wrote:
>
> > > From: Chuck Ebbert <[email protected]>
> > > Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0
> >
> > Also happens on 2.6.16.28 and 2.6.17.11, so not a regression.
>
> Well it's not a post-2.6.17 regression. But it's something which quite a few
> people have been reporting in recent months. I don't _think_ it's associated
> with any consistent runtime failures, but otoh I don't think we know what
> caused it.

in itself this can just happen (bios issue, but if the bars are unused
no big deal)... the kernel has gotten more verbose afaik though.

It CAN cause real issues so I'm not saying there's no problem or no
regression, it's just that the printk alone isn't serious.


2006-08-28 22:01:49

by Shailabh Nagar

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Cleanup allocation and freeing of tsk->delays used by delay accounting.
This solves two problems reported for delay accounting:

1. oops in __delayacct_blkio_ticks
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1844.html

Currently tsk->delays is getting freed too early in task exit
which can cause a NULL tsk->delays to get accessed via reading
of /proc/<tgid>/stats. The patch fixes this problem by freeing
tsk->delays closer to when task_struct itself is freed up. As a result,
it also eliminates the use of tsk->delays_lock which was only being
used (inadequately) to safeguard access to tsk->delays
while a task was exiting.

2. Possible memory leak in kernel/delayacct.c
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0608.2/1389.html

The patch cleans up tsk->delays allocations after a bad fork which
was missing earlier.


The patch has been tested to fix the problems listed above
and stress tested with rapid calls to delay accounting's taskstats
command interface (which is the other path that can access the same
data, besides the /proc interface causing the oops above).


Signed-Off-By: Shailabh Nagar <[email protected]>

include/linux/delayacct.h | 10 +++++++---
include/linux/sched.h | 1 -
kernel/delayacct.c | 16 ----------------
kernel/exit.c | 1 -
kernel/fork.c | 6 ++++--
5 files changed, 11 insertions(+), 23 deletions(-)

Index: linux-2.6.18-rc5/kernel/fork.c
===================================================================
--- linux-2.6.18-rc5.orig/kernel/fork.c 2006-08-28 11:34:27.000000000 -0400
+++ linux-2.6.18-rc5/kernel/fork.c 2006-08-28 11:34:48.000000000 -0400
@@ -117,6 +117,7 @@ void __put_task_struct(struct task_struc
security_task_free(tsk);
free_uid(tsk->user);
put_group_info(tsk->group_info);
+ delayacct_tsk_free(tsk);

if (!profile_handoff_task(tsk))
free_task(tsk);
@@ -1011,7 +1012,7 @@ static struct task_struct *copy_process(
retval = -EFAULT;
if (clone_flags & CLONE_PARENT_SETTID)
if (put_user(p->pid, parent_tidptr))
- goto bad_fork_cleanup;
+ goto bad_fork_cleanup_delays_binfmt;

INIT_LIST_HEAD(&p->children);
INIT_LIST_HEAD(&p->sibling);
@@ -1277,7 +1278,8 @@ bad_fork_cleanup_policy:
bad_fork_cleanup_cpuset:
#endif
cpuset_exit(p);
-bad_fork_cleanup:
+bad_fork_cleanup_delays_binfmt:
+ delayacct_tsk_free(p);
if (p->binfmt)
module_put(p->binfmt->module);
bad_fork_cleanup_put_domain:
Index: linux-2.6.18-rc5/include/linux/delayacct.h
===================================================================
--- linux-2.6.18-rc5.orig/include/linux/delayacct.h 2006-08-28 11:34:27.000000000 -0400
+++ linux-2.6.18-rc5/include/linux/delayacct.h 2006-08-28 11:34:48.000000000 -0400
@@ -59,10 +59,14 @@ static inline void delayacct_tsk_init(st
__delayacct_tsk_init(tsk);
}

-static inline void delayacct_tsk_exit(struct task_struct *tsk)
+/* Free tsk->delays. Called from bad fork and __put_task_struct
+ * where there's no risk of tsk->delays being accessed elsewhere
+ */
+static inline void delayacct_tsk_free(struct task_struct *tsk)
{
if (tsk->delays)
- __delayacct_tsk_exit(tsk);
+ kmem_cache_free(delayacct_cache, tsk->delays);
+ tsk->delays = NULL;
}

static inline void delayacct_blkio_start(void)
@@ -101,7 +105,7 @@ static inline void delayacct_init(void)
{}
static inline void delayacct_tsk_init(struct task_struct *tsk)
{}
-static inline void delayacct_tsk_exit(struct task_struct *tsk)
+static inline void delayacct_tsk_free(struct task_struct *tsk)
{}
static inline void delayacct_blkio_start(void)
{}
Index: linux-2.6.18-rc5/include/linux/sched.h
===================================================================
--- linux-2.6.18-rc5.orig/include/linux/sched.h 2006-08-28 11:34:27.000000000 -0400
+++ linux-2.6.18-rc5/include/linux/sched.h 2006-08-28 11:34:48.000000000 -0400
@@ -994,7 +994,6 @@ struct task_struct {
*/
struct pipe_inode_info *splice_pipe;
#ifdef CONFIG_TASK_DELAY_ACCT
- spinlock_t delays_lock;
struct task_delay_info *delays;
#endif
};
Index: linux-2.6.18-rc5/kernel/exit.c
===================================================================
--- linux-2.6.18-rc5.orig/kernel/exit.c 2006-08-28 11:34:27.000000000 -0400
+++ linux-2.6.18-rc5/kernel/exit.c 2006-08-28 11:34:48.000000000 -0400
@@ -908,7 +908,6 @@ fastcall NORET_TYPE void do_exit(long co
audit_free(tsk);
taskstats_exit_send(tsk, tidstats, group_dead, mycpu);
taskstats_exit_free(tidstats);
- delayacct_tsk_exit(tsk);

exit_mm(tsk);

Index: linux-2.6.18-rc5/kernel/delayacct.c
===================================================================
--- linux-2.6.18-rc5.orig/kernel/delayacct.c 2006-08-28 11:34:27.000000000 -0400
+++ linux-2.6.18-rc5/kernel/delayacct.c 2006-08-28 11:34:48.000000000 -0400
@@ -41,24 +41,11 @@ void delayacct_init(void)

void __delayacct_tsk_init(struct task_struct *tsk)
{
- spin_lock_init(&tsk->delays_lock);
- /* No need to acquire tsk->delays_lock for allocation here unless
- __delayacct_tsk_init called after tsk is attached to tasklist
- */
tsk->delays = kmem_cache_zalloc(delayacct_cache, SLAB_KERNEL);
if (tsk->delays)
spin_lock_init(&tsk->delays->lock);
}

-void __delayacct_tsk_exit(struct task_struct *tsk)
-{
- struct task_delay_info *delays = tsk->delays;
- spin_lock(&tsk->delays_lock);
- tsk->delays = NULL;
- spin_unlock(&tsk->delays_lock);
- kmem_cache_free(delayacct_cache, delays);
-}
-
/*
* Start accounting for a delay statistic using
* its starting timestamp (@start)
@@ -118,8 +105,6 @@ int __delayacct_add_tsk(struct taskstats
struct timespec ts;
unsigned long t1,t2,t3;

- spin_lock(&tsk->delays_lock);
-
/* Though tsk->delays accessed later, early exit avoids
* unnecessary returning of other data
*/
@@ -161,7 +146,6 @@ int __delayacct_add_tsk(struct taskstats
spin_unlock(&tsk->delays->lock);

done:
- spin_unlock(&tsk->delays_lock);
return 0;
}




2006-08-28 22:37:20

by Nathan Scott

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Mon, Aug 28, 2006 at 12:35:00PM +0200, Jesper Juhl wrote:
> On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> > On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > > Not really a regression, more like a long standing bug, but XFS has
> > > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> > and you are saying this issue exists in all post .11 kernels?

I would be surprised if this is not a day one bug, it probably
even affects the IRIX version of XFS. Our problem is the lack
of a test case to find it - my efforts have come to naught so
far. I'm having to cross my fingers that Jesper can extract a
bit more information when he's next able to hit it.

> > > See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
> > > kernel NULL pointer dereference at virtual address 00000078" for the
> > > full story.

That, and another story - Jesper hijacked that thread ;) - the
inital bug there was found and fixed, and the fix has now been
merged. But (fyi, Kasper) much of that thread is discussing a
different bug to this one.

cheers.

--
Nathan

2006-08-28 23:30:31

by Jesper Juhl

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 29/08/06, Nathan Scott <[email protected]> wrote:
> On Mon, Aug 28, 2006 at 12:35:00PM +0200, Jesper Juhl wrote:
> > On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> > > On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > > > Not really a regression, more like a long standing bug, but XFS has
> > > > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> > > and you are saying this issue exists in all post .11 kernels?
>
> I would be surprised if this is not a day one bug, it probably
> even affects the IRIX version of XFS. Our problem is the lack
> of a test case to find it - my efforts have come to naught so
> far. I'm having to cross my fingers that Jesper can extract a
> bit more information when he's next able to hit it.
>
I'm trying my best, but it's difficult. Often I can only run the -rc
kernel for a few hours on the box that currently shows the problem,
and that's not enough to hit the fault.
I've configured a XFS partition on my home workstation and I'm keeping
that one busy doing various rsync's and running benchmarks etc -
putting as much different stress on the XFS filesystem as I can. I'm
also setting up a test box at work to try and duplicate the problem on
a non-production server. I won't be able to duplicate the setup
exactly, but it'll be close.

> > > > See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
> > > > kernel NULL pointer dereference at virtual address 00000078" for the
> > > > full story.
>
> That, and another story - Jesper hijacked that thread ;) - the

Sorry ;)

> inital bug there was found and fixed, and the fix has now been
> merged. But (fyi, Kasper) much of that thread is discussing a
> different bug to this one.
>

True. I should have emphasised that.


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-08-29 02:09:07

by Greg KH

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, Aug 27, 2006 at 11:14:21PM -0700, Andrew Morton wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:
>
>
> From: Hubert Tonneau <[email protected]>
> Subject: Re: Linux v2.6.18-rc3
>
> (USB Audio regression)

Hubert, is this still a problem with -rc5? I think it was a ALSA usb
driver issue, right?

thanks,

greg k-h

2006-08-29 02:09:09

by Greg KH

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, Aug 27, 2006 at 11:14:21PM -0700, Andrew Morton wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
> > Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:
>
>
> http://bugzilla.kernel.org/show_bug.cgi?id=7062 (HPET)
>
> From: Chuck Ebbert <[email protected]>
> Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0

I thought this was resolved.

Chuck, do you still have issues with this with the -rc5 release?

thanks,

greg k-h

2006-08-29 05:43:45

by Chuck Ebbert

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

In-Reply-To: <[email protected]>

On Mon, 28 Aug 2006 19:01:07 -0700, Greg KH wrote:

> > From: Chuck Ebbert <[email protected]>
> > Subject: PCI: Cannot allocate resource region 7 of bridge 0000:00:04.0
>
> I thought this was resolved.
>
> Chuck, do you still have issues with this with the -rc5 release?

Yes. I think this is a separate problem from the one that was fixed in -rc5.

Digging deeper shows there are no devices behind the bridges anyway, so
maybe this message should be expected? (Resource start is zero for all
resources, that's what causes the message.)

--
Chuck

2006-08-29 11:55:57

by Olaf Hering

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Sun, Aug 27, Linus Torvalds wrote:

> Pls test it out, and please remind all the appropriate people about any
> regressions you find (including any found earlier if they haven't been
> addressed yet).

> Nathan Lynch:
> [POWERPC] Fix gettimeofday inaccuracies

Tested on B&W G3, iBook1 and a G4/466.
This patch causes deadlocks on ppc32, but not on ppc64. Have to verify
it on a vanilla kernel, but I'm sure there are no funky patches in
openSuSE.

https://bugzilla.novell.com/show_bug.cgi?id=202146

2006-08-29 13:06:36

by Nathan Lynch

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Hi Olaf-

Olaf Hering wrote:
> On Sun, Aug 27, Linus Torvalds wrote:
>
> > Pls test it out, and please remind all the appropriate people about any
> > regressions you find (including any found earlier if they haven't been
> > addressed yet).
>
> > Nathan Lynch:
> > [POWERPC] Fix gettimeofday inaccuracies
>
> Tested on B&W G3, iBook1 and a G4/466.
> This patch causes deadlocks on ppc32, but not on ppc64. Have to verify
> it on a vanilla kernel, but I'm sure there are no funky patches in
> openSuSE.
>
> https://bugzilla.novell.com/show_bug.cgi?id=202146

Sorry about that, does this (a partial revert of the change) fix it
for you?


diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 18e59e4..fe9b1d9 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -655,7 +655,6 @@ void timer_interrupt(struct pt_regs * re
int next_dec;
int cpu = smp_processor_id();
unsigned long ticks;
- u64 tb_next_jiffy;

#ifdef CONFIG_PPC32
if (atomic_read(&ppc_n_lost_interrupts) != 0)
@@ -697,14 +696,11 @@ void timer_interrupt(struct pt_regs * re
continue;

write_seqlock(&xtime_lock);
- tb_next_jiffy = tb_last_jiffy + tb_ticks_per_jiffy;
- if (per_cpu(last_jiffy, cpu) >= tb_next_jiffy) {
- tb_last_jiffy = tb_next_jiffy;
- tb_last_stamp = per_cpu(last_jiffy, cpu);
- do_timer(regs);
- timer_recalc_offset(tb_last_jiffy);
- timer_check_rtc();
- }
+ tb_last_jiffy += tb_ticks_per_jiffy;
+ tb_last_stamp = per_cpu(last_jiffy, cpu);
+ do_timer(regs);
+ timer_recalc_offset(tb_last_jiffy);
+ timer_check_rtc();
write_sequnlock(&xtime_lock);
}

2006-08-29 14:50:59

by Andreas Barth

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

* Andrew Morton ([email protected]) [060828 08:31]:
> From: Andreas Barth <[email protected]>
> Subject: Re: Fw: gdth SCSI driver(?) fails with more than 4GB of memory
>
> (Long saga - attempts were made to fix it but I think we're stumped?)

One patch was provided which failed. Another patch was provided that to
get more output, the new log is in
http://neualius.turmzimmer.net/~aba/6G/kernel-20060807.log

If there is anything more I should provide, please don't hesitate to
ping me. Unfortunatly, I'm not subscribed to lkml right now.


Cheers,
Andi
--
http://home.arcor.de/andreas-barth/

2006-08-29 15:53:26

by Olaf Hering

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Tue, Aug 29, Nathan Lynch wrote:

> Hi Olaf-
>
> Olaf Hering wrote:
> > On Sun, Aug 27, Linus Torvalds wrote:
> >
> > > Pls test it out, and please remind all the appropriate people about any
> > > regressions you find (including any found earlier if they haven't been
> > > addressed yet).
> >
> > > Nathan Lynch:
> > > [POWERPC] Fix gettimeofday inaccuracies
> >
> > Tested on B&W G3, iBook1 and a G4/466.
> > This patch causes deadlocks on ppc32, but not on ppc64. Have to verify
> > it on a vanilla kernel, but I'm sure there are no funky patches in
> > openSuSE.
> >
> > https://bugzilla.novell.com/show_bug.cgi?id=202146
>
> Sorry about that, does this (a partial revert of the change) fix it
> for you?

Yes, it works ok with this change.

2006-08-29 15:57:44

by Brice Goglin

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Andrew Morton wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
>
>> Linux 2.6.18-rc5 is out there now
>>
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:
>

I am having problems with ipw2200. It does not connect to my WPA access
point unless I help it (for instance by setting the ESSID with
iwconfig). It seems to be related to driver version 1.1.2. I had the
same problem when 1.1.2 was the latest external tarball.

I have been using 1.1.3 and 1.1.4 external tarballs without problems for
a while. Zhu Yi pushed 1.1.4 to netdev-2.6/wireless last week. Is there
any chance it could be pulled before 2.6.18?

Sorry for notifying so late in the release cycle...

Regards,
Brice

2006-08-30 06:13:22

by Paul Mackerras

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Olaf,

This patch should fix it. The problem was that I was comparing a
32-bit quantity with a 64-bit quantity, and consequently time wasn't
advancing. This makes us use a 64-bit quantity on all platforms,
which ends up simplifying the code since we can now get rid of the
tb_last_stamp variable (which actually fixes another bug that Ben H
and I noticed while going carefully through the code).

This works fine on my G4 tibook. Let me know how it goes on your
machines.

Paul.

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 18e59e4..a124499 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -125,15 +125,8 @@ static long timezone_offset;
unsigned long ppc_proc_freq;
unsigned long ppc_tb_freq;

-u64 tb_last_jiffy __cacheline_aligned_in_smp;
-unsigned long tb_last_stamp;
-
-/*
- * Note that on ppc32 this only stores the bottom 32 bits of
- * the timebase value, but that's enough to tell when a jiffy
- * has passed.
- */
-DEFINE_PER_CPU(unsigned long, last_jiffy);
+static u64 tb_last_jiffy __cacheline_aligned_in_smp;
+static DEFINE_PER_CPU(u64, last_jiffy);

#ifdef CONFIG_VIRT_CPU_ACCOUNTING
/*
@@ -458,7 +451,7 @@ void do_gettimeofday(struct timeval *tv)
do {
seq = read_seqbegin_irqsave(&xtime_lock, flags);
sec = xtime.tv_sec;
- nsec = xtime.tv_nsec + tb_ticks_since(tb_last_stamp);
+ nsec = xtime.tv_nsec + tb_ticks_since(tb_last_jiffy);
} while (read_seqretry_irqrestore(&xtime_lock, seq, flags));
usec = nsec / 1000;
while (usec >= 1000000) {
@@ -700,7 +693,6 @@ #endif
tb_next_jiffy = tb_last_jiffy + tb_ticks_per_jiffy;
if (per_cpu(last_jiffy, cpu) >= tb_next_jiffy) {
tb_last_jiffy = tb_next_jiffy;
- tb_last_stamp = per_cpu(last_jiffy, cpu);
do_timer(regs);
timer_recalc_offset(tb_last_jiffy);
timer_check_rtc();
@@ -749,7 +741,7 @@ void __init smp_space_timers(unsigned in
int i;
unsigned long half = tb_ticks_per_jiffy / 2;
unsigned long offset = tb_ticks_per_jiffy / max_cpus;
- unsigned long previous_tb = per_cpu(last_jiffy, boot_cpuid);
+ u64 previous_tb = per_cpu(last_jiffy, boot_cpuid);

/* make sure tb > per_cpu(last_jiffy, cpu) for all cpus always */
previous_tb -= tb_ticks_per_jiffy;
@@ -830,7 +822,7 @@ #endif
* and therefore the (jiffies - wall_jiffies) computation
* has been removed.
*/
- tb_delta = tb_ticks_since(tb_last_stamp);
+ tb_delta = tb_ticks_since(tb_last_jiffy);
tb_delta = mulhdu(tb_delta, do_gtod.varp->tb_to_xs); /* in xsec */
new_nsec -= SCALE_XSEC(tb_delta, 1000000000);

@@ -950,8 +942,7 @@ void __init time_init(void)
if (__USE_RTC()) {
/* 601 processor: dec counts down by 128 every 128ns */
ppc_tb_freq = 1000000000;
- tb_last_stamp = get_rtcl();
- tb_last_jiffy = tb_last_stamp;
+ tb_last_jiffy = get_rtcl();
} else {
/* Normal PowerPC with timebase register */
ppc_md.calibrate_decr();
@@ -959,7 +950,7 @@ void __init time_init(void)
ppc_tb_freq / 1000000, ppc_tb_freq % 1000000);
printk(KERN_DEBUG "time_init: processor frequency = %lu.%.6lu MHz\n",
ppc_proc_freq / 1000000, ppc_proc_freq % 1000000);
- tb_last_stamp = tb_last_jiffy = get_tb();
+ tb_last_jiffy = get_tb();
}

tb_ticks_per_jiffy = ppc_tb_freq / HZ;
@@ -1036,7 +1027,7 @@ void __init time_init(void)
do_gtod.varp = &do_gtod.vars[0];
do_gtod.var_idx = 0;
do_gtod.varp->tb_orig_stamp = tb_last_jiffy;
- __get_cpu_var(last_jiffy) = tb_last_stamp;
+ __get_cpu_var(last_jiffy) = tb_last_jiffy;
do_gtod.varp->stamp_xsec = (u64) xtime.tv_sec * XSEC_PER_SEC;
do_gtod.tb_ticks_per_sec = tb_ticks_per_sec;
do_gtod.varp->tb_to_xs = tb_to_xs;
diff --git a/include/asm-powerpc/time.h b/include/asm-powerpc/time.h
index dcde441..5785ac4 100644
--- a/include/asm-powerpc/time.h
+++ b/include/asm-powerpc/time.h
@@ -30,10 +30,6 @@ extern unsigned long tb_ticks_per_usec;
extern unsigned long tb_ticks_per_sec;
extern u64 tb_to_xs;
extern unsigned tb_to_us;
-extern unsigned long tb_last_stamp;
-extern u64 tb_last_jiffy;
-
-DECLARE_PER_CPU(unsigned long, last_jiffy);

struct rtc_time;
extern void to_tm(int tim, struct rtc_time * tm);

2006-08-30 08:05:59

by Olaf Hering

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Wed, Aug 30, Paul Mackerras wrote:

> This works fine on my G4 tibook. Let me know how it goes on your
> machines.

Works ok on an iBook1.

2006-08-30 08:14:47

by Kasper Sandberg

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Tue, 2006-08-29 at 01:30 +0200, Jesper Juhl wrote:
> On 29/08/06, Nathan Scott <[email protected]> wrote:
> > On Mon, Aug 28, 2006 at 12:35:00PM +0200, Jesper Juhl wrote:
> > > On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> > > > On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > > > > Not really a regression, more like a long standing bug, but XFS has
> > > > > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> > > > and you are saying this issue exists in all post .11 kernels?
> >
> > I would be surprised if this is not a day one bug, it probably
> > even affects the IRIX version of XFS. Our problem is the lack
> > of a test case to find it - my efforts have come to naught so
> > far. I'm having to cross my fingers that Jesper can extract a
> > bit more information when he's next able to hit it.
> >
> I'm trying my best, but it's difficult. Often I can only run the -rc
> kernel for a few hours on the box that currently shows the problem,
> and that's not enough to hit the fault.
> I've configured a XFS partition on my home workstation and I'm keeping
> that one busy doing various rsync's and running benchmarks etc -
> putting as much different stress on the XFS filesystem as I can. I'm
> also setting up a test box at work to try and duplicate the problem on
> a non-production server. I won't be able to duplicate the setup
> exactly, but it'll be close.

i have nyself tested xfs in -rc5 now, doing rsync over and over, and
been unable to hit any problem, it indeed seems very hard to reproduce.

>
> > > > > See the thread titled "2.6.18-rc3-git3 - XFS - BUG: unable to handle
> > > > > kernel NULL pointer dereference at virtual address 00000078" for the
> > > > > full story.
> >
> > That, and another story - Jesper hijacked that thread ;) - the
>
> Sorry ;)
>
> > inital bug there was found and fixed, and the fix has now been
> > merged. But (fyi, Kasper) much of that thread is discussing a
> > different bug to this one.
> >
>
> True. I should have emphasised that.
>
>

2006-08-30 09:01:21

by Mikael Pettersson

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Paul Mackerras writes:
> Olaf,
>
> This patch should fix it. The problem was that I was comparing a
> 32-bit quantity with a 64-bit quantity, and consequently time wasn't
> advancing. This makes us use a 64-bit quantity on all platforms,
> which ends up simplifying the code since we can now get rid of the
> tb_last_stamp variable (which actually fixes another bug that Ben H
> and I noticed while going carefully through the code).
>
> This works fine on my G4 tibook. Let me know how it goes on your
> machines.

Thanks. This fixed a kernel hang bug on my G4 eMac with 2.6.18-rc5.

The vanilla kernel ran fine until I tar xvf'd a file from an NFS-mount,
then everything ground to a halt.

/Mikael

2006-08-30 09:19:51

by Jesper Juhl

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 30/08/06, Kasper Sandberg <[email protected]> wrote:
> On Tue, 2006-08-29 at 01:30 +0200, Jesper Juhl wrote:
> > On 29/08/06, Nathan Scott <[email protected]> wrote:
> > > On Mon, Aug 28, 2006 at 12:35:00PM +0200, Jesper Juhl wrote:
> > > > On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> > > > > On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > > > > > Not really a regression, more like a long standing bug, but XFS has
> > > > > > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> > > > > and you are saying this issue exists in all post .11 kernels?
> > >
> > > I would be surprised if this is not a day one bug, it probably
> > > even affects the IRIX version of XFS. Our problem is the lack
> > > of a test case to find it - my efforts have come to naught so
> > > far. I'm having to cross my fingers that Jesper can extract a
> > > bit more information when he's next able to hit it.
> > >
> > I'm trying my best, but it's difficult. Often I can only run the -rc
> > kernel for a few hours on the box that currently shows the problem,
> > and that's not enough to hit the fault.
> > I've configured a XFS partition on my home workstation and I'm keeping
> > that one busy doing various rsync's and running benchmarks etc -
> > putting as much different stress on the XFS filesystem as I can. I'm
> > also setting up a test box at work to try and duplicate the problem on
> > a non-production server. I won't be able to duplicate the setup
> > exactly, but it'll be close.
>
> i have nyself tested xfs in -rc5 now, doing rsync over and over, and
> been unable to hit any problem, it indeed seems very hard to reproduce.
>
Just in case I have not mentioned it before; the box I'm seeing the
problem on is a dual 3.2GHz Xeon and the kernel is compiled for SMP.
might be relevant, might not be...

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-08-30 13:43:30

by Jesper Juhl

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On 30/08/06, Jesper Juhl <[email protected]> wrote:
> On 30/08/06, Kasper Sandberg <[email protected]> wrote:
> > On Tue, 2006-08-29 at 01:30 +0200, Jesper Juhl wrote:
> > > On 29/08/06, Nathan Scott <[email protected]> wrote:
> > > > On Mon, Aug 28, 2006 at 12:35:00PM +0200, Jesper Juhl wrote:
> > > > > On 28/08/06, Kasper Sandberg <[email protected]> wrote:
> > > > > > On Mon, 2006-08-28 at 12:10 +0200, Jesper Juhl wrote:
> > > > > > > Not really a regression, more like a long standing bug, but XFS has
> > > > > > > issues in 2.6.18-rc* (and earlier kernels, at least post 2.6.11).
> > > > > > and you are saying this issue exists in all post .11 kernels?
> > > >
> > > > I would be surprised if this is not a day one bug, it probably
> > > > even affects the IRIX version of XFS. Our problem is the lack
> > > > of a test case to find it - my efforts have come to naught so
> > > > far. I'm having to cross my fingers that Jesper can extract a
> > > > bit more information when he's next able to hit it.
> > > >
> > > I'm trying my best, but it's difficult. Often I can only run the -rc
> > > kernel for a few hours on the box that currently shows the problem,
> > > and that's not enough to hit the fault.
> > > I've configured a XFS partition on my home workstation and I'm keeping
> > > that one busy doing various rsync's and running benchmarks etc -
> > > putting as much different stress on the XFS filesystem as I can. I'm
> > > also setting up a test box at work to try and duplicate the problem on
> > > a non-production server. I won't be able to duplicate the setup
> > > exactly, but it'll be close.
> >
> > i have nyself tested xfs in -rc5 now, doing rsync over and over, and
> > been unable to hit any problem, it indeed seems very hard to reproduce.
> >
> Just in case I have not mentioned it before; the box I'm seeing the
> problem on is a dual 3.2GHz Xeon and the kernel is compiled for SMP.
> might be relevant, might not be...
>
Small correction: The CPU is a single 3.2GHz Xeon with HT. Not two
physical cores.
I should probably also mention that the 2.6.11.11 kernel that runs
stable on the server is compiled for UP, not SMP.

--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-08-30 15:54:41

by Andrew Benton

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Andrew Morton wrote:
> On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> Linus Torvalds <[email protected]> wrote:
>
>> Linux 2.6.18-rc5 is out there now
>
> (Reporters Bcc'ed: please provide updates)
>
> Serious-looking regressions include:
>
>
> From: Andrew Benton <[email protected]>
> Subject: ALSA problems with 2.6.18-rc3

The problem remains in 2.6.18-rc5.
The workaround people have suggested (using alsactl -F restore) works if
I have a working /etc/asound.state created with a 2.6.17 kernel. If I
was starting from scratch with 2.6.18-rc5 I would have no way to set the
sound level for the digital output. But maybe the bug is in alsamixer
and alsactl?

Andy

2006-08-30 16:15:13

by Takashi Iwai

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

At Wed, 30 Aug 2006 16:54:38 +0100,
Andrew Benton wrote:
>
> Andrew Morton wrote:
> > On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> > Linus Torvalds <[email protected]> wrote:
> >
> >> Linux 2.6.18-rc5 is out there now
> >
> > (Reporters Bcc'ed: please provide updates)
> >
> > Serious-looking regressions include:
> >
> >
> > From: Andrew Benton <[email protected]>
> > Subject: ALSA problems with 2.6.18-rc3
>
> The problem remains in 2.6.18-rc5.
> The workaround people have suggested (using alsactl -F restore) works if
> I have a working /etc/asound.state created with a 2.6.17 kernel. If I
> was starting from scratch with 2.6.18-rc5 I would have no way to set the
> sound level for the digital output. But maybe the bug is in alsamixer
> and alsactl?

No, it doesn't sound like a bug of alsamixer or alsactl if "alsactl
-F" works.

What did you exactly do and what doesn't work right now?
A detailed explanation for reproducing the bug is needed.


Takashi

2006-08-31 05:12:56

by Brown, Len

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

On Monday 28 August 2006 02:14, Andrew Morton wrote:

> From: Johan Rutgeerts <[email protected]>
> Subject: Acpi oops 2.6.17.7 vanilla

It turns out that this one has been with us since at least 2.6.15.
So far, seen only on Johan's machine.

http://bugzilla.kernel.org/show_bug.cgi?id=6980

So this one probably is not worthy of a 2.6.18 stopper list.

cheers,
-Len

2006-08-31 08:22:41

by Takashi Iwai

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

At Wed, 30 Aug 2006 18:15:09 +0200,
I wrote:
>
> At Wed, 30 Aug 2006 16:54:38 +0100,
> Andrew Benton wrote:
> >
> > Andrew Morton wrote:
> > > On Sun, 27 Aug 2006 21:30:50 -0700 (PDT)
> > > Linus Torvalds <[email protected]> wrote:
> > >
> > >> Linux 2.6.18-rc5 is out there now
> > >
> > > (Reporters Bcc'ed: please provide updates)
> > >
> > > Serious-looking regressions include:
> > >
> > >
> > > From: Andrew Benton <[email protected]>
> > > Subject: ALSA problems with 2.6.18-rc3
> >
> > The problem remains in 2.6.18-rc5.
> > The workaround people have suggested (using alsactl -F restore) works if
> > I have a working /etc/asound.state created with a 2.6.17 kernel. If I
> > was starting from scratch with 2.6.18-rc5 I would have no way to set the
> > sound level for the digital output. But maybe the bug is in alsamixer
> > and alsactl?
>
> No, it doesn't sound like a bug of alsamixer or alsactl if "alsactl
> -F" works.
>
> What did you exactly do and what doesn't work right now?
> A detailed explanation for reproducing the bug is needed.

AnOther bugreport suggest that the similar name mismatch appears in
ac97. IMO, it's no real breakage, but surely safer to avoid such
a thing.

Could you try the patch below (same found in bugzilla #7080)?


Thanks,

Takashi

====

[PATCH] ALSA: ac97 - Correct some Mic mixer elements

Revert the mixer element names of some Mic controls to the state of
2.6.17. This should fix the name mismatch in alsactl.

Signed-off-by: Takashi Iwai <[email protected]>

---
diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c
index 0abf280..51e83d7 100644
--- a/sound/pci/ac97/ac97_codec.c
+++ b/sound/pci/ac97/ac97_codec.c
@@ -573,7 +573,7 @@ AC97_SINGLE("PC Speaker Playback Volume"
};

static const struct snd_kcontrol_new snd_ac97_controls_mic_boost =
- AC97_SINGLE("Mic Boost (+20dB) Switch", AC97_MIC, 6, 1, 0);
+ AC97_SINGLE("Mic Boost (+20dB)", AC97_MIC, 6, 1, 0);


static const char* std_rec_sel[] = {"Mic", "CD", "Video", "Aux", "Line", "Mix", "Mix Mono", "Phone"};
@@ -615,7 +615,7 @@ AC97_SINGLE("Simulated Stereo Enhancemen
AC97_SINGLE("3D Control - Switch", AC97_GENERAL_PURPOSE, 13, 1, 0),
AC97_SINGLE("Loudness (bass boost)", AC97_GENERAL_PURPOSE, 12, 1, 0),
AC97_ENUM("Mono Output Select", std_enum[2]),
-AC97_ENUM("Mic Select Capture Switch", std_enum[3]),
+AC97_ENUM("Mic Select", std_enum[3]),
AC97_SINGLE("ADC/DAC Loopback", AC97_GENERAL_PURPOSE, 7, 1, 0)
};

2006-09-02 01:40:26

by Elias Holman

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5


> From: Elias Holman <[email protected]>
> Subject: PROBLEM: PCI/Intel 82945 trouble on Toshiba M400 notebook

I can successfully boot my M400 under 2.6.18-rc5. Even better, I can
now enable hotpluggable CPUs and successfully suspend and resume. I
needed the SATA patch (http://lkml.org/lkml/2006/7/20/56), referenced in
the "T60 not coming out of suspend to RAM" thread as well, so I would
like to second Michael Tsirkin's request that this be merged in.

--
Eli



--
VGER BF report: H 0.137978

2006-09-02 01:50:17

by Jeff Garzik

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Elias Holman wrote:
>> From: Elias Holman <[email protected]>
>> Subject: PROBLEM: PCI/Intel 82945 trouble on Toshiba M400 notebook
>
> I can successfully boot my M400 under 2.6.18-rc5. Even better, I can
> now enable hotpluggable CPUs and successfully suspend and resume. I
> needed the SATA patch (http://lkml.org/lkml/2006/7/20/56), referenced in
> the "T60 not coming out of suspend to RAM" thread as well, so I would
> like to second Michael Tsirkin's request that this be merged in.

As Andrew Morton noted, it's already queued for 2.6.19.

Jeff




--
VGER BF report: H 0

2006-09-03 18:44:00

by Tejun Heo

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

diff --git a/drivers/scsi/ata_piix.c b/drivers/scsi/ata_piix.c
index 2d20caf..46f7c9b 100644
--- a/drivers/scsi/ata_piix.c
+++ b/drivers/scsi/ata_piix.c
@@ -553,15 +553,42 @@ static unsigned int piix_sata_present_ma
{
struct pci_dev *pdev = to_pci_dev(ap->host_set->dev);
struct piix_host_priv *hpriv = ap->host_set->private_data;
+ const struct piix_map_db *map_db = hpriv->map_db;
const unsigned int *map = hpriv->map;
int base = 2 * ap->hard_port_no;
unsigned int present_mask = 0;
int port, i;
- u16 pcs;
+ u16 pcs, new_pcs;

pci_read_config_word(pdev, ICH5_PCS, &pcs);
DPRINTK("ata%u: ENTER, pcs=0x%x base=%d\n", ap->id, pcs, base);

+ new_pcs = pcs | map_db->port_enable;
+
+ if (pcs != new_pcs) {
+ u16 old_pcs = pcs;
+
+ for (i = 0; i < 10; i++) {
+ pci_write_config_word(pdev, ICH5_PCS, new_pcs);
+ msleep(150);
+ pci_read_config_word(pdev, ICH5_PCS, &pcs);
+
+ new_pcs = pcs | map_db->port_enable;
+ if (pcs == new_pcs)
+ break;
+ }
+
+ if (pcs == new_pcs)
+ ata_port_printk(ap, KERN_INFO, "updated PCS from "
+ "0x%x to 0x%x (%d tries)\n",
+ old_pcs, pcs, i);
+ else
+ ata_port_printk(ap, KERN_WARNING,
+ "failed to update PCS after %d tries, "
+ "old=0x%x cur=0x%x new=0x%x\n",
+ i, old_pcs, pcs, new_pcs);
+ }
+
for (i = 0; i < 2; i++) {
port = map[base + i];
if (port < 0)
@@ -816,35 +843,6 @@ static int __devinit piix_check_450nx_er
return no_piix_dma;
}

-static void __devinit piix_init_pcs(struct pci_dev *pdev,
- struct ata_port_info *pinfo,
- const struct piix_map_db *map_db)
-{
- u16 pcs, new_pcs;
-
- pci_read_config_word(pdev, ICH5_PCS, &pcs);
-
- new_pcs = pcs | map_db->port_enable;
-
- if (new_pcs != pcs) {
- DPRINTK("updating PCS from 0x%x to 0x%x\n", pcs, new_pcs);
- pci_write_config_word(pdev, ICH5_PCS, new_pcs);
- msleep(150);
- }
-
- if (force_pcs == 1) {
- dev_printk(KERN_INFO, &pdev->dev,
- "force ignoring PCS (0x%x)\n", new_pcs);
- pinfo[0].host_flags |= PIIX_FLAG_IGNORE_PCS;
- pinfo[1].host_flags |= PIIX_FLAG_IGNORE_PCS;
- } else if (force_pcs == 2) {
- dev_printk(KERN_INFO, &pdev->dev,
- "force honoring PCS (0x%x)\n", new_pcs);
- pinfo[0].host_flags &= ~PIIX_FLAG_IGNORE_PCS;
- pinfo[1].host_flags &= ~PIIX_FLAG_IGNORE_PCS;
- }
-}
-
static void __devinit piix_init_sata_map(struct pci_dev *pdev,
struct ata_port_info *pinfo,
const struct piix_map_db *map_db)
@@ -893,6 +891,17 @@ static void __devinit piix_init_sata_map

hpriv->map = map;
hpriv->map_db = map_db;
+
+ /* handle force_pcs module parameter */
+ if (force_pcs == 1) {
+ dev_printk(KERN_INFO, &pdev->dev, "force ignoring PCS\n");
+ pinfo[0].host_flags |= PIIX_FLAG_IGNORE_PCS;
+ pinfo[1].host_flags |= PIIX_FLAG_IGNORE_PCS;
+ } else if (force_pcs == 2) {
+ dev_printk(KERN_INFO, &pdev->dev, "force honoring PCS\n");
+ pinfo[0].host_flags &= ~PIIX_FLAG_IGNORE_PCS;
+ pinfo[1].host_flags &= ~PIIX_FLAG_IGNORE_PCS;
+ }
}

/**
@@ -948,12 +957,9 @@ static int piix_init_one (struct pci_dev
}

/* Initialize SATA map */
- if (host_flags & ATA_FLAG_SATA) {
+ if (host_flags & ATA_FLAG_SATA)
piix_init_sata_map(pdev, port_info,
piix_map_db_table[ent->driver_data]);
- piix_init_pcs(pdev, port_info,
- piix_map_db_table[ent->driver_data]);
- }

/* On ICH5, some BIOSen disable the interrupt using the
* PCI_COMMAND_INTX_DISABLE bit added in PCI 2.3.


Attachments:
patch (3.34 kB)

2006-09-06 07:34:51

by Keith Owens

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Tejun Heo (on Sun, 03 Sep 2006 15:10:44 +0900) wrote:
>Hmm... Can you try the attached patch and see what happens? ATM, I'm on
>the road and can't test the patch, so it's only compile-tested. This
>patch basically reverts some of the effects of the following commit and
>makes PCS update a little bit more aggressive iff necessary.
>
>ea35d29e2fa8b3d766a2ce8fbcce599dce8d2734
>[libata] ata_piix: Consolidate PCS register writing

I am also on the road, without access to the machines that had the ich5
and ich7 problems. I will not be able to test the patch until about
September 18.

2006-09-06 12:21:16

by Paul Slootman

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Kasper Sandberg <[email protected]> wrote:
>
>i have nyself tested xfs in -rc5 now, doing rsync over and over, and
>been unable to hit any problem, it indeed seems very hard to reproduce.

I have a box (dual opteron) that "reliably" has XFS failing every night
with kernels >= 2.6.16. It stays up without any problems with 2.6.15.7.

I've started doing a git bisect, and it failed with some 2.6.16-rc2
version, albeit in a different way to the usual XFS failure: all disk IO
related tasks were locked up in state 'D', no kernel messages on the
console. Probably not related to the previous XFS problem, I guess.

I now need to run at least one night with the "known good" 2.6.15.7,
but I'll report any further findings.


Paul Slootman

2006-09-13 14:41:46

by Tejun Heo

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Keith Owens wrote:
> Tejun Heo (on Sun, 03 Sep 2006 15:10:44 +0900) wrote:
>> Hmm... Can you try the attached patch and see what happens? ATM, I'm on
>> the road and can't test the patch, so it's only compile-tested. This
>> patch basically reverts some of the effects of the following commit and
>> makes PCS update a little bit more aggressive iff necessary.
>>
>> ea35d29e2fa8b3d766a2ce8fbcce599dce8d2734
>> [libata] ata_piix: Consolidate PCS register writing
>
> I am also on the road, without access to the machines that had the ich5
> and ich7 problems. I will not be able to test the patch until about
> September 18.

It seems my box can't reproduce your condition. I did ~50 soft reboots
but PCS always correctly detects devices. I'll wait for your test result.

Thanks.

--
tejun

2006-09-15 07:41:04

by Keith Owens

[permalink] [raw]
Subject: Re: Linux v2.6.18-rc5

Tejun Heo (on Sun, 03 Sep 2006 15:10:44 +0900) wrote:
>Keith Owens wrote:
>>>> (2) I have seen the same intermittent bug on ICH7 SATA but
>>>> PIIX_FLAG_IGNORE_PCS is only set for ich5 and i6300esb_sata. It
>>>> probably needs to be set for ich7 as well.
>>> No, ICH7 up to this point has been believed to have well-behaving PCS.
>>> If you report PCS problem, you'll be the first.
>
>Hmm... Can you try the attached patch and see what happens? ATM, I'm on
>the road and can't test the patch, so it's only compile-tested. This
>patch basically reverts some of the effects of the following commit and
>makes PCS update a little bit more aggressive iff necessary.
>
>ea35d29e2fa8b3d766a2ce8fbcce599dce8d2734
>[libata] ata_piix: Consolidate PCS register writing
>
>If this works for you ich7m, can you please test this on your formerly
>problematic ich5 with force_pcs=2 specified? I initially thought that
>the ich5 problem was caused by exact PCS map change and thus added
>IGNORE_PCS as workaround but if the same problem occurs on ich7 and is
>fixed by the attached patch, it's due to conservative PCS update change
>and thus the original IGNORE_PCS fix on ich5 might not be necessary.
>just doesn't work quite as ata_piix developers expect.

Tested the patch on my ich7 (actually ich6m) laptop, on top of
2.6.18-rc7. It failed after about 6 reboots. Hand copied portion of
the boot log (no serial port on this laptop).

ata_piix 0000:00:1f.2: MA{ [ P0 P2 IDE IDE ]
ACPI: PCI Interrupt 0000:00:1f.2[B] -> GSI 19 (level, low) -> IRQ 18
ata: 0x170 IDE port busy
ata1: SATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0x18B0 irq 14
scsi0: ata_piix
ata1: failed to update PCS after 10 tries, old=0x0 cur=0x0 new=0x5

I also booted my problem ich5 system with this patch on 2.6.18-rc7 with
ata_piix.force_pcs=2. It gets the 'force honoring PCS' message and so
far it has not failed. 95 reboots later, and every single one detected
the disk in 0 tries. Go figure.

Remember that adding the kdb patch and turning on kdb debugging makes
this problem more likely to occur. I can reproduce the problem on both
machines using 2.6.18-rc5 without kdb, so kdb is not the cause.

Also remember that the ich5 box is actually a 64 bit system. I have
only ever seen this problem when running in i386 mode, in x86_64 mode
pcs always just works.

Maybe we are looking at a timing race or a memory mapping problem?