2006-09-28 08:46:34

by Andrew Morton

[permalink] [raw]
Subject: 2.6.18-mm2


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/


- Added the SuperH architecture git tree to the -mm lineup as git-sh.patch
(Paul Mundt)

- Added the SuperH64 architecture git tree to the -mm lineup as git-sh64.patch
(Paul Mundt)

- Added the PCI-Domain support tree to the -mm lineup as git-pciseg.patch
(Jeff Garzik)

- The git-input tree has been temporarily dropped due to various USB mouse
related failures.

- More updates to the MSI code. If your machine has Message Signalled
Interrupts, please enable it and give it a try.

- The reboot command doesn't work if you're using netconsole-over-e100.




Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

git fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.

echo "subscribe mm-commits" | mail [email protected]

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.

- Semi-daily snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list.




Changes since 2.6.18-mm1:


origin.patch
git-acpi.patch
git-agpgart.patch
git-arm.patch
git-block.patch
git-cifs.patch
git-cpufreq.patch
git-drm.patch
git-dvb.patch
git-geode.patch
git-gfs2.patch
git-ia64.patch
git-ieee1394.patch
git-intelfb.patch
git-jfs.patch
git-libata-all.patch
git-lxdialog.patch
git-mmc.patch
git-mtd.patch
git-netdev-all.patch
git-net.patch
git-ocfs2.patch
git-parisc.patch
git-pcmcia.patch
git-powerpc.patch
git-serial.patch
git-pciseg.patch
git-s390.patch
git-scsi-misc.patch
git-block-vs-git-sas.patch
git-scsi-target.patch
git-watchdog.patch

git trees

+__percpu_alloc_mask-has-to-be-__always_inline-in-up-case.patch
+sys_getcpu-prototype-annotated.patch
+remove-generic__raw_read_trylock.patch
+jbd-memory-leak-in-journal_init_dev.patch

Queued for 2.6.19-rc1.

-autofs4-zero-timeout-prevents-shutdown.patch
-rtc-lockdep-fix-workaround.patch
-i386-bootioremap--kexec-fix.patch
-do-not-free-non-slab-allocated-per_cpu_pageset.patch
-vidioc_enumstd-bug.patch
-backlight-fix-oops-in-__mutex_lock_slowpath-during-head-sys-class-graphics-fb0.patch
-cpu-to-node-relationship-fixup-take2.patch
-cpu-to-node-relationship-fixup-map-cpu-to-node.patch
-i386-fix-flat-mode-numa-on-a-real-numa-system.patch
-load_module-no-bug-if-module_subsys-uninitialized.patch
-fix-longstanding-load-balancing-bug-in-the-scheduler.patch
-trigger-a-syntax-error-if-percpu-macros-are-incorrectly-used.patch
-allow-file-systems-to-manually-d_move-inside-of-rename.patch
-jbd-fix-commit-of-ordered-data-buffers.patch
-update-to-the-kernel-kmap-kunmap-api.patch
-acpi-mwait-c-state-fixes.patch
-kthread-switch-arch-arm-kernel-apmc.patch
-gregkh-driver-documentation-abi-devfs-is-not-obsolete-but-removed.patch
-gregkh-driver-deprecate-physdev-keys.patch
-gregkh-driver-class_device_create-make-fmt-argument-const-char.patch
-gregkh-driver-device_create-make-fmt-argument-const-char.patch
-gregkh-driver-driver-core-add-const-to-class_create.patch
-gregkh-driver-sysfs-make-poll-behaviour-consistent.patch
-gregkh-driver-debugfs-kernel-doc-fixes-for-debugfs.patch
-gregkh-driver-sysfs_symlink_in_root.patch
-gregkh-driver-suspend-infrastructure-cleanup-and-extension.patch
-gregkh-driver-suspend-pci.patch
-gregkh-driver-make-suspend-quieter.patch
-gregkh-driver-fix-broken-dubious-driver-suspend-methods.patch
-gregkh-driver-pm-define-pm_event_prethaw.patch
-gregkh-driver-pm-pci-and-ide-handle-pm_event_prethaw.patch
-gregkh-driver-pm-video-drivers-and-pm_event_prethaw.patch
-gregkh-driver-pm-usb-hcds-use-pm_event_prethaw.patch
-gregkh-driver-pm-issue-pm_event_prethaw.patch
-gregkh-driver-pm-update-docs-for-writing-...-power-state.patch
-gregkh-driver-pm-add-kconfig-option-for-deprecated-...-power-state-files.patch
-gregkh-driver-pm-schedule-sys-devices-...-power-state-for-removal.patch
-gregkh-driver-pm-no-suspend_prepare-phase.patch
-gregkh-driver-pm-add-sys-power-documentation-to-documentation-abi.patch
-gregkh-driver-pm-device_suspend-resume-may-sleep.patch
-gregkh-driver-pm-platform_bus-and-late_suspend-early_resume.patch
-gregkh-driver-device-groups.patch
-gregkh-driver-device-class-parent.patch
-gregkh-driver-device-class-attr.patch
-gregkh-driver-device_rename.patch
-gregkh-driver-device-virtual.patch
-gregkh-driver-class_device_interface.patch
-gregkh-driver-device_bin_file.patch
-gregkh-driver-kobject-must_check-fixes.patch
-gregkh-driver-sysfs_remove_bin_file-no-return-value-dump_stack-on-error.patch
-gregkh-driver-driver-core-fix-comments-in-drivers-base-power-resume.c.patch
-gregkh-driver-driver-core-fixed-add_bind_files-definition.patch
-gregkh-driver-add-__must_check-to-device-management-code.patch
-gregkh-driver-add-config_enable_must_check.patch
-gregkh-driver-v4l-dev2-handle-__must_check.patch
-gregkh-driver-drivers-base-platform-notify-needs-to-occur-before-drivers-attach-to-the-device.patch
-gregkh-driver-drivers-base-check-errors.patch
-gregkh-driver-sysfs-add-proper-sysfs_init-prototype.patch
-gregkh-driver-driver-multithread.patch
-gregkh-driver-pci-multithreaded-probe.patch
-gregkh-driver-driver-core-fix-potential-deadlock-in-driver-core.patch
-gregkh-driver-driver-core-remove-unneeded-routines-from-driver-core.patch
-gregkh-driver-driver-core-don-t-call-put-methods-while-holding-a-spinlock.patch
-scsi-device_reprobe-can-fail.patch
-gregkh-i2c-i2c-dev-cleanups.patch
-gregkh-i2c-i2c-dev-convert-array-to-list.patch
-gregkh-i2c-i2c-dev-drop-template-client.patch
-gregkh-i2c-i2c-dev-device.patch
-gregkh-i2c-i2c-__must_check-fixes.patch
-gregkh-i2c-i2c-__must_check-fixes-i2c-dev.patch
-gregkh-i2c-i2c-algo-sibyte-cleanups.patch
-gregkh-i2c-i2c-algo-sibyte-merge-in-i2c-sibyte.patch
-gregkh-i2c-i2c-sibyte-drop-kip-walker-address.patch
-gregkh-i2c-i2c-au1550-fix-timeout-problem.patch
-gregkh-i2c-i2c-au1550-add-smbus-functionality-flag.patch
-gregkh-i2c-i2c-au1550-add-au1200-support.patch
-gregkh-i2c-i2c-fix-copy-n-paste-in-subsystem-Kconfig.patch
-gregkh-i2c-i2c-matroxfb-c99-struct-init.patch
-gregkh-i2c-i2c-algo-bit-kill-mdelay.patch
-gregkh-i2c-i2c-bus-driver-for-TI-OMAP-boards.patch
-gregkh-i2c-i2c-isa-plan-for-removal.patch
-gregkh-i2c-i2c-stub-add-chip_addr-param.patch
-gregkh-i2c-i2c-dev-attach-detach-adapter-cleanups.patch
-gregkh-i2c-i2c-chips-__must_check-fixes.patch
-gregkh-i2c-i2c-isa-return-attach_adapter.patch
-gregkh-i2c-i2c-algo-bit-cleanups.patch
-gregkh-i2c-i2c-algo-pcf-kill-mdelay.patch
-gregkh-i2c-i2c-drop-useless-masking.patch
-gregkh-i2c-i2c-warn-on-failed-client-attach.patch
-gregkh-i2c-i2c-viapro-add-VT8251-VT8237A.patch
-gregkh-i2c-i2c-isa-restore-driver-owner.patch
-gregkh-i2c-i2c-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-algos-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-busses-constify-i2c_algorithm.patch
-gregkh-i2c-i2c-drop-slave-functions.patch
-i2c-mpc-fix-up-error-handling.patch
-ia64-kprobes-fixup-the-pagefault-exception-caused-by-probehandlers.patch
-stowaway-keyboard-support.patch
-stowaway-keyboard-support-update.patch
-wistron-fix-detection-of-special-buttons.patch
-fail-kernel-compilation-in-case-of-unresolved-symbols-v2.patch
-kerneldoc-error-on-ata_piixc.patch
-1-of-2-jmicron-driver.patch
-1-of-2-jmicron-driver-fix.patch
-2-of-2-jmicron-driver-plumbing-and-quirk.patch
-2-of-2-jmicron-driver-plumbing-and-quirk-cleanup.patch
-via-sata-oops-on-init.patch
-e1000-memory-leak-in-e1000_set_ringparam.patch
-drivers-net-acenicc-removal-of-old-code.patch
-drivers-net-tokenring-lanstreamerc-removal-of-old-code.patch
-drivers-net-tokenring-lanstreamerh-removal-of-old-code.patch
-drivers-net-typhoonc-removal-of-old-code.patch
-signedness-issue-in-drivers-net-phy-phy_devicec.patch
-fix-possible-null-ptr-deref-in-forcedeth.patch
-e1000-account-for-net_ip_align-when-calculating-bufsiz.patch
-net-ipv6-bh_lock_sock_nested-on-tcp_v6_rcv.patch
-via-ircc-fix-memory-leak.patch
-atm-he-fix-section-mismatch.patch
-add-netpoll-netconsole-support-to-vlan-devices.patch
-neighbourc-pneigh_get_next-skips-published-entry.patch
-nfs-replace-null-dentries-that-appear-in-readdirs-list-2.patch
-add-newline-to-nfs-dprintk.patch
-fs-nfs-make-code-static.patch
-gregkh-pci-resources-insert-identical-resources-above-existing-resources.patch
-gregkh-pci-msi-cleanup-existing-msi-quirks.patch
-gregkh-pci-msi-factorize-common-code-in-pci_msi_supported.patch
-gregkh-pci-msi-export-the-pci_bus_flags_no_msi-flag-in-sysfs.patch
-gregkh-pci-msi-rename-pci_cap_id_ht_irqconf-into-pci_cap_id_ht.patch
-gregkh-pci-msi-blacklist-pci-e-chipsets-depending-on-hypertransport-msi-capability.patch
-gregkh-pci-pcie-check-and-return-bus_register-errors.patch
-gregkh-pci-pci-express-aer-implemetation-aer-howto-document.patch
-gregkh-pci-pci-express-aer-implemetation-export-pcie_port_bus_type.patch
-gregkh-pci-pci-express-aer-implemetation-aer-core-and-aerdriver.patch
-gregkh-pci-pci-express-aer-implemetation-pcie_portdrv-error-handler.patch
-gregkh-pci-shpchp-must_check-fixes.patch
-gregkh-pci-pci-hotplug-must_check-fixes.patch
-gregkh-pci-pci-must_check-fixes.patch
-gregkh-pci-pci-multiprobe-sanitizer.patch
-gregkh-pci-pci-drivers-pci-hotplug-acpiphp_glue.c-make-a-function-static.patch
-gregkh-pci-pci-restore-pci-express-capability-registers-after-pm-event.patch
-gregkh-pci-pci-hotplug-cleanup-pcihp-skeleton-code.patch
-gregkh-pci-acpiphp-set-hpp-values-before-starting-devices.patch
-gregkh-pci-acpiphp-initialize-ioapics-before-starting-devices.patch
-gregkh-pci-acpiphp-do-not-initialize-existing-ioapics.patch
-gregkh-pci-pci-add-pci_stop_bus_device.patch
-gregkh-pci-acpiphp-stop-bus-device-before-acpi_bus_trim.patch
-gregkh-pci-acpiphp-disable-bridges.patch
-gregkh-pci-pci-assign-ioapic-resource-at-hotplug.patch
-gregkh-pci-acpiphp-add-support-for-ioapic-hot-remove.patch
-gregkh-pci-ia64-pci-dont-disable-irq-which-is-not-enabled.patch
-gregkh-pci-pciehp-fix-wrong-return-value.patch
-revert-scsi-improve-inquiry-printing.patch
-dc395x-fix-printk-format-warning.patch
-pci_module_init-conversion-in-scsi-subsys-2nd-try.patch
-megaraid-use-the-proper-type-to-hold-the-irq-number.patch
-drivers-scsi-dpt-dpti_i2oh-removal-of-old.patch
-drivers-scsi-gdthh-removal-of-old-scsi-code.patch
-drivers-scsi-nsp32h-removal-of-old-scsi-code.patch
-drivers-message-fusion-linux_compath-removal-of-old-code.patch
-signedness-issue-in-drivers-scsi-iprc.patch
-signedness-issue-in-drivers-scsi-osstc.patch
-bodge-scsi-misc-module-reference-count-checks-with-no-module_unload.patch
-scsi-remove-seagateh.patch
-scsi-seagate-scsi_cmnd-conversion.patch
-3w-xxxx-fix-ata-udma-upgrade-message-number.patch
-scsi-included-header-cleanup.patch
-gregkh-usb-usb-unusual_devs-entry-for-lacie-dvd-rw.patch
-gregkh-usb-usb-unusual_dev-entry-for-sony-p990i.patch
-gregkh-usb-usb-doc-patch-1.patch
-gregkh-usb-usb-doc-patch-2.patch
-gregkh-usb-usb-ohci-avoids-root-hub-timer-polling.patch
-gregkh-usb-usb-ohci-s3c2410.c-clock-now-usb-bus-host.patch
-gregkh-usb-usb-ohci-controller-support-for-pnx4008.patch
-gregkh-usb-usb-kill-usb-kconfig-warning.patch
-gregkh-usb-usb-move-linux-usb_otg.h-to-linux-usb-otg.h.patch
-gregkh-usb-usb-pxa2xx_udc-understands-gpio-based-vbus-sensing.patch
-gregkh-usb-usb-allow-compile-in-g_ether-fix-typo.patch
-gregkh-usb-usb-ark3116-add-tiocgserial-and-tiocsserial-ioctl-calls.patch
-gregkh-usb-usb-ark3116-formatting-cleanups.patch
-gregkh-usb-usb-make-usb_buffer_free-null-safe.patch
-gregkh-usb-usbcore-add-configuration_string-to-attribute-group.patch
-gregkh-usb-usb-add-driver-for-phidgetmotorcontrol.patch
-gregkh-usb-usb-put-phidgets-driver-in-a-sysfs-class.patch
-gregkh-usb-usb-phidgets-should-check-create_device_file-return-value.patch
-gregkh-usb-usbfs-private-mutex-for-open-release-and-remove.patch
-gregkh-usb-usbfs-detect-device-unregistration.patch
-gregkh-usb-usb-skeleton-don-t-submit-urbs-after-disconnection.patch
-gregkh-usb-usbcore-rename-usb_suspend_device-to-usb_port_suspend.patch
-gregkh-usb-usbcore-move-code-among-source-files.patch
-gregkh-usb-usbcore-add-usb_device_driver-definition.patch
-gregkh-usb-usbcore-make-usb_generic-a-usb_device_driver.patch
-gregkh-usb-usbcore-split-suspend-resume-for-device-and-interfaces.patch
-gregkh-usb-usbcore-resume-device-resume-recursion.patch
-gregkh-usb-usbcore-track-whether-interfaces-are-suspended.patch
-gregkh-usb-usbcore-set-device-and-power-states-properly.patch
-gregkh-usb-usbcore-fix-up-device-and-power-state-tests.patch
-gregkh-usb-usbcore-suspending-devices-with-no-driver.patch
-gregkh-usb-hub-driver-improve-use-of-ifdef.patch
-gregkh-usb-usb-usbtouchscreen-version-0.4.patch
-gregkh-usb-usb-pl2303-removes-unneeded-goto.patch
-gregkh-usb-usb-pl2303-remove-80-columns-limit-violations-in-pl2303-driver.patch
-gregkh-usb-usb-pl2303-cosmetic-changes-to-pl2303_buf_-clear-data_avail.patch
-gregkh-usb-usb-pl2303-reduce-number-of-prototypes.patch
-gregkh-usb-usb-pl2303-cosmetic-changes-to-quirk.patch
-gregkh-usb-usb-usbnet-add-unlink_rx_urbs-call-to-allow-for-jumbo-frames.patch
-gregkh-usb-usb-asix-add-ax88178-support-and-many-other-changes.patch
-gregkh-usb-usbnet-printk-format-warning.patch
-gregkh-usb-usb-ipaq-minor-ipaq_open-cleanup.patch
-gregkh-usb-usb-usbcore-get-rid-of-the-timer-in-usb_start_wait_urb.patch
-gregkh-usb-usb-wacom-tablet-driver-reorganization.patch
-gregkh-usb-usb-garmin_gps-support-for-new-generation-of-gps-receivers.patch
-gregkh-usb-usb-build-fixes-ohci-omap.patch
-gregkh-usb-usb-onetouch-handle-errors-from-input_register_device.patch
-gregkh-usb-usb-correct-locking-in-gadgetfs_disconnect.patch
-gregkh-usb-usb-fix-ep_config-to-return-correct-value.patch
-gregkh-usb-usb-gadgetfs-protect-ep_release-with-lock.patch
-gregkh-usb-usb-gmidi-new-usb-midi-gadget-class-driver.patch
-gregkh-usb-usb-make-file-operations-structs-in-drivers-usb-const.patch
-gregkh-usb-usb-making-the-kernel-wshadow-clean-usb-completion.patch
-gregkh-usb-usb-new-functions-to-check-endpoints-info.patch
-gregkh-usb-usb-usblp-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-hub-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-appletouch-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-acecad-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-ati_remote-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-keyspan_remote-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-powermate-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-usb-serial-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-usblcd-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-ldusb-use-usb_endpoint_-functions.patch
-gregkh-usb-usb-net1080-inherent-pad-length.patch
-gregkh-usb-usb-add-poll-to-gadgetfs-s-endpoint-zero.patch
-gregkh-usb-usb-gadget-gadgetfs-dont-try-to-lock-before-free.patch
-gregkh-usb-usb-properly-unregister-reboot-notifier-in-case-of-failure-in-ehci-hcd.patch
-gregkh-usb-uhci-increase-resume-detect-off-delay.patch
-gregkh-usb-usbcore-make-hcd_endpoint_disable-wait-for-queue-to-drain.patch
-gregkh-usb-usbcore-khubd-and-busy-port-handling.patch
-gregkh-usb-usb-skeleton-small-update.patch
-gregkh-usb-usb-storage-add-rio-karma-eject-support.patch
-gregkh-usb-usb-deal-with-broken-config-descriptors.patch
-gregkh-usb-wusb-hub-recognizes-wusb-ports.patch
-gregkh-usb-wusb-handle-wusb-device-ep0-speed-settings.patch
-gregkh-usb-wusb-pretty-print-new-devices.patch
-gregkh-usb-usb-core-use-const-where-possible.patch
-gregkh-usb-usb-fix-signedness-issue-in-drivers-usb-gadget-ether.c.patch
-gregkh-usb-usb-fix-typo-in-drivers-usb-gadget-kconfig.patch
-gregkh-usb-usb-storage-fix-for-ufi-lun-detection.patch
-gregkh-usb-usbcore-help-drivers-to-change-device-configs.patch
-gregkh-usb-usb-turn-usb_resume_both-into-static-inline.patch
-gregkh-usb-usb-usb-hub-driver-improve-use-of-ifdef-fix.patch
-gregkh-usb-usb-remove-struct-usb_operations.patch
-gregkh-usb-usbcore-add-flag-for-whether-a-host-controller-uses-dma.patch
-gregkh-usb-usbcore-trim-down-usb_bus-structure.patch
-gregkh-usb-usbmon-don-t-call-mon_dmapeek-if-dma-isn-t-being-used.patch
-gregkh-usb-usb-ethernet-gadget-avoids-zlps-for-musb_hdrc.patch
-gregkh-usb-usb-ehci-whitespace-fixes.patch
-gregkh-usb-gadgetfs-patch-for-ep0out.patch
-gregkh-usb-usb-replace-kernel_thread-with-kthread_run-in-libusual.c.patch
-gregkh-usb-usb-usb-serial-gadget-smp-related-bug.patch
-gregkh-usb-usb-net2280-update-dma-buffer-allocation.patch
-gregkh-usb-usb-ohci-at91-two-one-liners.patch
-gregkh-usb-usb-usb-input-usbmouse.c-whitespace-cleanup.patch
-gregkh-usb-usb-ub-let-cdrecord-to-see-a-device-with-media-absent.patch
-gregkh-usb-usbcore-store-each-usb_device-s-level-in-the-tree.patch
-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure.patch
-gregkh-usb-usbcore-non-hub-specific-uses-of-autosuspend.patch
-gregkh-usb-usbcore-remove-usb_suspend_root_hub.patch
-gregkh-usb-usb-fix-root-hub-resume-when-config_usb_suspend-is-not-set.patch
-gregkh-usb-usb-core-must_check.patch
-gregkh-usb-usb-misc-must_check.patch
-gregkh-usb-usb-atm-must_check.patch
-gregkh-usb-usb-class-must_check.patch
-gregkh-usb-usb-input-must_check.patch
-gregkh-usb-usb-host-must_check.patch
-gregkh-usb-usb-serial-must_check-fixes.patch
-gregkh-usb-cypress_m8-use-appropriate-urb-polling-interval.patch
-gregkh-usb-cypress_m8-use-usb_fill_int_urb-where-appropriate.patch
-gregkh-usb-cypress_m8-improve-control-endpoint-error-handling.patch
-gregkh-usb-cypress_m8-implement-graceful-failure-handling.patch
-gregkh-usb-add-aircable-usb-bluetooth-dongle-driver.patch
-gregkh-usb-aircable-fix-printk-format-warnings.patch
-gregkh-usb-usb-adutux-driver.patch
-gregkh-usb-usb-add-playstation-2-trance-vibrator-driver.patch
-gregkh-usb-usb-moschip-7840-usb-serial-driver.patch
-gregkh-usb-usb-serial-support-alcor-micro-corp.-usb-2.0-to-rs-232-through-pl2303-driver.patch
-gregkh-usb-usb-ftdi-elan-client-driver-for-elan-uxxx-adapters.patch
-gregkh-usb-usb-u132-hcd-host-controller-driver-for-elan-u132-adapter.patch
-gregkh-usb-usb-remove-unneeded-void-casts-in-core-files.patch
-gregkh-usb-usb-dealias-110-code.patch
-gregkh-usb-usb-ohci_usb-can-oops-on-shutdown.patch
-gregkh-usb-usb-force-root-hub-resume-after-power-loss.patch
-gregkh-usb-usb-ehci-update-via-workaround.patch
-gregkh-usb-usb-remove-otg-build-warning.patch
-gregkh-usb-airprime_major_update.patch
-gregkh-usb-usb-storage-add-rio-karma-eject-support-fix.patch
-fix-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure.patch
-x86_64-mm-i386-up-generic-arch.patch
-x86_64-mm-temp-revert-arch-perfmon.patch
-x86_64-mm-add-performance-counter-reservation-framework-for-up-kernels.patch
-x86_64-mm-utilize-performance-counter-reservation-framework-in-oprofile.patch
-x86_64-mm-add-smp-support-on-x86_64-to-reservation-framework.patch
-x86_64-mm-add-smp-support-on-i386-to-reservation-framework.patch
-x86_64-mm-cleanup-nmi-interrupt-path.patch
-x86_64-mm-tif-restore-sigmask.patch
-x86_64-mm-add-ppoll-pselect.patch
-x86_64-mm-remove-un-set_nmi_callback-and-reserve-release_lapic_nmi-functions.patch
-x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-sysfs.patch
-x86_64-mm-add-abilty-to-enable-disable-nmi-watchdog-from-procfs-update.patch
-x86_64-mm-allow-users-to-force-a-panic-on-nmi.patch
-x86_64-mm-x86-clean-up-nmi-panic-messages.patch
-x86_64-mm-x86-nmi-watchdog-suspend.patch
-x86_64-mm-unknown-nmi-panic.patch
-x86_64-mm-make-functions-static.patch
-x86_64-mm-kdump-x86_64-nmi-event-notification-fix.patch
-x86_64-mm-kdump-i386-nmi-event-notification-fix.patch
-x86_64-mm-i386-enable-nmi-wdog.patch
-x86_64-mm-add-nmi-watchdog-support-for-new-intel-cpus.patch
-x86_64-mm-rdtscp-macros.patch
-x86_64-mm-init-rdtscp.patch
-x86_64-mm-getcpu-vsyscall.patch
-x86_64-mm-generic-getcpu-syscall.patch
-x86_64-mm-no-asm-smp.patch
-x86_64-mm-tif-flags-for-debug-regs-and-io-bitmap-in-ctxsw.patch
-x86_64-mm-hpet-cosmetics.patch
-x86_64-mm-a-few-trivial-spelling-and-grammar-fixes.patch
-x86_64-mm-randomize-check.patch
-x86_64-mm-i386-profile-pc.patch
-x86_64-mm-simplify-profile-pc.patch
-x86_64-mm-backtracer-docs.patch
-x86_64-mm-asm-alternative.patch
-x86_64-mm-rwlock-to-asm.patch
-x86_64-mm-i386-remove-const-rwlock.patch
-x86_64-mm-fix-align.patch
-x86_64-mm-i386-asm-alternative.patch
-x86_64-mm-i386-semaphore-to-asm.patch
-x86_64-mm-remove-thunk-cvs-id.patch
-x86_64-mm-tce-comment.patch
-x86_64-mm-remove-apic-ifdefs.patch
-x86_64-mm-remove-apic-mismatch.patch
-x86_64-mm-remove-focus-disabled-workaround.patch
-x86_64-mm-tlb-flush-cleanup.patch
-x86_64-mm-i386-tlbflush-fixes.patch
-x86_64-mm-entry-comments.patch
-x86_64-mm-remove-pirq.patch
-x86_64-mm-remove-mca-eisa.patch
-x86_64-mm-remove-pic-mode.patch
-x86_64-mm-remove-mpparse-checks.patch
-x86_64-mm-io-apic-access.patch
-x86_64-mm-i386-io-apic-access.patch
-x86_64-mm-aux_device_info-is-one-byte-long,-use-movb.patch
-x86_64-mm-remove-apic-renumbering.patch
-x86_64-mm-quirks-own-file.patch
-x86_64-mm-mp-bus-type-bitmap.patch
-x86_64-mm-remove-mpparse-wrapper.patch
-x86_64-mm-remove-acpi-externs-in-mpparse.patch
-x86_64-mm-mpparse-acpi-style.patch
-x86_64-mm-i386-mpparse-acpi-style.patch
-x86_64-mm-apic-build-bug-on.patch
-x86_64-mm-detect-cfi.patch
-x86_64-mm-kernel-asm-remove-cvs-id.patch
-x86_64-mm-initialize-end-of-memory-variables-as-early-as.patch
-x86_64-mm-remove-int_delivery_dest.patch
-x86_64-mm-i386-end-of-memory.patch
-x86_64-mm-kernel-stack-doc.patch
-x86_64-mm-calgary-rearrange-struct-iommu_table.patch
-x86_64-mm-calgary-consolidate-per-bus-data.patch
-x86_64-mm-calgary-break-out-of.patch
-x86_64-mm-calgary-fix-error-path-memleak-in.patch
-x86_64-mm-calgary-fix-reference-counting-of.patch
-x86_64-mm-calgary-init-one.patch
-x86_64-mm-calgary-save-a-bit-of-space-in-bus_info.patch
-x86_64-mm-i386-remove-lock-section.patch
-x86_64-mm-remove-lock-section.patch
-x86_64-mm-fix-is_at_popf-for-compat-tasks.patch
-x86_64-mm-spinlock-cleanup.patch
-x86_64-mm-i386-spinlock-cleanup.patch
-x86_64-mm-annotate-lib.patch
-x86_64-mm-fix-gdt-table-size-in-trampoline.s.patch
-x86_64-mm-remove-superflous-bug_ons-in-nommu-and-gart.patch
-x86_64-mm-remove-lock-prefix-from-is_at_popf-tests.patch
-x86_64-mm-early-cpu-identify.patch
-x86_64-mm-allow-early_param-and-identical-__setup-to-exist.patch
-x86_64-mm-i386-early-param.patch
-x86_64-mm-early-param.patch
-x86_64-mm-remove-early-lockdep.patch
-x86_64-mm-move-acpi-disabled.patch
-x86_64-mm-move-acpi-numa.patch
-x86_64-mm-move-e820map.patch
-x86_64-mm-vsyscall-sparse.patch
-x86_64-mm-fault-sparse.patch
-x86_64-mm-sys_ia32-sparse.patch
-x86_64-mm-aout-sparse.patch
-x86_64-mm-replace-local_save_flags+local_irq_disable-with.patch
-x86_64-mm-acpi-remove-extern.patch
-x86_64-mm-tf-iret.patch
-x86_64-mm-print-whether-config_iommu_debug-is.patch
-x86_64-mm-only-verify-the-allocation-bitmap-if.patch
-x86_64-mm-remove-tce_cache_blast_stress.patch
-x86_64-mm-eradicate-sole-remaining-80-chars.patch
-x86_64-mm-stacktrace-cleanup.patch
-x86_64-mm-lockdep-stacktrace-no-recursion.patch
-x86_64-mm-early-safe-smp-processor-id.patch
-x86_64-mm-early-unwind-init.patch
-x86_64-mm-stacktrace-unwinder.patch
-x86_64-mm-stacktrace-terminate.patch
-x86_64-mm-i386-stacktrace-unwinder.patch
-x86_64-mm-i386-stacktrace-terminate.patch
-x86_64-mm-i386-backtrace-ebp-fallback.patch
-x86_64-mm-lockdep-dont-force-framepointer.patch
-x86_64-mm-fix-dubious-segment-register-clear-in-cpu_init.patch
-x86_64-mm-dont-taint-up-k7s-running-smp-kernels..patch
-x86_64-mm-kprobes-error_code.patch
-x86_64-mm-monotonic-clock.patch
-x86_64-mm-improve-crash-dump-description.patch
-x86_64-mm-boot-param-bss.patch
-x86_64-mm-i386-fix-mpparse-warning.patch
-x86_64-mm-fault-notifier-export.patch
-x86_64-mm-i386-fault-notifier-export.patch
-x86_64-mm-i386-acpi_force-static.patch
-x86_64-mm-i386-enable_local_apic-static.patch
-x86_64-mm-i386-kernel-thread.patch
-x86_64-mm-i386-desc-cleanup.patch
-x86_64-mm-per-cpu-area-size.patch
-x86_64-mm-i386-topology-cleanup.patch
-x86_64-mm-i386-more-init.patch
-x86_64-mm-fix-bus-numbering-format-in-mmconfig-warning.patch
-x86_64-mm-support-physical-cpu-hotplug-for-x86_64.patch
-x86_64-mm-less-lazy-fpu.patch
-x86_64-mm-wire-up-oops_enter-oops_exit.patch
-x86_64-mm-add-mem-fix.patch
-x86_64-mm-remove-redundant-generic_identify-calls-when-identifying-cpus.patch
-x86_64-mm-mark-init_amd-as-__cpuinit.patch
-x86_64-mm-mark-cpu_dev-structures-as-__cpuinitdata.patch
-x86_64-mm-mark-cpu-init-functions-as-__cpuinit,-data-as-__cpuinitdata.patch
-x86_64-mm-mark-cpu-identify-functions-as-__cpuinit.patch
-x86_64-mm-mark-cpu-cache-functions-as-__cpuinit.patch
-x86_64-mm-i386-kprobes-mca.patch
-x86_64-mm-i386-kprobes-nmi.patch
-x86_64-mm-remove-config.h-includes-from-asm-i386-asm-x86_64.patch
-x86_64-mm-drop-640k-reservation.patch
-x86_64-mm-move-compiler-check-to-ia64.patch
-x86_64-mm-make-numa_emulation-__init.patch
-x86_64-mm-i386-cfi-nmi.patch
-x86_64-mm-detect-clock-skew-during-suspend.patch
-x86_64-mm-remove-safe_smp_processor_id.patch
-x86_64-mm-early_ioremap-warning.patch
-x86_64-mm-pte-exec.patch
-x86_64-mm-cpa-pse-cleanup.patch
-x86_64-mm-remove-apic-version-capability.patch
-x86_64-mm-cleanup-apic-id-checking.patch
-x86_64-mm-mpparse-style.patch
-x86_64-mm-nmi-irqtrace-check.patch
-x86_64-mm-fix-head.S-warning.patch
-x86_64-mm-remove-e820-fallback.patch
-x86_64-mm-optimize-hweight64-for-x86_64.patch
-x86_64-mm-reload-cs-in-head.patch
-x86_64-mm-note-section.patch
-x86_64-mm-e820-comment.patch
-x86_64-mm-proxy-pda.patch
-x86_64-mm-fix-the-edd-code-misparsing-the-command-line.patch
-x86_64-mm-remove-most-of-the-special-cases-for-the-debug-ist-stack.patch
-x86_64-mm-kexec-dont-overwrite-pgd.patch
-x86_64-mm-i386-kexec-dont-overwrite-pgd.patch
-x86_64-mm-trace-kernel-text-address.patch
-x86_64-mm-document-tree.patch
-x86_64-mm-stack-protector-annotate-the-pda-offsets.patch
-x86_64-mm-stack-protector-add-the-kconfig-option.patch
-x86_64-mm-stack-protector-add-canary.patch
-x86_64-mm-stack-protector-add_stack_chk_fail.patch
-x86_64-mm-stack-protector-cflags.patch
-x86_64-mm-fix-irqcount-comment.patch
-x86_64-mm-pda-use-c-output-modifier.patch
-x86_64-mm-type-checking-for-write_pda.patch
-x86_64-mm-fix-pda-warning.patch
-x86_64-mm-i386-replace-sensitive-instructions.patch
-x86_64-mm-i386-allow-a-kernel-to-not-be-in-ring0.patch
-x86_64-mm-i386-pda-asm-offsets.patch
-x86_64-mm-i386-pda-basics.patch
-x86_64-mm-i386-pda-init-pda.patch
-x86_64-mm-i386-pda-use-gs.patch
-x86_64-mm-i386-pda-user-abi.patch
-x86_64-mm-i386-pda-vm86.patch
-x86_64-mm-i386-pda-smp-processorid.patch
-x86_64-mm-i386-pda-current.patch
-x86_64-mm-i386-early-fault.patch
-x86_64-mm-insert-ioapics-and-local-apic-into-resource-map.patch
-x86_64-mm-acpi-add-hpet-into-resource-map.patch
-x86_64-mm-copy-user-zeroing.patch
-x86_64-mm-copy-user-mustcheck.patch
-x86_64-mm-compat-pselect-must-check.patch
-x86_64-mm-compat-uname-must-check.patch
-x86_64-mm-copy-user-style.patch
-x86_64-mm-pda-style.patch
-x86_64-mm-pda-noreturn.patch
-x86_64-mm-remove-mmx.patch
-x86_64-mm-init-per-cpu-data-again.patch
-x86_64-mm-i386-kexec-not-experimental.patch
-x86_64-mm-kexec-not-experimental.patch
-x86_64-mm-fix-idle-notifiers.patch
-x86_64-mm-pci-probe-type1-first.patch
-x86_64-mm-mcfg-type1-heuristic.patch
-x86_64-mm-insert-gart-region-into-resource-map.patch
-x86_64-mm-mcfg-resource.patch
-x86_64-mm-i386-mcfg-resource.patch
-x86_64-mm-i386-pack-descriptor.patch
-x86_64-mm-i386-multiline-oops.patch
-x86_64-mm-restore-i8259a-eoi.patch
-x86_64-mm-core2-rep-good.patch
-x86_64-mm-mmconfig-fix-comment.patch
-x86_64-mm-amd-single-cpu-sync-rdtsc.patch
-x86_64-mm-remove-signal-map.patch
-x86_64-mm-ia32-signal-regparm.patch
-x86_64-mm-ia32-signal-style.patch
-x86_64-mm-unwind-signal-frame-detect.patch
-x86_64-mm-dont-leak-nt.patch
-x86_64-mm-early-scan-depends-on-pci.patch
-x86_64-mm-move-pci-direct-out-of-line.patch
-x86_64-mm-allow-disabling-early-pci-scans.patch
-x86_64-mm-fix-unw-pc-warning.patch
-x86_64-mm-i386-fix-unwind-disabled.patch
-x86_64-mm-add-64bit-jiffies-compares-for-use-with-get_jiffies_64.patch
-x86_64-mm-refactor-thermal-throttle-processing.patch
-x86_64-mm-make-the-jiffies-compares-use-the-64bit-safe-macros..patch
-x86_64-mm-add-a-cumulative-thermal-throttle-event-counter..patch
-fix-x86_64-mm-i386-pda-smp-processorid.patch
-fix-x86_64-mm-spinlock-cleanup.patch
-mm-vm_bug_on.patch
-mm-tracking-shared-dirty-pages.patch
-mm-tracking-shared-dirty-pages-nommu-fix-2.patch
-mm-balance-dirty-pages.patch
-mm-optimize-the-new-mprotect-code-a-bit.patch
-mm-small-cleanup-of-install_page.patch
-mm-fixup-do_wp_page.patch
-mm-msync-cleanup.patch
-mm-tracking-shared-dirty-pages-checks.patch
-mm-tracking-shared-dirty-pages-wimp.patch
-mm-make-functions-static.patch
-convert-i386-numa-kva-space-to-bootmem.patch
-convert-i386-numa-kva-space-to-bootmem-tidy.patch
-bootmem-remove-useless-__init-in-header-file.patch
-bootmem-mark-link_bootmem-as-part-of-the-__init-section.patch
-bootmem-remove-useless-parentheses-in-bootmem-header.patch
-bootmem-limit-to-80-columns-width.patch
-bootmem-remove-useless-headers-inclusions.patch
-bootmem-use-pfn-page-conversion-macros.patch
-bootmem-miscellaneous-coding-style-fixes.patch
-reduce-max_nr_zones-remove-two-strange-uses-of-max_nr_zones.patch
-reduce-max_nr_zones-fix-max_nr_zones-array-initializations.patch
-reduce-max_nr_zones-make-display-of-highmem-counters-conditional-on-config_highmem.patch
-reduce-max_nr_zones-make-display-of-highmem-counters-conditional-on-config_highmem-tidy.patch
-reduce-max_nr_zones-move-highmem-counters-into-highmemc-h.patch
-reduce-max_nr_zones-move-highmem-counters-into-highmemc-h-fix.patch
-reduce-max_nr_zones-page-allocator-zone_highmem-cleanup.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment-cleanup.patch
-reduce-max_nr_zones-use-enum-to-define-zones-reformat-and-comment-fix.patch
-reduce-max_nr_zones-make-zone_dma32-optional.patch
-reduce-max_nr_zones-make-zone_highmem-optional.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix-fix.patch
-reduce-max_nr_zones-make-zone_highmem-optional-fix-fix-fix.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones-s390-fix.patch
-reduce-max_nr_zones-remove-display-of-counters-for-unconfigured-zones-s390-fix-fix.patch
-reduce-max_nr_zones-fix-i386-srat-check-for-max_nr_zones.patch
-mempolicies-fix-policy_zone-check.patch
-apply-type-enum-zone_type.patch
-apply-type-enum-zone_type-fix.patch
-linearly-index-zone-node_zonelists.patch
-out-of-memory-notifier.patch
-out-of-memory-notifier-tidy.patch
-cpu-hotplug-compatible-alloc_percpu.patch
-cpu-hotplug-compatible-alloc_percpu-fix.patch
-cpu-hotplug-compatible-alloc_percpu-fix-2.patch
-add-kerneldocs-for-some-functions-in-mm-memoryc.patch
-mm-remove_mapping-safeness.patch
-mm-remove_mapping-safeness-fix.patch
-mm-non-syncing-lock_page.patch
-slab-respect-architecture-and-caller-mandated-alignment.patch
-mm-swap-write-failure-fixup.patch
-mm-swap-write-failure-fixup-update.patch
-mm-swap-write-failure-fixup-fix.patch
-oom-use-unreclaimable-info.patch
-oom-reclaim_mapped-on-oom.patch
-oom-cpuset-hint.patch
-oom-handle-current-exiting.patch
-oom-handle-oom_disable-exiting.patch
-oom-swapoff-tasks-tweak.patch
-oom-kthread-infinite-loop-fix.patch
-oom-more-printk.patch
-bootmem-use-max_dma_address-instead-of-low32limit.patch
-add-some-comments-to-slabc.patch
-update-some-mm-comments.patch
-slab-optimize-kmalloc_node-the-same-way-as-kmalloc.patch
-slab-optimize-kmalloc_node-the-same-way-as-kmalloc-fix.patch
-slab-extract-__kmem_cache_destroy-from-kmem_cache_destroy.patch
-slab-do-not-panic-when-alloc_kmemlist-fails-and-slab-is-up.patch
-slab-fix-lockdep-warnings.patch
-slab-fix-lockdep-warnings-fix.patch
-slab-fix-lockdep-warnings-fix-2.patch
-add-__gfp_thisnode-to-avoid-fallback-to-other-nodes-and-ignore.patch
-add-__gfp_thisnode-to-avoid-fallback-to-other-nodes-and-ignore-fix.patch
-sys_move_pages-do-not-fall-back-to-other-nodes.patch
-guarantee-that-the-uncached-allocator-gets-pages-on-the-correct.patch
-cleanup-add-zone-pointer-to-get_page_from_freelist.patch
-profiling-require-buffer-allocation-on-the-correct-node.patch
-define-easier-to-handle-gfp_thisnode.patch
-standardize-pxx_page-macros.patch
-standardize-pxx_page-macros-fix.patch
-optimize-free_one_page.patch
-do-not-check-unpopulated-zones-for-draining-and-counter.patch
-extract-the-allocpercpu-functions-from-the-slab-allocator.patch
-introduce-mechanism-for-registering-active-regions-of-memory.patch
-have-power-use-add_active_range-and-free_area_init_nodes.patch
-have-power-use-add_active_range-and-free_area_init_nodes-ppc-fix.patch
-have-x86-use-add_active_range-and-free_area_init_nodes.patch
-have-x86-use-add_active_range-and-free_area_init_nodes-fix.patch
-have-x86_64-use-add_active_range-and-free_area_init_nodes.patch
-have-ia64-use-add_active_range-and-free_area_init_nodes.patch
-have-ia64-use-add_active_range-and-free_area_init_nodes-fix.patch
-account-for-memmap-and-optionally-the-kernel-image-as-holes.patch
-account-for-memmap-and-optionally-the-kernel-image-as-holes-fix.patch
-account-for-holes-that-are-outside-the-range-of-physical-memory.patch
-allow-an-arch-to-expand-node-boundaries.patch
-replace-min_unmapped_ratio-by-min_unmapped_pages-in-struct-zone.patch
-zvc-support-nr_slab_reclaimable--nr_slab_unreclaimable.patch
-zone_reclaim-dynamic-slab-reclaim.patch
-zone_reclaim-dynamic-slab-reclaim-tidy.patch
-zone-reclaim-with-slab-avoid-unecessary-off-node-allocations.patch
-oom-kill-update-comments-to-reflect-current-code.patch
-hugepages-use-page_to_nid-rather-than-traversing-zone-pointers.patch
-numa-add-zone_to_nid-function.patch
-numa-add-zone_to_nid-function-update.patch
-vm-add-per-zone-writeout-counter.patch
-own-header-file-for-struct-page.patch
-page-invalidation-cleanup.patch
-slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id.patch
-slab-fix-kmalloc_node-applying-memory-policies-if-nodeid-==-numa_node_id-fix.patch
-condense-output-of-show_free_areas.patch
-add-numa_build-definition-in-kernelh-to-avoid-ifdef.patch
-disable-gfp_thisnode-in-the-non-numa-case.patch
-gfp_thisnode-for-the-slab-allocator-v2.patch
-gfp_thisnode-for-the-slab-allocator-v2-fix.patch
-gfp_thisnode-for-the-slab-allocator-v2-fix-3.patch
-add-node-to-zone-for-the-numa-case.patch
-add-node-to-zone-for-the-numa-case-fix.patch
-do-not-allocate-pagesets-for-unpopulated-zones.patch
-zone_statistics-use-hot-node-instead-of-cold-zone_pgdat.patch
-do_no_pfn.patch
-do_no_pfn-tweaks.patch
-mspec-driver.patch
-shared-page-table-for-hugetlb-page-v2.patch
-shared-page-table-for-hugetlb-page-v2-tidy.patch
-shared-page-table-for-hugetlb-page-v2-comments.patch
-selinux-eliminate-selinux_task_ctxid.patch
-selinux-rename-selinux_ctxid_to_string.patch
-selinux-replace-ctxid-with-sid-in.patch
-selinux-enable-configuration-of-max-policy-version.patch
-selinux-enable-configuration-of-max-policy-version-improve-security_selinux_policydb_version_max_value-help-texts.patch
-selinux-add-support-for-range-transitions-on-object.patch
-selinux-1-3-eliminate-inode_security_set_security.patch
-selinux-2-3-change-isec-semaphore-to-a-mutex.patch
-selinux-3-3-convert-sbsec-semaphore-to-a-mutex.patch
-selinux-fix-tty-locking.patch
-binfmt_elf-consistently-use-loff_t.patch
-frv-use-the-generic-irq-stuff.patch
-frv-improve-frvs-use-of-generic-irq-handling.patch
-frv-permit-__do_irq-to-be-dispensed-with.patch
-frv-fix-fls-to-handle-bit-31-being-set-correctly.patch
-frv-implement-fls64.patch
-frv-optimise-ffs.patch
-alchemy-delete-unused-pt_regs-argument-from-au1xxx_dbdma_chan_alloc.patch
-avr32-arch.patch
-avr32-config_debug_bugverbose-and-config_frame_pointer.patch
-avr32-fix-invalid-constraints-for-stcond.patch
-avr32-add-support-for-irq-flags-state-tracing.patch
-avr32-turn-off-support-for-discontigmem-and-sparsemem.patch
-avr32-always-enable-config_embedded.patch
-avr32-export-the-find__bit-functions.patch
-avr32-add-defconfig-for-at32stk1002.patch
-avr32-use-autoconf-instead-of-marker.patch
-avr32-dont-assume-anything-about-max_nr_zones.patch
-avr32-add-i-o-port-access-primitives.patch
-avr32-use-linux-pfnh.patch
-avr32-kill-config_discontigmem-support-completely.patch
-avr32-fix-bug-in-__avr32_asr64.patch
-avr32-switch-to-generic-timekeeping-framework.patch
-avr32-set-kbuild_defconfig.patch
-avr32-kprobes-compile-fix.patch
-avr32-asm-ioh-should-include-asm-byteorderh.patch
-avr32-fix-output-constraints-in-asm-bitopsh.patch
-avr32-standardize-pxx_page-macros-fix.patch
-avr32-rename-at32stk100x-atstk100x.patch
-avr32-dont-leave-dbe-set-when-resetting-cpu.patch
-avr32-make-prot_write-prot_exec-imply-prot_read.patch
-avr32-remove-set_wmb.patch
-avr32-use-parse_early_param.patch
-avr32-fix-exported-headers.patch
-avr32-fix-__const_udelay-overflow-bug.patch
-remove-zone_dma-remains-from-avr32.patch
-avr32-mtd-static-memory-controller-driver-try-2.patch
-avr32-mtd-at49bv6416-platform-device-for-atstk1000.patch
-nommu-check-that-access_process_vm-has-a-valid-target.patch
-nommu-set-bdi-capabilities-for-dev-mem-and-dev-kmem.patch
-nommu-set-bdi-capabilities-for-dev-mem-and-dev-kmem-tidy.patch
-nommu-use-find_vma-rather-than-reimplementing-a-vma-search.patch
-check-if-start-address-is-in-vma-region-in-nommu-function-get_user_pages.patch
-nommu-check-vma-protections.patch
-nommu-permit-ptrace-to-ignore-non-prot_write-vmas-in-nommu-mode.patch
-nommu-implement-proc-pid-maps-for-nommu.patch
-nommu-order-the-per-mm_struct-vma-list.patch
-nommu-make-mremap-partially-work-for-nommu-kernels.patch
-nommu-add-docs-about-shared-memory.patch
-nommu-make-futexes-work-under-nommu-conditions.patch
-nommu-make-futexes-work-under-nommu-conditions-doc.patch
-nommu-move-the-fallback-arch_vma_name-to-a-sensible-place.patch
-nommu-move-the-fallback-arch_vma_name-to-a-sensible-place-fix.patch
-hpet-rtc-emulation-add-watchdog-timer-2.patch
-i386-show_registers-try-harder-to-print-failing.patch
-use-bug_onfoo-instead-of-if-foo-bug-in-include-asm-i386-dma-mappingh.patch
-apm-clean-up-module-initalization.patch
-x86-remove-locally-defined-ldt-structure-in-favour-of-standard-type.patch
-x86-implement-always-locked-bit-ops-for-memory-shared-with-an-smp-hypervisor.patch
-x86-roll-all-the-cpuid-asm-into-one-__cpuid-call.patch
-x86-make-__fixaddr_top-variable-to-allow-it-to-make-space-for-a-hypervisor.patch
-x86-add-a-bootparameter-to-reserve-high-linear-address-space.patch
-x86-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch
-x86-put-note-sections-into-a-pt_note-segment-in-vmlinux-fix.patch
-x86-enable-vmsplit-for-highmem-kernels.patch
-x86-trivial-pgtableh-__assembly__-move.patch
-x86-trivial-move-of-__have-macros-in-i386-pagetable-headers.patch
-x86-trivial-move-of-ptep_set_access_flags.patch
-x86-remove-unused-include-from-efi_stubs.patch
-i386-adds-smp_call_function_single.patch
-voyager-tty-locking.patch
-i386-kill-references-to-xtime.patch
-mtrr-add-lock-annotations-for-prepare_set-and.patch
-i386-adds-smp_call_function_single-fix.patch
-alpha-fix-alpha_ev56-dependencies-typo.patch
-swsusp-write-timer.patch
-swsusp-write-speedup.patch
-swsusp-read-timer.patch
-swsusp-read-speedup.patch
-swsusp-read-speedup-fix.patch
-swsusp-read-speedup-cleanup.patch
-swsusp-read-speedup-cleanup-2.patch
-swsusp-read-speedup-fix-fix-2.patch
-swsusp-clean-up-browsing-of-pfns.patch
-swsusp-struct-snapshot_handle-cleanup.patch
-make-swsusp-avoid-memory-holes-and-reserved-memory-regions-on-x86_64.patch
-disable-cpu-hotplug-during-suspend-2.patch
-swsusp-fix-mark_free_pages.patch
-swsusp-reorder-memory-allocating-functions.patch
-swsusp-fix-alloc_pagedir.patch
-clean-up-suspend-header.patch
-change-the-name-of-pagedir_nosave.patch
-swsusp-introduce-some-helpful-constants.patch
-swsusp-introduce-memory-bitmaps.patch
-swsusp-use-memory-bitmaps-during-resume.patch
-swsusp-use-memory-bitmaps-during-resume-fix.patch
-pm-make-it-possible-to-disable-console-suspending.patch
-pm-make-it-possible-to-disable-console-suspending-fix.patch
-pm-make-it-possible-to-disable-console-suspending-fix-2.patch
-make-it-possible-to-disable-serial-console-suspend.patch
-i386-detect-clock-skew-during-suspend.patch
-pm-add-pm_trace-switch.patch
-pm-add-pm_trace-switch-doc.patch
-m32r-fix-make-headers_check.patch
-uml-use-klibc-setjmp-longjmp.patch
-uml-use-array_size-more-assiduously.patch
-uml-fix-stack-alignment.patch
-uml-whitespace-fixes.patch
-uml-fix-handling-of-failed-execs-of-helpers.patch
-uml-improve-sigbus-diagnostics.patch
-uml-sigio-cleanups.patch
-uml-move-signal-handlers-to-arch-code.patch
-uml-move-signal-handlers-to-arch-code-fix.patch
-uml-timer-cleanups.patch
-uml-remove-unused-variable.patch
-uml-clean-our-set_ether_mac.patch
-uml-stack-usage-reduction.patch
-uml-tty-locking.patch
-split-i386-and-x86_64-ptraceh.patch
-split-i386-and-x86_64-ptraceh-fix.patch
-make-uml-use-ptrace-abih.patch
-uml-use-mcmodel=kernel-for-x86_64.patch
-uml-fix-proc-vs-interrupt-context-spinlock-deadlock.patch
-s390-fix-cmm-kernel-thread-handling.patch
-autofs4-needs-to-force-fail-return-revalidate.patch
-kdump-introduce-reset_devices-command-line-option.patch
-fat-cleanup-fat_get_blocks.patch
-inode_diet-replace-inodeugeneric_ip-with-inodei_private.patch
-inode_diet-replace-inodeugeneric_ip-with-inodei_private-gfs-fix.patch
-inode-diet-move-i_pipe-into-a-union.patch
-inode-diet-move-i_bdev-into-a-union.patch
-inode-diet-move-i_cdev-into-a-union.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-fix-fix.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-xfs-fix.patch
-reiserfs-warn-about-the-useless-nolargeio-option.patch
-x86-microcode-microcode-driver-cleanup.patch
-x86-microcode-microcode-driver-cleanup-tidy.patch
-x86-microcode-using-request_firmware-to-pull-microcode.patch
-x86-microcode-add-sysfs-and-hotplug-support.patch
-x86-microcode-add-sysfs-and-hotplug-support-fix.patch
-x86-microcode-add-sysfs-and-hotplug-support-fix-fix-2.patch
-x86-microcode-dont-check-the-size.patch
-consistently-use-max_errno-in-__syscall_return.patch
-consistently-use-max_errno-in-__syscall_return-fix.patch
-eisa-bus-modalias-attributes-support-1.patch
-eisa-bus-modalias-attributes-support-1-fix.patch
-eisa-bus-modalias-attributes-support-1-fix-git-kbuild-fix.patch
-alloc_fdtable-cleanup.patch
-include-__param-section-in-read-only-data-range.patch
-msi-use-kmem_cache_zalloc.patch
-sysctl-allow-proc-sys-without-sys_sysctl.patch
-sysctl-allow-proc-sys-without-sys_sysctl-fix.patch
-sysctl-document-that-sys_sysctl-will-be-removed.patch
-pid-implement-transfer_pid-and-use-it-to-simplify-de_thread.patch
-pid-remove-temporary-debug-code-in-attach_pid.patch
-de_thread-use-tsk-not-current.patch
-add-probe_kernel_address.patch
-x86-use-probe_kernel_address-in-handle_bug.patch
-fs-conversions-from-kmallocmemset-to-kzcalloc.patch
-fs-removing-useless-casts.patch
-jbd-add-lock-annotation-to-jbd_sync_bh.patch
-ext3-and-jbd-cleanup-remove-whitespace.patch
-ext3-turn-on-reservation-dump-on-block-allocation-errors.patch
-ext3-add-more-comments-in-block-allocation-reservation-code.patch
-jbd-use-build_bug_on-in-journal-init.patch
-fix-ext3-mounts-at-16t.patch
-fix-ext3-mounts-at-16t-fix.patch
-fix-ext2-mounts-at-16t.patch
-fix-ext2-mounts-at-16t-fix.patch
-more-ext3-16t-overflow-fixes.patch
-more-ext3-16t-overflow-fixes-fix.patch
-ext3-inode-numbers-are-unsigned-long.patch
-ext3-inode-numbers-are-unsigned-long-fix.patch
-really-ignore-kmem_cache_destroy-return-value.patch
-make-kmem_cache_destroy-return-void.patch
-ibm-acpi-documentation-delete-irrelevant-how-to-compile-external-module.patch
-ext3-wrong-error-behavior.patch
-ext3-more-whitespace-cleanups.patch
-ext3-fix-sparse-warnings.patch
-jbd-16t-fixes.patch
-dontdiff-add-utsreleaseh.patch

Merged into mainline or a subsystem tree.

+acpi-preserve-correct-battery-state-through-suspend-resume-cycles.patch
+acpi-preserve-correct-battery-state-through-suspend-resume-cycles-tidy.patch

ACPI fix.

+driver-core-fixes-check-for-return-value-of-sysfs_create_link.patch

More __must_check fixes

+fix-gregkh-driver-nozomi.patch

Fix nozomi driver a bit.

-git-dvb-fixup.patch

Dropped.

+drivers-media-use-null-instead-of-0-for-ptrs.patch

Cleanup

-git-gfs2-fixup.patch

Dropped.

+inode_diet-replace-inodeugeneric_ip-with-inodei_private-gfs2.patch
+inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-gfs2.patch

GFS2 fixes

+possible-dereference-in.patch

Input possible-oops fix

+drivers-input-misc-added-acer-travelmate-2424nwxci-support-to-the-wistron-button-interface.patch

Add new machine to Wistron driver.

-libata-add-40pin-short-cable-support-honour-drive-fix.patch

Folded into libata-add-40pin-short-cable-support-honour-drive.patch

-via-pata-controller-xfer-fixes-fix.patch

Folded into via-pata-controller-xfer-fixes.patch

-mmc-driver-for-ti-flashmedia-card-reader-source-tidy.patch
-mmc-driver-for-ti-flashmedia-card-reader-source-alpha-fix.patch
-mmc-driver-for-ti-flashmedia-card-reader-source-vs-git-mmc.patch

Folded into mmc-driver-for-ti-flashmedia-card-reader-source.patch

+git-mtd-fixup.patch

Fix rejects in git-mtd.patch

-git-netdev-all-fixup.patch

Dropped.

+forcedeth-power-management-support.patch
+forcedeth-power-management-support-tidy.patch
+remove-unnecessary-check-in-drivers-net-depcac.patch

netdev updates

+nfs-kill-obsolete-nfs_paranoia.patch

NFS cleanup

-revert-allow-file-systems-to-manually-d_move-inside-of-rename.patch

Dropped.

+off-by-one-in-arch-ppc-platforms-mpc8.patch
+ehea-firmware-interface-based-on-anton-blanchards-new-hvcall-interface.patch

ppc fixes

-tickle-nmi-watchdog-on-serial-output-fix.patch

Folded into tickle-nmi-watchdog-on-serial-output.patch

+remove-unnecessary-check-in.patch
+pci-turn-pci_fixup_video-into-generic-for-embedded.patch
+pcie_portdrv_restore_config-undefined-without-config_pm.patch

PCI fixes

+remove-unnecessary-check-in-drivers-scsi-sgc.patch
+remove-extra-newline-from-info-message.patch
+fix-scsi-scsi_transporth-compile-error.patch
+overrun-in-drivers-scsi-scsic.patch
+megaraid-check-for-firmware-version.patch

SCSI fixes

+scsi-target-needs-pci.patch

Fix git-scsi-target.patch

+fix-gregkh-usb-usbcore-add-autosuspend-autoresume-infrastructure-2.patch
+usb-hubc-build-fix.patch
+usb-serial-possible-irq-lock-inversion-ppp-vs.patch
+usb-allow-both-root-hub-interrupts-and-polling.patch
+ohci-remove-existing-autosuspend-code.patch
+ohci-add-auto-stop-support.patch
+ohci-add-auto-stop-support-hack-hack.patch
+pegasus-driver-failing-for-admtek-8515-network-device.patch

USB fixes

+x86_64-mm-copy-user-inatomic.patch
+x86_64-mm-allow-disabling-dac.patch
+x86_64-mm-iommu-setup-style.patch
+x86_64-mm-document-iommu-panic.patch
+x86_64-mm-unify-ioapic-checking.patch
+x86_64-mm-nmi-sysctl-cleanup.patch
+x86_64-mm-i386-setup-array-size.patch
+x86_64-mm-setup-array-size.patch
+x86_64-mm-i386-mmconfig-flush.patch
+x86_64-mm-re-positioning-the-bss-segment.patch
+x86_64-mm-vsyscall-blob-header.patch
+x86_64-mm-sem-early-clobber.patch

x86 tree updates

-revert-x86_64-mm-i386-remove-lock-section.patch

Dropped

-revert-x86_64-mm-i386-pda-current.patch
-revert-x86_64-mm-i386-pda-smp-processorid.patch
-revert-x86_64-mm-i386-pda-vm86.patch
-revert-x86_64-mm-i386-pda-user-abi.patch
-revert-x86_64-mm-i386-pda-use-gs.patch
-revert-x86_64-mm-i386-pda-init-pda.patch

Dropped.

-hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup-fix.patch
-hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup-fix-2.patch

Folded into hot-add-mem-x86_64-memory_add_physaddr_to_nid-node-fixup.patch

-hot-add-mem-x86_64-use-config_memory_hotplug_reserve-fix.patch

Folded into hot-add-mem-x86_64-use-config_memory_hotplug_reserve.patch

+arch-i386-pci-mmconfigc-tlb-flush-fix-tweaks.patch

x86 fix

+deal-with-cases-of-zone_dma-meaning-the-first-zone-fix.patch

Fix deal-with-cases-of-zone_dma-meaning-the-first-zone.patch

-redo-radix-tree-fixes.patch
-adix-tree-rcu-lockless-readside-update.patch
-radix-tree-rcu-lockless-readside-semicolon.patch
-adix-tree-rcu-lockless-readside-update-tidy.patch
-adix-tree-rcu-lockless-readside-fix-2.patch
-adix-tree-rcu-lockless-readside-fix-3.patch
-radix-tree-cleanup-radix_tree_deref_slot-and.patch
-cleanup-radix_tree_derefreplace_slot-calling-conventions.patch
-cleanup-radix_tree_derefreplace_slot-calling-conventions-warning-fixes.patch

Folded into radix-tree-rcu-lockless-readside.patch

+mm-fix-a-race-condition-under-smc-cow.patch

MM fix

+uswsusp-add-pmops-prepareenterfinish-support-aka-platform-mode.patch
+swsusp-use-partition-device-and-offset-to-identify-swap-areas.patch
+swsusp-rearrange-swap-handling-code.patch
+swsusp-use-block-device-offsets-to-identify-swap-locations-rev-2.patch
+swsusp-add-resume_offset-command-line-parameter-rev-2.patch
+swsusp-add-resume_offset-command-line-parameter-rev-2-fix.patch
+swsusp-document-support-for-swap-files-rev-2.patch
+swsusp-debugging.patch

swsusp updates

+uml-assign-random-macs-to-interfaces-if-necessary.patch
+uml-mechanical-tidying-after-random-macs-change.patch
+uml-locking-documentation.patch
+uml-close-file-descriptor-leaks.patch
+uml-stack-consumption-reduction.patch

UML updates

-apple-motion-sensor-driver-2.patch
-apple-motion-sensor-driver-2-fixes-update.patch
-apple-motion-sensor-driver-kconfig-fix.patch
-ams-check-return-values-from-device_create_file.patch

Dropped - I couldn't keep up with all the changes.

-make-reiserfs-default-to-barrier=flush.patch
-make-ext3-mount-default-to-barrier=1.patch

Dropped - these slow things down too much.

+remove-sysrq_key-and-related-defines-from-ppc-sh-h8300.patch
+mmc-mainly-add-or-later-clause-to-licence-statement.patch
+prevent-multiple-inclusion-of-linux-sysrqh.patch
+move-ncpfs-32bit-compat-ioctl-to-ncpfs.patch
+ipmi-per-channel-command-registration.patch
+update-legacy-io-handling-for-pmac.patch
+ip2-use-newer-pci_get-functions.patch
+i2o-switch-to-pci_get-api.patch
+cardbus-switch-to-ref-counting-hotplug-safe-api.patch
+epoll_pwait.patch
+sysrq-disable-lockdep-on-reboot.patch
+trident-fix-pci_dev-reference-counting-and-buglet.patch
+off-by-one-in-drivers-char-mwave-mwaveddc.patch
+hdaps-support-lenovo-thinkpad-t60.patch
+typo-fixes-for-rt-mutex-designtxt.patch
+remove-bug_onunlikely-in-include-linux-aioh.patch

Misc fixes and updates

+csa-accounting-taskstats-update-update-comments-in-linux-taskstatsh.patch

Fix CSA accouting patches in -mm.

+char-mxser_new-correct-include-file.patch
+char-mxser_new-upgrade-to-191.patch
+char-mxser_new-rework-to-allow-dynamic-structs.patch

Update the new mxser driver

+kprobe-whitespace-cleanup.patch
+disallow-kprobes-on-notifier_call_chain.patch
+kretprobe-spinlock-deadlock-patch.patch

kprobes updates

+cpumask-add-highest_possible_node_id-fix.patch

Fix cpumask-add-highest_possible_node_id.patch

+ecryptfs-file-operations-readdir-fix-for-seeking-in-directory-streams.patch
+ecryptfs-grab-lock-on-lower_page-in-ecryptfs_sync_page.patch

ecryptfs updates

+reiser4-reiser4_drop_page-dont-call-remove_from_page_cache.patch
+reiser4-get-rid-of-semaphores-wherever-it-is-possible.patch

reiser4 fixes

+fbdev-correct-buffer-size-limit-in-fbmem_read_proc.patch

fbdev fix

-genirq-msi-restore-__do_irq-compat-logic-temporarily.patch

Dropped, unneeded.

+msi-simplify-msi-sanity-checks-by-adding-with-generic-irq-code.patch
+msi-only-use-a-single-irq_chip-for-msi-interrupts.patch
+msi-refactor-and-move-the-msi-irq_chip-into-the-arch-code.patch
+msi-move-the-ia64-code-into-arch-ia64.patch

MSI updates

+htirq-tidy-up-the-htirq-code.patch

Update hypertransport driver.



All 1259 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/patch-list


2006-09-28 11:54:11

by Michal Piotrowski

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Hi,

On 28/09/06, Andrew Morton <[email protected]> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
>

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.18-mm2 #1
-------------------------------------------------------
nash/1264 is trying to acquire lock:
(&bdev_part_lock_key){--..}, at: [<c0310d4a>] mutex_lock+0x1c/0x1f

but task is already holding lock:
(&new->reconfig_mutex){--..}, at: [<c03108ff>]
mutex_lock_interruptible+0x1c/0x1f

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&new->reconfig_mutex){--..}:
[<c01390b8>] add_lock_to_list+0x5c/0x7a
[<c013b1dd>] __lock_acquire+0x9f3/0xaef
[<c013b643>] lock_acquire+0x71/0x91
[<c031068f>] __mutex_lock_interruptible_slowpath+0xd2/0x326
[<c03108ff>] mutex_lock_interruptible+0x1c/0x1f
[<c02ba4e3>] md_open+0x28/0x5d
[<c0197853>] do_open+0x8b/0x377
[<c0197cd5>] blkdev_open+0x1d/0x46
[<c0172f36>] __dentry_open+0x133/0x260
[<c01730d1>] nameidata_to_filp+0x1c/0x2e
[<c0173111>] do_filp_open+0x2e/0x35
[<c0173170>] do_sys_open+0x58/0xde
[<c0173222>] sys_open+0x16/0x18
[<c0103297>] syscall_call+0x7/0xb
[<ffffffff>] 0xffffffff

-> #1 (&bdev->bd_mutex){--..}:
[<c01390b8>] add_lock_to_list+0x5c/0x7a
[<c013b1dd>] __lock_acquire+0x9f3/0xaef
[<c013b643>] lock_acquire+0x71/0x91
[<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0310d4a>] mutex_lock+0x1c/0x1f
[<c0197824>] do_open+0x5c/0x377
[<c0197bab>] blkdev_get+0x6c/0x77
[<c01978d0>] do_open+0x108/0x377
[<c0197bab>] blkdev_get+0x6c/0x77
[<c0197eb1>] open_by_devnum+0x30/0x3c
[<c0147419>] swsusp_check+0x14/0xc5
[<c0145865>] software_resume+0x7e/0x100
[<c010049e>] init+0x121/0x29f
[<c0103f23>] kernel_thread_helper+0x7/0x10
[<c0109523>] save_stack_trace+0x17/0x30
[<c0138fb0>] save_trace+0x4f/0xfb
[<c01390b8>] add_lock_to_list+0x5c/0x7a
[<c013b1dd>] __lock_acquire+0x9f3/0xaef
[<c013b643>] lock_acquire+0x71/0x91
[<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0310d4a>] mutex_lock+0x1c/0x1f
[<c0197824>] do_open+0x5c/0x377
[<c0197bab>] blkdev_get+0x6c/0x77
[<c01978d0>] do_open+0x108/0x377
[<c0197bab>] blkdev_get+0x6c/0x77
[<c0197eb1>] open_by_devnum+0x30/0x3c
[<c0147419>] swsusp_check+0x14/0xc5
[<c0145865>] software_resume+0x7e/0x100
[<c010049e>] init+0x121/0x29f
[<c0103f23>] kernel_thread_helper+0x7/0x10
[<ffffffff>] 0xffffffff

-> #0 (&bdev_part_lock_key){--..}:
[<c013a7b6>] print_circular_bug_tail+0x30/0x64
[<c013b114>] __lock_acquire+0x92a/0xaef
[<c013b643>] lock_acquire+0x71/0x91
[<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0310d4a>] mutex_lock+0x1c/0x1f
[<c0197323>] bd_claim_by_disk+0x5f/0x18e
[<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
[<c02b6453>] autostart_arrays+0x24b/0x322
[<c02b9158>] md_ioctl+0x91/0x13f4
[<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
[<c01ead23>] blkdev_ioctl+0x755/0x7a2
[<c0196f9d>] block_ioctl+0x16/0x1b
[<c01801d2>] do_ioctl+0x22/0x67
[<c0180460>] vfs_ioctl+0x249/0x25c
[<c01804ba>] sys_ioctl+0x47/0x75
[<c0103297>] syscall_call+0x7/0xb
[<ffffffff>] 0xffffffff

other info that might help us debug this:

1 lock held by nash/1264:
#0: (&new->reconfig_mutex){--..}, at: [<c03108ff>]
mutex_lock_interruptible+0x1c/0x1f
stack backtrace:
[<c0104215>] dump_trace+0x64/0x1cd
[<c0104390>] show_trace_log_lvl+0x12/0x25
[<c01049e5>] show_trace+0xd/0x10
[<c0104aad>] dump_stack+0x19/0x1b
[<c013a7df>] print_circular_bug_tail+0x59/0x64
[<c013b114>] __lock_acquire+0x92a/0xaef
[<c013b643>] lock_acquire+0x71/0x91
[<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0310d4a>] mutex_lock+0x1c/0x1f
[<c0197323>] bd_claim_by_disk+0x5f/0x18e
[<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
[<c02b6453>] autostart_arrays+0x24b/0x322
[<c02b9158>] md_ioctl+0x91/0x13f4
[<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
[<c01ead23>] blkdev_ioctl+0x755/0x7a2
[<c0196f9d>] block_ioctl+0x16/0x1b
[<c01801d2>] do_ioctl+0x22/0x67
[<c0180460>] vfs_ioctl+0x249/0x25c
[<c01804ba>] sys_ioctl+0x47/0x75
[<c0103297>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

Leftover inexact backtrace:

=======================
md: bind<hdb2>

config & dmesg http://www.stardust.webpages.pl/files/mm/2.6.18-mm2/

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)

2006-09-28 17:51:02

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:

> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.

TCP bic registered
TCP westwood registered
TCP htcp registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Unable to handle kernel paging request at ffffffffffffffff RIP:
[<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
PGD 203027 PUD 2b031067 PMD 0
Oops: 0000 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
Call Trace:
[<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
[<ffffffff8061c68d>] packet_init+0x2d/0x53
[<ffffffff80207182>] init+0x162/0x330
[<ffffffff8020a9d8>] child_rip+0xa/0x12
[<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
[<ffffffff80207020>] init+0x0/0x330
[<ffffffff8020a9ce>] child_rip+0x0/0x12


Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
RSP <ffff810bffcbde90>
CR2: ffffffffffffffff
<0>Kernel panic - not syncing: Attempted to kill init!

--

Steve Fox
IBM Linux Technology Center

2006-09-28 19:01:32

by jurriaan

[permalink] [raw]
Subject: Re: 2.6.18-mm2

From: Steve Fox <[email protected]>
Date: Thu, Sep 28, 2006 at 05:50:31PM +0000
> On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
>
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
> Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
>
> TCP bic registered
> TCP westwood registered
> TCP htcp registered
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> Unable to handle kernel paging request at ffffffffffffffff RIP:

I think you need to post additional details, such as .config files.
2.6.18-mm2 boots fine here (x86-64, X2 4600 cpu, smp)

Linux version 2.6.18-mm2 (jurriaan@middle) (gcc version 4.1.2 20060920 (prerelease) (Debian 4.1.1-14)) #5 SMP Thu Sep 28 19:56:29 CEST 2006
Command line: root=/dev/md2 video=nvidiafb:1600x1200-32@85 atkbd.softrepeat=1
protocol family 1
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
NET: Registered protocol family 17
NET: Registered protocol family 15
NET: Registered protocol family 8
NET: Registered protocol family 20
powernow-k8: Found 2 AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ processors (version 2.00.00)

Kind regards,
Jurriaan
--
"I resent it as well," said Scharde. "I am working to keep my rage under
control."
Jack Vance - Ecce and Old Earth
Debian (Unstable) GNU/Linux 2.6.18-mm2 2x4826 bogomips load 1.35

2006-09-28 21:01:30

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2


(please always do reply-to-all)

On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
"Steve Fox" <[email protected]> wrote:

> On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
>
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
> Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
>
> TCP bic registered
> TCP westwood registered
> TCP htcp registered
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> Unable to handle kernel paging request at ffffffffffffffff RIP:
> [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> PGD 203027 PUD 2b031067 PMD 0
> Oops: 0000 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> Call Trace:
> [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> [<ffffffff8061c68d>] packet_init+0x2d/0x53
> [<ffffffff80207182>] init+0x162/0x330
> [<ffffffff8020a9d8>] child_rip+0xa/0x12
> [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> [<ffffffff80207020>] init+0x0/0x330
> [<ffffffff8020a9ce>] child_rip+0x0/0x12
>
>
> Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> RSP <ffff810bffcbde90>
> CR2: ffffffffffffffff
> <0>Kernel panic - not syncing: Attempted to kill init!
>

I'm really struggling to work out what went wrong there. Comparing your
miserable 20 bytes of code to my object code makes me think that this:

struct packet_sock *po = pkt_sk(sk);

returned -1, perhaps in %ebp. But it's all very crude.

Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
addresses might change) then have a poke around with `gdb vmlinux' (or
maybe just addr2line) to work out where it's really oopsing?

I don't see much which has changed in that area recently.

2006-09-28 22:38:34

by Jim Cromie

[permalink] [raw]
Subject: Re: 2.6.18-mm2


[jimc@harpo linux-2.6.18-mm2-sk]$ make
CHK include/linux/version.h
CHK include/linux/utsrelease.h
CHK include/linux/compile.h
GEN .version
CHK include/linux/compile.h
UPD include/linux/compile.h
CC init/version.o
LD init/built-in.o
LD .tmp_vmlinux1
arch/i386/kernel/built-in.o(.text+0x34f1): In function `do_nmi':
arch/i386/kernel/traps.c:752: undefined reference to
`panic_on_unrecovered_nmi'
arch/i386/kernel/built-in.o(.text+0x3564):arch/i386/kernel/traps.c:712:
undefined reference to `panic_on_unrecovered_nmi'


$ grep nmi arch/i386/kernel/Makefile
obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o

which I dont have enabled.

It looks to be due to changes in x86_64-mm-nmi-sysctl-cleanup.patch

2006-09-28 22:44:17

by Matthias Hentges

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Hello all,

I've just tested -mm2 on my C2D system and I'm getting a lot of these
messages:

"[ 139.143807] printk: 131 messages suppressed.
[ 139.148235] sky2 0000:03:00.0: pci express error (0x500547)"

Please note that the "sky2" driver has always been the black sheep on
that system due to regular full lock-ups of the driver, requiring a
rmmod sky2 + modprobe sky2 cycle.

This happens often enough to warrant writing a cronjob checking the
network and auto-rmmod'ing the module.....

While the above is bloody annoying at times (heh), the driver never
caused any messages like the ones I now get with -mm2 .

Dmesg of a fresh boot is attached. -mm 1 works perfectly fine on that
machine.

--
Matthias Hentges

My OS: Debian SID. Geek by Nature, Linux by Choice


Attachments:
dmesg_2.6.18-mm2.txt.gz (8.82 kB)
signature.asc (189.00 B)
Dies ist ein digital signierter Nachrichtenteil
Download all attachments

2006-09-28 22:45:15

by Stephen Hemminger

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Thu, 28 Sep 2006 14:01:24 -0700
Andrew Morton <[email protected]> wrote:

>
> (please always do reply-to-all)
>
> On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> "Steve Fox" <[email protected]> wrote:
>
> > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> >
> > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> >
> > TCP bic registered
> > TCP westwood registered
> > TCP htcp registered
> > NET: Registered protocol family 1
> > NET: Registered protocol family 17
> > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > PGD 203027 PUD 2b031067 PMD 0
> > Oops: 0000 [1] SMP
> > last sysfs file:
> > CPU 0
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > Call Trace:
> > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > [<ffffffff80207182>] init+0x162/0x330
> > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > [<ffffffff80207020>] init+0x0/0x330
> > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> >
> >
> > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP <ffff810bffcbde90>
> > CR2: ffffffffffffffff
> > <0>Kernel panic - not syncing: Attempted to kill init!
> >
>
> I'm really struggling to work out what went wrong there. Comparing your
> miserable 20 bytes of code to my object code makes me think that this:
>
> struct packet_sock *po = pkt_sk(sk);
>
> returned -1, perhaps in %ebp. But it's all very crude.

That doesn't seem possible given:

static inline struct packet_sock *pkt_sk(struct sock *sk)
{
return (struct packet_sock *)sk;
}

That means the packet socket list is corrupted??

Stephen Hemminger <[email protected]>

2006-09-28 23:08:24

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Friday 29 September 2006 00:39, Jim Cromie wrote:
>
> [jimc@harpo linux-2.6.18-mm2-sk]$ make
> CHK include/linux/version.h
> CHK include/linux/utsrelease.h
> CHK include/linux/compile.h
> GEN .version
> CHK include/linux/compile.h
> UPD include/linux/compile.h
> CC init/version.o
> LD init/built-in.o
> LD .tmp_vmlinux1
> arch/i386/kernel/built-in.o(.text+0x34f1): In function `do_nmi':
> arch/i386/kernel/traps.c:752: undefined reference to
> `panic_on_unrecovered_nmi'
> arch/i386/kernel/built-in.o(.text+0x3564):arch/i386/kernel/traps.c:712:
> undefined reference to `panic_on_unrecovered_nmi'
>
>
> $ grep nmi arch/i386/kernel/Makefile
> obj-$(CONFIG_X86_LOCAL_APIC) += apic.o nmi.o
>
> which I dont have enabled.

Will fix.

BTW I was planning to make LOCAL_APIC unconditional on i386 too like on x86-64.
There is basically no reason ever to disable it, and the bug work around
for buggy BIOS one can be done at runtime. Overall the #ifdef / compile breakage
ratio vs saved code on disabled APIC code is definitely unbalanced.

-Andi

2006-09-29 03:20:30

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

Yowza. This has been one of the most unstable -mm I've personally tried since
2.6.0 came out (and I've tried to give each and every single one a shot).

Something is giving cache_alloc_refill() massive indigestion, I'm taking
lots of oopsen in it. Usually within 5-10 minutes I'm dead in the water.

>From an untainted kernel:

Sep 28 21:51:59 turing-police kernel: [ 526.046000] BUG: unable to handle kernel paging request at virtual address 00100104
Sep 28 21:51:59 turing-police kernel: [ 526.046000] printing eip:
Sep 28 21:51:59 turing-police kernel: [ 526.046000] c0150c43
Sep 28 21:51:59 turing-police kernel: [ 526.046000] *pde = 00000000

as far as it got logging it to disk - at that point the machine locked up
hard, even alt-sysrq was dead, had to power-cycle. Long time since that
happened. Admittedly, that's not much to go on, but it shows that I'm having
issues in cache_alloc_refill() even when untainted. I'll probably get more
complete untainted traces while playing bisect-the-mm tomorrow....

Another few traces, more complete, almost same EIP (inside cache_alloc_refill
both times), but admittedly nvidia-tainted:

Sep 28 21:40:07 turing-police kernel: [ 825.672000] BUG: unable to handle kernel paging request at virtual address 646c617a
Sep 28 21:40:07 turing-police kernel: [ 825.672000] printing eip:
Sep 28 21:40:07 turing-police kernel: [ 825.672000] c0150f9b
Sep 28 21:40:07 turing-police kernel: [ 825.672000] *pde = 00000000
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Oops: 0002 [#1]
Sep 28 21:40:07 turing-police kernel: [ 825.672000] PREEMPT
Sep 28 21:40:07 turing-police kernel: [ 825.672000] last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Modules linked in: aes cryptomgr xt_SECMARK xt_CONNSECMARK ip6table_
mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp
nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal sony_acpi processo
r fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class nvidia yenta_socket oh
ci1394 ieee1394 rsrc_nonstatic intel_agp pcmcia_core agpgart iTCO_wdt rtc
Sep 28 21:40:07 turing-police kernel: [ 825.672000] CPU: 0
Sep 28 21:40:07 turing-police kernel: [ 825.672000] EIP: 0060:[<c0150f9b>] Tainted: P VLI
Sep 28 21:40:07 turing-police kernel: [ 825.672000] EFLAGS: 00210002 (2.6.18-mm2 #1)
Sep 28 21:40:07 turing-police kernel: [ 825.672000] EIP is at cache_alloc_refill+0x12a/0x453
Sep 28 21:40:07 turing-police kernel: [ 825.672000] eax: effdf4d0 ebx: effdfa40 ecx: 00000001 edx: 646c6176
Sep 28 21:40:07 turing-police kernel: [ 825.672000] esi: dffedd00 edi: effdf4c0 ebp: def37f0c esp: def37ec8
Sep 28 21:40:07 turing-police kernel: [ 825.672000] ds: 007b es: 007b ss: 0068
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Process badpost (pid: 3474, ti=def36000 task=dfe9aaa0 task.ti=def36000)
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Stack: effe03e0 66666174 000000d0 effe18c0 00000003 effdfa40 00000000 ffffffff
Sep 28 21:40:07 turing-police kernel: [ 825.672000] 00000000 ffffffff 00000001 def37fbc 01200011 00000000 00200286 fffffff4
Sep 28 21:40:07 turing-police kernel: [ 825.672000] dfe9aaa0 def37f18 c0150e68 def37fbc def37f5c c0111b6a def37fbc bfda5158
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Call Trace:
Sep 28 21:40:07 turing-police kernel: [ 825.672000] [<c0150e68>] kmem_cache_alloc+0x25/0x2e
Sep 28 21:40:07 turing-police kernel: [ 825.672000] [<c0111b6a>] copy_process+0xa2/0x1183
Sep 28 21:40:07 turing-police kernel: [ 825.672000] [<c0112dbf>] do_fork+0x8d/0x172
Sep 28 21:40:07 turing-police kernel: [ 825.672000] [<c0101216>] sys_clone+0x25/0x2a
Sep 28 21:40:07 turing-police kernel: [ 825.672000] [<c0102d23>] syscall_call+0x7/0xb
Sep 28 21:40:07 turing-police kernel: [ 825.672000] DWARF2 unwinder stuck at syscall_call+0x7/0xb
Sep 28 21:40:07 turing-police kernel: [ 825.672000]
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Leftover inexact backtrace:
Sep 28 21:40:07 turing-police kernel: [ 825.672000]
Sep 28 21:40:07 turing-police kernel: [ 825.672000] =======================
Sep 28 21:40:07 turing-police kernel: [ 825.672000] Code: 9e 1c 89 46 14 8b 5d d0 89 54 8b 10 41 89 0b 8b 46 10 89 45 c0
8b 55 c8 3b 42 1c 73 09 ff 4d cc 83 7d cc ff 75 bd 8b 16 8b 46 04 <89> 42 04 89 10 c7 06 00 01 10 00 c7 46 04 00 02 20 0
0 83 7e 14
Sep 28 21:40:07 turing-police kernel: [ 825.672000] EIP: [<c0150f9b>] cache_alloc_refill+0x12a/0x453 SS:ESP 0068:def37ec8
Sep 28 21:40:07 turing-police kernel: [ 825.672000] <6>note: badpost[3474] exited with preempt_count 1

And then a second oops at the same exact EIP as the untainted one:

Sep 28 21:40:11 turing-police kernel: [ 829.630000] BUG: unable to handle kernel paging request at virtual address 646c617a
Sep 28 21:40:11 turing-police kernel: [ 829.630000] printing eip:
Sep 28 21:40:11 turing-police kernel: [ 829.630000] c0150f9b
Sep 28 21:40:11 turing-police kernel: [ 829.630000] *pde = 00000000
Sep 28 21:40:11 turing-police kernel: [ 829.630000] Oops: 0002 [#2]
Sep 28 21:40:11 turing-police kernel: [ 829.630000] PREEMPT
Sep 28 21:40:11 turing-police kernel: [ 829.630000] last sysfs file: /devices/system/cpu/cpu0/cpufreq/scaling_setspeed
Sep 28 21:40:11 turing-police kernel: [ 829.630000] Modules linked in: aes cryptomgr xt_SECMARK xt_CONNSECMARK ip6table_
mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp
nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal sony_acpi processo
r fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class nvidia yenta_socket oh
ci1394 ieee1394 rsrc_nonstatic intel_agp pcmcia_core agpgart iTCO_wdt rtc
Sep 28 21:40:11 turing-police kernel: [ 829.630000] CPU: 0
Sep 28 21:40:11 turing-police kernel: [ 829.630000] EIP: 0060:[<c0150f9b>] Tainted: P VLI
Sep 28 21:40:11 turing-police kernel: [ 829.630000] EFLAGS: 00210002 (2.6.18-mm2 #1)
Sep 28 21:40:11 turing-police kernel: [ 829.630000] EIP is at cache_alloc_refill+0x12a/0x453
Sep 28 21:40:11 turing-police kernel: [ 829.630000] eax: effdf4d0 ebx: effdfa40 ecx: 00000000 edx: 646c6176
Sep 28 21:40:11 turing-police kernel: [ 829.630000] esi: dffedd00 edi: effdf4c0 ebp: e11d3f0c esp: e11d3ec8
Sep 28 21:40:11 turing-police kernel: [ 829.630000] ds: 007b es: 007b ss: 0068

I've seen mostly 3 different stack traces for this:

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167 ebx: effdfa40 ecx: 00000001 edx: d9eede00
esi: daf19700 edi: effdf4c0 ebp: db237f0c esp: db237ec8
ds: 007b es: 007b ss: 0068
Process procmail (pid: 3206, ti=db236000 task=db299550 task.ti=db236000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000003 effdfa40 00000000 ffffffff
00000000 ffffffff 00000001 db237fbc 01200011 00000000 00200286 fffffff4
db299550 db237f18 c0150e68 db237fbc db237f5c c0111b6a db237fbc bfbc3678
Call Trace:
[<c0150e68>] kmem_cache_alloc+0x25/0x2e
[<c0111b6a>] copy_process+0xa2/0x1183
[<c0112dbf>] do_fork+0x8d/0x172
[<c0101216>] sys_clone+0x25/0x2a
[<c0102d23>] syscall_call+0x7/0xb

and

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167 ebx: effdfa40 ecx: 00000000 edx: d9eede00
esi: daf19700 edi: effdf4c0 ebp: dceedda8 esp: dceedd64
ds: 007b es: 007b ss: 0068
Process fetchmail (pid: 2752, ti=dceec000 task=dbfb9aa0 task.ti=dceec000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000004 effdfa40 00000000 e2774500
dceeddd4 c02fe47b 0000014f 0000014f 0000000f 00000473 00200286 00000f80
db1c1680 dceeddb4 c015130c db1c1680 dceeddd8 c02d2fb9 00000001 000000d0
Call Trace:
[<c015130c>] __kmalloc+0x48/0x55
[<c02d2fb9>] __alloc_skb+0x4f/0xf7
[<c02f4b2a>] tcp_sendmsg+0x14c/0x965
[<c030bdf4>] inet_sendmsg+0x3b/0x48
[<c02cdb8b>] sock_aio_write+0xf5/0x102
[<c0153691>] do_sync_write+0xae/0xec
[<c0153e6b>] vfs_write+0xbc/0x157
[<c01543be>] sys_write+0x3b/0x60
[<c0102d23>] syscall_call+0x7/0xb

and

EIP is at cache_alloc_refill+0x12d/0x453
eax: 00000167 ebx: effdfa40 ecx: 00000000 edx: d9eede00
esi: daf19700 edi: effdf4c0 ebp: ddb2fdc4 esp: ddb2fd80
ds: 007b es: 007b ss: 0068
Process Eterm (pid: 2700, ti=ddb2e000 task=e39ab000 task.ti=ddb2e000)
Stack: effe03e0 00000001 000000d0 effe18c0 00000004 effdfa40 00000000 00000017
00170001 00200082 ddb2fdd0 00200082 00000000 00000000 00200286 00000f80
ee80cd80 ddb2fdd0 c015130c ee80cd80 ddb2fdf4 c02d2fb9 00000000 000000d0
Call Trace:
[<c015130c>] __kmalloc+0x48/0x55
[<c02d2fb9>] __alloc_skb+0x4f/0xf7
[<c02cfef1>] sock_alloc_send_skb+0x5a/0x17b
[<c03258b9>] unix_stream_sendmsg+0x13b/0x2e6
[<c02cdb8b>] sock_aio_write+0xf5/0x102
[<c0153691>] do_sync_write+0xae/0xec
[<c0153e6b>] vfs_write+0xbc/0x157
[<c01543be>] sys_write+0x3b/0x60
[<c0102d23>] syscall_call+0x7/0xb


Attachments:
(No filename) (226.00 B)

2006-09-29 03:29:34

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Thu, 28 Sep 2006 23:19:11 -0400
[email protected] wrote:

> On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
> Yowza. This has been one of the most unstable -mm I've personally tried since
> 2.6.0 came out (and I've tried to give each and every single one a shot).
>
> Something is giving cache_alloc_refill() massive indigestion, I'm taking
> lots of oopsen in it. Usually within 5-10 minutes I'm dead in the water.

Could be anything I'm afraid. But you're the first to report it, so there's
something distinct in your .config or hardware.

Whose idea was it to make it a monolithic kernel??

> >From an untainted kernel:
>
> Sep 28 21:51:59 turing-police kernel: [ 526.046000] BUG: unable to handle kernel paging request at virtual address 00100104
> Sep 28 21:51:59 turing-police kernel: [ 526.046000] printing eip:
> Sep 28 21:51:59 turing-police kernel: [ 526.046000] c0150c43
> Sep 28 21:51:59 turing-police kernel: [ 526.046000] *pde = 00000000
>
> as far as it got logging it to disk - at that point the machine locked up
> hard, even alt-sysrq was dead, had to power-cycle. Long time since that
> happened. Admittedly, that's not much to go on, but it shows that I'm having
> issues in cache_alloc_refill() even when untainted. I'll probably get more
> complete untainted traces while playing bisect-the-mm tomorrow....

bisecting would be good, thanks. It might be quicker to strip down the .config
though.

2006-09-29 03:58:12

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:
> On Thu, 28 Sep 2006 23:19:11 -0400
> [email protected] wrote:
>
> > On Thu, 28 Sep 2006 01:46:23 PDT, Andrew Morton said:
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/

> > Something is giving cache_alloc_refill() massive indigestion, I'm taking
> > lots of oopsen in it. Usually within 5-10 minutes I'm dead in the water.
>
> Could be anything I'm afraid. But you're the first to report it, so there's
> something distinct in your .config or hardware.

Like *that* hasn't happened before. :)

> bisecting would be good, thanks. It might be quicker to strip down the .config
> though.

On the other hand, this really smells like the kind of storage overlay that
changing the config can change what gets overlaid, scaring it into hiding.
The fact the system lives 5-10 minutes means that there's *something* that
happens that makes it manifest - and that could be almost anything.


Attachments:
(No filename) (226.00 B)

2006-09-29 12:12:31

by Peter Zijlstra

[permalink] [raw]
Subject: md deadlock (was Re: 2.6.18-mm2)

On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> Hi,
>
> On 28/09/06, Andrew Morton <[email protected]> wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> >
> >
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.18-mm2 #1
> -------------------------------------------------------
> nash/1264 is trying to acquire lock:
> (&bdev_part_lock_key){--..}, at: [<c0310d4a>] mutex_lock+0x1c/0x1f
>
> but task is already holding lock:
> (&new->reconfig_mutex){--..}, at: [<c03108ff>]
> mutex_lock_interruptible+0x1c/0x1f
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #2 (&new->reconfig_mutex){--..}:
> [<c01390b8>] add_lock_to_list+0x5c/0x7a
> [<c013b1dd>] __lock_acquire+0x9f3/0xaef
> [<c013b643>] lock_acquire+0x71/0x91
> [<c031068f>] __mutex_lock_interruptible_slowpath+0xd2/0x326
> [<c03108ff>] mutex_lock_interruptible+0x1c/0x1f
> [<c02ba4e3>] md_open+0x28/0x5d -> mddev->reconfig_mutex
> [<c0197853>] do_open+0x8b/0x377 -> bdev->bd_mutex (whole)
> [<c0197cd5>] blkdev_open+0x1d/0x46
> [<c0172f36>] __dentry_open+0x133/0x260
> [<c01730d1>] nameidata_to_filp+0x1c/0x2e
> [<c0173111>] do_filp_open+0x2e/0x35
> [<c0173170>] do_sys_open+0x58/0xde
> [<c0173222>] sys_open+0x16/0x18
> [<c0103297>] syscall_call+0x7/0xb
> [<ffffffff>] 0xffffffff
>
> -> #1 (&bdev->bd_mutex){--..}:
> [<c01390b8>] add_lock_to_list+0x5c/0x7a
> [<c013b1dd>] __lock_acquire+0x9f3/0xaef
> [<c013b643>] lock_acquire+0x71/0x91
> [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
> [<c0310d4a>] mutex_lock+0x1c/0x1f
> [<c0197824>] do_open+0x5c/0x377
> [<c0197bab>] blkdev_get+0x6c/0x77
> [<c01978d0>] do_open+0x108/0x377
> [<c0197bab>] blkdev_get+0x6c/0x77
> [<c0197eb1>] open_by_devnum+0x30/0x3c
> [<c0147419>] swsusp_check+0x14/0xc5
> [<c0145865>] software_resume+0x7e/0x100
> [<c010049e>] init+0x121/0x29f
> [<c0103f23>] kernel_thread_helper+0x7/0x10
> [<c0109523>] save_stack_trace+0x17/0x30
> [<c0138fb0>] save_trace+0x4f/0xfb
> [<c01390b8>] add_lock_to_list+0x5c/0x7a
> [<c013b1dd>] __lock_acquire+0x9f3/0xaef
> [<c013b643>] lock_acquire+0x71/0x91
> [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
> [<c0310d4a>] mutex_lock+0x1c/0x1f
> [<c0197824>] do_open+0x5c/0x377 -> bdev->bd_mutex (whole)
> [<c0197bab>] blkdev_get+0x6c/0x77
> [<c01978d0>] do_open+0x108/0x377 -> bdev->bd_mutex (partition)
> [<c0197bab>] blkdev_get+0x6c/0x77
> [<c0197eb1>] open_by_devnum+0x30/0x3c
> [<c0147419>] swsusp_check+0x14/0xc5
> [<c0145865>] software_resume+0x7e/0x100
> [<c010049e>] init+0x121/0x29f
> [<c0103f23>] kernel_thread_helper+0x7/0x10
> [<ffffffff>] 0xffffffff
>
> -> #0 (&bdev_part_lock_key){--..}:
> [<c013a7b6>] print_circular_bug_tail+0x30/0x64
> [<c013b114>] __lock_acquire+0x92a/0xaef
> [<c013b643>] lock_acquire+0x71/0x91
> [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
> [<c0310d4a>] mutex_lock+0x1c/0x1f
> [<c0197323>] bd_claim_by_disk+0x5f/0x18e -> bdev->bd_mutex (partition)
> [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
> [<c02b6453>] autostart_arrays+0x24b/0x322
> [<c02b9158>] md_ioctl+0x91/0x13f4
> [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
> [<c01ead23>] blkdev_ioctl+0x755/0x7a2
> [<c0196f9d>] block_ioctl+0x16/0x1b
> [<c01801d2>] do_ioctl+0x22/0x67
> [<c0180460>] vfs_ioctl+0x249/0x25c
> [<c01804ba>] sys_ioctl+0x47/0x75
> [<c0103297>] syscall_call+0x7/0xb
> [<ffffffff>] 0xffffffff
>
> other info that might help us debug this:
>
> 1 lock held by nash/1264:
> #0: (&new->reconfig_mutex){--..}, at: [<c03108ff>]
> mutex_lock_interruptible+0x1c/0x1f
> stack backtrace:
> [<c0104215>] dump_trace+0x64/0x1cd
> [<c0104390>] show_trace_log_lvl+0x12/0x25
> [<c01049e5>] show_trace+0xd/0x10
> [<c0104aad>] dump_stack+0x19/0x1b
> [<c013a7df>] print_circular_bug_tail+0x59/0x64
> [<c013b114>] __lock_acquire+0x92a/0xaef
> [<c013b643>] lock_acquire+0x71/0x91
> [<c0310b0f>] __mutex_lock_slowpath+0xd2/0x2f1
> [<c0310d4a>] mutex_lock+0x1c/0x1f
> [<c0197323>] bd_claim_by_disk+0x5f/0x18e -> bdev->bd_mutex (part)
> [<c02b44ec>] bind_rdev_to_array+0x1f0/0x20e
autorun_devices -> mddev->reconfig_mutex
> [<c02b6453>] autostart_arrays+0x24b/0x322
> [<c02b9158>] md_ioctl+0x91/0x13f4
> [<c01ea5bc>] blkdev_driver_ioctl+0x49/0x5b
> [<c01ead23>] blkdev_ioctl+0x755/0x7a2
> [<c0196f9d>] block_ioctl+0x16/0x1b
> [<c01801d2>] do_ioctl+0x22/0x67
> [<c0180460>] vfs_ioctl+0x249/0x25c
> [<c01804ba>] sys_ioctl+0x47/0x75
> [<c0103297>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
>
> Leftover inexact backtrace:

Looks like a real deadlock here. It seems to me #2 is the easiest to
break.

static int md_open(struct inode *inode, struct file *file)
{
/*
* Succeed if we can lock the mddev, which confirms that
* it isn't being stopped right now.
*/
mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
int err;

if ((err = mddev_lock(mddev)))
goto out;

err = 0;
mddev_get(mddev);
mddev_unlock(mddev);

check_disk_change(inode->i_bdev);
out:
return err;
}

mddev_get() is a simple atomic_inc(), and I fail to see how waiting for
the lock makes any difference.



2006-09-29 12:52:57

by NeilBrown

[permalink] [raw]
Subject: Re: md deadlock (was Re: 2.6.18-mm2)

On Friday September 29, [email protected] wrote:
> On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
>
> Looks like a real deadlock here. It seems to me #2 is the easiest to
> break.

I guess it could deadlock if you tried to add /dev/md0 as a component
of /dev/md0. I should probably check for that somewhere.
In other cases the array->member ordering ensures there is no
deadlock.

>
> static int md_open(struct inode *inode, struct file *file)
> {
> /*
> * Succeed if we can lock the mddev, which confirms that
> * it isn't being stopped right now.
> */
> mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
> int err;
>
> if ((err = mddev_lock(mddev)))
> goto out;
>
> err = 0;
> mddev_get(mddev);
> mddev_unlock(mddev);
>
> check_disk_change(inode->i_bdev);
> out:
> return err;
> }
>
> mddev_get() is a simple atomic_inc(), and I fail to see how waiting for
> the lock makes any difference.

Hmm... I"m pretty sure I do want some sort of locking there - to make
sure that the
if (atomic_read(&mddev->active)>2) {
test in do_md_stop actually means something. However it does seem
that the locking I have doesn't really guarantee anything much.

But I really think that this locking order should be allowed. md
should ensure that there are never any loops in the array->member
ordering, and somehow that needs to be communicated to lockdep.

One of the items on my todo list is to sort out the lifetime rules of
md devices (once accessed, they currently never disappear). Getting
this locking right should be part of that.

NeilBrown

2006-09-29 13:57:59

by J.A. Magallón

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton <[email protected]> wrote:

>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
>
>

aic7xxx oopses on boot:

PCI: Setting latency timer of device 0000:00:0e.0 to 64
IRQ handler type mismatch for IRQ 0
[<c013c697>] setup_irq+0xb7/0x1b0
[<c0274770>] ahc_linux_isr+0x0/0x50
[<c013c833>] request_irq+0xa3/0xc0
[<c027605c>] ahc_pci_map_int+0x2c/0x50
[<c027167a>] ahc_pci_config+0x5ea/0xcf0
[<c0208c00>] pci_bus_write_config_byte+0x30/0x70
[<c02761dc>] ahc_linux_pci_dev_probe+0xec/0x1e0
[<c01983b5>] sysfs_dirent_exist+0x45/0x70
[<c019927b>] sysfs_create_link+0x7b/0x180
[<c020d643>] pci_match_device+0x13/0xd0
[<c0202b2f>] kobject_get+0xf/0x20
[<c020d776>] pci_device_probe+0x56/0x80
[<c024ea7b>] really_probe+0x3b/0xe0
[<c024eb5f>] driver_probe_device+0x3f/0xa0
[<c030c7a3>] klist_next+0x53/0xa0
[<c024ecba>] __driver_attach+0x7a/0x80
[<c024e01a>] bus_for_each_dev+0x3a/0x60
[<c024e986>] driver_attach+0x16/0x20
[<c024ec40>] __driver_attach+0x0/0x80
[<c024e39c>] bus_add_driver+0x7c/0x1a0
[<c020d935>] __pci_register_driver+0x65/0x90
[<c0405749>] ahc_linux_init+0x79/0x90
[<c01004b0>] init+0x120/0x330
[<c0102eca>] ret_from_fork+0x6/0x1c
[<c0100390>] init+0x0/0x330
[<c0100390>] init+0x0/0x330
[<c0103b13>] kernel_thread_helper+0x7/0x14
=======================
aic7xxx: probe of 0000:00:0e.0 failed with error -16

lspci:

leda:~# lspci
00:00.0 Host bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX Host bridge (rev 03)
00:01.0 PCI bridge: Intel Corporation 440BX/ZX/DX - 82443BX/ZX/DX AGP bridge (rev 03)
00:07.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 02)
00:07.1 IDE interface: Intel Corporation 82371AB/EB/MB PIIX4 IDE (rev 01)
00:07.2 USB Controller: Intel Corporation 82371AB/EB/MB PIIX4 USB (rev 01)
00:07.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 02)
00:0d.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
00:0e.0 SCSI storage controller: Adaptec AHA-2940U2/U2W / 7890/7891 (rev 01)
00:0f.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 64)
00:12.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 07)
00:12.1 Input device controller: Creative Labs SB Live! Game Port (rev 07)
01:00.0 VGA compatible controller: nVidia Corporation NV34 [GeForce FX 5200] (rev a1)

(the 2940 is onboard and the U160 is a PCI card).

Full dmesg follows:

Linux version 2.6.18-jam02 (root@rescue) (gcc version 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP Fri Sep 29 12:31:45 CEST 2006
BIOS-provided physical RAM map:
sanitize start
sanitize end
copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009fc00 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a0000 type: 2
copy_e820_map() start: 00000000000e0000 size: 0000000000020000 end: 0000000000100000 type: 2
copy_e820_map() start: 0000000000100000 size: 000000001ff00000 end: 0000000020000000 type: 1
copy_e820_map() type is E820_RAM
copy_e820_map() start: 00000000fec00000 size: 0000000000001000 end: 00000000fec01000 type: 2
copy_e820_map() start: 00000000fee00000 size: 0000000000001000 end: 00000000fee01000 type: 2
copy_e820_map() start: 00000000fffc0000 size: 0000000000040000 end: 0000000100000000 type: 2
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000020000000 (usable)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
512MB LOWMEM available.
found SMP MP-table at 000fb4c0
Entering add_active_range(0, 0, 131072) 0 entries of 256 used
Zone PFN ranges:
DMA 0 -> 4096
Normal 4096 -> 131072
HighMem 131072 -> 131072
early_node_map[1] active PFN ranges
0: 0 -> 131072
On node 0 totalpages: 131072
DMA zone: 32 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 4064 pages, LIFO batch:0
Normal zone: 992 pages used for memmap
Normal zone: 125984 pages, LIFO batch:31
HighMem zone: 0 pages used for memmap
DMI 2.1 present.
ACPI: Unable to locate RSDP
Intel MultiProcessor Specification v1.4
Virtual Wire compatibility mode.
OEM ID: INTEL Product ID: 440BX APIC at: 0xFEE00000
Processor #0 6:7 APIC version 17
Processor #1 6:7 APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Enabling APIC mode: Flat. Using 1 I/O APICs
Processors: 2
Allocating PCI resources starting at 30000000 (gap: 20000000:dec00000)
Detected 501.164 MHz processor.
Built 1 zonelists. Total pages: 130048
Kernel command line: vga=6 root=/dev/sda1
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 8192 bytes)
Console: colour VGA+ 80x60
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 515792k/524288k available (2111k kernel code, 8076k reserved, 845k data, 204k init, 0k highmem)
virtual kernel memory layout:
fixmap : 0xfff9d000 - 0xfffff000 ( 392 kB)
pkmap : 0xff800000 - 0xffc00000 (4096 kB)
vmalloc : 0xe0800000 - 0xff7fe000 ( 495 MB)
lowmem : 0xc0000000 - 0xe0000000 ( 512 MB)
.init : 0xc03ea000 - 0xc041d000 ( 204 kB)
.data : 0xc030fe3a - 0xc03e33a4 ( 845 kB)
.text : 0xc0100000 - 0xc030fe3a (2111 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 1003.12 BogoMIPS (lpj=5015604)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 16k freed
CPU0: Intel Pentium III (Katmai) stepping 03
Booting processor 1/1 eip 2000
Initializing CPU#1
Calibrating delay using timer specific routine.. 1002.31 BogoMIPS (lpj=5011557)
CPU: After generic identify, caps: 0383fbff 00000000 00000000 00000000 00000000 00000000 00000000
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
CPU: After all inits, caps: 0383fbff 00000000 00000000 00000040 00000000 00000000 00000000
CPU1: Intel Pentium III (Katmai) stepping 03
Total of 2 processors activated (2005.43 BogoMIPS).
ExtINT not setup in hardware but reported by MP table
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=0 pin2=0
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=2850
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb81, last bus=1
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
* Found PM-Timer Bug on the chipset. Due to workarounds for a bug,
* this clock source is slow. Consider trying other clock sources
PCI quirk: region 0400-043f claimed by PIIX4 ACPI
PCI quirk: region 0440-044f claimed by PIIX4 SMB
PIIX4 devres B PIO at 0290-0297
Boot video device is 0000:01:00.0
PCI: Cannot allocate resource region 0 of device 0000:00:0e.0
PCI: Bridge: 0000:00:01.0
IO window: d000-dfff
MEM window: fca00000-feafffff
PREFETCH window: e4800000-f48fffff
NET: Registered protocol family 2
IP route cache hash table entries: 16384 (order: 4, 65536 bytes)
TCP established hash table entries: 65536 (order: 7, 524288 bytes)
TCP bind hash table entries: 32768 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 65536 bind 32768)
TCP reno registered
Installing knfsd (copyright (C) 1996 [email protected]).
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
Limiting direct PCI/PCI transfers.
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
<Adaptec 29160 Ultra160 SCSI adapter>
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi 0:0:0:0: Direct-Access IBM IC35L018UWD210-0 S5BS PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled. Depth 32
target0:0:0: Beginning Domain Validation
target0:0:0: wide asynchronous
target0:0:0: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 63)
target0:0:0: Ending Domain Validation
scsi 0:0:5:0: CD-ROM TOSHIBA CD-ROM XM-6401TA 1015 PQ: 0 ANSI: 2
target0:0:5: Beginning Domain Validation
target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16)
target0:0:5: Domain Validation skipping write tests
target0:0:5: Ending Domain Validation
PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.
PCI: Setting latency timer of device 0000:00:0e.0 to 64
IRQ handler type mismatch for IRQ 0
[<c013c697>] setup_irq+0xb7/0x1b0
[<c0274770>] ahc_linux_isr+0x0/0x50
[<c013c833>] request_irq+0xa3/0xc0
[<c027605c>] ahc_pci_map_int+0x2c/0x50
[<c027167a>] ahc_pci_config+0x5ea/0xcf0
[<c0208c00>] pci_bus_write_config_byte+0x30/0x70
[<c02761dc>] ahc_linux_pci_dev_probe+0xec/0x1e0
[<c01983b5>] sysfs_dirent_exist+0x45/0x70
[<c019927b>] sysfs_create_link+0x7b/0x180
[<c020d643>] pci_match_device+0x13/0xd0
[<c0202b2f>] kobject_get+0xf/0x20
[<c020d776>] pci_device_probe+0x56/0x80
[<c024ea7b>] really_probe+0x3b/0xe0
[<c024eb5f>] driver_probe_device+0x3f/0xa0
[<c030c7a3>] klist_next+0x53/0xa0
[<c024ecba>] __driver_attach+0x7a/0x80
[<c024e01a>] bus_for_each_dev+0x3a/0x60
[<c024e986>] driver_attach+0x16/0x20
[<c024ec40>] __driver_attach+0x0/0x80
[<c024e39c>] bus_add_driver+0x7c/0x1a0
[<c020d935>] __pci_register_driver+0x65/0x90
[<c0405749>] ahc_linux_init+0x79/0x90
[<c01004b0>] init+0x120/0x330
[<c0102eca>] ret_from_fork+0x6/0x1c
[<c0100390>] init+0x0/0x330
[<c0100390>] init+0x0/0x330
[<c0103b13>] kernel_thread_helper+0x7/0x14
=======================
aic7xxx: probe of 0000:00:0e.0 failed with error -16
SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write through
SCSI device sda: 35843670 512-byte hdwr sectors (18352 MB)
sda: Write Protect is off
sda: Mode Sense: cb 00 00 08
SCSI device sda: drive cache: write through
sda: sda1 sda2 < sda5 >
sd 0:0:0:0: Attached scsi disk sda
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
mice: PS/2 mouse device common for all mice
md: linear personality registered for level -1
md: raid0 personality registered for level 0
md: raid1 personality registered for level 1
md: raid10 personality registered for level 10
input: AT Translated Set 2 keyboard as /class/input/input0
raid6: int32x1 95 MB/s
logips2pp: Detected unknown logitech mouse model 1
raid6: int32x2 98 MB/s
raid6: int32x4 114 MB/s
raid6: int32x8 117 MB/s
input: PS/2 Logitech Mouse as /class/input/input1
raid6: mmxx1 217 MB/s
raid6: mmxx2 323 MB/s
raid6: sse1x1 245 MB/s
raid6: sse1x2 329 MB/s
raid6: using algorithm sse1x2 (329 MB/s)
md: raid6 personality registered for level 6
md: raid5 personality registered for level 5
md: raid4 personality registered for level 4
raid5: automatically using best checksumming function: pIII_sse
pIII_sse : 1014.400 MB/sec
raid5: using function: pIII_sse (1014.400 MB/sec)
md: md driver 0.90.3 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: bitmap version 4.39
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
Time: tsc clocksource has been installed.
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 204k freed
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
USB Universal Host Controller Interface driver v3.0
uhci_hcd 0000:00:07.2: UHCI Host Controller
uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1
uhci_hcd 0000:00:07.2: irq 9, io base 0x0000ef80
usb usb1: new device found, idVendor=0000, idProduct=0000
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: Product: UHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.18-jam02 uhci_hcd
usb usb1: SerialNumber: 0000:00:07.2
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
sr0: scsi-1 drive
Uniform CD-ROM driver Revision: 3.20
sr 0:0:5:0: Attached scsi CD-ROM sr0
EXT3 FS on sda1, internal journal
libata version 2.00 loaded.
ata_piix 0000:00:07.1: version 2.00ac7
ata1: PATA max UDMA/33 cmd 0x1F0 ctl 0x3F6 bmdma 0xFFA0 irq 14
ata2: PATA max UDMA/33 cmd 0x170 ctl 0x376 bmdma 0xFFA8 irq 15
scsi1 : ata_piix
scsi2 : ata_piix
ATA: abnormal status 0x7F on port 0x177
ATA: abnormal status 0x7F on port 0x177
ata2.00: ATAPI, max MWDMA0, CDB intr
ata2.00: configured for PIO3
scsi 2:0:0:0: Direct-Access IOMEGA ZIP 250 51.G PQ: 0 ANSI: 5
sd 2:0:0:0: Attached scsi removable disk sdb
Adding 1148608k swap on /dev/sda5. Priority:-1 extents:1 across:1148608k
loop: loaded (max 8 devices)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
3c59x: Donald Becker and others. http://www.scyld.com/network/vortex.html
0000:00:0f.0: 3Com PCI 3c905B Cyclone 100baseTx at e0804f80.
eth0: setting full-duplex.
nfsd: last server has exited
nfsd: unexporting all filesystems
Linux agpgart interface v0.101 (c) Dave Jones
nvidia: module license 'NVIDIA' taints kernel.
NVRM: loading NVIDIA Linux x86 Kernel Module 1.0-9625 Thu Sep 14 15:33:21 PDT 2006

--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.18-jam02 (gcc 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP PREEMPT

2006-09-29 14:04:36

by Peter Zijlstra

[permalink] [raw]
Subject: Re: md deadlock (was Re: 2.6.18-mm2)

On Fri, 2006-09-29 at 22:52 +1000, Neil Brown wrote:
> On Friday September 29, [email protected] wrote:
> > On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> >
> > Looks like a real deadlock here. It seems to me #2 is the easiest to
> > break.
>
> I guess it could deadlock if you tried to add /dev/md0 as a component
> of /dev/md0. I should probably check for that somewhere.
> In other cases the array->member ordering ensures there is no
> deadlock.
>


1 2

open(/dev/md0)

open(/dev/md0)
- do_open() -> bdev->bd_mutex
ioctl(/dev/md0, hotadd)
- md_ioctl() -> mddev->reconfig_mutex
-- hot_add_disk()
--- bind_rdev_to_array()
---- bd_claim_by_disk()
----- bd_claim_by_kobject()
-- md_open()
--- mddev_lock()
---- mutex_lock(mddev->reconfig_mutex)
------ mutex_lock(bdev->bd_mutex)


looks like an AB-BA deadlock to me


2006-09-29 14:39:52

by Matthew Wilcox

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> aic7xxx oopses on boot:
>
> PCI: Setting latency timer of device 0000:00:0e.0 to 64
> IRQ handler type mismatch for IRQ 0

Of course, this isn't a scsi problem, it's a peecee hardware problem.
Or maybe a PCI subsystem problem. But it's clearly not aic7xxx's fault.

> PCI: Cannot allocate resource region 0 of device 0000:00:0e.0

That's not good. Might be part of the problem.

> PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
> PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.

This is the direct problem. You've got no irq.

2006-09-29 15:19:49

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:

> bisecting would be good, thanks. It might be quicker to strip down the .config
> though.

Well, I started with a clean 2.6.18 tree, and did a 'quilt push origin.patch'
to put just the stuff already in Linus's tree on. Unfortunately, *that*
dies a *different* horrid death after 2 to 5 minutes or so of uptime (and
this one is also a locked-up-hard power-cycle hang, no alt-sysrq). Of the
3 or 4 times I triggered it, it managed to scribble the oops down into
syslog before totally wedging:

BUG: unable to handle kernel paging request at virtual address 00100104
printing eip:
c014c8b3
*pde = 00000000
Oops: 0002 [#1]
PREEMPT
Modules linked in: xt_SECMARK xt_CONNSECMARK ip6table_mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal processor fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class ohci1394 intel_agp ieee1394 agpgart yenta_socket rsrc_nonstatic pcmcia_core rtc
CPU: 0
EIP: 0060:[<c014c8b3>] Not tainted VLI
EFLAGS: 00010083 (2.6.18-test #1)
EIP is at drain_freelist+0x45/0x9b
eax: 00200200 ebx: e5ce0540 ecx: effe10c0 edx: 00100100
esi: effdf4c0 edi: 00000001 ebp: effd2f54 esp: effd2f40
ds: 007b es: 007b ss: 0068
Process events/0 (pid: 3, ti=effd2000 task=c56cf000 task.ti=effd2000)
Stack: 00000002 effe18c0 effdf4c0 effe18c0 efe006c0 effd2f64 c014d8ea 00000296
c053df60 effd2f80 c0120f91 c014d864 00000000 efe006d0 efe006c0 efe006c8
effd2fc4 c01214d6 00000001 00000000 00000001 00010000 00000000 00000000
Call Trace:
[<c014d8ea>] cache_reap+0x86/0xc4
[<c0120f91>] run_workqueue+0x8f/0xe0
[<c01214d6>] worker_thread+0xe1/0x113
[<c0123861>] kthread+0xb0/0xdf
[<c0103813>] kernel_thread_helper+0x7/0x10
DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10

Leftover inexact backtrace:

[<c0103c4d>] show_trace_log_lvl+0x12/0x25
[<c0103cec>] show_stack_log_lvl+0x8c/0x97
[<c0103e18>] show_registers+0x121/0x1b2
[<c0104041>] die+0x198/0x273
[<c034fce1>] do_page_fault+0x3f5/0x4c2
[<c034e819>] error_code+0x39/0x40
[<c014d8ea>] cache_reap+0x86/0xc4
[<c0120f91>] run_workqueue+0x8f/0xe0
[<c01214d6>] worker_thread+0xe1/0x113
[<c0123861>] kthread+0xb0/0xdf
[<c0103813>] kernel_thread_helper+0x7/0x10
=======================
Code: f0 ff ff ff 40 14 8b 5e 14 39 d3 75 19 fb 89 e0 25 00 f0 ff ff ff 48 14 8b 40 08 a8 08 74 59 e8 99 04 20 00 eb 52 8b 13 8b 43 04 <89> 42 04 89 10 c7 03 00 01 10 00 c7 43 04 00 02 20 00 8b 46 18
EIP: [<c014c8b3>] drain_freelist+0x45/0x9b SS:ESP 0068:effd2f40
<6>note: events/0[3] exited with preempt_count 1

Now the question arises - is this the same bug I was seeing under the full -mm2,
and all the other patches just move the manifestation around, or is this fixed
by another -mm2 patch, and my original bug report is something else?

I may have to learn how to use 'git bisect' to shoot this one, it appears.


Attachments:
(No filename) (226.00 B)

2006-09-29 16:50:53

by Alan

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Ar Gwe, 2006-09-29 am 08:39 -0600, ysgrifennodd Matthew Wilcox:
> On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > aic7xxx oopses on boot:
> >
> > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > IRQ handler type mismatch for IRQ 0
>
> Of course, this isn't a scsi problem, it's a peecee hardware problem.
> Or maybe a PCI subsystem problem. But it's clearly not aic7xxx's fault.

AIC7xxx finding it has no IRQ configured is valid (annoying, stupid and
valid) so the driver should check before requesting "no IRQ"

2006-09-29 19:47:18

by Christoph Lameter

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006, [email protected] wrote:

> I may have to learn how to use 'git bisect' to shoot this one, it appears.

Or enable SLAB_DEBUG?

2006-09-29 19:49:46

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 11:19:41 -0400
[email protected] wrote:

> On Thu, 28 Sep 2006 20:29:31 PDT, Andrew Morton said:
>
> > bisecting would be good, thanks. It might be quicker to strip down the .config
> > though.
>
> Well, I started with a clean 2.6.18 tree, and did a 'quilt push origin.patch'
> to put just the stuff already in Linus's tree on. Unfortunately, *that*
> dies a *different* horrid death after 2 to 5 minutes or so of uptime (and
> this one is also a locked-up-hard power-cycle hang, no alt-sysrq). Of the
> 3 or 4 times I triggered it, it managed to scribble the oops down into
> syslog before totally wedging:
>
> BUG: unable to handle kernel paging request at virtual address 00100104
> printing eip:
> c014c8b3
> *pde = 00000000
> Oops: 0002 [#1]
> PREEMPT
> Modules linked in: xt_SECMARK xt_CONNSECMARK ip6table_mangle iptable_mangle nf_conntrack_ftp xt_pkttype ipt_REJECT nf_conntrack_ipv4 ipt_LOG iptable_filter ip_tables xt_tcpudp nf_conntrack_ipv6 xt_state nf_conntrack ip6t_LOG xt_limit ip6table_filter ip6_tables x_tables thermal processor fan button battery ac nfnetlink i8k floppy nvram orinoco_cs orinoco hermes pcmcia firmware_class ohci1394 intel_agp ieee1394 agpgart yenta_socket rsrc_nonstatic pcmcia_core rtc
> CPU: 0
> EIP: 0060:[<c014c8b3>] Not tainted VLI
> EFLAGS: 00010083 (2.6.18-test #1)
> EIP is at drain_freelist+0x45/0x9b
> eax: 00200200 ebx: e5ce0540 ecx: effe10c0 edx: 00100100
> esi: effdf4c0 edi: 00000001 ebp: effd2f54 esp: effd2f40
> ds: 007b es: 007b ss: 0068
> Process events/0 (pid: 3, ti=effd2000 task=c56cf000 task.ti=effd2000)
> Stack: 00000002 effe18c0 effdf4c0 effe18c0 efe006c0 effd2f64 c014d8ea 00000296
> c053df60 effd2f80 c0120f91 c014d864 00000000 efe006d0 efe006c0 efe006c8
> effd2fc4 c01214d6 00000001 00000000 00000001 00010000 00000000 00000000
> Call Trace:
> [<c014d8ea>] cache_reap+0x86/0xc4
> [<c0120f91>] run_workqueue+0x8f/0xe0
> [<c01214d6>] worker_thread+0xe1/0x113
> [<c0123861>] kthread+0xb0/0xdf
> [<c0103813>] kernel_thread_helper+0x7/0x10
> DWARF2 unwinder stuck at kernel_thread_helper+0x7/0x10
>
> Leftover inexact backtrace:
>
> [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> [<c0103cec>] show_stack_log_lvl+0x8c/0x97
> [<c0103e18>] show_registers+0x121/0x1b2
> [<c0104041>] die+0x198/0x273
> [<c034fce1>] do_page_fault+0x3f5/0x4c2
> [<c034e819>] error_code+0x39/0x40
> [<c014d8ea>] cache_reap+0x86/0xc4
> [<c0120f91>] run_workqueue+0x8f/0xe0
> [<c01214d6>] worker_thread+0xe1/0x113
> [<c0123861>] kthread+0xb0/0xdf
> [<c0103813>] kernel_thread_helper+0x7/0x10
> =======================
> Code: f0 ff ff ff 40 14 8b 5e 14 39 d3 75 19 fb 89 e0 25 00 f0 ff ff ff 48 14 8b 40 08 a8 08 74 59 e8 99 04 20 00 eb 52 8b 13 8b 43 04 <89> 42 04 89 10 c7 03 00 01 10 00 c7 43 04 00 02 20 00 8b 46 18
> EIP: [<c014c8b3>] drain_freelist+0x45/0x9b SS:ESP 0068:effd2f40
> <6>note: events/0[3] exited with preempt_count 1
>
> Now the question arises - is this the same bug I was seeing under the full -mm2,
> and all the other patches just move the manifestation around, or is this fixed
> by another -mm2 patch, and my original bug report is something else?

I'd expect it's the same bug - slab data structures have gone bad.

> I may have to learn how to use 'git bisect' to shoot this one, it appears.

That's one way.

Again: how come nobody else is hitting this? Something's different.

What device drivers are being used?

2006-09-29 20:23:06

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.18-mm2


* Andi Kleen <[email protected]> wrote:

> BTW I was planning to make LOCAL_APIC unconditional on i386 too like
> on x86-64.

please dont - embedded doesnt need it most of the time. At most make it
default y and dependent on EMBEDDED.

Ingo

2006-09-29 20:36:22

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Friday 29 September 2006 22:14, Ingo Molnar wrote:
>
> * Andi Kleen <[email protected]> wrote:
>
> > BTW I was planning to make LOCAL_APIC unconditional on i386 too like
> > on x86-64.
>
> please dont - embedded doesnt need it most of the time.

What do you mean with not need? Local APIC is an infinitely better
interface than PIC and faster. On embedded too this makes a lot of sense.
And a lot of modern systems don't even work anymore without
APIC enabled because Windows uses it and the BIOS haven't been
tested without it (e.g. you often find totally broken code paths
in the AML for PIC mode)

The code size also isn't a good argument because the delta
isn't that big:

text data bss dec hex filename
3303894 694980 436420 4435294 43ad5e obj32-up/vmlinux
3266532 665732 402372 4334636 42242c obj32-up-noapic/vmlinux

~63K. I don't think such a small difference is worth the maintenance
overhead of the many ifdefs and hairy code paths. If someone really
cared about that memory they could save much more by just optimizing
some dynamic memory allocations instead, which waste much more.

The only reason to not use it are old broken BIOS or old CPUs
without local APIC, but those can be all handled at runtime like
the 64bit kernel does.

The SUSE kernel has a imho good default heuristic based on
DMI date, DMI number of processors and of course trusting the ACPI tables
(don't use if disabled there)

> At most make it
> default y and dependent on EMBEDDED.

The whole point is to get rid of the many ifdefs and frequent
compile breakage of it. This would defeat it.

-Andi

2006-09-29 20:40:39

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.18-mm2


* Andi Kleen <[email protected]> wrote:

> On Friday 29 September 2006 22:14, Ingo Molnar wrote:
> >
> > * Andi Kleen <[email protected]> wrote:
> >
> > > BTW I was planning to make LOCAL_APIC unconditional on i386 too like
> > > on x86-64.
> >
> > please dont - embedded doesnt need it most of the time.
>
> What do you mean with not need? Local APIC is an infinitely better
> interface than PIC and faster. On embedded too this makes a lot of
> sense.

it's just not present or hardware-disabled.

> And a lot of modern systems don't even work anymore without APIC
> enabled because Windows uses it and the BIOS haven't been tested
> without it (e.g. you often find totally broken code paths in the AML
> for PIC mode)
>
> The code size also isn't a good argument because the delta
> isn't that big:
>
> text data bss dec hex filename
> 3303894 694980 436420 4435294 43ad5e obj32-up/vmlinux
> 3266532 665732 402372 4334636 42242c obj32-up-noapic/vmlinux
>
> ~63K.

63K???? You've got to be kidding. That's huge. That's ~10% of the
minconfig kernel. Even 1K would be bad. We did config hacks for half a K
win. Please ... dont cripple the i686 kernel.

Ingo

2006-09-29 20:58:38

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Friday 29 September 2006 22:32, Ingo Molnar wrote:
>
> * Andi Kleen <[email protected]> wrote:
>
> > On Friday 29 September 2006 22:14, Ingo Molnar wrote:
> > >
> > > * Andi Kleen <[email protected]> wrote:
> > >
> > > > BTW I was planning to make LOCAL_APIC unconditional on i386 too like
> > > > on x86-64.
> > >
> > > please dont - embedded doesnt need it most of the time.
> >
> > What do you mean with not need? Local APIC is an infinitely better
> > interface than PIC and faster. On embedded too this makes a lot of
> > sense.
>
> it's just not present or hardware-disabled.

The kernel won't use it then. Also on next years embedded systems
this will likely change.

>
> > And a lot of modern systems don't even work anymore without APIC
> > enabled because Windows uses it and the BIOS haven't been tested
> > without it (e.g. you often find totally broken code paths in the AML
> > for PIC mode)
> >
> > The code size also isn't a good argument because the delta
> > isn't that big:
> >
> > text data bss dec hex filename
> > 3303894 694980 436420 4435294 43ad5e obj32-up/vmlinux
> > 3266532 665732 402372 4334636 42242c obj32-up-noapic/vmlinux
> >
> > ~63K.
>
> 63K???? You've got to be kidding. That's huge. That's ~10% of the
> minconfig kernel.

A large part of it is the ACPI support. Without that it's smaller:

text data bss dec hex filename
2978333 640752 416100 4035185 3d9271 obj32-up-noacpi/vmlinux
2947808 612088 400292 3960188 3c6d7c obj32-up-noacpi-noapic/vmlinux

~30k

You might be able to do without ACPI on your embedded system.

> Even 1K would be bad. We did config hacks for half a K
> win.

<rant>

Sorry, but that's silly. I did some measurements and just tweaking a
few dynamic allocation pigs saves you much more memory without
uglifying the code. In fact in most configurations you can find dynamic
users who need more than the complete kernel text - this means
even if you got the kernel text down to 0 bytes you wouldn't save as
much as simple tweaks in the dynamic pig.

I know it's easy to do size vmlinux and complain about bloat there,
but that is really not where the real bloat is. Finding the
real ones takes more effort of course.

And maintainability is much more important. Too many CONFIGs
just waste developer time and this one is particularly nasty
because it tends to break all the time.

And if you really want to make vmlinux smaller anyways you usually
get much better payoff by concentrating on inline functions than
uglifying the code with more CONFIGs. A few people did excellent
work on that recently and the kernel actually shrunk for most people, not
just some extreme config. But CONFIGs just
cause everybody more work for usually very little payoff.

</rant>

-andi

2006-09-29 21:19:19

by Alan

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Ar Gwe, 2006-09-29 am 22:58 +0200, ysgrifennodd Andi Kleen:
> 2978333 640752 416100 4035185 3d9271 obj32-up-noacpi/vmlinux
> 2947808 612088 400292 3960188 3c6d7c obj32-up-noacpi-noapic/vmlinux
>
> ~30k

30K is a lot on an embedded x86 box.

> You might be able to do without ACPI on your embedded system.

Most embedded people don't use ACPI for some strange reason related to
the fact its bloated, hard to get right in the firmware and sucks. That
is one that makes sense to keep.

Alan

2006-09-29 21:22:28

by Ingo Molnar

[permalink] [raw]
Subject: [patch] fix !apic build breakage


* Andi Kleen <[email protected]> wrote:

> > 63K???? You've got to be kidding. That's huge. That's ~10% of the
> > minconfig kernel.
>
> A large part of it is the ACPI support. Without that it's smaller:
>
> text data bss dec hex filename
> 2978333 640752 416100 4035185 3d9271 obj32-up-noacpi/vmlinux
> 2947808 612088 400292 3960188 3c6d7c obj32-up-noacpi-noapic/vmlinux
>
> ~30k

that's still huge! The patch below fixes the panic_on_unrecovered_nmi
thing ...

> You might be able to do without ACPI on your embedded system.

of course many people do.

> > Even 1K would be bad. We did config hacks for half a K
> > win.
>
> <rant>
>
> Sorry, but that's silly. I did some measurements and just tweaking a
> few dynamic allocation pigs saves you much more memory without
> uglifying the code. In fact in most configurations you can find
> dynamic users who need more than the complete kernel text - this means
> even if you got the kernel text down to 0 bytes you wouldn't save as
> much as simple tweaks in the dynamic pig.

so please do it. The fact that there are /other/ reductions possible
doesnt mean we can be lax. It's like: "oh, the buddy allocator scales
better now, so we can slow down the SLAB allocator". No, kernel size is
like scalability: we need a million small steps.

the panic_on_unrecovered_nmi thing is gross anyway: it has no place in
kernel.h, it should go into include/[asm-i386|x86_64]/nmi.h and not the
generic headers. There the prototype can be made #ifdef APIC, hence
eliminating the #ifdefs from traps.c. (that's all we care about anyway)

please dont throw away a perfectly fine config option.

Ingo

---------------->
From: Ingo Molnar <[email protected]>
Subject: fix !apic build breakage

fix !apic build breakage.

Signed-off-by: Ingo Molnar <[email protected]>

Index: linux-hrt-mm.q/arch/i386/kernel/traps.c
===================================================================
--- linux-hrt-mm.q.orig/arch/i386/kernel/traps.c
+++ linux-hrt-mm.q/arch/i386/kernel/traps.c
@@ -709,8 +709,10 @@ mem_parity_error(unsigned char reason, s
"CPU %d.\n", reason, smp_processor_id());
printk(KERN_EMERG "You probably have a hardware problem with your RAM "
"chips\n");
+#ifdef CONFIG_X86_LOCAL_APIC
if (panic_on_unrecovered_nmi)
panic("NMI: Not continuing");
+#endif

printk(KERN_EMERG "Dazed and confused, but trying to continue\n");

@@ -749,8 +751,10 @@ unknown_nmi_error(unsigned char reason,
printk(KERN_EMERG "Uhhuh. NMI received for unknown reason %02x on "
"CPU %d.\n", reason, smp_processor_id());
printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
+#ifdef CONFIG_X86_LOCAL_APIC
if (panic_on_unrecovered_nmi)
panic("NMI: Not continuing");
+#endif

printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
}

2006-09-29 21:37:00

by Dave Jones

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Fri, Sep 29, 2006 at 10:36:15PM +0200, Andi Kleen wrote:

> The only reason to not use it are old broken BIOS or old CPUs
> without local APIC, but those can be all handled at runtime like
> the 64bit kernel does.
>
> The SUSE kernel has a imho good default heuristic based on
> DMI date, DMI number of processors and of course trusting the ACPI tables
> (don't use if disabled there)

Any plans to push those heuristics to mainline too ?

Dave

2006-09-29 21:45:04

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch] fix !apic build breakage


> so please do it. The fact that there are /other/ reductions possible
> doesnt mean we can be lax.

Well with that argument we would put ifdefs nearly everywhere
because most subsystem have some code that you don't need in some
obscure configuration.

Do we do that? No. Clean and maintainable code is more important.

This particular case of APIC CONFIG is just a historical ward.
I eliminated it on 64bit a long time ago (and it undoubtedly
saved me hours of fixing compilation issues and it made the code
cleaner too) and i386 is definitely ripe for that soon too.

> It's like: "oh, the buddy allocator scales
> better now, so we can slow down the SLAB allocator". No, kernel size is
> like scalability: we need a million small steps.

Sure you could do a million steps. Just for each step you need
to look at the ratio of maintainability impact:usefulness
IMHO microconfig loses there usually badly.

[As terminology i call microconfig anything that requires ifdefs
inside .c or .h files. CONFIGs that only appear in Makefiles are usually
not a problem]

There are lots of other steps to less bloat that make sense, but please don't
advocate that microCONFIG disease.

> the panic_on_unrecovered_nmi thing is gross anyway: it has no place in
> kernel.h, it should go into include/[asm-i386|x86_64]/nmi.h and not the
> generic headers. There the prototype can be made #ifdef APIC, hence
> eliminating the #ifdefs from traps.c. (that's all we care about anyway)

Yes I fixed it already in a cleaner way (without ugly ifdefs)

ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/patches/nmi-sysctl-cleanup

> please dont throw away a perfectly fine config option.

I can't count how many that silly option already got broken by
changes in the APIC code. I definitely wouldn't describe it as "perfectly fine",
more as "fragile and tends to fall over when you even look at it".

-Andi

2006-09-29 21:47:01

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Friday 29 September 2006 23:36, Dave Jones wrote:
> On Fri, Sep 29, 2006 at 10:36:15PM +0200, Andi Kleen wrote:
>
> > The only reason to not use it are old broken BIOS or old CPUs
> > without local APIC, but those can be all handled at runtime like
> > the 64bit kernel does.
> >
> > The SUSE kernel has a imho good default heuristic based on
> > DMI date, DMI number of processors and of course trusting the ACPI tables
> > (don't use if disabled there)
>
> Any plans to push those heuristics to mainline too ?

Yes, probably not for .19 though. I wanted to do it together
with the removal of the APIC CONFIGs and a lot of cleanup in this
area that will come from that.

-Andi

2006-09-29 21:49:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch] fix !apic build breakage


* Andi Kleen <[email protected]> wrote:

> > please dont throw away a perfectly fine config option.
>
> I can't count how many that silly option already got broken by changes
> in the APIC code. I definitely wouldn't describe it as "perfectly
> fine", more as "fragile and tends to fall over when you even look at
> it".

i disagree. I frequently (daily) boot with apic-on and apic-off configs.
Very rarely does it break. Today it did, took me 30 seconds and 531
milliseconds to fix. Spent much more time writing these silly emails ...

Ingo

2006-09-29 21:52:17

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Fri, Sep 29, 2006 at 06:15:42PM +0100, Alan Cox wrote:
> Ar Gwe, 2006-09-29 am 08:39 -0600, ysgrifennodd Matthew Wilcox:
> > On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > > aic7xxx oopses on boot:
> > >
> > > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > > IRQ handler type mismatch for IRQ 0
> >
> > Of course, this isn't a scsi problem, it's a peecee hardware problem.
> > Or maybe a PCI subsystem problem. But it's clearly not aic7xxx's fault.
>
> AIC7xxx finding it has no IRQ configured is valid (annoying, stupid and
> valid) so the driver should check before requesting "no IRQ"
>
Alan,

Does this patch makes sense in that case? If yes, I'll put up a patch
for the remaining cases in the drivers/scsi/aic7xxx/ directory.
Also, aic7xxx's coding style would put parenthesis around the returned
value, should I follow it?

Regards,
Frederik

diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..38f5ca7 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -185,6 +185,9 @@ ahc_linux_pci_dev_probe(struct pci_dev *
int error;
struct device *dev = &pdev->dev;

+ if (!pdev->irq)
+ return -ENODEV;
+
pci = pdev;
entry = ahc_find_pci_device(pci);
if (entry == NULL)

2006-09-29 23:15:38

by J.A. Magallón

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Fri, 29 Sep 2006 08:39:49 -0600, Matthew Wilcox <[email protected]> wrote:

> On Fri, Sep 29, 2006 at 03:57:38PM +0200, J.A. Magall??n wrote:
> > aic7xxx oopses on boot:
> >
> > PCI: Setting latency timer of device 0000:00:0e.0 to 64
> > IRQ handler type mismatch for IRQ 0
>
> Of course, this isn't a scsi problem, it's a peecee hardware problem.
> Or maybe a PCI subsystem problem. But it's clearly not aic7xxx's fault.
>
> > PCI: Cannot allocate resource region 0 of device 0000:00:0e.0
>
> That's not good. Might be part of the problem.
>
> > PCI: Enabling device 0000:00:0e.0 (0000 -> 0003)
> > PCI: No IRQ known for interrupt pin A of device 0000:00:0e.0. Probably buggy MP table.
>
> This is the direct problem. You've got no irq.
>

Thanks...

Now I have just realized this:

00:0d.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02)
00:0e.0 SCSI storage controller: Adaptec AHA-2940U2/U2W / 7890/7891 (rev 01)

leda:~# lsscsi -Hv
[0] aic7xxx
dir: /sys/class/scsi_host/host0
device dir: /sys/devices/pci0000:00/0000:00:0d.0/host0
[1] ata_piix
dir: /sys/class/scsi_host/host1
device dir: /sys/devices/pci0000:00/0000:00:07.1/host1
[2] ata_piix
dir: /sys/class/scsi_host/host2
device dir: /sys/devices/pci0000:00/0000:00:07.1/host2

leda:~# lsscsi
[0:0:0:0] disk IBM IC35L018UWD210-0 S5BS /dev/sda
[0:0:5:0] cd/dvd TOSHIBA CD-ROM XM-6401TA 1015 /dev/sr0
[2:0:0:0] disk IOMEGA ZIP 250 51.G /dev/sdb

Device 00:0e.0 is the 2940, which has nothing hung.
Who's to blame ? the bios because is assigns no interupts as no devices are
connected to the bus ? Or the kernel that should understand something like
'this device is disabled' ?

I can try to change the cdrom to the 2940 and see what happens...

Thanks, I will try the patch posted, it looks something like what I said
above, disable the device.

--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2007.0 (Cooker) for i586
Linux 2.6.18-jam02 (gcc 4.1.1 20060724 (prerelease) (4.1.1-3mdk)) #1 SMP PREEMPT

2006-09-29 23:18:31

by Alan

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> Does this patch makes sense in that case? If yes, I'll put up a patch
> for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> Also, aic7xxx's coding style would put parenthesis around the returned
> value, should I follow it?

Yes - but perhaps with a warning message so users know why ?

As to coding style - kernel style is unbracketed so I wouldnt worry
about either.


2006-09-30 00:06:10

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:

(Adding a bunch of people to the cc: list now that I have a clue what is
going on....)

> I'd expect it's the same bug - slab data structures have gone bad.

*bing*! We have a winner. A quick check showed the kernel wasn't built with
slab debugging enabled, so I turned on the more obvious options, and got
rewarded with a traceback..

> Again: how come nobody else is hitting this? Something's different.

gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
Core extras-development). Kernel is still a 2.6.18 with *only* the
origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
a change in the code since 01/03/2004 - hopefully there's been no unintentional
API change on the kernel side since then...

Here's the traceback I got:

slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
[<c0103ad2>] dump_trace+0x64/0x1cd
[<c0103c4d>] show_trace_log_lvl+0x12/0x25
[<c010415f>] show_trace+0xd/0x10
[<c01041fc>] dump_stack+0x19/0x1b
[<c014c796>] __slab_error+0x17/0x1c
[<c014cdac>] cache_free_debugcheck+0xaf/0x230
[<c014d43e>] kfree+0x59/0x8c
[<c02dc04a>] ioctl_standard_call+0x1da/0x218
[<c02dc275>] wireless_process_ioctl+0x55/0x312
[<c02d3750>] dev_ioctl+0x45f/0x49a
[<c02c92aa>] sock_ioctl+0x1b3/0x1c6
[<c0160322>] do_ioctl+0x22/0x67
[<c01605a5>] vfs_ioctl+0x23e/0x251
[<c01605ff>] sys_ioctl+0x47/0x64
[<c0102cd3>] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb

Leftover inexact backtrace:

=======================
de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.

Repeated, over and over, just about once a second.

A quick strace of gkrellm finds these likely ioctl's causing the problem:

% grep ioctl /tmp/foo2 | sort -u | more
ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0

Since I'm using an orinoco-based card, these 2 look like the most likely
candidates. WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
stable for me. I'll let somebody else argue over what path these took that
I never tripped over them in an earlier -mm before they hit Linus's tree...

commit baef186519c69b11cf7e48c26e75feb1e6173baa
Author: John W. Linville <[email protected]>
Date: Fri Sep 8 16:04:05 2006 -0400

[PATCH] WE-21 support (core API)

This is version 21 of the Wireless Extensions. Changelog :
o finishes migrating the ESSID API (remove the +1)
o netdev->get_wireless_stats is no more
o long/short retry

This is a redacted version of a patch originally submitted by Jean
Tourrilhes. I removed most of the additions, in order to minimize
future support requirements for nl80211 (or other WE successor).

CC: Jean Tourrilhes <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

commit eeec9f1a931262d69811135092c8447d6dccc3e6
Author: Jean Tourrilhes <[email protected]>
Date: Tue Aug 29 18:02:31 2006 -0700

[PATCH] WE-21 for orinoco

Signed-off-by: Jean Tourrilhes <[email protected]>
Signed-off-by: John W. Linville <[email protected]>




Attachments:
(No filename) (226.00 B)

2006-09-30 01:30:44

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 20:01:54 -0400
[email protected] wrote:

> On Fri, 29 Sep 2006 12:45:58 PDT, Andrew Morton said:
>
> (Adding a bunch of people to the cc: list now that I have a clue what is
> going on....)
>
> > I'd expect it's the same bug - slab data structures have gone bad.
>
> *bing*! We have a winner. A quick check showed the kernel wasn't built with
> slab debugging enabled, so I turned on the more obvious options, and got
> rewarded with a traceback..

doh. I'd assumed that CONFIG_DEBUG_SLAB was enabled :(

> > Again: how come nobody else is hitting this? Something's different.
>
> gkrellm and wireless (specifically, gkrellm-wifi-0.9.12-3.fc6 from Fedora
> Core extras-development). Kernel is still a 2.6.18 with *only* the
> origin.patch from -mm2 applied. Note that the gkrellm plugin hasn't had
> a change in the code since 01/03/2004 - hopefully there's been no unintentional
> API change on the kernel side since then...
>
> Here's the traceback I got:
>
> slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
> [<c0103ad2>] dump_trace+0x64/0x1cd
> [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> [<c010415f>] show_trace+0xd/0x10
> [<c01041fc>] dump_stack+0x19/0x1b
> [<c014c796>] __slab_error+0x17/0x1c
> [<c014cdac>] cache_free_debugcheck+0xaf/0x230
> [<c014d43e>] kfree+0x59/0x8c
> [<c02dc04a>] ioctl_standard_call+0x1da/0x218
> [<c02dc275>] wireless_process_ioctl+0x55/0x312
> [<c02d3750>] dev_ioctl+0x45f/0x49a
> [<c02c92aa>] sock_ioctl+0x1b3/0x1c6
> [<c0160322>] do_ioctl+0x22/0x67
> [<c01605a5>] vfs_ioctl+0x23e/0x251
> [<c01605ff>] sys_ioctl+0x47/0x64
> [<c0102cd3>] syscall_call+0x7/0xb
> DWARF2 unwinder stuck at syscall_call+0x7/0xb
>
> Leftover inexact backtrace:
>
> =======================
> de57e16c: redzone 1:0x170fc2a5, redzone 2:0x170fc200.
>
> Repeated, over and over, just about once a second.
>
> A quick strace of gkrellm finds these likely ioctl's causing the problem:
>
> % grep ioctl /tmp/foo2 | sort -u | more
> ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
> ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
> ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0

Yes. The main thing which those WE-21 patches do is to shorten the size of
various buffers which are used in wireless ioctls.

> Since I'm using an orinoco-based card, these 2 look like the most likely
> candidates. WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
> stable for me.

The WE-21 patches weren't in Jeff's tree for -mm1 or for -mm2. They
appeared there transiently then quickly went mainline. They _might_ have
been in the wireless git tree, although I often drop that due to git woes.
But that hasn't happened recently....

> I'll let somebody else argue over what path these took that
> I never tripped over them in an earlier -mm before they hit Linus's tree...
>
> commit baef186519c69b11cf7e48c26e75feb1e6173baa
> Author: John W. Linville <[email protected]>
> Date: Fri Sep 8 16:04:05 2006 -0400
>
> [PATCH] WE-21 support (core API)
>
> This is version 21 of the Wireless Extensions. Changelog :
> o finishes migrating the ESSID API (remove the +1)
> o netdev->get_wireless_stats is no more
> o long/short retry
>
> This is a redacted version of a patch originally submitted by Jean
> Tourrilhes. I removed most of the additions, in order to minimize
> future support requirements for nl80211 (or other WE successor).
>
> CC: Jean Tourrilhes <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
>
> commit eeec9f1a931262d69811135092c8447d6dccc3e6
> Author: Jean Tourrilhes <[email protected]>
> Date: Tue Aug 29 18:02:31 2006 -0700
>
> [PATCH] WE-21 for orinoco
>
> Signed-off-by: Jean Tourrilhes <[email protected]>
> Signed-off-by: John W. Linville <[email protected]>
>

Try reverting those?

2006-09-30 01:34:13

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> >
> > Here's the traceback I got:
> >
> > slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten
> > [<c0103ad2>] dump_trace+0x64/0x1cd
> > [<c0103c4d>] show_trace_log_lvl+0x12/0x25
> > [<c010415f>] show_trace+0xd/0x10
> > [<c01041fc>] dump_stack+0x19/0x1b
> > [<c014c796>] __slab_error+0x17/0x1c
> > [<c014cdac>] cache_free_debugcheck+0xaf/0x230
> > [<c014d43e>] kfree+0x59/0x8c
> > [<c02dc04a>] ioctl_standard_call+0x1da/0x218
> > [<c02dc275>] wireless_process_ioctl+0x55/0x312
> > [<c02d3750>] dev_ioctl+0x45f/0x49a
> > [<c02c92aa>] sock_ioctl+0x1b3/0x1c6
> > [<c0160322>] do_ioctl+0x22/0x67
> > [<c01605a5>] vfs_ioctl+0x23e/0x251
> > [<c01605ff>] sys_ioctl+0x47/0x64
> > [<c0102cd3>] syscall_call+0x7/0xb
> > DWARF2 unwinder stuck at syscall_call+0x7/0xb

Hum... Not clear what's happening. I'll look more into it on
monday.

> > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> >
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0

That's most likely the one. I need to check the source code.

> Yes. The main thing which those WE-21 patches do is to shorten the size of
> various buffers which are used in wireless ioctls.

Only for ESSID, it reduce it by one char, and remove the final
'\0'. But, kernel wise, it should not matter.

> > Since I'm using an orinoco-based card, these 2 look like the most likely
> > candidates. WE-21 was merged between -mm1 and -mm2, which is why -mm1 was
> > stable for me.

I'm using Orinoco, I've not seen that with iwconfig.
I'll look into that...

Jean


2006-09-30 01:45:08

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> >
> > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> >
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
> > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
> > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0

Excuse me, can you point out wich version of gkrellm you use
and where to find it, the only version that is listed on my page does
not use the ESSID ioctl. I want to be sure I'm looking at the same
thing as you are...

Jean

2006-09-30 01:57:05

by x z

[permalink] [raw]
Subject: Makefile for linux modules

Hi
I have a makefielt to make several driver modules:
obj-$(CONFIG_FUSION_SPI) += mptbase.o mptscsih.o
mptspi.o
obj-$(CONFIG_FUSION_FC) += mptbase.o mptscsih.o
mptfc.o
obj-m += mptbase.o mptscsih.o mptsas.o
obj-$(CONFIG_FUSION_LAN) += mptlan.o
obj-m += mptctl.o
obj-m += mptcfg.o
obj-m +=mptstm.o


this will compile and modules can be installed
successfully.

I need to have a comfunc.c file, which contains all
common functions, which could be used by these module
files.
I added the line below to the content just below
mptstm.o (I tried adding just above mptlan). All
modules are compiled successfully. I can install
mptbase.ko. However, when I try to install mptctl.ko
(or other modules), I got errors like mptctl: Unknown
symbol mpt_register; mpt_deregister. These functions
are implemented in mptbase.c.

How do I fix this problem?

thanks
Robert
mptbase-objs := comfunc.o

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2006-09-30 01:59:50

by x z

[permalink] [raw]
Subject: Makefile for linux modules

Hi
I have a makefile to make several driver modules:
obj-$(CONFIG_FUSION_SPI) += mptbase.o mptscsih.o
mptspi.o
obj-$(CONFIG_FUSION_FC) += mptbase.o mptscsih.o
mptfc.o
obj-m += mptbase.o mptscsih.o mptsas.o
obj-$(CONFIG_FUSION_LAN) += mptlan.o
obj-m += mptctl.o
obj-m += mptcfg.o
obj-m +=mptstm.o


this will compile all modules and the modules can be
installed successfully.

I need to have a comfunc.c file, which contains all
common functions, which could be used by these module
files.
I added the line below to the content just below
mptstm.o (I tried adding just above mptlan).
mptbase-objs := comfunc.o

All modules are compiled successfully. I can install
mptbase.ko. However, when I try to install mptctl.ko
(or other modules), I got errors like mptctl: Unknown
symbol mpt_register; mpt_deregister. These functions
are implemented in mptbase.c.

How do I fix this problem?

thanks
Robert


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2006-09-30 03:34:18

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 18:40:43 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > >
> > > A quick strace of gkrellm finds these likely ioctl's causing the problem:
> > >
> > > % grep ioctl /tmp/foo2 | sort -u | more
> > > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
> > > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
> > > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0
>
> Excuse me, can you point out wich version of gkrellm you use
> and where to find it, the only version that is listed on my page does
> not use the ESSID ioctl. I want to be sure I'm looking at the same
> thing as you are...

All the pieces:
http://download.fedora.redhat.com/pub/fedora/linux/extras/development/SRPMS/

The particular plugin causing the trouble:
http://download.fedora.redhat.com/pub/fedora/linux/extras/development/SRPMS/gkrellm-wifi-0.9.12-3.fc6.src.rpm

If you're not on a box that has rpm2cpio or similar, yell and I'll
break that .src.rpm up for you - there's basically just an 18K .tar.gz and
a 14K patch in there.


Attachments:
(No filename) (226.00 B)

2006-09-30 03:37:33

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 18:33:48 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > >
> > > Here's the traceback I got:
> > >
> > > slab error in verify_redzone_free(): cache `size-32': memory outside object was overwritten

> Hum... Not clear what's happening. I'll look more into it on
> monday.

Fair enough, I'm going to try reverting the 2 commits and see if things
behave better.

> I'm using Orinoco, I've not seen that with iwconfig.
> I'll look into that...

I'll bet it's the difference between a modern iwconfig and a 3-year-old
stone-age gkrellm plugin :)


Attachments:
(No filename) (226.00 B)

2006-09-30 07:04:13

by Borislav Petkov

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - possible recursive locking detected

On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
Hi,

.config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.

Sep 30 08:38:17 zmei kernel: [ 285.197902]
Sep 30 08:38:19 zmei kernel: [ 285.197905] =============================================
Sep 30 08:38:19 zmei kernel: [ 285.204776] [ INFO: possible recursive locking detected ]
Sep 30 08:38:19 zmei kernel: [ 285.210163] 2.6.18-mm2 #1
Sep 30 08:38:19 zmei kernel: [ 285.212782] ---------------------------------------------
Sep 30 08:38:19 zmei kernel: [ 285.218168] swapper/0 is trying to acquire lock:
Sep 30 08:38:19 zmei kernel: [ 285.222777] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [ 285.229114]
Sep 30 08:38:19 zmei kernel: [ 285.229115] but task is already holding lock:
Sep 30 08:38:19 zmei kernel: [ 285.234952] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [ 285.241290]
Sep 30 08:38:19 zmei kernel: [ 285.241291] other info that might help us debug this:
Sep 30 08:38:19 zmei kernel: [ 285.247817] 4 locks held by swapper/0:
Sep 30 08:38:19 zmei kernel: [ 285.251561] #0: (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
Sep 30 08:38:19 zmei kernel: [ 285.259041] #1: (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
Sep 30 08:38:19 zmei kernel: [ 285.266700] #2: (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
Sep 30 08:38:19 zmei kernel: [ 285.276454] #3: (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [ 285.283241]
Sep 30 08:38:19 zmei kernel: [ 285.283242] stack backtrace:
Sep 30 08:38:19 zmei kernel: [ 285.287688] [<c0103b65>] dump_trace+0x64/0x1cd
Sep 30 08:38:19 zmei kernel: [ 285.292243] [<c0103ce0>] show_trace_log_lvl+0x12/0x25
Sep 30 08:38:19 zmei kernel: [ 285.297405] [<c010431c>] show_trace+0xd/0x10
Sep 30 08:38:19 zmei kernel: [ 285.301780] [<c01043e4>] dump_stack+0x19/0x1b
Sep 30 08:38:19 zmei kernel: [ 285.306250] [<c013022d>] __lock_acquire+0x750/0x96c
Sep 30 08:38:19 zmei kernel: [ 285.311304] [<c013098c>] lock_acquire+0x4b/0x6b
Sep 30 08:38:19 zmei kernel: [ 285.316005] [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
Sep 30 08:38:19 zmei kernel: [ 285.321233] [<c0112f70>] __wake_up+0x15/0x3b
Sep 30 08:38:19 zmei kernel: [ 285.325638] [<c0178dd4>] ep_poll_safewake+0x91/0xc3
Sep 30 08:38:19 zmei kernel: [ 285.330760] [<c0179c69>] ep_poll_callback+0x83/0x8e
Sep 30 08:38:19 zmei kernel: [ 285.335888] [<c01122e5>] __wake_up_common+0x2f/0x53
Sep 30 08:38:19 zmei kernel: [ 285.340898] [<c0112f83>] __wake_up+0x28/0x3b
Sep 30 08:38:19 zmei kernel: [ 285.345312] [<c0278da8>] sock_def_readable+0x3a/0x69
Sep 30 08:38:20 zmei kernel: [ 285.350778] [<c02a1892>] tcp_data_queue+0x50f/0xa53
Sep 30 08:38:20 zmei kernel: [ 285.356232] [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
Sep 30 08:38:20 zmei kernel: [ 285.362077] [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
Sep 30 08:38:20 zmei kernel: [ 285.367322] [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb
Sep 30 08:38:20 zmei kernel: [ 285.372432] [<c02928e3>] ip_local_deliver+0x19c/0x265
Sep 30 08:38:20 zmei kernel: [ 285.378033] [<c029270b>] ip_rcv+0x453/0x48f
Sep 30 08:38:20 zmei kernel: [ 285.382769] [<c027e51a>] netif_receive_skb+0x1a6/0x239
Sep 30 08:38:20 zmei kernel: [ 285.388440] [<c020f5a5>] rtl8139_poll+0x297/0x405
Sep 30 08:38:20 zmei kernel: [ 285.393553] [<c027ff20>] net_rx_action+0x76/0x109
Sep 30 08:38:20 zmei kernel: [ 285.398782] [<c011dad0>] __do_softirq+0x70/0xf0
Sep 30 08:38:20 zmei kernel: [ 285.403459] [<c011db89>] do_softirq+0x39/0x55
Sep 30 08:38:20 zmei kernel: [ 285.407963] [<c011dcd5>] irq_exit+0x49/0x56
Sep 30 08:38:20 zmei kernel: [ 285.412295] [<c010537f>] do_IRQ+0x8f/0x9c
Sep 30 08:38:20 zmei kernel: [ 285.416408] [<c01035e1>] common_interrupt+0x25/0x2c
Sep 30 08:38:20 zmei kernel: [ 285.421393] DWARF2 unwinder stuck at common_interrupt+0x25/0x2c
Sep 30 08:38:20 zmei kernel: [ 285.427298]
Sep 30 08:38:20 zmei kernel: [ 285.428786] Leftover inexact backtrace:
Sep 30 08:38:20 zmei kernel: [ 285.428787]
Sep 30 08:38:20 zmei kernel: [ 285.434103] [<c010168b>] cpu_idle+0x72/0x9b
Sep 30 08:38:20 zmei kernel: [ 285.438383] [<c010064e>] rest_init+0x37/0x39
Sep 30 08:38:20 zmei kernel: [ 285.442742] [<c043d73b>] start_kernel+0x356/0x35e
Sep 30 08:38:20 zmei kernel: [ 285.447549] [<00000000>] 0x0
Sep 30 08:38:20 zmei kernel: [ 285.450541] =======================

--
Regards/Gru?,
Boris.





___________________________________________________________
Der fr?he Vogel f?ngt den Wurm. Hier gelangen Sie zum neuen Yahoo! Mail: http://mail.yahoo.de

2006-09-30 07:53:13

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, 29 Sep 2006 23:31:07 EDT, [email protected] said:
> Fair enough, I'm going to try reverting the 2 commits and see if things
> behave better.

OK, it's definitely something in those 2 commits - I reverted them and the
resulting 2.6.18-mm2 kernel has been up and stable for 4 hours, even with
the problem gkrellm updating once a second the whole time.

I'm not *seeing* how those changes can cause trouble - unless it's this:

diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
index 1840b69..9e19a96 100644
--- a/drivers/net/wireless/orinoco.c
+++ b/drivers/net/wireless/orinoco.c
@@ -3037,7 +3037,7 @@ static int orinoco_ioctl_getessid(struct
}

erq->flags = 1;
- erq->length = strlen(essidbuf) + 1;
+ erq->length = strlen(essidbuf);

Does some other code go batshit if length ==0? My current config doesn't
try to actually ifup the wireless if I also have connectivity via copper (in
order to avoid chewing up a DHCP lease in crowded address space if not needed).

% iwconfig eth5
eth5 IEEE 802.11b ESSID:"" Nickname:"HERMES I"
Mode:Managed Frequency:2.457 GHz Access Point: Not-Associated
Bit Rate:11 Mb/s Sensitivity:1/3
Retry limit:4 RTS thr:off Fragment thr:off
Power Management:off
Link Quality=0/92 Signal level=134/153 Noise level=134/153
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

That ESSID the source of the trouble?


Attachments:
(No filename) (226.00 B)

2006-09-30 08:28:28

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - possible recursive locking detected

On Sat, 30 Sep 2006 09:04:06 +0200
Borislav Petkov <[email protected]> wrote:

> On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
> Hi,
>
> .config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.
>
> Sep 30 08:38:17 zmei kernel: [ 285.197902]
> Sep 30 08:38:19 zmei kernel: [ 285.197905] =============================================
> Sep 30 08:38:19 zmei kernel: [ 285.204776] [ INFO: possible recursive locking detected ]
> Sep 30 08:38:19 zmei kernel: [ 285.210163] 2.6.18-mm2 #1
> Sep 30 08:38:19 zmei kernel: [ 285.212782] ---------------------------------------------
> Sep 30 08:38:19 zmei kernel: [ 285.218168] swapper/0 is trying to acquire lock:
> Sep 30 08:38:19 zmei kernel: [ 285.222777] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [ 285.229114]
> Sep 30 08:38:19 zmei kernel: [ 285.229115] but task is already holding lock:
> Sep 30 08:38:19 zmei kernel: [ 285.234952] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [ 285.241290]
> Sep 30 08:38:19 zmei kernel: [ 285.241291] other info that might help us debug this:
> Sep 30 08:38:19 zmei kernel: [ 285.247817] 4 locks held by swapper/0:
> Sep 30 08:38:19 zmei kernel: [ 285.251561] #0: (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
> Sep 30 08:38:19 zmei kernel: [ 285.259041] #1: (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
> Sep 30 08:38:19 zmei kernel: [ 285.266700] #2: (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
> Sep 30 08:38:19 zmei kernel: [ 285.276454] #3: (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [ 285.283241]
> Sep 30 08:38:19 zmei kernel: [ 285.283242] stack backtrace:
> Sep 30 08:38:19 zmei kernel: [ 285.287688] [<c0103b65>] dump_trace+0x64/0x1cd
> Sep 30 08:38:19 zmei kernel: [ 285.292243] [<c0103ce0>] show_trace_log_lvl+0x12/0x25
> Sep 30 08:38:19 zmei kernel: [ 285.297405] [<c010431c>] show_trace+0xd/0x10
> Sep 30 08:38:19 zmei kernel: [ 285.301780] [<c01043e4>] dump_stack+0x19/0x1b
> Sep 30 08:38:19 zmei kernel: [ 285.306250] [<c013022d>] __lock_acquire+0x750/0x96c
> Sep 30 08:38:19 zmei kernel: [ 285.311304] [<c013098c>] lock_acquire+0x4b/0x6b
> Sep 30 08:38:19 zmei kernel: [ 285.316005] [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
> Sep 30 08:38:19 zmei kernel: [ 285.321233] [<c0112f70>] __wake_up+0x15/0x3b
> Sep 30 08:38:19 zmei kernel: [ 285.325638] [<c0178dd4>] ep_poll_safewake+0x91/0xc3
> Sep 30 08:38:19 zmei kernel: [ 285.330760] [<c0179c69>] ep_poll_callback+0x83/0x8e
> Sep 30 08:38:19 zmei kernel: [ 285.335888] [<c01122e5>] __wake_up_common+0x2f/0x53
> Sep 30 08:38:19 zmei kernel: [ 285.340898] [<c0112f83>] __wake_up+0x28/0x3b
> Sep 30 08:38:19 zmei kernel: [ 285.345312] [<c0278da8>] sock_def_readable+0x3a/0x69
> Sep 30 08:38:20 zmei kernel: [ 285.350778] [<c02a1892>] tcp_data_queue+0x50f/0xa53
> Sep 30 08:38:20 zmei kernel: [ 285.356232] [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
> Sep 30 08:38:20 zmei kernel: [ 285.362077] [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
> Sep 30 08:38:20 zmei kernel: [ 285.367322] [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb

<looks at ep_poll_safewake>

<falls out of chair>

We'll need to teach lockdep about that one, but I don't have a clue how.

Is it not vulnerable to ab/ba deadlocking?


2006-09-30 08:40:29

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Sat, 30 Sep 2006 03:50:43 -0400
[email protected] wrote:

> On Fri, 29 Sep 2006 23:31:07 EDT, [email protected] said:
> > Fair enough, I'm going to try reverting the 2 commits and see if things
> > behave better.
>
> OK, it's definitely something in those 2 commits - I reverted them and the
> resulting 2.6.18-mm2 kernel has been up and stable for 4 hours, even with
> the problem gkrellm updating once a second the whole time.
>
> I'm not *seeing* how those changes can cause trouble - unless it's this:
>
> diff --git a/drivers/net/wireless/orinoco.c b/drivers/net/wireless/orinoco.c
> index 1840b69..9e19a96 100644
> --- a/drivers/net/wireless/orinoco.c
> +++ b/drivers/net/wireless/orinoco.c
> @@ -3037,7 +3037,7 @@ static int orinoco_ioctl_getessid(struct
> }
>
> erq->flags = 1;
> - erq->length = strlen(essidbuf) + 1;
> + erq->length = strlen(essidbuf);

You know what the next question is ;)

Did reverting just that line fix it?

> Does some other code go batshit if length ==0? My current config doesn't
> try to actually ifup the wireless if I also have connectivity via copper (in
> order to avoid chewing up a DHCP lease in crowded address space if not needed).
>
> % iwconfig eth5
> eth5 IEEE 802.11b ESSID:"" Nickname:"HERMES I"
> Mode:Managed Frequency:2.457 GHz Access Point: Not-Associated
> Bit Rate:11 Mb/s Sensitivity:1/3
> Retry limit:4 RTS thr:off Fragment thr:off
> Power Management:off
> Link Quality=0/92 Signal level=134/153 Noise level=134/153
> Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
> Tx excessive retries:0 Invalid misc:0 Missed beacon:0
>
> That ESSID the source of the trouble?
>

Might be. I can't immediately spot a problem with it, but perhaps
length==0 causes the driver to not allocate a buffer and to then write to
the not-allocated buffer. Not sure..

2006-09-30 08:55:35

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Makefile for linux modules

Hi Robert.

> I have a makefielt to make several driver modules:
> obj-$(CONFIG_FUSION_SPI) += mptbase.o mptscsih.o
> mptspi.o
> obj-$(CONFIG_FUSION_FC) += mptbase.o mptscsih.o
> mptfc.o
> obj-m += mptbase.o mptscsih.o mptsas.o
> obj-$(CONFIG_FUSION_LAN) += mptlan.o
> obj-m += mptctl.o
> obj-m += mptcfg.o
> obj-m +=mptstm.o

The above kbuild file snippet tells us that you are creating
a number of modules:
mptbase.ko mptscsih.ko mptsas.ko mptlan.ko mptctl.ko mtpcfg.ko and mptstm.ko
They are each build from a single .c file.

> mptbase-objs := comfunc.o

Now you try to include confunc.o in every module.
To do so you need to tell kbuild that you are dealing with a module
based on composite .o files.
That would look like:
obj-$(CONFIG_FUSION_PCI) += mptbase-foo.o
mtpbase-foo-y := comfunc.o mptbase.o

This will result in a module named mtpbase-foo.ko which is hardly what
you try to achive. Likewise you will have duplicate symbols in the
modules due to comfunc.o being included more than once.

The only sane approce here is to compile comfunc.o as an independent
module and let the modutils pull in the comfunc (deservers a more
specific name) module as needed.

So what you need to do is simply:
obj-m += comfunc.o

And accept this is a module so all symbols that you needs must be properly
exported using EXPORT_SYMBOL*

Sam

2006-09-30 12:11:11

by Frederik Deweerdt

[permalink] [raw]
Subject: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sat, Sep 30, 2006 at 12:43:24AM +0100, Alan Cox wrote:
> Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> > Does this patch makes sense in that case? If yes, I'll put up a patch
> > for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> > Also, aic7xxx's coding style would put parenthesis around the returned
> > value, should I follow it?
>
> Yes - but perhaps with a warning message so users know why ?
>
> As to coding style - kernel style is unbracketed so I wouldnt worry
> about either.
>
Thanks for the advices.

The following patch checks whenever the irq is valid before issuing a
request_irq() for AIC7XXX and AIC79XX. An error message is displayed to
let the user know what went wrong.

Regards,
Frederik

Signed-off-by: Frederik Deweerdt <[email protected]>

diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
index 2001fe8..8279122 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
@@ -132,6 +132,11 @@ ahd_linux_pci_dev_probe(struct pci_dev *
char *name;
int error;

+ if (!pdev->irq) {
+ printk(KERN_WARNING "aic79xx: No irq line set\n");
+ return -ENODEV;
+ }
+
pci = pdev;
entry = ahd_find_pci_device(pci);
if (entry == NULL)
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..ca61cdb 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -185,6 +185,11 @@ ahc_linux_pci_dev_probe(struct pci_dev *
int error;
struct device *dev = &pdev->dev;

+ if (!pdev->irq) {
+ printk(KERN_WARNING "aic7xxx: No irq line set\n");
+ return -ENODEV;
+ }
+
pci = pdev;
entry = ahc_find_pci_device(pci);
if (entry == NULL)

2006-09-30 13:54:23

by Alan

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

Ar Sad, 2006-09-30 am 14:09 +0000, ysgrifennodd Frederik Deweerdt:
> Signed-off-by: Frederik Deweerdt <[email protected]>

Acked-by: Alan Cox <[email protected]>

2006-09-30 14:21:35

by Willy Tarreau

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sat, Sep 30, 2006 at 03:19:14PM +0100, Alan Cox wrote:
> Ar Sad, 2006-09-30 am 14:09 +0000, ysgrifennodd Frederik Deweerdt:
> > Signed-off-by: Frederik Deweerdt <[email protected]>
>
> Acked-by: Alan Cox <[email protected]>

It seems to me that it's also valid for 2.4. Has someone any objection ?

Willy

2006-09-30 15:26:45

by James Bottomley

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> + if (!pdev->irq)
> + return -ENODEV;
> +

Don't I remember that 0 is a valid IRQ on some platforms?

i.e. shouldn't this be

if (pdev->irq == NO_IRQ)
return -ENODEV;

?

I think this won't quite work because only the platforms that actually
have a valid zero irq define it, but there must be something else that
works.

James


2006-09-30 16:21:43

by Matthew Wilcox

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Sat, Sep 30, 2006 at 10:26:22AM -0500, James Bottomley wrote:
> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> > + if (!pdev->irq)
> > + return -ENODEV;
> > +
>
> Don't I remember that 0 is a valid IRQ on some platforms?
>
> i.e. shouldn't this be
>
> if (pdev->irq == NO_IRQ)
> return -ENODEV;
>
> ?
>
> I think this won't quite work because only the platforms that actually
> have a valid zero irq define it, but there must be something else that
> works.

Linus threw a hissy fit and declared that platforms which use 0 as a
valid IRQ are broken and wrong. Despite PCI using 255 to mean no IRQ
and 0 as a valid IRQ ;-)

2006-09-30 17:20:52

by Mark Rustad

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Sep 30, 2006, at 11:21 AM, Matthew Wilcox wrote:

> On Sat, Sep 30, 2006 at 10:26:22AM -0500, James Bottomley wrote:
>> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
>>> + if (!pdev->irq)
>>> + return -ENODEV;
>>> +
>>
>> Don't I remember that 0 is a valid IRQ on some platforms?
>>
>> i.e. shouldn't this be
>>
>> if (pdev->irq == NO_IRQ)
>> return -ENODEV;
>>
>> ?
>>
>> I think this won't quite work because only the platforms that
>> actually
>> have a valid zero irq define it, but there must be something else
>> that
>> works.
>
> Linus threw a hissy fit and declared that platforms which use 0 as a
> valid IRQ are broken and wrong. Despite PCI using 255 to mean no IRQ
> and 0 as a valid IRQ ;-)

Having gone down the path of creating a platform that had IRQ 0 as a
valid interrupt some time ago with the 2.4 kernel, all I can say is
that while it can be made to work, things go much more smoothly if
you don't use IRQ 0. Every driver added to the environment pretty
much had to be tweaked. Of course that mainly meant adding to the
#ifdef's that were already there for other architectures that had
also made that mistake.

The biggest pain is admitting the mistake (of using IRQ 0) and
changing it. Making a clear statement on the issue will help prevent
others from making the same mistake again. I know that I wish that I
had known not to do that from the beginning. Having been there and
done that, I don't need any convincing.

--
Mark Rustad, [email protected]

2006-09-30 18:19:49

by Davide Libenzi

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - possible recursive locking detected

On Sat, 30 Sep 2006, Andrew Morton wrote:

> On Sat, 30 Sep 2006 09:04:06 +0200
> Borislav Petkov <[email protected]> wrote:
>
> > On Thu, Sep 28, 2006 at 01:46:23AM -0700, Andrew Morton wrote:
> > Hi,
> >
> > .config is at http://tim.dnsalias.org/2.6.18-mm2.cfg.
> >
> > Sep 30 08:38:17 zmei kernel: [ 285.197902]
> > Sep 30 08:38:19 zmei kernel: [ 285.197905] =============================================
> > Sep 30 08:38:19 zmei kernel: [ 285.204776] [ INFO: possible recursive locking detected ]
> > Sep 30 08:38:19 zmei kernel: [ 285.210163] 2.6.18-mm2 #1
> > Sep 30 08:38:19 zmei kernel: [ 285.212782] ---------------------------------------------
> > Sep 30 08:38:19 zmei kernel: [ 285.218168] swapper/0 is trying to acquire lock:
> > Sep 30 08:38:19 zmei kernel: [ 285.222777] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [ 285.229114]
> > Sep 30 08:38:19 zmei kernel: [ 285.229115] but task is already holding lock:
> > Sep 30 08:38:19 zmei kernel: [ 285.234952] (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [ 285.241290]
> > Sep 30 08:38:19 zmei kernel: [ 285.241291] other info that might help us debug this:
> > Sep 30 08:38:19 zmei kernel: [ 285.247817] 4 locks held by swapper/0:
> > Sep 30 08:38:19 zmei kernel: [ 285.251561] #0: (&tp->rx_lock){-+..}, at: [<c020f350>] rtl8139_poll+0x42/0x405
> > Sep 30 08:38:19 zmei kernel: [ 285.259041] #1: (slock-AF_INET/1){-+..}, at: [<c02aa753>] tcp_v4_rcv+0x3fa/0x8eb
> > Sep 30 08:38:19 zmei kernel: [ 285.266700] #2: (af_callback_keys + sk->sk_family#3){-.-?}, at: [<c0278d83>] sock_def_readable+0x15/0x69
> > Sep 30 08:38:19 zmei kernel: [ 285.276454] #3: (&q->lock){++..}, at: [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [ 285.283241]
> > Sep 30 08:38:19 zmei kernel: [ 285.283242] stack backtrace:
> > Sep 30 08:38:19 zmei kernel: [ 285.287688] [<c0103b65>] dump_trace+0x64/0x1cd
> > Sep 30 08:38:19 zmei kernel: [ 285.292243] [<c0103ce0>] show_trace_log_lvl+0x12/0x25
> > Sep 30 08:38:19 zmei kernel: [ 285.297405] [<c010431c>] show_trace+0xd/0x10
> > Sep 30 08:38:19 zmei kernel: [ 285.301780] [<c01043e4>] dump_stack+0x19/0x1b
> > Sep 30 08:38:19 zmei kernel: [ 285.306250] [<c013022d>] __lock_acquire+0x750/0x96c
> > Sep 30 08:38:19 zmei kernel: [ 285.311304] [<c013098c>] lock_acquire+0x4b/0x6b
> > Sep 30 08:38:19 zmei kernel: [ 285.316005] [<c02ca474>] _spin_lock_irqsave+0x2c/0x3c
> > Sep 30 08:38:19 zmei kernel: [ 285.321233] [<c0112f70>] __wake_up+0x15/0x3b
> > Sep 30 08:38:19 zmei kernel: [ 285.325638] [<c0178dd4>] ep_poll_safewake+0x91/0xc3
> > Sep 30 08:38:19 zmei kernel: [ 285.330760] [<c0179c69>] ep_poll_callback+0x83/0x8e
> > Sep 30 08:38:19 zmei kernel: [ 285.335888] [<c01122e5>] __wake_up_common+0x2f/0x53
> > Sep 30 08:38:19 zmei kernel: [ 285.340898] [<c0112f83>] __wake_up+0x28/0x3b
> > Sep 30 08:38:19 zmei kernel: [ 285.345312] [<c0278da8>] sock_def_readable+0x3a/0x69
> > Sep 30 08:38:20 zmei kernel: [ 285.350778] [<c02a1892>] tcp_data_queue+0x50f/0xa53
> > Sep 30 08:38:20 zmei kernel: [ 285.356232] [<c02a34c3>] tcp_rcv_established+0x5aa/0x64f
> > Sep 30 08:38:20 zmei kernel: [ 285.362077] [<c02a86f6>] tcp_v4_do_rcv+0x26/0x2f2
> > Sep 30 08:38:20 zmei kernel: [ 285.367322] [<c02aabd4>] tcp_v4_rcv+0x87b/0x8eb
>
> <looks at ep_poll_safewake>
>
> <falls out of chair>

Haha :)
I hope the comment describes the nastiness of the potential problems
that can heppen when adding epoll descriptors inside epoll descriptors
(non-trivial loops, looong chains, etc).



> We'll need to teach lockdep about that one, but I don't have a clue how.
>
> Is it not vulnerable to ab/ba deadlocking?

The two locks are different. One looks the netcard ->poll one, and one is
the epoll file ->poll one. I don't know lockdep, so I wouldn't know how to
make it quite in this case (w/out losing the ability to detect other
legitimate wait_queue_head_t-based x-locks).
Ingo?




- Davide


2006-09-30 19:53:49

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2

On Sat, 30 Sep 2006 15:37:06 +0200
Tobias Diedrich <[email protected]> wrote:

> Andrew Morton wrote:
>
> > - More updates to the MSI code. If your machine has Message Signalled
> > Interrupts, please enable it and give it a try.
>
> I'm happy to report, that with 2.6.18-mm2 suspend to disk works for
> me without additional patches, tested both with MSI interrupts
> disabled and enabled (forcedeth driver).

Thanks.

Which kernel version(s) didn't work? -mm1? Mainline?

2006-09-30 20:29:24

by Alan

[permalink] [raw]
Subject: Re: 2.6.18-mm2

Ar Sad, 2006-09-30 am 10:26 -0500, ysgrifennodd James Bottomley:
> On Fri, 2006-09-29 at 23:50 +0000, Frederik Deweerdt wrote:
> > + if (!pdev->irq)
> > + return -ENODEV;
> > +
>
> Don't I remember that 0 is a valid IRQ on some platforms?
>
> i.e. shouldn't this be
>
> if (pdev->irq == NO_IRQ)
> return -ENODEV;

NO_IRQ is gone. Everyone uses zero and Linus has declared that is how it
shall be.


Alan

2006-09-30 23:58:36

by Jeff Garzik

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..9743471 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -809,6 +809,40 @@ err_out:
return -EBUSY;
}

+#ifndef ARCH_VALIDATE_PCI_IRQ
+int pci_valid_irq(struct pci_dev *pdev)
+{
+ if (pdev->irq == 0)
+ return -EINVAL;
+
+ return 0;
+}
+EXPORT_SYMBOL(pci_valid_irq);
+#endif /* ARCH_VALIDATE_PCI_IRQ */
+
+int pci_request_irq(struct pci_dev *pdev,
+ irqreturn_t (*handler)(int, void *, struct pt_regs *),
+ unsigned long flags, const char *name, void *userdata)
+{
+ int rc;
+
+ rc = pci_valid_irq(pdev);
+ if (rc) {
+ dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
+ return rc;
+ }
+
+ return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
+ name, userdata);
+}
+EXPORT_SYMBOL(pci_request_irq);
+
+void pci_release_irq(struct pci_dev *pdev, void *userdata)
+{
+ free_irq(pdev->irq, userdata);
+}
+EXPORT_SYMBOL(pci_release_irq);
+
/**
* pci_set_master - enables bus-mastering for device dev
* @dev: the PCI device to enable
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5c3a417..5e254fc 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@ #include <linux/list.h>
#include <linux/compiler.h>
#include <linux/errno.h>
#include <linux/device.h>
+#include <linux/interrupt.h>

/* File state for mmap()s on /proc/bus/pci/X/Y */
enum pci_mmap_state {
@@ -537,6 +538,12 @@ void pci_release_regions(struct pci_dev
int __must_check pci_request_region(struct pci_dev *, int, const char *);
void pci_release_region(struct pci_dev *, int);

+int __must_check pci_valid_irq(struct pci_dev *pdev);
+int __must_check pci_request_irq(struct pci_dev *pdev,
+ irqreturn_t (*handler)(int, void *, struct pt_regs *),
+ unsigned long flags, const char *name, void *userdata);
+void pci_release_irq(struct pci_dev *pdev, void *userdata);
+
/* drivers/pci/bus.c */
int __must_check pci_bus_alloc_resource(struct pci_bus *bus,
struct resource *res, resource_size_t size,


Attachments:
patch (1.98 kB)

2006-10-01 14:28:11

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sat, Sep 30, 2006 at 07:58:18PM -0400, Jeff Garzik wrote:
> Actually, rather than adding this check to every driver, I would rather
> do something like the attached patch: create a pci_request_irq(), and
> pass a struct pci_device to it. Then the driver author doesn't have to
> worry about such details.

I like pci_request_irq(), but pci_valid_irq is bad.

> +#ifndef ARCH_VALIDATE_PCI_IRQ
> +int pci_valid_irq(struct pci_dev *pdev)
> +{
> + if (pdev->irq == 0)
> + return -EINVAL;
> +
> + return 0;
> +}
> +EXPORT_SYMBOL(pci_valid_irq);
> +#endif /* ARCH_VALIDATE_PCI_IRQ */

Better would be:

#ifndef ARCH_VALIDATE_IRQ
static inline int valid_irq(unsigned int irq)
{
return irq ? 1 : 0;
}
#endif

in linux/interrupt.h (around request_irq).

And it doesn't need to be a __must_check. There's no point -- it has
no side-effects. The only reason to call it is if you want the answer
to the question. You had the sense of the return code wrong too; you
want to use it as:

int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
unsigned long flags, const char *name, void *data)
{
if (!valid_irq(pdev->irq)) {
dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
return -EINVAL;
}

return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
}

2006-10-01 19:06:21

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

> .
>
> And it doesn't need to be a __must_check. There's no point -- it has
> no side-effects. The only reason to call it is if you want the answer
> to the question. You had the sense of the return code wrong too; you
> want to use it as:
>
> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> unsigned long flags, const char *name, void *data)
> {
> if (!valid_irq(pdev->irq)) {
> dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> return -EINVAL;
> }
>
> return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
> }


well... why not go one step further and eliminate the flags argument
entirely? And use pci_name() for the name (so eliminate the argument ;)
and always pass pdev as data, so that that argument can go away too....

that'll cover 99% of the request_irq() users for pci devices.. and makes
it really nicely simple and consistent.

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com

2006-10-01 19:20:10

by Jeff Garzik

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

Arjan van de Ven wrote:
> well... why not go one step further and eliminate the flags argument
> entirely? And use pci_name() for the name (so eliminate the argument ;)
> and always pass pdev as data, so that that argument can go away too....
>
> that'll cover 99% of the request_irq() users for pci devices.. and makes
> it really nicely simple and consistent.

Disagree. That would involve rewriting a lot of drivers.

flags: may or may not need sample-random flag.

name: is always the ethernet interface, for net drivers, or did you
forget from your irqbalance days? ;-)

data: in practice, is _rarely_ struct pci_dev. It's usually a
driver-private structure which is the structure most frequently
accessed. struct pci_dev* is rarely accessed inside the interrupt
handler, except maybe somewhere deep in an error handling path.

Jeff

2006-10-01 19:33:13

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sat, Sep 30, 2006 at 07:58:18PM -0400, Jeff Garzik wrote:
> Frederik Deweerdt wrote:
> >On Sat, Sep 30, 2006 at 12:43:24AM +0100, Alan Cox wrote:
> >>Ar Gwe, 2006-09-29 am 23:50 +0000, ysgrifennodd Frederik Deweerdt:
> >>>Does this patch makes sense in that case? If yes, I'll put up a patch
> >>>for the remaining cases in the drivers/scsi/aic7xxx/ directory.
> >>>Also, aic7xxx's coding style would put parenthesis around the returned
> >>>value, should I follow it?
> >>Yes - but perhaps with a warning message so users know why ?
> >>
> >>As to coding style - kernel style is unbracketed so I wouldnt worry
> >>about either.
> >>
> >Thanks for the advices. The following patch checks whenever the irq is valid before issuing a
> >request_irq() for AIC7XXX and AIC79XX. An error message is displayed to
> >let the user know what went wrong.
> >Regards,
> >Frederik
> >Signed-off-by: Frederik Deweerdt <[email protected]>
>
> Actually, rather than adding this check to every driver, I would rather do something like the attached patch: create a
> pci_request_irq(), and pass a struct pci_device to it. Then the driver author doesn't have to worry about such details.
>
That's better, indeed.
[...]
> +#ifndef ARCH_VALIDATE_PCI_IRQ
> +int pci_valid_irq(struct pci_dev *pdev)
> +{
> + if (pdev->irq == 0)
> + return -EINVAL;
^^^^^^
Woulnd't this rather be ENODEV? Admitedly, from pci_valid_irq() (or
is_irq_valid()) point of view, it _has_ been passed an invalid value. But
from userspace's point of view, it's like the device was not present.

Regards,
Frederik

2006-10-01 19:35:18

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sun, 2006-10-01 at 15:19 -0400, Jeff Garzik wrote:
> Arjan van de Ven wrote:
> > well... why not go one step further and eliminate the flags argument
> > entirely? And use pci_name() for the name (so eliminate the argument ;)
> > and always pass pdev as data, so that that argument can go away too....
> >
> > that'll cover 99% of the request_irq() users for pci devices.. and makes
> > it really nicely simple and consistent.
>
> Disagree. That would involve rewriting a lot of drivers.
>
> flags: may or may not need sample-random flag.

ok fair.. but I'd then almost call it "samplerandom" not "flags"...


>
> name: is always the ethernet interface, for net drivers, or did you
> forget from your irqbalance days? ;-)

I'd say the "always" isn't quite true .. I remember that well.
If it's always the pci device at least irqbalance can look up the device
type in sysfs ;)


> data: in practice, is _rarely_ struct pci_dev. It's usually a
> driver-private structure which is the structure most frequently
> accessed. struct pci_dev* is rarely accessed inside the interrupt
> handler, except maybe somewhere deep in an error handling path.

hmmm could put a pointer to the private data in the pci_dev at least...
that'd be generally useful, and then this can either just pass that,
or have the isr get to it that way (whichever makes more sense)

2006-10-01 19:36:19

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

On Sun, Oct 01, 2006 at 09:05:23PM +0200, Arjan van de Ven wrote:
> > int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> > unsigned long flags, const char *name, void *data)
> > {
> > if (!valid_irq(pdev->irq)) {
> > dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> > return -EINVAL;
> > }
> >
> > return request_irq(pdev->irq, handler, flags | IRQF_SHARED, name, data);
> > }
>
> well... why not go one step further and eliminate the flags argument
> entirely? And use pci_name() for the name (so eliminate the argument ;)
> and always pass pdev as data, so that that argument can go away too....
>
> that'll cover 99% of the request_irq() users for pci devices.. and makes
> it really nicely simple and consistent.

hmm. $ echo `cut -c34- /proc/interrupts`
timer i8042 cascade acpi yenta, ehci_hcd:usb1, Intel 82801DB-ICH4 yenta,
uhci_hcd:usb2 uhci_hcd:usb4, eth0 ide0 uhci_hcd:usb3, eth1

Network drivers use their eth%d name. USB drivers use [eu]hci_hcd:usb%d.
Others tend to use the driver name. Changing them all to be 0000:00:1d.2
isn't really an improvement in the readability of /proc/interrupts, IMO.

Passing pdev as the data is a good idea for practically no device driver.
It's rare to actually want the pci_device down in the interrupt handler;
normally you want the device private data. Using pci_get_drvdata(pdev)
as the data would make sense for both sym2 and tg3. I don't feel like
auditing other drivers to see if it'd make sense for them too.

So, current proposal:

int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
const char *name)
{
if (!valid_irq(pdev->irq)) {
dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
return -EINVAL;
}

return request_irq(pdev->irq, handler, IRQF_SHARED, name,
pci_get_drvdata(pdev));
}

But what about IRQF_SAMPLE_RANDOM?

2006-10-01 19:43:09

by Jeff Garzik

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)

Matthew Wilcox wrote:
> Others tend to use the driver name. Changing them all to be 0000:00:1d.2
> isn't really an improvement in the readability of /proc/interrupts, IMO.

agreed


> Passing pdev as the data is a good idea for practically no device driver.

agreed


> It's rare to actually want the pci_device down in the interrupt handler;
> normally you want the device private data. Using pci_get_drvdata(pdev)
> as the data would make sense for both sym2 and tg3. I don't feel like

Using pci_get_drvdata() is a pretty good idea


> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> const char *name)
> {
> if (!valid_irq(pdev->irq)) {
> dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> return -EINVAL;
> }
>
> return request_irq(pdev->irq, handler, IRQF_SHARED, name,
> pci_get_drvdata(pdev));
> }
>
> But what about IRQF_SAMPLE_RANDOM?

I still like having a flags argument though. It's enough of an open
question, and I bet there will be a new flag or two in the future that
PCI drivers will want to use.

Jeff


2006-10-02 02:13:09

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [-mm patch] aic7xxx: check irq validity (was Re: 2.6.18-mm2)


> Network drivers use their eth%d name. USB drivers use [eu]hci_hcd:usb%d.
> Others tend to use the driver name. Changing them all to be 0000:00:1d.2
> isn't really an improvement in the readability of /proc/interrupts, IMO.

hmm ok; how about allowing name to be NULL, and if it's NULL, use the
pci name?

>
> So, current proposal:
>
> int pci_request_irq(struct pci_dev *pdev, irq_handler_t handler,
> const char *name)
> {
> if (!valid_irq(pdev->irq)) {
> dev_printk(KERN_ERR, &pdev->dev, "invalid irq\n");
> return -EINVAL;
> }
>
> return request_irq(pdev->irq, handler, IRQF_SHARED, name,
> pci_get_drvdata(pdev));
> }
>
> But what about IRQF_SAMPLE_RANDOM?

that's a tough question. I'd almost suggest making such things
properties of the pdev, but sample-random is so far away from PCI
related that it makes no sense I suppose ;(

(others do I think)

One other interesting question is if this function can/should be used to
use MSI transparently (after pci_enable_msi() obviously)

--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com

2006-10-02 13:48:03

by Peter Zijlstra

[permalink] [raw]
Subject: Re: md deadlock (was Re: 2.6.18-mm2)

On Fri, 2006-09-29 at 16:03 +0200, Peter Zijlstra wrote:
> On Fri, 2006-09-29 at 22:52 +1000, Neil Brown wrote:
> > On Friday September 29, [email protected] wrote:
> > > On Thu, 2006-09-28 at 13:54 +0200, Michal Piotrowski wrote:
> > >
> > > Looks like a real deadlock here. It seems to me #2 is the easiest to
> > > break.
> >
> > I guess it could deadlock if you tried to add /dev/md0 as a component
> > of /dev/md0. I should probably check for that somewhere.
> > In other cases the array->member ordering ensures there is no
> > deadlock.
> >
>
>
> 1 2
>
> open(/dev/md0)
>
> open(/dev/md0)
> - do_open() -> bdev->bd_mutex
> ioctl(/dev/md0, hotadd)
> - md_ioctl() -> mddev->reconfig_mutex
> -- hot_add_disk()
> --- bind_rdev_to_array()
> ---- bd_claim_by_disk()
> ----- bd_claim_by_kobject()
> -- md_open()
> --- mddev_lock()
> ---- mutex_lock(mddev->reconfig_mutex)
> ------ mutex_lock(bdev->bd_mutex)
>

D'0h, 1:bdev->bd_mutex is ofcourse rdev->bd_mutex; the slave device's
mutex.

So mddev->bd_mutex wants to be another class all-together.

2006-10-02 17:56:39

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> On Fri, 29 Sep 2006 20:01:54 -0400
> >
> > % grep ioctl /tmp/foo2 | sort -u | more
> > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
> > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
> > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0
>
> Yes. The main thing which those WE-21 patches do is to shorten the size of
> various buffers which are used in wireless ioctls.

Ok, I've found it. Actually, I feel ashamed, as it is a fairly
classical buffer overflow, we put one extra char in a buffer. Now, I
don't understand why it did not blow up on my box ;-)
New patch. I think it is right, but I would not mind Pavel to
have a look at it. On my box it does not make thing worse.
Valdis : would you mind trying if this patch fix the problem
you are seeing with WE-21 ? If it fixes it, I'll send it to John...
Have fun...

Jean

P.S. : I'll audit the other wireless drivers for the same thing.

-------------------------------------------------

diff -u -p linux/drivers/net/wireless/orinoco.j1.c linux/drivers/net/wireless/orinoco.c
--- linux/drivers/net/wireless/orinoco.j1.c 2006-10-02 10:15:41.000000000 -0700
+++ linux/drivers/net/wireless/orinoco.c 2006-10-02 10:39:20.000000000 -0700
@@ -2456,6 +2456,7 @@ void free_orinocodev(struct net_device *
/* Wireless extensions */
/********************************************************************/

+/* Return : < 0 -> error code ; >= 0 -> length */
static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
char buf[IW_ESSID_MAX_SIZE+1])
{
@@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
len = le16_to_cpu(essidbuf.len);
BUG_ON(len > IW_ESSID_MAX_SIZE);

- memset(buf, 0, IW_ESSID_MAX_SIZE+1);
+ memset(buf, 0, IW_ESSID_MAX_SIZE);
memcpy(buf, p, len);
- buf[len] = '\0';
+ err = len;

fail_unlock:
orinoco_unlock(priv, &flags);
@@ -3026,17 +3027,18 @@ static int orinoco_ioctl_getessid(struct

if (netif_running(dev)) {
err = orinoco_hw_get_essid(priv, &active, essidbuf);
- if (err)
+ if (err < 0)
return err;
+ erq->length = err;
} else {
if (orinoco_lock(priv, &flags) != 0)
return -EBUSY;
- memcpy(essidbuf, priv->desired_essid, IW_ESSID_MAX_SIZE + 1);
+ memcpy(essidbuf, priv->desired_essid, IW_ESSID_MAX_SIZE);
+ erq->length = strlen(priv->desired_essid);
orinoco_unlock(priv, &flags);
}

erq->flags = 1;
- erq->length = strlen(essidbuf);

return 0;
}
@@ -3074,10 +3076,10 @@ static int orinoco_ioctl_getnick(struct
if (orinoco_lock(priv, &flags) != 0)
return -EBUSY;

- memcpy(nickbuf, priv->nick, IW_ESSID_MAX_SIZE+1);
+ memcpy(nickbuf, priv->nick, IW_ESSID_MAX_SIZE);
orinoco_unlock(priv, &flags);

- nrq->length = strlen(nickbuf);
+ nrq->length = strlen(priv->nick);

return 0;
}

2006-10-02 18:02:16

by Frederik Deweerdt

[permalink] [raw]
Subject: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)

Hi all,

I've tried to summarize the different proposals made by Jeff Garzik,
Matthew Wilcox and Arjan van de Ven in the "[-mm patch] aic7xxx: check
irq validity" thread. I've also added:
- some kerneldoc
- renamed valid_irq to is_irq_valid()
- added pci_release_irq().

I'll send a follow-up patch showing the implied modifications for the
following - semi-randomly chosen :) - drivers: aic7xxx, aic79xx, tg3
and drm.

Regards,
Frederik

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a544997..ae20a3a 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -15,6 +15,7 @@ #include <linux/init.h>
#include <linux/pci.h>
#include <linux/module.h>
#include <linux/spinlock.h>
+#include <linux/interrupt.h>
#include <linux/string.h>
#include <asm/dma.h> /* isa_dma_bridge_buggy */
#include "pci.h"
@@ -810,6 +811,49 @@ err_out:
}

/**
+ * pci_request_irq - Reserve an IRQ for a PCI device
+ * @pdev: The PCI device whose irq is to be reserved
+ * handler: The interrupt handler function,
+ * pci_get_drvdata(pdev) shall be passed as an argument to that function
+ * @flags: The flags to be passed to request_irq()
+ * @name: The name of the device to be associated with the irq
+ *
+ * Returns 0 on success, or a negative value on error. A warning
+ * message is also printed on failure.
+ */
+int pci_request_irq(struct pci_dev *pdev,
+ irqreturn_t (*handler)(int, void *, struct pt_regs *),
+ unsigned long flags, const char *name)
+{
+ int rc;
+ const char *actual_name = name;
+
+ rc = is_irq_valid(pdev->irq);
+ if (!rc) {
+ dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
+ return -EINVAL;
+ }
+
+ if (!actual_name)
+ actual_name = pci_name(pdev);
+
+ return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
+ actual_name, pci_get_drvdata(pdev));
+}
+EXPORT_SYMBOL(pci_request_irq);
+
+/**
+ * pci_free_irq - releases the interrupt line reserved to the PCI
+ * device pointed by @pdev
+ * @pdev: the PCI device whose interrupt is to be freed
+ */
+void pci_free_irq(struct pci_dev *pdev)
+{
+ free_irq(pdev->irq, pci_get_drvdata(pdev));
+}
+EXPORT_SYMBOL(pci_free_irq);
+
+/**
* pci_set_master - enables bus-mastering for device dev
* @dev: the PCI device to enable
*
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 1f97e3d..c320b50 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -75,6 +75,13 @@ struct irqaction {
struct proc_dir_entry *dir;
};

+#ifndef ARCH_VALIDATE_PCI_IRQ
+static inline int is_irq_valid(unsigned int irq)
+{
+ return irq ? 1 : 0;
+}
+#endif /* ARCH_VALIDATE_PCI_IRQ */
+
extern irqreturn_t no_action(int cpl, void *dev_id, struct pt_regs *regs);
extern int request_irq(unsigned int,
irqreturn_t (*handler)(int, void *, struct pt_regs *),
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 5bc4659..5e0f07a 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -52,6 +52,7 @@ #include <linux/list.h>
#include <linux/compiler.h>
#include <linux/errno.h>
#include <linux/device.h>
+#include <linux/interrupt.h>

/* File state for mmap()s on /proc/bus/pci/X/Y */
enum pci_mmap_state {
@@ -531,6 +532,11 @@ void pci_release_regions(struct pci_dev
int __must_check pci_request_region(struct pci_dev *, int, const char *);
void pci_release_region(struct pci_dev *, int);

+int __must_check pci_request_irq(struct pci_dev *pdev,
+ irqreturn_t (*handler)(int, void *, struct pt_regs *),
+ unsigned long flags, const char *name);
+void pci_free_irq(struct pci_dev *pdev);
+
/* drivers/pci/bus.c */
int __must_check pci_bus_alloc_resource(struct pci_bus *bus,
struct resource *res, resource_size_t size,

2006-10-02 18:08:34

by Frederik Deweerdt

[permalink] [raw]
Subject: [RFC PATCH] move aic7xxx to pci_request_irq

Hi,

This proof-of-concept patch converts the aic7xxx drivers to use the
pci_request_irq() function.

Regards,
Frederik


diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
index 2001fe8..c934f30 100644
--- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
@@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
{
int error;

- error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
- IRQF_SHARED, "aic79xx", ahd);
+ error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
+ IRQF_SHARED, "aic79xx");
if (!error)
ahd->platform_data->irq = ahd->dev_softc->irq;

- return (-error);
+ return error;
}

void
diff --git a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
index ea5687d..d5c402e 100644
--- a/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
+++ b/drivers/scsi/aic7xxx/aic7xxx_osm_pci.c
@@ -368,16 +368,14 @@ ahc_pci_map_registers(struct ahc_softc *
return (error);
}

-int
-ahc_pci_map_int(struct ahc_softc *ahc)
+int ahc_pci_map_int(struct ahc_softc *ahc)
{
int error;

- error = request_irq(ahc->dev_softc->irq, ahc_linux_isr,
- IRQF_SHARED, "aic7xxx", ahc);
+ error = pci_request_irq(ahc->dev_softc, ahc_linux_isr, IRQF_SHARED,
+ "aic7xxx");
if (error == 0)
ahc->platform_data->irq = ahc->dev_softc->irq;
-
- return (-error);
-}

+ return error;
+}

2006-10-02 18:13:07

by Frederik Deweerdt

[permalink] [raw]
Subject: [RFC PATCH] move tg3 to pci_request_irq

Hi,

This proof-of-concept patch converts the tg3 driver to use the
pci_request_irq() function.

Regards,
Frederik


diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index c25ba27..23660c6 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6838,9 +6838,9 @@ restart_timer:

static int tg3_request_irq(struct tg3 *tp)
{
+ struct net_device *dev = tp->dev;
irqreturn_t (*fn)(int, void *, struct pt_regs *);
unsigned long flags;
- struct net_device *dev = tp->dev;

if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
fn = tg3_msi;
@@ -6853,7 +6853,7 @@ static int tg3_request_irq(struct tg3 *t
fn = tg3_interrupt_tagged;
flags = IRQF_SHARED | IRQF_SAMPLE_RANDOM;
}
- return (request_irq(tp->pdev->irq, fn, flags, dev->name, dev));
+ return pci_request_irq(tp->pdev, fn, flags, dev->name);
}

static int tg3_test_interrupt(struct tg3 *tp)
@@ -6866,10 +6866,10 @@ static int tg3_test_interrupt(struct tg3

tg3_disable_ints(tp);

- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);

- err = request_irq(tp->pdev->irq, tg3_test_isr,
- IRQF_SHARED | IRQF_SAMPLE_RANDOM, dev->name, dev);
+ err = pci_request_irq(tp->pdev, tg3_test_isr,
+ IRQF_SHARED | IRQF_SAMPLE_RANDOM, dev->name);
if (err)
return err;

@@ -6897,7 +6897,7 @@ static int tg3_test_interrupt(struct tg3

tg3_disable_ints(tp);

- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);

err = tg3_request_irq(tp);

@@ -6915,7 +6915,6 @@ static int tg3_test_interrupt(struct tg3
*/
static int tg3_test_msi(struct tg3 *tp)
{
- struct net_device *dev = tp->dev;
int err;
u16 pci_cmd;

@@ -6946,7 +6945,7 @@ static int tg3_test_msi(struct tg3 *tp)
"the PCI maintainer and include system chipset information.\n",
tp->dev->name);

- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);
pci_disable_msi(tp->pdev);

tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;
@@ -6966,7 +6965,7 @@ static int tg3_test_msi(struct tg3 *tp)
tg3_full_unlock(tp);

if (err)
- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);

return err;
}
@@ -7051,7 +7050,7 @@ static int tg3_open(struct net_device *d
tg3_full_unlock(tp);

if (err) {
- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);
if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
pci_disable_msi(tp->pdev);
tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;
@@ -7363,7 +7362,7 @@ #endif

tg3_full_unlock(tp);

- free_irq(tp->pdev->irq, dev);
+ pci_free_irq(tp->pdev);
if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
pci_disable_msi(tp->pdev);
tp->tg3_flags2 &= ~TG3_FLG2_USING_MSI;

2006-10-02 18:13:58

by Frederik Deweerdt

[permalink] [raw]
Subject: [RFC PATCH] move drm to pci_request_irq

Hi,

This proof-of-concept patch converts the drm driver to use the
pci_request_irq() function.

Regards,
Frederik



diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
index b366c5b..5b000cd 100644
--- a/drivers/char/drm/drm_drv.c
+++ b/drivers/char/drm/drm_drv.c
@@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
}
mutex_unlock(&dev->struct_mutex);

+ pci_set_drvdata(dev, NULL);
+
DRM_DEBUG("lastclose completed\n");
return 0;
}
diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
index 4553a3a..5dd12cb 100644
--- a/drivers/char/drm/drm_irq.c
+++ b/drivers/char/drm/drm_irq.c
@@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
sh_flags = IRQF_SHARED;

- ret = request_irq(dev->irq, dev->driver->irq_handler,
- sh_flags, dev->devname, dev);
+ pci_set_drvdata(dev->pdev, dev);
+
+ ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
+ sh_flags, dev->devname);
if (ret < 0) {
mutex_lock(&dev->struct_mutex);
dev->irq_enabled = 0;
@@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev

dev->driver->irq_uninstall(dev);

- free_irq(dev->irq, dev);
+ pci_free_irq(dev->pdev);

return 0;
}

2006-10-02 18:15:24

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)

On Mon, Oct 02, 2006 at 08:00:48PM +0000, Frederik Deweerdt wrote:
> /**
> + * pci_request_irq - Reserve an IRQ for a PCI device
> + * @pdev: The PCI device whose irq is to be reserved
> + * handler: The interrupt handler function,

> + * pci_get_drvdata(pdev) shall be passed as an argument to that function

I don't think you can (or should) do this. Move it to the body of the
comment below.

> + * @flags: The flags to be passed to request_irq()
> + * @name: The name of the device to be associated with the irq
> + *
> + * Returns 0 on success, or a negative value on error. A warning
> + * message is also printed on failure.
> + */
> +int pci_request_irq(struct pci_dev *pdev,
> + irqreturn_t (*handler)(int, void *, struct pt_regs *),
> + unsigned long flags, const char *name)
> +{
> + int rc;
> + const char *actual_name = name;
> +
> + rc = is_irq_valid(pdev->irq);
> + if (!rc) {
> + dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> + return -EINVAL;
> + }

Why is that more readable than

if (!is_irq_valid(pdev->irq)) {
dev_err(&pdev->dev, "invalid irq #%d\n", pdev->irq);
return -EINVAL;
}

> + if (!actual_name)
> + actual_name = pci_name(pdev);
> +
> + return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> + actual_name, pci_get_drvdata(pdev));

The driver name is a far more common usage than the pci_name.

return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
name ? name : pdev->driver->name,
pci_get_drvdata(pdev));

2006-10-02 18:27:47

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC PATCH] move aic7xxx to pci_request_irq

On Mon, Oct 02, 2006 at 08:07:03PM +0000, Frederik Deweerdt wrote:
> +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
> {
> int error;
>
> - error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> - IRQF_SHARED, "aic79xx", ahd);
> + error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> + IRQF_SHARED, "aic79xx");
> if (!error)
> ahd->platform_data->irq = ahd->dev_softc->irq;
>
> - return (-error);
> + return error;

Seems unsafe to me. Unless you want to trace through the whole driver
changing its internal conventions to use negative errnos like the rest
of the kernel.

> -
> - return (-error);
> -}
>
> + return error;
> +}

Ditto.

2006-10-02 18:28:51

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC PATCH] move tg3 to pci_request_irq

On Mon, Oct 02, 2006 at 08:11:34PM +0000, Frederik Deweerdt wrote:
> @@ -6838,9 +6838,9 @@ restart_timer:
>
> static int tg3_request_irq(struct tg3 *tp)
> {
> + struct net_device *dev = tp->dev;
> irqreturn_t (*fn)(int, void *, struct pt_regs *);
> unsigned long flags;
> - struct net_device *dev = tp->dev;
>
> if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
> fn = tg3_msi;

Is there any reason for this noise?

2006-10-02 18:37:51

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

On Mon, Oct 02, 2006 at 08:12:29PM +0000, Frederik Deweerdt wrote:
>
> + pci_set_drvdata(dev, NULL);
> +
> DRM_DEBUG("lastclose completed\n");

Not necessary. pci_devs are allocated initialised to 0.

> @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
> if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> sh_flags = IRQF_SHARED;
>
> - ret = request_irq(dev->irq, dev->driver->irq_handler,
> - sh_flags, dev->devname, dev);
> + pci_set_drvdata(dev->pdev, dev);
> +
> + ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> + sh_flags, dev->devname);

This seems like the wrong place to be setting the pci_drvdata. It
should probably be done in each driver. But then, requesting the IRQ
should also be done by each driver. You've dragged us into the "wow,
what a mess DRI is" black hole here, I'm afraid.

2006-10-02 19:04:15

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] move aic7xxx to pci_request_irq

On Mon, Oct 02, 2006 at 12:27:44PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:07:03PM +0000, Frederik Deweerdt wrote:
> > +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> > @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
> > {
> > int error;
> >
> > - error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> > - IRQF_SHARED, "aic79xx", ahd);
> > + error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> > + IRQF_SHARED, "aic79xx");
> > if (!error)
> > ahd->platform_data->irq = ahd->dev_softc->irq;
> >
> > - return (-error);
> > + return error;
>
> Seems unsafe to me.
It is, it slipped through the patches, I didn't mean to send it to the
list :(. Please ignore that.
>
> > -
> > - return (-error);
> > -}
> >
> > + return error;
> > +}
>
> Ditto.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-02 19:05:58

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] move tg3 to pci_request_irq

On Mon, Oct 02, 2006 at 12:28:47PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:11:34PM +0000, Frederik Deweerdt wrote:
> > @@ -6838,9 +6838,9 @@ restart_timer:
> >
> > static int tg3_request_irq(struct tg3 *tp)
> > {
> > + struct net_device *dev = tp->dev;
> > irqreturn_t (*fn)(int, void *, struct pt_regs *);
> > unsigned long flags;
> > - struct net_device *dev = tp->dev;
> >
> > if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
> > fn = tg3_msi;
>
> Is there any reason for this noise?
You mean, besides my awkwardness ? ;)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-02 19:08:45

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

On Mon, Oct 02, 2006 at 12:37:49PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:12:29PM +0000, Frederik Deweerdt wrote:
> >
> > + pci_set_drvdata(dev, NULL);
> > +
> > DRM_DEBUG("lastclose completed\n");
>
> Not necessary. pci_devs are allocated initialised to 0.
Actually, this is the exit path, I felt like it could be safer if it was
set to NULL before freeing it.
>
> > @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
> > if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> > sh_flags = IRQF_SHARED;
> >
> > - ret = request_irq(dev->irq, dev->driver->irq_handler,
> > - sh_flags, dev->devname, dev);
> > + pci_set_drvdata(dev->pdev, dev);
> > +
> > + ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> > + sh_flags, dev->devname);
>
> This seems like the wrong place to be setting the pci_drvdata. It
> should probably be done in each driver. But then, requesting the IRQ
> should also be done by each driver. You've dragged us into the "wow,
> what a mess DRI is" black hole here, I'm afraid.
I must admit that I had no idea where to initialize it. Do you have a
better place in mind?
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-02 19:11:18

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)

On Mon, Oct 02, 2006 at 12:15:22PM -0600, Matthew Wilcox wrote:
> On Mon, Oct 02, 2006 at 08:00:48PM +0000, Frederik Deweerdt wrote:
> > /**
> > + * pci_request_irq - Reserve an IRQ for a PCI device
> > + * @pdev: The PCI device whose irq is to be reserved
> > + * handler: The interrupt handler function,
>
> > + * pci_get_drvdata(pdev) shall be passed as an argument to that function
>
> I don't think you can (or should) do this. Move it to the body of the
> comment below.
OK, thanks for pointing this, will do.
>
> > + * @flags: The flags to be passed to request_irq()
> > + * @name: The name of the device to be associated with the irq
> > + *
> > + * Returns 0 on success, or a negative value on error. A warning
> > + * message is also printed on failure.
> > + */
> > +int pci_request_irq(struct pci_dev *pdev,
> > + irqreturn_t (*handler)(int, void *, struct pt_regs *),
> > + unsigned long flags, const char *name)
> > +{
> > + int rc;
> > + const char *actual_name = name;
> > +
> > + rc = is_irq_valid(pdev->irq);
> > + if (!rc) {
> > + dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> > + return -EINVAL;
> > + }
>
> Why is that more readable than
>
> if (!is_irq_valid(pdev->irq)) {
> dev_err(&pdev->dev, "invalid irq #%d\n", pdev->irq);
> return -EINVAL;
> }
>
Better too.
> > + if (!actual_name)
> > + actual_name = pci_name(pdev);
> > +
> > + return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> > + actual_name, pci_get_drvdata(pdev));
>
> The driver name is a far more common usage than the pci_name.
>
> return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> name ? name : pdev->driver->name,
> pci_get_drvdata(pdev));
OK, thanks for the feedback,
Frederik
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-02 20:00:15

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Mon, 02 Oct 2006 10:52:45 PDT, Jean Tourrilhes said:
> On Fri, Sep 29, 2006 at 06:20:08PM -0700, Andrew Morton wrote:
> > On Fri, 29 Sep 2006 20:01:54 -0400
> > >
> > > % grep ioctl /tmp/foo2 | sort -u | more
> > > ioctl(13, SIOCGIWESSID, 0xbfbcdb9c) = 0
> > > ioctl(13, SIOCGIWRANGE, 0xbfbcdbdc) = 0
> > > ioctl(13, SIOCGIWRATE, 0xbfbcdbbc) = 0
> >
> > Yes. The main thing which those WE-21 patches do is to shorten the size of
> > various buffers which are used in wireless ioctls.
>
> Ok, I've found it. Actually, I feel ashamed, as it is a fairly
> classical buffer overflow, we put one extra char in a buffer. Now, I
> don't understand why it did not blow up on my box ;-)
> New patch. I think it is right, but I would not mind Pavel to
> have a look at it. On my box it does not make thing worse.
> Valdis : would you mind trying if this patch fix the problem
> you are seeing with WE-21 ? If it fixes it, I'll send it to John...

Been up and running with we-21 configured in, and gkrellm doing the monitoring
that gave it indigestion. It was dying in 1-2 minutes, now been up for 30 mins
with no issues....


Attachments:
(No filename) (226.00 B)

2006-10-02 20:12:12

by Alan

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

Ar Llu, 2006-10-02 am 20:12 +0000, ysgrifennodd Frederik Deweerdt:
> Hi,
>
> This proof-of-concept patch converts the drm driver to use the
> pci_request_irq() function.

0 isn't invalid - it means no IRQ was assigned so wants a different
message.

2006-10-02 20:27:53

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

On Mon, Oct 02, 2006 at 09:36:38PM +0100, Alan Cox wrote:
> Ar Llu, 2006-10-02 am 20:12 +0000, ysgrifennodd Frederik Deweerdt:
> > Hi,
> >
> > This proof-of-concept patch converts the drm driver to use the
> > pci_request_irq() function.
>
> 0 isn't invalid - it means no IRQ was assigned so wants a different
> message.
>
I understand, what about:

("No usable irq line was found (got #%d)\n", irqno)

This is generic enough, so that if on some arches a given irq (other
than 0) is invalid, the message still makes sense.

2006-10-02 23:54:10

by Dave Airlie

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

On 10/3/06, Frederik Deweerdt <[email protected]> wrote:
> Hi,
>
> This proof-of-concept patch converts the drm driver to use the
> pci_request_irq() function.

NAK.
Wow nice CC'list and no DRM maintainer in sight :-)

This will break framebuffer drivers, the DRM is not a proper PCI
device driver as we don't have PCI device sharing, take a look at the
gpu-2.6.git tree on kernel.org for the "correct" solution, which needs
more attention before merging..

Dave.
>
> Regards,
> Frederik
>
>
>
> diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
> index b366c5b..5b000cd 100644
> --- a/drivers/char/drm/drm_drv.c
> +++ b/drivers/char/drm/drm_drv.c
> @@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
> }
> mutex_unlock(&dev->struct_mutex);
>
> + pci_set_drvdata(dev, NULL);
> +
> DRM_DEBUG("lastclose completed\n");
> return 0;
> }
> diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
> index 4553a3a..5dd12cb 100644
> --- a/drivers/char/drm/drm_irq.c
> +++ b/drivers/char/drm/drm_irq.c
> @@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
> if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> sh_flags = IRQF_SHARED;
>
> - ret = request_irq(dev->irq, dev->driver->irq_handler,
> - sh_flags, dev->devname, dev);
> + pci_set_drvdata(dev->pdev, dev);
> +
> + ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> + sh_flags, dev->devname);
> if (ret < 0) {
> mutex_lock(&dev->struct_mutex);
> dev->irq_enabled = 0;
> @@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev
>
> dev->driver->irq_uninstall(dev);
>
> - free_irq(dev->irq, dev);
> + pci_free_irq(dev->pdev);
>
> return 0;
> }
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2006-10-03 03:46:00

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC PATCH] move aic7xxx to pci_request_irq

On Mon, 2006-10-02 at 20:07 +0000, Frederik Deweerdt wrote:
> Hi,
>
> This proof-of-concept patch converts the aic7xxx drivers to use the
> pci_request_irq() function.
>
> Regards,
> Frederik
>
>
> diff --git a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> index 2001fe8..c934f30 100644
> --- a/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> +++ b/drivers/scsi/aic7xxx/aic79xx_osm_pci.c
> @@ -341,12 +341,12 @@ ahd_pci_map_int(struct ahd_softc *ahd)
> {
> int error;
>
> - error = request_irq(ahd->dev_softc->irq, ahd_linux_isr,
> - IRQF_SHARED, "aic79xx", ahd);
> + error = pci_request_irq(ahd->dev_softc, ahd_linux_isr,
> + IRQF_SHARED, "aic79xx");
> if (!error)
> ahd->platform_data->irq = ahd->dev_softc->irq;
>
> - return (-error);
> + return error;
> }

might as well kill this entire wrapper...


2006-10-03 03:56:53

by Randy Dunlap

[permalink] [raw]
Subject: Re: [RFC PATCH] pci_request_irq (was [-mm patch] aic7xxx: check irq validity)

On Mon, 2 Oct 2006 20:00:48 +0000 Frederik Deweerdt wrote:

> Hi all,
>
> I've tried to summarize the different proposals made by Jeff Garzik,
> Matthew Wilcox and Arjan van de Ven in the "[-mm patch] aic7xxx: check
> irq validity" thread. I've also added:
> - some kerneldoc

The kernel-doc needs some repair -- see below.

> - renamed valid_irq to is_irq_valid()
> - added pci_release_irq().
>
> I'll send a follow-up patch showing the implied modifications for the
> following - semi-randomly chosen :) - drivers: aic7xxx, aic79xx, tg3
> and drm.
>
> Regards,
> Frederik
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a544997..ae20a3a 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -15,6 +15,7 @@ #include <linux/init.h>
> #include <linux/pci.h>
> #include <linux/module.h>
> #include <linux/spinlock.h>
> +#include <linux/interrupt.h>
> #include <linux/string.h>
> #include <asm/dma.h> /* isa_dma_bridge_buggy */
> #include "pci.h"
> @@ -810,6 +811,49 @@ err_out:
> }
>
> /**
> + * pci_request_irq - Reserve an IRQ for a PCI device
> + * @pdev: The PCI device whose irq is to be reserved
> + * handler: The interrupt handler function,

* @handler: ...

> + * pci_get_drvdata(pdev) shall be passed as an argument to that function
> + * @flags: The flags to be passed to request_irq()
> + * @name: The name of the device to be associated with the irq
> + *
> + * Returns 0 on success, or a negative value on error. A warning
> + * message is also printed on failure.
> + */
> +int pci_request_irq(struct pci_dev *pdev,
> + irqreturn_t (*handler)(int, void *, struct pt_regs *),
> + unsigned long flags, const char *name)
> +{
> + int rc;
> + const char *actual_name = name;
> +
> + rc = is_irq_valid(pdev->irq);
> + if (!rc) {
> + dev_printk(KERN_ERR, &pdev->dev, "invalid irq #%d\n", pdev->irq);
> + return -EINVAL;
> + }
> +
> + if (!actual_name)
> + actual_name = pci_name(pdev);
> +
> + return request_irq(pdev->irq, handler, flags | IRQF_SHARED,
> + actual_name, pci_get_drvdata(pdev));
> +}
> +EXPORT_SYMBOL(pci_request_irq);
> +
> +/**
> + * pci_free_irq - releases the interrupt line reserved to the PCI
> + * device pointed by @pdev

The first line is function name and <<short>> function description.
It cannot extend more than one line (combined).
If you want to use more text for function description,
you can do so after the list of parameters. See example below.

> + * @pdev: the PCI device whose interrupt is to be freed
*
* This froofroo_irq function only does this on odd phases of
* the moon.

> + */
> +void pci_free_irq(struct pci_dev *pdev)
> +{
> + free_irq(pdev->irq, pci_get_drvdata(pdev));
> +}
> +EXPORT_SYMBOL(pci_free_irq);
> +
> +/**
> * pci_set_master - enables bus-mastering for device dev
> * @dev: the PCI device to enable
> *

---
~Randy

2006-10-03 07:19:45

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [RFC PATCH] move tg3 to pci_request_irq

On Mon, 2006-10-02 at 20:11 +0000, Frederik Deweerdt wrote:
> Hi,
>
> This proof-of-concept patch converts the tg3 driver to use the
> pci_request_irq() function.
>
> Regards,
> Frederik
>
>
> diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
> index c25ba27..23660c6 100644
> --- a/drivers/net/tg3.c
> +++ b/drivers/net/tg3.c
> @@ -6838,9 +6838,9 @@ restart_timer:
>
> static int tg3_request_irq(struct tg3 *tp)
> {
> + struct net_device *dev = tp->dev;
> irqreturn_t (*fn)(int, void *, struct pt_regs *);
> unsigned long flags;
> - struct net_device *dev = tp->dev;
>
> if (tp->tg3_flags2 & TG3_FLG2_USING_MSI) {
> fn = tg3_msi;
> @@ -6853,7 +6853,7 @@ static int tg3_request_irq(struct tg3 *t
> fn = tg3_interrupt_tagged;
> flags = IRQF_SHARED | IRQF_SAMPLE_RANDOM;
> }
> - return (request_irq(tp->pdev->irq, fn, flags, dev->name, dev));
> + return pci_request_irq(tp->pdev, fn, flags, dev->name);

since pci_request_irq sets IRQF_SHARED... might as well drop that above.


2006-10-03 07:20:21

by Frederik Deweerdt

[permalink] [raw]
Subject: Re: [RFC PATCH] move drm to pci_request_irq

On Tue, Oct 03, 2006 at 09:54:07AM +1000, Dave Airlie wrote:
> On 10/3/06, Frederik Deweerdt <[email protected]> wrote:
> >Hi,
> >
> >This proof-of-concept patch converts the drm driver to use the
> >pci_request_irq() function.
>
> NAK.
> Wow nice CC'list and no DRM maintainer in sight :-)
:), this was just meant as an illustration of the needed modifications
to use pci_request_irq.
>
> This will break framebuffer drivers, the DRM is not a proper PCI
> device driver as we don't have PCI device sharing, take a look at the
> gpu-2.6.git tree on kernel.org for the "correct" solution, which needs
> more attention before merging..
I'll look, thanks,
Frederik
>
> Dave.
> >
> >Regards,
> >Frederik
> >
> >
> >
> >diff --git a/drivers/char/drm/drm_drv.c b/drivers/char/drm/drm_drv.c
> >index b366c5b..5b000cd 100644
> >--- a/drivers/char/drm/drm_drv.c
> >+++ b/drivers/char/drm/drm_drv.c
> >@@ -234,6 +234,8 @@ int drm_lastclose(drm_device_t * dev)
> > }
> > mutex_unlock(&dev->struct_mutex);
> >
> >+ pci_set_drvdata(dev, NULL);
> >+
> > DRM_DEBUG("lastclose completed\n");
> > return 0;
> > }
> >diff --git a/drivers/char/drm/drm_irq.c b/drivers/char/drm/drm_irq.c
> >index 4553a3a..5dd12cb 100644
> >--- a/drivers/char/drm/drm_irq.c
> >+++ b/drivers/char/drm/drm_irq.c
> >@@ -132,8 +132,10 @@ static int drm_irq_install(drm_device_t
> > if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
> > sh_flags = IRQF_SHARED;
> >
> >- ret = request_irq(dev->irq, dev->driver->irq_handler,
> >- sh_flags, dev->devname, dev);
> >+ pci_set_drvdata(dev->pdev, dev);
> >+
> >+ ret = pci_request_irq(dev->pdev, dev->driver->irq_handler,
> >+ sh_flags, dev->devname);
> > if (ret < 0) {
> > mutex_lock(&dev->struct_mutex);
> > dev->irq_enabled = 0;
> >@@ -173,7 +175,7 @@ int drm_irq_uninstall(drm_device_t * dev
> >
> > dev->driver->irq_uninstall(dev);
> >
> >- free_irq(dev->irq, dev);
> >+ pci_free_irq(dev->pdev);
> >
> > return 0;
> > }
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> >the body of a message to [email protected]
> >More majordomo info at http://vger.kernel.org/majordomo-info.html
> >Please read the FAQ at http://www.tux.org/lkml/
> >
>

2006-10-03 15:58:39

by Samuel Tardieu

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

>>>>> "Jean" == Jean Tourrilhes <[email protected]> writes:

Jean> @@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
Jean> len = le16_to_cpu(essidbuf.len);
Jean> BUG_ON(len > IW_ESSID_MAX_SIZE);
Jean>
Jean> - memset(buf, 0, IW_ESSID_MAX_SIZE+1);
Jean> + memset(buf, 0, IW_ESSID_MAX_SIZE);
Jean> memcpy(buf, p, len);
Jean> - buf[len] = '\0';
Jean> + err = len;

Jean,

something bugs me here:

- either buf is supposed to be a nul-terminated string, in which
case if p is IW_ESSID_MAX_SIZE long there may be a bug (no '\0' at
the end of buf)

- either buf is not-supposed to be nul-terminated and the length
value will always be used, in which case the memset() looks
useless

I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
the last byte is cleared as well. Or am I missing something?

Sam
--
Samuel Tardieu -- [email protected] -- http://www.rfc1149.net/

2006-10-03 16:35:05

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Tue, Oct 03, 2006 at 05:58:31PM +0200, Samuel Tardieu wrote:
> >>>>> "Jean" == Jean Tourrilhes <[email protected]> writes:
>
> Jean> @@ -2500,9 +2501,9 @@ static int orinoco_hw_get_essid(struct o
> Jean> len = le16_to_cpu(essidbuf.len);
> Jean> BUG_ON(len > IW_ESSID_MAX_SIZE);
> Jean>
> Jean> - memset(buf, 0, IW_ESSID_MAX_SIZE+1);
> Jean> + memset(buf, 0, IW_ESSID_MAX_SIZE);
> Jean> memcpy(buf, p, len);
> Jean> - buf[len] = '\0';
> Jean> + err = len;
>
> Jean,
>
> something bugs me here:
>
> - either buf is supposed to be a nul-terminated string, in which
> case if p is IW_ESSID_MAX_SIZE long there may be a bug (no '\0' at
> the end of buf)

ESSID is supposed to be up to 32 char, so we need to full
buffer size.

> - either buf is not-supposed to be nul-terminated and the length
> value will always be used, in which case the memset() looks
> useless

Yes, it is entirely useless, but not incorrect. Note that the
code was not very efficient to start with, the last char of the string
was set to NUL twice.
I don't really want to overstep my authority there, my goal
was to minimise the changes. Pavel will have to clean up my mess, so I
don't want change things too much.

> I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
> the last byte is cleared as well. Or am I missing something?

No, that would bring back the slab/memory overflow we are
trying to get rid of.

> Sam
> --
> Samuel Tardieu -- [email protected] -- http://www.rfc1149.net/

Strange, this name remind me someone. Must be a previous life ;-)

A+

Jean

2006-10-03 16:45:37

by Samuel Tardieu

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On 3/10, Jean Tourrilhes wrote:

| > I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
| > the last byte is cleared as well. Or am I missing something?
|
| No, that would bring back the slab/memory overflow we are
| trying to get rid of.

Then I am puzzled by the function declaration:

static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
char buf[IW_ESSID_MAX_SIZE+1])

Do you mean that this function is called with a buf parameter which
doesn't have the expected size? (as far as the function declaration is
concerned) Shouldn't the declaration be changed to

static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
char buf[IW_ESSID_MAX_SIZE])

then to reflect the reality? (it won't change the code but would be
clearer from a documentation point of view)

Sam

2006-10-03 17:08:26

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Tue, Oct 03, 2006 at 06:45:35PM +0200, Samuel Tardieu wrote:
> On 3/10, Jean Tourrilhes wrote:
>
> | > I suggest that you revert the memset() to IW_ESSID_MAX_SIZE+1 so that
> | > the last byte is cleared as well. Or am I missing something?
> |
> | No, that would bring back the slab/memory overflow we are
> | trying to get rid of.
>
> Then I am puzzled by the function declaration:
>
> static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
> char buf[IW_ESSID_MAX_SIZE+1])
>
> Do you mean that this function is called with a buf parameter which
> doesn't have the expected size? (as far as the function declaration is
> concerned) Shouldn't the declaration be changed to
>
> static int orinoco_hw_get_essid(struct orinoco_private *priv, int *active,
> char buf[IW_ESSID_MAX_SIZE])
>
> then to reflect the reality? (it won't change the code but would be
> clearer from a documentation point of view)

Yep, that one is a bug.
Thanks !

> Sam

Jean

2006-10-04 13:42:33

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> "Steve Fox" <[email protected]> wrote:
>
> > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> >
> > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> >
> > TCP bic registered
> > TCP westwood registered
> > TCP htcp registered
> > NET: Registered protocol family 1
> > NET: Registered protocol family 17
> > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > PGD 203027 PUD 2b031067 PMD 0
> > Oops: 0000 [1] SMP
> > last sysfs file:
> > CPU 0
> > Modules linked in:
> > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > Call Trace:
> > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > [<ffffffff80207182>] init+0x162/0x330
> > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > [<ffffffff80207020>] init+0x0/0x330
> > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> >
> >
> > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > RSP <ffff810bffcbde90>
> > CR2: ffffffffffffffff
> > <0>Kernel panic - not syncing: Attempted to kill init!
> >
>
> I'm really struggling to work out what went wrong there. Comparing your
> miserable 20 bytes of code to my object code makes me think that this:
>
> struct packet_sock *po = pkt_sk(sk);
>
> returned -1, perhaps in %ebp. But it's all very crude.
>
> Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> addresses might change) then have a poke around with `gdb vmlinux' (or
> maybe just addr2line) to work out where it's really oopsing?
>
> I don't see much which has changed in that area recently.

Sorry for the delay. I was finally able to perform a bisect on this. It
turns out the patch that causes this is
x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
strange candidate, but sure enough I can boot to login: right up until
that patch is applied.

P.S. I had to comment usb-hubc-build-fix.patch out of the series file
because it would not apply cleanly and caused quilt (0.45) to simply
abort its 'push' operation.

--

Steve Fox
IBM Linux Technology Center

2006-10-04 15:45:52

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, 04 Oct 2006 08:42:28 -0500
Steve Fox <[email protected]> wrote:

> On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > "Steve Fox" <[email protected]> wrote:
> >
> > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > >
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > >
> > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > >
> > > TCP bic registered
> > > TCP westwood registered
> > > TCP htcp registered
> > > NET: Registered protocol family 1
> > > NET: Registered protocol family 17
> > > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > PGD 203027 PUD 2b031067 PMD 0
> > > Oops: 0000 [1] SMP
> > > last sysfs file:
> > > CPU 0
> > > Modules linked in:
> > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > Call Trace:
> > > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > [<ffffffff80207182>] init+0x162/0x330
> > > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > [<ffffffff80207020>] init+0x0/0x330
> > > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > >
> > >
> > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > RSP <ffff810bffcbde90>
> > > CR2: ffffffffffffffff
> > > <0>Kernel panic - not syncing: Attempted to kill init!
> > >
> >
> > I'm really struggling to work out what went wrong there. Comparing your
> > miserable 20 bytes of code to my object code makes me think that this:
> >
> > struct packet_sock *po = pkt_sk(sk);
> >
> > returned -1, perhaps in %ebp. But it's all very crude.
> >
> > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > addresses might change) then have a poke around with `gdb vmlinux' (or
> > maybe just addr2line) to work out where it's really oopsing?
> >
> > I don't see much which has changed in that area recently.
>
> Sorry for the delay. I was finally able to perform a bisect on this. It
> turns out the patch that causes this is
> x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> strange candidate, but sure enough I can boot to login: right up until
> that patch is applied.

hm, that patch was merged into mainline September 29. Does mainline work?

> P.S. I had to comment usb-hubc-build-fix.patch out of the series file
> because it would not apply cleanly and caused quilt (0.45) to simply
> abort its 'push' operation.

Sorry about that.

If mainline _does_ work then perhaps it's an interaction between that patch
and something else in the -mm2 lineup (and at that point in the bisection,
it'll be one of the git trees or something else in the x86_64 tree). Could
be that the problem remains in -mm3.

2006-10-04 15:56:12

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, Oct 04, 2006 at 08:45:40AM -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <[email protected]> wrote:
>
> > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > "Steve Fox" <[email protected]> wrote:
> > >
> > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > >
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > >
> > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > >
> > > > TCP bic registered
> > > > TCP westwood registered
> > > > TCP htcp registered
> > > > NET: Registered protocol family 1
> > > > NET: Registered protocol family 17
> > > > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > > > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > PGD 203027 PUD 2b031067 PMD 0
> > > > Oops: 0000 [1] SMP
> > > > last sysfs file:
> > > > CPU 0
> > > > Modules linked in:
> > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > Call Trace:
> > > > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > > [<ffffffff80207182>] init+0x162/0x330
> > > > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > > [<ffffffff80207020>] init+0x0/0x330
> > > > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > >
> > > >
> > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > > > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP <ffff810bffcbde90>
> > > > CR2: ffffffffffffffff
> > > > <0>Kernel panic - not syncing: Attempted to kill init!
> > > >
> > >
> > > I'm really struggling to work out what went wrong there. Comparing your
> > > miserable 20 bytes of code to my object code makes me think that this:
> > >
> > > struct packet_sock *po = pkt_sk(sk);
> > >
> > > returned -1, perhaps in %ebp. But it's all very crude.
> > >
> > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > maybe just addr2line) to work out where it's really oopsing?
> > >
> > > I don't see much which has changed in that area recently.
> >
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
>
> hm, that patch was merged into mainline September 29. Does mainline work?
>

I thought above patch was dropped because Keith ran into some boot issues
on one of the machines. Though there seems to be nothing wrong with the
patch as such but it might have triggered some existing bug. At that point
of time I looked into the issue but nothing was conclusive.

So looks like this patch has come back. I am not sure how.

Thanks
Vivek

2006-10-04 15:56:59

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wednesday 04 October 2006 17:45, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <[email protected]> wrote:
>
> > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > "Steve Fox" <[email protected]> wrote:
> > >
> > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > >
> > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > >
> > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > >
> > > > TCP bic registered
> > > > TCP westwood registered
> > > > TCP htcp registered
> > > > NET: Registered protocol family 1
> > > > NET: Registered protocol family 17
> > > > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > > > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > PGD 203027 PUD 2b031067 PMD 0
> > > > Oops: 0000 [1] SMP
> > > > last sysfs file:
> > > > CPU 0
> > > > Modules linked in:
> > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > Call Trace:
> > > > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > > [<ffffffff80207182>] init+0x162/0x330
> > > > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > > [<ffffffff80207020>] init+0x0/0x330
> > > > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > >
> > > >
> > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > > > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > RSP <ffff810bffcbde90>
> > > > CR2: ffffffffffffffff
> > > > <0>Kernel panic - not syncing: Attempted to kill init!
> > > >
> > >
> > > I'm really struggling to work out what went wrong there. Comparing your
> > > miserable 20 bytes of code to my object code makes me think that this:
> > >
> > > struct packet_sock *po = pkt_sk(sk);
> > >
> > > returned -1, perhaps in %ebp. But it's all very crude.
> > >
> > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > maybe just addr2line) to work out where it's really oopsing?
> > >
> > > I don't see much which has changed in that area recently.
> >
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
>
> hm, that patch was merged into mainline September 29. Does mainline work?

Yes we had this earlier already. But without this patch it doesn't
compile for some people. So it was readded.

And nobody knows why the reposition-bss patch actually breaks things :/

In theory the reposition is ok, so it must be some marginal code
somewhere else that just ends up failing over.

-Andi

2006-10-04 16:42:06

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 08:42:28 -0500
> Steve Fox <[email protected]> wrote:
> > Sorry for the delay. I was finally able to perform a bisect on this. It
> > turns out the patch that causes this is
> > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > strange candidate, but sure enough I can boot to login: right up until
> > that patch is applied.
>
> hm, that patch was merged into mainline September 29. Does mainline work?

-git21 also fails with this same error.

--

Steve Fox
IBM Linux Technology Center

2006-10-05 00:07:16

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, 04 Oct 2006 11:41:59 -0500
Steve Fox <[email protected]> wrote:

> On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> > On Wed, 04 Oct 2006 08:42:28 -0500
> > Steve Fox <[email protected]> wrote:
> > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > turns out the patch that causes this is
> > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > strange candidate, but sure enough I can boot to login: right up until
> > > that patch is applied.
> >
> > hm, that patch was merged into mainline September 29. Does mainline work?
>
> -git21 also fails with this same error.
>

OK, thanks. And we know that
x86_64-mm-re-positioning-the-bss-segment.patch triggered this failure. And
that patch is non-buggy, and the xfrm code is probably non-buggy. So we don't
know squat, and we're going to need to debug this crash.

Well. There is one trick we could use: apply
x86_64-mm-re-positioning-the-bss-segment.patch to 2.6.18 base and see if it
crashes. If it doesn't, then we can theorise that the bug is some buggy
post 2.6.18 patch which is being exposed by
x86_64-mm-re-positioning-the-bss-segment.patch. A technique I've used
before for identifying the buggy patch is to do a git-bisect, but apply
x86_64-mm-re-positioning-the-bss-segment.patch by hand at each bisection
step. It's pretty straightforward as long as the patch roughly applies at
each step.

Or we could debug it. Can you send the .config? Let's see if it happens
with my toolchain+machine first.

Thanks.

2006-10-05 00:52:03

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, Oct 04, 2006 at 05:06:59PM -0700, Andrew Morton wrote:
> On Wed, 04 Oct 2006 11:41:59 -0500
> Steve Fox <[email protected]> wrote:
>
> > On Wed, 2006-10-04 at 08:45 -0700, Andrew Morton wrote:
> > > On Wed, 04 Oct 2006 08:42:28 -0500
> > > Steve Fox <[email protected]> wrote:
> > > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > > turns out the patch that causes this is
> > > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > > strange candidate, but sure enough I can boot to login: right up until
> > > > that patch is applied.
> > >
> > > hm, that patch was merged into mainline September 29. Does mainline work?
> >
> > -git21 also fails with this same error.
> >
>
> OK, thanks. And we know that
> x86_64-mm-re-positioning-the-bss-segment.patch triggered this failure. And
> that patch is non-buggy, and the xfrm code is probably non-buggy. So we don't
> know squat, and we're going to need to debug this crash.
>
> Well. There is one trick we could use: apply
> x86_64-mm-re-positioning-the-bss-segment.patch to 2.6.18 base and see if it
> crashes. If it doesn't, then we can theorise that the bug is some buggy
> post 2.6.18 patch which is being exposed by

I think most likely it would crash on 2.6.18. Keith mannthey had reported
a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
time. Following is the link to the thread.

http://marc.theaimsgroup.com/?l=linux-kernel&m=115629369729911&w=2

Following is the backtrace he had reported.

Unable to handle kernel NULL pointer dereference at 0000000000000007
RIP:
[<ffffffff803d45b0>] __unix_insert_socket+0x49/0x5a
PGD 115c934067 PUD 115c935067 PMD 0
Oops: 0002 [1] SMP
last sysfs file:
CPU 14
Modules linked in:
Pid: 1, comm: init Not tainted 2.6.18-rc4-mm2-smp #3
RIP: 0010:[<ffffffff803d45b0>] [<ffffffff803d45b0>]
__unix_insert_socket+0x49/0x5a
RSP: 0018:ffff810460605eb8 EFLAGS: 00010286
RAX: ffffffffffffffff RBX: ffff81115c171c80 RCX: 0000000000000000
RDX: ffff81115c171c88 RSI: ffff81115c171c80 RDI: ffffffff806656e0
RBP: ffffffff806656e0 R08: ffff81115c069200 R09: ffff8110700b4000
R10: 0000000000000000 R11: 0000000000000002 R12: ffff81115c170d00
R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000000
FS: 00002b793a4fd6d0(0000) GS:ffff81115c910e40(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000007 CR3: 000000115c92d000 CR4: 00000000000006e0
Process init (pid: 1, threadinfo ffff810460604000, task
ffff81115cb10040)
Stack: 0000000100000001 00000000ffffffff ffff81115c171c80
ffffffff803d58e9
ffffffff8045bb30 0000000180298f61 ffffffff80498080 0000000000000001
ffff81115c170d00 ffffffff803d595d 0000000000000004 ffffffff80376061
Call Trace:
[<ffffffff803d58e9>] unix_create1+0xf3/0x107
[<ffffffff803d595d>] unix_create+0x60/0x6b
[<ffffffff80376061>] __sock_create+0x12f/0x227
[<ffffffff80376429>] sys_socket+0xf/0x37
[<ffffffff8020968e>] system_call+0x7e/0x83


Code: 48 89 50 08 48 89 55 00 48 89 6a 08 41 58 5b 5d c3 c7 47 08
RIP [<ffffffff803d45b0>] __unix_insert_socket+0x49/0x5a
RSP <ffff810460605eb8>
CR2: 0000000000000007
<0>Kernel panic - not syncing: Attempted to kill init!

Thanks
Vivek

2006-10-05 00:58:05

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64


> I think most likely it would crash on 2.6.18. Keith mannthey had reported
> a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> time. Following is the link to the thread.

Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?

-Andi

2006-10-05 01:08:21

by Martin Bligh

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

Andi Kleen wrote:
>>I think most likely it would crash on 2.6.18. Keith mannthey had reported
>>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
>>time. Following is the link to the thread.
>
>
> Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?

I think it's fixed already in -git22, or at least it is for the IBM box
reporting to test.kernel.org. You might want to try that one ...

M.

2006-10-05 01:57:11

by Keith Mannthey

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On 10/4/06, Andi Kleen <[email protected]> wrote:
> On Wednesday 04 October 2006 17:45, Andrew Morton wrote:
> > On Wed, 04 Oct 2006 08:42:28 -0500
> > Steve Fox <[email protected]> wrote:
> >
> > > On Thu, 2006-09-28 at 14:01 -0700, Andrew Morton wrote:
> > > > On Thu, 28 Sep 2006 17:50:31 +0000 (UTC)
> > > > "Steve Fox" <[email protected]> wrote:
> > > >
> > > > > On Thu, 28 Sep 2006 01:46:23 -0700, Andrew Morton wrote:
> > > > >
> > > > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm2/
> > > > >
> > > > > Panic on boot. This machine booted 2.6.18-mm1 fine. em64t machine.
> > > > >
> > > > > TCP bic registered
> > > > > TCP westwood registered
> > > > > TCP htcp registered
> > > > > NET: Registered protocol family 1
> > > > > NET: Registered protocol family 17
> > > > > Unable to handle kernel paging request at ffffffffffffffff RIP:
> > > > > [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > > PGD 203027 PUD 2b031067 PMD 0
> > > > > Oops: 0000 [1] SMP
> > > > > last sysfs file:
> > > > > CPU 0
> > > > > Modules linked in:
> > > > > Pid: 1, comm: swapper Not tainted 2.6.18-mm2-autokern1 #1
> > > > > RIP: 0010:[<ffffffff8047ef93>] [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > > RSP: 0000:ffff810bffcbde90 EFLAGS: 00010286
> > > > > RAX: 0000000000000000 RBX: ffff810bff4a1000 RCX: 2222222222222222
> > > > > RDX: ffff810bff4a1000 RSI: 0000000000000005 RDI: ffffffff8055f5e0
> > > > > RBP: ffffffffffffffff R08: 0000000000007616 R09: 000000000000000e
> > > > > R10: 0000000000000006 R11: ffffffff803373f0 R12: 0000000000000000
> > > > > R13: 0000000000000005 R14: ffff810bff4a1000 R15: 0000000000000000
> > > > > FS: 0000000000000000(0000) GS:ffffffff805d8000(0000) knlGS:0000000000000000
> > > > > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> > > > > CR2: ffffffffffffffff CR3: 0000000000201000 CR4: 00000000000006e0
> > > > > Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb510)
> > > > > Stack: ffff810bff4a1000 ffffffff8055f4c0 0000000000000000 ffff810bffcbdef0
> > > > > 0000000000000000 ffffffff8042736e 0000000000000000 0000000000000000
> > > > > 0000000000000000 ffffffff8061c68d ffffffff806260f0 ffffffff80207182
> > > > > Call Trace:
> > > > > [<ffffffff8042736e>] register_netdevice_notifier+0x3e/0x70
> > > > > [<ffffffff8061c68d>] packet_init+0x2d/0x53
> > > > > [<ffffffff80207182>] init+0x162/0x330
> > > > > [<ffffffff8020a9d8>] child_rip+0xa/0x12
> > > > > [<ffffffff8033c2a2>] acpi_ds_init_one_object+0x0/0x82
> > > > > [<ffffffff80207020>] init+0x0/0x330
> > > > > [<ffffffff8020a9ce>] child_rip+0x0/0x12
> > > > >
> > > > >
> > > > > Code: 48 8b 45 00 0f 18 08 49 83 fd 02 4c 8d 65 f8 0f 84 f8 fe ff
> > > > > RIP [<ffffffff8047ef93>] packet_notifier+0x163/0x1a0
> > > > > RSP <ffff810bffcbde90>
> > > > > CR2: ffffffffffffffff
> > > > > <0>Kernel panic - not syncing: Attempted to kill init!
> > > > >
> > > >
> > > > I'm really struggling to work out what went wrong there. Comparing your
> > > > miserable 20 bytes of code to my object code makes me think that this:
> > > >
> > > > struct packet_sock *po = pkt_sk(sk);
> > > >
> > > > returned -1, perhaps in %ebp. But it's all very crude.
> > > >
> > > > Perhaps you could compile that kernel with CONFIG_DEBUG_INFO, rerun it (the
> > > > addresses might change) then have a poke around with `gdb vmlinux' (or
> > > > maybe just addr2line) to work out where it's really oopsing?
> > > >
> > > > I don't see much which has changed in that area recently.
> > >
> > > Sorry for the delay. I was finally able to perform a bisect on this. It
> > > turns out the patch that causes this is
> > > x86_64-mm-re-positioning-the-bss-segment.patch, which seems like a
> > > strange candidate, but sure enough I can boot to login: right up until
> > > that patch is applied.
> >
> > hm, that patch was merged into mainline September 29. Does mainline work?
>
> Yes we had this earlier already. But without this patch it doesn't
> compile for some people. So it was readded.
>
> And nobody knows why the reposition-bss patch actually breaks things :/

I just wanted to add a chaned up my config file and things went away.
It was not at all clear as to what was causing it.


Thanks,
Keith

2006-10-05 02:05:56

by Keith Mannthey

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On 10/4/06, Martin Bligh <[email protected]> wrote:
> Andi Kleen wrote:
> >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> >>time. Following is the link to the thread.
> >
> >
> > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
>
> I think it's fixed already in -git22, or at least it is for the IBM box
> reporting to test.kernel.org. You might want to try that one ...

Fixed or hidden... hard to say at this point. I think it could be a
werid interaction between patches and or config options. I will see
tommorrow if I can recreate again.

Thanks,
Keith

2006-10-05 14:53:45

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote:
> Andi Kleen wrote:
> >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> >>time. Following is the link to the thread.
> >
> >
> > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
>
> I think it's fixed already in -git22, or at least it is for the IBM box
> reporting to test.kernel.org. You might want to try that one ...

-git22 also panics for me.

--

Steve Fox
IBM Linux Technology Center

2006-10-05 15:13:23

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 09:53 -0500, Steve Fox wrote:
> On Wed, 2006-10-04 at 18:08 -0700, Martin Bligh wrote:
> > Andi Kleen wrote:
> > >>I think most likely it would crash on 2.6.18. Keith mannthey had reported
> > >>a different crash on 2.6.18-rc4-mm2 when this patch was introduced first
> > >>time. Following is the link to the thread.
> > >
> > >
> > > Then maybe trying 2.6.17 + the patch and then bisect between that and -rc4?
> >
> > I think it's fixed already in -git22, or at least it is for the IBM box
> > reporting to test.kernel.org. You might want to try that one ...
>
> -git22 also panics for me.
>

Steve,

Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ?
Last time I couldn't match your instruction dump to any code segment
in the routine. And also, can you post your .config file. I have
an amd64 and em64t machine and both work fine...

Thanks,
Badari

2006-10-05 15:32:16

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote:

> Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ?

CONFIG_DEBUG_KERNEL should be on

> Last time I couldn't match your instruction dump to any code segment
> in the routine. And also, can you post your .config file. I have
> an amd64 and em64t machine and both work fine...

Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
[<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1
RIP: 0010:[<ffffffff804705e6>] [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 000000003f924371 R09: 0000000000000000
R10: ffff810bffcbdcb0 R11: 0000000000000154 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack: 0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000090000

The base config file I'm using is at
http://flooterbu.net/kernel/elm3b239-2.6.17.config

--

Steve Fox
IBM Linux Technology Center

2006-10-05 15:41:14

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 17:32, Steve Fox wrote:
> On Thu, 2006-10-05 at 08:12 -0700, Badari Pulavarty wrote:
>
> > Can you post the latest panic stack again (with CONFIG_DEBUG_KERNEL) ?
>
> CONFIG_DEBUG_KERNEL should be on
>
> > Last time I couldn't match your instruction dump to any code segment
> > in the routine. And also, can you post your .config file. I have
> > an amd64 and em64t machine and both work fine...
>
> Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
> [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> PGD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-git22 #1
> RIP: 0010:[<ffffffff804705e6>] [<ffffffff804705e6>] xfrm_register_mode+0x36/0x60
> RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
> RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
> RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
> RBP: 00000000ffffffef R08: 000000003f924371 R09: 0000000000000000
> R10: ffff810bffcbdcb0 R11: 0000000000000154 R12: 0000000000000000
> R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
> Stack: 0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000090000

Please don't snip the Code: line. It is fairly important.

>
> The base config file I'm using is at
> http://flooterbu.net/kernel/elm3b239-2.6.17.config

My guess is that something is wrong with the global variable it is accessing.
Can you post the output of grep -5 xfrm_policy_afinfo ?

I wonder if that variable overlaps something else.

And please add a
printk("global %p\n", xfrm_policy_afinfo[family]);
at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo
and post the output.

If not then it's possible
that some nearby variable is overflowing or similar. Adding some padding
around xfrm_policy_afinfo would show that.

Another way if that global is proven to be corrupted will be to add
checks all over the boot process to track down where it gets corrupted.

-Andi

2006-10-05 17:57:17

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:

> Please don't snip the Code: line. It is fairly important.

Sorry about that. The remote console I was using appears to overwrite
some text after I force the reboot. Here's a clean one.

global ffffffffffffffff
Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
[<ffffffff80470766>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #3
RIP: 0010:[<ffffffff80470766>] [<ffffffff80470766>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000000000
RDX: ffffffffffffffff RSI: 0000000000000046 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000007a02 R09: 000000000000000e
R10: 0000000000000006 R11: ffffffff80334660 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack: 0000000000000000 ffffffff8061fb48 0000000000000000 ffffffff80207182
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
[<ffffffff80207182>] init+0x162/0x330
[<ffffffff8020a9a8>] child_rip+0xa/0x12
[<ffffffff803394c2>] acpi_ds_init_one_object+0x0/0x82
[<ffffffff80207020>] init+0x0/0x330
[<ffffffff8020a99e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 65 fd ff
RIP [<ffffffff80470766>] xfrm_register_mode+0x36/0x60
RSP <ffff810bffcbded0>
CR2: 0000000000000827
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!

> My guess is that something is wrong with the global variable it is accessing.
> Can you post the output of grep -5 xfrm_policy_afinfo ?

elm3b239:/boot # grep -5 xfrm_policy_afinfo System.map-2.6.18-git22
ffffffff805594c0 d xfrm4_state_afinfo
ffffffff80559500 D xfrm_cfg_mutex
ffffffff80559530 d xfrm_dev_notifier
ffffffff80559548 d xfrm_policy_lock
ffffffff8055954c d xfrm_policy_gc_lock
ffffffff80559550 d xfrm_policy_afinfo_lock
ffffffff80559560 d xfrm_hash_work
ffffffff805595c0 d hash_resize_mutex
ffffffff80559600 D sysctl_xfrm_aevent_etime
ffffffff80559604 D sysctl_xfrm_aevent_rseqth
ffffffff80559610 D km_waitq
--
ffffffff8075bfd8 b idiagnl
ffffffff8075bfe0 B xfrm_policy_count
ffffffff8075bff8 b xfrm_policy_gc_list
ffffffff8075c000 b dummy.28400
ffffffff8075c038 b idx_generator.27450
ffffffff8075c040 b xfrm_policy_afinfo
ffffffff8075c140 b xfrm_policy_gc_work
ffffffff8075c1a0 b xfrm_policy_inexact
ffffffff8075c1e0 B xfrm_nl
ffffffff8075c1e8 b xfrm_state_gc_list
ffffffff8075c1f0 b acqseq.27386

> And please add a
> printk("global %p\n", xfrm_policy_afinfo[family]);
> at the beginning of net/xfrm/xfrm_poliy.c:xfrm_policy_lock_afinfo
> and post the output.

Included above.

--

Steve Fox
IBM Linux Technology Center

2006-10-05 18:27:14

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 19:57, Steve Fox wrote:
> On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
>
> > Please don't snip the Code: line. It is fairly important.
>
> Sorry about that. The remote console I was using appears to overwrite
> some text after I force the reboot. Here's a clean one.
>
> global ffffffffffffffff

Ok that definitely shouldn't be in there.

I guess we need to track when it gets corrupted. Can you send the full
boot log with this patch applied?


-Andi

Index: linux-2.6.19-rc1-hack/init/main.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/init/main.c
+++ linux-2.6.19-rc1-hack/init/main.c
@@ -75,6 +75,9 @@

static int init(void *);

+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
extern void init_IRQ(void);
extern void fork_init(unsigned long);
extern void mca_init(void);
@@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void
char * command_line;
extern struct kernel_param __start___param[], __stop___param[];

+ CHECK;
+
smp_setup_processor_id();

/*
@@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void
page_address_init();
printk(KERN_NOTICE);
printk(linux_banner);
+ CHECK;
setup_arch(&command_line);
+ CHECK;
setup_per_cpu_areas();
smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */

@@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void
* fragile until we cpu_idle() for the first time.
*/
preempt_disable();
+ CHECK;
build_all_zonelists();
page_alloc_init();
printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line);
@@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void
__stop___param - __start___param,
&unknown_bootoption);
sort_main_extable();
+ CHECK;
trap_init();
rcu_init();
init_IRQ();
@@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void
hrtimers_init();
softirq_init();
timekeeping_init();
+ CHECK;
time_init();
profile_init();
+ CHECK;
if (!irqs_disabled())
printk("start_kernel(): bug: interrupts were enabled early\n");
early_boot_irqs_on();
@@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void
#endif
vfs_caches_init_early();
cpuset_init_early();
+ CHECK;
mem_init();
+ CHECK;
kmem_cache_init();
setup_per_cpu_pageset();
numa_policy_init();
@@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void
calibrate_delay();
pidmap_init();
pgtable_cache_init();
+ CHECK;
prio_tree_init();
anon_vma_init();
#ifdef CONFIG_X86
@@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void
fork_init(num_physpages);
proc_caches_init();
buffer_init();
+ CHECK;
unnamed_dev_init();
key_init();
security_init();
vfs_caches_init(num_physpages);
radix_tree_init();
signals_init();
+ CHECK;
/* rootfs populating might need page-writeback */
page_writeback_init();
#ifdef CONFIG_PROC_FS
@@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void
#endif
cpuset_init();
taskstats_init_early();
+ CHECK;
delayacct_init();

check_bugs();
@@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void
rest_init();
}

-static int __initdata initcall_debug;
+static int __initdata initcall_debug = 1;

static int __init initcall_debug_setup(char *str)
{
@@ -639,7 +656,11 @@ static void __init do_initcalls(void)
printk("\n");
}

+ CHECK;
+
result = (*call)();
+
+ CHECK;

if (result && result != -ENODEV && initcall_debug) {
sprintf(msgbuf, "error code %d", result);
@@ -725,21 +746,32 @@ static int init(void * unused)

smp_prepare_cpus(max_cpus);

+ CHECK;
+
do_pre_smp_initcalls();

smp_init();
+
+ CHECK;
+
sched_init_smp();

cpuset_init_smp();

+ CHECK;
+
/*
* Do this before initcalls, because some drivers want to access
* firmware files.
*/
populate_rootfs();

+ CHECK;
+
do_basic_setup();

+ CHECK;
+
/*
* check if there is an early userspace init. If yes, let it do all
* the work
Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c
+++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
@@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count);
static DEFINE_RWLOCK(xfrm_policy_afinfo_lock);
static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO];

+void bugcheck(char *where, int line)
+{
+ int i;
+ for (i = 0; i < NPROTO; i++)
+ if (xfrm_policy_afinfo[i] == (void *)-1UL) {
+ printk("afinfo corrupted at %s:%d\n",where,line);
+ return;
+ }
+}
+
static kmem_cache_t *xfrm_dst_cache __read_mostly;

static struct work_struct xfrm_policy_gc_work;

2006-10-05 18:51:15

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote:

> I guess we need to track when it gets corrupted. Can you send the full
> boot log with this patch applied?

Here she blows!

root (hd0,0)
Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0
console=ttyS0,
57600 autobench_args: root=/dev/sda1 ABAT:1160073474
[Linux-bzImage, setup=0x1400, size=0x1dd755]
initrd /boot/initrd-autobench.img
[Linux-initrd @ 0x37ceb000, 0x304c57 bytes]

Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE
Linux)) #4 SMP Thu Oct 5 11:36:21 PDT 2006
Command line: root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1
showopts console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1
ABAT:1160073474
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
DMI 2.3 present.
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 12582912
early_node_map[3] active PFN ranges
0: 0 -> 154
0: 256 -> 786294
0: 1048576 -> 12582912
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists. Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791
ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1
showopts console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1
ABAT:1160073474
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
afinfo corrupted at init/main.c:582
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x310c2000 - 0x350c2000
Memory: 48422908k/50331648k available (2566k kernel code, 858868k
reserved, 1345k data, 184k init)
afinfo corrupted at init/main.c:584
Calibrating delay using timer specific routine.. 5677.94 BogoMIPS
(lpj=11355895)
afinfo corrupted at init/main.c:593
afinfo corrupted at init/main.c:603
Mount-cache hash table entries: 256
afinfo corrupted at init/main.c:610
afinfo corrupted at init/main.c:618
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Using local APIC timer interrupts.
result 10425802
Detected 10.425 MHz APIC timer.
afinfo corrupted at init/main.c:749
SMP alternatives: switching to SMP code
Booting processor 1/16 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 5671.84 BogoMIPS
(lpj=11343680)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -4 cycles, maxerr 799
cycles)
SMP alternatives: switching to SMP code
Booting processor 2/16 APIC 0x6
Initializing CPU#2
Calibrating delay using timer specific routine.. 5671.99 BogoMIPS
(lpj=11343984)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU2: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 2: Syncing TSC to CPU 0.
CPU 2: synchronized TSC with CPU 0 (last diff -13 cycles, maxerr 3341
cycles)
SMP alternatives: switching to SMP code
Booting processor 3/16 APIC 0x7
Initializing CPU#3
Calibrating delay using timer specific routine.. 5672.06 BogoMIPS
(lpj=11344129)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU3: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 3: Syncing TSC to CPU 0.
CPU 3: synchronized TSC with CPU 0 (last diff 178 cycles, maxerr 3171
cycles)
SMP alternatives: switching to SMP code
Booting processor 4/16 APIC 0x10
Initializing CPU#4
Calibrating delay using timer specific routine.. 5672.04 BogoMIPS
(lpj=11344087)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU4: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 4: Syncing TSC to CPU 0.
CPU 4: synchronized TSC with CPU 0 (last diff -420 cycles, maxerr 3510
cycles)
SMP alternatives: switching to SMP code
Booting processor 5/16 APIC 0x11
Initializing CPU#5
Calibrating delay using timer specific routine.. 5672.04 BogoMIPS
(lpj=11344081)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU5: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 5: Syncing TSC to CPU 0.
CPU 5: synchronized TSC with CPU 0 (last diff -801 cycles, maxerr 3315
cycles)
SMP alternatives: switching to SMP code
Booting processor 6/16 APIC 0x16
Initializing CPU#6
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS
(lpj=11344046)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU6: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 6: Syncing TSC to CPU 0.
CPU 6: synchronized TSC with CPU 0 (last diff -287 cycles, maxerr 3281
cycles)
SMP alternatives: switching to SMP code
Booting processor 7/16 APIC 0x17
Initializing CPU#7
Calibrating delay using timer specific routine.. 5672.01 BogoMIPS
(lpj=11344028)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU7: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 7: Syncing TSC to CPU 0.
CPU 7: synchronized TSC with CPU 0 (last diff 238 cycles, maxerr 3391
cycles)
SMP alternatives: switching to SMP code
Booting processor 8/16 APIC 0x20
Initializing CPU#8
Calibrating delay using timer specific routine.. 5672.42 BogoMIPS
(lpj=11344847)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU8: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 8: Syncing TSC to CPU 0.
CPU 8: synchronized TSC with CPU 0 (last diff 101 cycles, maxerr 8577
cycles)
SMP alternatives: switching to SMP code
Booting processor 9/16 APIC 0x21
Initializing CPU#9
Calibrating delay using timer specific routine.. 5672.28 BogoMIPS
(lpj=11344576)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU9: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 9: Syncing TSC to CPU 0.
CPU 9: synchronized TSC with CPU 0 (last diff 200 cycles, maxerr 8109
cycles)
SMP alternatives: switching to SMP code
Booting processor 10/16 APIC 0x26
Initializing CPU#10
Calibrating delay using timer specific routine.. 5672.50 BogoMIPS
(lpj=11345012)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU10: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 10: Syncing TSC to CPU 0.
CPU 10: synchronized TSC with CPU 0 (last diff 72 cycles, maxerr 8551
cycles)
SMP alternatives: switching to SMP code
Booting processor 11/16 APIC 0x27
Initializing CPU#11
Calibrating delay using timer specific routine.. 5672.90 BogoMIPS
(lpj=11345804)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU11: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 11: Syncing TSC to CPU 0.
CPU 11: synchronized TSC with CPU 0 (last diff -548 cycles, maxerr 8526
cycles)
SMP alternatives: switching to SMP code
Booting processor 12/16 APIC 0x30
Initializing CPU#12
Calibrating delay using timer specific routine.. 5672.75 BogoMIPS
(lpj=11345516)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU12: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 12: Syncing TSC to CPU 0.
CPU 12: synchronized TSC with CPU 0 (last diff 35 cycles, maxerr 8636
cycles)
SMP alternatives: switching to SMP code
Booting processor 13/16 APIC 0x31
Initializing CPU#13
Calibrating delay using timer specific routine.. 5672.55 BogoMIPS
(lpj=11345119)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU13: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 13: Syncing TSC to CPU 0.
CPU 13: synchronized TSC with CPU 0 (last diff -1125 cycles, maxerr 7829
cycles)
SMP alternatives: switching to SMP code
Booting processor 14/16 APIC 0x36
Initializing CPU#14
Calibrating delay using timer specific routine.. 5672.25 BogoMIPS
(lpj=11344507)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU14: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 14: Syncing TSC to CPU 0.
CPU 14: synchronized TSC with CPU 0 (last diff -796 cycles, maxerr 8568
cycles)
SMP alternatives: switching to SMP code
Booting processor 15/16 APIC 0x37
Initializing CPU#15
Calibrating delay using timer specific routine.. 5672.24 BogoMIPS
(lpj=11344495)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU15: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 15: Syncing TSC to CPU 0.
CPU 15: synchronized TSC with CPU 0 (last diff -3 cycles, maxerr 7531
cycles)
Brought up 16 CPUs
testing NMI watchdog ... OK.
time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
time.c: Detected 2835.836 MHz processor.
afinfo corrupted at init/main.c:755
migration_cost=29,1007
afinfo corrupted at init/main.c:761
afinfo corrupted at init/main.c:769
Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806077b0: helper_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b40: pm_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607bc0: ksysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a490: filelock_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afa0: init_script_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afb0: init_elf_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614400: sock_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614ba0: netlink_proto_init+0x0/0x1a0()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 16
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c080: kobject_uevent_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c210: pcibus_class_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c7e0: pci_driver_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060eca0: tty_class_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f790: vtconsole_class_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c920: acpi_pci_init+0x0/0x40()
afinfo corrupted at init/main.c:659
ACPI: bus type pci registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d65f: init_acpi_device_notify+0x0/0x4b()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613810: pci_access_init+0x0/0x30()
afinfo corrupted at init/main.c:659
PCI: Using configuration type 1
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806054d0: topology_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806074e0: param_sysfs_init+0x0/0x200()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ac50: init_bio+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bf40: genhd_device_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d472: acpi_init+0x0/0x1ed()
afinfo corrupted at init/main.c:659
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d945: acpi_ec_init+0x0/0x62()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dd5e: acpi_pci_root_init+0x0/0x28()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dda6: acpi_pci_link_init+0x0/0x48()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060df2c: acpi_power_init+0x0/0x77()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dfa3: acpi_system_init+0x0/0xc6()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e069: acpi_event_init+0x0/0x3f()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e0a8: acpi_scan_init+0x0/0x1ac()
afinfo corrupted at init/main.c:659
ACPI: PCI Root Bridge [VP00] (0000:00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
ACPI: PCI Root Bridge [VP01] (0000:01)
ACPI: PCI Root Bridge [VP02] (0000:02)
ACPI: PCI Root Bridge [VP03] (0000:04)
ACPI: PCI Root Bridge [VP04] (0000:06)
ACPI: PCI Root Bridge [VP05] (0000:08)
ACPI: PCI Root Bridge [VP06] (0000:0a)
ACPI: PCI Root Bridge [VP07] (0000:0c)
ACPI: PCI Root Bridge [VP10] (0000:0e)
ACPI: PCI Root Bridge [VP11] (0000:0f)
ACPI: PCI Root Bridge [VP12] (0000:10)
ACPI: PCI Root Bridge [VP13] (0000:12)
ACPI: PCI Root Bridge [VP14] (0000:14)
ACPI: PCI Root Bridge [VP15] (0000:16)
ACPI: PCI Root Bridge [VP16] (0000:18)
ACPI: PCI Root Bridge [VP17] (0000:1a)
ACPI: PCI Root Bridge [VP20] (0000:1c)
ACPI: PCI Root Bridge [VP21] (0000:1d)
ACPI: PCI Root Bridge [VP22] (0000:1e)
ACPI: PCI Root Bridge [VP23] (0000:20)
ACPI: PCI Root Bridge [VP24] (0000:22)
ACPI: PCI Root Bridge [VP25] (0000:24)
ACPI: PCI Root Bridge [VP26] (0000:26)
ACPI: PCI Root Bridge [VP27] (0000:28)
ACPI: PCI Root Bridge [VP30] (0000:2a)
ACPI: PCI Root Bridge [VP31] (0000:2b)
ACPI: PCI Root Bridge [VP32] (0000:2c)
ACPI: PCI Root Bridge [VP33] (0000:2e)
ACPI: PCI Root Bridge [VP34] (0000:30)
ACPI: PCI Root Bridge [VP35] (0000:32)
ACPI: PCI Root Bridge [VP36] (0000:34)
ACPI: PCI Root Bridge [VP37] (0000:36)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e3c4: acpi_cm_sbs_init+0x0/0xc()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e3d0: pnp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux Plug and Play Support v0.97 (c) Adam Belay
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e660: pnpacpi_init+0x0/0x70()
afinfo corrupted at init/main.c:659
pnp: PnP ACPI init
pnp: PnP ACPI: found 47 devices
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f200: misc_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611560: init_scsi+0x0/0x90()
afinfo corrupted at init/main.c:659
SCSI subsystem initialized
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612240: serio_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612660: input_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612a70: rtc_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ac0: rtc_sysfs_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ad0: rtc_proc_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612ae0: rtc_dev_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613840: pci_acpi_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a
report
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806138f0: pci_legacy_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613ea0: pcibios_irq_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614390: pcibios_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806144c0: proto_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614660: net_dev_init+0x0/0x210()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614d40: genl_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 3707069 Hz
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ffe20: pci_iommu_init+0x0/0x20()
afinfo corrupted at init/main.c:659
PCI-GART: No AMD northbridge found.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a410: init_pipe_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e294: acpi_motherboard_init+0x0/0x130()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e500: pnp_system_init+0x0/0x10()
afinfo corrupted at init/main.c:659
pnp: 00:0a: ioport range 0x400-0x47f has been reserved
pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e9e0: chr_dev_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806107b0: firmware_class_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613220: pcibios_assign_resources+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615750: inet_init+0x0/0x400()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 2
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe760: i8259A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe730: init_timer_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fed80: vsyscall_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff010: sbf_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ffdf0: i8237A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600270: periodic_mcheck_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806002a0: mce_init_device+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806003e0: thermal_throttle_init_device
+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600450: threshold_init_device+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80601c50: init_lapic_sysfs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806027f0: ioapic_init_sysfs+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806055e0: x8664_sysctl_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606aa0: create_proc_profile+0x0/0x280()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606ee0: ioresources_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607050: timekeeping_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607170: uid_cache_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806076e0: init_posix_timers+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806077f0: init_posix_cpu_timers+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607910: latency_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a00: init_clocksource_sysfs+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a60: init_jiffies_clocksource+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a70: init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607ae0: proc_dma_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b10: kallsyms_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607b80: ikconfig_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80608cd0: init_per_zone_pages_min+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609c40: pdflush_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609c90: kswapd_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609cc0: setup_vmstat+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609d30: procswaps_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609da0: hugetlb_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Total HugeTLB memory allocated, 0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609e10: init_tmpfs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609ef0: cpucache_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a460: fasync_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ab70: aio_setup+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060adf0: inotify_setup+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ae00: inotify_user_setup+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aec0: eventpoll_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060afc0: init_mbcache+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aff0: dnotify_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b4b0: init_devpts_fs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b4f0: init_reiserfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b570: init_ext3_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b6a0: journal_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b780: init_ext2_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b840: init_ramfs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b850: init_hugetlbfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b910: init_fat_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b960: init_vfat_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b970: init_nls_cp437+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b980: init_nls_iso8859_1+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b990: init_autofs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b9a0: init_autofs4_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
initcall at 0xffffffff8060b9a0: init_autofs4_fs+0x0/0x10(): returned
with error code -16
Calling initcall 0xffffffff8060b9b0: ipc_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc80: init_mqueue_fs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bd60: crypto_algapi_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bda0: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bdb0: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfa0: noop_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler noop registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfb0: as_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler anticipatory registered (default)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfc0: deadline_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler deadline registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bfd0: cfq_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
io scheduler cfq registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c7f0: pci_sysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c830: pci_proc_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d6aa: acpi_ac_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d6ef: acpi_battery_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dd00: acpi_video_init+0x0/0x5e()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ddee: irqrouter_init_sysfs+0x0/0x38()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ea80: rand_initialize+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060eab0: tty_init+0x0/0x1f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed10: pty_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f850: hpet_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f8c0: agp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux agpgart interface v0.101 (c) Dave Jones
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fa20: cn_proc_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fe60: serial8250_init+0x0/0x150()
afinfo corrupted at init/main.c:659
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing
disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610090: serial8250_pnp_init+0x0/0x10()
afinfo corrupted at init/main.c:659
00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806100a0: serial8250_pci_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610830: e1000_init_module+0x0/0x50()
afinfo corrupted at init/main.c:659
Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
Copyright (c) 1999-2006 Intel Corporation.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610880: tg3_init+0x0/0x10()
afinfo corrupted at init/main.c:659
tg3.c:v3.66 (September 23, 2006)
ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:0d:60:98:63:54
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:0d:60:98:63:55
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1]
TSOcap[0]
eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit)
10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1]
TSOcap[1]
eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610910: net_olddevs_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
afinfo corrupted at init/main.c:659
netconsole: not configured, aborting
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806109e0: piix_ide_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610b20: ide_init+0x0/0x90()
afinfo corrupted at init/main.c:659
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
SvrWks CSB6: chipset revision 160
SvrWks CSB6: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
SvrWks CSB6: simplex device: DMA disabled
ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806114f0: ide_generic_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611510: idedisk_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611520: ide_cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.20
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611530: idefloppy_init+0x0/0x30()
afinfo corrupted at init/main.c:659
ide-floppy driver 0.99.newide
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611800: raid_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611810: spi_transport_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611850: fc_transport_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806118a0: iscsi_transport_init+0x0/0x120()
afinfo corrupted at init/main.c:659
Loading iSCSI transport class v2.0-685.afinfo corrupted at
init/main.c:663
Calling initcall 0xffffffff806119c0: sas_transport_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611a80: iscsi_tcp_init+0x0/0x50()
afinfo corrupted at init/main.c:659
iscsi: registered transport (tcp)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611ad0: aac_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Adaptec aacraid driver (1.1-5[2409]-mh2)
ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
AAC0: kernel 5.0-2[8264]
AAC0: monitor 5.0-2[8264]
AAC0: bios 5.0-2[8264]
AAC0: serial 162348
AAC0: 64bit support enabled.
AAC0: 64 Bit DAC enabled
scsi0 : ServeRAID
scsi 0:0:0:0: Direct-Access IBM Drive 1 V1.0 PQ: 0
ANSI: 2
scsi 0:0:1:0: Direct-Access IBM Drive 2 V1.0 PQ: 0
ANSI: 2
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611b40: qla1280_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d10: sym2_init+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611e20: init_sd+0x0/0x60()
afinfo corrupted at init/main.c:659
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi removable disk sda
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
sdb: sdb1 sdb2 sdb3
sd 0:0:1:0: Attached scsi removable disk sdb
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611e80: fusion_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT base driver 3.04.01
Copyright (c) 1999-2005 LSI Logic Corporation
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611f80: mptspi_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
Fusion MPT SPI Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612040: mptfc_init+0x0/0xf0()
afinfo corrupted at init/main.c:659
Fusion MPT FC Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612130: mptctl_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT misc device (ioctl) driver 3.04.01
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612230: cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612310: i8042_init+0x0/0x350()
afinfo corrupted at init/main.c:659
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612780: mousedev_init+0x0/0x100()
afinfo corrupted at init/main.c:659
mice: PS/2 mouse device common for all mice
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612880: atkbd_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612b90: hwmon_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806149d0: flow_cache_init+0x0/0x1d0()
afinfo corrupted at init/main.c:659
input: AT Translated Set 2 keyboard as /class/input/input0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615e60: init_syncookies+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80615e80: xfrm4_beet_init+0x0/0x20()
afinfo corrupted at init/main.c:659
Unable to handle kernel NULL pointer dereference at 0000000000000827
RIP:
[<ffffffff80470666>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #4
RIP: 0010:[<ffffffff80470666>] [<ffffffff80470666>] xfrm_register_mode
+0x36/0x60
RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff805d2000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task
ffff810bffcbb4e0)
Stack: 0000000000000000 0000000000000000 ffffffff8061fc48
ffffffff802071d6
6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
[<ffffffff802071d6>] init+0x1b6/0x3b0
[<ffffffff8020aa28>] child_rip+0xa/0x12
[<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
[<ffffffff80207020>] init+0x0/0x3b0
[<ffffffff8020aa1e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
RIP [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
RSP <ffff810bffcbded0>
CR2: 0000000000000827
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!

--

Steve Fox
IBM Linux Technology Center

2006-10-05 18:52:53

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 19:57, Steve Fox wrote:
> > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> >
> > > Please don't snip the Code: line. It is fairly important.
> >
> > Sorry about that. The remote console I was using appears to overwrite
> > some text after I force the reboot. Here's a clean one.
> >
> > global ffffffffffffffff
>
> Ok that definitely shouldn't be in there.
>
> I guess we need to track when it gets corrupted. Can you send the full
> boot log with this patch applied?
>

Just recalled one more observation about the problem when keith had
reported it last. If I just move .bss before .data_nosave instead
of it being at the end, keith's problem had disappeared.

Thanks
Vivek

2006-10-05 19:05:15

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 20:51, Steve Fox wrote:
> On Thu, 2006-10-05 at 20:27 +0200, Andi Kleen wrote:
>
> > I guess we need to track when it gets corrupted. Can you send the full
> > boot log with this patch applied?
>
> Here she blows!

Can you please try it again with this patch to narrow it down further?

-Andi

Index: linux-2.6.19-rc1-hack/init/main.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/init/main.c
+++ linux-2.6.19-rc1-hack/init/main.c
@@ -75,6 +75,9 @@

static int init(void *);

+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
extern void init_IRQ(void);
extern void fork_init(unsigned long);
extern void mca_init(void);
@@ -480,6 +483,8 @@ asmlinkage void __init start_kernel(void
char * command_line;
extern struct kernel_param __start___param[], __stop___param[];

+ CHECK;
+
smp_setup_processor_id();

/*
@@ -502,7 +507,9 @@ asmlinkage void __init start_kernel(void
page_address_init();
printk(KERN_NOTICE);
printk(linux_banner);
+ CHECK;
setup_arch(&command_line);
+ CHECK;
setup_per_cpu_areas();
smp_prepare_boot_cpu(); /* arch-specific boot-cpu hooks */

@@ -517,6 +524,7 @@ asmlinkage void __init start_kernel(void
* fragile until we cpu_idle() for the first time.
*/
preempt_disable();
+ CHECK;
build_all_zonelists();
page_alloc_init();
printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line);
@@ -525,6 +533,7 @@ asmlinkage void __init start_kernel(void
__stop___param - __start___param,
&unknown_bootoption);
sort_main_extable();
+ CHECK;
trap_init();
rcu_init();
init_IRQ();
@@ -533,8 +542,10 @@ asmlinkage void __init start_kernel(void
hrtimers_init();
softirq_init();
timekeeping_init();
+ CHECK;
time_init();
profile_init();
+ CHECK;
if (!irqs_disabled())
printk("start_kernel(): bug: interrupts were enabled early\n");
early_boot_irqs_on();
@@ -568,7 +579,9 @@ asmlinkage void __init start_kernel(void
#endif
vfs_caches_init_early();
cpuset_init_early();
+ CHECK;
mem_init();
+ CHECK;
kmem_cache_init();
setup_per_cpu_pageset();
numa_policy_init();
@@ -577,6 +590,7 @@ asmlinkage void __init start_kernel(void
calibrate_delay();
pidmap_init();
pgtable_cache_init();
+ CHECK;
prio_tree_init();
anon_vma_init();
#ifdef CONFIG_X86
@@ -586,12 +600,14 @@ asmlinkage void __init start_kernel(void
fork_init(num_physpages);
proc_caches_init();
buffer_init();
+ CHECK;
unnamed_dev_init();
key_init();
security_init();
vfs_caches_init(num_physpages);
radix_tree_init();
signals_init();
+ CHECK;
/* rootfs populating might need page-writeback */
page_writeback_init();
#ifdef CONFIG_PROC_FS
@@ -599,6 +615,7 @@ asmlinkage void __init start_kernel(void
#endif
cpuset_init();
taskstats_init_early();
+ CHECK;
delayacct_init();

check_bugs();
@@ -609,7 +626,7 @@ asmlinkage void __init start_kernel(void
rest_init();
}

-static int __initdata initcall_debug;
+static int __initdata initcall_debug = 1;

static int __init initcall_debug_setup(char *str)
{
@@ -639,7 +656,11 @@ static void __init do_initcalls(void)
printk("\n");
}

+ CHECK;
+
result = (*call)();
+
+ CHECK;

if (result && result != -ENODEV && initcall_debug) {
sprintf(msgbuf, "error code %d", result);
@@ -725,21 +746,32 @@ static int init(void * unused)

smp_prepare_cpus(max_cpus);

+ CHECK;
+
do_pre_smp_initcalls();

smp_init();
+
+ CHECK;
+
sched_init_smp();

cpuset_init_smp();

+ CHECK;
+
/*
* Do this before initcalls, because some drivers want to access
* firmware files.
*/
populate_rootfs();

+ CHECK;
+
do_basic_setup();

+ CHECK;
+
/*
* check if there is an early userspace init. If yes, let it do all
* the work
Index: linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/net/xfrm/xfrm_policy.c
+++ linux-2.6.19-rc1-hack/net/xfrm/xfrm_policy.c
@@ -39,6 +39,16 @@ EXPORT_SYMBOL(xfrm_policy_count);
static DEFINE_RWLOCK(xfrm_policy_afinfo_lock);
static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO];

+void bugcheck(char *where, int line)
+{
+ int i;
+ for (i = 0; i < NPROTO; i++)
+ if (xfrm_policy_afinfo[i] == (void *)-1UL) {
+ panic("afinfo corrupted at %s:%d\n",where,line);
+ return;
+ }
+}
+
static kmem_cache_t *xfrm_dst_cache __read_mostly;

static struct work_struct xfrm_policy_gc_work;
Index: linux-2.6.19-rc1-hack/arch/x86_64/kernel/setup.c
===================================================================
--- linux-2.6.19-rc1-hack.orig/arch/x86_64/kernel/setup.c
+++ linux-2.6.19-rc1-hack/arch/x86_64/kernel/setup.c
@@ -65,6 +65,12 @@
#include <asm/sections.h>
#include <asm/dmi.h>

+
+
+extern void bugcheck(char *, int);
+#define CHECK bugcheck(__FILE__, __LINE__)
+
+
/*
* Machine setup..
*/
@@ -351,14 +357,22 @@ void __init setup_arch(char **cmdline_p)
saved_video_mode = SAVED_VIDEO_MODE;
bootloader_type = LOADER_TYPE;

+ CHECK;
+
#ifdef CONFIG_BLK_DEV_RAM
rd_image_start = RAMDISK_FLAGS & RAMDISK_IMAGE_START_MASK;
rd_prompt = ((RAMDISK_FLAGS & RAMDISK_PROMPT_FLAG) != 0);
rd_doload = ((RAMDISK_FLAGS & RAMDISK_LOAD_FLAG) != 0);
#endif
+
+ CHECK;
+
setup_memory_region();
+ CHECK;
copy_edd();

+ CHECK;
+
if (!MOUNT_ROOT_RDONLY)
root_mountflags &= ~MS_RDONLY;
init_mm.start_code = (unsigned long) &_text;
@@ -373,14 +387,25 @@ void __init setup_arch(char **cmdline_p)

early_identify_cpu(&boot_cpu_data);

+ CHECK;
+
+
strlcpy(command_line, saved_command_line, COMMAND_LINE_SIZE);
*cmdline_p = command_line;

+ CHECK;
+
+
parse_early_param();

+ CHECK;
+
finish_e820_parsing();
+ CHECK;

e820_register_active_regions(0, 0, -1UL);
+ CHECK;
+
/*
* partially used pages are not usable - thus
* we are rounding upwards:
@@ -389,14 +414,19 @@ void __init setup_arch(char **cmdline_p)
num_physpages = end_pfn;

check_efer();
+ CHECK;

discover_ebda();
+ CHECK;

init_memory_mapping(0, (end_pfn_map << PAGE_SHIFT));
+ CHECK;

dmi_scan_machine();
+ CHECK;

zap_low_mappings(0);
+ CHECK;

#ifdef CONFIG_ACPI
/*
@@ -405,6 +435,7 @@ void __init setup_arch(char **cmdline_p)
*/
acpi_boot_table_init();
#endif
+ CHECK;

/* How many end-of-memory variables you have, grandma! */
max_low_pfn = end_pfn;
@@ -413,6 +444,7 @@ void __init setup_arch(char **cmdline_p)

/* Remove active ranges so rediscovery with NUMA-awareness happens */
remove_all_active_ranges();
+ CHECK;

#ifdef CONFIG_ACPI_NUMA
/*
@@ -420,20 +452,24 @@ void __init setup_arch(char **cmdline_p)
*/
acpi_numa_init();
#endif
+ CHECK;

#ifdef CONFIG_NUMA
numa_initmem_init(0, end_pfn);
#else
contig_initmem_init(0, end_pfn);
#endif
+ CHECK;

/* Reserve direct mapping */
reserve_bootmem_generic(table_start << PAGE_SHIFT,
(table_end - table_start) << PAGE_SHIFT);
+ CHECK;

/* reserve kernel */
reserve_bootmem_generic(__pa_symbol(&_text),
__pa_symbol(&_end) - __pa_symbol(&_text));
+ CHECK;

/*
* reserve physical page 0 - it's a special BIOS page on many boxes,
@@ -444,6 +480,7 @@ void __init setup_arch(char **cmdline_p)
/* reserve ebda region */
if (ebda_addr)
reserve_bootmem_generic(ebda_addr, ebda_size);
+ CHECK;

#ifdef CONFIG_SMP
/*
@@ -456,6 +493,7 @@ void __init setup_arch(char **cmdline_p)
/* Reserve SMP trampoline */
reserve_bootmem_generic(SMP_TRAMPOLINE_BASE, PAGE_SIZE);
#endif
+ CHECK;

#ifdef CONFIG_ACPI_SLEEP
/*
@@ -463,10 +501,14 @@ void __init setup_arch(char **cmdline_p)
*/
acpi_reserve_bootmem();
#endif
+ CHECK;
+
/*
* Find and reserve possible boot-time SMP configuration:
*/
find_smp_config();
+ CHECK;
+
#ifdef CONFIG_BLK_DEV_INITRD
if (LOADER_TYPE && INITRD_START) {
if (INITRD_START + INITRD_SIZE <= (end_pfn << PAGE_SHIFT)) {
@@ -484,18 +526,23 @@ void __init setup_arch(char **cmdline_p)
}
}
#endif
+ CHECK;
+
#ifdef CONFIG_KEXEC
if (crashk_res.start != crashk_res.end) {
reserve_bootmem_generic(crashk_res.start,
crashk_res.end - crashk_res.start + 1);
}
#endif
+ CHECK;

paging_init();
+ CHECK;

#ifdef CONFIG_PCI
early_quirks();
#endif
+ CHECK;

/*
* set this early, so we dont allocate cpu0
@@ -509,25 +556,36 @@ void __init setup_arch(char **cmdline_p)
*/
acpi_boot_init();
#endif
+ CHECK;

init_cpu_to_node();
+ CHECK;

/*
* get boot-time SMP configuration:
*/
if (smp_found_config)
get_smp_config();
+ CHECK;
+
init_apic_mappings();
+ CHECK;

/*
* Request address space for all standard RAM and ROM resources
* and also for regions reported as reserved by the e820.
*/
probe_roms();
+ CHECK;
+
e820_reserve_resources();
+ CHECK;
+
e820_mark_nosave_regions();
+ CHECK;

request_resource(&iomem_resource, &video_ram_resource);
+ CHECK;

{
unsigned i;
@@ -535,8 +593,10 @@ void __init setup_arch(char **cmdline_p)
for (i = 0; i < ARRAY_SIZE(standard_io_resources); i++)
request_resource(&ioport_resource, &standard_io_resources[i]);
}
+ CHECK;

e820_setup_gap();
+ CHECK;

#ifdef CONFIG_VT
#if defined(CONFIG_VGA_CONSOLE)

2006-10-05 19:09:06

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 20:52, Vivek Goyal wrote:
> On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
> > On Thursday 05 October 2006 19:57, Steve Fox wrote:
> > > On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
> > >
> > > > Please don't snip the Code: line. It is fairly important.
> > >
> > > Sorry about that. The remote console I was using appears to overwrite
> > > some text after I force the reboot. Here's a clean one.
> > >
> > > global ffffffffffffffff
> >
> > Ok that definitely shouldn't be in there.
> >
> > I guess we need to track when it gets corrupted. Can you send the full
> > boot log with this patch applied?
> >
>
> Just recalled one more observation about the problem when keith had
> reported it last. If I just move .bss before .data_nosave instead
> of it being at the end, keith's problem had disappeared.

Yes, that could well be that it's something in the new bootmap
management. Steve's box failed at

Using ACPI (MADT) for SMP configuration information
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at init/main.c:512

which is directly after that code does lots of stuff.

Mel might want to take a look (and perhaps
also cut down a little on the ugly printks ...)

BTW I found one of my test systems too now which does a lot of:
I'm about to leave for vacation so i won't have time to track it down
any time soon. But here is it for reference.

-Andi

Please enable the IOMMU option in the BIOS setup
This costs you 64 MB of RAM
Mapping aperture over 65536 KB of RAM @ 8000000
Bad page state in process 'swapper'
page:ffff810003ee5480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
[<ffffffff8020ac84>] show_trace+0x34/0x47
[<ffffffff8020aca9>] dump_stack+0x12/0x17
[<ffffffff802586a7>] bad_page+0x57/0x81
[<ffffffff80258791>] __free_pages_ok+0x64/0x247
[<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
[<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
[<ffffffff807c915e>] mem_init+0x44/0x186
[<ffffffff807bc5f0>] start_kernel+0x17b/0x207
[<ffffffff807bc168>] _sinittext+0x168/0x16c

Bad page state in process 'swapper'
page:ffff810003ee54b8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
[<ffffffff8020ac84>] show_trace+0x34/0x47
[<ffffffff8020aca9>] dump_stack+0x12/0x17
[<ffffffff802586a7>] bad_page+0x57/0x81
[<ffffffff80258791>] __free_pages_ok+0x64/0x247
[<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
[<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
[<ffffffff807c915e>] mem_init+0x44/0x186
[<ffffffff807bc5f0>] start_kernel+0x17b/0x207
[<ffffffff807bc168>] _sinittext+0x168/0x16c


... lots more of those ...

2006-10-05 20:25:16

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 21:08 +0200, Andi Kleen wrote:

> Mel might want to take a look (and perhaps
> also cut down a little on the ugly printks ...)

I tested a patch from Mel which backs out the arch independent zone
sizing and got the same results (to my inexperienced eye). I've sent him
the boot log to verify they really are the same as without this
back-out.

--

Steve Fox
IBM Linux Technology Center

2006-10-05 20:39:19

by Mel Gorman

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 5 Oct 2006, Andi Kleen wrote:

> On Thursday 05 October 2006 20:52, Vivek Goyal wrote:
>> On Thu, Oct 05, 2006 at 08:27:02PM +0200, Andi Kleen wrote:
>>> On Thursday 05 October 2006 19:57, Steve Fox wrote:
>>>> On Thu, 2006-10-05 at 17:40 +0200, Andi Kleen wrote:
>>>>
>>>>> Please don't snip the Code: line. It is fairly important.
>>>>
>>>> Sorry about that. The remote console I was using appears to overwrite
>>>> some text after I force the reboot. Here's a clean one.
>>>>
>>>> global ffffffffffffffff
>>>
>>> Ok that definitely shouldn't be in there.
>>>
>>> I guess we need to track when it gets corrupted. Can you send the full
>>> boot log with this patch applied?
>>>
>>
>> Just recalled one more observation about the problem when keith had
>> reported it last. If I just move .bss before .data_nosave instead
>> of it being at the end, keith's problem had disappeared.
>
> Yes, that could well be that it's something in the new bootmap
> management. Steve's box failed at
>
> Using ACPI (MADT) for SMP configuration information
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at init/main.c:512
>
> which is directly after that code does lots of stuff.
>
> Mel might want to take a look (and perhaps
> also cut down a little on the ugly printks ...)
>

Steve tested a patch with arch-independent zone-sizing backed out for
x86_64 and things looked ok but that is no guarantee it is not a
contributary factor. The "Nosave address range:" printks are related to a
suspend problem that was reported .... end of June I believe.

I'll pick this up in the morning because I should have access to the same
machine Steve does and see what I can come up with.

> BTW I found one of my test systems too now which does a lot of:
> I'm about to leave for vacation so i won't have time to track it down
> any time soon. But here is it for reference.
>

hmm, rather than bugging you with patches now, I'll see what I can find
with the x86_64 machines I have access to and see can I reproduce it.

> -Andi
>
> Please enable the IOMMU option in the BIOS setup
> This costs you 64 MB of RAM
> Mapping aperture over 65536 KB of RAM @ 8000000
> Bad page state in process 'swapper'
> page:ffff810003ee5480 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
>
> Call Trace:
> [<ffffffff8020ac84>] show_trace+0x34/0x47
> [<ffffffff8020aca9>] dump_stack+0x12/0x17
> [<ffffffff802586a7>] bad_page+0x57/0x81
> [<ffffffff80258791>] __free_pages_ok+0x64/0x247
> [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
> [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
> [<ffffffff807c915e>] mem_init+0x44/0x186
> [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
> [<ffffffff807bc168>] _sinittext+0x168/0x16c
>
> Bad page state in process 'swapper'
> page:ffff810003ee54b8 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
> Trying to fix it up, but a reboot is needed
> Backtrace:
>
> Call Trace:
> [<ffffffff8020ac84>] show_trace+0x34/0x47
> [<ffffffff8020aca9>] dump_stack+0x12/0x17
> [<ffffffff802586a7>] bad_page+0x57/0x81
> [<ffffffff80258791>] __free_pages_ok+0x64/0x247
> [<ffffffff807cca72>] free_all_bootmem_core+0xcc/0x1a9
> [<ffffffff807ca08b>] numa_free_all_bootmem+0x3b/0x77
> [<ffffffff807c915e>] mem_init+0x44/0x186
> [<ffffffff807bc5f0>] start_kernel+0x17b/0x207
> [<ffffffff807bc168>] _sinittext+0x168/0x16c
>
>
> ... lots more of those ...
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-10-05 20:42:40

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:

> Can you please try it again with this patch to narrow it down further?

Unfortunately this is as far as it got before it hung.

root (hd0,0)
Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts console=tty0 console=ttyS0,
57600 autobench_args: root=/dev/sda1 ABAT:1160080320
[Linux-bzImage, setup=0x1400, size=0x1dd871]
initrd /boot/initrd-autobench.img
[Linux-initrd @ 0x37ceb000, 0x304c57 bytes]


--

Steve Fox
IBM Linux Technology Center

2006-10-05 20:51:08

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thursday 05 October 2006 22:42, Steve Fox wrote:
> On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
>
> > Can you please try it again with this patch to narrow it down further?
>
> Unfortunately this is as far as it got before it hung.

Boot with earlyprintk=serial,ttyS0,57600
(or change the panic in the checkfunction back to a printk)

-Andi

2006-10-05 20:51:46

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64


> hmm, rather than bugging you with patches now, I'll see what I can find
> with the x86_64 machines I have access to and see can I reproduce it.

I started the bisect, should finish soon.

-Andi

2006-10-05 22:37:58

by Pavel Roskin

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

Hello!

On Tue, 2006-10-03 at 09:34 -0700, Jean Tourrilhes wrote:
> I don't really want to overstep my authority there, my goal
> was to minimise the changes. Pavel will have to clean up my mess, so I
> don't want change things too much.

Sorry for a long delay.

I'm actually not very interested in the Wireless Extension interface of
the driver. The less I touch that code, the better I feel. I won't add
to the criticism for the latest changes; enough has been said.

Its fine with me that your are changing the orinoco driver to update
Wireless Extensions compatibility.

I'm trying to maintain a Subversion repository with the driver modified
to be compatible with a few latest kernels. But it looks like it's an
uphill battle that I'm not going to win.

--
Regards,
Pavel Roskin

2006-10-05 22:46:38

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: 2.6.18-mm2 - oops in cache_alloc_refill()

On Thu, Oct 05, 2006 at 06:37:53PM -0400, Pavel Roskin wrote:
> Hello!
>
> On Tue, 2006-10-03 at 09:34 -0700, Jean Tourrilhes wrote:
> > I don't really want to overstep my authority there, my goal
> > was to minimise the changes. Pavel will have to clean up my mess, so I
> > don't want change things too much.
>
> Sorry for a long delay.

That's ok, we all have a real life ;-)

> I'm actually not very interested in the Wireless Extension interface of
> the driver. The less I touch that code, the better I feel. I won't add
> to the criticism for the latest changes; enough has been said.
>
> Its fine with me that your are changing the orinoco driver to update
> Wireless Extensions compatibility.
>
> I'm trying to maintain a Subversion repository with the driver modified
> to be compatible with a few latest kernels. But it looks like it's an
> uphill battle that I'm not going to win.

I'll try to come up with a patch for you. It's not as bad as
it looks like. It will look like the patch for the external ipw
drivers I sent on the list.

> Pavel Roskin

Have fun...

Jean

2006-10-05 23:14:20

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II

On Thursday 05 October 2006 22:51, Andi Kleen wrote:
>
> > hmm, rather than bugging you with patches now, I'll see what I can find
> > with the x86_64 machines I have access to and see can I reproduce it.
>
> I started the bisect, should finish soon.

It ended at

diff-tree d5cdb67236dba94496de052c9f9f431e1fc658f4 (from 0dad3510ee82bcf8a380b81
a2184a664a911ef9c)
Author: Satoru Takeuchi <[email protected]>
Date: Tue Sep 12 10:19:00 2006 -0700

acpiphp: disable bridges

Currently acpiphp calls pci_enable_device() against all
hot-added bridges, but acpiphp does not call pci_disable_device()
against them in hot-remove. So ioapic hot-remove would fail.
This patch fixes this issue.

Not sure that is it really, it is possible i made a mistake during bisect
(the symptoms changed from bad page to just networking doesn't work
somewhere at 4cfee88ad30acc47f02b8b7ba3db8556262dce1e)

I don't have time to rerun unfortunately
for some time. Anyone else looking would be useful.

-Andi

2006-10-05 23:33:05

by Keith Mannthey

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II

On Fri, 2006-10-06 at 01:14 +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 22:51, Andi Kleen wrote:
> >
> > > hmm, rather than bugging you with patches now, I'll see what I can find
> > > with the x86_64 machines I have access to and see can I reproduce it.
> >
> > I started the bisect, should finish soon.
>
> It ended at
>
> diff-tree d5cdb67236dba94496de052c9f9f431e1fc658f4 (from 0dad3510ee82bcf8a380b81
> a2184a664a911ef9c)
> Author: Satoru Takeuchi <[email protected]>
> Date: Tue Sep 12 10:19:00 2006 -0700
>
> acpiphp: disable bridges
>
> Currently acpiphp calls pci_enable_device() against all
> hot-added bridges, but acpiphp does not call pci_disable_device()
> against them in hot-remove. So ioapic hot-remove would fail.
> This patch fixes this issue.
>
> Not sure that is it really, it is possible i made a mistake during bisect
> (the symptoms changed from bad page to just networking doesn't work
> somewhere at 4cfee88ad30acc47f02b8b7ba3db8556262dce1e)
>
> I don't have time to rerun unfortunately
> for some time. Anyone else looking would be useful.

As of yet I haven't been able to recreate the hang. I am running
similar HW to Steve.

Thanks,
Keith

2006-10-05 23:38:39

by Andi Kleen

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II


> As of yet I haven't been able to recreate the hang. I am running
> similar HW to Steve.

That was on a 4 core Opteron with Tyan board (S2881) and AMD-8111
chipset.

-Andi

2006-10-05 23:58:37

by Keith Mannthey

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II

On Fri, 2006-10-06 at 01:35 +0200, Andi Kleen wrote:
> > As of yet I haven't been able to recreate the hang. I am running
> > similar HW to Steve.

I ran into this with -mm3

Memory: 24150368k/26738688k available (1933k kernel code, 490260k
reserved, 978k data, 308k init)
------------[ cut here ]------------
kernel BUG in init_list at mm/slab.c:1334!
invalid opcode: 0000 [1] SMP
last sysfs file:
CPU 0
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.18-mm3-smp #1
RIP: 0010:[<ffffffff8027f8fa>] [<ffffffff8027f8fa>] init_list+0x1d/0xfd
RSP: 0018:ffffffff80577f48 EFLAGS: 00010212
RAX: 0000000000000040 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffffff805ba848 RDI: ffff810460700040
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: ffffffff805bc268 R12: ffff810460700040
R13: ffffffff805ba848 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff804d8000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006a0
Process swapper (pid: 0, threadinfo ffffffff80576000, task
ffffffff80455840)
Stack: 0000000000000000 0000000000000000 0000000100000000
0000000000000001
ffffffff805ba848 0000000000000000 0000000000000000 ffffffff80593aa8
00000000000002c0 0000000100000001 000000000008ef00 000000000008c000
Call Trace:
[<ffffffff80593aa8>] kmem_cache_init+0x344/0x406
[<ffffffff805805ef>] start_kernel+0x180/0x21b
[<ffffffff8058016a>] _sinittext+0x16a/0x16e


Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
RIP [<ffffffff8027f8fa>] init_list+0x1d/0xfd
RSP <ffffffff80577f48>
<0>Kernel panic - not syncing: Attempted to kill the idle task!


I am going to revert the patch and see if it works. I ran -git22 just
fine.

Thanks,
Keith

2006-10-06 00:02:58

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II

keith mannthey wrote:
> On Fri, 2006-10-06 at 01:35 +0200, Andi Kleen wrote:
>
>>> As of yet I haven't been able to recreate the hang. I am running
>>> similar HW to Steve.
>>>
>
> I ran into this with -mm3
>
> Memory: 24150368k/26738688k available (1933k kernel code, 490260k
> reserved, 978k data, 308k init)
> ------------[ cut here ]------------
> kernel BUG in init_list at mm/slab.c:1334!
> invalid opcode: 0000 [1] SMP
> last sysfs file:
> CPU 0
> Modules linked in:
> Pid: 0, comm: swapper Not tainted 2.6.18-mm3-smp #1
> RIP: 0010:[<ffffffff8027f8fa>] [<ffffffff8027f8fa>] init_list+0x1d/0xfd
> RSP: 0018:ffffffff80577f48 EFLAGS: 00010212
> RAX: 0000000000000040 RBX: 0000000000000001 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffffff805ba848 RDI: ffff810460700040
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000003
> R10: 0000000000000000 R11: ffffffff805bc268 R12: ffff810460700040
> R13: ffffffff805ba848 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffffff804d8000(0000)
> knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000201000 CR4: 00000000000006a0
> Process swapper (pid: 0, threadinfo ffffffff80576000, task
> ffffffff80455840)
> Stack: 0000000000000000 0000000000000000 0000000100000000
> 0000000000000001
> ffffffff805ba848 0000000000000000 0000000000000000 ffffffff80593aa8
> 00000000000002c0 0000000100000001 000000000008ef00 000000000008c000
> Call Trace:
> [<ffffffff80593aa8>] kmem_cache_init+0x344/0x406
> [<ffffffff805805ef>] start_kernel+0x180/0x21b
> [<ffffffff8058016a>] _sinittext+0x16a/0x16e
>
>
> Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
> RIP [<ffffffff8027f8fa>] init_list+0x1d/0xfd
> RSP <ffffffff80577f48>
> <0>Kernel panic - not syncing: Attempted to kill the idle task!
>
>
> I am going to revert the patch and see if it works. I ran -git22 just
> fine.
>
> Thanks,
> Keith
>
>
Keith,

I fixed this already. Can you look for it on lkml (look for 2.6.18-mm3
in the subject line).
one typo in mm/slab.c

Thanks,
Badari

2006-10-06 00:13:01

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64 II

On Thu, 05 Oct 2006 17:02:54 -0700
Badari Pulavarty <[email protected]> wrote:

> > Code: 0f 0b 48 8b 3d 15 ab 1e 00 be d0 00 00 00 e8 c0 f5 ff ff 48
> > RIP [<ffffffff8027f8fa>] init_list+0x1d/0xfd
> > RSP <ffffffff80577f48>
> > <0>Kernel panic - not syncing: Attempted to kill the idle task!
> >
> >
> > I am going to revert the patch and see if it works. I ran -git22 just
> > fine.
> >
> > Thanks,
> > Keith
> >
> >
> Keith,
>
> I fixed this already. Can you look for it on lkml (look for 2.6.18-mm3
> in the subject line).
> one typo in mm/slab.c

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm3/hot-fixes

2006-10-06 02:23:42

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Thu, 2006-10-05 at 22:50 +0200, Andi Kleen wrote:
> On Thursday 05 October 2006 22:42, Steve Fox wrote:
> > On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
> >
> > > Can you please try it again with this patch to narrow it down further?
> >
> > Unfortunately this is as far as it got before it hung.
>
> Boot with earlyprintk=serial,ttyS0,57600
> (or change the panic in the checkfunction back to a printk)

root (hd0,0)
Filesystem type is reiserfs, partition type 0x83
kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.5
0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57
600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:116010
0417
[Linux-bzImage, setup=0x1400, size=0x1dd855]
initrd /boot/initrd-autobench.img
[Linux-initrd @ 0x37cec000, 0x303f80 bytes]

Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
kernel direct mapping tables up to c00000000 @ 8000-39000
DMI 2.3 present.
afinfo corrupted at arch/x86_64/kernel/setup.c:462
afinfo corrupted at arch/x86_64/kernel/setup.c:467
afinfo corrupted at arch/x86_64/kernel/setup.c:472
afinfo corrupted at arch/x86_64/kernel/setup.c:483
afinfo corrupted at arch/x86_64/kernel/setup.c:496
afinfo corrupted at arch/x86_64/kernel/setup.c:504
afinfo corrupted at arch/x86_64/kernel/setup.c:510
afinfo corrupted at arch/x86_64/kernel/setup.c:529
afinfo corrupted at arch/x86_64/kernel/setup.c:537
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 12582912
early_node_map[3] active PFN ranges
0: 0 -> 154
0: 256 -> 786294
0: 1048576 -> 12582912
afinfo corrupted at arch/x86_64/kernel/setup.c:540
afinfo corrupted at arch/x86_64/kernel/setup.c:545
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
afinfo corrupted at arch/x86_64/kernel/setup.c:559
afinfo corrupted at arch/x86_64/kernel/setup.c:562
Using ACPI (MADT) for SMP configuration information
afinfo corrupted at arch/x86_64/kernel/setup.c:569
afinfo corrupted at arch/x86_64/kernel/setup.c:572
afinfo corrupted at arch/x86_64/kernel/setup.c:579
afinfo corrupted at arch/x86_64/kernel/setup.c:582
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
afinfo corrupted at arch/x86_64/kernel/setup.c:585
afinfo corrupted at arch/x86_64/kernel/setup.c:588
afinfo corrupted at arch/x86_64/kernel/setup.c:596
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at arch/x86_64/kernel/setup.c:599
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists. Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
disabling early console
Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
end_pfn_map = 12582912
DMI 2.3 present.
afinfo corrupted at arch/x86_64/kernel/setup.c:462
afinfo corrupted at arch/x86_64/kernel/setup.c:467
afinfo corrupted at arch/x86_64/kernel/setup.c:472
afinfo corrupted at arch/x86_64/kernel/setup.c:483
afinfo corrupted at arch/x86_64/kernel/setup.c:496
afinfo corrupted at arch/x86_64/kernel/setup.c:504
afinfo corrupted at arch/x86_64/kernel/setup.c:510
afinfo corrupted at arch/x86_64/kernel/setup.c:529
afinfo corrupted at arch/x86_64/kernel/setup.c:537
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 12582912
early_node_map[3] active PFN ranges
0: 0 -> 154
0: 256 -> 786294
0: 1048576 -> 12582912
afinfo corrupted at arch/x86_64/kernel/setup.c:540
afinfo corrupted at arch/x86_64/kernel/setup.c:545
ACPI: PM-Timer IO Port: 0x9c
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
Processor #6
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Processor #7
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
Processor #16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
Processor #17
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
Processor #22
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
Processor #23
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
Processor #32
ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
Processor #33
ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
Processor #38
ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
Processor #39
ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
Processor #48
ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
Processor #49
ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
Processor #54
ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
Processor #55
ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
Processor #64
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
Processor #65
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
Processor #70
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
Processor #71
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
Processor #80
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
Processor #81
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
Processor #86
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
Processor #87
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
Processor #96
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
Processor #97
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
Processor #102
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
Processor #103
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
Processor #112
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
Processor #113
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
Processor #118
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
Processor #119
WARNING: NR_CPUS limit of 16 reached. Processor ignored.
ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
Setting APIC routing to clustered
ACPI: HPET id: 0x10142201 base: 0xfde84000
afinfo corrupted at arch/x86_64/kernel/setup.c:559
afinfo corrupted at arch/x86_64/kernel/setup.c:562
Using ACPI (MADT) for SMP configuration information
afinfo corrupted at arch/x86_64/kernel/setup.c:569
afinfo corrupted at arch/x86_64/kernel/setup.c:572
afinfo corrupted at arch/x86_64/kernel/setup.c:579
afinfo corrupted at arch/x86_64/kernel/setup.c:582
Nosave address range: 000000000009a000 - 000000000009b000
Nosave address range: 000000000009b000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000bff76000 - 00000000bff77000
Nosave address range: 00000000bff77000 - 00000000bff98000
Nosave address range: 00000000bff98000 - 00000000bff99000
Nosave address range: 00000000bff99000 - 00000000c0000000
Nosave address range: 00000000c0000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
afinfo corrupted at arch/x86_64/kernel/setup.c:585
afinfo corrupted at arch/x86_64/kernel/setup.c:588
afinfo corrupted at arch/x86_64/kernel/setup.c:596
Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
afinfo corrupted at arch/x86_64/kernel/setup.c:599
afinfo corrupted at init/main.c:512
SMP: Allowing 16 CPUs, 0 hotplug CPUs
PERCPU: Allocating 33920 bytes of per cpu data
afinfo corrupted at init/main.c:527
Built 1 zonelists. Total pages: 12147064
Kernel command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
afinfo corrupted at init/main.c:536
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
afinfo corrupted at init/main.c:545
afinfo corrupted at init/main.c:548
disabling early console
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
afinfo corrupted at init/main.c:582
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x310c2000 - 0x350c2000
Memory: 48422908k/50331648k available (2566k kernel code, 858868k reserved, 1345k data, 184k init)
afinfo corrupted at init/main.c:584
Calibrating delay using timer specific routine.. 5678.09 BogoMIPS (lpj=11356196)
afinfo corrupted at init/main.c:593
afinfo corrupted at init/main.c:603
Mount-cache hash table entries: 256
afinfo corrupted at init/main.c:610
afinfo corrupted at init/main.c:618
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
..MP-BIOS bug: 8254 timer not connected to IO-APIC
Using local APIC timer interrupts.
result 10425595
Detected 10.425 MHz APIC timer.
afinfo corrupted at init/main.c:749
SMP alternatives: switching to SMP code
Booting processor 1/16 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 5671.84 BogoMIPS (lpj=11343696)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -2 cycles, maxerr 799 cycles)
SMP alternatives: switching to SMP code
Booting processor 2/16 APIC 0x6
Initializing CPU#2
Calibrating delay using timer specific routine.. 5671.98 BogoMIPS (lpj=11343971)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU2: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 2: Syncing TSC to CPU 0.
CPU 2: synchronized TSC with CPU 0 (last diff -184 cycles, maxerr 3349 cycles)
SMP alternatives: switching to SMP code
Booting processor 3/16 APIC 0x7
Initializing CPU#3
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344041)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU3: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 3: Syncing TSC to CPU 0.
CPU 3: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 1989 cycles)
SMP alternatives: switching to SMP code
Booting processor 4/16 APIC 0x10
Initializing CPU#4
Calibrating delay using timer specific routine.. 5672.07 BogoMIPS (lpj=11344144)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU4: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 4: Syncing TSC to CPU 0.
CPU 4: synchronized TSC with CPU 0 (last diff 43 cycles, maxerr 3247 cycles)
SMP alternatives: switching to SMP code
Booting processor 5/16 APIC 0x11
Initializing CPU#5
Calibrating delay using timer specific routine.. 5672.01 BogoMIPS (lpj=11344024)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 8
CPU: Processor Core ID: 0
CPU5: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 5: Syncing TSC to CPU 0.
CPU 5: synchronized TSC with CPU 0 (last diff 21 cycles, maxerr 3349 cycles)
SMP alternatives: switching to SMP code
Booting processor 6/16 APIC 0x16
Initializing CPU#6
Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344042)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU6: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 6: Syncing TSC to CPU 0.
CPU 6: synchronized TSC with CPU 0 (last diff 257 cycles, maxerr 3383 cycles)
SMP alternatives: switching to SMP code
Booting processor 7/16 APIC 0x17
Initializing CPU#7
Calibrating delay using timer specific routine.. 5672.10 BogoMIPS (lpj=11344218)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 11
CPU: Processor Core ID: 0
CPU7: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 7: Syncing TSC to CPU 0.
CPU 7: synchronized TSC with CPU 0 (last diff 233 cycles, maxerr 3357 cycles)
SMP alternatives: switching to SMP code
Booting processor 8/16 APIC 0x20
Initializing CPU#8
Calibrating delay using timer specific routine.. 5672.35 BogoMIPS (lpj=11344712)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU8: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 8: Syncing TSC to CPU 0.
CPU 8: synchronized TSC with CPU 0 (last diff 140 cycles, maxerr 8509 cycles)
SMP alternatives: switching to SMP code
Booting processor 9/16 APIC 0x21
Initializing CPU#9
Calibrating delay using timer specific routine.. 5672.25 BogoMIPS (lpj=11344515)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 16
CPU: Processor Core ID: 0
CPU9: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 9: Syncing TSC to CPU 0.
CPU 9: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 7556 cycles)
SMP alternatives: switching to SMP code
Booting processor 10/16 APIC 0x26
Initializing CPU#10
Calibrating delay using timer specific routine.. 5672.33 BogoMIPS (lpj=11344676)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU10: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 10: Syncing TSC to CPU 0.
CPU 10: synchronized TSC with CPU 0 (last diff 405 cycles, maxerr 8126 cycles)
SMP alternatives: switching to SMP code
Booting processor 11/16 APIC 0x27
Initializing CPU#11
Calibrating delay using timer specific routine.. 5672.46 BogoMIPS (lpj=11344939)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 19
CPU: Processor Core ID: 0
CPU11: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 11: Syncing TSC to CPU 0.
CPU 11: synchronized TSC with CPU 0 (last diff -145 cycles, maxerr 8568 cycles)
SMP alternatives: switching to SMP code
Booting processor 12/16 APIC 0x30
Initializing CPU#12
Calibrating delay using timer specific routine.. 5672.23 BogoMIPS (lpj=11344472)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU12: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 12: Syncing TSC to CPU 0.
CPU 12: synchronized TSC with CPU 0 (last diff 419 cycles, maxerr 8602 cycles)
SMP alternatives: switching to SMP code
Booting processor 13/16 APIC 0x31
Initializing CPU#13
Calibrating delay using timer specific routine.. 5672.34 BogoMIPS (lpj=11344689)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 24
CPU: Processor Core ID: 0
CPU13: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 13: Syncing TSC to CPU 0.
CPU 13: synchronized TSC with CPU 0 (last diff 242 cycles, maxerr 8636 cycles)
SMP alternatives: switching to SMP code
Booting processor 14/16 APIC 0x36
Initializing CPU#14
Calibrating delay using timer specific routine.. 5672.32 BogoMIPS (lpj=11344644)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU14: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 14: Syncing TSC to CPU 0.
CPU 14: synchronized TSC with CPU 0 (last diff -272 cycles, maxerr 8109 cycles)
SMP alternatives: switching to SMP code
Booting processor 15/16 APIC 0x37
Initializing CPU#15
Calibrating delay using timer specific routine.. 5672.21 BogoMIPS (lpj=11344423)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: L3 cache: 4096K
CPU: Physical Processor ID: 27
CPU: Processor Core ID: 0
CPU15: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
CPU 15: Syncing TSC to CPU 0.
CPU 15: synchronized TSC with CPU 0 (last diff -21 cycles, maxerr 8560 cycles)
Brought up 16 CPUs
testing NMI watchdog ... OK.
time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
time.c: Detected 2835.773 MHz processor.
afinfo corrupted at init/main.c:755
migration_cost=19,988
afinfo corrupted at init/main.c:761
afinfo corrupted at init/main.c:769
Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a40: helper_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607dd0: pm_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607e50: ksysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a720: filelock_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b230: init_script_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b240: init_elf_binfmt+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614690: sock_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614e30: netlink_proto_init+0x0/0x1a0()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 16
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c310: kobject_uevent_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c4a0: pcibus_class_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ca70: pci_driver_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ef30: tty_class_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fa20: vtconsole_class_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060cbb0: acpi_pci_init+0x0/0x40()
afinfo corrupted at init/main.c:659
ACPI: bus type pci registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d8ef: init_acpi_device_notify+0x0/0x4b()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613aa0: pci_access_init+0x0/0x30()
afinfo corrupted at init/main.c:659
PCI: Using configuration type 1
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80605760: topology_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607770: param_sysfs_init+0x0/0x200()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060aee0: init_bio+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c1d0: genhd_device_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d702: acpi_init+0x0/0x1ed()
afinfo corrupted at init/main.c:659
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dbd5: acpi_ec_init+0x0/0x62()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060dfee: acpi_pci_root_init+0x0/0x28()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e036: acpi_pci_link_init+0x0/0x48()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e1bc: acpi_power_init+0x0/0x77()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e233: acpi_system_init+0x0/0xc6()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e2f9: acpi_event_init+0x0/0x3f()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e338: acpi_scan_init+0x0/0x1ac()
afinfo corrupted at init/main.c:659
ACPI: PCI Root Bridge [VP00] (0000:00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
ACPI: PCI Root Bridge [VP01] (0000:01)
ACPI: PCI Root Bridge [VP02] (0000:02)
ACPI: PCI Root Bridge [VP03] (0000:04)
ACPI: PCI Root Bridge [VP04] (0000:06)
ACPI: PCI Root Bridge [VP05] (0000:08)
ACPI: PCI Root Bridge [VP06] (0000:0a)
ACPI: PCI Root Bridge [VP07] (0000:0c)
ACPI: PCI Root Bridge [VP10] (0000:0e)
ACPI: PCI Root Bridge [VP11] (0000:0f)
ACPI: PCI Root Bridge [VP12] (0000:10)
ACPI: PCI Root Bridge [VP13] (0000:12)
ACPI: PCI Root Bridge [VP14] (0000:14)
ACPI: PCI Root Bridge [VP15] (0000:16)
ACPI: PCI Root Bridge [VP16] (0000:18)
ACPI: PCI Root Bridge [VP17] (0000:1a)
ACPI: PCI Root Bridge [VP20] (0000:1c)
ACPI: PCI Root Bridge [VP21] (0000:1d)
ACPI: PCI Root Bridge [VP22] (0000:1e)
ACPI: PCI Root Bridge [VP23] (0000:20)
ACPI: PCI Root Bridge [VP24] (0000:22)
ACPI: PCI Root Bridge [VP25] (0000:24)
ACPI: PCI Root Bridge [VP26] (0000:26)
ACPI: PCI Root Bridge [VP27] (0000:28)
ACPI: PCI Root Bridge [VP30] (0000:2a)
ACPI: PCI Root Bridge [VP31] (0000:2b)
ACPI: PCI Root Bridge [VP32] (0000:2c)
ACPI: PCI Root Bridge [VP33] (0000:2e)
ACPI: PCI Root Bridge [VP34] (0000:30)
ACPI: PCI Root Bridge [VP35] (0000:32)
ACPI: PCI Root Bridge [VP36] (0000:34)
ACPI: PCI Root Bridge [VP37] (0000:36)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e654: acpi_cm_sbs_init+0x0/0xc()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e660: pnp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux Plug and Play Support v0.97 (c) Adam Belay
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e8f0: pnpacpi_init+0x0/0x70()
afinfo corrupted at init/main.c:659
pnp: PnP ACPI init
pnp: PnP ACPI: found 47 devices
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060f490: misc_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117f0: init_scsi+0x0/0x90()
afinfo corrupted at init/main.c:659
SCSI subsystem initialized
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806124d0: serio_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806128f0: input_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d00: rtc_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d50: rtc_sysfs_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d60: rtc_proc_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612d70: rtc_dev_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613ad0: pci_acpi_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80613b80: pci_legacy_init+0x0/0x120()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614130: pcibios_irq_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614620: pcibios_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614750: proto_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806148f0: net_dev_init+0x0/0x210()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614fd0: genl_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
hpet0: 3 64-bit timers, 3707069 Hz
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806000b0: pci_iommu_init+0x0/0x20()
afinfo corrupted at init/main.c:659
PCI-GART: No AMD northbridge found.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a6a0: init_pipe_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e524: acpi_motherboard_init+0x0/0x130()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e790: pnp_system_init+0x0/0x10()
afinfo corrupted at init/main.c:659
pnp: 00:0a: ioport range 0x400-0x47f has been reserved
pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ec70: chr_dev_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610a40: firmware_class_init+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806134b0: pcibios_assign_resources+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806159e0: inet_init+0x0/0x400()
afinfo corrupted at init/main.c:659
NET: Registered protocol family 2
IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe9f0: i8259A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805fe9c0: init_timer_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff010: vsyscall_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff805ff2a0: sbf_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600080: i8237A_init_sysfs+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600500: periodic_mcheck_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600530: mce_init_device+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80600670: thermal_throttle_init_device+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806006e0: threshold_init_device+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80601ee0: init_lapic_sysfs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80602a80: ioapic_init_sysfs+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80605870: x8664_sysctl_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80606d30: create_proc_profile+0x0/0x280()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607170: ioresources_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806072e0: timekeeping_init_device+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607400: uid_cache_init+0x0/0x90()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607970: init_posix_timers+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607a80: init_posix_cpu_timers+0x0/0xf0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607ba0: latency_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607c90: init_clocksource_sysfs+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607cf0: init_jiffies_clocksource+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607d00: init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607d70: proc_dma_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607da0: kallsyms_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80607e10: ikconfig_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80608f60: init_per_zone_pages_min+0x0/0x60()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609ed0: pdflush_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609f20: kswapd_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609f50: setup_vmstat+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80609fc0: procswaps_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a030: hugetlb_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Total HugeTLB memory allocated, 0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a0a0: init_tmpfs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a180: cpucache_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060a6f0: fasync_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ae00: aio_setup+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b080: inotify_setup+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b090: inotify_user_setup+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b150: eventpoll_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b250: init_mbcache+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b280: dnotify_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b740: init_devpts_fs+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b780: init_reiserfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b800: init_ext3_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060b930: journal_init+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ba10: init_ext2_fs+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bad0: init_ramfs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bae0: init_hugetlbfs_fs+0x0/0x80()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bba0: init_fat_fs+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bbf0: init_vfat_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc00: init_nls_cp437+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc10: init_nls_iso8859_1+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc20: init_autofs_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
initcall at 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10(): returned with error code -16
Calling initcall 0xffffffff8060bc40: ipc_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bf10: init_mqueue_fs+0x0/0xe0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060bff0: crypto_algapi_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c030: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c040: init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c230: noop_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler noop registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c240: as_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler anticipatory registered (default)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c250: deadline_init+0x0/0x10()
afinfo corrupted at init/main.c:659
io scheduler deadline registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060c260: cfq_init+0x0/0xb0()
afinfo corrupted at init/main.c:659
io scheduler cfq registered
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ca80: pci_sysfs_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060cac0: pci_proc_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d93a: acpi_ac_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060d97f: acpi_battery_init+0x0/0x45()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060df90: acpi_video_init+0x0/0x5e()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060e07e: irqrouter_init_sysfs+0x0/0x38()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed10: rand_initialize+0x0/0x30()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060ed40: tty_init+0x0/0x1f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060efa0: pty_init+0x0/0x4f0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fae0: hpet_init+0x0/0x70()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fb50: agp_init+0x0/0x30()
afinfo corrupted at init/main.c:659
Linux agpgart interface v0.101 (c) Dave Jones
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff8060fcb0: cn_proc_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806100f0: serial8250_init+0x0/0x150()
afinfo corrupted at init/main.c:659
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610320: serial8250_pnp_init+0x0/0x10()
afinfo corrupted at init/main.c:659
00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610330: serial8250_pci_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610ac0: e1000_init_module+0x0/0x50()
afinfo corrupted at init/main.c:659
Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
Copyright (c) 1999-2006 Intel Corporation.
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610b10: tg3_init+0x0/0x10()
afinfo corrupted at init/main.c:659
tg3.c:v3.66 (September 23, 2006)
ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:54
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:55
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610ba0: net_olddevs_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
afinfo corrupted at init/main.c:659
netconsole: not configured, aborting
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610c70: piix_ide_init+0x0/0xd0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80610db0: ide_init+0x0/0x90()
afinfo corrupted at init/main.c:659
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
SvrWks CSB6: chipset revision 160
SvrWks CSB6: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
SvrWks CSB6: simplex device: DMA disabled
ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611780: ide_generic_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117a0: idedisk_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117b0: ide_cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
Uniform CD-ROM driver Revision: 3.20
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806117c0: idefloppy_init+0x0/0x30()
afinfo corrupted at init/main.c:659
ide-floppy driver 0.99.newide
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611a90: raid_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611aa0: spi_transport_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611ae0: fc_transport_init+0x0/0x50()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611b30: iscsi_transport_init+0x0/0x120()
afinfo corrupted at init/main.c:659
Loading iSCSI transport class v2.0-685.afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611c50: sas_transport_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d10: iscsi_tcp_init+0x0/0x50()
afinfo corrupted at init/main.c:659
iscsi: registered transport (tcp)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611d60: aac_init+0x0/0x70()
afinfo corrupted at init/main.c:659
Adaptec aacraid driver (1.1-5[2409]-mh2)
ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
AAC0: kernel 5.0-2[8264]
AAC0: monitor 5.0-2[8264]
AAC0: bios 5.0-2[8264]
AAC0: serial 162348
AAC0: 64bit support enabled.
AAC0: 64 Bit DAC enabled
scsi0 : ServeRAID
scsi 0:0:0:0: Direct-Access IBM Drive 1 V1.0 PQ: 0 ANSI: 2
scsi 0:0:1:0: Direct-Access IBM Drive 2 V1.0 PQ: 0 ANSI: 2
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611dd0: qla1280_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80611fa0: sym2_init+0x0/0x110()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806120b0: init_sd+0x0/0x60()
afinfo corrupted at init/main.c:659
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
sda: assuming Write Enabled
sda: assuming drive cache: write through
sda: sda1 sda2 sda3
sd 0:0:0:0: Attached scsi removable disk sda
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
sdb: assuming Write Enabled
sdb: assuming drive cache: write through
sdb: sdb1 sdb2 sdb3
sd 0:0:1:0: Attached scsi removable disk sdb
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612110: fusion_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT base driver 3.04.01
Copyright (c) 1999-2005 LSI Logic Corporation
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612210: mptspi_init+0x0/0xc0()
afinfo corrupted at init/main.c:659
Fusion MPT SPI Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806122d0: mptfc_init+0x0/0xf0()
afinfo corrupted at init/main.c:659
Fusion MPT FC Host driver 3.04.01
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806123c0: mptctl_init+0x0/0x100()
afinfo corrupted at init/main.c:659
Fusion MPT misc device (ioctl) driver 3.04.01
mptctl: Registered with Fusion MPT base driver
mptctl: /dev/mptctl @ (major,minor=10,220)
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806124c0: cdrom_init+0x0/0x10()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806125a0: i8042_init+0x0/0x350()
afinfo corrupted at init/main.c:659
PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612a10: mousedev_init+0x0/0x100()
afinfo corrupted at init/main.c:659
mice: PS/2 mouse device common for all mice
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612b10: atkbd_init+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80612e20: hwmon_init+0x0/0x40()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80614c60: flow_cache_init+0x0/0x1d0()
afinfo corrupted at init/main.c:659
input: AT Translated Set 2 keyboard as /class/input/input0
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff806160f0: init_syncookies+0x0/0x20()
afinfo corrupted at init/main.c:659
afinfo corrupted at init/main.c:663
Calling initcall 0xffffffff80616110: xfrm4_beet_init+0x0/0x20()
afinfo corrupted at init/main.c:659
Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
[<ffffffff80470666>] xfrm_register_mode+0x36/0x60
PGD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.18-git22 #2
RIP: 0010:[<ffffffff80470666>] [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
Stack: 0000000000000000 0000000000000000 ffffffff8061fee8 ffffffff802071d6
6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000090000
Call Trace:
[<ffffffff802071d6>] init+0x1b6/0x3b0
[<ffffffff8020aa28>] child_rip+0xa/0x12
[<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
[<ffffffff80207020>] init+0x0/0x3b0
[<ffffffff8020aa1e>] child_rip+0x0/0x12


Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
RIP [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
RSP <ffff810bffcbded0>
CR2: 0000000000000827
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!


--

Steve Fox
IBM Linux Technology Center

2006-10-06 14:33:19

by mel

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On (05/10/06 21:23), Steve Fox didst pronounce:
> On Thu, 2006-10-05 at 22:50 +0200, Andi Kleen wrote:
> > On Thursday 05 October 2006 22:42, Steve Fox wrote:
> > > On Thu, 2006-10-05 at 21:05 +0200, Andi Kleen wrote:
> > >
> > > > Can you please try it again with this patch to narrow it down further?
> > >
> > > Unfortunately this is as far as it got before it hung.
> >
> > Boot with earlyprintk=serial,ttyS0,57600
> > (or change the panic in the checkfunction back to a printk)
>
> root (hd0,0)
> Filesystem type is reiserfs, partition type 0x83
> kernel /boot/vmlinuz-autobench root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.5
> 0:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57
> 600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:116010
> 0417
> [Linux-bzImage, setup=0x1400, size=0x1dd855]
> initrd /boot/initrd-autobench.img
> [Linux-initrd @ 0x37cec000, 0x303f80 bytes]
>
> Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)

I continued what Steve was doing this morning to see could this be
pinned down. After placing 'CHECK;' in a few places as suggested by
Andi's check, the problem code was identified as that following in
mm/bootmem.c#init_bootmem_core()

mapsize = get_mapsize(bdata);
memset(bdata->node_bootmem_map, 0xff, mapsize);

That explains the value in the array at least. A few more printfs around
this point printed out the following in the boot log

init_bootmem_core(0, 1909, 0, 12582912)
init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
AAGH: afinfo corrupted at mm/bootmem.c:121

where;

1909 == mapstart
0 == start
12582912 == end
1572864 == mapsize

mapstart, start and end being the parameters being passed to
init_bootmem_core(). This means we are calling memset for the physical
range 0x775000 -> 0x8F5000 which is in a usable range according to the
BIOS-e820 map it appears.

However with 2.6.18-git22, a backout of the patch
x86_64-mm-re-positioning-the-bss-segment.patch from 2.6.18-mm2 allowed the
machine to boot. As this patch moves the BSS past the end of the init section,
it seems that an unintentional side-effect of the patch that BSS ends up in
a place that init_bootmem clobbers it.

> end_pfn_map = 12582912
> kernel direct mapping tables up to c00000000 @ 8000-39000
> DMI 2.3 present.
> afinfo corrupted at arch/x86_64/kernel/setup.c:462
> afinfo corrupted at arch/x86_64/kernel/setup.c:467
> afinfo corrupted at arch/x86_64/kernel/setup.c:472
> afinfo corrupted at arch/x86_64/kernel/setup.c:483
> afinfo corrupted at arch/x86_64/kernel/setup.c:496
> afinfo corrupted at arch/x86_64/kernel/setup.c:504
> afinfo corrupted at arch/x86_64/kernel/setup.c:510
> afinfo corrupted at arch/x86_64/kernel/setup.c:529
> afinfo corrupted at arch/x86_64/kernel/setup.c:537
> Zone PFN ranges:
> DMA 0 -> 4096
> DMA32 4096 -> 1048576
> Normal 1048576 -> 12582912
> early_node_map[3] active PFN ranges
> 0: 0 -> 154
> 0: 256 -> 786294
> 0: 1048576 -> 12582912
> afinfo corrupted at arch/x86_64/kernel/setup.c:540
> afinfo corrupted at arch/x86_64/kernel/setup.c:545
> ACPI: PM-Timer IO Port: 0x9c
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 (Bootup-CPU)
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> Processor #1
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
> Processor #6
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
> Processor #7
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
> Processor #16
> ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
> Processor #17
> ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
> Processor #22
> ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
> Processor #23
> ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
> Processor #32
> ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
> Processor #33
> ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
> Processor #38
> ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
> Processor #39
> ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
> Processor #48
> ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
> Processor #49
> ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
> Processor #54
> ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
> Processor #55
> ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
> Processor #64
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
> Processor #65
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
> Processor #70
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
> Processor #71
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
> Processor #80
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
> Processor #81
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
> Processor #86
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
> Processor #87
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
> Processor #96
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
> Processor #97
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
> Processor #102
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
> Processor #103
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
> Processor #112
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
> Processor #113
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
> Processor #118
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
> Processor #119
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
> ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
> IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
> ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
> IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
> ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
> IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
> ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
> IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
> ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
> IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
> ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
> IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
> ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
> IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
> Setting APIC routing to clustered
> ACPI: HPET id: 0x10142201 base: 0xfde84000
> afinfo corrupted at arch/x86_64/kernel/setup.c:559
> afinfo corrupted at arch/x86_64/kernel/setup.c:562
> Using ACPI (MADT) for SMP configuration information
> afinfo corrupted at arch/x86_64/kernel/setup.c:569
> afinfo corrupted at arch/x86_64/kernel/setup.c:572
> afinfo corrupted at arch/x86_64/kernel/setup.c:579
> afinfo corrupted at arch/x86_64/kernel/setup.c:582
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> afinfo corrupted at arch/x86_64/kernel/setup.c:585
> afinfo corrupted at arch/x86_64/kernel/setup.c:588
> afinfo corrupted at arch/x86_64/kernel/setup.c:596
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at arch/x86_64/kernel/setup.c:599
> afinfo corrupted at init/main.c:512
> SMP: Allowing 16 CPUs, 0 hotplug CPUs
> PERCPU: Allocating 33920 bytes of per cpu data
> afinfo corrupted at init/main.c:527
> Built 1 zonelists. Total pages: 12147064
> Kernel command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> afinfo corrupted at init/main.c:536
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> afinfo corrupted at init/main.c:545
> afinfo corrupted at init/main.c:548
> disabling early console
> Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> BIOS-provided physical RAM map:
> BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> end_pfn_map = 12582912
> DMI 2.3 present.
> afinfo corrupted at arch/x86_64/kernel/setup.c:462
> afinfo corrupted at arch/x86_64/kernel/setup.c:467
> afinfo corrupted at arch/x86_64/kernel/setup.c:472
> afinfo corrupted at arch/x86_64/kernel/setup.c:483
> afinfo corrupted at arch/x86_64/kernel/setup.c:496
> afinfo corrupted at arch/x86_64/kernel/setup.c:504
> afinfo corrupted at arch/x86_64/kernel/setup.c:510
> afinfo corrupted at arch/x86_64/kernel/setup.c:529
> afinfo corrupted at arch/x86_64/kernel/setup.c:537
> Zone PFN ranges:
> DMA 0 -> 4096
> DMA32 4096 -> 1048576
> Normal 1048576 -> 12582912
> early_node_map[3] active PFN ranges
> 0: 0 -> 154
> 0: 256 -> 786294
> 0: 1048576 -> 12582912
> afinfo corrupted at arch/x86_64/kernel/setup.c:540
> afinfo corrupted at arch/x86_64/kernel/setup.c:545
> ACPI: PM-Timer IO Port: 0x9c
> ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
> Processor #0 (Bootup-CPU)
> ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
> Processor #1
> ACPI: LAPIC (acpi_id[0x02] lapic_id[0x06] enabled)
> Processor #6
> ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
> Processor #7
> ACPI: LAPIC (acpi_id[0x04] lapic_id[0x10] enabled)
> Processor #16
> ACPI: LAPIC (acpi_id[0x05] lapic_id[0x11] enabled)
> Processor #17
> ACPI: LAPIC (acpi_id[0x06] lapic_id[0x16] enabled)
> Processor #22
> ACPI: LAPIC (acpi_id[0x07] lapic_id[0x17] enabled)
> Processor #23
> ACPI: LAPIC (acpi_id[0x10] lapic_id[0x20] enabled)
> Processor #32
> ACPI: LAPIC (acpi_id[0x11] lapic_id[0x21] enabled)
> Processor #33
> ACPI: LAPIC (acpi_id[0x12] lapic_id[0x26] enabled)
> Processor #38
> ACPI: LAPIC (acpi_id[0x13] lapic_id[0x27] enabled)
> Processor #39
> ACPI: LAPIC (acpi_id[0x14] lapic_id[0x30] enabled)
> Processor #48
> ACPI: LAPIC (acpi_id[0x15] lapic_id[0x31] enabled)
> Processor #49
> ACPI: LAPIC (acpi_id[0x16] lapic_id[0x36] enabled)
> Processor #54
> ACPI: LAPIC (acpi_id[0x17] lapic_id[0x37] enabled)
> Processor #55
> ACPI: LAPIC (acpi_id[0x20] lapic_id[0x40] enabled)
> Processor #64
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x21] lapic_id[0x41] enabled)
> Processor #65
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x22] lapic_id[0x46] enabled)
> Processor #70
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x23] lapic_id[0x47] enabled)
> Processor #71
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x24] lapic_id[0x50] enabled)
> Processor #80
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x25] lapic_id[0x51] enabled)
> Processor #81
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x26] lapic_id[0x56] enabled)
> Processor #86
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x27] lapic_id[0x57] enabled)
> Processor #87
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x30] lapic_id[0x60] enabled)
> Processor #96
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x31] lapic_id[0x61] enabled)
> Processor #97
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x32] lapic_id[0x66] enabled)
> Processor #102
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x33] lapic_id[0x67] enabled)
> Processor #103
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x34] lapic_id[0x70] enabled)
> Processor #112
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x35] lapic_id[0x71] enabled)
> Processor #113
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x36] lapic_id[0x76] enabled)
> Processor #118
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC (acpi_id[0x37] lapic_id[0x77] enabled)
> Processor #119
> WARNING: NR_CPUS limit of 16 reached. Processor ignored.
> ACPI: LAPIC_NMI (acpi_id[0x00] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x01] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x02] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x03] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x04] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x05] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x06] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x07] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x10] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x11] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x12] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x13] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x14] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x15] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x16] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x17] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x20] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x21] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x22] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x23] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x24] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x25] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x26] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x27] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x30] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x31] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x32] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x33] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x34] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x35] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x36] dfl dfl lint[0x1])
> ACPI: LAPIC_NMI (acpi_id[0x37] dfl dfl lint[0x1])
> ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
> IOAPIC[0]: apic_id 15, address 0xfec00000, GSI 0-35
> ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
> IOAPIC[1]: apic_id 14, address 0xfec01000, GSI 36-71
> ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
> IOAPIC[2]: apic_id 13, address 0xfec02000, GSI 72-107
> ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
> IOAPIC[3]: apic_id 12, address 0xfec03000, GSI 108-143
> ACPI: IOAPIC (id[0x0b] address[0xfec04000] gsi_base[144])
> IOAPIC[4]: apic_id 11, address 0xfec04000, GSI 144-179
> ACPI: IOAPIC (id[0x0a] address[0xfec05000] gsi_base[180])
> IOAPIC[5]: apic_id 10, address 0xfec05000, GSI 180-215
> ACPI: IOAPIC (id[0x09] address[0xfec06000] gsi_base[216])
> IOAPIC[6]: apic_id 9, address 0xfec06000, GSI 216-251
> ACPI: IOAPIC (id[0x08] address[0xfec07000] gsi_base[252])
> IOAPIC[7]: apic_id 8, address 0xfec07000, GSI 252-287
> ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 8 global_irq 8 low edge)
> ACPI: INT_SRC_OVR (bus 0 bus_irq 14 global_irq 14 low edge)
> Setting APIC routing to clustered
> ACPI: HPET id: 0x10142201 base: 0xfde84000
> afinfo corrupted at arch/x86_64/kernel/setup.c:559
> afinfo corrupted at arch/x86_64/kernel/setup.c:562
> Using ACPI (MADT) for SMP configuration information
> afinfo corrupted at arch/x86_64/kernel/setup.c:569
> afinfo corrupted at arch/x86_64/kernel/setup.c:572
> afinfo corrupted at arch/x86_64/kernel/setup.c:579
> afinfo corrupted at arch/x86_64/kernel/setup.c:582
> Nosave address range: 000000000009a000 - 000000000009b000
> Nosave address range: 000000000009b000 - 00000000000a0000
> Nosave address range: 00000000000a0000 - 00000000000e0000
> Nosave address range: 00000000000e0000 - 0000000000100000
> Nosave address range: 00000000bff76000 - 00000000bff77000
> Nosave address range: 00000000bff77000 - 00000000bff98000
> Nosave address range: 00000000bff98000 - 00000000bff99000
> Nosave address range: 00000000bff99000 - 00000000c0000000
> Nosave address range: 00000000c0000000 - 00000000fec00000
> Nosave address range: 00000000fec00000 - 0000000100000000
> afinfo corrupted at arch/x86_64/kernel/setup.c:585
> afinfo corrupted at arch/x86_64/kernel/setup.c:588
> afinfo corrupted at arch/x86_64/kernel/setup.c:596
> Allocating PCI resources starting at c4000000 (gap: c0000000:3ec00000)
> afinfo corrupted at arch/x86_64/kernel/setup.c:599
> afinfo corrupted at init/main.c:512
> SMP: Allowing 16 CPUs, 0 hotplug CPUs
> PERCPU: Allocating 33920 bytes of per cpu data
> afinfo corrupted at init/main.c:527
> Built 1 zonelists. Total pages: 12147064
> Kernel command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> afinfo corrupted at init/main.c:536
> Initializing CPU#0
> PID hash table entries: 4096 (order: 12, 32768 bytes)
> afinfo corrupted at init/main.c:545
> afinfo corrupted at init/main.c:548
> disabling early console
> Console: colour VGA+ 80x25
> Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
> Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
> afinfo corrupted at init/main.c:582
> Checking aperture...
> PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
> Placing software IO TLB between 0x310c2000 - 0x350c2000
> Memory: 48422908k/50331648k available (2566k kernel code, 858868k reserved, 1345k data, 184k init)
> afinfo corrupted at init/main.c:584
> Calibrating delay using timer specific routine.. 5678.09 BogoMIPS (lpj=11356196)
> afinfo corrupted at init/main.c:593
> afinfo corrupted at init/main.c:603
> Mount-cache hash table entries: 256
> afinfo corrupted at init/main.c:610
> afinfo corrupted at init/main.c:618
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> using mwait in idle threads.
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> CPU0: Thermal monitoring enabled (TM1)
> SMP alternatives: switching to UP code
> ACPI: Core revision 20060707
> ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> Using local APIC timer interrupts.
> result 10425595
> Detected 10.425 MHz APIC timer.
> afinfo corrupted at init/main.c:749
> SMP alternatives: switching to SMP code
> Booting processor 1/16 APIC 0x1
> Initializing CPU#1
> Calibrating delay using timer specific routine.. 5671.84 BogoMIPS (lpj=11343696)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 0
> CPU: Processor Core ID: 0
> CPU1: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 1: Syncing TSC to CPU 0.
> CPU 1: synchronized TSC with CPU 0 (last diff -2 cycles, maxerr 799 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 2/16 APIC 0x6
> Initializing CPU#2
> Calibrating delay using timer specific routine.. 5671.98 BogoMIPS (lpj=11343971)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 3
> CPU: Processor Core ID: 0
> CPU2: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 2: Syncing TSC to CPU 0.
> CPU 2: synchronized TSC with CPU 0 (last diff -184 cycles, maxerr 3349 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 3/16 APIC 0x7
> Initializing CPU#3
> Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344041)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 3
> CPU: Processor Core ID: 0
> CPU3: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 3: Syncing TSC to CPU 0.
> CPU 3: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 1989 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 4/16 APIC 0x10
> Initializing CPU#4
> Calibrating delay using timer specific routine.. 5672.07 BogoMIPS (lpj=11344144)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 8
> CPU: Processor Core ID: 0
> CPU4: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 4: Syncing TSC to CPU 0.
> CPU 4: synchronized TSC with CPU 0 (last diff 43 cycles, maxerr 3247 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 5/16 APIC 0x11
> Initializing CPU#5
> Calibrating delay using timer specific routine.. 5672.01 BogoMIPS (lpj=11344024)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 8
> CPU: Processor Core ID: 0
> CPU5: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 5: Syncing TSC to CPU 0.
> CPU 5: synchronized TSC with CPU 0 (last diff 21 cycles, maxerr 3349 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 6/16 APIC 0x16
> Initializing CPU#6
> Calibrating delay using timer specific routine.. 5672.02 BogoMIPS (lpj=11344042)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 11
> CPU: Processor Core ID: 0
> CPU6: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 6: Syncing TSC to CPU 0.
> CPU 6: synchronized TSC with CPU 0 (last diff 257 cycles, maxerr 3383 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 7/16 APIC 0x17
> Initializing CPU#7
> Calibrating delay using timer specific routine.. 5672.10 BogoMIPS (lpj=11344218)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 11
> CPU: Processor Core ID: 0
> CPU7: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 7: Syncing TSC to CPU 0.
> CPU 7: synchronized TSC with CPU 0 (last diff 233 cycles, maxerr 3357 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 8/16 APIC 0x20
> Initializing CPU#8
> Calibrating delay using timer specific routine.. 5672.35 BogoMIPS (lpj=11344712)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 16
> CPU: Processor Core ID: 0
> CPU8: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 8: Syncing TSC to CPU 0.
> CPU 8: synchronized TSC with CPU 0 (last diff 140 cycles, maxerr 8509 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 9/16 APIC 0x21
> Initializing CPU#9
> Calibrating delay using timer specific routine.. 5672.25 BogoMIPS (lpj=11344515)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 16
> CPU: Processor Core ID: 0
> CPU9: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 9: Syncing TSC to CPU 0.
> CPU 9: synchronized TSC with CPU 0 (last diff -100 cycles, maxerr 7556 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 10/16 APIC 0x26
> Initializing CPU#10
> Calibrating delay using timer specific routine.. 5672.33 BogoMIPS (lpj=11344676)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 19
> CPU: Processor Core ID: 0
> CPU10: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 10: Syncing TSC to CPU 0.
> CPU 10: synchronized TSC with CPU 0 (last diff 405 cycles, maxerr 8126 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 11/16 APIC 0x27
> Initializing CPU#11
> Calibrating delay using timer specific routine.. 5672.46 BogoMIPS (lpj=11344939)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 19
> CPU: Processor Core ID: 0
> CPU11: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 11: Syncing TSC to CPU 0.
> CPU 11: synchronized TSC with CPU 0 (last diff -145 cycles, maxerr 8568 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 12/16 APIC 0x30
> Initializing CPU#12
> Calibrating delay using timer specific routine.. 5672.23 BogoMIPS (lpj=11344472)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 24
> CPU: Processor Core ID: 0
> CPU12: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 12: Syncing TSC to CPU 0.
> CPU 12: synchronized TSC with CPU 0 (last diff 419 cycles, maxerr 8602 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 13/16 APIC 0x31
> Initializing CPU#13
> Calibrating delay using timer specific routine.. 5672.34 BogoMIPS (lpj=11344689)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 24
> CPU: Processor Core ID: 0
> CPU13: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 13: Syncing TSC to CPU 0.
> CPU 13: synchronized TSC with CPU 0 (last diff 242 cycles, maxerr 8636 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 14/16 APIC 0x36
> Initializing CPU#14
> Calibrating delay using timer specific routine.. 5672.32 BogoMIPS (lpj=11344644)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 27
> CPU: Processor Core ID: 0
> CPU14: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 14: Syncing TSC to CPU 0.
> CPU 14: synchronized TSC with CPU 0 (last diff -272 cycles, maxerr 8109 cycles)
> SMP alternatives: switching to SMP code
> Booting processor 15/16 APIC 0x37
> Initializing CPU#15
> Calibrating delay using timer specific routine.. 5672.21 BogoMIPS (lpj=11344423)
> CPU: Trace cache: 12K uops, L1 D cache: 16K
> CPU: L2 cache: 1024K
> CPU: L3 cache: 4096K
> CPU: Physical Processor ID: 27
> CPU: Processor Core ID: 0
> CPU15: Thermal monitoring enabled (TM1)
> Intel(R) Xeon(TM) MP CPU 2.83GHz stepping 01
> CPU 15: Syncing TSC to CPU 0.
> CPU 15: synchronized TSC with CPU 0 (last diff -21 cycles, maxerr 8560 cycles)
> Brought up 16 CPUs
> testing NMI watchdog ... OK.
> time.c: Using 333.333333 MHz WALL PIT GTOD PIT/HPET timer.
> time.c: Detected 2835.773 MHz processor.
> afinfo corrupted at init/main.c:755
> migration_cost=19,988
> afinfo corrupted at init/main.c:761
> afinfo corrupted at init/main.c:769
> Calling initcall 0xffffffff802166c0: init_smp_flush+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607a40: helper_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607dd0: pm_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607e50: ksysfs_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a720: filelock_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b230: init_script_binfmt+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b240: init_elf_binfmt+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614690: sock_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614e30: netlink_proto_init+0x0/0x1a0()
> afinfo corrupted at init/main.c:659
> NET: Registered protocol family 16
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c310: kobject_uevent_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c4a0: pcibus_class_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ca70: pci_driver_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ef30: tty_class_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fa20: vtconsole_class_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060cbb0: acpi_pci_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> ACPI: bus type pci registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d8ef: init_acpi_device_notify+0x0/0x4b()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613aa0: pci_access_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> PCI: Using configuration type 1
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80605760: topology_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607770: param_sysfs_init+0x0/0x200()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80249d00: pm_sysrq_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060aee0: init_bio+0x0/0x110()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c1d0: genhd_device_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d702: acpi_init+0x0/0x1ed()
> afinfo corrupted at init/main.c:659
> ACPI: Interpreter enabled
> ACPI: Using IOAPIC for interrupt routing
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060dbd5: acpi_ec_init+0x0/0x62()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060dfee: acpi_pci_root_init+0x0/0x28()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e036: acpi_pci_link_init+0x0/0x48()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e1bc: acpi_power_init+0x0/0x77()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e233: acpi_system_init+0x0/0xc6()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e2f9: acpi_event_init+0x0/0x3f()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e338: acpi_scan_init+0x0/0x1ac()
> afinfo corrupted at init/main.c:659
> ACPI: PCI Root Bridge [VP00] (0000:00)
> PCI: Ignoring BAR0-3 of IDE controller 0000:00:0f.1
> ACPI: PCI Root Bridge [VP01] (0000:01)
> ACPI: PCI Root Bridge [VP02] (0000:02)
> ACPI: PCI Root Bridge [VP03] (0000:04)
> ACPI: PCI Root Bridge [VP04] (0000:06)
> ACPI: PCI Root Bridge [VP05] (0000:08)
> ACPI: PCI Root Bridge [VP06] (0000:0a)
> ACPI: PCI Root Bridge [VP07] (0000:0c)
> ACPI: PCI Root Bridge [VP10] (0000:0e)
> ACPI: PCI Root Bridge [VP11] (0000:0f)
> ACPI: PCI Root Bridge [VP12] (0000:10)
> ACPI: PCI Root Bridge [VP13] (0000:12)
> ACPI: PCI Root Bridge [VP14] (0000:14)
> ACPI: PCI Root Bridge [VP15] (0000:16)
> ACPI: PCI Root Bridge [VP16] (0000:18)
> ACPI: PCI Root Bridge [VP17] (0000:1a)
> ACPI: PCI Root Bridge [VP20] (0000:1c)
> ACPI: PCI Root Bridge [VP21] (0000:1d)
> ACPI: PCI Root Bridge [VP22] (0000:1e)
> ACPI: PCI Root Bridge [VP23] (0000:20)
> ACPI: PCI Root Bridge [VP24] (0000:22)
> ACPI: PCI Root Bridge [VP25] (0000:24)
> ACPI: PCI Root Bridge [VP26] (0000:26)
> ACPI: PCI Root Bridge [VP27] (0000:28)
> ACPI: PCI Root Bridge [VP30] (0000:2a)
> ACPI: PCI Root Bridge [VP31] (0000:2b)
> ACPI: PCI Root Bridge [VP32] (0000:2c)
> ACPI: PCI Root Bridge [VP33] (0000:2e)
> ACPI: PCI Root Bridge [VP34] (0000:30)
> ACPI: PCI Root Bridge [VP35] (0000:32)
> ACPI: PCI Root Bridge [VP36] (0000:34)
> ACPI: PCI Root Bridge [VP37] (0000:36)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e654: acpi_cm_sbs_init+0x0/0xc()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e660: pnp_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> Linux Plug and Play Support v0.97 (c) Adam Belay
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e8f0: pnpacpi_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> pnp: PnP ACPI init
> pnp: PnP ACPI: found 47 devices
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060f490: misc_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80375670: cn_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117f0: init_scsi+0x0/0x90()
> afinfo corrupted at init/main.c:659
> SCSI subsystem initialized
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806124d0: serio_init+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806128f0: input_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d00: rtc_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d50: rtc_sysfs_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d60: rtc_proc_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612d70: rtc_dev_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613ad0: pci_acpi_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> PCI: Using ACPI for IRQ routing
> PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80613b80: pci_legacy_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614130: pcibios_irq_init+0x0/0x4f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614620: pcibios_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614750: proto_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806148f0: net_dev_init+0x0/0x210()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614fd0: genl_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fdfc0: late_hpet_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> hpet0: at MMIO 0xfde84000, IRQs 2, 8, 0
> hpet0: 3 64-bit timers, 3707069 Hz
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806000b0: pci_iommu_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> PCI-GART: No AMD northbridge found.
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a6a0: init_pipe_fs+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e524: acpi_motherboard_init+0x0/0x130()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e790: pnp_system_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> pnp: 00:0a: ioport range 0x400-0x47f has been reserved
> pnp: 00:0a: ioport range 0x480-0x4ff could not be reserved
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ec70: chr_dev_init+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610a40: firmware_class_init+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806134b0: pcibios_assign_resources+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806159e0: inet_init+0x0/0x400()
> afinfo corrupted at init/main.c:659
> NET: Registered protocol family 2
> IP route cache hash table entries: 524288 (order: 10, 4194304 bytes)
> TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
> TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
> TCP: Hash tables configured (established 262144 bind 65536)
> TCP reno registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8020db10: time_init_device+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fe9f0: i8259A_init_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805fe9c0: init_timer_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805ff010: vsyscall_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff805ff2a0: sbf_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600080: i8237A_init_sysfs+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600500: periodic_mcheck_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600530: mce_init_device+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80600670: thermal_throttle_init_device+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806006e0: threshold_init_device+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80601ee0: init_lapic_sysfs+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80602a80: ioapic_init_sysfs+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8021d1f0: cache_sysfs_init+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80605870: x8664_sysctl_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80606d30: create_proc_profile+0x0/0x280()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607170: ioresources_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806072e0: timekeeping_init_device+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607400: uid_cache_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607970: init_posix_timers+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607a80: init_posix_cpu_timers+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607ba0: latency_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607c90: init_clocksource_sysfs+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607cf0: init_jiffies_clocksource+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607d00: init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607d70: proc_dma_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80245840: percpu_modinit+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607da0: kallsyms_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80607e10: ikconfig_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80608f60: init_per_zone_pages_min+0x0/0x60()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609ed0: pdflush_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609f20: kswapd_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609f50: setup_vmstat+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80609fc0: procswaps_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a030: hugetlb_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> Total HugeTLB memory allocated, 0
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a0a0: init_tmpfs+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a180: cpucache_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060a6f0: fasync_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ae00: aio_setup+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b080: inotify_setup+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b090: inotify_user_setup+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b150: eventpoll_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b250: init_mbcache+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b280: dnotify_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b740: init_devpts_fs+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b780: init_reiserfs_fs+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b800: init_ext3_fs+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060b930: journal_init+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ba10: init_ext2_fs+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bad0: init_ramfs_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bae0: init_hugetlbfs_fs+0x0/0x80()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bba0: init_fat_fs+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bbf0: init_vfat_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc00: init_nls_cp437+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc10: init_nls_iso8859_1+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc20: init_autofs_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> initcall at 0xffffffff8060bc30: init_autofs4_fs+0x0/0x10(): returned with error code -16
> Calling initcall 0xffffffff8060bc40: ipc_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bf10: init_mqueue_fs+0x0/0xe0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060bff0: crypto_algapi_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c030: init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c040: init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c230: noop_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler noop registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c240: as_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler anticipatory registered (default)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c250: deadline_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> io scheduler deadline registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060c260: cfq_init+0x0/0xb0()
> afinfo corrupted at init/main.c:659
> io scheduler cfq registered
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8032c1d0: pci_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ca80: pci_sysfs_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060cac0: pci_proc_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d93a: acpi_ac_init+0x0/0x45()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060d97f: acpi_battery_init+0x0/0x45()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060df90: acpi_video_init+0x0/0x5e()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060e07e: irqrouter_init_sysfs+0x0/0x38()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ed10: rand_initialize+0x0/0x30()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060ed40: tty_init+0x0/0x1f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060efa0: pty_init+0x0/0x4f0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fae0: hpet_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fb50: agp_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> Linux agpgart interface v0.101 (c) Dave Jones
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff8060fcb0: cn_proc_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806100f0: serial8250_init+0x0/0x150()
> afinfo corrupted at init/main.c:659
> Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610320: serial8250_pnp_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> 00:03: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> 00:04: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610330: serial8250_pci_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80384c90: topology_sysfs_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610ac0: e1000_init_module+0x0/0x50()
> afinfo corrupted at init/main.c:659
> Intel(R) PRO/1000 Network Driver - version 7.2.9-k2
> Copyright (c) 1999-2006 Intel Corporation.
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610b10: tg3_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> tg3.c:v3.66 (September 23, 2006)
> ACPI: PCI Interrupt 0000:01:01.0[A] -> GSI 24 (level, low) -> IRQ 24
> eth0: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:54
> eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth0: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:01:01.1[B] -> GSI 28 (level, low) -> IRQ 28
> eth1: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:0d:60:98:63:55
> eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth1: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:0f:01.0[A] -> GSI 96 (level, low) -> IRQ 96
> eth2: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0c
> eth2: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth2: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:0f:01.1[B] -> GSI 100 (level, low) -> IRQ 100
> eth3: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:0d
> eth3: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth3: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:1d:01.0[A] -> GSI 168 (level, low) -> IRQ 168
> eth4: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6c
> eth4: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth4: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:1d:01.1[B] -> GSI 172 (level, low) -> IRQ 172
> eth5: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:45:6d
> eth5: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth5: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:2b:01.0[A] -> GSI 240 (level, low) -> IRQ 240
> eth6: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:82
> eth6: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[0]
> eth6: dma_rwctrl[769f0000] dma_mask[64-bit]
> ACPI: PCI Interrupt 0000:2b:01.1[B] -> GSI 244 (level, low) -> IRQ 244
> eth7: Tigon3 [partno(BCM95704A6) rev 2100 PHY(5704)] (PCIX:66MHz:64-bit) 10/100/1000BaseT Ethernet 00:14:5e:1c:43:83
> eth7: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
> eth7: dma_rwctrl[769f0000] dma_mask[64-bit]
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610ba0: net_olddevs_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803a8630: init_netconsole+0x0/0x80()
> afinfo corrupted at init/main.c:659
> netconsole: not configured, aborting
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803a8710: cmd64x_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610c70: piix_ide_init+0x0/0xd0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803aa810: svwks_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff803ab480: generic_ide_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80610db0: ide_init+0x0/0x90()
> afinfo corrupted at init/main.c:659
> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
> ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
> SvrWks CSB6: IDE controller at PCI slot 0000:00:0f.1
> SvrWks CSB6: chipset revision 160
> SvrWks CSB6: not 100% native mode: will probe irqs later
> ide0: BM-DMA at 0x0700-0x0707, BIOS settings: hda:DMA, hdb:DMA
> SvrWks CSB6: simplex device: DMA disabled
> ide1: SvrWks CSB6 Bus-Master DMA disabled (BIOS)
> hda: MATSHITADVD-ROM SR-8178, ATAPI CD/DVD-ROM drive
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611780: ide_generic_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117a0: idedisk_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117b0: ide_cdrom_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> hda: ATAPI 24X DVD-ROM drive, 256kB Cache, UDMA(66)
> Uniform CD-ROM driver Revision: 3.20
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806117c0: idefloppy_init+0x0/0x30()
> afinfo corrupted at init/main.c:659
> ide-floppy driver 0.99.newide
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611a90: raid_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611aa0: spi_transport_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611ae0: fc_transport_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611b30: iscsi_transport_init+0x0/0x120()
> afinfo corrupted at init/main.c:659
> Loading iSCSI transport class v2.0-685.afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611c50: sas_transport_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611d10: iscsi_tcp_init+0x0/0x50()
> afinfo corrupted at init/main.c:659
> iscsi: registered transport (tcp)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611d60: aac_init+0x0/0x70()
> afinfo corrupted at init/main.c:659
> Adaptec aacraid driver (1.1-5[2409]-mh2)
> ACPI: PCI Interrupt 0000:01:02.0[A] -> GSI 25 (level, low) -> IRQ 25
> AAC0: kernel 5.0-2[8264]
> AAC0: monitor 5.0-2[8264]
> AAC0: bios 5.0-2[8264]
> AAC0: serial 162348
> AAC0: 64bit support enabled.
> AAC0: 64 Bit DAC enabled
> scsi0 : ServeRAID
> scsi 0:0:0:0: Direct-Access IBM Drive 1 V1.0 PQ: 0 ANSI: 2
> scsi 0:0:1:0: Direct-Access IBM Drive 2 V1.0 PQ: 0 ANSI: 2
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611dd0: qla1280_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80611fa0: sym2_init+0x0/0x110()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806120b0: init_sd+0x0/0x60()
> afinfo corrupted at init/main.c:659
> SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
> sda: assuming Write Enabled
> sda: assuming drive cache: write through
> SCSI device sda: 143132672 512-byte hdwr sectors (73284 MB)
> sda: assuming Write Enabled
> sda: assuming drive cache: write through
> sda: sda1 sda2 sda3
> sd 0:0:0:0: Attached scsi removable disk sda
> SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
> sdb: assuming Write Enabled
> sdb: assuming drive cache: write through
> SCSI device sdb: 143132672 512-byte hdwr sectors (73284 MB)
> sdb: assuming Write Enabled
> sdb: assuming drive cache: write through
> sdb: sdb1 sdb2 sdb3
> sd 0:0:1:0: Attached scsi removable disk sdb
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612110: fusion_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> Fusion MPT base driver 3.04.01
> Copyright (c) 1999-2005 LSI Logic Corporation
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612210: mptspi_init+0x0/0xc0()
> afinfo corrupted at init/main.c:659
> Fusion MPT SPI Host driver 3.04.01
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806122d0: mptfc_init+0x0/0xf0()
> afinfo corrupted at init/main.c:659
> Fusion MPT FC Host driver 3.04.01
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806123c0: mptctl_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> Fusion MPT misc device (ioctl) driver 3.04.01
> mptctl: Registered with Fusion MPT base driver
> mptctl: /dev/mptctl @ (major,minor=10,220)
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806124c0: cdrom_init+0x0/0x10()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806125a0: i8042_init+0x0/0x350()
> afinfo corrupted at init/main.c:659
> PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12
> serio: i8042 KBD port at 0x60,0x64 irq 1
> serio: i8042 AUX port at 0x60,0x64 irq 12
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612a10: mousedev_init+0x0/0x100()
> afinfo corrupted at init/main.c:659
> mice: PS/2 mouse device common for all mice
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612b10: atkbd_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80612e20: hwmon_init+0x0/0x40()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80614c60: flow_cache_init+0x0/0x1d0()
> afinfo corrupted at init/main.c:659
> input: AT Translated Set 2 keyboard as /class/input/input0
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff806160f0: init_syncookies+0x0/0x20()
> afinfo corrupted at init/main.c:659
> afinfo corrupted at init/main.c:663
> Calling initcall 0xffffffff80616110: xfrm4_beet_init+0x0/0x20()
> afinfo corrupted at init/main.c:659
> Unable to handle kernel NULL pointer dereference at 0000000000000827 RIP:
> [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
> PGD 0
> Oops: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 1, comm: swapper Not tainted 2.6.18-git22 #2
> RIP: 0010:[<ffffffff80470666>] [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
> RSP: 0000:ffff810bffcbded0 EFLAGS: 00010286
> RAX: 000000000000081f RBX: ffffffff805588a0 RCX: 0000000000100000
> RDX: ffffffffffffffff RSI: 0000000000000002 RDI: ffffffff80559550
> RBP: 00000000ffffffef R08: 0000000000000002 R09: fffffffffffffffd
> R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
> R13: ffff810bffcbdef0 R14: 0000000000000000 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000827 CR3: 0000000000201000 CR4: 00000000000006e0
> Process swapper (pid: 1, threadinfo ffff810bffcbc000, task ffff810bffcbb4e0)
> Stack: 0000000000000000 0000000000000000 ffffffff8061fee8 ffffffff802071d6
> 6f6320726f727265 000036312d206564 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000090000
> Call Trace:
> [<ffffffff802071d6>] init+0x1b6/0x3b0
> [<ffffffff8020aa28>] child_rip+0xa/0x12
> [<ffffffff80339542>] acpi_ds_init_one_object+0x0/0x82
> [<ffffffff80207020>] init+0x0/0x3b0
> [<ffffffff8020aa1e>] child_rip+0x0/0x12
>
>
> Code: 48 83 78 08 00 75 06 48 89 58 08 31 ed 48 89 d7 e8 e5 fe ff
> RIP [<ffffffff80470666>] xfrm_register_mode+0x36/0x60
> RSP <ffff810bffcbded0>
> CR2: 0000000000000827
> <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
>
>
> --
>
> Steve Fox
> IBM Linux Technology Center
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-10-06 15:37:59

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > BIOS-provided physical RAM map:
> > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
>
> I continued what Steve was doing this morning to see could this be
> pinned down. After placing 'CHECK;' in a few places as suggested by
> Andi's check, the problem code was identified as that following in
> mm/bootmem.c#init_bootmem_core()
>
> mapsize = get_mapsize(bdata);
> memset(bdata->node_bootmem_map, 0xff, mapsize);
>
> That explains the value in the array at least. A few more printfs around
> this point printed out the following in the boot log
>
> init_bootmem_core(0, 1909, 0, 12582912)
> init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> AAGH: afinfo corrupted at mm/bootmem.c:121
>
> where;
>
> 1909 == mapstart
> 0 == start
> 12582912 == end
> 1572864 == mapsize
>
> mapstart, start and end being the parameters being passed to
> init_bootmem_core(). This means we are calling memset for the physical
> range 0x775000 -> 0x8F5000 which is in a usable range according to the
> BIOS-e820 map it appears.
>

Hi Mel,

Where is bss placed in physical memory? I guess bss_start and bss_stop
from System.map will tell us. That will confirm that above memset step is
stomping over bss. Then we have to just find that somewhere probably
we allocated wrong physical memory area for bootmem allocator map.

Thanks
Vivek

2006-10-06 17:11:11

by mel

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On (06/10/06 11:36), Vivek Goyal didst pronounce:
> On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > BIOS-provided physical RAM map:
> > > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> >
> > I continued what Steve was doing this morning to see could this be
> > pinned down. After placing 'CHECK;' in a few places as suggested by
> > Andi's check, the problem code was identified as that following in
> > mm/bootmem.c#init_bootmem_core()
> >
> > mapsize = get_mapsize(bdata);
> > memset(bdata->node_bootmem_map, 0xff, mapsize);
> >
> > That explains the value in the array at least. A few more printfs around
> > this point printed out the following in the boot log
> >
> > init_bootmem_core(0, 1909, 0, 12582912)
> > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > AAGH: afinfo corrupted at mm/bootmem.c:121
> >
> > where;
> >
> > 1909 == mapstart
> > 0 == start
> > 12582912 == end
> > 1572864 == mapsize
> >
> > mapstart, start and end being the parameters being passed to
> > init_bootmem_core(). This means we are calling memset for the physical
> > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > BIOS-e820 map it appears.
> >
>
> Hi Mel,
>

Hi.

> Where is bss placed in physical memory? I guess bss_start and bss_stop
> from System.map will tell us. That will confirm that above memset step is
> stomping over bss. Then we have to just find that somewhere probably
> we allocated wrong physical memory area for bootmem allocator map.
>

BSS is at 0x643000 -> 0x777BC4
init_bootmem wipes from 0x777000 -> 0x8F7000

So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
fix is below. It adds a check in bad_addr() to see if the BSS section is
about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
the source of the problem even if it's not the 100% correct fix.

diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c
--- linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c 2006-10-05 20:42:07.000000000 +0100
+++ linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c 2006-10-06 17:39:51.000000000 +0100
@@ -51,6 +51,7 @@ extern struct resource code_resource, da
static inline int bad_addr(unsigned long *addrp, unsigned long size)
{
unsigned long addr = *addrp, last = addr + size;
+ unsigned long bss_start, bss_end;

/* various gunk below that needed for SMP startup */
if (addr < 0x8000) {
@@ -77,6 +78,14 @@ static inline int bad_addr(unsigned long
*addrp = __pa_symbol(&_end);
return 1;
}
+
+ /* bss section */
+ bss_start = __pa_symbol(&__bss_start);
+ bss_end = PAGE_ALIGN(__pa_symbol(&__bss_stop));
+ if (addr >= bss_start && addr < bss_end) {
+ *addrp = bss_end;
+ return 1;
+ }

if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
*addrp = ebda_addr + ebda_size;

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-10-06 17:35:22

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, Oct 06, 2006 at 06:11:05PM +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > > BIOS-provided physical RAM map:
> > > > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > >
> > > I continued what Steve was doing this morning to see could this be
> > > pinned down. After placing 'CHECK;' in a few places as suggested by
> > > Andi's check, the problem code was identified as that following in
> > > mm/bootmem.c#init_bootmem_core()
> > >
> > > mapsize = get_mapsize(bdata);
> > > memset(bdata->node_bootmem_map, 0xff, mapsize);
> > >
> > > That explains the value in the array at least. A few more printfs around
> > > this point printed out the following in the boot log
> > >
> > > init_bootmem_core(0, 1909, 0, 12582912)
> > > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > > AAGH: afinfo corrupted at mm/bootmem.c:121
> > >
> > > where;
> > >
> > > 1909 == mapstart
> > > 0 == start
> > > 12582912 == end
> > > 1572864 == mapsize
> > >
> > > mapstart, start and end being the parameters being passed to
> > > init_bootmem_core(). This means we are calling memset for the physical
> > > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > > BIOS-e820 map it appears.
> > >
> >
> > Hi Mel,
> >
>
> Hi.
>
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> >
>
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
>
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.
>
> diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c
> --- linux-2.6.18-git22-clean/arch/x86_64/kernel/e820.c 2006-10-05 20:42:07.000000000 +0100
> +++ linux-2.6.18-git22-bss_relocate_fix/arch/x86_64/kernel/e820.c 2006-10-06 17:39:51.000000000 +0100
> @@ -51,6 +51,7 @@ extern struct resource code_resource, da
> static inline int bad_addr(unsigned long *addrp, unsigned long size)
> {
> unsigned long addr = *addrp, last = addr + size;
> + unsigned long bss_start, bss_end;
>
> /* various gunk below that needed for SMP startup */
> if (addr < 0x8000) {
> @@ -77,6 +78,14 @@ static inline int bad_addr(unsigned long
> *addrp = __pa_symbol(&_end);
> return 1;
> }
> +
> + /* bss section */
> + bss_start = __pa_symbol(&__bss_start);
> + bss_end = PAGE_ALIGN(__pa_symbol(&__bss_stop));
> + if (addr >= bss_start && addr < bss_end) {
> + *addrp = bss_end;
> + return 1;
> + }
>

Surprising, the kernel code check just before this should have taken care
of it.

/* kernel code */
if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
*addrp = __pa_symbol(&_end);
return 1;
}
May be it can be changed to
if (last >= __pa_symbol(&_text) && last < PAGE_ALIGN(__pa_symbol(&_end))) {

But all this seem to be a stopgap fix. Still the real puzzle is exactly
where did it slip out and should be fixed there.

May be some more printks will help us.

Thanks
Vivek

2006-10-06 18:00:54

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, Oct 06, 2006 at 06:11:05PM +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > On Fri, Oct 06, 2006 at 03:33:12PM +0100, Mel Gorman wrote:
> > > > Linux version 2.6.18-git22 (root@elm3b239) (gcc version 4.1.0 (SUSE Linux)) #2 SMP Thu Oct 5 19:05:36 PDT 2006
> > > > Command line: root=/dev/sda1 vga=791 ip=9.47.67.239:9.47.67.50:9.47.67.1:255.255.255.0 resume=/dev/sdb1 showopts earlyprintk=serial,ttyS0,57600 console=tty0 console=ttyS0,57600 autobench_args: root=/dev/sda1 ABAT:1160100417
> > > > BIOS-provided physical RAM map:
> > > > BIOS-e820: 0000000000000000 - 000000000009ac00 (usable)
> > > > BIOS-e820: 000000000009ac00 - 00000000000a0000 (reserved)
> > > > BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
> > > > BIOS-e820: 0000000000100000 - 00000000bff764c0 (usable)
> > > > BIOS-e820: 00000000bff764c0 - 00000000bff98880 (ACPI data)
> > > > BIOS-e820: 00000000bff98880 - 00000000c0000000 (reserved)
> > > > BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
> > > > BIOS-e820: 0000000100000000 - 0000000c00000000 (usable)
> > >
> > > I continued what Steve was doing this morning to see could this be
> > > pinned down. After placing 'CHECK;' in a few places as suggested by
> > > Andi's check, the problem code was identified as that following in
> > > mm/bootmem.c#init_bootmem_core()
> > >
> > > mapsize = get_mapsize(bdata);
> > > memset(bdata->node_bootmem_map, 0xff, mapsize);
> > >
> > > That explains the value in the array at least. A few more printfs around
> > > this point printed out the following in the boot log
> > >
> > > init_bootmem_core(0, 1909, 0, 12582912)
> > > init_bootmem_core: Calling memset(0xFFFF810000775000, 1572864)
> > > AAGH: afinfo corrupted at mm/bootmem.c:121
> > >
> > > where;
> > >
> > > 1909 == mapstart
> > > 0 == start
> > > 12582912 == end
> > > 1572864 == mapsize
> > >
> > > mapstart, start and end being the parameters being passed to
> > > init_bootmem_core(). This means we are calling memset for the physical
> > > range 0x775000 -> 0x8F5000 which is in a usable range according to the
> > > BIOS-e820 map it appears.
> > >
> >
> > Hi Mel,
> >
>
> Hi.
>
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> >
>
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
>
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.
>

Ok, it looks like that code is assuming that memory area returned by
find_e820_area() is page aligned. I found two such instances and that's
what is leading to problem.

bootmap_size = init_bootmem_node(NODE_DATA(nodeid),
bootmap_start >> PAGE_SHIFT,
start_pfn, end_pfn);

Here bootmap_start is not page aligned and I guess currently should
contain the value 0x777BC4 (just beyond _end). But the moement I do
bootmap_start>>PAGE_SHIFT, I start stomping bss.

Similar is the case here.

bootmap = find_e820_area(0, end_pfn<<PAGE_SHIFT, bootmap_size);
if (bootmap == -1L)
panic("Cannot find bootmem map of size %ld\n",bootmap_size);
bootmap_size = init_bootmem(bootmap >> PAGE_SHIFT, end_pfn);

So may be we should return a page aligned address from find_e820_area().
May be we can change bad_addr() to set *addrp to next page aligned
boundary for every check?

*addrp = PAGE_ALIGN(__pa_symbol(&_end));

Thanks
Vivek

2006-10-06 18:04:00

by Steve Fox

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > from System.map will tell us. That will confirm that above memset step is
> > stomping over bss. Then we have to just find that somewhere probably
> > we allocated wrong physical memory area for bootmem allocator map.
> >
>
> BSS is at 0x643000 -> 0x777BC4
> init_bootmem wipes from 0x777000 -> 0x8F7000
>
> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> fix is below. It adds a check in bad_addr() to see if the BSS section is
> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> the source of the problem even if it's not the 100% correct fix.

I was able to boot the machine with Mel's patch applied on top of
-git22.

--

Steve Fox
IBM Linux Technology Center

2006-10-06 20:05:19

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
> On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> > On (06/10/06 11:36), Vivek Goyal didst pronounce:
> > > Where is bss placed in physical memory? I guess bss_start and bss_stop
> > > from System.map will tell us. That will confirm that above memset step is
> > > stomping over bss. Then we have to just find that somewhere probably
> > > we allocated wrong physical memory area for bootmem allocator map.
> > >
> >
> > BSS is at 0x643000 -> 0x777BC4
> > init_bootmem wipes from 0x777000 -> 0x8F7000
> >
> > So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> > pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> > fix is below. It adds a check in bad_addr() to see if the BSS section is
> > about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
> > the source of the problem even if it's not the 100% correct fix.
>
> I was able to boot the machine with Mel's patch applied on top of
> -git22.


Please have a look at the attached patch. Does it make some sense.

Steve, can you please give this patch a try if it fixes the problem?

Thanks
Vivek




o Currently some code pieces assume that address returned by find_e820_area()
are page aligned. But looks like find_e820_area() had no such intention
and hence one might end up stomping over some of the data. One such
case is bootmem allocator initialization code stomped over bss.

o This patch modified find_e820_area() to return page aligned address. This
might be little wasteful of memory but at the same time probably it is
easier to handle page aligned memory.

Signed-off-by: Vivek Goyal <[email protected]>
---

arch/x86_64/kernel/e820.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
--- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area 2006-10-06 15:28:13.000000000 -0400
+++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c 2006-10-06 15:44:45.000000000 -0400
@@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long

/* various gunk below that needed for SMP startup */
if (addr < 0x8000) {
- *addrp = 0x8000;
+ *addrp = PAGE_ALIGN(0x8000);
return 1;
}

/* direct mapping tables of the kernel */
if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) {
- *addrp = table_end << PAGE_SHIFT;
+ *addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
return 1;
}

@@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
#ifdef CONFIG_BLK_DEV_INITRD
if (LOADER_TYPE && INITRD_START && last >= INITRD_START &&
addr < INITRD_START+INITRD_SIZE) {
- *addrp = INITRD_START + INITRD_SIZE;
+ *addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
return 1;
}
#endif
/* kernel code */
- if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
- *addrp = __pa_symbol(&_end);
+ if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
+ *addrp = PAGE_ALIGN(__pa_symbol(&_end));
return 1;
}

if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
- *addrp = ebda_addr + ebda_size;
+ *addrp = PAGE_ALIGN(ebda_addr + ebda_size);
return 1;
}

@@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
continue;
while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
;
- last = addr + size;
+ last = PAGE_ALIGN(addr) + size;
if (last > ei->addr + ei->size)
continue;
if (last > end)
_

2006-10-09 09:54:04

by Mel Gorman

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Fri, 6 Oct 2006, Vivek Goyal wrote:

> On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
>> On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
>>> On (06/10/06 11:36), Vivek Goyal didst pronounce:
>>>> Where is bss placed in physical memory? I guess bss_start and bss_stop
>>>> from System.map will tell us. That will confirm that above memset step is
>>>> stomping over bss. Then we have to just find that somewhere probably
>>>> we allocated wrong physical memory area for bootmem allocator map.
>>>>
>>>
>>> BSS is at 0x643000 -> 0x777BC4
>>> init_bootmem wipes from 0x777000 -> 0x8F7000
>>>
>>> So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
>>> pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
>>> fix is below. It adds a check in bad_addr() to see if the BSS section is
>>> about to be used for bootmap. It Seems To Work For Me (tm) and illustrates
>>> the source of the problem even if it's not the 100% correct fix.
>>
>> I was able to boot the machine with Mel's patch applied on top of
>> -git22.
>
>
> Please have a look at the attached patch. Does it make some sense.
>

It makes some sense. As you state, it wastes memory but that is better
than breaking.

> Steve, can you please give this patch a try if it fixes the problem?
>

I boottested the patch on the same machine as Steve was using and it
completed successfully.

> Thanks
> Vivek
>
>
>
>
> o Currently some code pieces assume that address returned by find_e820_area()
> are page aligned. But looks like find_e820_area() had no such intention
> and hence one might end up stomping over some of the data. One such
> case is bootmem allocator initialization code stomped over bss.
>
> o This patch modified find_e820_area() to return page aligned address. This
> might be little wasteful of memory but at the same time probably it is
> easier to handle page aligned memory.
>
> Signed-off-by: Vivek Goyal <[email protected]>
> ---
>
> arch/x86_64/kernel/e820.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
> --- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area 2006-10-06 15:28:13.000000000 -0400
> +++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c 2006-10-06 15:44:45.000000000 -0400
> @@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long
>
> /* various gunk below that needed for SMP startup */
> if (addr < 0x8000) {
> - *addrp = 0x8000;
> + *addrp = PAGE_ALIGN(0x8000);
> return 1;
> }
>
> /* direct mapping tables of the kernel */
> if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) {
> - *addrp = table_end << PAGE_SHIFT;
> + *addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
> return 1;
> }
>
> @@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
> #ifdef CONFIG_BLK_DEV_INITRD
> if (LOADER_TYPE && INITRD_START && last >= INITRD_START &&
> addr < INITRD_START+INITRD_SIZE) {
> - *addrp = INITRD_START + INITRD_SIZE;
> + *addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
> return 1;
> }
> #endif
> /* kernel code */
> - if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
> - *addrp = __pa_symbol(&_end);
> + if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
> + *addrp = PAGE_ALIGN(__pa_symbol(&_end));
> return 1;
> }
>
> if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
> - *addrp = ebda_addr + ebda_size;
> + *addrp = PAGE_ALIGN(ebda_addr + ebda_size);
> return 1;
> }
>
> @@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
> continue;
> while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
> ;
> - last = addr + size;
> + last = PAGE_ALIGN(addr) + size;
> if (last > ei->addr + ei->size)
> continue;
> if (last > end)
> _
>

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2006-10-10 03:53:28

by NeilBrown

[permalink] [raw]
Subject: Re: md deadlock (was Re: 2.6.18-mm2)


Hi,
would this be an appropriate fix do the warning lockdep gives about
possible deadlocks in md.

The warning is currently easily triggered with
mdadm -C /dev/md1 -l1 -n1 /dev/sdc missing

(assuming /dev/sdc is a device that you are happy to be scribbled on).

This will take ->reconfig_mutex on md1 while holding bd_mutex,
then will take bd_mutex on sdc while holding reconfig_mutex on md1

This superficial deadlock isn't a real problem because the bd_mutexes
are on different devices and there is an hierarchical relationship
which avoids the loop necessary for a deadlock.

-----------------------
Avoid lockdep warning in md.

md_open takes ->reconfig_mutex which causes lockdep to complain.
This (normally) doesn't have deadlock potential as the possible
conflict is with a reconfig_mutex in a different device.

I say "normally" because if a loop were created in the array->member
hierarchy a deadlock could happen. However that causes bigger
problems than a deadlock and should be fixed independently.

So we flag the lock in md_open as a nested lock. This requires
defining mutex_lock_interruptible_nested.

Signed-off-by: Neil Brown <[email protected]>

### Diffstat output
./drivers/md/md.c | 2 +-
./include/linux/mutex.h | 3 ++-
./kernel/mutex.c | 8 ++++++++
3 files changed, 11 insertions(+), 2 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c 2006-10-09 14:25:11.000000000 +1000
+++ ./drivers/md/md.c 2006-10-10 12:28:35.000000000 +1000
@@ -4422,7 +4422,7 @@ static int md_open(struct inode *inode,
mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
int err;

- if ((err = mddev_lock(mddev)))
+ if ((err = mutex_lock_interruptible_nested(&mddev->reconfig_mutex, 1)))
goto out;

err = 0;

diff .prev/include/linux/mutex.h ./include/linux/mutex.h
--- .prev/include/linux/mutex.h 2006-10-10 12:37:04.000000000 +1000
+++ ./include/linux/mutex.h 2006-10-10 12:40:20.000000000 +1000
@@ -125,8 +125,9 @@ extern int fastcall mutex_lock_interrupt

#ifdef CONFIG_DEBUG_LOCK_ALLOC
extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
+extern int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass);
#else
-# define mutex_lock_nested(lock, subclass) mutex_lock(lock)
+# define mutex_lock_interruptible_nested(lock, subclass) mutex_interruptible_lock(lock)
#endif

/*

diff .prev/kernel/mutex.c ./kernel/mutex.c
--- .prev/kernel/mutex.c 2006-10-10 12:35:54.000000000 +1000
+++ ./kernel/mutex.c 2006-10-10 13:20:04.000000000 +1000
@@ -206,6 +206,14 @@ mutex_lock_nested(struct mutex *lock, un
}

EXPORT_SYMBOL_GPL(mutex_lock_nested);
+int __sched
+mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
+{
+ might_sleep();
+ return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass);
+}
+
+EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
#endif

/*

2006-10-10 06:50:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: md deadlock (was Re: 2.6.18-mm2)


* Neil Brown <[email protected]> wrote:

> --- .prev/include/linux/mutex.h 2006-10-10 12:37:04.000000000 +1000
> +++ ./include/linux/mutex.h 2006-10-10 12:40:20.000000000 +1000
> @@ -125,8 +125,9 @@ extern int fastcall mutex_lock_interrupt
>
> #ifdef CONFIG_DEBUG_LOCK_ALLOC
> extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
> +extern int mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass);
> #else
> -# define mutex_lock_nested(lock, subclass) mutex_lock(lock)
> +# define mutex_lock_interruptible_nested(lock, subclass) mutex_interruptible_lock(lock)
> #endif

> EXPORT_SYMBOL_GPL(mutex_lock_nested);
> +int __sched
> +mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
> +{
> + might_sleep();
> + return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, subclass);
> +}
> +
> +EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);

looks good to me. (small style nit: maybe insert a newline after the
first EXPORT_SYMBOL_GPL line)

Acked-by: Ingo Molnar <[email protected]>

Ingo

2006-10-16 18:16:47

by Vivek Goyal

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Mon, Oct 09, 2006 at 10:53:58AM +0100, Mel Gorman wrote:
> On Fri, 6 Oct 2006, Vivek Goyal wrote:
>
> >On Fri, Oct 06, 2006 at 01:03:50PM -0500, Steve Fox wrote:
> >>On Fri, 2006-10-06 at 18:11 +0100, Mel Gorman wrote:
> >>>On (06/10/06 11:36), Vivek Goyal didst pronounce:
> >>>>Where is bss placed in physical memory? I guess bss_start and bss_stop
> >>>>from System.map will tell us. That will confirm that above memset step
> >>>>is
> >>>>stomping over bss. Then we have to just find that somewhere probably
> >>>>we allocated wrong physical memory area for bootmem allocator map.
> >>>>
> >>>
> >>>BSS is at 0x643000 -> 0x777BC4
> >>>init_bootmem wipes from 0x777000 -> 0x8F7000
> >>>
> >>>So the BSS bytes from 0x777000 ->0x777BC4 (which looks very suspiciously
> >>>pile a page alignment of addr & PAGE_MASK) gets set to 0xFF. One possible
> >>>fix is below. It adds a check in bad_addr() to see if the BSS section is
> >>>about to be used for bootmap. It Seems To Work For Me (tm) and
> >>>illustrates
> >>>the source of the problem even if it's not the 100% correct fix.
> >>
> >>I was able to boot the machine with Mel's patch applied on top of
> >>-git22.
> >
> >
> >Please have a look at the attached patch. Does it make some sense.
> >
>
> It makes some sense. As you state, it wastes memory but that is better
> than breaking.
>
> >Steve, can you please give this patch a try if it fixes the problem?
> >
>
> I boottested the patch on the same machine as Steve was using and it
> completed successfully.
>

Hi Andrew,

Can you please have a look at the attached patch and include it in -mm.
This fixes the issue for steve. It also figures in the list of Adrian Bunk
of known regressions.

Subject : oops in xfrm_register_mode
References : http://lkml.org/lkml/2006/10/4/170
Submitter : Steve Fox <[email protected]>
Handled-By : Vivek Goyal <[email protected]>
Status : patch available



o Currently some code pieces assume that address returned by find_e820_area()
are page aligned. But looks like find_e820_area() had no such intention
and hence one might end up stomping over some of the data. One such
case is bootmem allocator initialization code stomped over bss.

o This patch modified find_e820_area() to return page aligned address. This
might be little wasteful of memory but at the same time probably it is
easier to handle page aligned memory.

Signed-off-by: Vivek Goyal <[email protected]>
---

arch/x86_64/kernel/e820.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff -puN arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area arch/x86_64/kernel/e820.c
--- linux-2.6.19-rc1-1M/arch/x86_64/kernel/e820.c~x86_64-return-page-aligned-phy-addr-from-find-e820-area 2006-10-06 15:28:13.000000000 -0400
+++ linux-2.6.19-rc1-1M-root/arch/x86_64/kernel/e820.c 2006-10-06 15:44:45.000000000 -0400
@@ -54,13 +54,13 @@ static inline int bad_addr(unsigned long

/* various gunk below that needed for SMP startup */
if (addr < 0x8000) {
- *addrp = 0x8000;
+ *addrp = PAGE_ALIGN(0x8000);
return 1;
}

/* direct mapping tables of the kernel */
if (last >= table_start<<PAGE_SHIFT && addr < table_end<<PAGE_SHIFT) {
- *addrp = table_end << PAGE_SHIFT;
+ *addrp = PAGE_ALIGN(table_end << PAGE_SHIFT);
return 1;
}

@@ -68,18 +68,18 @@ static inline int bad_addr(unsigned long
#ifdef CONFIG_BLK_DEV_INITRD
if (LOADER_TYPE && INITRD_START && last >= INITRD_START &&
addr < INITRD_START+INITRD_SIZE) {
- *addrp = INITRD_START + INITRD_SIZE;
+ *addrp = PAGE_ALIGN(INITRD_START + INITRD_SIZE);
return 1;
}
#endif
/* kernel code */
- if (last >= __pa_symbol(&_text) && last < __pa_symbol(&_end)) {
- *addrp = __pa_symbol(&_end);
+ if (last >= __pa_symbol(&_text) && addr < __pa_symbol(&_end)) {
+ *addrp = PAGE_ALIGN(__pa_symbol(&_end));
return 1;
}

if (last >= ebda_addr && addr < ebda_addr + ebda_size) {
- *addrp = ebda_addr + ebda_size;
+ *addrp = PAGE_ALIGN(ebda_addr + ebda_size);
return 1;
}

@@ -152,7 +152,7 @@ unsigned long __init find_e820_area(unsi
continue;
while (bad_addr(&addr, size) && addr+size <= ei->addr+ei->size)
;
- last = addr + size;
+ last = PAGE_ALIGN(addr) + size;
if (last > ei->addr + ei->size)
continue;
if (last > end)
_

2006-10-17 00:01:54

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Mon, 16 Oct 2006 14:16:13 -0400
Vivek Goyal <[email protected]> wrote:

>
> Can you please have a look at the attached patch

Looks like a fine patch to me, although it could benefit from a comment
explaining why all those PAGE_ALIGN()s are in there.

> and include it in -mm.

Does it fix a patch in -mm or is it needed in mainline?


2006-10-17 12:18:41

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Mon, Oct 16, 2006 at 04:58:14PM -0700, Andrew Morton wrote:
> On Mon, 16 Oct 2006 14:16:13 -0400
> Vivek Goyal <[email protected]> wrote:
>
> >
> > Can you please have a look at the attached patch
>
> Looks like a fine patch to me, although it could benefit from a comment
> explaining why all those PAGE_ALIGN()s are in there.
>
> > and include it in -mm.
>
> Does it fix a patch in -mm or is it needed in mainline?

The bug in my list was reported to be present in mainline [1].

cu
Adrian

[1] http://lkml.org/lkml/2006/10/4/394

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-10-17 17:32:50

by Mel Gorman

[permalink] [raw]
Subject: Re: 2.6.18-mm2 boot failure on x86-64

On Tue, 17 Oct 2006, Adrian Bunk wrote:

> On Mon, Oct 16, 2006 at 04:58:14PM -0700, Andrew Morton wrote:
>> On Mon, 16 Oct 2006 14:16:13 -0400
>> Vivek Goyal <[email protected]> wrote:
>>
>>>
>>> Can you please have a look at the attached patch
>>
>> Looks like a fine patch to me, although it could benefit from a comment
>> explaining why all those PAGE_ALIGN()s are in there.
>>
>>> and include it in -mm.
>>
>> Does it fix a patch in -mm or is it needed in mainline?
>
> The bug in my list was reported to be present in mainline [1].
>

Confirmed. This bug is present in 2.6.19-rc2

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab