2009-07-16 21:34:08

by Andrew Morton

[permalink] [raw]
Subject: mmotm 2009-07-16-14-32 uploaded

The mm-of-the-moment snapshot 2009-07-16-14-32 has been uploaded to

http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

git://git.zen-sources.org/zen/mmotm.git

It contains the following patches against 2.6.31-rc3:

origin.patch
markup_oops-fix-it-with-32-bit-userspace-on-a-64-bit-kernel.patch
sys_pipe-fix-fd-leak-if-pipe-is-called-with-an-invalid-address.patch
page-allocator-preserve-pfn-ordering-when-__gfp_cold-is-set.patch
repeatable-slab-corruption-with-ltp-msgctl08.patch
linux-next.patch
next-remove-localversion.patch
i-need-old-gcc.patch
x86-fix-x86-fix-pageattr-handling-for-lpage-percpu-allocator-and-re-enable-it.patch
acpi-battery-work-around-negative-s16-battery-current-on-acer.patch
kernel-core-add-smp_call_function_any.patch
kernel-core-add-smp_call_function_any-update.patch
arch-x86-kernel-cpu-cpufreq-acpi-cpufreqc-avoid-cross-cpu-interrupts-by-using-smp_call_function_any.patch
toshiba_acpi-return-on-a-fail-path.patch
acerhdf-fix-fan-control-for-aoa150-model.patch
acpi-dont-free-non-existent-backlight-in-acpi-video-module.patch
drivers-acpi-videoc-remove-unneeded-memsets.patch
acpi-reintroduce-acpi_device_ops-shutdown-method.patch
arch-x86-kernel-tscc-smi-workaround-for-pit_expect_msb.patch
arch-x86-kernel-tscc-smi-workaround-for-pit_expect_msb-checkpatch-fixes.patch
s3c-fix-check-of-index-into-s3c_gpios.patch
pcmcia-yenta-add-missing-__devexit-marking.patch
pcmcia-pccard-deadlock-fix.patch
powerpc-sky-cpu-redundant-or-incorrect-tests-on-unsigned.patch
platform_device_add_data-use-kmemdup.patch
video-initial-support-for-adv7180.patch
ecryptfs-fix-lockdep-reported-ab-ba-mutex-issue.patch
ecryptfs-another-lockdep-issue.patch
ecryptfs-yet-another-lockdep-issue.patch
posix_cpu_timers_exit_group-do-not-use-thread_group_cputimer.patch
timer-stats-fix-del_timer_sync-and-try_to_del_timer_sync.patch
input-drivers-input-xpadc-improve-xbox-360-wireless-support-and-add-sysfs-interface.patch
input-documentation-input-xpadtxt-update-for-new-driver-functionality.patch
input-more-i8042-reset-quirks-for-msi-wind-clone-netbooks.patch
input-tsc2007-remove-hr-timer.patch
input-tsc2007-make-platform-callbacks-optional.patch
gitignore-usr-initramfs_datacpiobz2-and-usr-initramfs_datacpiolzma.patch
kernel-hacking-move-strip_asm_syms-from-general.patch
leds-gpio-leds-fix-typographics-fault.patch
leds-gpio-leds-fix-typographics-fault-checkpatch-fixes.patch
mmc-in-mmc_power_up-use-previously-selected-ocr-if-available.patch
omap-hsmmc-do-not-enable-buffer-ready-interrupt-if-using-dma.patch
mmc-msm_sdccc-driver-for-htc-dream.patch
msm_sdccc-convert-printkkern_level-to-pr_level.patch
msm_sdccc-stylistic-cleaning.patch
msm_sdccc-move-overly-indented-code-to-separate-function.patch
jffs2-move-jffs2_gcd_mtd-threads-to-the-new-kthread-api.patch
mtd-sst25l-non-jedec-spi-flash-driver.patch
mtd-sst25l-non-jedec-spi-flash-driver-update.patch
mtd-sst25l-non-jedec-spi-flash-driver-fix.patch
drivers-mtd-mtdcorec-make-symbols-static.patch
mtd-sst25l-fix-lock-imbalance.patch
isdn-hisax-fix-lock-imbalance.patch
3x59x-fix-pci-resource-management.patch
3x59x-fix-pci-resource-management-checkpatch-fixes.patch
ext4-remove-redundant-test-on-unsigned.patch
sunrpc-use-formatting-of-module-name-in-sunrpc.patch
serial_txx9-use-container_of-instead-of-direct-cast.patch
icom-converting-space-to-tabs.patch
serial-add-parameter-to-force-skipping-the-test-for-the-txen-bug.patch
hypfs-remove-useless-variable-qname.patch
scsi-use-the-common-hex_asc-array-rather-than-a-private-one.patch
scsi-gdthc-use-unaligned-access-helpers.patch
scsi-annotate-gdth_rdcap_data-gdth_rdcap16_data-endianness.patch
scsi-add-__init-__exit-macros-to-ibmvstgtc.patch
scsi-make-scsi-sg-v4-driver-enabled-by-default-and-remove-experimental-dependency-since-udev-depends-on-bsg.patch
scsi-make-scsi-sg-v4-driver-enabled-by-default-and-remove-experimental-dependency-since-udev-depends-on-bsg-checkpatch-fixes.patch
vt6655-s-void-void.patch
staging-rt2860-remove-dependency-on-wireless_ext-version.patch
drivers-usb-gadget-s3c2410_udcc-fix.patch
drivers-usb-gadget-s3c-hsotgc-missing-parentheses.patch
vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
raw-fix-rawctl-compat-ioctls-breakage-on-amd64-and-itanic.patch
vfs-improve-comment-describing-fget_light.patch
libfs-make-simple_read_from_buffer-conventional.patch
fs-inodec-add-dev-id-and-inode-number-for-debugging-in-init_special_inode.patch
vfs-split-generic_forget_inode-so-that-hugetlbfs-does-not-have-to-copy-it.patch
seq_file-return-a-negative-error-code-when-seq_path_root-fails.patch
fs-fix-overflow-in-sys_mount-for-in-kernel-calls.patch
fs-fix-overflow-in-sys_mount-for-in-kernel-calls-fix.patch
sendfile-several-fixes.patch
xtensa-variant-specific-code.patch
mm.patch
jbd-fix-race-bwtween-write_metadata_buffer-and-get_write_access.patch
dynamic-debug-fix-typo.patch
mm-copy-over-oom_adj-value-at-fork-time.patch
kexec-fix-omitting-offset-in-extended-crashkernel-syntax.patch
edac-x83-fix-mchbar-high-register-addr.patch
cgroups-fix-pid-namespace-bug.patch
cgroups-fix-pid-namespace-bug-fix.patch
flat-fix-uninitialized-ptr-with-shared-libs.patch
revert-mm-prevent-balance_dirty_pages-from-doing-too-much-work.patch
genirq-do-not-disable-irq_wakeup-marked-irqs-on-suspend.patch
cciss-remove-logical-drive-sysfs-entries-during-driver-cleanup.patch
cciss-use-only-one-scan-thread.patch
cciss-kick-off-logical-drive-topology-rescan-through-sysfs.patch
drivers-gpu-drm-ttm-ttm_bo_vmc-fix-misplaced-parentheses.patch
maintainers-update-atlx-contact-info.patch
arch-x86-oprofile-op_model_amdc-fix-op_amd_handle_ibs-return-type.patch
drivers-rtc-rtc-cmosc-cmos_init-dont-ignore-pnp_register_driver-return-value.patch
qla2xxx-fix-__little_endian-definition-warnings.patch
serial-bfin_5xx-fix-building-as-module-when-early-printk-is-enabled.patch
clocksource-save-mult_orig-in-clocksource_disable.patch
include-linux-clocksourceh-coding-style-tweaks.patch
9p-fix-incorrect-parameters-to-v9fs_file_readn.patch
mm-make-swap-token-dummies-static-inlines.patch
mm-make-swap-token-dummies-static-inlines-fix.patch
mm-make-swap-token-dummies-static-inlines-fix-2.patch
mm-remove-obsoleted-alloc_pages-cpuset-comment.patch
readahead-add-blk_run_backing_dev.patch
readahead-add-blk_run_backing_dev-fix.patch
readahead-add-blk_run_backing_dev-fix-fix-2.patch
memory-hotplug-update-zone-pcp-at-memory-online.patch
memory-hotplug-update-zone-pcp-at-memory-online-fix.patch
memory-hotplug-exclude-isolated-page-from-pco-page-alloc.patch
memory-hotplug-make-pages-from-movable-zone-always-isolatable.patch
memory-hotplug-alloc-page-from-other-node-in-memory-online.patch
memory-hotplug-migrate-swap-cache-page.patch
page_alloc-fix-kernel-doc-warning.patch
hugetlb-balance-freeing-of-huge-pages-across-nodes.patch
hugetlb-use-free_pool_huge_page-to-return-unused-surplus-pages.patch
hugetlb-use-free_pool_huge_page-to-return-unused-surplus-pages-fix.patch
hugetlb-clean-up-and-update-huge-pages-documentation.patch
mm-clean-up-page_remove_rmap.patch
mm-show_free_areas-display-slab-pages-in-two-separate-fields.patch
documentation-memorytxt-remove-some-very-outdated-recommendations.patch
mm-oom-analysis-add-per-zone-statistics-to-show_free_areas.patch
mm-oom-analysis-add-buffer-cache-information-to-show_free_areas.patch
mm-oom-analysis-show-kernel-stack-usage-in-proc-meminfo-and-oom-log-output.patch
mm-oom-analysis-add-shmem-vmstat.patch
page-allocator-allow-too-high-order-warning-messages-to-be-suppressed-with-__gfp_nowarn.patch
profile-suppress-warning-about-large-allocations-when-profile=1-is-specified.patch
net-dccp-suppress-warning-about-large-allocations-from-dccp.patch
mm-update-alloc_flags-after-oom-killer-has-been-called.patch
mm-rename-pgmoved-variable-in-shrink_active_list.patch
mm-shrink_inactive_list-nr_scan-accounting-fix-fix.patch
mm-vmstat-add-isolate-pages.patch
mm-vmstat-add-isolate-pages-fix.patch
vmscan-throttle-direct-reclaim-when-too-many-pages-are-isolated-already.patch
mm-remove-__addsub_zone_page_state.patch
frv-duplicate-output_buffer-of-e03.patch
frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch
m32r-remove-redundant-tests-on-unsigned.patch
m68k-count-can-reach-51-not-50.patch
m68k-cnt-reaches-1-not-0.patch
arch-m68k-include-asm-motorola_pgalloch-fix-kunmap-arg.patch
rework-fix-is_single_threaded.patch
printk-boot_delay-rename-printk_delay_msec-to-loops_per_msec.patch
printk-boot_delay-rename-printk_delay_msec-to-loops_per_msec-fix.patch
printk-boot_delay-rename-printk_delay_msec-to-loops_per_msec-fix-2.patch
printk-add-printk_delay-to-make-messages-readable-for-some-scenarios.patch
printk-add-printk_delay-to-make-messages-readable-for-some-scenarios-fix.patch
printk-add-printk_delay-to-make-messages-readable-for-some-scenarios-cleanup.patch
move-magic-numbers-into-magich.patch
move-magic-numbers-into-magich-update.patch
kmod-fix-race-in-usermodehelper-code.patch
add-a-driver-for-the-winbond-wpcd376i-ir-functionality.patch
add-a-driver-for-the-winbond-wpcd376i-ir-functionality-update.patch
maintainers-ia64-pair-p-m-entries-properly.patch
maintainers-remove-ivtv-user-lists-add-cx18-url.patch
maintainers-qlge-10gb-ethernet-pair-p-m-entries-properly.patch
maintainers-use-tabs-in-acer-aspire-one.patch
maintainers-remove-l-linux-kernel-vgerkernelorg.patch
maintainers-move-arpd-to-credits.patch
maintainers-update-kernel-janitors.patch
maintainers-add-pps-patterns.patch
maintainers-usb-serial-digi-acceleport-use-separate-p-for-al-borchers.patch
maintainers-input-add-dmitrys-name-to-his-email-address.patch
maintainers-remove-cs461x-sound-card-section.patch
maintainers-qlogic-qla2xxx-add-andrew-vasquez-email-address.patch
maintainers-qlogic-qla3xxx-add-ron-mercer-email-address.patch
maintainers-scott-murray-is-no-longer-with-somanetworks.patch
scripts-get_maintainerpl-add-f-directory-use.patch
get_maintainerpl-add-git-min-percent-option.patch
get_maintainerpl-add-git-min-percent-option-fix.patch
maintainers-coalesce-name-and-email-address-lines.patch
maintainers-finish-off-the-email-address-coalescing.patch
updated-f-and-t-in-maintainers-kristoffer-ericson.patch
getrusage-fill-ru_maxrss-value.patch
getrusage-fill-ru_maxrss-value-update.patch
asm-sections-add-text-data-checking-functions-for-arches-to-override.patch
kallsyms-use-new-arch_is_kernel_text.patch
lockdep-use-new-arch_is_kernel_data.patch
blackfin-override-text-data-checking-functions.patch
drivers-hwmon-coretempc-enable-the-intel-atom.patch
lis3-fix-typo.patch
lis3-add-free-fall-wakeup-function-via-platform_data.patch
lis3-add-power-management-functions.patch
lis3-add-power-management-functions-fix.patch
lis3_spi-code-cleanups.patch
kcore-fix-proc-kcores-statst_size.patch
proc-connector-add-event-for-process-becoming-session-leader.patch
proc-connector-add-event-for-process-becoming-session-leader-checkpatch-fixes.patch
procfs-provide-stack-information-for-threads-v08.patch
procfs-provide-stack-information-for-threads-v011.patch
spi-remove-imx-spi-driver.patch
spi-add-spi-driver-for-most-known-imx-socs.patch
rtc-add-driver-for-mxcs-internal-rtc-module.patch
rtc-add-driver-for-mxcs-internal-rtc-module-fix.patch
rtc-add-driver-for-mxcs-internal-rtc-module-fix-fix.patch
rtc-u300-coh-901-331-rtc-driver-v3.patch
rtc-update-documentation-wrt-rtc_pie-irq_set_state.patch
rtc-bfin-do-not-share-rtc-irq.patch
rtc-add-freescale-stmp37xx-378x-driver.patch
rtc-philips-nxp-pcf2123-driver.patch
rtc-philips-nxp-pcf2123-driver-v03.patch
rtc-philips-nxp-pcf2123-driver-v03-fix.patch
rtc-philips-nxp-pcf2123-driver-v03-update.patch
rtc-reorder-makefile.patch
rtc-driver-for-pcap2-pmic.patch
rtc-driver-for-pcap2-pmic-update.patch
gpiolib-allow-exported-gpio-nodes-to-be-named-using-sysfs-links.patch
gpiolib-allow-exported-gpio-nodes-to-be-named-using-sysfs-links-update.patch
gpio-add-mc33880-driver.patch
mfd-gpio-add-a-gpio-interface-to-the-ucb1400-mfd-chip-driver-via-gpiolib.patch
omapfb-add-support-for-the-apollon-lcd.patch
omapfb-add-support-for-mipi-dcs-compatible-lcds.patch
omapfb-add-support-for-the-amstrad-delta-lcd.patch
omapfb-add-support-for-the-2430sdp-lcd.patch
omapfb-add-support-for-the-omap2evm-lcd.patch
omapfb-add-support-for-the-3430sdp-lcd.patch
omapfb-add-support-for-the-omap3-evm-lcd.patch
omapfb-add-support-for-the-omap3-beagle-dvi-output.patch
omapfb-add-support-for-the-gumstix-overo-lcd.patch
omapfb-add-support-for-the-zoom-mdk-lcd.patch
omapfb-add-support-for-rotation-on-the-blizzard-lcd-ctrl.patch
n770-enable-lcd-mipi-dcs-in-kconfig.patch
omapfb-dispc-various-typo-fixes.patch
omapfb-dispc-disable-iface-clocks-along-with-func-clocks.patch
omapfb-dispc-enable-wake-up-capability.patch
omapfb-dispc-allow-multiple-external-irq-handlers.patch
omapfb-suspend-resume-only-if-fb-device-is-already-initialized.patch
omapfb-fix-coding-style-remove-dead-line.patch
omapfb-add-fb-manual-update-option-to-kconfig.patch
omapfb-hwa742-fix-pointer-to-be-const.patch
atyfb-coding-style-cleanup.patch
framebuffer-support-for-htc-dream.patch
framebuffer-support-for-htc-dream-checkpatch-fixes.patch
platinumfb-misplaced-parenthesis.patch
davinci-fb-frame-buffer-driver-for-ti-da8xx-omap-l1xx.patch
davinci-fb-frame-buffer-driver-for-ti-da8xx-omap-l1xx-fix.patch
intelfb-fix-setting-of-active-pipe-with-lvds-displays.patch
hfsplus-identify-journal-info-block-in-volume-header.patch
hfsplus-fix-journal-detection.patch
memcg-remove-the-overhead-associated-with-the-root-cgroup.patch
memcg-remove-the-overhead-associated-with-the-root-cgroup-fix.patch
memcg-remove-the-overhead-associated-with-the-root-cgroup-fix-2.patch
memcg-add-comments-explaining-memory-barriers.patch
memcg-add-comments-explaining-memory-barriers-checkpatch-fixes.patch
ptrace-__ptrace_detach-do-__wake_up_parent-if-we-reap-the-tracee.patch
do_wait-wakeup-optimization-shift-security_task_wait-from-eligible_child-to-wait_consider_task.patch
do_wait-wakeup-optimization-change-__wake_up_parent-to-use-filtered-wakeup.patch
do_wait-wakeup-optimization-child_wait_callback-check-__wnothread-case.patch
do_wait-optimization-do-not-place-sub-threads-on-task_struct-children-list.patch
wait_consider_task-kill-parent-argument.patch
signals-tracehook_notify_jctl-change.patch
utrace-core.patch
elf-clean-up-fill_note_info.patch
elf-clean-up-fill_note_info-fix.patch
flat-use-is_err_value-helper-macro.patch
n_hdlc-add-buffer-flushing-checkpatch-fixes.patch
edac-mpc85xx-add-p2020ds-support.patch
edac-mpc85xx-add-mpc83xx-support.patch
edac-fix-resource-size-calculation.patch
asm-generic-remove-calling-flush_write_buffers-in-dma_sync__for_cpu.patch
adfs-remove-redundant-test-on-unsigned.patch
aio-ifdef-fields-in-mm_struct.patch
bzip2-lzma-gzip-fix-comments-describing-decompressor-api.patch
bzip2-lzma-remove-nasty-uncompressed-size-hack-in-pre-boot-environment.patch
lzma-gzip-fix-potential-oops-when-input-data-is-truncated.patch
kernel-time-add-function-to-convert-between-calendar-time-and-broken-down-time-for-universal-use.patch
fatfs-use-common-localtime-gmtime-in-fat_time_unix2fat.patch
sound-core-pcm_timerc-use-lib-gcdc.patch
net-netfilter-ipvs-ip_vs_wrrc-use-lib-gcdc.patch
net-netfilter-ipvs-ip_vs_wrrc-use-lib-gcdc-fix.patch
vfs-take-2add-set_page_dirty_notag.patch
reiser4-vfs-add-super_operationssync_inodes-2.patch
reiser4-export-remove_from_page_cache.patch
reiser4-export-remove_from_page_cache-fix.patch
reiser4-export-find_get_pages.patch
reiser4.patch
reiser4-adjust-to-the-new-aops.patch
reiser4-adjust-to-the-new-aops-fixup.patch
reiser4-remove-simple_prepare_write-usage.patch
reiser4-remove-simple_prepare_write-usage-checkpatch-fixes.patch
fs-symlink-write_begin-allocation-context-fix-reiser4-fix.patch
reiser4-handling-error-returned-by-d_obtain_alias-fixup.patch
reiser4-update-names-of-quota-methods.patch
reiser4-use-set_page_dirty_notag.patch
fs-reiser4-contextc-current_is_pdflush-got-removed.patch
make-sure-nobodys-leaking-resources.patch
journal_add_journal_head-debug.patch
releasing-resources-with-children.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch
slab-leaks3-default-y.patch
put_bh-debug.patch
add-debugging-aid-for-memory-initialisation-problems.patch
keep-track-of-network-interface-renaming.patch
workaround-for-a-pci-restoring-bug.patch
prio_tree-debugging-patch.patch
single_open-seq_release-leak-diagnostics.patch
add-a-refcount-check-in-dput.patch
getblk-handle-2tb-devices.patch
getblk-handle-2tb-devices-fix.patch
undeprecate-pci_find_device.patch
notify_change-callers-must-hold-i_mutex.patch


2009-07-21 02:53:51

by Valdis Klētnieks

[permalink] [raw]
Subject: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code

On Thu, 16 Jul 2009 14:34:02 PDT, [email protected] said:
> The mm-of-the-moment snapshot 2009-07-16-14-32 has been uploaded to

Dies a horrid death during early boot. Dell Latitude D820, and this graphics:

01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS 110M/GeForce Go 7300] (rev a1)

Traceback (hand-copied from a very crappy cell-phone picture)

strcmp+0x4/0x1f
acpi_device+probe+0xac/0x13e
driver_probe_device+0xc9/0x14e
__driver_attach+0x58/0x7c
? __driver_attach+0x58/0x7c
? __driver_attach+0x58/0x7c
bus_for_each_dev+0x54/0x89
driver_attach+0x19/0x1b
bus_add_driver+0xv4/0x1fe
driver_register+0xb7/0x128
? acpi_video_init+0x0/0x17
acpi_bus_register_driver+0x3e/0x42
acpi_video_register+0x42/0x6e
acpi_video_init+0x15/0x17
do_one_initcall+0x56/0x130

Analysis shows it's the following code from (inlined) acpi_device_install_notify_handler

static int acpi_device_install_notify_handler(struct acpi_device *device)
{
acpi_status status;
char *hid;

hid = acpi_device_hid(device);
if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))

but we never check if hid is non-trash before feeding it to strcmp. Looks
like something in this linux-next commit is involved:

commit ed444824932d2a563858d82ec1ea29b0aa775e91
Author: Bob Moore <[email protected]>
Date: Mon Jun 29 13:39:29 2009 +0800

I suspect something in acpi_get_object_info() is going astray, causing
acpi_device_set_id() to set the ->pnp.hardware_id to NULL in this code:

if (hid) {
device->pnp.hardware_id = ACPI_ALLOCATE_ZEROED(strlen (hid) + 1);
if (device->pnp.hardware_id) {
strcpy(device->pnp.hardware_id, hid);
device->flags.hardware_id = 1;
}
} else
device->pnp.hardware_id = NULL;

The else clause is new in this commit.

Looking at the old code, it *may* be that the ACPI code on my laptop is just
busticated and/or there's no _HID method for the graphics card, and the old
code Just Happened To Work in previous kernels because ->pnp.hardware_id
wouldn't actually get set *at all* in acpi_device_set_id, so we'd get random
stale data that was bogus, but didn't give strcmp() indigestion...

Any wisdom on debugging this further (including how to tell if the ACPI
tables have a sane _HID method for the graphics card) would be appreciated...

Or is the correct fix in fact to just add a 'if (!hid) return -EINVAL;' to
acpi_device_install_notify_handler()?



Attachments:
(No filename) (226.00 B)

2009-07-21 03:27:00

by Lin Ming

[permalink] [raw]
Subject: Re: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code


> From: <[email protected]>
> Date: Tue, Jul 21, 2009 at 10:52 AM
> Subject: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code
> To: Andrew Morton <[email protected]>, Bob Moore
> <[email protected]>, Len Brown <[email protected]>
> Cc: [email protected], [email protected]
>
>
> On Thu, 16 Jul 2009 14:34:02 PDT, [email protected] said:
> > The mm-of-the-moment snapshot 2009-07-16-14-32 has been uploaded to
>
> Dies a horrid death during early boot. Dell Latitude D820, and this graphics:
>
> 01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS
> 110M/GeForce Go 7300] (rev a1)
>
> Traceback (hand-copied from a very crappy cell-phone picture)
>
> strcmp+0x4/0x1f
> acpi_device+probe+0xac/0x13e
> driver_probe_device+0xc9/0x14e
> __driver_attach+0x58/0x7c
> ? __driver_attach+0x58/0x7c
> ? __driver_attach+0x58/0x7c
> bus_for_each_dev+0x54/0x89
> driver_attach+0x19/0x1b
> bus_add_driver+0xv4/0x1fe
> driver_register+0xb7/0x128
> ? acpi_video_init+0x0/0x17
> acpi_bus_register_driver+0x3e/0x42
> acpi_video_register+0x42/0x6e
> acpi_video_init+0x15/0x17
> do_one_initcall+0x56/0x130
>
> Analysis shows it's the following code from (inlined)
> acpi_device_install_notify_handler
>
> static int acpi_device_install_notify_handler(struct acpi_device *device)
> {
> acpi_status status;
> char *hid;
>
> hid = acpi_device_hid(device);
> if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))
>
> but we never check if hid is non-trash before feeding it to strcmp. Looks
> like something in this linux-next commit is involved:
>
> commit ed444824932d2a563858d82ec1ea29b0aa775e91
> Author: Bob Moore <[email protected]>
> Date: Mon Jun 29 13:39:29 2009 +0800
>
> I suspect something in acpi_get_object_info() is going astray, causing
> acpi_device_set_id() to set the ->pnp.hardware_id to NULL in this code:
>
> if (hid) {
> device->pnp.hardware_id = ACPI_ALLOCATE_ZEROED(strlen (hid) + 1);
> if (device->pnp.hardware_id) {
> strcpy(device->pnp.hardware_id, hid);
> device->flags.hardware_id = 1;
> }
> } else
> device->pnp.hardware_id = NULL;
>
> The else clause is new in this commit.

Hi, would you please try below patch?

diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 6e83a68..6c64366 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -188,8 +188,8 @@ struct acpi_device_pnp {

#define acpi_device_bid(d) ((d)->pnp.bus_id)
#define acpi_device_adr(d) ((d)->pnp.bus_address)
-#define acpi_device_hid(d) ((d)->pnp.hardware_id)
-#define acpi_device_uid(d) ((d)->pnp.unique_id)
+#define acpi_device_hid(d) ((d)->pnp.hardware_id ? (d)->pnp.hardware_id : "\0")
+#define acpi_device_uid(d) ((d)->pnp.unique_id ? (d)->pnp.unique_id : "\0")
#define acpi_device_name(d) ((d)->pnp.device_name)
#define acpi_device_class(d) ((d)->pnp.device_class)


Thanks,
Lin Ming

2009-07-21 03:33:49

by Hugh Dickins

[permalink] [raw]
Subject: Re: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code

On Mon, 20 Jul 2009, [email protected] wrote:
> On Thu, 16 Jul 2009 14:34:02 PDT, [email protected] said:
> > The mm-of-the-moment snapshot 2009-07-16-14-32 has been uploaded to
>
> Dies a horrid death during early boot. Dell Latitude D820, and this graphics:
>
> 01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS 110M/GeForce Go 7300] (rev a1)

Oh yes, I was getting just the same with Intel graphics (i915);
but promptly forgot about it once I'd a workaround in place,
and moved on to other things, sorry.

>
> Traceback (hand-copied from a very crappy cell-phone picture)
>
> strcmp+0x4/0x1f
> acpi_device+probe+0xac/0x13e
> driver_probe_device+0xc9/0x14e
> __driver_attach+0x58/0x7c
> ? __driver_attach+0x58/0x7c
> ? __driver_attach+0x58/0x7c
> bus_for_each_dev+0x54/0x89
> driver_attach+0x19/0x1b
> bus_add_driver+0xv4/0x1fe
> driver_register+0xb7/0x128
> ? acpi_video_init+0x0/0x17
> acpi_bus_register_driver+0x3e/0x42
> acpi_video_register+0x42/0x6e
> acpi_video_init+0x15/0x17
> do_one_initcall+0x56/0x130
>
> Analysis shows it's the following code from (inlined) acpi_device_install_notify_handler
>
> static int acpi_device_install_notify_handler(struct acpi_device *device)
> {
> acpi_status status;
> char *hid;
>
> hid = acpi_device_hid(device);
> if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))
>
> but we never check if hid is non-trash before feeding it to strcmp. Looks
> like something in this linux-next commit is involved:
>
> commit ed444824932d2a563858d82ec1ea29b0aa775e91
> Author: Bob Moore <[email protected]>
> Date: Mon Jun 29 13:39:29 2009 +0800
>
> I suspect something in acpi_get_object_info() is going astray, causing
> acpi_device_set_id() to set the ->pnp.hardware_id to NULL in this code:
>
> if (hid) {
> device->pnp.hardware_id = ACPI_ALLOCATE_ZEROED(strlen (hid) + 1);
> if (device->pnp.hardware_id) {
> strcpy(device->pnp.hardware_id, hid);
> device->flags.hardware_id = 1;
> }
> } else
> device->pnp.hardware_id = NULL;
>
> The else clause is new in this commit.

I think pnp.hardware_id has changed from being a builtin array to
an allocated pointer: so before there was always a zeroed array to
strcmp against, whereas now there's a NULL pointer if you come to
use acpi_device_install_notify_handler() "too early".

Patch that works for me at the bottom.

>
> Looking at the old code, it *may* be that the ACPI code on my laptop is just
> busticated and/or there's no _HID method for the graphics card, and the old
> code Just Happened To Work in previous kernels because ->pnp.hardware_id
> wouldn't actually get set *at all* in acpi_device_set_id, so we'd get random
> stale data that was bogus, but didn't give strcmp() indigestion...
>
> Any wisdom on debugging this further (including how to tell if the ACPI
> tables have a sane _HID method for the graphics card) would be appreciated...
>
> Or is the correct fix in fact to just add a 'if (!hid) return -EINVAL;' to
> acpi_device_install_notify_handler()?

[PATCH mmotm] acpi: work around NULL hardware_id

Work around NULL pnp.hardware_id in acpi_device_install_notify_handler()
when probing video device.

Signed-off-by: Hugh Dickins <[email protected]>
---
Signoff provided to handle the unlikely event that this hack
is actually the right fix!

drivers/acpi/scan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- mmotm/drivers/acpi/scan.c 2009-07-17 12:53:20.000000000 +0100
+++ linux/drivers/acpi/scan.c 2009-07-17 21:19:10.000000000 +0100
@@ -376,12 +376,12 @@ static int acpi_device_install_notify_ha
char *hid;

hid = acpi_device_hid(device);
- if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))
+ if (hid && !strcmp(hid, ACPI_BUTTON_HID_POWERF))
status =
acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON,
acpi_device_notify_fixed,
device);
- else if (!strcmp(hid, ACPI_BUTTON_HID_SLEEPF))
+ else if (hid && !strcmp(hid, ACPI_BUTTON_HID_SLEEPF))
status =
acpi_install_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON,
acpi_device_notify_fixed,

2009-07-21 05:33:59

by Lin Ming

[permalink] [raw]
Subject: Re: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code


> From: Hugh Dickins <[email protected]>
> Date: Tue, Jul 21, 2009 at 11:33 AM
> Subject: Re: mmotm 2009-07-16-14-32 - sudden OOPS at boot in ACPI code
> To: [email protected]
> Cc: Andrew Morton <[email protected]>, Bob Moore
> <[email protected]>, Len Brown <[email protected]>,
> [email protected], [email protected]
>
>
> On Mon, 20 Jul 2009, [email protected] wrote:
> > On Thu, 16 Jul 2009 14:34:02 PDT, [email protected] said:
> > > The mm-of-the-moment snapshot 2009-07-16-14-32 has been uploaded to
> >
> > Dies a horrid death during early boot. Dell Latitude D820, and this graphics:
> >
> > 01:00.0 VGA compatible controller: nVidia Corporation G72M [Quadro NVS 110M/GeForce Go 7300] (rev a1)
>
> Oh yes, I was getting just the same with Intel graphics (i915);
> but promptly forgot about it once I'd a workaround in place,
> and moved on to other things, sorry.
>
> >
> > Traceback (hand-copied from a very crappy cell-phone picture)
> >
> > strcmp+0x4/0x1f
> > acpi_device+probe+0xac/0x13e
> > driver_probe_device+0xc9/0x14e
> > __driver_attach+0x58/0x7c
> > ? __driver_attach+0x58/0x7c
> > ? __driver_attach+0x58/0x7c
> > bus_for_each_dev+0x54/0x89
> > driver_attach+0x19/0x1b
> > bus_add_driver+0xv4/0x1fe
> > driver_register+0xb7/0x128
> > ? acpi_video_init+0x0/0x17
> > acpi_bus_register_driver+0x3e/0x42
> > acpi_video_register+0x42/0x6e
> > acpi_video_init+0x15/0x17
> > do_one_initcall+0x56/0x130
> >
> > Analysis shows it's the following code from (inlined) acpi_device_install_notify_handler
> >
> > static int acpi_device_install_notify_handler(struct acpi_device *device)
> > {
> > acpi_status status;
> > char *hid;
> >
> > hid = acpi_device_hid(device);
> > if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))
> >
> > but we never check if hid is non-trash before feeding it to strcmp. Looks
> > like something in this linux-next commit is involved:
> >
> > commit ed444824932d2a563858d82ec1ea29b0aa775e91
> > Author: Bob Moore <[email protected]>
> > Date: Mon Jun 29 13:39:29 2009 +0800
> >
> > I suspect something in acpi_get_object_info() is going astray, causing
> > acpi_device_set_id() to set the ->pnp.hardware_id to NULL in this code:
> >
> > if (hid) {
> > device->pnp.hardware_id = ACPI_ALLOCATE_ZEROED(strlen (hid) + 1);
> > if (device->pnp.hardware_id) {
> > strcpy(device->pnp.hardware_id, hid);
> > device->flags.hardware_id = 1;
> > }
> > } else
> > device->pnp.hardware_id = NULL;
> >
> > The else clause is new in this commit.
>
> I think pnp.hardware_id has changed from being a builtin array to
> an allocated pointer: so before there was always a zeroed array to

Yes, pnp.hardware_id and pnp.unique_id are now allocated pointer.
We made the change for acpi_get_object_info interface.

> strcmp against, whereas now there's a NULL pointer if you come to
> use acpi_device_install_notify_handler() "too early".
>
> Patch that works for me at the bottom.

Yes,
your patch can workaround the problem in
acpi_device_install_notify_handler.

But there are other places call strcmp to compare HID/UID.
So we'd better fix acpi_device_hid/_uid as below,

diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 6e83a68..6c64366 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -188,8 +188,8 @@ struct acpi_device_pnp {

#define acpi_device_bid(d) ((d)->pnp.bus_id)
#define acpi_device_adr(d) ((d)->pnp.bus_address)
-#define acpi_device_hid(d) ((d)->pnp.hardware_id)
-#define acpi_device_uid(d) ((d)->pnp.unique_id)
+#define acpi_device_hid(d) ((d)->pnp.hardware_id ? (d)->pnp.hardware_id : "\0")
+#define acpi_device_uid(d) ((d)->pnp.unique_id ? (d)->pnp.unique_id : "\0")
#define acpi_device_name(d) ((d)->pnp.device_name)
#define acpi_device_class(d) ((d)->pnp.device_class)

---
Thanks,
Lin Ming

>
> >
> > Looking at the old code, it *may* be that the ACPI code on my laptop is just
> > busticated and/or there's no _HID method for the graphics card, and the old
> > code Just Happened To Work in previous kernels because ->pnp.hardware_id
> > wouldn't actually get set *at all* in acpi_device_set_id, so we'd get random
> > stale data that was bogus, but didn't give strcmp() indigestion...
> >
> > Any wisdom on debugging this further (including how to tell if the ACPI
> > tables have a sane _HID method for the graphics card) would be appreciated...
> >
> > Or is the correct fix in fact to just add a 'if (!hid) return -EINVAL;' to
> > acpi_device_install_notify_handler()?
>
> [PATCH mmotm] acpi: work around NULL hardware_id
>
> Work around NULL pnp.hardware_id in acpi_device_install_notify_handler()
> when probing video device.
>
> Signed-off-by: Hugh Dickins <[email protected]>
> ---
> Signoff provided to handle the unlikely event that this hack
> is actually the right fix!
>
> drivers/acpi/scan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> --- mmotm/drivers/acpi/scan.c 2009-07-17 12:53:20.000000000 +0100
> +++ linux/drivers/acpi/scan.c 2009-07-17 21:19:10.000000000 +0100
> @@ -376,12 +376,12 @@ static int acpi_device_install_notify_ha
> char *hid;
>
> hid = acpi_device_hid(device);
> - if (!strcmp(hid, ACPI_BUTTON_HID_POWERF))
> + if (hid && !strcmp(hid, ACPI_BUTTON_HID_POWERF))
> status =
> acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON,
> acpi_device_notify_fixed,
> device);
> - else if (!strcmp(hid, ACPI_BUTTON_HID_SLEEPF))
> + else if (hid && !strcmp(hid, ACPI_BUTTON_HID_SLEEPF))
> status =
> acpi_install_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON,
> acpi_device_notify_fixed,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2009-07-22 13:18:27

by Valdis Klētnieks

[permalink] [raw]
Subject: mmotm 2009-07-16-14-32 - lockdep whinge in ext3/quota code

Saw this while bisecting to find another issue.

quilt top:
memcg-add-comments-explaining-memory-barriers-checkpatch-fixes.patch

Some checking doesn't look like any of the 58 patches after that are
relevant (only the reiser4 patch references quotas, no grep hits for lockdep
or ext3).

About 30 seconds after boot:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.31-rc3 #2
-------------------------------------------------------
rm/1562 is trying to acquire lock:
(&sb->s_type->i_mutex_key#11/4){+.+...}, at: [<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274

but task is already holding lock:
(&s->s_dquot.dqio_mutex){+.+...}, at: [<ffffffff811153d5>] dquot_commit+0x26/0xee

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&s->s_dquot.dqio_mutex){+.+...}:
[<ffffffff81067010>] __lock_acquire+0xa1b/0xb97
[<ffffffff81067278>] lock_acquire+0xec/0x110
[<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
[<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
[<ffffffff81115b54>] vfs_load_quota_inode+0x264/0x496
[<ffffffff81116094>] vfs_quota_on_path+0x4c/0x55
[<ffffffff81138e9d>] ext3_quota_on+0x14c/0x167
[<ffffffff81119e4f>] do_quotactl+0xf4/0x44c
[<ffffffff8111a491>] sys_quotactl+0x2ea/0x30e
[<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

-> #0 (&sb->s_type->i_mutex_key#11/4){+.+...}:
[<ffffffff81066eed>] __lock_acquire+0x8f8/0xb97
[<ffffffff81067278>] lock_acquire+0xec/0x110
[<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
[<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
[<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274
[<ffffffff81119bd3>] qtree_write_dquot+0xce/0x127
[<ffffffff81118924>] v2_write_dquot+0x27/0x29
[<ffffffff8111544c>] dquot_commit+0x9d/0xee
[<ffffffff8113a8ae>] ext3_write_dquot+0x69/0x8a
[<ffffffff8111713f>] dqput+0x138/0x25c
[<ffffffff81117993>] dquot_drop+0x6a/0x74
[<ffffffff8111509f>] vfs_dq_drop+0x41/0x43
[<ffffffff81130cba>] ext3_free_inode+0x96/0x28c
[<ffffffff81134cce>] ext3_delete_inode+0xbf/0xdd
[<ffffffff810e357d>] generic_delete_inode+0x135/0x1db
[<ffffffff810e363a>] generic_drop_inode+0x17/0x56
[<ffffffff810e252d>] iput+0x7a/0x7f
[<ffffffff810db906>] do_unlinkat+0x123/0x176
[<ffffffff810dbab8>] sys_unlinkat+0x24/0x26
[<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

other info that might help us debug this:

2 locks held by rm/1562:
#0: (jbd_handle){+.+...}, at: [<ffffffff811433c9>] journal_start+0x10a/0x137
#1: (&s->s_dquot.dqio_mutex){+.+...}, at: [<ffffffff811153d5>] dquot_commit+0x26/0xee

stack backtrace:
Pid: 1562, comm: rm Not tainted 2.6.31-rc3 #2
Call Trace:
[<ffffffff81066290>] print_circular_bug_tail+0x71/0x7c
[<ffffffff81066eed>] __lock_acquire+0x8f8/0xb97
[<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
[<ffffffff81067278>] lock_acquire+0xec/0x110
[<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
[<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
[<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
[<ffffffff810665e4>] ? check_irq_usage+0xad/0xbe
[<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
[<ffffffff810670e2>] ? __lock_acquire+0xaed/0xb97
[<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
[<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274
[<ffffffff81119bd3>] qtree_write_dquot+0xce/0x127
[<ffffffff81034c90>] ? get_parent_ip+0x11/0x42
[<ffffffff81118924>] v2_write_dquot+0x27/0x29
[<ffffffff8111544c>] dquot_commit+0x9d/0xee
[<ffffffff8113a8ae>] ext3_write_dquot+0x69/0x8a
[<ffffffff8111713f>] dqput+0x138/0x25c
[<ffffffff81117993>] dquot_drop+0x6a/0x74
[<ffffffff8111509f>] vfs_dq_drop+0x41/0x43
[<ffffffff81130cba>] ext3_free_inode+0x96/0x28c
[<ffffffff81131d63>] ? ext3_mark_inode_dirty+0x48/0x53
[<ffffffff81134cce>] ext3_delete_inode+0xbf/0xdd
[<ffffffff81134c0f>] ? ext3_delete_inode+0x0/0xdd
[<ffffffff810e357d>] generic_delete_inode+0x135/0x1db
[<ffffffff810e363a>] generic_drop_inode+0x17/0x56
[<ffffffff810e252d>] iput+0x7a/0x7f
[<ffffffff810db906>] do_unlinkat+0x123/0x176
[<ffffffff8107d440>] ? audit_syscall_entry+0x170/0x19c
[<ffffffff810dbab8>] sys_unlinkat+0x24/0x26
[<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b


Attachments:
(No filename) (226.00 B)

2009-07-22 16:19:29

by Jan Kara

[permalink] [raw]
Subject: Re: mmotm 2009-07-16-14-32 - lockdep whinge in ext3/quota code

On Wed 22-07-09 09:17:49, [email protected] wrote:
> Saw this while bisecting to find another issue.
>
> quilt top:
> memcg-add-comments-explaining-memory-barriers-checkpatch-fixes.patch
>
> Some checking doesn't look like any of the 58 patches after that are
> relevant (only the reiser4 patch references quotas, no grep hits for lockdep
> or ext3).
>
> About 30 seconds after boot:
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.31-rc3 #2
> -------------------------------------------------------
> rm/1562 is trying to acquire lock:
> (&sb->s_type->i_mutex_key#11/4){+.+...}, at: [<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274
>
> but task is already holding lock:
> (&s->s_dquot.dqio_mutex){+.+...}, at: [<ffffffff811153d5>] dquot_commit+0x26/0xee
>
> which lock already depends on the new lock.
>
Grumble... Commit d01730d74d2b0155da50d44555001706294014f7 didn't quite
fix the problem. At least lockdep now warns about it ;). Attached patch
should fix it (compile tested only so far).

> the existing dependency chain (in reverse order) is:
>
> -> #1 (&s->s_dquot.dqio_mutex){+.+...}:
> [<ffffffff81067010>] __lock_acquire+0xa1b/0xb97
> [<ffffffff81067278>] lock_acquire+0xec/0x110
> [<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
> [<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
> [<ffffffff81115b54>] vfs_load_quota_inode+0x264/0x496
> [<ffffffff81116094>] vfs_quota_on_path+0x4c/0x55
> [<ffffffff81138e9d>] ext3_quota_on+0x14c/0x167
> [<ffffffff81119e4f>] do_quotactl+0xf4/0x44c
> [<ffffffff8111a491>] sys_quotactl+0x2ea/0x30e
> [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> -> #0 (&sb->s_type->i_mutex_key#11/4){+.+...}:
> [<ffffffff81066eed>] __lock_acquire+0x8f8/0xb97
> [<ffffffff81067278>] lock_acquire+0xec/0x110
> [<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
> [<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
> [<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274
> [<ffffffff81119bd3>] qtree_write_dquot+0xce/0x127
> [<ffffffff81118924>] v2_write_dquot+0x27/0x29
> [<ffffffff8111544c>] dquot_commit+0x9d/0xee
> [<ffffffff8113a8ae>] ext3_write_dquot+0x69/0x8a
> [<ffffffff8111713f>] dqput+0x138/0x25c
> [<ffffffff81117993>] dquot_drop+0x6a/0x74
> [<ffffffff8111509f>] vfs_dq_drop+0x41/0x43
> [<ffffffff81130cba>] ext3_free_inode+0x96/0x28c
> [<ffffffff81134cce>] ext3_delete_inode+0xbf/0xdd
> [<ffffffff810e357d>] generic_delete_inode+0x135/0x1db
> [<ffffffff810e363a>] generic_drop_inode+0x17/0x56
> [<ffffffff810e252d>] iput+0x7a/0x7f
> [<ffffffff810db906>] do_unlinkat+0x123/0x176
> [<ffffffff810dbab8>] sys_unlinkat+0x24/0x26
> [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
>
> other info that might help us debug this:
>
> 2 locks held by rm/1562:
> #0: (jbd_handle){+.+...}, at: [<ffffffff811433c9>] journal_start+0x10a/0x137
> #1: (&s->s_dquot.dqio_mutex){+.+...}, at: [<ffffffff811153d5>] dquot_commit+0x26/0xee
>
> stack backtrace:
> Pid: 1562, comm: rm Not tainted 2.6.31-rc3 #2
> Call Trace:
> [<ffffffff81066290>] print_circular_bug_tail+0x71/0x7c
> [<ffffffff81066eed>] __lock_acquire+0x8f8/0xb97
> [<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
> [<ffffffff81067278>] lock_acquire+0xec/0x110
> [<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
> [<ffffffff814a04af>] __mutex_lock_common+0x5a/0x54e
> [<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
> [<ffffffff810665e4>] ? check_irq_usage+0xad/0xbe
> [<ffffffff81139fd9>] ? ext3_quota_write+0xb5/0x274
> [<ffffffff810670e2>] ? __lock_acquire+0xaed/0xb97
> [<ffffffff814a0a43>] mutex_lock_nested+0x32/0x37
> [<ffffffff81139fd9>] ext3_quota_write+0xb5/0x274
> [<ffffffff81119bd3>] qtree_write_dquot+0xce/0x127
> [<ffffffff81034c90>] ? get_parent_ip+0x11/0x42
> [<ffffffff81118924>] v2_write_dquot+0x27/0x29
> [<ffffffff8111544c>] dquot_commit+0x9d/0xee
> [<ffffffff8113a8ae>] ext3_write_dquot+0x69/0x8a
> [<ffffffff8111713f>] dqput+0x138/0x25c
> [<ffffffff81117993>] dquot_drop+0x6a/0x74
> [<ffffffff8111509f>] vfs_dq_drop+0x41/0x43
> [<ffffffff81130cba>] ext3_free_inode+0x96/0x28c
> [<ffffffff81131d63>] ? ext3_mark_inode_dirty+0x48/0x53
> [<ffffffff81134cce>] ext3_delete_inode+0xbf/0xdd
> [<ffffffff81134c0f>] ? ext3_delete_inode+0x0/0xdd
> [<ffffffff810e357d>] generic_delete_inode+0x135/0x1db
> [<ffffffff810e363a>] generic_drop_inode+0x17/0x56
> [<ffffffff810e252d>] iput+0x7a/0x7f
> [<ffffffff810db906>] do_unlinkat+0x123/0x176
> [<ffffffff8107d440>] ? audit_syscall_entry+0x170/0x19c
> [<ffffffff810dbab8>] sys_unlinkat+0x24/0x26
> [<ffffffff8100b2ab>] system_call_fastpath+0x16/0x1b

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR


Attachments:
(No filename) (4.89 kB)
0001-quota-Silence-lockdep-on-quota_on.patch (2.38 kB)
Download all attachments

2009-07-22 18:26:23

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2009-07-16-14-32 - lockdep whinge in ext3/quota code

On Wed, 22 Jul 2009 18:19:23 +0200, Jan Kara said:

> From df60fe9a9d554070e6135087e154bd5aad2cc1b5 Mon Sep 17 00:00:00 2001
> From: Jan Kara <[email protected]>
> Date: Wed, 22 Jul 2009 18:12:17 +0200
> Subject: [PATCH] quota: Silence lockdep on quota_on
>
> Commit d01730d74d2b0155da50d44555001706294014f7 didn't completely fix
> the problem since we still take dqio_mutex and i_mutex in the wrong
> order. Move taking of i_mutex further down (luckily it's needed only
> for updating inode flags) below where dqio_mutex is taken.
>
> Signed-off-by: Jan Kara <[email protected]>

Applied that fix to the -mmotm I was testing, and the kernel booted quietly.
Feel free to attach a:

Tested-by: Valdis Kletnieks <[email protected]>

as it goes upstream. Thanks for the fast patch.


Attachments:
(No filename) (226.00 B)