2011-04-14 22:45:08

by Andrew Morton

[permalink] [raw]
Subject: mmotm 2011-04-14-15-08 uploaded

The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to

http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

git://zen-kernel.org/kernel/mmotm.git

It contains the following patches against 2.6.39-rc3:

origin.patch
memcg-fix-mem_cgroup_rotate_reclaimable_page.patch
mm-optimize-pfn-calculation-in-online_page.patch
rtc-rtc-mc13xxx-fix-unterminated-platform_device_id-table.patch
fs-partitions-ldmc-fix-oops-caused-by-corrupted-partition-table.patch
mm-page_allocc-silence-build_all_zonelists-section-mismatch.patch
vmstat-update-comment-regarding-stat_threshold.patch
leds-leds-regulatorc-fix-handling-of-already-enabled-regulators.patch
kstrtox-fix-compile-warnings-in-test.patch
kstrtox-simpler-code-in-_kstrtoull.patch
maintainers-add-arm-ts78xx-setup-platform-maintainer.patch
maintainers-update-m68knommu-patterns.patch
maintainers-update-various-tty-patterns.patch
mm-add-vm-counters-for-transparent-hugepages.patch
maintainers-update-stable-branch-info.patch
tmpfs-fix-off-by-one-in-max_blocks-checks.patch
drivers-misc-sgi-gru-grufilec-fix-the-wrong-members-of-gru_chip.patch
brk-compat_brk-fix-detection-of-randomized-brk.patch
mm-check-that-we-have-the-right-vma-in-__access_remote_vm.patch
vmscan-all_unreclaimable-use-zone-all_unreclaimable-as-a-name.patch
oom-kill-remove-boost_dying_task_prio.patch
rapidio-add-idt-cps-1432-switch-definitions.patch
rapidio-mpc85xx-fix-possible-mport-registration-problems.patch
maintainers-change-mail-adress-of-hans-j-koch.patch
fs-fhandlec-add-linux-personalityh-for-ia64.patch
um-fix-call-tracer-and-bug-handler.patch
um-disable-config_cmpxchg_local.patch
ramfs-fix-memleak-on-no-mmu-arch.patch
mm-thp-use-conventional-format-for-boolean-attributes.patch
backlight-new-driver-for-the-adp8870-backlight-devices.patch
linux-next.patch
next-remove-localversion.patch
i-need-old-gcc.patch
hid-examplec-is-borked.patch
arch-alpha-kernel-systblss-remove-debug-check.patch
include-asm-generic-vmlinuxldsh-fix-__modver-section-warnings.patch
drivers-i2c-busses-i2c-designware-corec-needs-delayh.patch
vfs-avoid-large-kmallocs-for-the-fdtable.patch
drivers-char-agp-genericc-fix-arbitrary-kernel-memory-writes.patch
drivers-char-agp-genericc-fix-oom-and-buffer-overflow.patch
drivers-scsi-pmcraid-reject-negative-request-size.patch
drivers-scsi-mpt2sas-mpt2sas_ctlc-fix-unbounded-copy_to_user.patch
acpi-remove-acpi_sleep=s4_nonvs.patch
acerhdf-add-support-for-aspire-1410-bios-v13314.patch
arch-x86-include-asm-delayh-fix-udelay-and-ndelay-for-8-bit-args.patch
x86-fix-mmap-random-address-range.patch
leds-new-pcengines-alix-system-driver-enables-leds-via-gpio-interface.patch
gpio-show-explicit-dependency-between-gpio_cs5535-and-mfd_cs5535.patch
sound-pci-hda-hda_codecc-fix-warning.patch
msm-timer-migrate-to-timer-based-__delay.patch
arch-arm-mach-ux500-mbox-db5500c-world-writable-sysfs-fifo-file.patch
audit-always-follow-va_copy-with-va_end.patch
fs-btrfs-inodec-eliminate-memory-leak.patch
btrfs-dont-dereference-extent_mapping-if-null.patch
drivers-gpu-drm-radeon-atomc-fix-warning.patch
fb-fix-potential-deadlock-between-lock_fb_info-and-console_lock.patch
cyber2000fb-avoid-palette-corruption-at-higher-clocks.patch
fscache-remove-dead-code-under-config_workqueue_debugfs.patch
bitmap-irq-add-smp_affinity_list-interface-to-proc-irq.patch
leds-support-automatic-start-of-blinking-with-ledtrig-timer.patch
drivers-leds-leds-pca9532c-add-gpio-capability.patch
leds-route-kbd-leds-through-the-generic-leds-layer.patch
net-irda-convert-bfin_sir-to-common-blackfin-uart-header.patch
net-convert-%p-usage-to-%pk.patch
backlight-add-backlight-type-fix.patch
backlight-add-backlight-type-fix-fix.patch
drivers-video-backlight-adp5520_blc-check-strict_strtoul-return-value.patch
drivers-video-backlight-adp5520_blc-check-strict_strtoul-return-value-fix.patch
i915-add-native-backlight-control.patch
btusb-patch-add_apple_macbookpro62.patch
drivers-message-fusion-mptsasc-fix-warning.patch
scsi-fix-a-header-to-include-linux-typesh.patch
aic94xx-world-writable-sysfs-update_bios-file.patch
osst-wrong-index-used-in-inner-loop.patch
osst-wrong-index-used-in-inner-loop-checkpatch-fixes.patch
drivers-scsi-osstc-fix-warning.patch
drbd-fix-warning.patch
usb-yurex-recognize-generalkeys-wireless-presenter-as-generic-hid.patch
drivers-usb-misc-usbtestc-fix-warning.patch
xtensa-s-irq_chip-irq_data-in-various-places.patch
mm.patch
arch-mm-filter-disallowed-nodes-from-arch-specific-show_mem-functions.patch
mmap-add-alignment-for-some-variables.patch
mmap-avoid-unnecessary-anon_vma-lock.patch
mmap-avoid-merging-cloned-vmas.patch
mm-remove-unused-zone_idx-variable-from-set_migratetype_isolate.patch
mm-nommu-sort-mm-mmap-list-properly.patch
mm-nommu-sort-mm-mmap-list-properly-fix.patch
mm-nommu-dont-scan-the-vma-list-when-deleting.patch
mm-nommu-find-vma-using-the-sorted-vma-list.patch
mm-nommu-check-the-vma-list-when-unmapping-file-mapped-vma.patch
mm-nommu-fix-a-potential-memory-leak-in-do_mmap_private.patch
mm-nommu-fix-a-compile-warning-in-do_mmap_pgoff.patch
mm-per-node-vmstat-show-proper-vmstats.patch
mm-per-node-vmstat-show-proper-vmstats-fix.patch
mm-increase-reclaim_distance-to-30.patch
mm-introduce-wait_on_page_locked_killable.patch
x86mm-make-pagefault-killable.patch
mm-mem-hotplug-fix-section-mismatch-setup_per_zone_inactive_ratio-should-be-__meminit.patch
mm-mem-hotplug-recalculate-lowmem_reserve-when-memory-hotplug-occur.patch
mm-mem-hotplug-update-pcp-stat_threshold-when-memory-hotplug-occur.patch
mm-mem-hotplug-update-pcp-stat_threshold-when-memory-hotplug-occur-fix.patch
mm-convert-vma-vm_flags-to-64-bit.patch
mm-add-__nocast-attribute-to-vm_flags.patch
fremap-convert-vm_flags-to-unsigned-long-long.patch
procfs-convert-vm_flags-to-unsigned-long-long.patch
mm-compaction-reverse-the-change-that-forbade-sync-migraton-with-__gfp_no_kswapd.patch
oom-replace-pf_oom_origin-with-toggling-oom_score_adj.patch
oom-replace-pf_oom_origin-with-toggling-oom_score_adj-update.patch
mm-remove-unused-token-argument-from-apply_to_page_range-callback.patch
mm-add-apply_to_page_range_batch.patch
ioremap-use-apply_to_page_range_batch-for-ioremap_page_range.patch
vmalloc-use-plain-pte_clear-for-unmaps.patch
vmalloc-use-apply_to_page_range_batch-for-vunmap_page_range.patch
vmalloc-use-apply_to_page_range_batch-for-vmap_page_range_noflush.patch
vmalloc-use-apply_to_page_range_batch-in-alloc_vm_area.patch
xen-mmu-use-apply_to_page_range_batch-in-xen_remap_domain_mfn_range.patch
xen-grant-table-use-apply_to_page_range_batch.patch
memsw-remove-noswapaccount-kernel-parameter.patch
mm-batch-activate_page-to-reduce-lock-contention.patch
xattrh-expose-string-defines-to-userspace.patch
frv-duplicate-output_buffer-of-e03.patch
frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch
hpet-factor-timer-allocate-from-open.patch
arch-alpha-include-asm-ioh-s-extern-inline-static-inline.patch
bluetooth-fix-build-warnings-on-defconfigs.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-checkpatch-fixes.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-fix.patch
lib-vsprintfc-fix-interaction-of-kasprintf-and-vsnprintf-when-using-%pv.patch
fcntlf_setfl-allow-setting-of-o_sync.patch
lru_cache-use-correct-type-in-sizeof-for-allocation.patch
lru_cache-use-correct-type-in-sizeof-for-allocation-fix.patch
lib-add-kstrto_from_user.patch
lib-consolidate-debug_per_cpu_maps.patch
include-linux-genalloch-add-multiple-inclusion-guards.patch
lib-genallocc-add-support-for-specifying-the-physical-address.patch
lib-genpoolc-document-return-values-fix-gen_pool_add_virt-return-value.patch
percpu_counter-change-return-value-and-add-comments.patch
percpu_counter-change-return-value-and-add-comments-fix.patch
checkpatch-add-check-for-line-continuations-in-quoted-strings.patch
lib-hexdumpc-make-hex2bin-return-the-updated-src-address.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix-fix.patch
fs-ncpfs-inodec-suppress-used-uninitialised-warning.patch
vt-add-k_off-return-value-to-vt_ioctl-kdgkbmode.patch
drivers-tty-vt-vt_ioctlc-repair-insane-expression.patch
rtc-add-support-for-the-rtc-in-via-vt8500-and-compatibles.patch
rtc-add-em3027-rtc-driver.patch
rtc-add-rv3029c2-rtc-support.patch
rtc-add-basic-support-for-st-m41t93-spi-rtc.patch
drivers-rtc-rtc-mrstc-use-release_mem_region-after-request_mem_region.patch
drivers-rtc-rtc-mrstc-use-release_mem_region-after-request_mem_region-fix.patch
rtc-driver-for-pt7c4338-chip.patch
rtc-driver-for-pt7c4338-chip-checkpatch-fixes.patch
rtc-driver-for-pt7c4338-chip-fix.patch
gpio-add-new-altera-pio-driver.patch
gpio-add-new-altera-pio-driver-update.patch
gpio-make-gpio_requestfree_array-gpio-array-parameter-const.patch
jbd-remove-dependency-on-__gfp_nofail.patch
ufs-truncated-values-handling-64-bit-metadata.patch
documentation-atomic_opstxt-avoid-volatile-in-sample-code.patch
documentation-accounting-getdelaysc-fix-unused-var-warning.patch
documentation-accounting-getdelaysc-handle-sendto-failures.patch
cgroups-read-write-lock-clone_thread-forking-per-threadgroup.patch
cgroups-add-per-thread-subsystem-callbacks.patch
cgroups-make-procs-file-writable.patch
cgroups-use-flex_array-in-attach_proc.patch
cgroup-remove-the-ns_cgroup.patch
mm-move-enum-vm_event_item-into-a-standalone-header-file.patch
memcg-count-the-soft_limit-reclaim-in-global-background-reclaim.patch
memcg-add-stats-to-monitor-soft_limit-reclaim.patch
add-the-pagefault-count-into-memcg-stats.patch
add-the-pagefault-count-into-memcg-stats-fix.patch
memcg-remove-pointless-next_mz-nullification-in-mem_cgroup_soft_limit_reclaim.patch
memcg-mark-init_section_page_cgroup-properly.patch
memcg-fix-off-by-one-when-calculating-swap-cgroup-map-length.patch
memcg-move-page-freeing-code-out-of-lock.patch
maintainers-add-mm-page_cgroupc-into-memcg-subsystem.patch
cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node.patch
signal-introduce-retarget_shared_pending.patch
signal-retarget_shared_pending-consider-shared-unblocked-signals-only.patch
signal-sigprocmask-narrow-the-scope-of-sigloc.patch
signal-sigprocmask-should-do-retarget_shared_pending.patch
x86-signal-handle_signal-should-use-sigprocmask.patch
x86-signal-sys_rt_sigreturn-should-use-sigprocmask.patch
kstrtox-convert-fs-proc.patch
proc-constify-status-array.patch
proc-stat-use-defined-macro-kmalloc_max_size.patch
dev-kmsg-properly-support-writev-to-avoid-interleaved-printk-lines.patch
dev-kmsg-properly-support-writev-to-avoid-interleaved-printk-lines-fix.patch
fs-partitions-efic-corrupted-guid-partition-tables-can-cause-kernel-oops.patch
fs-partitions-efic-corrupted-guid-partition-tables-can-cause-kernel-oops-fix.patch
sysctl-add-proc_dointvec_bool-handler.patch
sysctl-use-proc_dointvec_bool-where-appropriate.patch
sysctl-add-proc_dointvec_unsigned-handler.patch
sysctl-use-proc_dointvec_unsigned-where-appropriate.patch
pid-fix-typo-in-function-description.patch
fs-execc-provide-the-correct-process-pid-to-the-pipe-helper.patch
scatterlist-new-helper-functions.patch
scatterlist-new-helper-functions-update.patch
scatterlist-new-helper-functions-update-fix.patch
memstick-add-support-for-legacy-memorysticks.patch
memstick-add-support-for-legacy-memorysticks-update-2.patch
w1-add-1-wire-w1-reset-and-resume-command-api-support.patch
w1-add-1-wire-w1-ds2408-8-channel-addressable-switch-support.patch
w1-complete-the-1-wire-w1-ds1wm-driver-search-algorithm.patch
kexec-remove-kmsg_dump_kexec.patch
kexec-remove-kmsg_dump_kexec-fix.patch
make-sure-nobodys-leaking-resources.patch
journal_add_journal_head-debug.patch
releasing-resources-with-children.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch
mutex-subsystem-synchro-test-module-fix.patch
slab-leaks3-default-y.patch
put_bh-debug.patch
add-debugging-aid-for-memory-initialisation-problems.patch
workaround-for-a-pci-restoring-bug.patch
prio_tree-debugging-patch.patch
single_open-seq_release-leak-diagnostics.patch
add-a-refcount-check-in-dput.patch
memblock-add-input-size-checking-to-memblock_find_region.patch
memblock-add-input-size-checking-to-memblock_find_region-fix.patch


2011-04-15 14:58:02

by Valdis Klētnieks

[permalink] [raw]
Subject: mmotm 2011-04-14 - lockdep splats in sched.c during boot

On Thu, 14 Apr 2011 15:08:47 PDT, [email protected] said:
> The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/

This throws at least two complaints about lockdep on the way up. I've had
several complete hangs as well last night during boot following a WARN in
sched.c, but didn't have netconsole or a camera handy at the time. Will follow up if I
catch one. Both whinges point at a 'for_each_domain()'. Not sure why I
haven't seen mention on lkml before - what am I doing different?

Splat number 1:
[ 0.044382] smpboot cpu 1: start_ip = 99000
[ 0.002999] calibrate_delay_direct() timer_rate_max=2526877 timer_rate_min=2526840 pre_start=520283431585 pre_end=520308700132
[ 0.002999] calibrate_delay_direct() timer_rate_max=2526857 timer_rate_min=2526829 pre_start=520313753438 pre_end=520339021871
[ 0.002999] calibrate_delay_direct() timer_rate_max=2526851 timer_rate_min=2526824 pre_start=520344075709 pre_end=520369344094
[ 0.002999] calibrate_delay_direct() timer_rate_max=2526862 timer_rate_min=2526834 pre_start=520374397819 pre_end=520399666308
[ 0.002999] calibrate_delay_direct() timer_rate_max=2526864 timer_rate_min=2526836 pre_start=520404719957 pre_end=520429988465
[ 0.116010]
[ 0.116011] ===================================================
[ 0.116989] [ INFO: suspicious rcu_dereference_check() usage. ]
[ 0.116989] ---------------------------------------------------
[ 0.116989] kernel/sched.c:2426 invoked rcu_dereference_check() without protection!
[ 0.116989]
[ 0.116989] other info that might help us debug this:
[ 0.116989]
[ 0.116989]
[ 0.116989] rcu_scheduler_active = 1, debug_locks = 1
[ 0.116989] 2 locks held by swapper/1:
[ 0.116989] #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810394d2>] cpu_maps_update_begin+0x12/0x14
[ 0.116989] #1: (&p->pi_lock){-.....}, at: [<ffffffff81032959>] try_to_wake_up+0x29/0x1aa
[ 0.116989]
[ 0.116989] stack backtrace:
[ 0.116989] Pid: 1, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
[ 0.116989] Call Trace:
[ 0.116989] [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
[ 0.116989] [<ffffffff8102acd0>] ttwu_stat+0xcc/0xf5
[ 0.116989] [<ffffffff81032ab5>] try_to_wake_up+0x185/0x1aa
[ 0.116989] [<ffffffff81b5540a>] ? migration_call+0x9e/0xd0
[ 0.116989] [<ffffffff81564643>] ? _raw_spin_unlock_irqrestore+0x46/0x80
[ 0.116989] [<ffffffff81032b06>] wake_up_process+0x10/0x12
[ 0.116989] [<ffffffff81b56207>] cpu_stop_cpu_callback+0xe5/0x11b
[ 0.116989] [<ffffffff81567abe>] notifier_call_chain+0x54/0x81
[ 0.116989] [<ffffffff810596bc>] __raw_notifier_call_chain+0x9/0xb
[ 0.116989] [<ffffffff815434d1>] __cpu_notify+0x1b/0x2d
[ 0.116989] [<ffffffff81b55709>] _cpu_up.constprop.0+0xd1/0xe5
[ 0.116989] [<ffffffff81b55757>] cpu_up+0x3a/0x47
[ 0.116989] [<ffffffff81b2f3d2>] smp_init+0x41/0x93
[ 0.116989] [<ffffffff81b1dbc5>] kernel_init+0x9d/0x15b
[ 0.116989] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
[ 0.116989] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
[ 0.116989] [<ffffffff81b1db28>] ? start_kernel+0x394/0x394
[ 0.116989] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
[ 0.117089] NMI watchdog enabled, takes one hw-pmu counter.
[ 0.119006] Brought up 2 CPUs

Splat number 2:
[ 1.179319] netconsole: remote ethernet address 00:b0:d0:c3:bd:a7
[ 1.179430] netconsole: device eth0 not up yet, forcing it
[ 1.247705] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
[ 1.298111] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
[ 1.298312]
[ 1.298313] ===================================================
[ 1.298516] [ INFO: suspicious rcu_dereference_check() usage. ]
[ 1.298623] ---------------------------------------------------
[ 1.298731] kernel/sched.c:1211 invoked rcu_dereference_check() without protection!
[ 1.298858]
[ 1.298858] other info that might help us debug this:
[ 1.298859]
[ 1.299152]
[ 1.299152] rcu_scheduler_active = 1, debug_locks = 1
[ 1.299294] 1 lock held by swapper/0:
[ 1.299294] #0: (&(&base->lock)->rlock){-.-.-.}, at: [<ffffffff810443fd>] lock_timer_base+0x49/0x92
[ 1.299294]
[ 1.299294] stack backtrace:
[ 1.299294] Pid: 0, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
[ 1.299294] Call Trace:
[ 1.299294] <IRQ> [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
[ 1.299294] [<ffffffff810337a7>] get_nohz_timer_target+0x79/0xbe
[ 1.299294] [<ffffffff810452ec>] __mod_timer+0xc7/0x16d
[ 1.299294] [<ffffffff810454bf>] mod_timer+0x87/0x8e
[ 1.299294] [<ffffffff8130814c>] e1000_intr_msi+0xa2/0xef
[ 1.299294] [<ffffffff8108acab>] handle_irq_event_percpu+0xba/0x29f
[ 1.299294] [<ffffffff8108aecc>] handle_irq_event+0x3c/0x5c
[ 1.299294] [<ffffffff810193c6>] ? ack_APIC_irq+0x10/0x12
[ 1.299294] [<ffffffff8108d197>] handle_edge_irq+0xf4/0x121
[ 1.299294] [<ffffffff810031aa>] handle_irq+0x122/0x133
[ 1.299294] [<ffffffff81002fdf>] do_IRQ+0x48/0xa0
[ 1.299294] [<ffffffff81564cd3>] common_interrupt+0x13/0x13
[ 1.299294] <EOI> [<ffffffff81008009>] ? default_idle+0x52/0x89
[ 1.299294] [<ffffffff81008007>] ? default_idle+0x50/0x89
[ 1.299294] [<ffffffff8100084c>] cpu_idle+0x87/0x102
[ 1.299294] [<ffffffff81535587>] rest_init+0xcb/0xd2
[ 1.299294] [<ffffffff815354bc>] ? csum_partial_copy_generic+0x16c/0x16c
[ 1.299294] [<ffffffff81b1db1d>] start_kernel+0x389/0x394
[ 1.299294] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3
[ 1.299294] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7
[ 1.309814] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)


Attachments:
(No filename) (227.00 B)

2011-04-15 15:50:04

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2011-04-14-15-08 uploaded (leds)

On Thu, 14 Apr 2011 15:08:47 -0700 [email protected] wrote:

> The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/
>
> and will soon be available at
>
> git://zen-kernel.org/kernel/mmotm.git


from leds-support-automatic-start-of-blinking-with-ledtrig-timer.patch:

This build error happened several times when
CONFIG_LEDS_TRIGGERS is not enabled:

drivers/leds/led-class.c:134: error: 'struct led_classdev' has no member named 'trigger_data'


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-04-15 15:52:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - lockdep splats in sched.c during boot

On Fri, 2011-04-15 at 10:57 -0400, [email protected] wrote:
> On Thu, 14 Apr 2011 15:08:47 PDT, [email protected] said:
> > The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
>
> This throws at least two complaints about lockdep on the way up. I've had
> several complete hangs as well last night during boot following a WARN in
> sched.c, but didn't have netconsole or a camera handy at the time. Will follow up if I
> catch one.

That would be most appreciated, I merged two large series of scheduler
patches.

> Both whinges point at a 'for_each_domain()'. Not sure why I
> haven't seen mention on lkml before - what am I doing different?

Probably running a very fresh kernel..

> Splat number 1:
> [ 0.044382] smpboot cpu 1: start_ip = 99000
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526877 timer_rate_min=2526840 pre_start=520283431585 pre_end=520308700132
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526857 timer_rate_min=2526829 pre_start=520313753438 pre_end=520339021871
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526851 timer_rate_min=2526824 pre_start=520344075709 pre_end=520369344094
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526862 timer_rate_min=2526834 pre_start=520374397819 pre_end=520399666308
> [ 0.002999] calibrate_delay_direct() timer_rate_max=2526864 timer_rate_min=2526836 pre_start=520404719957 pre_end=520429988465
> [ 0.116010]
> [ 0.116011] ===================================================
> [ 0.116989] [ INFO: suspicious rcu_dereference_check() usage. ]
> [ 0.116989] ---------------------------------------------------
> [ 0.116989] kernel/sched.c:2426 invoked rcu_dereference_check() without protection!
> [ 0.116989]
> [ 0.116989] other info that might help us debug this:
> [ 0.116989]
> [ 0.116989]
> [ 0.116989] rcu_scheduler_active = 1, debug_locks = 1
> [ 0.116989] 2 locks held by swapper/1:
> [ 0.116989] #0: (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff810394d2>] cpu_maps_update_begin+0x12/0x14
> [ 0.116989] #1: (&p->pi_lock){-.....}, at: [<ffffffff81032959>] try_to_wake_up+0x29/0x1aa
> [ 0.116989]
> [ 0.116989] stack backtrace:
> [ 0.116989] Pid: 1, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 0.116989] Call Trace:
> [ 0.116989] [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
> [ 0.116989] [<ffffffff8102acd0>] ttwu_stat+0xcc/0xf5
> [ 0.116989] [<ffffffff81032ab5>] try_to_wake_up+0x185/0x1aa
> [ 0.116989] [<ffffffff81b5540a>] ? migration_call+0x9e/0xd0
> [ 0.116989] [<ffffffff81564643>] ? _raw_spin_unlock_irqrestore+0x46/0x80
> [ 0.116989] [<ffffffff81032b06>] wake_up_process+0x10/0x12
> [ 0.116989] [<ffffffff81b56207>] cpu_stop_cpu_callback+0xe5/0x11b
> [ 0.116989] [<ffffffff81567abe>] notifier_call_chain+0x54/0x81
> [ 0.116989] [<ffffffff810596bc>] __raw_notifier_call_chain+0x9/0xb
> [ 0.116989] [<ffffffff815434d1>] __cpu_notify+0x1b/0x2d
> [ 0.116989] [<ffffffff81b55709>] _cpu_up.constprop.0+0xd1/0xe5
> [ 0.116989] [<ffffffff81b55757>] cpu_up+0x3a/0x47
> [ 0.116989] [<ffffffff81b2f3d2>] smp_init+0x41/0x93
> [ 0.116989] [<ffffffff81b1dbc5>] kernel_init+0x9d/0x15b
> [ 0.116989] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
> [ 0.116989] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
> [ 0.116989] [<ffffffff81b1db28>] ? start_kernel+0x394/0x394
> [ 0.116989] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
> [ 0.117089] NMI watchdog enabled, takes one hw-pmu counter.
> [ 0.119006] Brought up 2 CPUs
>
> Splat number 2:
> [ 1.179319] netconsole: remote ethernet address 00:b0:d0:c3:bd:a7
> [ 1.179430] netconsole: device eth0 not up yet, forcing it
> [ 1.247705] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
> [ 1.298111] e1000e 0000:00:19.0: irq 46 for MSI/MSI-X
> [ 1.298312]
> [ 1.298313] ===================================================
> [ 1.298516] [ INFO: suspicious rcu_dereference_check() usage. ]
> [ 1.298623] ---------------------------------------------------
> [ 1.298731] kernel/sched.c:1211 invoked rcu_dereference_check() without protection!
> [ 1.298858]
> [ 1.298858] other info that might help us debug this:
> [ 1.298859]
> [ 1.299152]
> [ 1.299152] rcu_scheduler_active = 1, debug_locks = 1
> [ 1.299294] 1 lock held by swapper/0:
> [ 1.299294] #0: (&(&base->lock)->rlock){-.-.-.}, at: [<ffffffff810443fd>] lock_timer_base+0x49/0x92
> [ 1.299294]
> [ 1.299294] stack backtrace:
> [ 1.299294] Pid: 0, comm: swapper Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 1.299294] Call Trace:
> [ 1.299294] <IRQ> [<ffffffff81065bfc>] lockdep_rcu_dereference+0x9b/0xa4
> [ 1.299294] [<ffffffff810337a7>] get_nohz_timer_target+0x79/0xbe
> [ 1.299294] [<ffffffff810452ec>] __mod_timer+0xc7/0x16d
> [ 1.299294] [<ffffffff810454bf>] mod_timer+0x87/0x8e
> [ 1.299294] [<ffffffff8130814c>] e1000_intr_msi+0xa2/0xef
> [ 1.299294] [<ffffffff8108acab>] handle_irq_event_percpu+0xba/0x29f
> [ 1.299294] [<ffffffff8108aecc>] handle_irq_event+0x3c/0x5c
> [ 1.299294] [<ffffffff810193c6>] ? ack_APIC_irq+0x10/0x12
> [ 1.299294] [<ffffffff8108d197>] handle_edge_irq+0xf4/0x121
> [ 1.299294] [<ffffffff810031aa>] handle_irq+0x122/0x133
> [ 1.299294] [<ffffffff81002fdf>] do_IRQ+0x48/0xa0
> [ 1.299294] [<ffffffff81564cd3>] common_interrupt+0x13/0x13
> [ 1.299294] <EOI> [<ffffffff81008009>] ? default_idle+0x52/0x89
> [ 1.299294] [<ffffffff81008007>] ? default_idle+0x50/0x89
> [ 1.299294] [<ffffffff8100084c>] cpu_idle+0x87/0x102
> [ 1.299294] [<ffffffff81535587>] rest_init+0xcb/0xd2
> [ 1.299294] [<ffffffff815354bc>] ? csum_partial_copy_generic+0x16c/0x16c
> [ 1.299294] [<ffffffff81b1db1d>] start_kernel+0x389/0x394
> [ 1.299294] [<ffffffff81b1d29f>] x86_64_start_reservations+0xaf/0xb3
> [ 1.299294] [<ffffffff81b1d393>] x86_64_start_kernel+0xf0/0xf7
> [ 1.309814] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>


The below should cure those two I think.

---
kernel/sched.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 0cfe031..cd06b53 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1208,11 +1208,13 @@ int get_nohz_timer_target(void)
int i;
struct sched_domain *sd;

+ rcu_read_lock();
for_each_domain(cpu, sd) {
for_each_cpu(i, sched_domain_span(sd))
if (!idle_cpu(i))
return i;
}
+ rcu_read_unlock();
return cpu;
}
/*
@@ -2415,12 +2417,14 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
struct sched_domain *sd;

schedstat_inc(p, se.statistics.nr_wakeups_remote);
+ rcu_read_lock();
for_each_domain(this_cpu, sd) {
if (cpumask_test_cpu(cpu, sched_domain_span(sd))) {
schedstat_inc(sd, ttwu_wake_remote);
break;
}
}
+ rcu_read_unlock();
}
#endif /* CONFIG_SMP */

2011-04-15 16:12:05

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2011-04-14-15-08 uploaded (staging/gma500)

On Thu, 14 Apr 2011 15:08:47 -0700 [email protected] wrote:

> The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/
>
> and will soon be available at
>
> git://zen-kernel.org/kernel/mmotm.git
>
> It contains the following patches against 2.6.39-rc3:


ERROR: "__bad_udelay" [drivers/staging/gma500/psb_gfx.ko] undefined!

on x86_64

in psb_intel_wait_for_vblank().

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-04-15 18:56:36

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - hangs during boot.

On Fri, 15 Apr 2011 10:57:09 EDT, [email protected] said:
> On Thu, 14 Apr 2011 15:08:47 PDT, [email protected] said:
> > The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
>
> This throws at least two complaints about lockdep on the way up. I've had
> several complete hangs as well last night during boot

Caught them. Not sure how the WARN_ON_ONCE it's hitting just before it hangs
is related to the actual hang, but I'm betting it's the kernel's last plaintive
cry for help before everything grinds to a halt.

First boot:

[ 3.852927] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10
[ 3.859723] ------------[ cut here ]------------
[ 3.859859] WARNING: at kernel/workqueue.c:1217 worker_enter_idle+0x168/0x19f()
[ 3.859984] Hardware name: Latitude E6500
[ 3.860089] Modules linked in:
[ 3.860308] Pid: 11, comm: kworker/1:0 Not tainted 2.6.39-rc3-mmotm0414 #1
[ 3.860428] Call Trace:
[ 3.860531] [<ffffffff81037c62>] warn_slowpath_common+0x7e/0x96
[ 3.860640] [<ffffffff81037c8f>] warn_slowpath_null+0x15/0x17
[ 3.860677] [<ffffffff8104e12d>] worker_enter_idle+0x168/0x19f
[ 3.860677] [<ffffffff81050d3f>] worker_thread+0x1ed/0x206
[ 3.860677] [<ffffffff81050b52>] ? manage_workers+0xc0/0xc0
[ 3.860677] [<ffffffff8105472e>] kthread+0x7f/0x87
[ 3.860677] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
[ 3.860677] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
[ 3.860677] [<ffffffff810546af>] ? __init_kthread_worker+0x55/0x55
[ 3.860677] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
[ 3.860677] ---[ end trace 64d29d8be7ad450b ]---

and wham it was dead hard at that point,no further output. Next boot, it hit again,
and lived a while longer:

[ 3.983798] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10
[ 3.993411] ------------[ cut here ]------------
[ 3.993531] WARNING: at kernel/workqueue.c:1217 worker_enter_idle+0x168/0x19f()
[ 3.993658] Hardware name: Latitude E6500
[ 3.993764] Modules linked in:
[ 3.993951] Pid: 482, comm: kworker/1:1 Not tainted 2.6.39-rc3-mmotm0414 #1
[ 3.994065] Call Trace:
[ 3.994173] [<ffffffff81037c62>] warn_slowpath_common+0x7e/0x96
[ 3.994282] [<ffffffff81037c8f>] warn_slowpath_null+0x15/0x17
[ 3.994381] [<ffffffff8104e12d>] worker_enter_idle+0x168/0x19f
[ 3.994381] [<ffffffff81050d3f>] worker_thread+0x1ed/0x206
[ 3.994381] [<ffffffff81050b52>] ? manage_workers+0xc0/0xc0
[ 3.994381] [<ffffffff8105472e>] kthread+0x7f/0x87
[ 3.994381] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
[ 3.994381] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
[ 3.994381] [<ffffffff810546af>] ? __init_kthread_worker+0x55/0x55
[ 3.994381] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
[ 3.994381] ---[ end trace 604fcd3646d16bcd ]---
[ 4.141467] udevadm used greatest stack depth: 4352 bytes left
[ 4.225220] usb 1-4.1: new low speed USB device number 5 using ehci_hcd
[ 4.326710] usb 1-4.1: New USB device found, idVendor=045e, idProduct=0023
[ 4.326831] usb 1-4.1: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[ 4.326959] usb 1-4.1: Product: Microsoft Trackball Optical®
[ 4.327105] usb 1-4.1: Manufacturer: Microsoft
[ 4.338620] input: Microsoft Microsoft Trackball Optical® as /devices/pci0000:00/0000:00:1a.7/sb1/1-4/1-4.1/1-4.1:1.0/input/input11
[ 4.339775] generic-usb 0003:045E:0023.0001: input,hidraw0: USB HID v1.00 Mouse [Microsoft Micr
osoft Trackball Optical®] on usb-0000:00:1a.7-4.1/input0
[ 4.413229] usb 1-4.2: new full speed USB device number 6 using ehci_hcd
[ 4.467320] dracut: luksOpen /dev/sda2 luks-715ceabf-6f58-4251-9373-ed29e8629a7c
[ 4.498856] usb 1-4.2: New USB device found, idVendor=0451, idProduct=1446
[ 4.498998] usb 1-4.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[ 4.500490] hub 1-4.2:1.0: USB hub found
[ 4.500704] hub 1-4.2:1.0: 4 ports detected
[ 4.707088] usb 5-1: new full speed USB device number 2 using uhci_hcd
[ 4.875565] usb 5-1: New USB device found, idVendor=0a5c, idProduct=5800
[ 4.875684] usb 5-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[ 4.875805] usb 5-1: Product: 5880

but it then didn't accept keyboard input from the USB keyboard and acted pretty dead.

Any ideas, or am I looking at a weekend of bisecting? ;)



Attachments:
(No filename) (227.00 B)

2011-04-15 19:04:00

by Peter Zijlstra

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - hangs during boot.

On Fri, 2011-04-15 at 14:53 -0400, [email protected] wrote:
> On Fri, 15 Apr 2011 10:57:09 EDT, [email protected] said:
> > On Thu, 14 Apr 2011 15:08:47 PDT, [email protected] said:
> > > The mm-of-the-moment snapshot 2011-04-14-15-08 has been uploaded to
> > >
> > > http://userweb.kernel.org/~akpm/mmotm/
> >
> > This throws at least two complaints about lockdep on the way up. I've had
> > several complete hangs as well last night during boot
>
> Caught them. Not sure how the WARN_ON_ONCE it's hitting just before it hangs
> is related to the actual hang, but I'm betting it's the kernel's last plaintive
> cry for help before everything grinds to a halt.
>
> First boot:

> [ 3.859723] ------------[ cut here ]------------
> [ 3.859859] WARNING: at kernel/workqueue.c:1217 worker_enter_idle+0x168/0x19f()
> [ 3.859984] Hardware name: Latitude E6500
> [ 3.860089] Modules linked in:
> [ 3.860308] Pid: 11, comm: kworker/1:0 Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 3.860428] Call Trace:
> [ 3.860531] [<ffffffff81037c62>] warn_slowpath_common+0x7e/0x96
> [ 3.860640] [<ffffffff81037c8f>] warn_slowpath_null+0x15/0x17
> [ 3.860677] [<ffffffff8104e12d>] worker_enter_idle+0x168/0x19f
> [ 3.860677] [<ffffffff81050d3f>] worker_thread+0x1ed/0x206
> [ 3.860677] [<ffffffff81050b52>] ? manage_workers+0xc0/0xc0
> [ 3.860677] [<ffffffff8105472e>] kthread+0x7f/0x87
> [ 3.860677] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
> [ 3.860677] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
> [ 3.860677] [<ffffffff810546af>] ? __init_kthread_worker+0x55/0x55
> [ 3.860677] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
> [ 3.860677] ---[ end trace 64d29d8be7ad450b ]---
>
> and wham it was dead hard at that point,no further output. Next boot, it hit again,
> and lived a while longer:

> [ 3.993411] ------------[ cut here ]------------
> [ 3.993531] WARNING: at kernel/workqueue.c:1217 worker_enter_idle+0x168/0x19f()
> [ 3.993658] Hardware name: Latitude E6500
> [ 3.993764] Modules linked in:
> [ 3.993951] Pid: 482, comm: kworker/1:1 Not tainted 2.6.39-rc3-mmotm0414 #1
> [ 3.994065] Call Trace:
> [ 3.994173] [<ffffffff81037c62>] warn_slowpath_common+0x7e/0x96
> [ 3.994282] [<ffffffff81037c8f>] warn_slowpath_null+0x15/0x17
> [ 3.994381] [<ffffffff8104e12d>] worker_enter_idle+0x168/0x19f
> [ 3.994381] [<ffffffff81050d3f>] worker_thread+0x1ed/0x206
> [ 3.994381] [<ffffffff81050b52>] ? manage_workers+0xc0/0xc0
> [ 3.994381] [<ffffffff8105472e>] kthread+0x7f/0x87
> [ 3.994381] [<ffffffff8156bb94>] kernel_thread_helper+0x4/0x10
> [ 3.994381] [<ffffffff81564d84>] ? retint_restore_args+0xe/0xe
> [ 3.994381] [<ffffffff810546af>] ? __init_kthread_worker+0x55/0x55
> [ 3.994381] [<ffffffff8156bb90>] ? gs_change+0xb/0xb
> [ 3.994381] ---[ end trace 604fcd3646d16bcd ]---

> but it then didn't accept keyboard input from the USB keyboard and acted pretty dead.
>
> Any ideas, or am I looking at a weekend of bisecting? ;)

Does your kernel contain c2f7115e2e52a6c187b8c1f54f0e4970bb677be0 ? If
not, mmotm is based on an old -next and should upgrade ;-)


2011-04-15 19:29:23

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - hangs during boot.

On Fri, 15 Apr 2011 21:06:22 +0200, Peter Zijlstra said:

> Does your kernel contain c2f7115e2e52a6c187b8c1f54f0e4970bb677be0 ? If
> not, mmotm is based on an old -next and should upgrade ;-)

Seems to include only so far:

GIT 9e06a6ea7f6cd992bcaa8c469d3831cb9d7d71d9 git+ssh://master.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git

commit 9e06a6ea7f6cd992bcaa8c469d3831cb9d7d71d9
Author: Stephen Rothwell <[email protected]>
Date: Thu Apr 14 14:34:58 2011 +1000

Add linux-next specific files for 20110414

Signed-off-by: Stephen Rothwell <[email protected]>

I'll see if I can cherrypick that c2f711 commit out and apply it, or have you been churning
it that much in the last 24 hours? :)



Attachments:
(No filename) (227.00 B)

2011-04-15 19:36:20

by Peter Zijlstra

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - hangs during boot.

On Fri, 2011-04-15 at 15:28 -0400, [email protected] wrote:
> I'll see if I can cherrypick that c2f711 commit out and apply it, or
> have you been churning
> it that much in the last 24 hours? :)

I'm afraid it doesn't cleanly apply, I stuck it somewhere in the head of
the series and rebased the tail on top to make it all bisectable etc..

2011-04-15 19:39:18

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-14 - hangs during boot.

On Fri, 15 Apr 2011 21:06:22 +0200, Peter Zijlstra said:
> Does your kernel contain c2f7115e2e52a6c187b8c1f54f0e4970bb677be0 ? If
> not, mmotm is based on an old -next and should upgrade ;-)

https://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Fsfr%2Flinux-next.git&a=search&h=HEAD&st=commit&s=c2f7115e2e52a6c187b8c1f54f0e4970bb677be0

says 'no match' against 20110415, so I think you and Steve Rothwell need to chat. ;)


Attachments:
(No filename) (227.00 B)

2011-04-19 12:06:30

by Peter Zijlstra

[permalink] [raw]
Subject: [tip:sched/core] sched: Fix sched_domain iterations vs. RCU

Commit-ID: 057f3fadb347e9c51b07e1b277bbdda79f976768
Gitweb: http://git.kernel.org/tip/057f3fadb347e9c51b07e1b277bbdda79f976768
Author: Peter Zijlstra <[email protected]>
AuthorDate: Mon, 18 Apr 2011 11:24:34 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Tue, 19 Apr 2011 10:56:54 +0200

sched: Fix sched_domain iterations vs. RCU

Vladis Kletnieks reported a new RCU debug warning in the scheduler.

Since commit dce840a08702b ("sched: Dynamically allocate sched_domain/
sched_group data-structures") the sched_domain trees are protected by
RCU instead of RCU-sched.

This means that we need to include rcu_read_lock() protection when we
iterate them since disabling preemption doesn't suffice anymore.

Reported-by: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/1302882741.2388.241.camel@twins
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched.c | 14 +++++++++++---
1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 0cfe031..27d3e73 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -1208,11 +1208,17 @@ int get_nohz_timer_target(void)
int i;
struct sched_domain *sd;

+ rcu_read_lock();
for_each_domain(cpu, sd) {
- for_each_cpu(i, sched_domain_span(sd))
- if (!idle_cpu(i))
- return i;
+ for_each_cpu(i, sched_domain_span(sd)) {
+ if (!idle_cpu(i)) {
+ cpu = i;
+ goto unlock;
+ }
+ }
}
+unlock:
+ rcu_read_unlock();
return cpu;
}
/*
@@ -2415,12 +2421,14 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
struct sched_domain *sd;

schedstat_inc(p, se.statistics.nr_wakeups_remote);
+ rcu_read_lock();
for_each_domain(this_cpu, sd) {
if (cpumask_test_cpu(cpu, sched_domain_span(sd))) {
schedstat_inc(sd, ttwu_wake_remote);
break;
}
}
+ rcu_read_unlock();
}
#endif /* CONFIG_SMP */