2010-04-15 22:10:13

by Andrew Morton

[permalink] [raw]
Subject: mmotm 2010-04-15-14-42 uploaded

The mm-of-the-moment snapshot 2010-04-15-14-42 has been uploaded to

http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

git://zen-kernel.org/kernel/mmotm.git

It contains the following patches against 2.6.34-rc4:

origin.patch
reiserfs-fix-permissions-on-reiserfs_priv.patch
memcg-fix-prepare-migration.patch
reiserfs-fix-corruption-during-shrinking-of-xattrs.patch
linux-next.patch
next-remove-localversion.patch
i-need-old-gcc.patch
arch-x86-crypto-aesni-intel_asms-still-busted.patch
include-linux-fsh-complete-hexification-of-fmode_-constants.patch
lis3-add-support-for-hp-probook-432x-442x-452x-522x.patch
drivers-video-efifbc-framebuffer-for-nvidia-9400m-in-macbook-pro-51.patch
drivers-video-efifbc-framebuffer-for-nvidia-9400m-in-macbook-pro-51-fix.patch
cgroups-fix-procs-documentation.patch
initramfs-prevent-buffer-overflow-when-unpacking-to-rootfs.patch
bsdacct-use-del_timer_sync-in-acct_exit_ns.patch
fbdev-fix-kconfig-breakage-in-drivers-video.patch
rmap-add-exclusively-owned-pages-to-the-newest-anon_vma.patch
it8761e_gpio-fix-bug-in-gpio-numbering.patch
power_meter-acpi_device_class-power_meter_resource-too-long.patch
drivers-base-cpuc-fix-the-output-from-sys-devices-system-cpu-offline.patch
intel-agpc-fix-crash-when-accessing-nonexistent-gtt-entries-in-i915.patch
intel-agpc-fix-crash-when-accessing-nonexistent-gtt-entries-in-i915-checkpatch-fixes.patch
inotify-dont-leak-user-struct-on-inotify-release.patch
oprofile-remove-double-ring-buffering.patch
sched-prevent-compiler-from-optimising-sched_avg_update-loop.patch
keys-dont-need-to-use-rcu-in-keyring_read-as-semaphore-is-held.patch
drivers-serial-pmac_zilogc-add-missing-unlock.patch
iio-iio_get_new_idr_val-return-negative-value-on-failure.patch
iio-iio_get_new_idr_val-return-negative-value-on-failure-fix.patch
drivers-usb-gadget-s3c-hsotgc-add-missing-unlock.patch
acerhdf-add-new-bios-versions.patch
drivers-acpi-use-kasprintf.patch
drivers-acpi-use-kasprintf-fix.patch
sbshc-acpi_device_class-smbus_host_controller-too-long.patch
acpi_pad-processor_aggregator-name-too-long.patch
acpi_pad-processor_aggregator-name-too-long-fix.patch
x86-apic-ack-all-pending-irqs-when-crashed-on-kexec-v5.patch
arch-x86-pci-use-kasprintf.patch
x86-nosmp-command-line-option-should-force-the-system-into-up-mode.patch
arch-x86-kernel-setupcl-phoenix-bios-fixup-is-needed-on-dell-inspiron-mini-1012.patch
x86-remove-last-traces-of-quicklist-usage.patch
arch-x86-kernel-hpetc-make-the-hpet-compare-register-read-back-failed-warning-conditional-on-the-hpet=verbose-boot-option.patch
x86-fix-handling-of-the-reservetop-boot-option.patch
agp-amd64-fix-pci-reference-leaks.patch
arm-convert-proc-cpu-aligment-to-seq_file.patch
arch-arm-plat-pxa-dmac-correct-null-test.patch
arch-arm-include-asm-elfh-forward-declare-the-task-struct.patch
cifs-provide-user-with-a-hint-when-name-resolution-fails.patch
cifs-provide-user-with-a-hint-when-name-resolution-fails-fix.patch
dmaengine-support-for-st-ericssons-dma40-block-v3.patch
dmaengine-dma40-u8500-platform-configuration-v3.patch
jfs-free-sbi-memory-in-error-path.patch
powerpc-add-rcu_read_lock-to-gup_fast-implementation.patch
gpu-vga_switcheroo-fix-lock-imbalance.patch
drivers-gpu-drm-via-via_videoc-fix-off-by-one-issue.patch
drivers-gpu-drm-radeon-radeon_atombiosc-range-check-issues.patch
drivers-gpu-drm-drm_sysfsc-sysfs-files-error-handling.patch
drivers-gpu-drm-drm_memoryc-fix-check-for-end-of-loop.patch
dib3000mc-reduce-large-stack-usage.patch
dib7000p-reduce-large-stack-usage.patch
dvb-usb-gp8psk-fix-potential-null-derefernce.patch
drivers-media-video-avoid-null-dereference.patch
drivers-media-video-au0828-au0828-videoc-off-by-one-bug.patch
drivers-media-video-zc0301-zc0301_corec-improve-error-handling.patch
drivers-media-video-et61x251-et61x251_corec-improve-error-handling.patch
drivers-media-video-sn9c102-sn9c102_corec-improve-error-handling.patch
cx88-improve-error-handling.patch
ir-keytable-avoid-double-lock.patch
fs-fscache-object-listc-fix-warning-on-32-bit.patch
gpiolib-introduce-chip-addition-removal-notifier.patch
of-gpio-add-support-for-two-stage-registration-for-the-of_gpio_chips.patch
of-gpio-implement-gpiolib-notifier-hooks.patch
of-gpio-implement-gpiolib-notifier-hooks-fix.patch
of-gpio-implement-gpiolib-notifier-hooks-fix-fix2.patch
powerpc-mcu_mpc8349emitx-remove-of-gpio-handling-stuff.patch
gpiolib-cosmetic-improvements-for-error-handling-in-gpiochip_add.patch
cpu-timers-optimize-run_posix_cpu_timers.patch
time-remove-xtime_cache-take-2.patch
ati_remote-add-some-missing-devices-from-lirc_atiusb.patch
usbtouchscreen-support-bigger-inexio-touchscreens.patch
input-handle-bad-parity-ps-2-packets-in-mouse-drivers-better.patch
markup_oops-fix-perlcritic-warnings.patch
led-driver-for-the-soekris-net5501-board.patch
led-driver-for-the-soekris-net5501-board-checkpatch-fixes.patch
led-driver-for-the-soekris-net5501-board-fix-2.patch
leds-route-kbd-leds-through-the-generic-leds-layer.patch
leds-route-kbd-leds-through-the-generic-leds-layer-leds-input-depends-on-input.patch
leds-route-kbd-leds-through-the-generic-leds-layer-fix.patch
gpio-add-support-for-janz-vmod-ttl-digital-io-module-fix.patch
mtd-nandsim-fix-typo-struct-nandsin_geometry.patch
mtd-nand-remove-stray-endchoice-from-kconfig-help-text.patch
ntfs-clean-up-ntfs_attr_extend_initialized.patch
ntfs-use-add_to_page_cache_lru.patch
score-fix-dereference-of-null-pointer-in-local_flush_tlb_page.patch
3x59x-fix-pci-resource-management.patch
mbp_nvidia_bl-add-support-for-older-macbookpro-and-macbook-61.patch
backlight-backlight_device_register-return-err_ptr.patch
backlight-add-s6e63m0-amoled-lcd-panel-driver.patch
backlight-add-s6e63m0-amoled-lcd-panel-driver-checkpatch-fixes.patch
sunrpc-use-formatting-of-module-name-in-sunrpc.patch
serial-two-branches-the-same-in-timbuart_set_mctrl.patch
serial-timbuart-make-sure-last-byte-is-sent-when-port-is-closed.patch
serial-timbuart-make-sure-last-byte-is-sent-when-port-is-closed-fix.patch
serial-8250_pnp-add-fujitsu-wacom-device.patch
kernel-irq-managec-add-raise_threaded_irq.patch
kernel-irq-managec-add-raise_threaded_irq-fix.patch
max3100-move-to-threaded-interrupt.patch
max3100-add-console-support-for-max3100.patch
max3100-to_max3100_port-small-style-fixes.patch
max3100-to_max3100_port-small-style-fixes-fix.patch
max3100-add-console-support-for-max3100-fixes-for-the-max31x0-console.patch
serial-add-driver-for-the-altera-jtag-uart.patch
serial-add-driver-for-the-altera-uart.patch
serial-add-driver-for-the-altera-uart-update.patch
rcu-remove-init_rcu_head-rcu_head_init-rcu_head.patch
scsi-add-__init-__exit-macros-to-ibmvstgtc.patch
drivers-scsi-fnic-fnic_scsic-clean-up.patch
drivers-scsi-lpfc-lpfc_vportc-fix-read-buffer-overflow.patch
osst-fix-read-buffer-overflow.patch
gdth-unmap-ccb_phys-when-scsi_add_host-fails-in-gdth_eisa_probe_one.patch
drivers-scsi-libsas-use-sam_good.patch
ncr5380-bit-mr_dma_mode-set-twice-in-ncr5380_transfer_dma.patch
drivers-scsi-remove-unnecessary-null-test.patch
drivers-message-move-dereference-after-null-test.patch
mpt-fusion-convert-to-seq_file.patch
g_ncr5380-remove-misleading-pnp-error-message.patch
g_ncr5380-fix-broken-mmio-compilation.patch
g_ncr5380-fix-missing-pnp_device_detach-and-scsi_unregister-on-rmmod.patch
dc395x-decrease-iteration-for-tag_number-of-max_command-in-start_scsi.patch
drivers-scsi-correct-the-size-argument-to-kmalloc.patch
scsi-remove-superfluous-null-pointer-check-from-scsi_kill_request.patch
scsi-sdc-quiet-all-sparse-noise.patch
lpfc-positive-error-return-into-negative.patch
drivers-scsi-qla2xxx-qla_osc-fix-continuation-line-formats.patch
scsi-bfa-correct-onstack-wait_queue_head-declaration.patch
mptscsih-fix-first-line-of-kernel-doc-for-a-few-functions.patch
drivers-scsi-chc-dont-use-vprintk-as-macro.patch
scsi-fix-convert-scsi_scanc-kernel-doc.patch
scsi-update-drivers-tools-url-references.patch
bfa-wrong-fcport-h2i-message-tested-in-bfa_fcport_isr.patch
scsi-use-__ux-types-for-headers-exported-to-user-space.patch
fs-splicec-fix-mapping_gfp_mask-usage.patch
vt6655-cgi-csi-confusion-in-device_ioctl.patch
drivers-staging-otus-hal-hpanic-using-the-wrong-variable.patch
drivers-staging-comedi-drivers-dt2801c-off-by-one-issue.patch
musb-potential-use-after-free.patch
kaweth-new-usb-id-07c9-b010-allied-telesyn-at-usb10.patch
usb-fix-serial-build-when-sysrq-is-disabled.patch
usb-oxu210hp-release-spinlock-on-error-path.patch
vfs-fix-vfs_rename_dir-for-fs_rename_does_d_move-filesystems.patch
vfs-improve-comment-describing-fget_light.patch
ecryptfs-another-lockdep-issue.patch
vfs-o_-bit-numbers-uniqueness-check.patch
vfs-o_-bit-numbers-uniqueness-check-update.patch
vfs-o_-bit-numbers-uniqueness-check-fix.patch
vfs-o_-bit-numbers-uniqueness-check-fix-2.patch
vfs-introduce-fmode_neg_offset-for-allowing-negative-f_pos.patch
vfs-clarify-that-nonseekable_open-will-never-fail.patch
xtensa-convert-to-asm-generic-hardirqh.patch
xtensa-includecheck-fix-vectorss.patch
modpost-support-objects-with-more-than-64k-sections.patch
mm.patch
page-allocator-reduce-fragmentation-in-buddy-allocator-by-adding-buddies-that-are-merging-to-the-tail-of-the-free-lists.patch
sparsemem-on-no-vmemmap-path-put-mem_map-on-node-high-too.patch
shmem-remove-redundant-code.patch
define-madv_hugepage.patch
mm-remove-return-value-of-putback_lru_pages.patch
mempolicy-remove-redundant-code.patch
oom-filter-tasks-not-sharing-the-same-cpuset.patch
oom-sacrifice-child-with-highest-badness-score-for-parent.patch
oom-select-task-from-tasklist-for-mempolicy-ooms.patch
oom-remove-special-handling-for-pagefault-ooms.patch
oom-badness-heuristic-rewrite.patch
oom-deprecate-oom_adj-tunable.patch
oom-replace-sysctls-with-quick-mode.patch
oom-avoid-oom-killer-for-lowmem-allocations.patch
oom-remove-unnecessary-code-and-cleanup.patch
oom-default-to-killing-current-for-pagefault-ooms.patch
oom-avoid-race-for-oom-killed-tasks-detaching-mm-prior-to-exit.patch
oom-select_bad_process-check-pf_kthread-instead-of-mm-to-skip-kthreads.patch
oom-select_bad_process-pf_exiting-check-should-take-mm-into-account.patch
oom-introduce-find_lock_task_mm-to-fix-mm-false-positives.patch
oom-oom_forkbomb_penalty-move-thread_group_cputime-out-of-task_lock.patch
oom-hold-tasklist_lock-when-dumping-tasks.patch
oom-give-current-access-to-memory-reserves-if-it-has-been-killed.patch
oom-avoid-sending-exiting-tasks-a-sigkill.patch
oom-clean-up-oom_kill_task.patch
oom-clean-up-oom_badness.patch
oom-select_bad_process-never-choose-tasks-with-badness-==-0.patch
mempolicy-remove-case-mpol_interleave-from-policy_zonelist.patch
mempolicy-remove-redundant-check.patch
mempolicy-dont-call-mpol_set_nodemask-when-no_context.patch
mempolicy-lose-unnecessary-loop-variable-in-mpol_parse_str.patch
mempolicy-rename-policy_types-and-cleanup-initialization.patch
mempolicy-factor-mpol_shared_policy_init-return-paths.patch
mempolicy-document-cpuset-interaction-with-tmpfs-mpol-mount-option.patch
mincore-cleanups.patch
mincore-break-do_mincore-into-logical-pieces.patch
mincore-pass-ranges-as-startend-address-pairs.patch
mincore-do-nested-page-table-walks.patch
pagemap-add-ifdefs-config_hugetlb_page-on-code-walking-hugetlb-vma.patch
mm-default-to-node-zonelist-ordering-when-nodes-have-only-lowmem.patch
oom-move-sysctl-declarations-to-oomh.patch
fs-writebackc-bitfields-should-be-unsigned.patch
mm-migration-take-a-reference-to-the-anon_vma-before-migrating.patch
mm-migration-do-not-try-to-migrate-unmapped-anonymous-pages.patch
mm-share-the-anon_vma-ref-counts-between-ksm-and-page-migration.patch
mm-allow-config_migration-to-be-set-without-config_numa-or-memory-hot-remove.patch
mm-allow-config_migration-to-be-set-without-config_numa-or-memory-hot-remove-fix.patch
mm-export-unusable-free-space-index-via-proc-unusable_index.patch
mm-export-unusable-free-space-index-via-proc-unusable_index-fix.patch
mm-export-unusable-free-space-index-via-proc-unusable_index-fix-fix-2.patch
mm-export-fragmentation-index-via-proc-extfrag_index.patch
mm-export-fragmentation-index-via-proc-extfrag_index-fix.patch
mm-move-definition-for-lru-isolation-modes-to-a-header.patch
mm-compaction-memory-compaction-core.patch
mm-compaction-memory-compaction-core-fix.patch
mm-compaction-memory-compaction-core-fix-page-buddy-can-go-away-before-reading-page_order-while-isolating-pages-for-migration.patch
mm-compaction-add-proc-trigger-for-memory-compaction.patch
mm-compaction-add-proc-trigger-for-memory-compaction-fix.patch
mm-compaction-add-proc-trigger-for-memory-compaction-fix-fix.patch
mm-compaction-add-sys-trigger-for-per-node-memory-compaction.patch
mm-compaction-direct-compact-when-a-high-order-allocation-fails.patch
mm-compaction-direct-compact-when-a-high-order-allocation-fails-reject-fix.patch
mm-compaction-add-a-tunable-that-decides-when-memory-should-be-compacted-and-when-it-should-be-reclaimed.patch
mm-migration-allow-the-migration-of-pageswapcache-pages.patch
mm-migration-allow-the-migration-of-pageswapcache-pages-fix.patch
mm-compaction-do-not-display-compaction-related-stats-when-config_compaction.patch
mm-compaction-do-not-display-compaction-related-stats-when-config_compaction-fix.patch
mm-compaction-do-not-display-compaction-related-stats-when-config_compaction-fix-fix-2.patch
mm-revalidate-anon_vma-in-page_lock_anon_vma.patch
vmscan-prevent-get_scan_ratio-rounding-errors.patch
readaheadc-fix-comment.patch
mm-remove-nodes-validity-check-in-alloc_pages.patch
mm-alloc-function-in-pcpu_alloc_pages.patch
mm-change-alloc-function-in-vmemmap_alloc_block.patch
mm-change-alloc-function-in-__vmalloc_area_node.patch
mm-add-comment-in-alloc_pages_exact_node.patch
frv-extend-gdbstub-to-support-more-features-of-gdb.patch
frv-extend-gdbstub-to-support-more-features-of-gdb-fix.patch
frv-duplicate-output_buffer-of-e03.patch
frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch
nommu-allow-private-mappings-of-read-only-devices.patch
errh-add-__must_check-to-error-pointer-handlers.patch
endian-define-__byte_order.patch
bitops-optimize-hweight-by-making-use-of-compile-time-evaluation.patch
x86-add-optimized-popcnt-variants.patch
hangcheck-timer-fix-x86_32-bugs.patch
kernel-wide-replace-ushort_max-short_max-and-short_min-with-ushrt_max-shrt_max-and-shrt_min.patch
kernel-wide-replace-ushort_max-short_max-and-short_min-with-ushrt_max-shrt_max-and-shrt_min-fix.patch
improve-sys_personality-for-compat-architectures.patch
vsprintfc-use-noinline_for_stack.patch
dynamic_debug-small-cleanup-in-ddebug_proc_write.patch
lib-hexdumpc-reduce-stack-variable-size-and-cleanups.patch
firmware-loader-use-statically-initialized-data-attribute.patch
firmware-loader-use-statically-initialized-data-attribute-fix.patch
firmware-loader-use-statically-initialized-data-attribute-fix-fix.patch
davinci-mmc-pass-number-of-sg-segments-as-platform-data.patch
mmc-omap-add-support-for-16-bit-and-32-bit-registers.patch
sdhci-implement-cap_clock_base_broken-quirk.patch
sdhci-pltfm-implement-platform-data-passing.patch
sdhci-pltfm-do-not-print-errors-in-case-of-an-extended-iomem-size.patch
davinci-mmc-add-a-function-to-control-reset-state-of-the-controller.patch
davinci-mmc-updates-to-suspend-resume-implementation.patch
davinci-mmc-updates-to-suspend-resume-implementation-checkpatch-fixes.patch
mmc-sd-clean-up-redundant-memset.patch
checkpatch-add-check-for-too-short-kconfig-descriptions.patch
checkpatch-add-check-for-too-short-kconfig-descriptions-checkpatch-fixes.patch
hwmon-driver-for-ti-tmp102-temperature-sensor.patch
hwmon-driver-for-ti-tmp102-temperature-sensor-fix.patch
lis3-add-missing-constants-for-8bit-device.patch
lis3-separate-configuration-function-for-8-bit-device.patch
lis3-introduce-platform-data-for-second-ff-wu-unit.patch
lis3-add-skeletons-for-interrupt-handlers.patch
lis3-interrupt-handlers-for-8bit-wakeup-and-click-events.patch
lis3-setup-poll-interval-limits.patch
hwmon-add-ti-ads7871-a-d-converter-driver.patch
hwmon-add-ti-ads7871-a-d-converter-driver-checkpatch-fixes.patch
xen-fix-build-when-sysrq-is-disabled.patch
smbfs-remove-duplicated-include.patch
s3c-rtc-driver-add-support-for-s3c64xx.patch
rtc-mxc-remove-unnecessary-clock-source-for-rtc-subsystem.patch
gpio-add-interrupt-handling-capability-to-max732x.patch
gpiolib-make-names-array-and-its-values-const.patch
gpiolib-make-names-array-and-its-values-const-fix.patch
gpiolib-a-gpio-is-unsigned-so-use-%u-to-print-it.patch
gpiolib-document-that-names-can-contain-printk-format-specifiers.patch
fbdev-bfin-lq035q1-fb-respect-new-ppi-mode-platform-field.patch
sis-strcpy-=-strlcpy.patch
fbdev-section-cleanup-in-arcfb.patch
fbdev-section-cleanup-in-hgafb.patch
fbdev-section-cleanup-in-vfb.patch
fbdev-section-cleanup-in-vga16fb.patch
fbdev-section-cleanup-in-w100fb.patch
cobalt_lcdfb-fix-section-mismatch-cobalt_lcdfb_fix.patch
da8xx-omap-l1xx-fb-implement-double-buffering.patch
auxdisplay-section-cleanup-in-cfag12864bfb-driver.patch
ext3-fixup-rb_root-initializations-to-use-rb_root.patch
hfsplus-identify-journal-info-block-in-volume-header.patch
hfsplus-fix-journal-detection.patch
memcg-oom-wakeup-filter.patch
memcg-oom-wakeup-filter-update.patch
memcg-oom-notifier.patch
memcg-oom-notifier-update.patch
memcg-oom-kill-disable-and-oom-status.patch
memcg-oom-kill-disable-and-oom-status-update.patch
memcg-oom-kill-disable-and-oom-status-update-checkpatch-fixes.patch
memcg-clean-up-move-charge.patch
memcg-move-charge-of-file-pages.patch
memcg-move-charge-of-file-pages-fix.patch
kmod-add-init-function-to-usermodehelper.patch
exec-replace-call_usermodehelper_pipe-with-use-of-umh-init-function-and-resolve-limit.patch
umh-creds-convert-call_usermodehelper_keys-to-use-subprocess_info-init.patch
umh-creds-kill-subprocess_info-cred-logic.patch
call_usermodehelper-no-need-to-unblock-signals.patch
wait_for_helper-sigchld-from-user-space-can-lead-to-use-after-free.patch
call_usermodehelper-simplify-fix-umh_no_wait-case.patch
call_usermodehelper-umh_wait_exec-ignores-kernel_thread-failure.patch
coredump-factor-out-the-not-ispipe-file-checks.patch
coredump-cleanup-ispipe-code.patch
coredump-factor-out-put_cred-calls.patch
coredump-shift-down_writemmap_sem-into-coredump_wait.patch
exit-exit_notify-can-trust-signal-notify_count-0.patch
exit-change-zap_other_threads-to-count-sub-threads.patch
exit-avoid-sig-count-in-de_thread-__exit_signal-synchronization.patch
exit-avoid-sig-count-in-__exit_signal-to-detect-the-group-dead-case.patch
posix-cpu-timers-avoid-task-signal-=-null-checks.patch
ia64-ptrace_attach_sync_user_rbs-avoid-task-signal-=-null-checks.patch
fork-exit-move-tty_kref_put-outside-of-__cleanup_signal.patch
signals-make-task_struct-signal-immutable-refcountable.patch
signals-clear-signal-tty-when-the-last-thread-exits.patch
signals-clear-signal-tty-when-the-last-thread-exits-fix.patch
signals-kill-the-awful-task_rq_unlock_wait-hack.patch
exit-__exit_signal-use-thread_group_leader-consistently.patch
kill-the-obsolete-thread_group_cputime_free-and-taskstats_tgid_init-helpers.patch
exit-move-taskstats_tgid_free-from-__exit_signal-to-free_signal_struct.patch
check_unshare_flags-kill-the-bogus-clone_sighand-sig-count-check.patch
proc-get_nr_threads-doesnt-need-siglock-any-longer.patch
proc_sched_show_task-use-get_nr_threads.patch
keyctl_session_to_parent-use-thread_group_empty-to-check-singlethreadness.patch
proc-turn-signal_struct-count-into-int-nr_threads.patch
proc-turn-signal_struct-count-into-int-nr_threads-checkpatch-fixes.patch
proc-cleanup-remove-unused-assignments.patch
cpu-hotplug-introduce-cpu_notify-__cpu_notify-cpu_notify_nofail.patch
cpu-hotplug-return-better-errno-on-cpu-hotplug-failure.patch
notifier-change-notifier_from_errno0-to-return-notify_ok.patch
x86-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
topology-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
kernel-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
slab-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
iucv-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
ehca-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
s390-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
md-convert-cpu-notifier-to-return-encapsulate-errno-value.patch
fault-injection-add-cpu-notifier-error-injection-module.patch
fault-injection-add-cpu-notifier-error-injection-module-fix.patch
cpuhotplug-do-not-need-cpu_hotplug_begin-when-config_hotplug_cpu=n.patch
ipmi-raise-precedence-of-pnp-based-discovery-mechanisms-acpi-pci.patch
ipmi-convert-tracking-of-the-acpi-device-pointer-to-a-pnp-device.patch
ipmi-update-driver-to-use-dev_printk-and-its-constructs.patch
char-drivers-ram-oops-panic-logger.patch
char-drivers-ram-oops-panic-logger-update.patch
drivers-char-ppdevc-use-kasprintf.patch
rapidio-add-idt-cps-tsi-switches.patch
rapidio-add-switch-locking-during-discovery.patch
rapidio-add-port-write-handling-for-em.patch
rapidio-powerpc-85xx-add-port-write-message-handler-for-srio-port.patch
rapidio-powerpc-85xx-add-mchk-handler-for-srio-port.patch
rapidio-add-enabling-srio-port-rx-and-tx.patch
delay-accounting-re-implement-c-for-getdelaysc-to-report-information-on-a-target-command.patch
delay-accounting-re-implement-c-for-getdelaysc-to-report-information-on-a-target-command-checkpatch-fixes.patch
delayacct-align-to-8-byte-boundary-on-64-bit-systems.patch
lib-random32-export-pseudo-random-number-generator-for-modules.patch
drivers-edac-introduce-missing-kfree.patch
edac-add-__init-to-i7core_xeon_pci_fixup.patch
ssb-add-dma_dev-to-ssb_device-structure.patch
b43legacy-replace-the-ssb_dma-api-with-the-generic-dma-api.patch
b43-replace-the-ssb_dma-api-with-the-generic-dma-api.patch
b44-replace-the-ssb_dma-api-with-the-generic-dma-api.patch
ssb-remove-the-ssb-dma-api.patch
panic-allow-taint-flag-for-warnings-to-be-changed-from-taint_warn.patch
panic-allow-taint-flag-for-warnings-to-be-changed-from-taint_warn-checkpatch-fixes.patch
panic-add-taint-flag-taint_firmware_workaround-i.patch
pci-dmar-combine-the-bios-dmar-table-warning-messages.patch
pci-dmar-tone-down-warnings-about-invalid-bios-dmar-tables.patch
kfifo-kfifo_is_fullempty-should-return-bools-not-ints.patch
time-kill-off-config_generic_time.patch
asm-generic-remove-isa_dma_threshold-in-scatterlisth.patch
asm-generic-add-need_sg_dma_length-to-define-sg_dma_len.patch
x86_32-use-asm-generic-scatterlisth.patch
powerpc-use-asm-generic-scatterlisth.patch
arm-use-asm-generic-scatterlisth.patch
alpha-use-asm-generic-scatterlisth.patch
asm-generic-remove-arch_has_sg_chain-in-scatterlisth.patch
avr32-use-asm-generic-scatterlisth.patch
cris-use-asm-generic-scatterlisth.patch
h8300-use-asm-generic-scatterlisth.patch
m32r-use-use-asm-generic-scatterlisth.patch
m68k-use-asm-generic-scatterlisth.patch
mips-use-use-asm-generic-scatterlisth.patch
xtensa-use-use-asm-generic-scatterlisth.patch
blackfin-use-use-asm-generic-scatterlisth.patch
frv-use-asm-generic-scatterlisth.patch
mn10300-use-asm-generic-scatterlisth.patch
parisc-use-asm-generic-scatterlisth.patch
osst-update-ppos-instead-of-using-file-f_pos.patch
drivers-sbus-char-flashc-flash_read-should-update-ppos-instead-of-file-f_pos.patch
arch-cris-arch-v10-drivers-eepromc-eeprom_read-eeprom_write-should-update-ppos-instead-of-file-f_pos.patch
frv-remove-struct-file-argument-from-sysctl-proc_handler.patch
misdn-remove-unnecessary-test-on-f_pos.patch
rtc-m41t80-use-nonseekable_open.patch
vfs-introduce-noop_llseek.patch
osst-use-noop_llseek-instead-of-default_llseek.patch
st-use-noop_llseek-instead-of-default_llseek.patch
fs-do-not-fallback-to-default_llseek-when-readdir-uses-bkl.patch
documentation-filesystems-locking-update-documentation-on-llseek-wrt-bkl.patch
vfs-add-super-operation-writeback_inodes.patch
vfs-take-2add-set_page_dirty_notag.patch
reiser4-export-remove_from_page_cache.patch
reiser4-export-remove_from_page_cache-fix.patch
reiser4-export-find_get_pages.patch
reiser4.patch
reiser4-writeback_inodes-implementation.patch
reiser4-writeback_inodes-implementation-fix.patch
reiser4-fixup-checkin-checkout-jnodes-for-entd.patch
reiser4-fixups.patch
reiser4-broke.patch
make-sure-nobodys-leaking-resources.patch
journal_add_journal_head-debug.patch
releasing-resources-with-children.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch
mutex-subsystem-synchro-test-module-add-missing-header-file.patch
slab-leaks3-default-y.patch
put_bh-debug.patch
add-debugging-aid-for-memory-initialisation-problems.patch
workaround-for-a-pci-restoring-bug.patch
prio_tree-debugging-patch.patch
single_open-seq_release-leak-diagnostics.patch
add-a-refcount-check-in-dput.patch
getblk-handle-2tb-devices.patch
getblk-handle-2tb-devices-fix.patch
notify_change-callers-must-hold-i_mutex.patch


2010-04-15 23:40:49

by Randy Dunlap

[permalink] [raw]
Subject: [PATCH -mmotm] vmstat: fix build errors when PROC_FS is disabled

From: Randy Dunlap <[email protected]>

Fix vmstat.c to build when CONFIG_PROC_FS is disabled
but CONFIG_DEBUG_FS is enabled.

Fixes around 25 errors.

Signed-off-by: Randy Dunlap <[email protected]>
Cc: Mel Gorman <[email protected]>
---
mm/vmstat.c | 119 ++++++++++++++++++++++++--------------------------
1 file changed, 59 insertions(+), 60 deletions(-)

--- mmotm-2010-0415-1442.orig/mm/vmstat.c
+++ mmotm-2010-0415-1442/mm/vmstat.c
@@ -16,6 +16,7 @@
#include <linux/cpu.h>
#include <linux/vmstat.h>
#include <linux/sched.h>
+#include <linux/seq_file.h>
#include <linux/math64.h>

#ifdef CONFIG_VM_EVENT_COUNTERS
@@ -380,18 +381,57 @@ void zone_statistics(struct zone *prefer
}
#endif

-#ifdef CONFIG_PROC_FS
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-
-static char * const migratetype_names[MIGRATE_TYPES] = {
- "Unmovable",
- "Reclaimable",
- "Movable",
- "Reserve",
- "Isolate",
+struct contig_page_info {
+ unsigned long free_pages;
+ unsigned long free_blocks_total;
+ unsigned long free_blocks_suitable;
};

+/* Walk all the zones in a node and print using a callback */
+static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
+ void (*print)(struct seq_file *m, pg_data_t *, struct zone *))
+{
+ struct zone *zone;
+ struct zone *node_zones = pgdat->node_zones;
+ unsigned long flags;
+
+ for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
+ if (!populated_zone(zone))
+ continue;
+
+ spin_lock_irqsave(&zone->lock, flags);
+ print(m, pgdat, zone);
+ spin_unlock_irqrestore(&zone->lock, flags);
+ }
+}
+
+/*
+ * A fragmentation index only makes sense if an allocation of a requested
+ * size would fail. If that is true, the fragmentation index indicates
+ * whether external fragmentation or a lack of memory was the problem.
+ * The value can be used to determine if page reclaim or compaction
+ * should be used
+ */
+int __fragmentation_index(unsigned int order, struct contig_page_info *info)
+{
+ unsigned long requested = 1UL << order;
+
+ if (!info->free_blocks_total)
+ return 0;
+
+ /* Fragmentation index only makes sense when a request would fail */
+ if (info->free_blocks_suitable)
+ return -1000;
+
+ /*
+ * Index is between 0 and 1 so return within 3 decimal places
+ *
+ * 0 => allocation would fail due to lack of memory
+ * 1 => allocation would fail due to fragmentation
+ */
+ return 1000 - div_u64( (1000+(div_u64(info->free_pages * 1000ULL, requested))), info->free_blocks_total);
+}
+
static void *frag_start(struct seq_file *m, loff_t *pos)
{
pg_data_t *pgdat;
@@ -416,23 +456,16 @@ static void frag_stop(struct seq_file *m
{
}

-/* Walk all the zones in a node and print using a callback */
-static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
- void (*print)(struct seq_file *m, pg_data_t *, struct zone *))
-{
- struct zone *zone;
- struct zone *node_zones = pgdat->node_zones;
- unsigned long flags;
-
- for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
- if (!populated_zone(zone))
- continue;
+#ifdef CONFIG_PROC_FS
+#include <linux/proc_fs.h>

- spin_lock_irqsave(&zone->lock, flags);
- print(m, pgdat, zone);
- spin_unlock_irqrestore(&zone->lock, flags);
- }
-}
+static char * const migratetype_names[MIGRATE_TYPES] = {
+ "Unmovable",
+ "Reclaimable",
+ "Movable",
+ "Reserve",
+ "Isolate",
+};

static void frag_show_print(struct seq_file *m, pg_data_t *pgdat,
struct zone *zone)
@@ -455,39 +488,6 @@ static int frag_show(struct seq_file *m,
return 0;
}

-struct contig_page_info {
- unsigned long free_pages;
- unsigned long free_blocks_total;
- unsigned long free_blocks_suitable;
-};
-
-/*
- * A fragmentation index only makes sense if an allocation of a requested
- * size would fail. If that is true, the fragmentation index indicates
- * whether external fragmentation or a lack of memory was the problem.
- * The value can be used to determine if page reclaim or compaction
- * should be used
- */
-int __fragmentation_index(unsigned int order, struct contig_page_info *info)
-{
- unsigned long requested = 1UL << order;
-
- if (!info->free_blocks_total)
- return 0;
-
- /* Fragmentation index only makes sense when a request would fail */
- if (info->free_blocks_suitable)
- return -1000;
-
- /*
- * Index is between 0 and 1 so return within 3 decimal places
- *
- * 0 => allocation would fail due to lack of memory
- * 1 => allocation would fail due to fragmentation
- */
- return 1000 - div_u64( (1000+(div_u64(info->free_pages * 1000ULL, requested))), info->free_blocks_total);
-}
-
static void pagetypeinfo_showfree_print(struct seq_file *m,
pg_data_t *pgdat, struct zone *zone)
{
@@ -1001,7 +1001,6 @@ module_init(setup_vmstat)

#ifdef CONFIG_DEBUG_FS
#include <linux/debugfs.h>
-#include <linux/seq_file.h>

static struct dentry *extfrag_debug_root;

2010-04-16 16:03:49

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2010-04-15-14-42 uploaded (shmem, CGROUP_MEM_RES_CTLR)

On Thu, 15 Apr 2010 14:42:59 -0700 [email protected] wrote:

> The mm-of-the-moment snapshot 2010-04-15-14-42 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/
>
> and will soon be available at
>
> git://zen-kernel.org/kernel/mmotm.git
>
> It contains the following patches against 2.6.34-rc4:


memcg-move-charge-of-file-pages.patch:

when CONFIG_SHMFS is not enabled:

mm/shmem.c:2721: error: implicit declaration of function 'SHMEM_I'
mm/shmem.c:2721: warning: initialization makes pointer from integer without a cast
mm/shmem.c:2726: error: dereferencing pointer to incomplete type
mm/shmem.c:2727: error: implicit declaration of function 'shmem_swp_entry'
mm/shmem.c:2727: warning: assignment makes pointer from integer without a cast
mm/shmem.c:2734: error: implicit declaration of function 'shmem_swp_unmap'
mm/shmem.c:2735: error: dereferencing pointer to incomplete type


However, adding (needed)
#include <linux/spinlock.h>
to that source file does not fix the build error.

Should CGROUP_MEM_RES_CTLR depend on SHMFS or anything else?


kernel config attached.

thanks,
---
~Randy


Attachments:
config-r9857 (54.90 kB)

2010-04-19 01:55:10

by Daisuke Nishimura

[permalink] [raw]
Subject: Re: mmotm 2010-04-15-14-42 uploaded (shmem, CGROUP_MEM_RES_CTLR)

On Fri, 16 Apr 2010 09:03:15 -0700, Randy Dunlap <[email protected]> wrote:
> On Thu, 15 Apr 2010 14:42:59 -0700 [email protected] wrote:
>
> > The mm-of-the-moment snapshot 2010-04-15-14-42 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
> >
> > and will soon be available at
> >
> > git://zen-kernel.org/kernel/mmotm.git
> >
> > It contains the following patches against 2.6.34-rc4:
>
>
> memcg-move-charge-of-file-pages.patch:
>
> when CONFIG_SHMFS is not enabled:
>
> mm/shmem.c:2721: error: implicit declaration of function 'SHMEM_I'
> mm/shmem.c:2721: warning: initialization makes pointer from integer without a cast
> mm/shmem.c:2726: error: dereferencing pointer to incomplete type
> mm/shmem.c:2727: error: implicit declaration of function 'shmem_swp_entry'
> mm/shmem.c:2727: warning: assignment makes pointer from integer without a cast
> mm/shmem.c:2734: error: implicit declaration of function 'shmem_swp_unmap'
> mm/shmem.c:2735: error: dereferencing pointer to incomplete type
>
Thank you very much for your report.

I attach a fix patch.

===
From: Daisuke Nishimura <[email protected]>

build fix for !CONFIG_SHMEM case.

CC mm/shmem.o
mm/shmem.c: In function 'mem_cgroup_get_shmem_target':
mm/shmem.c:2721: error: implicit declaration of function 'SHMEM_I'
mm/shmem.c:2721: warning: initialization makes pointer from integer without a cast
mm/shmem.c:2726: error: dereferencing pointer to incomplete type
mm/shmem.c:2727: error: implicit declaration of function 'shmem_swp_entry'
mm/shmem.c:2727: warning: assignment makes pointer from integer without a cast
mm/shmem.c:2734: error: implicit declaration of function 'shmem_swp_unmap'
mm/shmem.c:2735: error: dereferencing pointer to incomplete type
make[1]: *** [mm/shmem.o] Error 1

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: Daisuke Nishimura <[email protected]>
---
mm/shmem.c | 99 +++++++++++++++++++++++++++++++++++++----------------------
1 files changed, 62 insertions(+), 37 deletions(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index cb87365..6f183ef 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2568,6 +2568,43 @@ out4:
return error;
}

+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+/**
+ * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
+ * @inode: the inode to be searched
+ * @pgoff: the offset to be searched
+ * @pagep: the pointer for the found page to be stored
+ * @ent: the pointer for the found swap entry to be stored
+ *
+ * If a page is found, refcount of it is incremented. Callers should handle
+ * these refcount.
+ */
+void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
+ struct page **pagep, swp_entry_t *ent)
+{
+ swp_entry_t entry = { .val = 0 }, *ptr;
+ struct page *page = NULL;
+ struct shmem_inode_info *info = SHMEM_I(inode);
+
+ if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
+ goto out;
+
+ spin_lock(&info->lock);
+ ptr = shmem_swp_entry(info, pgoff, NULL);
+ if (ptr && ptr->val) {
+ entry.val = ptr->val;
+ page = find_get_page(&swapper_space, entry.val);
+ } else
+ page = find_get_page(inode->i_mapping, pgoff);
+ if (ptr)
+ shmem_swp_unmap(ptr);
+ spin_unlock(&info->lock);
+out:
+ *pagep = page;
+ *ent = entry;
+}
+#endif
+
#else /* !CONFIG_SHMEM */

/*
@@ -2607,6 +2644,31 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
return 0;
}

+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+/**
+ * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
+ * @inode: the inode to be searched
+ * @pgoff: the offset to be searched
+ * @pagep: the pointer for the found page to be stored
+ * @ent: the pointer for the found swap entry to be stored
+ *
+ * If a page is found, refcount of it is incremented. Callers should handle
+ * these refcount.
+ */
+void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
+ struct page **pagep, swp_entry_t *ent)
+{
+ struct page *page = NULL;
+
+ if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
+ goto out;
+ page = find_get_page(inode->i_mapping, pgoff);
+out:
+ *pagep = page;
+ *ent = (swp_entry_t){ .val = 0 };
+}
+#endif
+
#define shmem_vm_ops generic_file_vm_ops
#define shmem_file_operations ramfs_file_operations
#define shmem_get_inode(sb, mode, dev, flags) ramfs_get_inode(sb, mode, dev)
@@ -2701,40 +2763,3 @@ int shmem_zero_setup(struct vm_area_struct *vma)
vma->vm_ops = &shmem_vm_ops;
return 0;
}
-
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
-/**
- * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
- * @inode: the inode to be searched
- * @pgoff: the offset to be searched
- * @pagep: the pointer for the found page to be stored
- * @ent: the pointer for the found swap entry to be stored
- *
- * If a page is found, refcount of it is incremented. Callers should handle
- * these refcount.
- */
-void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
- struct page **pagep, swp_entry_t *ent)
-{
- swp_entry_t entry = { .val = 0 }, *ptr;
- struct page *page = NULL;
- struct shmem_inode_info *info = SHMEM_I(inode);
-
- if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
- goto out;
-
- spin_lock(&info->lock);
- ptr = shmem_swp_entry(info, pgoff, NULL);
- if (ptr && ptr->val) {
- entry.val = ptr->val;
- page = find_get_page(&swapper_space, entry.val);
- } else
- page = find_get_page(inode->i_mapping, pgoff);
- if (ptr)
- shmem_swp_unmap(ptr);
- spin_unlock(&info->lock);
-out:
- *pagep = page;
- *ent = entry;
-}
-#endif
--
1.5.6.1

2010-04-19 02:18:37

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2010-04-15-14-42 uploaded (shmem, CGROUP_MEM_RES_CTLR)

On 04/18/10 18:49, Daisuke Nishimura wrote:
> On Fri, 16 Apr 2010 09:03:15 -0700, Randy Dunlap <[email protected]> wrote:
>> On Thu, 15 Apr 2010 14:42:59 -0700 [email protected] wrote:
>>
>>> The mm-of-the-moment snapshot 2010-04-15-14-42 has been uploaded to
>>>
>>> http://userweb.kernel.org/~akpm/mmotm/
>>>
>>> and will soon be available at
>>>
>>> git://zen-kernel.org/kernel/mmotm.git
>>>
>>> It contains the following patches against 2.6.34-rc4:
>>
>>
>> memcg-move-charge-of-file-pages.patch:
>>
>> when CONFIG_SHMFS is not enabled:
>>
>> mm/shmem.c:2721: error: implicit declaration of function 'SHMEM_I'
>> mm/shmem.c:2721: warning: initialization makes pointer from integer without a cast
>> mm/shmem.c:2726: error: dereferencing pointer to incomplete type
>> mm/shmem.c:2727: error: implicit declaration of function 'shmem_swp_entry'
>> mm/shmem.c:2727: warning: assignment makes pointer from integer without a cast
>> mm/shmem.c:2734: error: implicit declaration of function 'shmem_swp_unmap'
>> mm/shmem.c:2735: error: dereferencing pointer to incomplete type
>>
> Thank you very much for your report.
>
> I attach a fix patch.
>
> ===
> From: Daisuke Nishimura <[email protected]>
>
> build fix for !CONFIG_SHMEM case.
>
> CC mm/shmem.o
> mm/shmem.c: In function 'mem_cgroup_get_shmem_target':
> mm/shmem.c:2721: error: implicit declaration of function 'SHMEM_I'
> mm/shmem.c:2721: warning: initialization makes pointer from integer without a cast
> mm/shmem.c:2726: error: dereferencing pointer to incomplete type
> mm/shmem.c:2727: error: implicit declaration of function 'shmem_swp_entry'
> mm/shmem.c:2727: warning: assignment makes pointer from integer without a cast
> mm/shmem.c:2734: error: implicit declaration of function 'shmem_swp_unmap'
> mm/shmem.c:2735: error: dereferencing pointer to incomplete type
> make[1]: *** [mm/shmem.o] Error 1
>
> Reported-by: Randy Dunlap <[email protected]>
> Signed-off-by: Daisuke Nishimura <[email protected]>

Acked-by: Randy Dunlap <[email protected]>

Thanks.

> ---
> mm/shmem.c | 99 +++++++++++++++++++++++++++++++++++++----------------------
> 1 files changed, 62 insertions(+), 37 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index cb87365..6f183ef 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2568,6 +2568,43 @@ out4:
> return error;
> }
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> +/**
> + * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
> + * @inode: the inode to be searched
> + * @pgoff: the offset to be searched
> + * @pagep: the pointer for the found page to be stored
> + * @ent: the pointer for the found swap entry to be stored
> + *
> + * If a page is found, refcount of it is incremented. Callers should handle
> + * these refcount.
> + */
> +void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
> + struct page **pagep, swp_entry_t *ent)
> +{
> + swp_entry_t entry = { .val = 0 }, *ptr;
> + struct page *page = NULL;
> + struct shmem_inode_info *info = SHMEM_I(inode);
> +
> + if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
> + goto out;
> +
> + spin_lock(&info->lock);
> + ptr = shmem_swp_entry(info, pgoff, NULL);
> + if (ptr && ptr->val) {
> + entry.val = ptr->val;
> + page = find_get_page(&swapper_space, entry.val);
> + } else
> + page = find_get_page(inode->i_mapping, pgoff);
> + if (ptr)
> + shmem_swp_unmap(ptr);
> + spin_unlock(&info->lock);
> +out:
> + *pagep = page;
> + *ent = entry;
> +}
> +#endif
> +
> #else /* !CONFIG_SHMEM */
>
> /*
> @@ -2607,6 +2644,31 @@ int shmem_lock(struct file *file, int lock, struct user_struct *user)
> return 0;
> }
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> +/**
> + * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
> + * @inode: the inode to be searched
> + * @pgoff: the offset to be searched
> + * @pagep: the pointer for the found page to be stored
> + * @ent: the pointer for the found swap entry to be stored
> + *
> + * If a page is found, refcount of it is incremented. Callers should handle
> + * these refcount.
> + */
> +void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
> + struct page **pagep, swp_entry_t *ent)
> +{
> + struct page *page = NULL;
> +
> + if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
> + goto out;
> + page = find_get_page(inode->i_mapping, pgoff);
> +out:
> + *pagep = page;
> + *ent = (swp_entry_t){ .val = 0 };
> +}
> +#endif
> +
> #define shmem_vm_ops generic_file_vm_ops
> #define shmem_file_operations ramfs_file_operations
> #define shmem_get_inode(sb, mode, dev, flags) ramfs_get_inode(sb, mode, dev)
> @@ -2701,40 +2763,3 @@ int shmem_zero_setup(struct vm_area_struct *vma)
> vma->vm_ops = &shmem_vm_ops;
> return 0;
> }
> -
> -#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> -/**
> - * mem_cgroup_get_shmem_target - find a page or entry assigned to the shmem file
> - * @inode: the inode to be searched
> - * @pgoff: the offset to be searched
> - * @pagep: the pointer for the found page to be stored
> - * @ent: the pointer for the found swap entry to be stored
> - *
> - * If a page is found, refcount of it is incremented. Callers should handle
> - * these refcount.
> - */
> -void mem_cgroup_get_shmem_target(struct inode *inode, pgoff_t pgoff,
> - struct page **pagep, swp_entry_t *ent)
> -{
> - swp_entry_t entry = { .val = 0 }, *ptr;
> - struct page *page = NULL;
> - struct shmem_inode_info *info = SHMEM_I(inode);
> -
> - if ((pgoff << PAGE_CACHE_SHIFT) >= i_size_read(inode))
> - goto out;
> -
> - spin_lock(&info->lock);
> - ptr = shmem_swp_entry(info, pgoff, NULL);
> - if (ptr && ptr->val) {
> - entry.val = ptr->val;
> - page = find_get_page(&swapper_space, entry.val);
> - } else
> - page = find_get_page(inode->i_mapping, pgoff);
> - if (ptr)
> - shmem_swp_unmap(ptr);
> - spin_unlock(&info->lock);
> -out:
> - *pagep = page;
> - *ent = entry;
> -}
> -#endif


--
~Randy

2010-04-19 10:05:40

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded


mmotm 2010-04-15-14-42

When I tried
# echo 0 > /proc/sys/vm/compaction

I see following.

My enviroment was
2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
allocating tons of hugepages and reduce free memory.

What I did was:
# echo 0 > /proc/sys/vm/compact_memory

Hmm, I see this kind of error at migation for the 1st time..
my.config is attached. Hmm... ?

(I'm sorry I'll be offline soon.)
-Kame
==

Apr 19 18:55:04 localhost kernel: BUG: unable to handle kernel paging request at ffff8806213ff000
Apr 19 18:55:04 localhost kernel: IP: [<ffffffff812ae3a5>] copy_page_c+0x5/0x10
Apr 19 18:55:04 localhost kernel: PGD 1a43063 PUD 50d5067 PMD 51df067 PTE 80000006213ff160
Apr 19 18:55:04 localhost kernel: Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
Apr 19 18:55:04 localhost kernel: last sysfs file: /sys/devices/pci0000:00/0000:00:1d.3/usb5/devnum
Apr 19 18:55:04 localhost kernel: CPU 1
Apr 19 18:55:04 localhost kernel: Modules linked in: sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat
bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conn
track_ipv6 ip6table_filter ip6_tables ipv6 uinput e1000e bnx2 shpchp i5000_edac edac_core i2c_i801
i2c_core ppdev i5k_amb parport_pc parport ioatdma dca iTCO_wdt iTCO_vendor_support pcspkr kvm_intel
kvm dm_multipath megaraid_sas [last unloaded: microcode]
Apr 19 18:55:04 localhost kernel:
Apr 19 18:55:04 localhost kernel: Pid: 2427, comm: bash Tainted: G W 2.6.34-rc4-mm1+ #1 D2
519/PRIMERGY
Apr 19 18:55:04 localhost kernel: RIP: 0010:[<ffffffff812ae3a5>] [<ffffffff812ae3a5>] copy_page_c+
0x5/0x10
Apr 19 18:55:04 localhost kernel: RSP: 0018:ffff88061c025b70 EFLAGS: 00010286
Apr 19 18:55:04 localhost kernel: RAX: ffff880000000000 RBX: ffffea0003801180 RCX: 0000000000000200
Apr 19 18:55:04 localhost kernel: RDX: 6db6db6db6db6db7 RSI: ffff880100050000 RDI: ffff8806213ff000
Apr 19 18:55:04 localhost kernel: RBP: ffff88061c025b98 R08: 0000000000000048 R09: 0000000000000001
Apr 19 18:55:04 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffffea0015745fc8
Apr 19 18:55:04 localhost kernel: R13: ffff88061c024000 R14: ffffea00038011a8 R15: 0000000000000000
Apr 19 18:55:04 localhost kernel: FS: 00007f634035f700(0000) GS:ffff880005600000(0000) knlGS:00000
00000000000
Apr 19 18:55:04 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 19 18:55:04 localhost kernel: CR2: ffff8806213ff000 CR3: 0000000612a15000 CR4: 00000000000006e0
Apr 19 18:55:04 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 19 18:55:04 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 19 18:55:04 localhost kernel: Process bash (pid: 2427, threadinfo ffff88061c024000, task ffff880611422410)
Apr 19 18:55:04 localhost kernel: Stack:
Apr 19 18:55:04 localhost kernel: ffffffff8114ea2b ffffea0015745fc8 ffffea0003801180 ffffea0015745fc8
Apr 19 18:55:04 localhost kernel: <0> ffffea0003801180 ffff88061c025bc8 ffffffff8114ee75 0000000000000061
Apr 19 18:55:04 localhost kernel: <0> ffffc90000000000 ffffea0003801180 ffffea0015745fc8 ffff88061c025c08
Apr 19 18:55:04 localhost kernel: Call Trace:
Apr 19 18:55:04 localhost kernel: [<ffffffff8114ea2b>] ? migrate_page_copy+0x7b/0x200
Apr 19 18:55:04 localhost kernel: [<ffffffff8114ee75>] migrate_page+0x35/0x50
Apr 19 18:55:04 localhost kernel: [<ffffffff8114fe85>] buffer_migrate_page+0x125/0x150
Apr 19 18:55:04 localhost kernel: [<ffffffff8114f3f2>] migrate_pages+0x562/0x7d0
Apr 19 18:55:04 localhost kernel: [<ffffffff81111500>] ? lru_add_drain_per_cpu+0x0/0x10
Apr 19 18:55:04 localhost kernel: [<ffffffff81142630>] ? compaction_alloc+0x0/0x370
Apr 19 18:55:04 localhost kernel: [<ffffffff8107cc90>] ? schedule_on_each_cpu+0x150/0x180
Apr 19 18:55:04 localhost kernel: [<ffffffff81141f20>] ? compact_zone+0x270/0x530
Apr 19 18:55:04 localhost kernel: [<ffffffff811420ec>] compact_zone+0x43c/0x530
Apr 19 18:55:04 localhost kernel: [<ffffffff811422e9>] compact_node+0x109/0x140
Apr 19 18:55:04 localhost kernel: [<ffffffff811423cc>] sysctl_compaction_handler+0x5c/0x90
Apr 19 18:55:04 localhost kernel: [<ffffffff811c28f7>] proc_sys_call_handler+0x97/0xd0
Apr 19 18:55:04 localhost kernel: [<ffffffff811c2944>] proc_sys_write+0x14/0x20
Apr 19 18:55:04 localhost kernel: [<ffffffff8115c448>] vfs_write+0xc8/0x190
Apr 19 18:55:04 localhost kernel: [<ffffffff8115ce41>] sys_write+0x51/0x90
Apr 19 18:55:04 localhost kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Apr 19 18:55:04 localhost kernel: Code: 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 b9 00 02 00 00 <f3> 48 a5 c3 0f 1f 80 00 00 00 00 eb ee 0f 1f 84 00 00 00 00 00
Apr 19 18:55:04 localhost kernel: RIP [<ffffffff812ae3a5>] copy_page_c+0x5/0x10
Apr 19 18:55:04 localhost kernel: RSP <ffff88061c025b70>
Apr 19 18:55:04 localhost kernel: CR2: ffff8806213ff000
Apr 19 18:55:04 localhost kernel: ---[ end trace a512635642ce2994 ]---


Attachments:
myconfig (94.97 kB)

2010-04-19 18:15:06

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
>
> mmotm 2010-04-15-14-42
>
> When I tried
> # echo 0 > /proc/sys/vm/compaction
>
> I see following.
>
> My enviroment was
> 2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> allocating tons of hugepages and reduce free memory.
>
> What I did was:
> # echo 0 > /proc/sys/vm/compact_memory
>
> Hmm, I see this kind of error at migation for the 1st time..
> my.config is attached. Hmm... ?
>
> (I'm sorry I'll be offline soon.)

That's ok, thanks you for the report. I'm afraid I made little progress
as I spent most of the day on other bugs but I do have something for
you.

First, I reproduced the problem using your .config. However, the problem does
not manifest with the .config I normally use which is derived from the distro
kernel configuration (Debian Lenny). So, there is something in your .config
that triggers the problem. I very strongly suspect this is an interaction
between migration, compaction and page allocation debug. Compaction takes
pages directly off the buddy list and I bet you a shiny penny they are still
unmapped when the copy takes place resulting in your oops.

I'll verify the theory tomorrow but it's a plausible explanation. On a
different note, where did config options like the following come out of?

CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"

I don't think they are a factor but I'm curious.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2010-04-19 19:39:40

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
> >
> > mmotm 2010-04-15-14-42
> >
> > When I tried
> > # echo 0 > /proc/sys/vm/compaction
> >
> > I see following.
> >
> > My enviroment was
> > 2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> > allocating tons of hugepages and reduce free memory.
> >
> > What I did was:
> > # echo 0 > /proc/sys/vm/compact_memory
> >
> > Hmm, I see this kind of error at migation for the 1st time..
> > my.config is attached. Hmm... ?
> >
> > (I'm sorry I'll be offline soon.)
>
> That's ok, thanks you for the report. I'm afraid I made little progress
> as I spent most of the day on other bugs but I do have something for
> you.
>
> First, I reproduced the problem using your .config. However, the problem does
> not manifest with the .config I normally use which is derived from the distro
> kernel configuration (Debian Lenny). So, there is something in your .config
> that triggers the problem. I very strongly suspect this is an interaction
> between migration, compaction and page allocation debug.

I unexpecedly had the time to dig into this. Does the following patch fix
your problem? It Worked For Me.

==== CUT HERE ====
mm,compaction: Map free pages in the address space after they get split for compaction

split_free_page() is a helper function which takes a free page from the
buddy lists and splits it into order-0 pages. It is used by memory
compaction to build a list of destination pages. If
CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
because split_free_page() did not call the arch-allocation hooks or map
the page into the kernel address space.

This patch does not update split_free_page() as it is called with
interrupts held. Instead it documents that callers of split_free_page()
are responsible for calling the arch hooks and to map the page and fixes
compaction.

This is a fix to the patch mm-compaction-memory-compaction-core.patch.

Signed-off-by: Mel Gorman <[email protected]>
---
mm/compaction.c | 6 ++++++
mm/page_alloc.c | 3 +++
2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index 8f4c518..6218e03 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -184,6 +184,12 @@ static void isolate_freepages(struct zone *zone,
}
spin_unlock_irqrestore(&zone->lock, flags);

+ /* split_free_page does not map the pages */
+ list_for_each_entry(page, freelist, lru) {
+ arch_alloc_page(page, 0);
+ kernel_map_pages(page, 1, 1);
+ }
+
cc->free_pfn = high_pfn;
cc->nr_freepages = nr_freepages;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 53442fd..b2af4d9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1211,6 +1211,9 @@ void split_page(struct page *page, unsigned int order)
/*
* Similar to split_page except the page is already free. As this is only
* being used for migration, the migratetype of the block also changes.
+ * As this is called with interrupts disabled, the caller is responsible
+ * for calling arch_alloc_page() and kernel_map_page() after interrupts
+ * are enabled.
*/
int split_free_page(struct page *page)
{

2010-04-20 02:34:29

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Mon, 19 Apr 2010 20:39:19 +0100
Mel Gorman <[email protected]> wrote:

> On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> > On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
> > >
> > > mmotm 2010-04-15-14-42
> > >
> > > When I tried
> > > # echo 0 > /proc/sys/vm/compaction
> > >
> > > I see following.
> > >
> > > My enviroment was
> > > 2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> > > allocating tons of hugepages and reduce free memory.
> > >
> > > What I did was:
> > > # echo 0 > /proc/sys/vm/compact_memory
> > >
> > > Hmm, I see this kind of error at migation for the 1st time..
> > > my.config is attached. Hmm... ?
> > >
> > > (I'm sorry I'll be offline soon.)
> >
> > That's ok, thanks you for the report. I'm afraid I made little progress
> > as I spent most of the day on other bugs but I do have something for
> > you.
> >
> > First, I reproduced the problem using your .config. However, the problem does
> > not manifest with the .config I normally use which is derived from the distro
> > kernel configuration (Debian Lenny). So, there is something in your .config
> > that triggers the problem. I very strongly suspect this is an interaction
> > between migration, compaction and page allocation debug.
>
> I unexpecedly had the time to dig into this. Does the following patch fix
> your problem? It Worked For Me.
>
Ok, works for me, too.

Tested-by: KAMEZAWA Hiroyuki <[email protected]>

Thank you.
-Kame
> ==== CUT HERE ====
> mm,compaction: Map free pages in the address space after they get split for compaction
>
> split_free_page() is a helper function which takes a free page from the
> buddy lists and splits it into order-0 pages. It is used by memory
> compaction to build a list of destination pages. If
> CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> because split_free_page() did not call the arch-allocation hooks or map
> the page into the kernel address space.
>
> This patch does not update split_free_page() as it is called with
> interrupts held. Instead it documents that callers of split_free_page()
> are responsible for calling the arch hooks and to map the page and fixes
> compaction.
>
> This is a fix to the patch mm-compaction-memory-compaction-core.patch.
>
> Signed-off-by: Mel Gorman <[email protected]>
> ---
> mm/compaction.c | 6 ++++++
> mm/page_alloc.c | 3 +++
> 2 files changed, 9 insertions(+), 0 deletions(-)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 8f4c518..6218e03 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -184,6 +184,12 @@ static void isolate_freepages(struct zone *zone,
> }
> spin_unlock_irqrestore(&zone->lock, flags);
>
> + /* split_free_page does not map the pages */
> + list_for_each_entry(page, freelist, lru) {
> + arch_alloc_page(page, 0);
> + kernel_map_pages(page, 1, 1);
> + }
> +
> cc->free_pfn = high_pfn;
> cc->nr_freepages = nr_freepages;
> }
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 53442fd..b2af4d9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1211,6 +1211,9 @@ void split_page(struct page *page, unsigned int order)
> /*
> * Similar to split_page except the page is already free. As this is only
> * being used for migration, the migratetype of the block also changes.
> + * As this is called with interrupts disabled, the caller is responsible
> + * for calling arch_alloc_page() and kernel_map_page() after interrupts
> + * are enabled.
> */
> int split_free_page(struct page *page)
> {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2010-04-20 02:39:38

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Mon, 19 Apr 2010 19:14:42 +0100
Mel Gorman <[email protected]> wrote:

> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:

> I'll verify the theory tomorrow but it's a plausible explanation. On a
> different note, where did config options like the following come out of?
>
> CONFIG_ARCH_HWEIGHT_CFLAGS="-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11"
>
> I don't think they are a factor but I'm curious.
>

Hmm ? arch/x86/Kconfig.

config ARCH_HWEIGHT_CFLAGS
string
default "-fcall-saved-ecx -fcall-saved-edx" if X86_32
default "-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11" if X86_64


Seems to be from
patches/x86-add-optimized-popcnt-variants.patch

Thanks,
-Kame

2010-04-20 02:39:49

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
> On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
>> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
>> >
>> > mmotm 2010-04-15-14-42
>> >
>> > When I tried
>> >  # echo 0 > /proc/sys/vm/compaction
>> >
>> > I see following.
>> >
>> > My enviroment was
>> >   2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
>> >   allocating tons of hugepages and reduce free memory.
>> >
>> > What I did was:
>> >   # echo 0 > /proc/sys/vm/compact_memory
>> >
>> > Hmm, I see this kind of error at migation for the 1st time..
>> > my.config is attached. Hmm... ?
>> >
>> > (I'm sorry I'll be offline soon.)
>>
>> That's ok, thanks you for the report. I'm afraid I made little progress
>> as I spent most of the day on other bugs but I do have something for
>> you.
>>
>> First, I reproduced the problem using your .config. However, the problem does
>> not manifest with the .config I normally use which is derived from the distro
>> kernel configuration (Debian Lenny). So, there is something in your .config
>> that triggers the problem. I very strongly suspect this is an interaction
>> between migration, compaction and page allocation debug.
>
> I unexpecedly had the time to dig into this. Does the following patch fix
> your problem? It Worked For Me.

Nice catch during shot time. Below is comment.

>
> ==== CUT HERE ====
> mm,compaction: Map free pages in the address space after they get split for compaction
>
> split_free_page() is a helper function which takes a free page from the
> buddy lists and splits it into order-0 pages. It is used by memory
> compaction to build a list of destination pages. If
> CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> because split_free_page() did not call the arch-allocation hooks or map
> the page into the kernel address space.
>
> This patch does not update split_free_page() as it is called with
> interrupts held. Instead it documents that callers of split_free_page()
> are responsible for calling the arch hooks and to map the page and fixes
> compaction.

Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
as interrupt disabled? It's deadlock issue or latency issue?
I don't found any comment about it.
It should have added the comment around that functions. :)

And now compaction only uses split_free_page and it is exposed by mm.h.
I think it would be better to map pages inside split_free_page to
export others.(ie, making generic function).
If we can't do, how about making split_free_page static as static function?
And only uses it in compaction.

--
Kind regards,
Minchan Kim

2010-04-20 03:11:50

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, 20 Apr 2010 11:39:46 +0900
Minchan Kim <[email protected]> wrote:

> On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> >> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
> >> >
> >> > mmotm 2010-04-15-14-42
> >> >
> >> > When I tried
> >> >  # echo 0 > /proc/sys/vm/compaction
> >> >
> >> > I see following.
> >> >
> >> > My enviroment was
> >> >   2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> >> >   allocating tons of hugepages and reduce free memory.
> >> >
> >> > What I did was:
> >> >   # echo 0 > /proc/sys/vm/compact_memory
> >> >
> >> > Hmm, I see this kind of error at migation for the 1st time..
> >> > my.config is attached. Hmm... ?
> >> >
> >> > (I'm sorry I'll be offline soon.)
> >>
> >> That's ok, thanks you for the report. I'm afraid I made little progress
> >> as I spent most of the day on other bugs but I do have something for
> >> you.
> >>
> >> First, I reproduced the problem using your .config. However, the problem does
> >> not manifest with the .config I normally use which is derived from the distro
> >> kernel configuration (Debian Lenny). So, there is something in your .config
> >> that triggers the problem. I very strongly suspect this is an interaction
> >> between migration, compaction and page allocation debug.
> >
> > I unexpecedly had the time to dig into this. Does the following patch fix
> > your problem? It Worked For Me.
>
> Nice catch during shot time. Below is comment.
>
> >
> > ==== CUT HERE ====
> > mm,compaction: Map free pages in the address space after they get split for compaction
> >
> > split_free_page() is a helper function which takes a free page from the
> > buddy lists and splits it into order-0 pages. It is used by memory
> > compaction to build a list of destination pages. If
> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> > because split_free_page() did not call the arch-allocation hooks or map
> > the page into the kernel address space.
> >
> > This patch does not update split_free_page() as it is called with
> > interrupts held. Instead it documents that callers of split_free_page()
> > are responsible for calling the arch hooks and to map the page and fixes
> > compaction.
>
> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
> as interrupt disabled? It's deadlock issue or latency issue?
> I don't found any comment about it.
> It should have added the comment around that functions. :)
>

I guess it's from the same reason as vfree(), which can't be called under
irq-disabled.

Both of them has to flush TLB of all cpus. At flushing TLB (of other cpus), cpus has
to send IPI via smp_call_function. What I know from old stories is below.

At sendinf IPI, usual sequence is following. (This may be old.)

spin_lock(&ipi_lock);
set up cpu mask for getting notification from other cpu for declearing
"I received IPI and finished my own work".
spin_unlock(&ipi_lock);

Then,
CPU0 CPU1

irq_disable (somewhere) spin_lock
send IPI and wait for notification.
spin_lock()

deadlock. Seeing decription of kernel/smp.c::smp_call_function_many(), it says
this function should not be called under irq-disabled.
(Maybe the same kind of spin-wait deadlock can happen.)


Thanks,
-Kame

2010-04-20 03:58:45

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 12:07 PM, KAMEZAWA Hiroyuki
<[email protected]> wrote:
> On Tue, 20 Apr 2010 11:39:46 +0900
> Minchan Kim <[email protected]> wrote:
>
>> On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
>> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
>> >> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
>> >> >
>> >> > mmotm 2010-04-15-14-42
>> >> >
>> >> > When I tried
>> >> >  # echo 0 > /proc/sys/vm/compaction
>> >> >
>> >> > I see following.
>> >> >
>> >> > My enviroment was
>> >> >   2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
>> >> >   allocating tons of hugepages and reduce free memory.
>> >> >
>> >> > What I did was:
>> >> >   # echo 0 > /proc/sys/vm/compact_memory
>> >> >
>> >> > Hmm, I see this kind of error at migation for the 1st time..
>> >> > my.config is attached. Hmm... ?
>> >> >
>> >> > (I'm sorry I'll be offline soon.)
>> >>
>> >> That's ok, thanks you for the report. I'm afraid I made little progress
>> >> as I spent most of the day on other bugs but I do have something for
>> >> you.
>> >>
>> >> First, I reproduced the problem using your .config. However, the problem does
>> >> not manifest with the .config I normally use which is derived from the distro
>> >> kernel configuration (Debian Lenny). So, there is something in your .config
>> >> that triggers the problem. I very strongly suspect this is an interaction
>> >> between migration, compaction and page allocation debug.
>> >
>> > I unexpecedly had the time to dig into this. Does the following patch fix
>> > your problem? It Worked For Me.
>>
>> Nice catch during shot time. Below is comment.
>>
>> >
>> > ==== CUT HERE ====
>> > mm,compaction: Map free pages in the address space after they get split for compaction
>> >
>> > split_free_page() is a helper function which takes a free page from the
>> > buddy lists and splits it into order-0 pages. It is used by memory
>> > compaction to build a list of destination pages. If
>> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
>> > because split_free_page() did not call the arch-allocation hooks or map
>> > the page into the kernel address space.
>> >
>> > This patch does not update split_free_page() as it is called with
>> > interrupts held. Instead it documents that callers of split_free_page()
>> > are responsible for calling the arch hooks and to map the page and fixes
>> > compaction.
>>
>> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
>> as interrupt disabled? It's deadlock issue or latency issue?
>> I don't found any comment about it.
>> It should have added the comment around that functions. :)
>>
>
> I guess it's from the same reason as vfree(), which can't be called under
> irq-disabled.
>
> Both of them has to flush TLB of all cpus. At flushing TLB (of other cpus), cpus has
> to send IPI via smp_call_function. What I know from old stories is below.
>
> At sendinf IPI, usual sequence is following. (This may be old.)
>
>        spin_lock(&ipi_lock);
>                set up cpu mask for getting notification from other cpu for declearing
>                "I received IPI and finished my own work".
>        spin_unlock(&ipi_lock);
>
> Then,
>          CPU0                             CPU1
>
>    irq_disable (somewhere)             spin_lock
>                                        send IPI and wait for notification.
>    spin_lock()
>
> deadlock.  Seeing decription of kernel/smp.c::smp_call_function_many(), it says
> this function should not be called under irq-disabled.
> (Maybe the same kind of spin-wait deadlock can happen.)
>

Thanks for kind explanation.
Actually I guessed TLB issue but I can't find any glue point which
connect tlb flush to smp_call_function_xxx. :(

Now look at the __native_flush_tlb_global.
It just read and write cr4 with just mask off X86_CR4_PGE.
So i don't know how connect this and smp_schedule_xxxx.
Hmm,, maybe APIC?

Sorry for dumb question.



--
Kind regards,
Minchan Kim

2010-04-20 04:28:33

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, 20 Apr 2010 12:58:43 +0900
Minchan Kim <[email protected]> wrote:

> On Tue, Apr 20, 2010 at 12:07 PM, KAMEZAWA Hiroyuki
> <[email protected]> wrote:
> > On Tue, 20 Apr 2010 11:39:46 +0900
> >> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
> >> as interrupt disabled? It's deadlock issue or latency issue?
> >> I don't found any comment about it.
> >> It should have added the comment around that functions. :)
> >>
> >
> > I guess it's from the same reason as vfree(), which can't be called under
> > irq-disabled.
> >
> > Both of them has to flush TLB of all cpus. At flushing TLB (of other cpus), cpus has
> > to send IPI via smp_call_function. What I know from old stories is below.
> >
> > At sendinf IPI, usual sequence is following. (This may be old.)
> >
> >        spin_lock(&ipi_lock);
> >                set up cpu mask for getting notification from other cpu for declearing
> >                "I received IPI and finished my own work".
> >        spin_unlock(&ipi_lock);
> >
> > Then,
> >          CPU0                             CPU1
> >
> >    irq_disable (somewhere)             spin_lock
> >                                        send IPI and wait for notification.
> >    spin_lock()
> >
> > deadlock.  Seeing decription of kernel/smp.c::smp_call_function_many(), it says
> > this function should not be called under irq-disabled.
> > (Maybe the same kind of spin-wait deadlock can happen.)
> >
>
> Thanks for kind explanation.
> Actually I guessed TLB issue but I can't find any glue point which
> connect tlb flush to smp_call_function_xxx. :(
>
> Now look at the __native_flush_tlb_global.
> It just read and write cr4 with just mask off X86_CR4_PGE.
> So i don't know how connect this and smp_schedule_xxxx.
> Hmm,, maybe APIC?
>
> Sorry for dumb question.
>
Hmm...seeing again,

arch/x86/mm/pageattr.c::kernel_map_pages() says:

1293 /*
1294 * We should perform an IPI and flush all tlbs,
1295 * but that can deadlock->flush only current cpu:
1296 */

Wow. It just flush only local cpu. Then, no IPI.

Hmm...all other archs does the same thing ? If so, kernel_map_pages()
can be called under irq_disabled. The author of kernel_map_pages()
is aware that this can be called under irq-disabled.

Hmm...

Thanks,
-Kame





2010-04-20 08:21:23

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 11:39:46AM +0900, Minchan Kim wrote:
> On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> >> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
> >> >
> >> > mmotm 2010-04-15-14-42
> >> >
> >> > When I tried
> >> > ?# echo 0 > /proc/sys/vm/compaction
> >> >
> >> > I see following.
> >> >
> >> > My enviroment was
> >> > ? 2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> >> > ? allocating tons of hugepages and reduce free memory.
> >> >
> >> > What I did was:
> >> > ? # echo 0 > /proc/sys/vm/compact_memory
> >> >
> >> > Hmm, I see this kind of error at migation for the 1st time..
> >> > my.config is attached. Hmm... ?
> >> >
> >> > (I'm sorry I'll be offline soon.)
> >>
> >> That's ok, thanks you for the report. I'm afraid I made little progress
> >> as I spent most of the day on other bugs but I do have something for
> >> you.
> >>
> >> First, I reproduced the problem using your .config. However, the problem does
> >> not manifest with the .config I normally use which is derived from the distro
> >> kernel configuration (Debian Lenny). So, there is something in your .config
> >> that triggers the problem. I very strongly suspect this is an interaction
> >> between migration, compaction and page allocation debug.
> >
> > I unexpecedly had the time to dig into this. Does the following patch fix
> > your problem? It Worked For Me.
>
> Nice catch during shot time. Below is comment.
>
> >
> > ==== CUT HERE ====
> > mm,compaction: Map free pages in the address space after they get split for compaction
> >
> > split_free_page() is a helper function which takes a free page from the
> > buddy lists and splits it into order-0 pages. It is used by memory
> > compaction to build a list of destination pages. If
> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> > because split_free_page() did not call the arch-allocation hooks or map
> > the page into the kernel address space.
> >
> > This patch does not update split_free_page() as it is called with
> > interrupts held. Instead it documents that callers of split_free_page()
> > are responsible for calling the arch hooks and to map the page and fixes
> > compaction.
>
> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
> as interrupt disabled?

In theory, it isn't known what arch_alloc_page is going to do but more
practically kernel_map_pages() is updating mappings and should be
flushing all the TLBs. It can't do that with interrupts disabled.

I checked X86 and it should be fine but only because it flushes the
local CPU and appears to just hope for the best that this doesn't cause
problems.

> It's deadlock issue or latency issue?

deadlock

> I don't found any comment about it.

I'm not aware of any. arch_alloc_page() is only used by s390 so it's not
well known. kernel_map_pages() is only active for a rarely used
debugging option.

> It should have added the comment around that functions. :)
>
> And now compaction only uses split_free_page and it is exposed by mm.h.
> I think it would be better to map pages inside split_free_page to
> export others.(ie, making generic function).

I considered that and it would not be ideal. It would have to disable and
reenable interrupts as each page is taken from the list or alternatively
require that the caller not have the zone lock taken. The latter of these
options is more reasonable but would still result in more interrupt enabling
and disabling.

split_free_page() is extremely specialised and requires knowledge of the
page allocator internals to call properly. There is little pressure to
make this easier to use at the cost of increased locking.

> If we can't do, how about making split_free_page static as static function?
> And only uses it in compaction.
>

It pretty much has to be in page_alloc.c because it uses internal
functions of the page allocator - e.g. rmv_page_order. I could move it
to mm/internal.h because whatever about split_page, I can't imagine why
anyone else would need to call split_free_page.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2010-04-20 08:32:16

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 5:20 PM, Mel Gorman <[email protected]> wrote:
> On Tue, Apr 20, 2010 at 11:39:46AM +0900, Minchan Kim wrote:
>> On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
>> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
>> >> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
>> >> >
>> >> > mmotm 2010-04-15-14-42
>> >> >
>> >> > When I tried
>> >> >  # echo 0 > /proc/sys/vm/compaction
>> >> >
>> >> > I see following.
>> >> >
>> >> > My enviroment was
>> >> >   2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
>> >> >   allocating tons of hugepages and reduce free memory.
>> >> >
>> >> > What I did was:
>> >> >   # echo 0 > /proc/sys/vm/compact_memory
>> >> >
>> >> > Hmm, I see this kind of error at migation for the 1st time..
>> >> > my.config is attached. Hmm... ?
>> >> >
>> >> > (I'm sorry I'll be offline soon.)
>> >>
>> >> That's ok, thanks you for the report. I'm afraid I made little progress
>> >> as I spent most of the day on other bugs but I do have something for
>> >> you.
>> >>
>> >> First, I reproduced the problem using your .config. However, the problem does
>> >> not manifest with the .config I normally use which is derived from the distro
>> >> kernel configuration (Debian Lenny). So, there is something in your .config
>> >> that triggers the problem. I very strongly suspect this is an interaction
>> >> between migration, compaction and page allocation debug.
>> >
>> > I unexpecedly had the time to dig into this. Does the following patch fix
>> > your problem? It Worked For Me.
>>
>> Nice catch during shot time. Below is comment.
>>
>> >
>> > ==== CUT HERE ====
>> > mm,compaction: Map free pages in the address space after they get split for compaction
>> >
>> > split_free_page() is a helper function which takes a free page from the
>> > buddy lists and splits it into order-0 pages. It is used by memory
>> > compaction to build a list of destination pages. If
>> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
>> > because split_free_page() did not call the arch-allocation hooks or map
>> > the page into the kernel address space.
>> >
>> > This patch does not update split_free_page() as it is called with
>> > interrupts held. Instead it documents that callers of split_free_page()
>> > are responsible for calling the arch hooks and to map the page and fixes
>> > compaction.
>>
>> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
>> as interrupt disabled?
>
> In theory, it isn't known what arch_alloc_page is going to do but more
> practically kernel_map_pages() is updating mappings and should be
> flushing all the TLBs. It can't do that with interrupts disabled.
>
> I checked X86 and it should be fine but only because it flushes the
> local CPU and appears to just hope for the best that this doesn't cause
> problems.

Okay.

>> And now compaction only uses split_free_page and it is exposed by mm.h.
>> I think it would be better to map pages inside split_free_page to
>> export others.(ie, making generic function).
>
> I considered that and it would not be ideal. It would have to disable and
> reenable interrupts as each page is taken from the list or alternatively
> require that the caller not have the zone lock taken. The latter of these
> options is more reasonable but would still result in more interrupt enabling
> and disabling.
>
> split_free_page() is extremely specialised and requires knowledge of the
> page allocator internals to call properly. There is little pressure to
> make this easier to use at the cost of increased locking.
>
>> If we can't do, how about making split_free_page static as static function?
>> And only uses it in compaction.
>>
>
> It pretty much has to be in page_alloc.c because it uses internal
> functions of the page allocator - e.g. rmv_page_order. I could move it
> to mm/internal.h because whatever about split_page, I can't imagine why
> anyone else would need to call split_free_page.

Yes. Then, Let's add comment like split_page. :)
/*
* Note: this is probably too low level an operation for use in drivers.
* Please consult with lkml before using this in your driver.
*/


>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>



--
Kind regards,
Minchan Kim

2010-04-20 08:45:16

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 05:32:13PM +0900, Minchan Kim wrote:
> On Tue, Apr 20, 2010 at 5:20 PM, Mel Gorman <[email protected]> wrote:
> > On Tue, Apr 20, 2010 at 11:39:46AM +0900, Minchan Kim wrote:
> >> On Tue, Apr 20, 2010 at 4:39 AM, Mel Gorman <[email protected]> wrote:
> >> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> >> >> On Mon, Apr 19, 2010 at 07:01:33PM +0900, KAMEZAWA Hiroyuki wrote:
> >> >> >
> >> >> > mmotm 2010-04-15-14-42
> >> >> >
> >> >> > When I tried
> >> >> > ?# echo 0 > /proc/sys/vm/compaction
> >> >> >
> >> >> > I see following.
> >> >> >
> >> >> > My enviroment was
> >> >> > ? 2.6.34-rc4-mm1+ (2010-04-15-14-42) (x86-64) CPUx8
> >> >> > ? allocating tons of hugepages and reduce free memory.
> >> >> >
> >> >> > What I did was:
> >> >> > ? # echo 0 > /proc/sys/vm/compact_memory
> >> >> >
> >> >> > Hmm, I see this kind of error at migation for the 1st time..
> >> >> > my.config is attached. Hmm... ?
> >> >> >
> >> >> > (I'm sorry I'll be offline soon.)
> >> >>
> >> >> That's ok, thanks you for the report. I'm afraid I made little progress
> >> >> as I spent most of the day on other bugs but I do have something for
> >> >> you.
> >> >>
> >> >> First, I reproduced the problem using your .config. However, the problem does
> >> >> not manifest with the .config I normally use which is derived from the distro
> >> >> kernel configuration (Debian Lenny). So, there is something in your .config
> >> >> that triggers the problem. I very strongly suspect this is an interaction
> >> >> between migration, compaction and page allocation debug.
> >> >
> >> > I unexpecedly had the time to dig into this. Does the following patch fix
> >> > your problem? It Worked For Me.
> >>
> >> Nice catch during shot time. Below is comment.
> >>
> >> >
> >> > ==== CUT HERE ====
> >> > mm,compaction: Map free pages in the address space after they get split for compaction
> >> >
> >> > split_free_page() is a helper function which takes a free page from the
> >> > buddy lists and splits it into order-0 pages. It is used by memory
> >> > compaction to build a list of destination pages. If
> >> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> >> > because split_free_page() did not call the arch-allocation hooks or map
> >> > the page into the kernel address space.
> >> >
> >> > This patch does not update split_free_page() as it is called with
> >> > interrupts held. Instead it documents that callers of split_free_page()
> >> > are responsible for calling the arch hooks and to map the page and fixes
> >> > compaction.
> >>
> >> Dumb question. Why can't we call arch_alloc_page and kernel_map_pages
> >> as interrupt disabled?
> >
> > In theory, it isn't known what arch_alloc_page is going to do but more
> > practically kernel_map_pages() is updating mappings and should be
> > flushing all the TLBs. It can't do that with interrupts disabled.
> >
> > I checked X86 and it should be fine but only because it flushes the
> > local CPU and appears to just hope for the best that this doesn't cause
> > problems.
>
> Okay.
>
> >> And now compaction only uses split_free_page and it is exposed by mm.h.
> >> I think it would be better to map pages inside split_free_page to
> >> export others.(ie, making generic function).
> >
> > I considered that and it would not be ideal. It would have to disable and
> > reenable interrupts as each page is taken from the list or alternatively
> > require that the caller not have the zone lock taken. The latter of these
> > options is more reasonable but would still result in more interrupt enabling
> > and disabling.
> >
> > split_free_page() is extremely specialised and requires knowledge of the
> > page allocator internals to call properly. There is little pressure to
> > make this easier to use at the cost of increased locking.
> >
> >> If we can't do, how about making split_free_page static as static function?
> >> And only uses it in compaction.
> >>
> >
> > It pretty much has to be in page_alloc.c because it uses internal
> > functions of the page allocator - e.g. rmv_page_order. I could move it
> > to mm/internal.h because whatever about split_page, I can't imagine why
> > anyone else would need to call split_free_page.
>
> Yes. Then, Let's add comment like split_page. :)
> /*
> * Note: this is probably too low level an operation for use in drivers.
> * Please consult with lkml before using this in your driver.
> */
>

I can, but the comment that was there says it's like split_page except the
page is already free. This also covers not using it in a driver.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2010-04-20 09:50:26

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 5:44 PM, Mel Gorman <[email protected]> wrote:
> On Tue, Apr 20, 2010 at 05:32:13PM +0900, Minchan Kim wrote:
>>
>> Yes. Then, Let's add comment like split_page. :)
>>  /*
>>  * Note: this is probably too low level an operation for use in drivers.
>>  * Please consult with lkml before using this in your driver.
>>  */
>>
>
> I can, but the comment that was there says it's like split_page except the
> page is already free. This also covers not using it in a driver.

I see. In addition, you already mentioned "As this is only being used
for migration".
I missed one.
I don't have any against one.
Will you repost v2 which move split_free_pages out of compaction.c?
Anyway, feel free to add my reviewed-by sign.
Thanks, Mel.

Reviewed-by: Minchan Kim <[email protected]>

>
> --
> Mel Gorman
> Part-time Phd Student                          Linux Technology Center
> University of Limerick                         IBM Dublin Software Lab
>



--
Kind regards,
Minchan Kim

2010-04-20 09:59:15

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 06:50:23PM +0900, Minchan Kim wrote:
> On Tue, Apr 20, 2010 at 5:44 PM, Mel Gorman <[email protected]> wrote:
> > On Tue, Apr 20, 2010 at 05:32:13PM +0900, Minchan Kim wrote:
> >>
> >> Yes. Then, Let's add comment like split_page. :)
> >> ?/*
> >> ?* Note: this is probably too low level an operation for use in drivers.
> >> ?* Please consult with lkml before using this in your driver.
> >> ?*/
> >>
> >
> > I can, but the comment that was there says it's like split_page except the
> > page is already free. This also covers not using it in a driver.
>
> I see. In addition, you already mentioned "As this is only being used
> for migration".
> I missed one.
> I don't have any against one.
> Will you repost v2 which move split_free_pages out of compaction.c?

I don't understand your suggestion. split_free_pages is already out of
compaction.c.

> Anyway, feel free to add my reviewed-by sign.
> Thanks, Mel.
>
> Reviewed-by: Minchan Kim <[email protected]>
>

Thanks

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2010-04-20 10:14:10

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Tue, Apr 20, 2010 at 6:58 PM, Mel Gorman <[email protected]> wrote:
> On Tue, Apr 20, 2010 at 06:50:23PM +0900, Minchan Kim wrote:
>> On Tue, Apr 20, 2010 at 5:44 PM, Mel Gorman <[email protected]> wrote:
>> > On Tue, Apr 20, 2010 at 05:32:13PM +0900, Minchan Kim wrote:
>> >>
>> >> Yes. Then, Let's add comment like split_page. :)
>> >>  /*
>> >>  * Note: this is probably too low level an operation for use in drivers.
>> >>  * Please consult with lkml before using this in your driver.
>> >>  */
>> >>
>> >
>> > I can, but the comment that was there says it's like split_page except the
>> > page is already free. This also covers not using it in a driver.
>>
>> I see. In addition, you already mentioned "As this is only being used
>> for migration".
>> I missed one.
>> I don't have any against one.
>> Will you repost v2 which move split_free_pages out of compaction.c?
>
> I don't understand your suggestion. split_free_pages is already out of
> compaction.c.

Ahh. Sorry. It's my fault. I confused. forget it, please.
--
Kind regards,
Minchan Kim

2010-04-21 08:32:38

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Mon, 19 Apr 2010 20:39:19 +0100
Mel Gorman <[email protected]> wrote:

> On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> ==== CUT HERE ====
> mm,compaction: Map free pages in the address space after they get split for compaction
>
> split_free_page() is a helper function which takes a free page from the
> buddy lists and splits it into order-0 pages. It is used by memory
> compaction to build a list of destination pages. If
> CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> because split_free_page() did not call the arch-allocation hooks or map
> the page into the kernel address space.
>
> This patch does not update split_free_page() as it is called with
> interrupts held. Instead it documents that callers of split_free_page()
> are responsible for calling the arch hooks and to map the page and fixes
> compaction.
>
> This is a fix to the patch mm-compaction-memory-compaction-core.patch.
>
> Signed-off-by: Mel Gorman <[email protected]>

Sorry, I think I hit another? error again. (sorry, no log.)
What I did was...
Running 2 shells.
while true; do make -j 16;make cleanl;done
and
while true; do echo 0 > /proc/sys/vm/compact_memory;done


Using the same config.

Apr 21 17:27:47 localhost kernel: ------------[ cut here ]------------
Apr 21 17:27:47 localhost kernel: kernel BUG at include/linux/swapops.h:105!
Apr 21 17:27:47 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
Apr 21 17:27:47 localhost kernel: last sysfs file: /sys/devices/virtual/net/br0/statistics/collisions
Apr 21 17:27:47 localhost kernel: CPU 3
Apr 21 17:27:47 localhost kernel: Modules linked in: fuse sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput ioatdma ppdev parport_pc i5000_edac bnx2 iTCO_wdt edac_core iTCO_vendor_support shpchp parport e1000e kvm_intel dca kvm i2c_i801 i2c_core i5k_amb pcspkr megaraid_sas [last unloaded: microcode]
Apr 21 17:27:47 localhost kernel:
Apr 21 17:27:47 localhost kernel: Pid: 27892, comm: cc1 Tainted: G W 2.6.34-rc4-mm1+ #4 D2519/PRIMERGY
Apr 21 17:27:47 localhost kernel: RIP: 0010:[<ffffffff8114e9cf>] [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
Apr 21 17:27:47 localhost kernel: RSP: 0000:ffff88008d9efe08 EFLAGS: 00010246
Apr 21 17:27:47 localhost kernel: RAX: ffffea0000000000 RBX: ffffea0000241100 RCX: 0000000000000001
Apr 21 17:27:47 localhost kernel: RDX: 000000000000a4e0 RSI: ffff880621a4ab00 RDI: 000000000149c03e
Apr 21 17:27:47 localhost kernel: RBP: ffff88008d9efe38 R08: 0000000000000000 R09: 0000000000000000
Apr 21 17:27:47 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880621a4aae8
Apr 21 17:27:47 localhost kernel: R13: 00000000bf811000 R14: 000000000149c03e R15: 0000000000000000
Apr 21 17:27:47 localhost kernel: FS: 00007fe6abc90700(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000
Apr 21 17:27:47 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 21 17:27:47 localhost kernel: CR2: 00007fe6a37279a0 CR3: 000000008d942000 CR4: 00000000000006e0
Apr 21 17:27:47 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Apr 21 17:27:47 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Apr 21 17:27:47 localhost kernel: Process cc1 (pid: 27892, threadinfo ffff88008d9ee000, task ffff8800b23ec820)
Apr 21 17:27:47 localhost kernel: Stack:
Apr 21 17:27:47 localhost kernel: ffffea000101aee8 ffff880621a4aae8 ffff88008d9efe38 00007fe6a37279a0
Apr 21 17:27:47 localhost kernel: <0> ffff8805d9706d90 ffff880621a4aa00 ffff88008d9efef8 ffffffff81126d05
Apr 21 17:27:47 localhost kernel: <0> ffff88008d9efec8 0000000000000246 0000000000000000 ffffffff81586533
Apr 21 17:27:47 localhost kernel: Call Trace:
Apr 21 17:27:47 localhost kernel: [<ffffffff81126d05>] handle_mm_fault+0x995/0x9b0
Apr 21 17:27:47 localhost kernel: [<ffffffff81586533>] ? do_page_fault+0x103/0x330
Apr 21 17:27:47 localhost kernel: [<ffffffff8104bf40>] ? finish_task_switch+0x0/0xf0
Apr 21 17:27:47 localhost kernel: [<ffffffff8158659e>] do_page_fault+0x16e/0x330
Apr 21 17:27:47 localhost kernel: [<ffffffff81582f35>] page_fault+0x25/0x30
Apr 21 17:27:47 localhost kernel: Code: 53 08 85 c9 0f 84 32 ff ff ff 8d 41 01 89 4d d8 89 45 d4 8b 75 d4 8b 45 d8 f0 0f b1 32 89 45 dc 8b 45 dc 39 c8 74 aa 89 c1 eb d7 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
Apr 21 17:27:47 localhost kernel: RIP [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
Apr 21 17:27:47 localhost kernel: RSP <ffff88008d9efe08>
Apr 21 17:27:47 localhost kernel: ---[ end trace 4860ab585c1fcddb ]---

Regards,
-Kame

2010-04-21 09:52:06

by Kamezawa Hiroyuki

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Wed, 21 Apr 2010 17:28:38 +0900
KAMEZAWA Hiroyuki <[email protected]> wrote:

> On Mon, 19 Apr 2010 20:39:19 +0100
> Mel Gorman <[email protected]> wrote:
>
> > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> > ==== CUT HERE ====
> > mm,compaction: Map free pages in the address space after they get split for compaction
> >
> > split_free_page() is a helper function which takes a free page from the
> > buddy lists and splits it into order-0 pages. It is used by memory
> > compaction to build a list of destination pages. If
> > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> > because split_free_page() did not call the arch-allocation hooks or map
> > the page into the kernel address space.
> >
> > This patch does not update split_free_page() as it is called with
> > interrupts held. Instead it documents that callers of split_free_page()
> > are responsible for calling the arch hooks and to map the page and fixes
> > compaction.
> >
> > This is a fix to the patch mm-compaction-memory-compaction-core.patch.
> >
> > Signed-off-by: Mel Gorman <[email protected]>
>
> Sorry, I think I hit another? error again. (sorry, no log.)
> What I did was...
> Running 2 shells.
> while true; do make -j 16;make cleanl;done
> and
> while true; do echo 0 > /proc/sys/vm/compact_memory;done
>
>
> Using the same config.
>
> Apr 21 17:27:47 localhost kernel: ------------[ cut here ]------------
> Apr 21 17:27:47 localhost kernel: kernel BUG at include/linux/swapops.h:105!
> Apr 21 17:27:47 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> Apr 21 17:27:47 localhost kernel: last sysfs file: /sys/devices/virtual/net/br0/statistics/collisions
> Apr 21 17:27:47 localhost kernel: CPU 3
> Apr 21 17:27:47 localhost kernel: Modules linked in: fuse sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput ioatdma ppdev parport_pc i5000_edac bnx2 iTCO_wdt edac_core iTCO_vendor_support shpchp parport e1000e kvm_intel dca kvm i2c_i801 i2c_core i5k_amb pcspkr megaraid_sas [last unloaded: microcode]
> Apr 21 17:27:47 localhost kernel:
> Apr 21 17:27:47 localhost kernel: Pid: 27892, comm: cc1 Tainted: G W 2.6.34-rc4-mm1+ #4 D2519/PRIMERGY
> Apr 21 17:27:47 localhost kernel: RIP: 0010:[<ffffffff8114e9cf>] [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> Apr 21 17:27:47 localhost kernel: RSP: 0000:ffff88008d9efe08 EFLAGS: 00010246
> Apr 21 17:27:47 localhost kernel: RAX: ffffea0000000000 RBX: ffffea0000241100 RCX: 0000000000000001
> Apr 21 17:27:47 localhost kernel: RDX: 000000000000a4e0 RSI: ffff880621a4ab00 RDI: 000000000149c03e
> Apr 21 17:27:47 localhost kernel: RBP: ffff88008d9efe38 R08: 0000000000000000 R09: 0000000000000000
> Apr 21 17:27:47 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880621a4aae8
> Apr 21 17:27:47 localhost kernel: R13: 00000000bf811000 R14: 000000000149c03e R15: 0000000000000000
> Apr 21 17:27:47 localhost kernel: FS: 00007fe6abc90700(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000
> Apr 21 17:27:47 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Apr 21 17:27:47 localhost kernel: CR2: 00007fe6a37279a0 CR3: 000000008d942000 CR4: 00000000000006e0
> Apr 21 17:27:47 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> Apr 21 17:27:47 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Apr 21 17:27:47 localhost kernel: Process cc1 (pid: 27892, threadinfo ffff88008d9ee000, task ffff8800b23ec820)
> Apr 21 17:27:47 localhost kernel: Stack:
> Apr 21 17:27:47 localhost kernel: ffffea000101aee8 ffff880621a4aae8 ffff88008d9efe38 00007fe6a37279a0
> Apr 21 17:27:47 localhost kernel: <0> ffff8805d9706d90 ffff880621a4aa00 ffff88008d9efef8 ffffffff81126d05
> Apr 21 17:27:47 localhost kernel: <0> ffff88008d9efec8 0000000000000246 0000000000000000 ffffffff81586533
> Apr 21 17:27:47 localhost kernel: Call Trace:
> Apr 21 17:27:47 localhost kernel: [<ffffffff81126d05>] handle_mm_fault+0x995/0x9b0
> Apr 21 17:27:47 localhost kernel: [<ffffffff81586533>] ? do_page_fault+0x103/0x330
> Apr 21 17:27:47 localhost kernel: [<ffffffff8104bf40>] ? finish_task_switch+0x0/0xf0
> Apr 21 17:27:47 localhost kernel: [<ffffffff8158659e>] do_page_fault+0x16e/0x330
> Apr 21 17:27:47 localhost kernel: [<ffffffff81582f35>] page_fault+0x25/0x30
> Apr 21 17:27:47 localhost kernel: Code: 53 08 85 c9 0f 84 32 ff ff ff 8d 41 01 89 4d d8 89 45 d4 8b 75 d4 8b 45 d8 f0 0f b1 32 89 45 dc 8b 45 dc 39 c8 74 aa 89 c1 eb d7 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
> Apr 21 17:27:47 localhost kernel: RIP [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> Apr 21 17:27:47 localhost kernel: RSP <ffff88008d9efe08>
> Apr 21 17:27:47 localhost kernel: ---[ end trace 4860ab585c1fcddb ]---
>

It seems that this is a new error.


static inline struct page *migration_entry_to_page(swp_entry_t entry)
{
struct page *p = pfn_to_page(swp_offset(entry));
/*
* Any use of migration entries may only occur while the
* corresponding page is locked
*/
BUG_ON(!PageLocked(p));
return p;
}


Hits this BUG_ON()....then, the page migration_entry points to is unlocked.

But we always do

lock_page(old_page);
unamp(old_page);
remap(new_page);
unlock_page(old_page);

So....some pte wasn't updated at remap ?

Hmm.
-Kame

2010-04-21 10:21:00

by Mel Gorman

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Wed, Apr 21, 2010 at 06:48:06PM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 21 Apr 2010 17:28:38 +0900
> KAMEZAWA Hiroyuki <[email protected]> wrote:
>
> > On Mon, 19 Apr 2010 20:39:19 +0100
> > Mel Gorman <[email protected]> wrote:
> >
> > > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> > > ==== CUT HERE ====
> > > mm,compaction: Map free pages in the address space after they get split for compaction
> > >
> > > split_free_page() is a helper function which takes a free page from the
> > > buddy lists and splits it into order-0 pages. It is used by memory
> > > compaction to build a list of destination pages. If
> > > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> > > because split_free_page() did not call the arch-allocation hooks or map
> > > the page into the kernel address space.
> > >
> > > This patch does not update split_free_page() as it is called with
> > > interrupts held. Instead it documents that callers of split_free_page()
> > > are responsible for calling the arch hooks and to map the page and fixes
> > > compaction.
> > >
> > > This is a fix to the patch mm-compaction-memory-compaction-core.patch.
> > >
> > > Signed-off-by: Mel Gorman <[email protected]>
> >
> > Sorry, I think I hit another? error again. (sorry, no log.)
> > What I did was...
> > Running 2 shells.
> > while true; do make -j 16;make cleanl;done
> > and
> > while true; do echo 0 > /proc/sys/vm/compact_memory;done
> >
> >
> > Using the same config.
> >
> > Apr 21 17:27:47 localhost kernel: ------------[ cut here ]------------
> > Apr 21 17:27:47 localhost kernel: kernel BUG at include/linux/swapops.h:105!
> > Apr 21 17:27:47 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> > Apr 21 17:27:47 localhost kernel: last sysfs file: /sys/devices/virtual/net/br0/statistics/collisions
> > Apr 21 17:27:47 localhost kernel: CPU 3
> > Apr 21 17:27:47 localhost kernel: Modules linked in: fuse sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput ioatdma ppdev parport_pc i5000_edac bnx2 iTCO_wdt edac_core iTCO_vendor_support shpchp parport e1000e kvm_intel dca kvm i2c_i801 i2c_core i5k_amb pcspkr megaraid_sas [last unloaded: microcode]
> > Apr 21 17:27:47 localhost kernel:
> > Apr 21 17:27:47 localhost kernel: Pid: 27892, comm: cc1 Tainted: G W 2.6.34-rc4-mm1+ #4 D2519/PRIMERGY
> > Apr 21 17:27:47 localhost kernel: RIP: 0010:[<ffffffff8114e9cf>] [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> > Apr 21 17:27:47 localhost kernel: RSP: 0000:ffff88008d9efe08 EFLAGS: 00010246
> > Apr 21 17:27:47 localhost kernel: RAX: ffffea0000000000 RBX: ffffea0000241100 RCX: 0000000000000001
> > Apr 21 17:27:47 localhost kernel: RDX: 000000000000a4e0 RSI: ffff880621a4ab00 RDI: 000000000149c03e
> > Apr 21 17:27:47 localhost kernel: RBP: ffff88008d9efe38 R08: 0000000000000000 R09: 0000000000000000
> > Apr 21 17:27:47 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880621a4aae8
> > Apr 21 17:27:47 localhost kernel: R13: 00000000bf811000 R14: 000000000149c03e R15: 0000000000000000
> > Apr 21 17:27:47 localhost kernel: FS: 00007fe6abc90700(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000
> > Apr 21 17:27:47 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > Apr 21 17:27:47 localhost kernel: CR2: 00007fe6a37279a0 CR3: 000000008d942000 CR4: 00000000000006e0
> > Apr 21 17:27:47 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > Apr 21 17:27:47 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Apr 21 17:27:47 localhost kernel: Process cc1 (pid: 27892, threadinfo ffff88008d9ee000, task ffff8800b23ec820)
> > Apr 21 17:27:47 localhost kernel: Stack:
> > Apr 21 17:27:47 localhost kernel: ffffea000101aee8 ffff880621a4aae8 ffff88008d9efe38 00007fe6a37279a0
> > Apr 21 17:27:47 localhost kernel: <0> ffff8805d9706d90 ffff880621a4aa00 ffff88008d9efef8 ffffffff81126d05
> > Apr 21 17:27:47 localhost kernel: <0> ffff88008d9efec8 0000000000000246 0000000000000000 ffffffff81586533
> > Apr 21 17:27:47 localhost kernel: Call Trace:
> > Apr 21 17:27:47 localhost kernel: [<ffffffff81126d05>] handle_mm_fault+0x995/0x9b0
> > Apr 21 17:27:47 localhost kernel: [<ffffffff81586533>] ? do_page_fault+0x103/0x330
> > Apr 21 17:27:47 localhost kernel: [<ffffffff8104bf40>] ? finish_task_switch+0x0/0xf0
> > Apr 21 17:27:47 localhost kernel: [<ffffffff8158659e>] do_page_fault+0x16e/0x330
> > Apr 21 17:27:47 localhost kernel: [<ffffffff81582f35>] page_fault+0x25/0x30
> > Apr 21 17:27:47 localhost kernel: Code: 53 08 85 c9 0f 84 32 ff ff ff 8d 41 01 89 4d d8 89 45 d4 8b 75 d4 8b 45 d8 f0 0f b1 32 89 45 dc 8b 45 dc 39 c8 74 aa 89 c1 eb d7 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
> > Apr 21 17:27:47 localhost kernel: RIP [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> > Apr 21 17:27:47 localhost kernel: RSP <ffff88008d9efe08>
> > Apr 21 17:27:47 localhost kernel: ---[ end trace 4860ab585c1fcddb ]---
> >
>
> It seems that this is a new error.
>
>
> static inline struct page *migration_entry_to_page(swp_entry_t entry)
> {
> struct page *p = pfn_to_page(swp_offset(entry));
> /*
> * Any use of migration entries may only occur while the
> * corresponding page is locked
> */
> BUG_ON(!PageLocked(p));
> return p;
> }
>
>
> Hits this BUG_ON()....then, the page migration_entry points to is unlocked.
>
> But we always do
>
> lock_page(old_page);
> unamp(old_page);
> remap(new_page);
> unlock_page(old_page);
>
> So....some pte wasn't updated at remap ?
>

I'm working on reproducing the problem. I've hit it only once. My stress
tests were using dd instead of make like yours did and my
compilation-orientated test would not have been hitting compaction as
hard.

The theory I'm working on is that it's a PageSwapCache page that was
unmapped and not remapped (remap_swapcache == 0) in move_to_new_page().
In this case, the page would be migrated, left in place and unlocked.
Later when a swap fault occurred, the migration PTE is found and the
bug_on triggers i.e. the bug check is no longer valid because it is
possible for an unlocked migration pte to be left behind.

Trying to reproduce with some instrumentation in place documenting pages
left behind but haven't managed to trigger it a second time yet.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab

2010-04-21 16:52:14

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

Hi, Mel.

On Wed, 2010-04-21 at 11:20 +0100, Mel Gorman wrote:
> On Wed, Apr 21, 2010 at 06:48:06PM +0900, KAMEZAWA Hiroyuki wrote:
> > On Wed, 21 Apr 2010 17:28:38 +0900
> > KAMEZAWA Hiroyuki <[email protected]> wrote:
> >
> > > On Mon, 19 Apr 2010 20:39:19 +0100
> > > Mel Gorman <[email protected]> wrote:
> > >
> > > > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
> > > > ==== CUT HERE ====
> > > > mm,compaction: Map free pages in the address space after they get split for compaction
> > > >
> > > > split_free_page() is a helper function which takes a free page from the
> > > > buddy lists and splits it into order-0 pages. It is used by memory
> > > > compaction to build a list of destination pages. If
> > > > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
> > > > because split_free_page() did not call the arch-allocation hooks or map
> > > > the page into the kernel address space.
> > > >
> > > > This patch does not update split_free_page() as it is called with
> > > > interrupts held. Instead it documents that callers of split_free_page()
> > > > are responsible for calling the arch hooks and to map the page and fixes
> > > > compaction.
> > > >
> > > > This is a fix to the patch mm-compaction-memory-compaction-core.patch.
> > > >
> > > > Signed-off-by: Mel Gorman <[email protected]>
> > >
> > > Sorry, I think I hit another? error again. (sorry, no log.)
> > > What I did was...
> > > Running 2 shells.
> > > while true; do make -j 16;make cleanl;done
> > > and
> > > while true; do echo 0 > /proc/sys/vm/compact_memory;done
> > >
> > >
> > > Using the same config.
> > >
> > > Apr 21 17:27:47 localhost kernel: ------------[ cut here ]------------
> > > Apr 21 17:27:47 localhost kernel: kernel BUG at include/linux/swapops.h:105!
> > > Apr 21 17:27:47 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
> > > Apr 21 17:27:47 localhost kernel: last sysfs file: /sys/devices/virtual/net/br0/statistics/collisions
> > > Apr 21 17:27:47 localhost kernel: CPU 3
> > > Apr 21 17:27:47 localhost kernel: Modules linked in: fuse sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput ioatdma ppdev parport_pc i5000_edac bnx2 iTCO_wdt edac_core iTCO_vendor_support shpchp parport e1000e kvm_intel dca kvm i2c_i801 i2c_core i5k_amb pcspkr megaraid_sas [last unloaded: microcode]
> > > Apr 21 17:27:47 localhost kernel:
> > > Apr 21 17:27:47 localhost kernel: Pid: 27892, comm: cc1 Tainted: G W 2.6.34-rc4-mm1+ #4 D2519/PRIMERGY
> > > Apr 21 17:27:47 localhost kernel: RIP: 0010:[<ffffffff8114e9cf>] [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> > > Apr 21 17:27:47 localhost kernel: RSP: 0000:ffff88008d9efe08 EFLAGS: 00010246
> > > Apr 21 17:27:47 localhost kernel: RAX: ffffea0000000000 RBX: ffffea0000241100 RCX: 0000000000000001
> > > Apr 21 17:27:47 localhost kernel: RDX: 000000000000a4e0 RSI: ffff880621a4ab00 RDI: 000000000149c03e
> > > Apr 21 17:27:47 localhost kernel: RBP: ffff88008d9efe38 R08: 0000000000000000 R09: 0000000000000000
> > > Apr 21 17:27:47 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880621a4aae8
> > > Apr 21 17:27:47 localhost kernel: R13: 00000000bf811000 R14: 000000000149c03e R15: 0000000000000000
> > > Apr 21 17:27:47 localhost kernel: FS: 00007fe6abc90700(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000
> > > Apr 21 17:27:47 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > Apr 21 17:27:47 localhost kernel: CR2: 00007fe6a37279a0 CR3: 000000008d942000 CR4: 00000000000006e0
> > > Apr 21 17:27:47 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > Apr 21 17:27:47 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Apr 21 17:27:47 localhost kernel: Process cc1 (pid: 27892, threadinfo ffff88008d9ee000, task ffff8800b23ec820)
> > > Apr 21 17:27:47 localhost kernel: Stack:
> > > Apr 21 17:27:47 localhost kernel: ffffea000101aee8 ffff880621a4aae8 ffff88008d9efe38 00007fe6a37279a0
> > > Apr 21 17:27:47 localhost kernel: <0> ffff8805d9706d90 ffff880621a4aa00 ffff88008d9efef8 ffffffff81126d05
> > > Apr 21 17:27:47 localhost kernel: <0> ffff88008d9efec8 0000000000000246 0000000000000000 ffffffff81586533
> > > Apr 21 17:27:47 localhost kernel: Call Trace:
> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81126d05>] handle_mm_fault+0x995/0x9b0
> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81586533>] ? do_page_fault+0x103/0x330
> > > Apr 21 17:27:47 localhost kernel: [<ffffffff8104bf40>] ? finish_task_switch+0x0/0xf0
> > > Apr 21 17:27:47 localhost kernel: [<ffffffff8158659e>] do_page_fault+0x16e/0x330
> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81582f35>] page_fault+0x25/0x30
> > > Apr 21 17:27:47 localhost kernel: Code: 53 08 85 c9 0f 84 32 ff ff ff 8d 41 01 89 4d d8 89 45 d4 8b 75 d4 8b 45 d8 f0 0f b1 32 89 45 dc 8b 45 dc 39 c8 74 aa 89 c1 eb d7 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
> > > Apr 21 17:27:47 localhost kernel: RIP [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
> > > Apr 21 17:27:47 localhost kernel: RSP <ffff88008d9efe08>
> > > Apr 21 17:27:47 localhost kernel: ---[ end trace 4860ab585c1fcddb ]---
> > >
> >
> > It seems that this is a new error.
> >
> >
> > static inline struct page *migration_entry_to_page(swp_entry_t entry)
> > {
> > struct page *p = pfn_to_page(swp_offset(entry));
> > /*
> > * Any use of migration entries may only occur while the
> > * corresponding page is locked
> > */
> > BUG_ON(!PageLocked(p));
> > return p;
> > }
> >
> >
> > Hits this BUG_ON()....then, the page migration_entry points to is unlocked.
> >
> > But we always do
> >
> > lock_page(old_page);
> > unamp(old_page);
> > remap(new_page);
> > unlock_page(old_page);
> >
> > So....some pte wasn't updated at remap ?
> >
>
> I'm working on reproducing the problem. I've hit it only once. My stress
> tests were using dd instead of make like yours did and my
> compilation-orientated test would not have been hitting compaction as
> hard.
>
> The theory I'm working on is that it's a PageSwapCache page that was
> unmapped and not remapped (remap_swapcache == 0) in move_to_new_page().
> In this case, the page would be migrated, left in place and unlocked.
> Later when a swap fault occurred, the migration PTE is found and the
> bug_on triggers i.e. the bug check is no longer valid because it is
> possible for an unlocked migration pte to be left behind.

Hmm. How about the situation?


CPU A CPU B

1. unmap_and_move
2. lock_page
3. PageAnon && !page_mapped && PageSwapCache 3' do_fork
4. remap_swapcache = 0 4' pte lock, page_dup_rmap <- race happens
5. try_to_unmap - make migration entry by 4'
6. move_to_newpage
7. don't call remove_migration due to 4
8. do_swap_page
9. migration_entry_wait
10. goto out
11. fault!

In this case, process of CPU B will be killed although it passes PageLocked
So I think we have to find another method.

I might be wrong since nearly falling asleep. :(

--
Kind regards,
Minchan Kim

2010-04-21 23:01:30

by Minchan Kim

[permalink] [raw]
Subject: Re: error at compaction (Re: mmotm 2010-04-15-14-42 uploaded

On Thu, Apr 22, 2010 at 1:52 AM, Minchan Kim <[email protected]> wrote:
> Hi, Mel.
>
> On Wed, 2010-04-21 at 11:20 +0100, Mel Gorman wrote:
>> On Wed, Apr 21, 2010 at 06:48:06PM +0900, KAMEZAWA Hiroyuki wrote:
>> > On Wed, 21 Apr 2010 17:28:38 +0900
>> > KAMEZAWA Hiroyuki <[email protected]> wrote:
>> >
>> > > On Mon, 19 Apr 2010 20:39:19 +0100
>> > > Mel Gorman <[email protected]> wrote:
>> > >
>> > > > On Mon, Apr 19, 2010 at 07:14:42PM +0100, Mel Gorman wrote:
>> > > > ==== CUT HERE ====
>> > > > mm,compaction: Map free pages in the address space after they get split for compaction
>> > > >
>> > > > split_free_page() is a helper function which takes a free page from the
>> > > > buddy lists and splits it into order-0 pages. It is used by memory
>> > > > compaction to build a list of destination pages. If
>> > > > CONFIG_DEBUG_PAGEALLOC is set, a kernel paging request bug is triggered
>> > > > because split_free_page() did not call the arch-allocation hooks or map
>> > > > the page into the kernel address space.
>> > > >
>> > > > This patch does not update split_free_page() as it is called with
>> > > > interrupts held. Instead it documents that callers of split_free_page()
>> > > > are responsible for calling the arch hooks and to map the page and fixes
>> > > > compaction.
>> > > >
>> > > > This is a fix to the patch mm-compaction-memory-compaction-core.patch.
>> > > >
>> > > > Signed-off-by: Mel Gorman <[email protected]>
>> > >
>> > > Sorry, I think I hit another? error again. (sorry, no log.)
>> > > What I did was...
>> > >    Running 2 shells.
>> > >    while true; do make -j 16;make cleanl;done
>> > >    and
>> > >    while true; do echo 0 > /proc/sys/vm/compact_memory;done
>> > >
>> > >
>> > > Using the same config.
>> > >
>> > > Apr 21 17:27:47 localhost kernel: ------------[ cut here ]------------
>> > > Apr 21 17:27:47 localhost kernel: kernel BUG at include/linux/swapops.h:105!
>> > > Apr 21 17:27:47 localhost kernel: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
>> > > Apr 21 17:27:47 localhost kernel: last sysfs file: /sys/devices/virtual/net/br0/statistics/collisions
>> > > Apr 21 17:27:47 localhost kernel: CPU 3
>> > > Apr 21 17:27:47 localhost kernel: Modules linked in: fuse sit tunnel4 ipt_MASQUERADE iptable_nat nf_nat bridge stp llc sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf xt_physdev ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 dm_multipath uinput ioatdma ppdev parport_pc i5000_edac bnx2 iTCO_wdt edac_core iTCO_vendor_support shpchp parport e1000e kvm_intel dca kvm i2c_i801 i2c_core i5k_amb pcspkr megaraid_sas [last unloaded: microcode]
>> > > Apr 21 17:27:47 localhost kernel:
>> > > Apr 21 17:27:47 localhost kernel: Pid: 27892, comm: cc1 Tainted: G        W   2.6.34-rc4-mm1+ #4 D2519/PRIMERGY
>> > > Apr 21 17:27:47 localhost kernel: RIP: 0010:[<ffffffff8114e9cf>]  [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
>> > > Apr 21 17:27:47 localhost kernel: RSP: 0000:ffff88008d9efe08  EFLAGS: 00010246
>> > > Apr 21 17:27:47 localhost kernel: RAX: ffffea0000000000 RBX: ffffea0000241100 RCX: 0000000000000001
>> > > Apr 21 17:27:47 localhost kernel: RDX: 000000000000a4e0 RSI: ffff880621a4ab00 RDI: 000000000149c03e
>> > > Apr 21 17:27:47 localhost kernel: RBP: ffff88008d9efe38 R08: 0000000000000000 R09: 0000000000000000
>> > > Apr 21 17:27:47 localhost kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff880621a4aae8
>> > > Apr 21 17:27:47 localhost kernel: R13: 00000000bf811000 R14: 000000000149c03e R15: 0000000000000000
>> > > Apr 21 17:27:47 localhost kernel: FS:  00007fe6abc90700(0000) GS:ffff880005a00000(0000) knlGS:0000000000000000
>> > > Apr 21 17:27:47 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > > Apr 21 17:27:47 localhost kernel: CR2: 00007fe6a37279a0 CR3: 000000008d942000 CR4: 00000000000006e0
>> > > Apr 21 17:27:47 localhost kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > > Apr 21 17:27:47 localhost kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> > > Apr 21 17:27:47 localhost kernel: Process cc1 (pid: 27892, threadinfo ffff88008d9ee000, task ffff8800b23ec820)
>> > > Apr 21 17:27:47 localhost kernel: Stack:
>> > > Apr 21 17:27:47 localhost kernel: ffffea000101aee8 ffff880621a4aae8 ffff88008d9efe38 00007fe6a37279a0
>> > > Apr 21 17:27:47 localhost kernel: <0> ffff8805d9706d90 ffff880621a4aa00 ffff88008d9efef8 ffffffff81126d05
>> > > Apr 21 17:27:47 localhost kernel: <0> ffff88008d9efec8 0000000000000246 0000000000000000 ffffffff81586533
>> > > Apr 21 17:27:47 localhost kernel: Call Trace:
>> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81126d05>] handle_mm_fault+0x995/0x9b0
>> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81586533>] ? do_page_fault+0x103/0x330
>> > > Apr 21 17:27:47 localhost kernel: [<ffffffff8104bf40>] ? finish_task_switch+0x0/0xf0
>> > > Apr 21 17:27:47 localhost kernel: [<ffffffff8158659e>] do_page_fault+0x16e/0x330
>> > > Apr 21 17:27:47 localhost kernel: [<ffffffff81582f35>] page_fault+0x25/0x30
>> > > Apr 21 17:27:47 localhost kernel: Code: 53 08 85 c9 0f 84 32 ff ff ff 8d 41 01 89 4d d8 89 45 d4 8b 75 d4 8b 45 d8 f0 0f b1 32 89 45 dc 8b 45 dc 39 c8 74 aa 89 c1 eb d7 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5
>> > > Apr 21 17:27:47 localhost kernel: RIP  [<ffffffff8114e9cf>] migration_entry_wait+0x16f/0x180
>> > > Apr 21 17:27:47 localhost kernel: RSP <ffff88008d9efe08>
>> > > Apr 21 17:27:47 localhost kernel: ---[ end trace 4860ab585c1fcddb ]---
>> > >
>> >
>> > It seems that this is a new error.
>> >
>> >
>> > static inline struct page *migration_entry_to_page(swp_entry_t entry)
>> > {
>> >         struct page *p = pfn_to_page(swp_offset(entry));
>> >         /*
>> >          * Any use of migration entries may only occur while the
>> >          * corresponding page is locked
>> >          */
>> >         BUG_ON(!PageLocked(p));
>> >         return p;
>> > }
>> >
>> >
>> > Hits this BUG_ON()....then, the page migration_entry points to is unlocked.
>> >
>> > But we always do
>> >
>> >     lock_page(old_page);
>> >     unamp(old_page);
>> >     remap(new_page);
>> >     unlock_page(old_page);
>> >
>> > So....some pte wasn't updated at remap ?
>> >
>>
>> I'm working on reproducing the problem. I've hit it only once. My stress
>> tests were using dd instead of make like yours did and my
>> compilation-orientated test would not have been hitting compaction as
>> hard.
>>
>> The theory I'm working on is that it's a PageSwapCache page that was
>> unmapped and not remapped (remap_swapcache == 0) in move_to_new_page().
>> In this case, the page would be migrated, left in place and unlocked.
>> Later when a swap fault occurred, the migration PTE is found and the
>> bug_on triggers i.e. the bug check is no longer valid because it is
>> possible for an unlocked migration pte to be left behind.
>
> Hmm. How about the situation?
>
>
> CPU A                                           CPU B
>
> 1. unmap_and_move
> 2. lock_page
> 3. PageAnon && !page_mapped && PageSwapCache    3' do_fork
> 4. remap_swapcache = 0                          4' pte lock, page_dup_rmap <- race happens
> 5. try_to_unmap - make migration entry by 4'
> 6. move_to_newpage
> 7. don't call remove_migration due to 4
>                                                8. do_swap_page
>                                                9. migration_entry_wait
>                                                10. goto out
>                                                11. fault!
>
> In this case, process of CPU B will be killed although it passes PageLocked
> So I think we have to find another method.
>
> I might be wrong since nearly falling asleep. :(

Yes. I was wrong.
I seem to miss detach_vma before unmap_region.
Sorry, Ignore this, please. :(

--
Kind regards,
Minchan Kim