2011-04-30 00:02:55

by Andrew Morton

[permalink] [raw]
Subject: mmotm 2011-04-29-16-25 uploaded

The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to

http://userweb.kernel.org/~akpm/mmotm/

and will soon be available at

git://zen-kernel.org/kernel/mmotm.git

It contains the following patches against 2.6.39-rc5:

origin.patch
rtc-s3c-rtc-fixup-wake-support-for-rtc.patch
backlight-new-driver-for-the-adp8870-backlight-devices.patch
linux-next.patch
next-remove-localversion.patch
i-need-old-gcc.patch
arch-alpha-kernel-systblss-remove-debug-check.patch
include-asm-generic-vmlinuxldsh-fix-__modver-section-warnings.patch
drivers-i2c-busses-i2c-designware-corec-needs-delayh.patch
tmpfs-fix-race-between-umount-and-writepage.patch
fs-namespacec-bound-mount-propagation-fix.patch
drivers-scsi-pmcraid-reject-negative-request-size.patch
acpi-remove-acpi_sleep=s4_nonvs.patch
acerhdf-add-support-for-aspire-1410-bios-v13314.patch
arch-x86-include-asm-delayh-fix-udelay-and-ndelay-for-8-bit-args.patch
x86-fix-mmap-random-address-range.patch
leds-new-pcengines-alix-system-driver-enables-leds-via-gpio-interface.patch
gpio-show-explicit-dependency-between-gpio_cs5535-and-mfd_cs5535.patch
msm-timer-migrate-to-timer-based-__delay.patch
arch-arm-mach-ux500-mbox-db5500c-world-writable-sysfs-fifo-file.patch
audit-always-follow-va_copy-with-va_end.patch
btrfs-dont-dereference-extent_mapping-if-null.patch
drivers-gpu-drm-radeon-atomc-fix-warning.patch
fb-fix-potential-deadlock-between-lock_fb_info-and-console_lock.patch
cyber2000fb-avoid-palette-corruption-at-higher-clocks.patch
fscache-remove-dead-code-under-config_workqueue_debugfs.patch
bitmap-irq-add-smp_affinity_list-interface-to-proc-irq.patch
posix-timers-rcu-conversion.patch
leds-support-automatic-start-of-blinking-with-ledtrig-timer.patch
drivers-leds-leds-pca9532c-add-gpio-capability.patch
leds-route-kbd-leds-through-the-generic-leds-layer.patch
net-irda-convert-bfin_sir-to-common-blackfin-uart-header.patch
net-convert-%p-usage-to-%pk.patch
backlight-add-backlight-type-fix.patch
backlight-add-backlight-type-fix-fix.patch
drivers-video-backlight-adp5520_blc-check-strict_strtoul-return-value.patch
drivers-video-backlight-adp5520_blc-check-strict_strtoul-return-value-fix.patch
i915-add-native-backlight-control.patch
btusb-patch-add_apple_macbookpro62.patch
drivers-message-fusion-mptsasc-fix-warning.patch
scsi-fix-a-header-to-include-linux-typesh.patch
aic94xx-world-writable-sysfs-update_bios-file.patch
osst-wrong-index-used-in-inner-loop.patch
osst-wrong-index-used-in-inner-loop-checkpatch-fixes.patch
drivers-scsi-osstc-fix-warning.patch
drivers-scsi-megaraidc-fix-sparse-warnings.patch
drbd-fix-warning.patch
usb-yurex-recognize-generalkeys-wireless-presenter-as-generic-hid.patch
drivers-usb-misc-usbtestc-fix-warning.patch
mm.patch
arch-mm-filter-disallowed-nodes-from-arch-specific-show_mem-functions.patch
mmap-add-alignment-for-some-variables.patch
mmap-avoid-unnecessary-anon_vma-lock.patch
mmap-avoid-merging-cloned-vmas.patch
mm-remove-unused-zone_idx-variable-from-set_migratetype_isolate.patch
mm-nommu-sort-mm-mmap-list-properly.patch
mm-nommu-sort-mm-mmap-list-properly-fix.patch
mm-nommu-dont-scan-the-vma-list-when-deleting.patch
mm-nommu-find-vma-using-the-sorted-vma-list.patch
mm-nommu-check-the-vma-list-when-unmapping-file-mapped-vma.patch
mm-nommu-fix-a-potential-memory-leak-in-do_mmap_private.patch
mm-nommu-fix-a-compile-warning-in-do_mmap_pgoff.patch
mm-per-node-vmstat-show-proper-vmstats.patch
mm-per-node-vmstat-show-proper-vmstats-fix.patch
mm-increase-reclaim_distance-to-30.patch
mm-introduce-wait_on_page_locked_killable.patch
x86mm-make-pagefault-killable.patch
mm-mem-hotplug-fix-section-mismatch-setup_per_zone_inactive_ratio-should-be-__meminit.patch
mm-mem-hotplug-recalculate-lowmem_reserve-when-memory-hotplug-occur.patch
mm-mem-hotplug-update-pcp-stat_threshold-when-memory-hotplug-occur.patch
mm-mem-hotplug-update-pcp-stat_threshold-when-memory-hotplug-occur-fix.patch
mm-convert-vma-vm_flags-to-64-bit.patch
mm-add-__nocast-attribute-to-vm_flags.patch
fremap-convert-vm_flags-to-unsigned-long-long.patch
procfs-convert-vm_flags-to-unsigned-long-long.patch
mm-compaction-reverse-the-change-that-forbade-sync-migraton-with-__gfp_no_kswapd.patch
oom-replace-pf_oom_origin-with-toggling-oom_score_adj.patch
oom-replace-pf_oom_origin-with-toggling-oom_score_adj-update.patch
include-linux-gfph-work-around-apparent-sparse-confusion.patch
include-linux-gfph-convert-bug_on-into-vm_bug_on.patch
mm-rename-alloc_pages_exact.patch
mm-make-new-alloc_pages_exact.patch
mm-reuse-__free_pages_exact-in-__alloc_pages_exact.patch
mm-vmalloc-remove-guard-page-from-between-vmap-blocks.patch
mm-make-expand_downwards-symmetrical-with-expand_upwards.patch
mm-make-expand_downwards-symmetrical-with-expand_upwards-v3.patch
mm-mmu_gather-rework.patch
mm-mmu_gather-rework-fix.patch
powerpc-mmu_gather-rework.patch
sparc-mmu_gather-rework.patch
s390-mmu_gather-rework.patch
arm-mmu_gather-rework.patch
sh-mmu_gather-rework.patch
ia64-mmu_gather-rework.patch
um-mmu_gather-rework.patch
mm-now-that-all-old-mmu_gather-code-is-gone-remove-the-storage.patch
mm-powerpc-move-the-rcu-page-table-freeing-into-generic-code.patch
mm-extended-batches-for-generic-mmu_gather.patch
lockdep-mutex-provide-mutex_lock_nest_lock.patch
mm-remove-i_mmap_lock-lockbreak.patch
mm-convert-i_mmap_lock-to-a-mutex.patch
mm-revert-page_lock_anon_vma-lock-annotation.patch
mm-improve-page_lock_anon_vma-comment.patch
mm-use-refcounts-for-page_lock_anon_vma.patch
mm-convert-anon_vma-lock-to-a-mutex.patch
mm-optimize-page_lock_anon_vma-fast-path.patch
mm-uninline-large-generic-tlbh-functions.patch
mm-thp-optimize-memcg-charge-in-khugepaged.patch
mm-thp-optimize-memcg-charge-in-khugepaged-fix.patch
mn10300-replace-mm-cpu_vm_mask-with-mm_cpumask.patch
tile-replace-mm-cpu_vm_mask-with-mm_cpumask.patch
mm-convert-mm-cpu_vm_cpumask-into-cpumask_var_t.patch
mm-convert-mm-cpu_vm_cpumask-into-cpumask_var_t-fix.patch
mm-convert-mm-cpu_vm_cpumask-into-cpumask_var_t-checkpatch-fixes.patch
mm-break-out-page-allocation-warning-code.patch
mm-print-vmalloc-state-after-allocation-failures.patch
writeback-pass-writeback_control-down-to-move_expired_inodes.patch
writeback-introduce-writeback_controlinodes_cleaned.patch
writeback-try-more-writeback-as-long-as-something-was-written.patch
writeback-the-kupdate-expire-timestamp-should-be-a-moving-target.patch
writeback-sync-expired-inodes-first-in-background-writeback.patch
writeback-sync-expired-inodes-first-in-background-writeback-fix.patch
writeback-refill-b_io-iff-empty.patch
mem-hotplug-call-isolate_lru_page-with-elevated-refcount.patch
mem-hwpoison-fix-page-refcount-around-isolate_lru_page.patch
mm-strictly-require-elevated-page-refcount-in-isolate_lru_page.patch
alpha-mm-set-all-online-nodes-in-n_normal_memory.patch
m32r-mm-set-all-online-nodes-in-n_normal_memory.patch
mm-check-if-any-page-in-a-pageblock-is-reserved-before-marking-it-migrate_reserve.patch
mm-check-if-any-page-in-a-pageblock-is-reserved-before-marking-it-migrate_reserve-fix.patch
readahead-readahead-page-allocations-are-ok-to-fail.patch
vmscan-change-shrink_slab-interfaces-by-passing-shrink_control.patch
vmscan-change-shrink_slab-interfaces-by-passing-shrink_control-fix.patch
vmscan-change-shrink_slab-interfaces-by-passing-shrink_control-fix-2.patch
vmscan-change-shrinker-api-by-passing-shrink_control-struct.patch
vmscan-change-shrinker-api-by-passing-shrink_control-struct-fix.patch
vmscan-change-shrinker-api-by-passing-shrink_control-struct-fix-2.patch
readahead-return-early-when-readahead-is-disabled.patch
readahead-reduce-unnecessary-mmap_miss-increases.patch
readahead-trigger-mmap-sequential-readahead-on-pg_readahead.patch
writeback-split-inode_wb_list_lock-into-bdi_writebacklist_lock.patch
writeback-elevate-queue_io-into-wb_writeback.patch
mm-remove-unused-token-argument-from-apply_to_page_range-callback.patch
mm-add-apply_to_page_range_batch.patch
ioremap-use-apply_to_page_range_batch-for-ioremap_page_range.patch
vmalloc-use-plain-pte_clear-for-unmaps.patch
vmalloc-use-apply_to_page_range_batch-for-vunmap_page_range.patch
vmalloc-use-apply_to_page_range_batch-for-vmap_page_range_noflush.patch
vmalloc-use-apply_to_page_range_batch-in-alloc_vm_area.patch
xen-mmu-use-apply_to_page_range_batch-in-xen_remap_domain_mfn_range.patch
xen-grant-table-use-apply_to_page_range_batch.patch
memsw-remove-noswapaccount-kernel-parameter.patch
mm-batch-activate_page-to-reduce-lock-contention.patch
xattrh-expose-string-defines-to-userspace.patch
frv-duplicate-output_buffer-of-e03.patch
frv-duplicate-output_buffer-of-e03-checkpatch-fixes.patch
hpet-factor-timer-allocate-from-open.patch
alpha-replace-with-new-cpumask-apis.patch
arch-alpha-include-asm-ioh-s-extern-inline-static-inline.patch
m32r-convert-cpumask-api.patch
m32r-fix-spin_lock_irqsave-misuse.patch
m32r-remove-redundant-declaration.patch
ulimit-raise-default-hard-ulimit-on-number-of-files-to-4096.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-checkpatch-fixes.patch
init-calibratec-fix-for-critical-bogomips-intermittent-calculation-failure-fix.patch
sparse-define-dummy-build_bug_on-definition-for-sparse.patch
sparse-define-__must_be_array-for-__checker__.patch
sparse-undef-__compiletime_warningerror-if-__checker__-is-defined.patch
lib-vsprintfc-fix-interaction-of-kasprintf-and-vsnprintf-when-using-%pv.patch
memblock-add-error-return-when-config_have_memblock-is-not-set.patch
printk-allocate-kernel-log-buffer-earlier-v2.patch
printk-allocate-kernel-log-buffer-earlier-v2-checkpatch-fixes.patch
printk-allocate-kernel-log-buffer-earlier-v2-fix.patch
fcntlf_setfl-allow-setting-of-o_sync.patch
maintainers-roman-zippel-has-been-mia-for-several-years.patch
lru_cache-use-correct-type-in-sizeof-for-allocation.patch
lru_cache-use-correct-type-in-sizeof-for-allocation-fix.patch
lib-add-kstrto_from_user.patch
lib-consolidate-debug_per_cpu_maps.patch
include-linux-genalloch-add-multiple-inclusion-guards.patch
lib-genallocc-add-support-for-specifying-the-physical-address.patch
lib-genallocc-add-support-for-specifying-the-physical-address-v2.patch
percpu_counter-change-return-value-and-add-comments.patch
percpu_counter-change-return-value-and-add-comments-fix.patch
checkpatch-add-check-for-line-continuations-in-quoted-strings.patch
checkpatch-add-foo_level-and-module_bar-to-80-column-exceptions.patch
checkpatch-fix-defect-in-printkkern_level-80-column-exceptions.patch
lib-hexdumpc-make-hex2bin-return-the-updated-src-address.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix.patch
fs-binfmt_miscc-use-kernels-hex_to_bin-method-fix-fix.patch
fs-ncpfs-inodec-suppress-used-uninitialised-warning.patch
rtc-add-support-for-the-rtc-in-via-vt8500-and-compatibles.patch
rtc-add-em3027-rtc-driver.patch
rtc-add-rv3029c2-rtc-support.patch
rtc-add-basic-support-for-st-m41t93-spi-rtc.patch
drivers-rtc-rtc-mrstc-use-release_mem_region-after-request_mem_region.patch
drivers-rtc-rtc-mrstc-use-release_mem_region-after-request_mem_region-fix.patch
rtc-add-support-for-spear-rtc.patch
rtc-driver-for-pt7c4338-chip.patch
rtc-driver-for-pt7c4338-chip-checkpatch-fixes.patch
rtc-driver-for-pt7c4338-chip-fix.patch
rtc-rtc-driver-for-da9052-pmic.patch
gpio-add-new-altera-pio-driver.patch
gpio-add-new-altera-pio-driver-update.patch
gpio-make-gpio_requestfree_array-gpio-array-parameter-const.patch
jbd-remove-dependency-on-__gfp_nofail.patch
ufs-truncated-values-handling-64-bit-metadata.patch
documentation-atomic_opstxt-avoid-volatile-in-sample-code.patch
documentation-accounting-getdelaysc-fix-unused-var-warning.patch
documentation-accounting-getdelaysc-handle-sendto-failures.patch
cgroups-read-write-lock-clone_thread-forking-per-threadgroup.patch
cgroups-add-per-thread-subsystem-callbacks.patch
cgroups-make-procs-file-writable.patch
cgroups-use-flex_array-in-attach_proc.patch
cgroup-remove-the-ns_cgroup.patch
mm-move-enum-vm_event_item-into-a-standalone-header-file.patch
memcg-count-the-soft_limit-reclaim-in-global-background-reclaim.patch
memcg-add-stats-to-monitor-soft_limit-reclaim.patch
add-the-pagefault-count-into-memcg-stats.patch
add-the-pagefault-count-into-memcg-stats-fix.patch
memcg-remove-pointless-next_mz-nullification-in-mem_cgroup_soft_limit_reclaim.patch
memcg-mark-init_section_page_cgroup-properly.patch
memcg-fix-off-by-one-when-calculating-swap-cgroup-map-length.patch
memcg-move-page-freeing-code-out-of-lock.patch
maintainers-add-mm-page_cgroupc-into-memcg-subsystem.patch
cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node.patch
cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node-cpusets-initialize-spread-rotor-lazily.patch
cpusets-randomize-node-rotor-used-in-cpuset_mem_spread_node-cpusets-initialize-spread-rotor-lazily-fix.patch
asm-generic-ptraceh-start-a-common-low-level-ptrace-helper.patch
blackfin-convert-to-asm-generic-ptraceh.patch
x86-convert-to-asm-generic-ptraceh.patch
sh-convert-to-asm-generic-ptraceh.patch
kgdbts-unify-generalize-gdb-breakpoint-adjustment.patch
signal-introduce-retarget_shared_pending.patch
signal-retarget_shared_pending-consider-shared-unblocked-signals-only.patch
signal-retarget_shared_pending-optimize-while_each_thread-loop.patch
signal-sigprocmask-narrow-the-scope-of-siglock.patch
signal-sigprocmask-should-do-retarget_shared_pending.patch
x86-signal-handle_signal-should-use-set_current_blocked.patch
x86-signal-sys_rt_sigreturn-should-use-set_current_blocked.patch
signal-cleanup-sys_rt_sigprocmask.patch
mm-extract-exe_file-handling-from-procfs.patch
coredump-add-support-for-exe_file-in-core-name.patch
kstrtox-convert-fs-proc.patch
proc-constify-status-array.patch
proc-stat-use-defined-macro-kmalloc_max_size.patch
proc-put-check_mem_permission-after-__get_free_page-in-mem_write.patch
proc-fix-pagemap_read-error-case.patch
cpumask-convert-for_each_cpumask-with-for_each_cpu.patch
cpumask-convert-cpumask_of_cpu-to-cpumask_of.patch
drivers-char-mspecc-use-kvzalloc-to-allocate-memory.patch
fs-partitions-efic-corrupted-guid-partition-tables-can-cause-kernel-oops.patch
fs-partitions-efic-corrupted-guid-partition-tables-can-cause-kernel-oops-fix.patch
sysctl-add-proc_dointvec_bool-handler.patch
sysctl-use-proc_dointvec_bool-where-appropriate.patch
sysctl-add-proc_dointvec_unsigned-handler.patch
sysctl-use-proc_dointvec_unsigned-where-appropriate.patch
pid-fix-typo-in-function-description.patch
fs-execc-provide-the-correct-process-pid-to-the-pipe-helper.patch
drivers-char-ppdevc-put-gotten-port-value.patch
kernel-profilec-use-vzalloc-instead-of-vmallocmemset.patch
scatterlist-new-helper-functions.patch
scatterlist-new-helper-functions-update.patch
scatterlist-new-helper-functions-update-fix.patch
memstick-add-support-for-legacy-memorysticks.patch
memstick-add-support-for-legacy-memorysticks-update-2.patch
w1-add-1-wire-w1-reset-and-resume-command-api-support.patch
w1-add-1-wire-w1-ds2408-8-channel-addressable-switch-support.patch
w1-complete-the-1-wire-w1-ds1wm-driver-search-algorithm.patch
w1-have-netlink-search-update-kernel-list.patch
kexec-remove-kmsg_dump_kexec.patch
kexec-remove-kmsg_dump_kexec-fix.patch
m68knommu-fix-build-error-due-to-the-lack-of-find_next_bit_le.patch
arch-add-define-for-each-of-optimized-find-bitops.patch
bitops-add-ifndef-for-each-of-find-bitops.patch
arch-remove-config_generic_find_next_bitbit_lelast_bit.patch
arm-use-asm-generic-bitops-leh.patch
s390-use-asm-generic-bitops-leh.patch
m68knommu-use-generic-find_next_bit_le.patch
make-sure-nobodys-leaking-resources.patch
journal_add_journal_head-debug.patch
releasing-resources-with-children.patch
make-frame_pointer-default=y.patch
mutex-subsystem-synchro-test-module.patch
mutex-subsystem-synchro-test-module-fix.patch
slab-leaks3-default-y.patch
put_bh-debug.patch
add-debugging-aid-for-memory-initialisation-problems.patch
workaround-for-a-pci-restoring-bug.patch
prio_tree-debugging-patch.patch
single_open-seq_release-leak-diagnostics.patch
add-a-refcount-check-in-dput.patch
memblock-add-input-size-checking-to-memblock_find_region.patch
memblock-add-input-size-checking-to-memblock_find_region-fix.patch


2011-04-30 16:46:21

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2011-04-29-16-25 uploaded

On Fri, 29 Apr 2011 16:26:16 -0700 [email protected] wrote:

> The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/
>
> and will soon be available at
>
> git://zen-kernel.org/kernel/mmotm.git
>
> It contains the following patches against 2.6.39-rc5:


mm-per-node-vmstat-show-proper-vmstats.patch

when CONFIG_PROC_FS is not enabled:

drivers/built-in.o: In function `node_read_vmstat':
node.c:(.text+0x1e995): undefined reference to `vmstat_text'

from drivers/base/node.c

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-05-01 02:35:42

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] mmotm: fix hang at startup

Yesterday's mmotm hangs at startup, and with lockdep it reports:
BUG: spinlock recursion on CPU#1, blkid/284 - with bdi_lock_two()
called from bdev_inode_switch_bdi() in the backtrace. It appears
that this function is sometimes called with new the same as old.

Signed-off-by: Hugh Dickins <[email protected]>
---
Fix to
writeback-split-inode_wb_list_lock-into-bdi_writebacklist_lock.patch

fs/block_dev.c | 2 ++
1 file changed, 2 insertions(+)

--- 2.6.39-rc5-mm1/fs/block_dev.c 2011-04-29 18:20:09.183314733 -0700
+++ linux/fs/block_dev.c 2011-04-30 17:55:45.718785263 -0700
@@ -57,6 +57,8 @@ static void bdev_inode_switch_bdi(struct
{
struct backing_dev_info *old = inode->i_data.backing_dev_info;

+ if (dst == old)
+ return;
bdi_lock_two(&old->wb, &dst->wb);
spin_lock(&inode->i_lock);
inode->i_data.backing_dev_info = dst;

2011-05-01 03:25:13

by Fengguang Wu

[permalink] [raw]
Subject: Re: [PATCH] mmotm: fix hang at startup

On Sun, May 01, 2011 at 10:35:38AM +0800, Hugh Dickins wrote:
> Yesterday's mmotm hangs at startup, and with lockdep it reports:
> BUG: spinlock recursion on CPU#1, blkid/284 - with bdi_lock_two()
> called from bdev_inode_switch_bdi() in the backtrace. It appears
> that this function is sometimes called with new the same as old.
>
> Signed-off-by: Hugh Dickins <[email protected]>

Thanks!

Reviewed-by: Wu Fengguang <[email protected]>

> Fix to
> writeback-split-inode_wb_list_lock-into-bdi_writebacklist_lock.patch
>
> fs/block_dev.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> --- 2.6.39-rc5-mm1/fs/block_dev.c 2011-04-29 18:20:09.183314733 -0700
> +++ linux/fs/block_dev.c 2011-04-30 17:55:45.718785263 -0700
> @@ -57,6 +57,8 @@ static void bdev_inode_switch_bdi(struct
> {
> struct backing_dev_info *old = inode->i_data.backing_dev_info;
>
> + if (dst == old)
> + return;

nitpick: it could help to add a comment

/* avoid spinlock recursion */

to indicate that's not merely an optional optimization, but indeed
required for correctness.

Thanks,
Fengguang

> bdi_lock_two(&old->wb, &dst->wb);
> spin_lock(&inode->i_lock);
> inode->i_data.backing_dev_info = dst;

2011-05-01 07:47:47

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: mmotm 2011-04-29-16-25 uploaded

> On Fri, 29 Apr 2011 16:26:16 -0700 [email protected] wrote:
>
> > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
> >
> > and will soon be available at
> >
> > git://zen-kernel.org/kernel/mmotm.git
> >
> > It contains the following patches against 2.6.39-rc5:
>
>
> mm-per-node-vmstat-show-proper-vmstats.patch
>
> when CONFIG_PROC_FS is not enabled:
>
> drivers/built-in.o: In function `node_read_vmstat':
> node.c:(.text+0x1e995): undefined reference to `vmstat_text'
>
> from drivers/base/node.c


Thank you for finding that!



>From 63ad7c06f082f8423c033b9f54070e14d561db7e Mon Sep 17 00:00:00 2001
From: KOSAKI Motohiro <[email protected]>
Date: Sun, 1 May 2011 16:00:09 +0900
Subject: [PATCH] vmstat: fix build error when SYSFS=y and PROC_FS=n

Randy Dunlap pointed out node.c makes build error when
PROC_FS=n. Because node.c#node_read_vmstat() uses vmstat_text
and it depend on PROC_FS.

Thus, this patch change it to depend both SYSFS and PROC_FS.

Reported-by: Randy Dunlap <[email protected]>
Signed-off-by: KOSAKI Motohiro <[email protected]>
---
mm/vmstat.c | 261 ++++++++++++++++++++++++++++++-----------------------------
1 files changed, 132 insertions(+), 129 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index a2b7344..20c18b7 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -659,6 +659,138 @@ static void walk_zones_in_node(struct seq_file *m, pg_data_t *pgdat,
}
#endif

+#if defined(CONFIG_PROC_FS) || defined(CONFIG_SYSFS)
+#ifdef CONFIG_ZONE_DMA
+#define TEXT_FOR_DMA(xx) xx "_dma",
+#else
+#define TEXT_FOR_DMA(xx)
+#endif
+
+#ifdef CONFIG_ZONE_DMA32
+#define TEXT_FOR_DMA32(xx) xx "_dma32",
+#else
+#define TEXT_FOR_DMA32(xx)
+#endif
+
+#ifdef CONFIG_HIGHMEM
+#define TEXT_FOR_HIGHMEM(xx) xx "_high",
+#else
+#define TEXT_FOR_HIGHMEM(xx)
+#endif
+
+#define TEXTS_FOR_ZONES(xx) TEXT_FOR_DMA(xx) TEXT_FOR_DMA32(xx) xx "_normal", \
+ TEXT_FOR_HIGHMEM(xx) xx "_movable",
+
+const char * const vmstat_text[] = {
+ /* Zoned VM counters */
+ "nr_free_pages",
+ "nr_inactive_anon",
+ "nr_active_anon",
+ "nr_inactive_file",
+ "nr_active_file",
+ "nr_unevictable",
+ "nr_mlock",
+ "nr_anon_pages",
+ "nr_mapped",
+ "nr_file_pages",
+ "nr_dirty",
+ "nr_writeback",
+ "nr_slab_reclaimable",
+ "nr_slab_unreclaimable",
+ "nr_page_table_pages",
+ "nr_kernel_stack",
+ "nr_unstable",
+ "nr_bounce",
+ "nr_vmscan_write",
+ "nr_writeback_temp",
+ "nr_isolated_anon",
+ "nr_isolated_file",
+ "nr_shmem",
+ "nr_dirtied",
+ "nr_written",
+
+#ifdef CONFIG_NUMA
+ "numa_hit",
+ "numa_miss",
+ "numa_foreign",
+ "numa_interleave",
+ "numa_local",
+ "numa_other",
+#endif
+ "nr_anon_transparent_hugepages",
+ "nr_dirty_threshold",
+ "nr_dirty_background_threshold",
+
+#ifdef CONFIG_VM_EVENT_COUNTERS
+ "pgpgin",
+ "pgpgout",
+ "pswpin",
+ "pswpout",
+
+ TEXTS_FOR_ZONES("pgalloc")
+
+ "pgfree",
+ "pgactivate",
+ "pgdeactivate",
+
+ "pgfault",
+ "pgmajfault",
+
+ TEXTS_FOR_ZONES("pgrefill")
+ TEXTS_FOR_ZONES("pgsteal")
+ TEXTS_FOR_ZONES("pgscan_kswapd")
+ TEXTS_FOR_ZONES("pgscan_direct")
+
+#ifdef CONFIG_NUMA
+ "zone_reclaim_failed",
+#endif
+ "pginodesteal",
+ "slabs_scanned",
+ "kswapd_steal",
+ "kswapd_inodesteal",
+ "kswapd_low_wmark_hit_quickly",
+ "kswapd_high_wmark_hit_quickly",
+ "kswapd_skip_congestion_wait",
+ "pageoutrun",
+ "allocstall",
+
+ "pgrotated",
+
+#ifdef CONFIG_COMPACTION
+ "compact_blocks_moved",
+ "compact_pages_moved",
+ "compact_pagemigrate_failed",
+ "compact_stall",
+ "compact_fail",
+ "compact_success",
+#endif
+
+#ifdef CONFIG_HUGETLB_PAGE
+ "htlb_buddy_alloc_success",
+ "htlb_buddy_alloc_fail",
+#endif
+ "unevictable_pgs_culled",
+ "unevictable_pgs_scanned",
+ "unevictable_pgs_rescued",
+ "unevictable_pgs_mlocked",
+ "unevictable_pgs_munlocked",
+ "unevictable_pgs_cleared",
+ "unevictable_pgs_stranded",
+ "unevictable_pgs_mlockfreed",
+
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ "thp_fault_alloc",
+ "thp_fault_fallback",
+ "thp_collapse_alloc",
+ "thp_collapse_alloc_failed",
+ "thp_split",
+#endif
+
+#endif /* CONFIG_VM_EVENTS_COUNTERS */
+};
+#endif /* CONFIG_PROC_FS || CONFIG_SYSFS */
+
+
#ifdef CONFIG_PROC_FS
static void frag_show_print(struct seq_file *m, pg_data_t *pgdat,
struct zone *zone)
@@ -831,135 +963,6 @@ static const struct file_operations pagetypeinfo_file_ops = {
.release = seq_release,
};

-#ifdef CONFIG_ZONE_DMA
-#define TEXT_FOR_DMA(xx) xx "_dma",
-#else
-#define TEXT_FOR_DMA(xx)
-#endif
-
-#ifdef CONFIG_ZONE_DMA32
-#define TEXT_FOR_DMA32(xx) xx "_dma32",
-#else
-#define TEXT_FOR_DMA32(xx)
-#endif
-
-#ifdef CONFIG_HIGHMEM
-#define TEXT_FOR_HIGHMEM(xx) xx "_high",
-#else
-#define TEXT_FOR_HIGHMEM(xx)
-#endif
-
-#define TEXTS_FOR_ZONES(xx) TEXT_FOR_DMA(xx) TEXT_FOR_DMA32(xx) xx "_normal", \
- TEXT_FOR_HIGHMEM(xx) xx "_movable",
-
-const char * const vmstat_text[] = {
- /* Zoned VM counters */
- "nr_free_pages",
- "nr_inactive_anon",
- "nr_active_anon",
- "nr_inactive_file",
- "nr_active_file",
- "nr_unevictable",
- "nr_mlock",
- "nr_anon_pages",
- "nr_mapped",
- "nr_file_pages",
- "nr_dirty",
- "nr_writeback",
- "nr_slab_reclaimable",
- "nr_slab_unreclaimable",
- "nr_page_table_pages",
- "nr_kernel_stack",
- "nr_unstable",
- "nr_bounce",
- "nr_vmscan_write",
- "nr_writeback_temp",
- "nr_isolated_anon",
- "nr_isolated_file",
- "nr_shmem",
- "nr_dirtied",
- "nr_written",
-
-#ifdef CONFIG_NUMA
- "numa_hit",
- "numa_miss",
- "numa_foreign",
- "numa_interleave",
- "numa_local",
- "numa_other",
-#endif
- "nr_anon_transparent_hugepages",
- "nr_dirty_threshold",
- "nr_dirty_background_threshold",
-
-#ifdef CONFIG_VM_EVENT_COUNTERS
- "pgpgin",
- "pgpgout",
- "pswpin",
- "pswpout",
-
- TEXTS_FOR_ZONES("pgalloc")
-
- "pgfree",
- "pgactivate",
- "pgdeactivate",
-
- "pgfault",
- "pgmajfault",
-
- TEXTS_FOR_ZONES("pgrefill")
- TEXTS_FOR_ZONES("pgsteal")
- TEXTS_FOR_ZONES("pgscan_kswapd")
- TEXTS_FOR_ZONES("pgscan_direct")
-
-#ifdef CONFIG_NUMA
- "zone_reclaim_failed",
-#endif
- "pginodesteal",
- "slabs_scanned",
- "kswapd_steal",
- "kswapd_inodesteal",
- "kswapd_low_wmark_hit_quickly",
- "kswapd_high_wmark_hit_quickly",
- "kswapd_skip_congestion_wait",
- "pageoutrun",
- "allocstall",
-
- "pgrotated",
-
-#ifdef CONFIG_COMPACTION
- "compact_blocks_moved",
- "compact_pages_moved",
- "compact_pagemigrate_failed",
- "compact_stall",
- "compact_fail",
- "compact_success",
-#endif
-
-#ifdef CONFIG_HUGETLB_PAGE
- "htlb_buddy_alloc_success",
- "htlb_buddy_alloc_fail",
-#endif
- "unevictable_pgs_culled",
- "unevictable_pgs_scanned",
- "unevictable_pgs_rescued",
- "unevictable_pgs_mlocked",
- "unevictable_pgs_munlocked",
- "unevictable_pgs_cleared",
- "unevictable_pgs_stranded",
- "unevictable_pgs_mlockfreed",
-
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- "thp_fault_alloc",
- "thp_fault_fallback",
- "thp_collapse_alloc",
- "thp_collapse_alloc_failed",
- "thp_split",
-#endif
-
-#endif /* CONFIG_VM_EVENTS_COUNTERS */
-};
-
static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
struct zone *zone)
{
--
1.7.3.1



2011-05-02 00:27:37

by Valdis Klētnieks

[permalink] [raw]
Subject: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Fri, 29 Apr 2011 16:26:16 PDT, [email protected] said:
> The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
>
> http://userweb.kernel.org/~akpm/mmotm/

Dell Latitude E6500 laptop, Core2 Due P8700, 4G RAM, 2G swap.Z86_64 kernel.

I was running a backup of the system to an external USB hard drive. Source and
target filesystems were ext4 on LVM on a LUKS encrypted partition. Same backup
script to same destination drive worked fine a few days ago on a -rc1-mmotm0331
kernel.

System ran out of RAM, and went about 50M into the 2G of swap. Not sure why *that*
happened, as previously the backup script didn't cause any swapping. After that, the
VmRSS and VmHWM values were corrupted for some 20 processes, including systemd,
the X server, pidgin, firefox, rsyslogd

Nothing notable in dmesg output, Nothing noted by abrtd, no processes crashed
or misbehaving that I can tell. Just wonky numbers.

top says:

Tasks: 186 total, 3 running, 183 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.1%us, 9.1%sy, 0.0%ni, 74.8%id, 6.7%wa, 0.0%hi, 0.2%si, 0.0%st
Mem: 4028664k total, 3839128k used, 189536k free, 1728880k buffers
Swap: 2097148k total, 52492k used, 2044656k free, 1081528k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
47720 root 20 0 0 0 0 R 17.6 0.0 0:21.64 kworker/0:0
47453 root 20 0 0 0 0 D 13.7 0.0 0:36.10 kworker/1:3
26854 root 20 0 0 0 0 R 3.9 0.0 1:02.24 usb-storage
46917 root 20 0 18192 ? 208 D 3.9 457887369396224.0 4:18.50 dump
46918 root 20 0 18192 ? 208 S 3.9 457887369396224.0 4:18.38 dump
46919 root 20 0 18192 ? 208 D 3.9 457887369396224.0 4:18.48 dump
3 root 20 0 0 0 0 S 2.0 0.0 0:29.20 ksoftirqd/0
13 root 20 0 0 0 0 S 2.0 0.0 0:29.13 ksoftirqd/1
5467 root 20 0 12848 448 168 S 2.0 0.0 30:59.25 eTSrv
5655 root 20 0 178m ? ? S 2.0 457887369396224.0 89:18.03 Xorg
6079 valdis 20 0 347m 3936 5440 S 2.0 0.1 31:17.23 gkrellm
6479 valdis 20 0 1251m ? ? S 2.0 457887369396224.0 46:33.43 firef
46916 root 20 0 22296 2708 328 S 2.0 0.1 0:39.38 dump
48406 root 20 0 0 0 0 S 2.0 0.0 0:00.06 kworker/1:1
1 root 20 0 72228 ? 924 S 0.0 457887369396224.0 0:06.69 syste

grep ^Vm /proc/5655/status (the X server)

VmPeak: 215788 kB
VmSize: 182440 kB
VmLck: 0 kB
VmHWM: 18446744073709544032 kB
VmRSS: 18446744073408330104 kB
VmData: 67688 kB
VmStk: 288 kB
VmExe: 1824 kB
VmLib: 37800 kB
VmPTE: 308 kB
VmSwap: 0 kB

Probably noteworth - the HWM in hex is FFFFFFFFFFFFE260, and
similarly for VmRSS. Looks like an underflow someplace?

It ended up hitting a bunch of processes:

grep 184467 /proc/*/status
/proc/1/status:VmHWM: 18446744073709551612 kB
/proc/1/status:VmRSS: 18446744073709550072 kB
/proc/26902/status:VmHWM: 18446744073709548820 kB
/proc/26902/status:VmRSS: 18446744073709547948 kB
/proc/27079/status:VmHWM: 18446744073709546764 kB
/proc/27079/status:VmRSS: 18446744073709382820 kB
/proc/28359/status:VmHWM: 18446744073709550700 kB
/proc/28359/status:VmRSS: 18446744073709510496 kB
/proc/42136/status:VmHWM: 18446744073709550528 kB
/proc/42136/status:VmRSS: 18446744073709549656 kB
/proc/46917/status:VmHWM: 18446744073709551568 kB
/proc/46917/status:VmRSS: 18446744073640042856 kB
/proc/46918/status:VmHWM: 18446744073709551568 kB
/proc/46918/status:VmRSS: 18446744073640042056 kB
/proc/46919/status:VmHWM: 18446744073709551568 kB
/proc/46919/status:VmRSS: 18446744073640037512 kB
/proc/4742/status:VmHWM: 18446744073709550144 kB
/proc/4742/status:VmRSS: 18446744073709549520 kB
/proc/4821/status:VmHWM: 18446744073709519576 kB
/proc/4821/status:VmRSS: 18446744073709519428 kB
/proc/5412/status:VmHWM: 18446744073709547064 kB
/proc/5412/status:VmRSS: 18446744073709546976 kB
/proc/5641/status:VmHWM: 18446744073709027168 kB
/proc/5641/status:VmRSS: 18446744073708532364 kB
/proc/5655/status:VmHWM: 18446744073709544032 kB
/proc/5655/status:VmRSS: 18446744073407790088 kB
/proc/5856/status:VmHWM: 18446744073709550760 kB
/proc/5856/status:VmRSS: 18446744073708844568 kB
/proc/5997/status:VmHWM: 18446744073709308884 kB
/proc/5997/status:VmRSS: 18446744073411781076 kB
/proc/6306/status:VmHWM: 18446744073709546960 kB
/proc/6306/status:VmRSS: 18446744073709425144 kB
/proc/6416/status:VmHWM: 18446744073709532884 kB
/proc/6416/status:VmRSS: 18446744073706032272 kB
/proc/6446/status:VmHWM: 18446744073709534900 kB
/proc/6446/status:VmRSS: 18446744073709527604 kB
/proc/6479/status:VmHWM: 18446744073709547196 kB
/proc/6479/status:VmRSS: 18446744073654889656 kB
/proc/6555/status:VmHWM: 18446744073709551612 kB
/proc/6555/status:VmRSS: 18446744073709526840 kB
/proc/6647/status:VmHWM: 18446744073709549680 kB
/proc/6647/status:VmRSS: 18446744073685279348 kB

Any ideas? The backup has finished, but the corrupted values are hanging around.
Not sure if it's repeatable.


Attachments:
(No filename) (227.00 B)

2011-05-02 14:37:31

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Sun, 01 May 2011 20:26:54 EDT, [email protected] said:
> On Fri, 29 Apr 2011 16:26:16 PDT, [email protected] said:
> > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> >
> > http://userweb.kernel.org/~akpm/mmotm/
>
> Dell Latitude E6500 laptop, Core2 Due P8700, 4G RAM, 2G swap.Z86_64 kernel.
>
> I was running a backup of the system to an external USB hard drive.

Is a red herring. Am seeing it again, after only 20 minutes of uptime, and so
far I've only gotten 1.2G or so into the 4G ram (2.5G still free), and never
touched swap yet.

Aha! I have a reproducer (found while composing this note). /bin/su will
reliably trigger it (4 tries out of 4, launching from a bash shell that itself
has sane VmRSS and VmHWM values). So it's a specific code sequence doing it
(probably one syscall doing something quirky).

Now if I could figure out how to make strace look at the VmRSS after each
syscall, or get gdb to do similar. Any suggestions? Am open to perf/other
solutions as well, if anybody has one handy...


Attachments:
(No filename) (227.00 B)

2011-05-02 15:46:35

by Randy Dunlap

[permalink] [raw]
Subject: Re: mmotm 2011-04-29-16-25 uploaded

On Sun, 1 May 2011 16:47:43 +0900 (JST) KOSAKI Motohiro wrote:

> > On Fri, 29 Apr 2011 16:26:16 -0700 [email protected] wrote:
> >
> > > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> > >
> > > http://userweb.kernel.org/~akpm/mmotm/
> > >
> > > and will soon be available at
> > >
> > > git://zen-kernel.org/kernel/mmotm.git
> > >
> > > It contains the following patches against 2.6.39-rc5:
> >
> >
> > mm-per-node-vmstat-show-proper-vmstats.patch
> >
> > when CONFIG_PROC_FS is not enabled:
> >
> > drivers/built-in.o: In function `node_read_vmstat':
> > node.c:(.text+0x1e995): undefined reference to `vmstat_text'
> >
> > from drivers/base/node.c
>
>
> Thank you for finding that!
>
>
>
> From 63ad7c06f082f8423c033b9f54070e14d561db7e Mon Sep 17 00:00:00 2001
> From: KOSAKI Motohiro <[email protected]>
> Date: Sun, 1 May 2011 16:00:09 +0900
> Subject: [PATCH] vmstat: fix build error when SYSFS=y and PROC_FS=n
>
> Randy Dunlap pointed out node.c makes build error when
> PROC_FS=n. Because node.c#node_read_vmstat() uses vmstat_text
> and it depend on PROC_FS.
>
> Thus, this patch change it to depend both SYSFS and PROC_FS.
>
> Reported-by: Randy Dunlap <[email protected]>

Acked-by: Randy Dunlap <[email protected]>

Thanks.

> Signed-off-by: KOSAKI Motohiro <[email protected]>
> ---
> mm/vmstat.c | 261 ++++++++++++++++++++++++++++++-----------------------------
> 1 files changed, 132 insertions(+), 129 deletions(-)


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-05-02 23:44:54

by Andrew Morton

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Mon, 02 May 2011 10:37:22 -0400
[email protected] wrote:

> On Sun, 01 May 2011 20:26:54 EDT, [email protected] said:
> > On Fri, 29 Apr 2011 16:26:16 PDT, [email protected] said:
> > > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> > >
> > > http://userweb.kernel.org/~akpm/mmotm/
> >
> > Dell Latitude E6500 laptop, Core2 Due P8700, 4G RAM, 2G swap.Z86_64 kernel.
> >
> > I was running a backup of the system to an external USB hard drive.
>
> Is a red herring. Am seeing it again, after only 20 minutes of uptime, and so
> far I've only gotten 1.2G or so into the 4G ram (2.5G still free), and never
> touched swap yet.
>
> Aha! I have a reproducer (found while composing this note). /bin/su will
> reliably trigger it (4 tries out of 4, launching from a bash shell that itself
> has sane VmRSS and VmHWM values). So it's a specific code sequence doing it
> (probably one syscall doing something quirky).
>
> Now if I could figure out how to make strace look at the VmRSS after each
> syscall, or get gdb to do similar. Any suggestions? Am open to perf/other
> solutions as well, if anybody has one handy...
>

hm, me too. After boot, hald has a get_mm_counter(mm, MM_ANONPAGES) of
0xffffffffffff3c27. Bisected to Pater's
mm-extended-batches-for-generic-mmu_gather.patch, can't see how it did
that.

2011-05-02 23:57:42

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Mon, 02 May 2011 16:44:30 PDT, Andrew Morton said:

> hm, me too. After boot, hald has a get_mm_counter(mm, MM_ANONPAGES) of
> 0xffffffffffff3c27. Bisected to Pater's
> mm-extended-batches-for-generic-mmu_gather.patch, can't see how it did
> that.

Looking at it:
@@ -177,15 +205,24 @@ tlb_finish_mmu(struct mmu_gather *tlb, u
*/
static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
{
+ struct mmu_gather_batch *batch;
+
tlb->need_flush = 1;
+
if (tlb_fast_mode(tlb)) {
free_page_and_swap_cache(page);
return 1; /* avoid calling tlb_flush_mmu() */
}
- tlb->pages[tlb->nr++] = page;
- VM_BUG_ON(tlb->nr > tlb->max);

- return tlb->max - tlb->nr;
+ batch = tlb->active;
+ batch->pages[batch->nr++] = page;
+ VM_BUG_ON(batch->nr > batch->max);
+ if (batch->nr == batch->max) {
+ if (!tlb_next_batch(tlb))
+ return 0;
+ }
+
+ return batch->max - batch->nr;
}

Who's intializing/setting batch->max? Perhaps whoever set up tlb->active
failed to do so?



Attachments:
(No filename) (227.00 B)

2011-05-10 16:01:49

by Peter Zijlstra

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Mon, 2011-05-02 at 16:44 -0700, Andrew Morton wrote:
> On Mon, 02 May 2011 10:37:22 -0400
> [email protected] wrote:
>
> > On Sun, 01 May 2011 20:26:54 EDT, [email protected] said:
> > > On Fri, 29 Apr 2011 16:26:16 PDT, [email protected] said:
> > > > The mm-of-the-moment snapshot 2011-04-29-16-25 has been uploaded to
> > > >
> > > > http://userweb.kernel.org/~akpm/mmotm/
> > >
> > > Dell Latitude E6500 laptop, Core2 Due P8700, 4G RAM, 2G swap.Z86_64 kernel.
> > >
> > > I was running a backup of the system to an external USB hard drive.
> >
> > Is a red herring. Am seeing it again, after only 20 minutes of uptime, and so
> > far I've only gotten 1.2G or so into the 4G ram (2.5G still free), and never
> > touched swap yet.
> >
> > Aha! I have a reproducer (found while composing this note). /bin/su will
> > reliably trigger it (4 tries out of 4, launching from a bash shell that itself
> > has sane VmRSS and VmHWM values). So it's a specific code sequence doing it
> > (probably one syscall doing something quirky).
> >
> > Now if I could figure out how to make strace look at the VmRSS after each
> > syscall, or get gdb to do similar. Any suggestions? Am open to perf/other
> > solutions as well, if anybody has one handy...
> >
>
> hm, me too. After boot, hald has a get_mm_counter(mm, MM_ANONPAGES) of
> 0xffffffffffff3c27. Bisected to Pater's
> mm-extended-batches-for-generic-mmu_gather.patch, can't see how it did
> that.
>

I haven't quite figured out how to reproduce, but does the below cure
things? If so, it should probably be folded into the first patch
(mm-mmu_gather-rework.patch?) since that is the one introducing this.

---
Subject: mm: Fix RSS zap_pte_range() accounting

Since we update the RSS counters when breaking out of the loop and
release the PTE lock, we should start with fresh deltas when we
restart the gather loop.

Reported-by: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
---
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -1120,8 +1120,8 @@ static unsigned long zap_pte_range(struc
spinlock_t *ptl;
pte_t *pte;

- init_rss_vec(rss);
again:
+ init_rss_vec(rss);
pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
arch_enter_lazy_mmu_mode();
do {

2011-05-11 22:53:36

by Andrew Morton

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Tue, 10 May 2011 18:04:45 +0200
Peter Zijlstra <[email protected]> wrote:

> > hm, me too. After boot, hald has a get_mm_counter(mm, MM_ANONPAGES) of
> > 0xffffffffffff3c27. Bisected to Pater's
> > mm-extended-batches-for-generic-mmu_gather.patch, can't see how it did
> > that.
> >
>
> I haven't quite figured out how to reproduce, but does the below cure
> things? If so, it should probably be folded into the first patch
> (mm-mmu_gather-rework.patch?) since that is the one introducing this.
>
> ---
> Subject: mm: Fix RSS zap_pte_range() accounting
>
> Since we update the RSS counters when breaking out of the loop and
> release the PTE lock, we should start with fresh deltas when we
> restart the gather loop.
>
> Reported-by: [email protected]
> Signed-off-by: Peter Zijlstra <[email protected]>
> ---
> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -1120,8 +1120,8 @@ static unsigned long zap_pte_range(struc
> spinlock_t *ptl;
> pte_t *pte;
>
> - init_rss_vec(rss);
> again:
> + init_rss_vec(rss);
> pte = pte_offset_map_lock(mm, pmd, addr, &ptl);
> arch_enter_lazy_mmu_mode();
> do {

That fixed the negative hald VmHWM output on my test box.

2011-05-12 01:08:54

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: mmotm 2011-04-29 - wonky VmRSS and VmHWM values after swapping

On Tue, 10 May 2011 18:04:45 +0200, Peter Zijlstra said:

> I haven't quite figured out how to reproduce, but does the below cure
> things? If so, it should probably be folded into the first patch
> (mm-mmu_gather-rework.patch?) since that is the one introducing this.
>
> ---
> Subject: mm: Fix RSS zap_pte_range() accounting
>
> Since we update the RSS counters when breaking out of the loop and
> release the PTE lock, we should start with fresh deltas when we
> restart the gather loop.

Confirming this patch makes the numbers look sane and plausible again, so
feel free to stick a Tested-By: to match the Reported-By:.


Attachments:
(No filename) (227.00 B)