2016-12-25 21:36:39

by Klaus Ethgen

[permalink] [raw]
Subject: Bug 4.9 and memorymanagement

Hello,

The last days I compiled version 4.9 for my i386 laptop. (Lenovo x61s)

First, everything seems to be sane but after some sleep and awake
(suspend to ram) cycles I seen some really weird behaviour ending in OOM
or even complete freeze of the laptop.

What I was able to see is that it went to swap even if there is plenty
of memory left. The OOMs was also with many memory left.

Once I also catched kswapd0 with running insane with 100% CPU
utilization.

I first had in mind the CONFIG_SLAB_FREELIST_RANDOM setting and disabled
it. This didn't made the problem to go away but it helped a little.
Nevertheless, further OOM or other strange behaviour happened.

I went back to 4.8.15 now with the same config from 4.9 and everything
gets back to normal.

So it seems for me that there are some really strange memory leaks in
4.9. The biggest problem is, that I do not know how to reproduce it
reliable. The only what I know is that it happened after several
suspends. (Not necessarily the first.)

Am I the only one seeing that behavior or do anybody have an idea what
could went wrong?

For the reference I put the .configs of the two compilings as attachment
to this mail.

Please keep me in CC as I am not subscribed to LKML.

Regards
Klaus
--
Klaus Ethgen http://www.ethgen.ch/
pub 4096R/4E20AF1C 2011-05-16 Klaus Ethgen <[email protected]>
Fingerprint: 85D4 CA42 952C 949B 1753 62B3 79D0 B06F 4E20 AF1C


Attachments:
(No filename) (0.00 B)
signature.asc (688.00 B)
Download all attachments

2016-12-26 11:00:59

by Michal Hocko

[permalink] [raw]
Subject: Re: Bug 4.9 and memorymanagement

[CCing linux-mm]

On Sun 25-12-16 21:52:52, Klaus Ethgen wrote:
> Hello,
>
> The last days I compiled version 4.9 for my i386 laptop. (Lenovo x61s)

Do you have memory cgroups enabled in runtime (aka does the same happen
with cgroup_disable=memory)?

> First, everything seems to be sane but after some sleep and awake
> (suspend to ram) cycles I seen some really weird behaviour ending in OOM
> or even complete freeze of the laptop.
>
> What I was able to see is that it went to swap even if there is plenty
> of memory left. The OOMs was also with many memory left.

Could you paste those OOM reports from the kernel log?

> Once I also catched kswapd0 with running insane with 100% CPU
> utilization.
>
> I first had in mind the CONFIG_SLAB_FREELIST_RANDOM setting and disabled
> it. This didn't made the problem to go away but it helped a little.
> Nevertheless, further OOM or other strange behaviour happened.
>
> I went back to 4.8.15 now with the same config from 4.9 and everything
> gets back to normal.
>
> So it seems for me that there are some really strange memory leaks in
> 4.9. The biggest problem is, that I do not know how to reproduce it
> reliable. The only what I know is that it happened after several
> suspends. (Not necessarily the first.)
>
> Am I the only one seeing that behavior or do anybody have an idea what
> could went wrong?

no there were some reports recently and 32b with memory cgroups are
broken since 4.8 when the zone LRU's were moved to nodes.
>x
> For the reference I put the .configs of the two compilings as attachment
> to this mail.
>
> Please keep me in CC as I am not subscribed to LKML.
>
> Regards
> Klaus






--
Michal Hocko
SUSE Labs

2016-12-27 11:28:56

by Michal Hocko

[permalink] [raw]
Subject: Re: Bug 4.9 and memorymanagement

On Mon 26-12-16 12:00:53, Michal Hocko wrote:
> [CCing linux-mm]
>
> On Sun 25-12-16 21:52:52, Klaus Ethgen wrote:
> > Hello,
> >
> > The last days I compiled version 4.9 for my i386 laptop. (Lenovo x61s)
>
> Do you have memory cgroups enabled in runtime (aka does the same happen
> with cgroup_disable=memory)?

If this turns out to be memory cgroup related then the patch from
http://lkml.kernel.org/r/[email protected] might
help.

--
Michal Hocko
SUSE Labs

2016-12-27 12:14:23

by Klaus Ethgen

[permalink] [raw]
Subject: Re: Bug 4.9 and memorymanagement

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

I was just to fast mentioning that my new 4.9 compile did well. Just
after I wrote the mail, I got the same issue again. Now being back to
4.7.

OOM:
[34629.315415] Unable to lock GPU to purge memory.
[34629.315542] wicd invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), nodemask=0, order=1, oom_score_adj=0
[34629.315547] wicd cpuset=/ mems_allowed=0
[34629.315557] CPU: 1 PID: 2525 Comm: wicd Tainted: G U O 4.9.0 #1
[34629.315560] Hardware name: LENOVO 7669A26/7669A26, BIOS 7NETB3WW (2.13 ) 04/30/2008
[34629.315563] f2cb5ea4 c12b32b0 f2cb5ea4 f0ac8840 c113c861 f2cb5e10 00000206 c12b845f
[34629.315571] f3baa100 f2cb5dac ea28cbb3 f0ac8840 f0ac8c6c c159b37f f2cb5ea4 c10e8843
[34629.315579] 00000000 d298b180 00000000 0013cfe4 c10e84eb 00000284 00000000 00000062
[34629.315586] Call Trace:
[34629.315600] [<c12b32b0>] ? dump_stack+0x44/0x64
[34629.315607] [<c113c861>] ? dump_header+0x5d/0x1b7
[34629.315611] [<c12b845f>] ? ___ratelimit+0x8f/0xf0
[34629.315616] [<c10e8843>] ? oom_kill_process+0x203/0x3d0
[34629.315620] [<c10e84eb>] ? oom_badness.part.12+0xeb/0x160
[34629.315624] [<c10e8cce>] ? out_of_memory+0xde/0x290
[34629.315628] [<c10ec9ec>] ? __alloc_pages_nodemask+0xc3c/0xc50
[34629.315633] [<c1044846>] ? copy_process.part.54+0xe6/0x1490
[34629.315638] [<c10c84fe>] ? __audit_syscall_entry+0xae/0x110
[34629.315642] [<c10011b3>] ? syscall_trace_enter+0x183/0x200
[34629.315645] [<c1045d96>] ? _do_fork+0xd6/0x310
[34629.315649] [<c10c8736>] ? __audit_syscall_exit+0x1d6/0x260
[34629.315653] [<c10014e9>] ? do_fast_syscall_32+0x79/0x130
[34629.315657] [<c14c21e2>] ? sysenter_past_esp+0x47/0x75
[34629.315660] Mem-Info:
[34629.315667] active_anon:41216 inactive_anon:122089 isolated_anon:0
[34629.315667] active_file:345542 inactive_file:145465 isolated_file:1
[34629.315667] unevictable:7274 dirty:41 writeback:0 unstable:0
[34629.315667] slab_reclaimable:53222 slab_unreclaimable:11663
[34629.315667] mapped:43891 shmem:14724 pagetables:797 bounce:0
[34629.315667] free:41833 free_pcp:184 free_cma:0
[34629.315678] Node 0 active_anon:164864kB inactive_anon:488356kB active_file:1382168kB inactive_file:581860kB unevictable:29096kB isolated(anon):0kB isolated(file):4kB mapped:175564kB dirty:164kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 58896kB writeback_tmp:0kB unstable:0kB pages_scanned:9691812 all_unreclaimable? yes
[34629.315685] DMA free:4084kB min:788kB low:984kB high:1180kB active_anon:204kB inactive_anon:400kB active_file:2632kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15908kB mlocked:0kB slab_reclaimable:8128kB slab_unreclaimable:428kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 833 3008 3008
[34629.315698] Normal free:42396kB min:42416kB low:53020kB high:63624kB active_anon:3820kB inactive_anon:39704kB active_file:505188kB inactive_file:540kB unevictable:0kB writepending:164kB present:892920kB managed:854344kB mlocked:0kB slab_reclaimable:204760kB slab_unreclaimable:46224kB kernel_stack:2776kB pagetables:32kB bounce:0kB free_pcp:728kB local_pcp:4kB free_cma:0kB
lowmem_reserve[]: 0 0 17397 17397
[34629.315710] HighMem free:120852kB min:512kB low:28164kB high:55816kB active_anon:160840kB inactive_anon:448252kB active_file:874348kB inactive_file:581224kB unevictable:29096kB writepending:0kB present:2226888kB managed:2226888kB mlocked:29096kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:3156kB bounce:0kB free_pcp:8kB local_pcp:8kB free_cma:0kB
lowmem_reserve[]: 0 0 0 0
[34629.315718] DMA: 61*4kB (UM) 60*8kB (ME) 26*16kB (UME) 28*32kB (ME) 18*64kB (UME) 3*128kB (ME) 2*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4084kB
Normal: 2261*4kB (UME) 1667*8kB (UME) 755*16kB (UME) 222*32kB (UME) 13*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 42396kB
HighMem: 81*4kB (U) 6*8kB (UM) 40*16kB (UM) 691*32kB (UM) 419*64kB (UM) 78*128kB (UM) 102*256kB (UM) 10*512kB (UM) 7*1024kB (UM) 5*2048kB (UM) 3*4096kB (UM) = 120852kB
509332 total pagecache pages
[34629.315778] 1424 pages in swap cache
[34629.315781] Swap cache stats: add 31179, delete 29755, find 254185/259247
[34629.315783] Free swap = 2074532kB
[34629.315785] Total swap = 2096476kB
[34629.315787] 783948 pages RAM
[34629.315789] 556722 pages HighMem/MovableOnly
[34629.315791] 9663 pages reserved
[34629.315793] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[34629.315799] [ 216] 0 216 2927 798 5 0 2 -1000 udevd
[34629.315806] [ 1640] 0 1640 1233 45 5 0 0 0 acpi_fakekeyd
[34629.315811] [ 1659] 0 1659 3233 404 5 0 0 -1000 auditd
[34629.315815] [ 1759] 0 1759 7688 328 8 0 0 0 lxcfs
[34629.315819] [ 1780] 0 1780 5413 76 6 0 0 0 lvmetad
[34629.315822] [ 1803] 0 1803 7910 165 7 0 0 0 rsyslogd
[34629.315826] [ 1865] 0 1865 595 448 3 0 0 0 acpid
[34629.315830] [ 1913] 0 1913 581 379 4 0 0 0 battery-stats-c
[34629.315834] [ 1935] 103 1935 1108 602 5 0 0 0 dbus-daemon
[34629.315838] [ 1983] 8 1983 1058 454 4 0 0 0 nullmailer-send
[34629.315842] [ 2001] 0 2001 1451 553 5 0 0 0 bluetoothd
[34629.315846] [ 2008] 0 2008 2003 989 6 0 0 0 haveged
[34629.315850] [ 2017] 0 2017 7111 423 7 0 0 0 pcscd
[34629.315853] [ 2076] 0 2076 558 18 4 0 0 0 thinkfan
[34629.315857] [ 2077] 0 2077 559 182 4 0 0 0 startpar
[34629.315862] [ 2102] 0 2102 2129 391 5 0 0 -1000 sshd
[34629.315865] [ 2109] 0 2109 1460 337 5 0 0 0 smartd
[34629.315869] [ 2147] 124 2147 824 416 4 0 0 0 ulogd
[34629.315873] [ 2282] 110 2282 5261 2286 9 0 0 0 unbound
[34629.315877] [ 2393] 121 2393 957 470 4 0 119 0 privoxy
[34629.315881] [ 2394] 0 2394 601 0 4 0 21 0 uuidd
[34629.315885] [ 2416] 0 2416 1406 518 5 0 25 0 cron
[34629.315889] [ 2459] 0 2459 1511 2 5 0 75 0 wdm
[34629.315892] [ 2462] 0 2462 1511 412 5 0 77 0 wdm
[34629.315897] [ 2469] 0 2469 32107 9124 28 0 2625 0 Xorg
[34629.315901] [ 2525] 0 2525 8382 3056 10 0 597 0 wicd
[34629.315905] [ 2556] 0 2556 4975 1579 8 0 869 0 wicd-monitor
[34629.315908] [ 2581] 0 2581 553 122 4 0 16 0 mingetty
[34629.315912] [ 2588] 0 2588 1698 588 5 0 145 0 wdm
[34629.315916] [ 2604] 10230 2604 13645 1993 11 0 0 0 fvwm2
[34629.315920] [ 2662] 10230 2662 1136 0 5 0 65 0 dbus-launch
[34629.315924] [ 2663] 10230 2663 1075 595 5 0 21 0 dbus-daemon
[34629.315928] [ 2680] 10230 2680 8460 584 8 0 40 0 gpg-agent
[34629.315932] [ 2684] 10230 2684 3957 478 6 0 61 0 tpb
[34629.315936] [ 2696] 10230 2696 2098 1097 6 0 32 0 xscreensaver
[34629.315940] [ 2698] 10230 2698 2937 404 6 0 72 0 redshift
[34629.315944] [ 2710] 10230 2710 1346 731 5 0 0 0 autocutsel
[34629.315948] [ 2717] 10230 2717 10943 3617 13 0 0 0 gkrellm
[34629.315952] [ 2718] 10230 2718 54515 18285 43 0 0 0 psi-plus
[34629.315955] [ 2719] 10230 2719 11929 7489 15 0 0 0 wicd-client
[34629.315959] [ 2741] 10230 2741 1047 255 5 0 0 0 FvwmCommandS
[34629.315964] [ 2742] 10230 2742 1528 347 5 0 0 0 FvwmEvent
[34629.315968] [ 2743] 10230 2743 12182 1079 9 0 0 0 FvwmAnimate
[34629.315971] [ 2744] 10230 2744 12808 996 11 0 0 0 FvwmButtons
[34629.315975] [ 2745] 10230 2745 13344 1330 11 0 0 0 FvwmProxy
[34629.315979] [ 2746] 10230 2746 1507 316 4 0 0 0 FvwmAuto
[34629.315983] [ 2747] 10230 2747 12803 974 11 0 0 0 FvwmPager
[34629.315987] [ 2748] 10230 2748 581 143 4 0 0 0 sh
[34629.315991] [ 2749] 10230 2749 1063 407 4 0 0 0 stalonetray
[34629.315994] [ 2908] 10230 2908 1073 479 5 0 0 0 xsnow
[34629.315999] [ 2935] 10230 2935 295381 127866 253 0 1 0 firefox.real
[34629.316030] [ 5794] 10230 5794 2816 1470 6 0 0 0 xterm
[34629.316034] [ 5797] 10230 5797 2308 1629 6 0 0 0 zsh
[34629.316038] [ 5962] 10230 5962 3406 2006 6 0 0 0 xterm
[34629.316041] [ 5963] 10230 5963 2495 645 6 0 0 0 ssh
[34629.316045] [ 5970] 0 5970 2899 492 6 0 1 0 sshd
[34629.316050] [ 5974] 10230 5974 8535 531 8 0 0 0 scdaemon
[34629.316053] [ 6008] 10230 6008 2529 291 6 0 0 0 ssh
[34629.316057] [ 6011] 0 6011 2074 1333 6 0 15 0 zsh
[34629.316063] [ 7225] 10230 7225 2813 1423 7 0 0 0 xterm
[34629.316067] [ 7228] 10230 7228 2089 1383 5 0 0 0 zsh
[34629.316070] [24346] 10230 24346 2027 1052 5 0 0 0 xfconfd
[34629.316075] [28935] 10230 28935 2825 1655 6 0 0 0 xterm
[34629.316079] [28938] 10230 28938 1903 1236 6 0 0 0 zsh
[34629.316083] [ 3534] 0 3534 7724 7231 11 0 0 -1000 ulatencyd
[34629.316088] [13919] 0 13919 2190 785 5 0 0 0 wpa_supplicant
[34629.316092] [13965] 0 13965 2026 181 4 0 0 0 dhclient
[34629.316095] [14016] 65534 14016 2289 1431 7 0 0 0 openvpn
[34629.316099] [14024] 120 14024 7407 6469 11 0 0 0 tor
[34629.316104] [18617] 10230 18617 14715 8499 18 0 0 0 vim
[34629.316108] [19852] 0 19852 557 134 4 0 0 0 sleep
[34629.316112] [20042] 10230 20042 40711 11341 31 0 0 0 zathura
[34629.316115] Out of memory: Kill process 2935 (firefox.real) score 98 or sacrifice child
[34629.316216] Killed process 2935 (firefox.real) total-vm:1181524kB, anon-rss:429748kB, file-rss:72444kB, shmem-rss:9272kB
[34633.828297] oom_reaper: reaped process 2935 (firefox.real), now anon-rss:0kB, file-rss:32kB, shmem-rss:9248kB

[34701.040055] Xorg invoked oom-killer: gfp_mask=0x24200d4(GFP_USER|GFP_DMA32|__GFP_RECLAIMABLE), nodemask=0, order=0, oom_score_adj=0
[34701.040064] Xorg cpuset=/ mems_allowed=0
[34701.040074] CPU: 1 PID: 2469 Comm: Xorg Tainted: G U O 4.9.0 #1
[34701.040077] Hardware name: LENOVO 7669A26/7669A26, BIOS 7NETB3WW (2.13 ) 04/30/2008
[34701.040081] f32efa6c c12b32b0 f32efa6c f16b3180 c113c861 00001f8f 00200206 c12b845f
[34701.040089] f32efa54 f59eab80 c10349d5 f16b3180 f16b35ac c159b37f f32efa6c c10e8843
[34701.040096] 00000000 ee7f98c0 00000000 0013cfe4 c10e84eb 0000043d 00000000 0000000e
[34701.040104] Call Trace:
[34701.040115] [<c12b32b0>] ? dump_stack+0x44/0x64
[34701.040122] [<c113c861>] ? dump_header+0x5d/0x1b7
[34701.040126] [<c12b845f>] ? ___ratelimit+0x8f/0xf0
[34701.040132] [<c10349d5>] ? smp_trace_apic_timer_interrupt+0x55/0x80
[34701.040137] [<c10e8843>] ? oom_kill_process+0x203/0x3d0
[34701.040140] [<c10e84eb>] ? oom_badness.part.12+0xeb/0x160
[34701.040143] [<c10e8cce>] ? out_of_memory+0xde/0x290
[34701.040148] [<c10ec9ec>] ? __alloc_pages_nodemask+0xc3c/0xc50
[34701.040153] [<c10fb557>] ? shmem_alloc_and_acct_page+0x137/0x210
[34701.040158] [<c10e4965>] ? find_get_entry+0xd5/0x110
[34701.040161] [<c10fbe85>] ? shmem_getpage_gfp+0x165/0xbb0
[34701.040167] [<c14be22d>] ? schedule+0x2d/0x80
[34701.040174] [<c14bf3df>] ? wait_for_completion+0xbf/0xe0
[34701.040178] [<c1067cf0>] ? wake_up_q+0x60/0x60
[34701.040181] [<c10fc912>] ? shmem_read_mapping_page_gfp+0x42/0x70
[34701.040228] [<f8ae79b1>] ? i915_gem_object_get_pages_gtt+0x1e1/0x3d0 [i915]
[34701.040251] [<f8ae2441>] ? ggtt_bind_vma+0x41/0x70 [i915]
[34701.040275] [<f8ae81f5>] ? i915_gem_object_get_pages+0x35/0xb0 [i915]
[34701.040300] [<f8aea197>] ? __i915_vma_do_pin+0x117/0x6b0 [i915]
[34701.040323] [<f8adb5ae>] ? i915_gem_execbuffer_reserve_vma.isra.36+0x15e/0x1f0 [i915]
[34701.040347] [<f8adba4b>] ? i915_gem_execbuffer_reserve.isra.37+0x40b/0x440 [i915]
[34701.040370] [<f8add3f1>] ? i915_gem_do_execbuffer.isra.40+0x5d1/0x11d0 [i915]
[34701.040390] [<f81b71e6>] ? drm_vma_node_allow+0x86/0xb0 [drm]
[34701.040395] [<c112b82f>] ? __kmalloc+0xdf/0x160
[34701.040418] [<f8ade40e>] ? i915_gem_execbuffer2+0x7e/0x220 [i915]
[34701.040443] [<f8af10ff>] ? i915_gem_set_tiling+0x12f/0x480 [i915]
[34701.040466] [<f8ade390>] ? i915_gem_execbuffer+0x3a0/0x3a0 [i915]
[34701.040476] [<f81a55c3>] ? drm_ioctl+0x1b3/0x3e0 [drm]
[34701.040500] [<f8ade390>] ? i915_gem_execbuffer+0x3a0/0x3a0 [i915]
[34701.040505] [<c1140562>] ? do_readv_writev+0x132/0x400
[34701.040509] [<c14123b0>] ? kernel_sendmsg+0x50/0x50
[34701.040519] [<f81a5410>] ? drm_getunique+0x40/0x40 [drm]
[34701.040523] [<c115259f>] ? do_vfs_ioctl+0x8f/0x770
[34701.040529] [<c10c84fe>] ? __audit_syscall_entry+0xae/0x110
[34701.040532] [<c10011b3>] ? syscall_trace_enter+0x183/0x200
[34701.040536] [<c10c80d1>] ? audit_filter_inodes+0xc1/0x100
[34701.040540] [<c10c7bd5>] ? audit_filter_syscall+0xa5/0xd0
[34701.040544] [<c115c601>] ? __fget+0x61/0xb0
[34701.040548] [<c1152cae>] ? SyS_ioctl+0x2e/0x50
[34701.040551] [<c10014e9>] ? do_fast_syscall_32+0x79/0x130
[34701.040555] [<c14c21e2>] ? sysenter_past_esp+0x47/0x75
[34701.040558] Mem-Info:
[34701.040565] active_anon:24871 inactive_anon:31713 isolated_anon:0
[34701.040565] active_file:345258 inactive_file:145071 isolated_file:4
[34701.040565] unevictable:7256 dirty:140 writeback:0 unstable:0
[34701.040565] slab_reclaimable:52226 slab_unreclaimable:11653
[34701.040565] mapped:28020 shmem:13879 pagetables:547 bounce:0
[34701.040565] free:150608 free_pcp:158 free_cma:0
[34701.040576] Node 0 active_anon:99484kB inactive_anon:126852kB active_file:1381032kB inactive_file:580284kB unevictable:29024kB isolated(anon):0kB isolated(file):16kB mapped:112080kB dirty:560kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 55516kB writeback_tmp:0kB unstable:0kB pages_scanned:21492992 all_unreclaimable? yes
[34701.040584] DMA free:4112kB min:788kB low:984kB high:1180kB active_anon:204kB inactive_anon:404kB active_file:2632kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15908kB mlocked:0kB slab_reclaimable:8128kB slab_unreclaimable:428kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 833 3008 3008
[34701.040597] Normal free:42352kB min:42416kB low:53020kB high:63624kB active_anon:3412kB inactive_anon:43628kB active_file:506844kB inactive_file:156kB unevictable:0kB writepending:236kB present:892920kB managed:854344kB mlocked:0kB slab_reclaimable:200776kB slab_unreclaimable:46184kB kernel_stack:2424kB pagetables:28kB bounce:0kB free_pcp:376kB local_pcp:232kB free_cma:0kB
lowmem_reserve[]: 0 0 17397 17397
[34701.040609] HighMem free:555968kB min:512kB low:28164kB high:55816kB active_anon:95868kB inactive_anon:82820kB active_file:871556kB inactive_file:580128kB unevictable:29024kB writepending:324kB present:2226888kB managed:2226888kB mlocked:29024kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:2160kB bounce:0kB free_pcp:256kB local_pcp:180kB free_cma:0kB
lowmem_reserve[]: 0 0 0 0
[34701.040617] DMA: 62*4kB (UME) 59*8kB (ME) 26*16kB (UME) 29*32kB (UME) 18*64kB (UME) 3*128kB (ME) 2*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4112kB
Normal: 2356*4kB (ME) 1742*8kB (UME) 707*16kB (UME) 212*32kB (UME) 14*64kB (UM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 42352kB
HighMem: 11250*4kB (UM) 8793*8kB (UM) 4821*16kB (UM) 2769*32kB (UM) 1309*64kB (UM) 421*128kB (UM) 236*256kB (UM) 58*512kB (UM) 16*1024kB (UM) 7*2048kB (UM) 4*4096kB (UM) = 555968kB
507833 total pagecache pages
[34701.040677] 1423 pages in swap cache
[34701.040679] Swap cache stats: add 31179, delete 29756, find 254737/259799
[34701.040682] Free swap = 2074540kB
[34701.040684] Total swap = 2096476kB
[34701.040686] 783948 pages RAM
[34701.040688] 556722 pages HighMem/MovableOnly
[34701.040690] 9663 pages reserved
[34701.040692] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[34701.040701] [ 216] 0 216 2927 798 5 0 2 -1000 udevd
[34701.040711] [ 1640] 0 1640 1233 45 5 0 0 0 acpi_fakekeyd
[34701.040716] [ 1659] 0 1659 3233 404 5 0 0 -1000 auditd
[34701.040721] [ 1759] 0 1759 7688 328 8 0 0 0 lxcfs
[34701.040725] [ 1780] 0 1780 5413 76 6 0 0 0 lvmetad
[34701.040729] [ 1803] 0 1803 7910 165 7 0 0 0 rsyslogd
[34701.040734] [ 1865] 0 1865 595 448 3 0 0 0 acpid
[34701.040738] [ 1913] 0 1913 581 379 4 0 0 0 battery-stats-c
[34701.040742] [ 1935] 103 1935 1108 602 5 0 0 0 dbus-daemon
[34701.040746] [ 1983] 8 1983 1058 454 4 0 0 0 nullmailer-send
[34701.040750] [ 2001] 0 2001 1451 553 5 0 0 0 bluetoothd
[34701.040754] [ 2008] 0 2008 2003 989 6 0 0 0 haveged
[34701.040759] [ 2017] 0 2017 7111 423 7 0 0 0 pcscd
[34701.040763] [ 2076] 0 2076 558 18 4 0 0 0 thinkfan
[34701.040767] [ 2077] 0 2077 559 182 4 0 0 0 startpar
[34701.040772] [ 2102] 0 2102 2129 391 5 0 0 -1000 sshd
[34701.040776] [ 2109] 0 2109 1460 337 5 0 0 0 smartd
[34701.040780] [ 2147] 124 2147 824 416 4 0 0 0 ulogd
[34701.040784] [ 2282] 110 2282 5261 2286 9 0 0 0 unbound
[34701.040788] [ 2393] 121 2393 957 470 4 0 119 0 privoxy
[34701.040792] [ 2394] 0 2394 601 0 4 0 21 0 uuidd
[34701.040797] [ 2416] 0 2416 1406 518 5 0 25 0 cron
[34701.040801] [ 2459] 0 2459 1511 2 5 0 75 0 wdm
[34701.040805] [ 2462] 0 2462 1511 412 5 0 77 0 wdm
[34701.040810] [ 2469] 0 2469 29966 7317 28 0 2625 0 Xorg
[34701.040814] [ 2525] 0 2525 8382 3056 10 0 597 0 wicd
[34701.040818] [ 2556] 0 2556 4975 1579 8 0 869 0 wicd-monitor
[34701.040823] [ 2581] 0 2581 553 122 4 0 16 0 mingetty
[34701.040827] [ 2588] 0 2588 1698 588 5 0 145 0 wdm
[34701.040831] [ 2604] 10230 2604 13645 1993 11 0 0 0 fvwm2
[34701.040835] [ 2662] 10230 2662 1136 0 5 0 65 0 dbus-launch
[34701.040839] [ 2663] 10230 2663 1075 595 5 0 21 0 dbus-daemon
[34701.040844] [ 2680] 10230 2680 8460 584 8 0 40 0 gpg-agent
[34701.040848] [ 2684] 10230 2684 3957 478 6 0 61 0 tpb
[34701.040852] [ 2696] 10230 2696 2098 1097 6 0 32 0 xscreensaver
[34701.040856] [ 2698] 10230 2698 2937 404 6 0 72 0 redshift
[34701.040860] [ 2710] 10230 2710 1346 731 5 0 0 0 autocutsel
[34701.040864] [ 2717] 10230 2717 10943 3617 13 0 0 0 gkrellm
[34701.040868] [ 2718] 10230 2718 54515 18285 43 0 0 0 psi-plus
[34701.040872] [ 2719] 10230 2719 11929 7489 15 0 0 0 wicd-client
[34701.040876] [ 2741] 10230 2741 1047 255 5 0 0 0 FvwmCommandS
[34701.040881] [ 2742] 10230 2742 1528 347 5 0 0 0 FvwmEvent
[34701.040885] [ 2743] 10230 2743 12182 1079 9 0 0 0 FvwmAnimate
[34701.040889] [ 2744] 10230 2744 12808 996 11 0 0 0 FvwmButtons
[34701.040894] [ 2745] 10230 2745 13344 1330 11 0 0 0 FvwmProxy
[34701.040898] [ 2746] 10230 2746 1507 316 4 0 0 0 FvwmAuto
[34701.040902] [ 2747] 10230 2747 12803 974 11 0 0 0 FvwmPager
[34701.040907] [ 2748] 10230 2748 581 143 4 0 0 0 sh
[34701.040911] [ 2749] 10230 2749 1063 407 4 0 0 0 stalonetray
[34701.040915] [ 2908] 10230 2908 1073 479 5 0 0 0 xsnow
[34701.040919] [ 5794] 10230 5794 2816 1470 6 0 0 0 xterm
[34701.040923] [ 5797] 10230 5797 2308 1629 6 0 0 0 zsh
[34701.040927] [ 5962] 10230 5962 3406 2006 6 0 0 0 xterm
[34701.040931] [ 5963] 10230 5963 2495 645 6 0 0 0 ssh
[34701.040935] [ 5970] 0 5970 2899 492 6 0 1 0 sshd
[34701.040940] [ 5974] 10230 5974 8535 531 8 0 0 0 scdaemon
[34701.040944] [ 6008] 10230 6008 2529 291 6 0 0 0 ssh
[34701.040948] [ 6011] 0 6011 2074 1333 6 0 15 0 zsh
[34701.040955] [ 7225] 10230 7225 2813 1423 7 0 0 0 xterm
[34701.040959] [ 7228] 10230 7228 2089 1383 5 0 0 0 zsh
[34701.040963] [24346] 10230 24346 2027 1052 5 0 0 0 xfconfd
[34701.040968] [28935] 10230 28935 2825 1655 6 0 0 0 xterm
[34701.040972] [28938] 10230 28938 1903 1236 6 0 0 0 zsh
[34701.040977] [ 3534] 0 3534 7706 7230 11 0 0 -1000 ulatencyd
[34701.040982] [13919] 0 13919 2190 785 5 0 0 0 wpa_supplicant
[34701.040987] [13965] 0 13965 2026 181 4 0 0 0 dhclient
[34701.040991] [14016] 65534 14016 2289 1431 7 0 0 0 openvpn
[34701.040995] [14024] 120 14024 7407 6469 11 0 0 0 tor
[34701.041000] [18617] 10230 18617 14715 8499 18 0 0 0 vim
[34701.096211] [20110] 0 20110 557 135 4 0 0 0 sleep
[34701.096222] [20305] 10230 20305 42766 13755 34 0 0 0 zathura
[34701.096226] Out of memory: Kill process 2718 (psi-plus) score 14 or sacrifice child
[34701.096253] Killed process 2718 (psi-plus) total-vm:218060kB, anon-rss:34692kB, file-rss:36680kB, shmem-rss:1768kB
[34701.605857] oom_reaper: reaped process 2718 (psi-plus), now anon-rss:0kB, file-rss:0kB, shmem-rss:1768kB
[34703.917075] Purging GPU memory, 4333 pages freed, 7594 pages still pinned.

Regards
Klaus
- --
Klaus Ethgen http://www.ethgen.ch/
pub 4096R/4E20AF1C 2011-05-16 Klaus Ethgen <[email protected]>
Fingerprint: 85D4 CA42 952C 949B 1753 62B3 79D0 B06F 4E20 AF1C
-----BEGIN PGP SIGNATURE-----
Comment: Charset: ISO-8859-1

iQGzBAEBCgAdFiEEMWF28vh4/UMJJLQEpnwKsYAZ9qwFAlhiWwgACgkQpnwKsYAZ
9qxdxwv/cPVmQDUKReWCI5/5DXixxVt+9H+6oLyvUcHn1hPyCYHzNsRKI13j4yHv
3Q05i4i8bq/HbhoNd3kYx3IISlDWfMEeaO3m/fDroAsHlgHEY2Mc+PdfKdiJtZQh
oMIpIVRiFV7hwsXAZeUh2tCAJeMimchA6z6yvSMzs5cz1FRUR0AmGwqVjAYIRIvV
PWF36aIroITl0CnKqLfq6u7I6ZGBpZntF6XQxj3App2mhF+SB06GmRnBAxQkMGJn
iM5oe9FSU8i/NWRDjAxJCEBWLwdTf1oL5udx/0ZQ3Qu2SNvlB0yQl9LbI9+925qm
OgF/LxBko1yl+n+cezqnKKxm9hCwv3tKKu4Dx1vFQBZmtI8fddmLPFAXXqKP5cye
MLMhLnnR/SpqSoIKkkDeMv1I3oGhJa9VaQ/aE7tDPmm2oMebrZ0fMBhMBHcGU4ch
JPH1qO54tx7TAexmDFDGhTNxHt6NGEs3QXaDHb7MfoW66vUUjjKPGgnDmhdL+iTJ
wnkvxODi
=K+I5
-----END PGP SIGNATURE-----

2016-12-27 12:24:06

by Michal Hocko

[permalink] [raw]
Subject: Re: Bug 4.9 and memorymanagement

On Tue 27-12-16 12:48:24, Klaus Ethgen wrote:
[...]
> By the way, I find the following two messages often. Maybe they are
> unrelated, maybe not.
> [31633.189121] Purging GPU memory, 144 pages freed, 5692 pages still pinned.
> [31638.530025] Unable to lock GPU to purge memory.

I do not think this makes much of a difference for the oom reports. See
more below:

[...]
> [28756.498366] Xorg invoked oom-killer: gfp_mask=0x24200d4(GFP_USER|GFP_DMA32|__GFP_RECLAIMABLE), nodemask=0, order=0, oom_score_adj=0
[...]
> [28756.498761] Node 0 active_anon:409988kB inactive_anon:233592kB active_file:1272348kB inactive_file:720884kB unevictable:27744kB isolated(anon):0kB isolated(file):0kB mapped:190792kB dirty:4656kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 35332kB writeback_tmp:0kB unstable:0kB pages_scanned:4062806 all_unreclaimable? yes
> [28756.498769] DMA free:4116kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:1568kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15908kB mlocked:0kB slab_reclaimable:9848kB slab_unreclaimable:376kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> lowmem_reserve[]: 0 833 3008 3008
> [28756.498782] Normal free:42404kB min:42416kB low:53020kB high:63624kB active_anon:1316kB inactive_anon:20540kB active_file:548376kB inactive_file:60kB unevictable:0kB writepending:260kB present:892920kB managed:854328kB mlocked:0kB slab_reclaimable:184192kB slab_unreclaimable:47020kB kernel_stack:2728kB pagetables:0kB bounce:0kB free_pcp:936kB local_pcp:216kB free_cma:0kB
[...]
>
> [28757.732436] updatedb.mlocat invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0
> [28757.732649] Node 0 active_anon:124120kB inactive_anon:61840kB active_file:1280688kB inactive_file:713860kB unevictable:27744kB isolated(anon):0kB isolated(file):0kB mapped:103164kB dirty:0kB writeback:4kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 27352kB writeback_tmp:0kB unstable:0kB pages_scanned:3674354 all_unreclaimable? yes
> [28757.732656] DMA free:4116kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:1568kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15908kB mlocked:0kB slab_reclaimable:9848kB slab_unreclaimable:376kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> lowmem_reserve[]: 0 833 3008 3008
> [28757.732669] Normal free:42324kB min:42416kB low:53020kB high:63624kB active_anon:1312kB inactive_anon:20408kB active_file:549628kB inactive_file:40kB unevictable:0kB writepending:0kB present:892920kB managed:854328kB mlocked:0kB slab_reclaimable:184036kB slab_unreclaimable:46960kB kernel_stack:2384kB pagetables:0kB bounce:0kB free_pcp:408kB local_pcp:124kB free_cma:0kB
> lowmem_reserve[]: 0 0 17397 17397
[...]
> [31617.991795] gkrellm invoked oom-killer: gfp_mask=0x25000c0(GFP_KERNEL_ACCOUNT), nodemask=0, order=0, oom_score_adj=0
> [31617.991950] Node 0 active_anon:410912kB inactive_anon:188952kB active_file:1274928kB inactive_file:579668kB unevictable:28468kB isolated(anon):0kB isolated(file):0kB mapped:184664kB dirty:68kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 34744kB writeback_tmp:0kB unstable:0kB pages_scanned:5362697 all_unreclaimable? yes
> [31617.991957] DMA free:4116kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:1568kB inactive_file:0kB unevictable:0kB writepending:0kB present:15984kB managed:15908kB mlocked:0kB slab_reclaimable:9848kB slab_unreclaimable:376kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> lowmem_reserve[]: 0 833 3008 3008
> [31617.991970] Normal free:42300kB min:42416kB low:53020kB high:63624kB active_anon:1380kB inactive_anon:20244kB active_file:558700kB inactive_file:16kB unevictable:0kB writepending:16kB present:892920kB managed:854328kB mlocked:0kB slab_reclaimable:173456kB slab_unreclaimable:45024kB kernel_stack:3144kB pagetables:0kB bounce:0kB free_pcp:1392kB local_pcp:612kB free_cma:0kB

All of those are lowmem requests triggering the oom killer while there
is a lot of page cache sitting on the active Normal list. So it smells
like the same issue I was referring to and the mentioned patch should
fix the issue.
--
Michal Hocko
SUSE Labs

2016-12-30 11:11:40

by Michal Hocko

[permalink] [raw]
Subject: Re: Bug 4.9 and memorymanagement

On Tue 27-12-16 12:28:44, Michal Hocko wrote:
> On Mon 26-12-16 12:00:53, Michal Hocko wrote:
> > [CCing linux-mm]
> >
> > On Sun 25-12-16 21:52:52, Klaus Ethgen wrote:
> > > Hello,
> > >
> > > The last days I compiled version 4.9 for my i386 laptop. (Lenovo x61s)
> >
> > Do you have memory cgroups enabled in runtime (aka does the same happen
> > with cgroup_disable=memory)?
>
> If this turns out to be memory cgroup related then the patch from
> http://lkml.kernel.org/r/[email protected] might
> help.

Did you get chance to test the above patch? I would like to send it for
merging and having it tested on another system would be really helpeful
and much appreciated.

Thanks!
--
Michal Hocko
SUSE Labs

2016-12-30 16:52:35

by Klaus Ethgen

[permalink] [raw]
Subject: Re: [KERNEL] Re: Bug 4.9 and memorymanagement

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Sorry, did reply only you..

Am Fr den 30. Dez 2016 um 12:11 schrieb Michal Hocko:
> > If this turns out to be memory cgroup related then the patch from
> > http://lkml.kernel.org/r/[email protected] might
> > help.
>
> Did you get chance to test the above patch? I would like to send it for
> merging and having it tested on another system would be really helpeful
> and much appreciated.

Sorry, no, I was a bit busy when coming back from X-mass. ;-)

Maybe I can do so today.

The only think is, how can I find out if the bug is fixed? Is 7 days
enough? Or is there a change to force the bug to happen (or not)...?

Am Fr den 30. Dez 2016 um 12:11 schrieb Michal Hocko:
> > If this turns out to be memory cgroup related then the patch from
> > http://lkml.kernel.org/r/[email protected] might
> > help.

Which of the 3 patches is the one? All 3 or just one.

Regards
Klaus
- --
Klaus Ethgen http://www.ethgen.ch/
pub 4096R/4E20AF1C 2011-05-16 Klaus Ethgen <[email protected]>
Fingerprint: 85D4 CA42 952C 949B 1753 62B3 79D0 B06F 4E20 AF1C
-----BEGIN PGP SIGNATURE-----
Comment: Charset: ISO-8859-1

iQGzBAEBCgAdFiEEMWF28vh4/UMJJLQEpnwKsYAZ9qwFAlhmkM4ACgkQpnwKsYAZ
9qz18QwAtNNESyhqkpYaOss2Q6Ko1o+9eygil3X9MtAPWY/UP/d7MJ7q8lBrjQT7
wetFM4yZtfS4lk2wnUXUDHT8r41QT/39YmZefZemdHjMwbPk+NpeX3J7Y+Agu117
7x0NtWEpMM2mimSUcLpKxZjScx1lci572trlWVy8v8yPxTAeyPTxJ2Zun/W7vqS2
so3o2OA9eMSv7s0zWvE/9X3UZowcWaZtNIx2EvIPdghg2zazYwFydNFFGqn6tPtR
wlLj9Oxw3NTwKvFvHCGXz/xodw0t8Y1ZQa4yc5fYRzEy8PNJnxo6LNoboiycdcIw
E8FgybmJM2eFshiwRuFp8pgrI+HU6Mubp2aUPaNKYUUhfc58T59fSfh+qEkEkgym
kYCTiUA1f9SCSYVSkZrCaV1TuPnEmXANOTvQS5k4We7/kMbmk67UyWpSRCRSRYSX
Ofr/rblMVQ+dqQIQTVNoufSZgAOmCJdSbCQO/RduVjhSPgM47lLajIcRM/EYizE/
fBfHXomC
=T2Gd
-----END PGP SIGNATURE-----

2016-12-30 17:24:05

by Michal Hocko

[permalink] [raw]
Subject: Re: [KERNEL] Re: Bug 4.9 and memorymanagement

On Fri 30-12-16 17:52:30, Klaus Ethgen wrote:
> Sorry, did reply only you..
>
> Am Fr den 30. Dez 2016 um 12:11 schrieb Michal Hocko:
> > > If this turns out to be memory cgroup related then the patch from
> > > http://lkml.kernel.org/r/[email protected] might
> > > help.
> >
> > Did you get chance to test the above patch? I would like to send it for
> > merging and having it tested on another system would be really helpeful
> > and much appreciated.
>
> Sorry, no, I was a bit busy when coming back from X-mass. ;-)
>
> Maybe I can do so today.
>
> The only think is, how can I find out if the bug is fixed? Is 7 days
> enough? Or is there a change to force the bug to happen (or not)...?

Just try to run with the patch and do what you do normally. If you do
not see any OOMs in few days it should be sufficient evidence. From your
previous logs it seems you hit the problem quite early after few hours
as far as I remember.

> Am Fr den 30. Dez 2016 um 12:11 schrieb Michal Hocko:
> > > If this turns out to be memory cgroup related then the patch from
> > > http://lkml.kernel.org/r/[email protected] might
> > > help.
>
> Which of the 3 patches is the one? All 3 or just one.

Just this one should be sufficient
---
>From 209710f27de0016cad68cb4ff448f294bc0ff95a Mon Sep 17 00:00:00 2001
From: Michal Hocko <[email protected]>
Date: Fri, 23 Dec 2016 15:11:54 +0100
Subject: [PATCH] mm, memcg: fix the active list aging for lowmem requests when
memcg is enabled

Nils Holland has reported unexpected OOM killer invocations with 32b
kernel starting with 4.8 kernels

kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0
kworker/u4:5 cpuset=/ mems_allowed=0
CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2
[...]
Mem-Info:
active_anon:58685 inactive_anon:90 isolated_anon:0
active_file:274324 inactive_file:281962 isolated_file:0
unevictable:0 dirty:649 writeback:0 unstable:0
slab_reclaimable:40662 slab_unreclaimable:17754
mapped:7382 shmem:202 pagetables:351 bounce:0
free:206736 free_pcp:332 free_cma:0
Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 813 3474 3474
Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB
lowmem_reserve[]: 0 0 21292 21292
HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB

the oom killer is clearly pre-mature because there there is still a
lot of page cache in the zone Normal which should satisfy this lowmem
request. Further debugging has shown that the reclaim cannot make any
forward progress because the page cache is hidden in the active list
which doesn't get rotated because inactive_list_is_low is not memcg
aware.
It simply subtracts per-zone highmem counters from the respective
memcg's lru sizes which doesn't make any sense. We can simply end up
always seeing the resulting active and inactive counts 0 and return
false. This issue is not limited to 32b kernels but in practice the
effect on systems without CONFIG_HIGHMEM would be much harder to notice
because we do not invoke the OOM killer for allocations requests
targeting < ZONE_NORMAL.

Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node
and subtract per-memcg highmem counts when memcg is enabled. Introduce
helper lruvec_zone_lru_size which redirects to either zone counters or
mem_cgroup_get_zone_lru_size when appropriate.

We are losing empty LRU but non-zero lru size detection introduced by
ca707239e8a7 ("mm: update_lru_size warn and reset bad lru_size") because
of the inherent zone vs. node discrepancy.

Fixes: f8d1a31163fc ("mm: consider whether to decivate based on eligible zones inactive ratio")
Cc: stable # 4.8+
Acked-by: Minchan Kim <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Reported-and-tested-by: Nils Holland <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
include/linux/memcontrol.h | 26 +++++++++++++++++++++++---
include/linux/mm_inline.h | 2 +-
mm/memcontrol.c | 18 ++++++++----------
mm/vmscan.c | 27 +++++++++++++++++----------
4 files changed, 49 insertions(+), 24 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 61d20c17f3b7..254698856b8f 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -120,7 +120,7 @@ struct mem_cgroup_reclaim_iter {
*/
struct mem_cgroup_per_node {
struct lruvec lruvec;
- unsigned long lru_size[NR_LRU_LISTS];
+ unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];

struct mem_cgroup_reclaim_iter iter[DEF_PRIORITY + 1];

@@ -432,7 +432,7 @@ static inline bool mem_cgroup_online(struct mem_cgroup *memcg)
int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);

void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
- int nr_pages);
+ int zid, int nr_pages);

unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
int nid, unsigned int lru_mask);
@@ -441,9 +441,23 @@ static inline
unsigned long mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru)
{
struct mem_cgroup_per_node *mz;
+ unsigned long nr_pages = 0;
+ int zid;

mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
- return mz->lru_size[lru];
+ for (zid = 0; zid < MAX_NR_ZONES; zid++)
+ nr_pages += mz->lru_zone_size[zid][lru];
+ return nr_pages;
+}
+
+static inline
+unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
+ enum lru_list lru, int zone_idx)
+{
+ struct mem_cgroup_per_node *mz;
+
+ mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
+ return mz->lru_zone_size[zone_idx][lru];
}

void mem_cgroup_handle_over_high(void);
@@ -671,6 +685,12 @@ mem_cgroup_get_lru_size(struct lruvec *lruvec, enum lru_list lru)
{
return 0;
}
+static inline
+unsigned long mem_cgroup_get_zone_lru_size(struct lruvec *lruvec,
+ enum lru_list lru, int zone_idx)
+{
+ return 0;
+}

static inline unsigned long
mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index 71613e8a720f..41d376e7116d 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -39,7 +39,7 @@ static __always_inline void update_lru_size(struct lruvec *lruvec,
{
__update_lru_size(lruvec, lru, zid, nr_pages);
#ifdef CONFIG_MEMCG
- mem_cgroup_update_lru_size(lruvec, lru, nr_pages);
+ mem_cgroup_update_lru_size(lruvec, lru, zid, nr_pages);
#endif
}

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 91dfc7c5ce8f..b59676026272 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -625,8 +625,8 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
int nid, unsigned int lru_mask)
{
+ struct lruvec *lruvec = mem_cgroup_lruvec(NODE_DATA(nid), memcg);
unsigned long nr = 0;
- struct mem_cgroup_per_node *mz;
enum lru_list lru;

VM_BUG_ON((unsigned)nid >= nr_node_ids);
@@ -634,8 +634,7 @@ unsigned long mem_cgroup_node_nr_lru_pages(struct mem_cgroup *memcg,
for_each_lru(lru) {
if (!(BIT(lru) & lru_mask))
continue;
- mz = mem_cgroup_nodeinfo(memcg, nid);
- nr += mz->lru_size[lru];
+ nr += mem_cgroup_get_lru_size(lruvec, lru);
}
return nr;
}
@@ -1002,6 +1001,7 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd
* mem_cgroup_update_lru_size - account for adding or removing an lru page
* @lruvec: mem_cgroup per zone lru vector
* @lru: index of lru list the page is sitting on
+ * @zid: zone id of the accounted pages
* @nr_pages: positive when adding or negative when removing
*
* This function must be called under lru_lock, just before a page is added
@@ -1009,27 +1009,25 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *page, struct pglist_data *pgd
* so as to allow it to check that lru_size 0 is consistent with list_empty).
*/
void mem_cgroup_update_lru_size(struct lruvec *lruvec, enum lru_list lru,
- int nr_pages)
+ int zid, int nr_pages)
{
struct mem_cgroup_per_node *mz;
unsigned long *lru_size;
long size;
- bool empty;

if (mem_cgroup_disabled())
return;

mz = container_of(lruvec, struct mem_cgroup_per_node, lruvec);
- lru_size = mz->lru_size + lru;
- empty = list_empty(lruvec->lists + lru);
+ lru_size = &mz->lru_zone_size[zid][lru];

if (nr_pages < 0)
*lru_size += nr_pages;

size = *lru_size;
- if (WARN_ONCE(size < 0 || empty != !size,
- "%s(%p, %d, %d): lru_size %ld but %sempty\n",
- __func__, lruvec, lru, nr_pages, size, empty ? "" : "not ")) {
+ if (WARN_ONCE(size < 0,
+ "%s(%p, %d, %d): lru_size %ld\n",
+ __func__, lruvec, lru, nr_pages, size)) {
VM_BUG_ON(1);
*lru_size = 0;
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c4abf08861d2..fa30010a5277 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -242,6 +242,16 @@ unsigned long lruvec_lru_size(struct lruvec *lruvec, enum lru_list lru)
return node_page_state(lruvec_pgdat(lruvec), NR_LRU_BASE + lru);
}

+unsigned long lruvec_zone_lru_size(struct lruvec *lruvec, enum lru_list lru,
+ int zone_idx)
+{
+ if (!mem_cgroup_disabled())
+ return mem_cgroup_get_zone_lru_size(lruvec, lru, zone_idx);
+
+ return zone_page_state(&lruvec_pgdat(lruvec)->node_zones[zone_idx],
+ NR_ZONE_LRU_BASE + lru);
+}
+
/*
* Add a shrinker callback to be called from the vm.
*/
@@ -1382,8 +1392,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode)
* be complete before mem_cgroup_update_lru_size due to a santity check.
*/
static __always_inline void update_lru_sizes(struct lruvec *lruvec,
- enum lru_list lru, unsigned long *nr_zone_taken,
- unsigned long nr_taken)
+ enum lru_list lru, unsigned long *nr_zone_taken)
{
int zid;

@@ -1392,11 +1401,11 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec,
continue;

__update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]);
- }
-
#ifdef CONFIG_MEMCG
- mem_cgroup_update_lru_size(lruvec, lru, -nr_taken);
+ mem_cgroup_update_lru_size(lruvec, lru, zid, -nr_zone_taken[zid]);
#endif
+ }
+
}

/*
@@ -1501,7 +1510,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
*nr_scanned = scan;
trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, nr_to_scan, scan,
nr_taken, mode, is_file_lru(lru));
- update_lru_sizes(lruvec, lru, nr_zone_taken, nr_taken);
+ update_lru_sizes(lruvec, lru, nr_zone_taken);
return nr_taken;
}

@@ -2047,10 +2056,8 @@ static bool inactive_list_is_low(struct lruvec *lruvec, bool file,
if (!managed_zone(zone))
continue;

- inactive_zone = zone_page_state(zone,
- NR_ZONE_LRU_BASE + (file * LRU_FILE));
- active_zone = zone_page_state(zone,
- NR_ZONE_LRU_BASE + (file * LRU_FILE) + LRU_ACTIVE);
+ inactive_zone = lruvec_zone_lru_size(lruvec, file * LRU_FILE, zid);
+ active_zone = lruvec_zone_lru_size(lruvec, (file * LRU_FILE) + LRU_ACTIVE, zid);

inactive -= min(inactive, inactive_zone);
active -= min(active, active_zone);
--
2.10.2


--
Michal Hocko
SUSE Labs