2015-08-19 15:45:00

by Arthur Marsh

[permalink] [raw]
Subject: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7

Hi, I've found that the Linus' git head kernel has had some unwelcome
behaviour where chromium browser would exhaust all swap space in the
course of a few hours. The behaviour appeared before the release of
4.2.0-rc7.

This does not happen with kernel 4.2.0-rc6.

When I tried a git-bisect, the results where not conclusive due to the
problem taking over an hour to appear after booting, the closest I came
was around this commit (the actual problem may be a few commits either
side):

git bisect good
4f258a46346c03fa0bbb6199ffaf4e1f9f599660 is the first bad commit
commit 4f258a46346c03fa0bbb6199ffaf4e1f9f599660
Author: Martin K. Petersen <[email protected]>
Date: Tue Jun 23 12:13:59 2015 -0400

sd: Fix maximum I/O size for BLOCK_PC requests

Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the
BLOCK
LIMITS VPD. This had the unfortunate effect of also limiting the
maximum
size of non-filesystem requests sent to the device through sg/bsg.

Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
limit directly.

Also update the comment in blk_limits_max_hw_sectors() to clarify that
max_hw_sectors defines the limit for the I/O controller only.

Signed-off-by: Martin K. Petersen <[email protected]>
Reported-by: Brian King <[email protected]>
Tested-by: Brian King <[email protected]>
Cc: [email protected] # 3.17+
Signed-off-by: James Bottomley <[email protected]>

:040000 040000 fbd0519d9ee0a8f92a7dab9a9c6d7b7868974fba
b4cf554c568813704993538008aed5b704624679 M block
:040000 040000 f2630c903cd36ede2619d173f9d1ea0d725ea111
ff6b6f732afbf6f4b6b26a827c463de50f0e356c M drivers

Has anyone seen a similar problem?
I can supply .config and other information if requested.

Arthur.


2015-08-20 08:16:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7

On 08/19/2015 05:44 PM, Arthur Marsh wrote:
> Hi, I've found that the Linus' git head kernel has had some unwelcome
> behaviour where chromium browser would exhaust all swap space in the
> course of a few hours. The behaviour appeared before the release of
> 4.2.0-rc7.

Do you have any more details about the memory/swap usage? Is it really
that chromium process(es) itself eats more memory and starts swapping,
or that something else (a graphics driver?) eats kernel memory, and
chromium as one of the biggest processes is driven to swap by that? Can
you provide e.g. top output with good/bad kernels?

Also what does /proc/meminfo and /proc/zoneinfo look like when it's
swapping?

To see which processes use swap, you can try [1] :
for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{
print ""}' $file; done | sort -k 2 -n -r | less

Thanks

[1] http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/

> This does not happen with kernel 4.2.0-rc6.
>
> When I tried a git-bisect, the results where not conclusive due to the
> problem taking over an hour to appear after booting, the closest I came
> was around this commit (the actual problem may be a few commits either
> side):
>
> git bisect good
> 4f258a46346c03fa0bbb6199ffaf4e1f9f599660 is the first bad commit
> commit 4f258a46346c03fa0bbb6199ffaf4e1f9f599660
> Author: Martin K. Petersen <[email protected]>
> Date: Tue Jun 23 12:13:59 2015 -0400
>
> sd: Fix maximum I/O size for BLOCK_PC requests
>
> Commit bcdb247c6b6a ("sd: Limit transfer length") clamped the maximum
> size of an I/O request to the MAXIMUM TRANSFER LENGTH field in the
> BLOCK
> LIMITS VPD. This had the unfortunate effect of also limiting the
> maximum
> size of non-filesystem requests sent to the device through sg/bsg.
>
> Avoid using blk_queue_max_hw_sectors() and set the max_sectors queue
> limit directly.
>
> Also update the comment in blk_limits_max_hw_sectors() to clarify that
> max_hw_sectors defines the limit for the I/O controller only.
>
> Signed-off-by: Martin K. Petersen <[email protected]>
> Reported-by: Brian King <[email protected]>
> Tested-by: Brian King <[email protected]>
> Cc: [email protected] # 3.17+
> Signed-off-by: James Bottomley <[email protected]>
>
> :040000 040000 fbd0519d9ee0a8f92a7dab9a9c6d7b7868974fba
> b4cf554c568813704993538008aed5b704624679 M block
> :040000 040000 f2630c903cd36ede2619d173f9d1ea0d725ea111
> ff6b6f732afbf6f4b6b26a827c463de50f0e356c M drivers
>
> Has anyone seen a similar problem?
> I can supply .config and other information if requested.
>
> Arthur.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
>

2015-08-21 09:18:10

by Arthur Marsh

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7



Vlastimil Babka wrote on 20/08/15 17:46:
> On 08/19/2015 05:44 PM, Arthur Marsh wrote:
>> Hi, I've found that the Linus' git head kernel has had some unwelcome
>> behaviour where chromium browser would exhaust all swap space in the
>> course of a few hours. The behaviour appeared before the release of
>> 4.2.0-rc7.
>
> Do you have any more details about the memory/swap usage? Is it really
> that chromium process(es) itself eats more memory and starts swapping,
> or that something else (a graphics driver?) eats kernel memory, and
> chromium as one of the biggest processes is driven to swap by that? Can
> you provide e.g. top output with good/bad kernels?
>
> Also what does /proc/meminfo and /proc/zoneinfo look like when it's
> swapping?
>
> To see which processes use swap, you can try [1] :
> for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{
> print ""}' $file; done | sort -k 2 -n -r | less
>
> Thanks
>
> [1] http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/
>
>> This does not happen with kernel 4.2.0-rc6.

Sorry for the delay in replying. I had to give an extended run under
kernel 4.2.0-rc6 to obtain comparative results. Both kernels' config
files are attached.

The applications running are the same both times, mainly iceweasel
38.1.0esr-3 and chromium 44.0.2403.107-1.

With the rc7+ kernel but not the rc6 kernel, chromium eventually gets
into a state of consuming lots of swap.

I was able to capture the output requested when running a 4.2.0-rc7+
kernel (Linus' git head as of around 05:00 UTC 19 August 2015) just
before swap was exhausted, forcing me to do a control-alt-delete
shutdown and waiting ages. The kernel config for the rc7+ is attached

The comparison good kernel is from Debian:
Linux am64 4.2.0-rc6-amd64 #1 SMP Debian 4.2~rc6-1~exp1 (2015-08-12)
x86_64 GNU/Linux

Output of the script quoted above for the rc7+ kernel is:

iceweasel 229872 kB
chromium 58680 kB
chromium 57956 kB
chromium 55596 kB
chromium 54600 kB
chromium 51880 kB
chromium 51552 kB
chromium 46436 kB
chromium 43292 kB
chromium 40904 kB
chromium 40832 kB
chromium 37776 kB
chromium 36816 kB
chromium 30512 kB
chromium 30164 kB
chromium 29824 kB
chromium 27760 kB
plasma-desktop 26236 kB
chromium 25340 kB
mysqld 23268 kB
chromium 21368 kB
chromium 21040 kB
named 20824 kB
kwin 19512 kB
chromium 19072 kB
chromium 16468 kB
blueman-applet 15732 kB
Xorg 11564 kB
krunner 10984 kB
knotify4 9508 kB
chromium 9196 kB
kded4 8572 kB
chromium 8472 kB
ksmserver 8416 kB
dhclient 6888 kB
chromium 6556 kB
kmix 6416 kB
mozc_server 5128 kB
akonadi_newmail 5036 kB
chromium 4748 kB
timidity 4696 kB
kactivitymanage 4472 kB
gpsd 4464 kB
chromium 4424 kB
knemo 4392 kB
chromium 4336 kB
akonadi_maildis 4284 kB
ibus-ui-gtk3 4256 kB
akonadi_migrati 4196 kB
chromium 4004 kB
akonadi_agent_l 3948 kB
akonadi_agent_l 3940 kB
akonadi_agent_l 3932 kB
akonadi_agent_l 3896 kB
kglobalaccel 3788 kB
kdeinit4 3736 kB
kuiserver 3464 kB
bash 3384 kB
klauncher 3232 kB
konsole 3192 kB
klipper 2588 kB
akonadiserver 2544 kB
apache2 2308 kB
apache2 2280 kB
pulseaudio 2260 kB
apache2 2260 kB
apache2 2252 kB
apache2 2252 kB
apache2 2252 kB
systemd-udevd 2012 kB
smbd 1800 kB
smbd 1636 kB
winbindd 1600 kB
winbindd 1588 kB
upowerd 1492 kB
winbindd 1488 kB
winbindd 1468 kB
nmbd 1248 kB
akonadi_control 1060 kB
telnet 1032 kB
console-kit-dae 988 kB
kscreen_backend 916 kB
ibus-x11 772 kB
gconfd-2 708 kB
cups-browsed 708 kB
kdm 656 kB
smartd 628 kB
ibus-engine-moz 624 kB
login 600 kB
cupsd 596 kB
exim4 592 kB
bash 564 kB
ibus-dconf 512 kB
bash 496 kB
bash 452 kB
bash 452 kB
at-spi-bus-laun 448 kB
at-spi2-registr 448 kB
ibus-daemon 400 kB
mount.ntfs 396 kB
dbus-daemon 384 kB
su 372 kB
obexd 340 kB
dbus-launch 332 kB
bash 320 kB
ssh-agent 316 kB
bluetoothd 300 kB
bash 272 kB
ck-launch-sessi 252 kB
ntpd 248 kB
x-session-manag 240 kB
ibus-engine-sim 240 kB
gam_server 236 kB
dbus-daemon 236 kB
neard 232 kB
acpid 208 kB
rtkit-daemon 188 kB
cron 188 kB
avahi-daemon 188 kB
dbus-daemon 180 kB
avahi-daemon 172 kB
kdm 168 kB
cpufreqd 160 kB
inetd 148 kB
atd 148 kB
getty 136 kB
getty 136 kB
getty 132 kB
getty 128 kB
getty 128 kB
init 124 kB
irqbalance 108 kB
minissdpd 84 kB
uuidd 80 kB
kwrapper4 76 kB
iscsid 56 kB
start_kdeinit 44 kB
writeback
watchdog/3
watchdog/2
watchdog/1
watchdog/0
vlc 0 kB
usb-storage
ttm_swap
top 0 kB
su 0 kB
sort 0 kB
scsi_tmf_6
scsi_tmf_5
scsi_tmf_4
scsi_tmf_3
scsi_tmf_2
scsi_tmf_1
scsi_tmf_0
scsi_eh_6
scsi_eh_5
scsi_eh_4
scsi_eh_3
scsi_eh_2
scsi_eh_1
scsi_eh_0
rsyslogd 0 kB
rdma_cm
rcu_sched
rcu_preempt
rcu_bh
rc0
radeon-crtc
radeon-crtc
perf
netns
migration/3
migration/2
migration/1
migration/0
login 0 kB
kworker/u12:3
kworker/u12:2
kworker/u12:1
kworker/u12:0
kworker/3:2
kworker/3:1H
kworker/3:1
kworker/3:0H
kworker/3:0
kworker/2:2
kworker/2:1H
kworker/2:1
kworker/2:0H
kworker/2:0
kworker/1:2
kworker/1:1H
kworker/1:1
kworker/1:0H
kworker/1:0
kworker/0:2
kworker/0:1H
kworker/0:1
kworker/0:0H
kworker/0:0
kvm-irqfd-clean
kthrotld
kthreadd
kswapd0
ksoftirqd/3
ksoftirqd/2
ksoftirqd/1
ksoftirqd/0
ksmd
kpsmoused
kintegrityd
khungtaskd
khugepaged
khelper
kdevtmpfs
kblockd
kauditd
jbd2/sdb2-8
jbd2/sda3-8
jbd2/sda2-8
iw_cm_wq
iscsi_eh
iscsid 0 kB
ipv6_addrconf
in.telnetd 0 kB
ib_mcast
ib_cm
ib_addr
fsnotify_mark
ext4-rsv-conver
ext4-rsv-conver
ext4-rsv-conver
edac-poller
devfreq_wq
deferwq
crypto
cifsiod
cifsd
bioset
bash 0 kB
bash 0 kB
bash 0 kB
awk 0 kB
awk 0 kB
ata_sff

Output from top:

# top
top - 20:22:12 up 17:56, 9 users, load average: 1.00, 1.30, 1.38
Tasks: 242 total, 1 running, 241 sleeping, 0 stopped, 0 zombie
%Cpu(s): 7.7 us, 13.3 sy, 0.0 ni, 79.0 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
KiB Mem : 7895788 total, 125240 free, 5735128 used, 2035420 buff/cache
KiB Swap: 4194288 total, 1696 free, 4192592 used. 375612 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4735 amarsh04 20 0 995864 124876 92280 S 28.6 1.6 28:48.45
chromium
2515 root 20 0 268600 54652 18856 S 27.6 0.7 91:12.59 Xorg
4828 amarsh04 20 0 2026352 607252 210260 S 11.3 7.7 272:52.66
chromium
4668 amarsh04 20 0 3529796 248432 74724 S 5.0 3.1 24:27.81
chromium
4285 amarsh04 20 0 2785900 1.136g 29612 S 3.3 15.1 161:44.35
iceweasel
4140 amarsh04 20 0 3445952 83712 20984 S 3.0 1.1 49:39.43
plasma-des+
4912 amarsh04 20 0 1436652 60820 18964 S 1.0 0.8 5:20.89
chromium
22188 amarsh04 20 0 1534992 265120 216864 S 1.0 3.4 0:03.94 chromium
4822 amarsh04 20 0 1398716 45300 24228 S 0.7 0.6 5:30.02
chromium
5067 amarsh04 20 0 1423816 55520 18820 S 0.7 0.7 5:14.10
chromium
5103 amarsh04 20 0 1446588 60724 20996 S 0.7 0.8 5:08.34
chromium
5135 amarsh04 20 0 1472420 66624 21184 S 0.7 0.8 5:14.41
chromium
22854 root 20 0 25816 3016 2432 R 0.7 0.0 0:00.06 top
25408 amarsh04 20 0 25716 2188 1592 S 0.7 0.0 3:14.28 top
7 root 20 0 0 0 0 S 0.3 0.0 3:28.85
rcu_preempt
2372 root 20 0 41524 1612 1480 S 0.3 0.0 3:38.29
cpufreqd
5115 amarsh04 20 0 1532560 154956 30740 S 0.3 2.0 5:37.47
chromium
11632 amarsh04 20 0 1517036 90436 10836 S 0.3 1.1 14:34.25 vlc

output from /proc/mem:

MemTotal: 7895788 kB
MemFree: 119528 kB
MemAvailable: 368924 kB
Buffers: 8480 kB
Cached: 1812008 kB
SwapCached: 11748 kB
Active: 3074528 kB
Inactive: 1484656 kB
Active(anon): 2945236 kB
Inactive(anon): 1410924 kB
Active(file): 129292 kB
Inactive(file): 73732 kB
Unevictable: 20332 kB
Mlocked: 20332 kB
SwapTotal: 4194288 kB
SwapFree: 684 kB
Dirty: 168 kB
Writeback: 0 kB
AnonPages: 2747452 kB
Mapped: 640304 kB
Shmem: 1598360 kB
Slab: 214100 kB
SReclaimable: 88204 kB
SUnreclaim: 125896 kB
KernelStack: 11552 kB
PageTables: 82720 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8142180 kB
Committed_AS: 13532120 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 331044 kB
VmallocChunk: 34358947836 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 550592 kB
DirectMap2M: 7575552 kB
DirectMap1G: 0 kB

/proc/zoneinfo:

Node 0, zone DMA
pages free 3967
min 5
low 6
high 7
scanned 0
spanned 4095
present 3998
managed 3976
nr_free_pages 3967
nr_alloc_batch 1
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 1
nr_slab_unreclaimable 0
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 2966, 7692, 7692)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 6
all_unreclaimable: 1
start_pfn: 1
inactive_ratio: 1
Node 0, zone DMA32
pages free 13735
min 1074
low 1342
high 1611
scanned 0
spanned 1044480
present 782256
managed 760117
nr_free_pages 13735
nr_alloc_batch 269
nr_inactive_anon 131943
nr_active_anon 273595
nr_inactive_file 6966
nr_active_file 13043
nr_unevictable 1808
nr_mlock 1808
nr_anon_pages 259911
nr_mapped 50449
nr_file_pages 167472
nr_dirty 23
nr_writeback 0
nr_slab_reclaimable 7993
nr_slab_unreclaimable 11849
nr_page_table_pages 8994
nr_kernel_stack 253
nr_unstable 0
nr_bounce 0
nr_vmscan_write 449534
nr_vmscan_immediate_reclaim 410
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 144889
nr_dirtied 757345
nr_written 1201233
nr_pages_scanned 0
workingset_refault 1020951
workingset_activate 209651
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 4725, 4725)
pagesets
cpu: 0
count: 178
high: 186
batch: 31
vm stats threshold: 36
cpu: 1
count: 59
high: 186
batch: 31
vm stats threshold: 36
cpu: 2
count: 85
high: 186
batch: 31
vm stats threshold: 36
cpu: 3
count: 162
high: 186
batch: 31
vm stats threshold: 36
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 4
Node 0, zone Normal
pages free 11938
min 1711
low 2138
high 2566
scanned 0
spanned 1245184
present 1245184
managed 1209854
nr_free_pages 11938
nr_alloc_batch 0
nr_inactive_anon 220863
nr_active_anon 463176
nr_inactive_file 11483
nr_active_file 19328
nr_unevictable 3275
nr_mlock 3275
nr_anon_pages 427490
nr_mapped 109636
nr_file_pages 290660
nr_dirty 20
nr_writeback 0
nr_slab_reclaimable 14057
nr_slab_unreclaimable 19616
nr_page_table_pages 11688
nr_kernel_stack 468
nr_unstable 0
nr_bounce 0
nr_vmscan_write 629788
nr_vmscan_immediate_reclaim 70
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 254701
nr_dirtied 1220388
nr_written 1839981
nr_pages_scanned 0
workingset_refault 1731349
workingset_activate 368342
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 152
high: 186
batch: 31
vm stats threshold: 42
cpu: 1
count: 31
high: 186
batch: 31
vm stats threshold: 42
cpu: 2
count: 139
high: 186
batch: 31
vm stats threshold: 42
cpu: 3
count: 96
high: 186
batch: 31
vm stats threshold: 42
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 6


By comparison the rc6 kernel after running a while was:

mysqld 19244 kB
named 16960 kB
plasma-desktop 15712 kB
kded4 8196 kB
Xorg 7060 kB
dhclient 6896 kB
ksmserver 6012 kB
mozc_server 5104 kB
timidity 4692 kB
kmix 4536 kB
gpsd 4460 kB
blueman-applet 4436 kB
kactivitymanage 4352 kB
ibus-ui-gtk3 3876 kB
kglobalaccel 3812 kB
kdeinit4 3632 kB
kwin 3480 kB
kuiserver 3444 kB
bash 3388 kB
klauncher 3144 kB
krunner 2780 kB
pulseaudio 2248 kB
systemd-udevd 1960 kB
smbd 1804 kB
konsole 1684 kB
smbd 1620 kB
winbindd 1604 kB
winbindd 1596 kB
winbindd 1468 kB
nmbd 1368 kB
winbindd 1312 kB
knemo 1132 kB
klipper 988 kB
console-kit-dae 908 kB
knotify4 904 kB
upowerd 872 kB
akonadiserver 780 kB
ibus-x11 676 kB
cups-browsed 676 kB
kdm 656 kB
smartd 628 kB
login 600 kB
exim4 592 kB
rsyslogd 468 kB
at-spi2-registr 420 kB
ibus-dconf 392 kB
akonadi_control 368 kB
apache2 360 kB
apache2 360 kB
apache2 360 kB
apache2 360 kB
apache2 360 kB
apache2 360 kB
at-spi-bus-laun 344 kB
kscreen_backend 328 kB
dbus-launch 328 kB
ssh-agent 316 kB
bluetoothd 300 kB
bash 288 kB
mount.ntfs 276 kB
bash 272 kB
ntpd 252 kB
ck-launch-sessi 252 kB
ibus-daemon 248 kB
x-session-manag 244 kB
neard 232 kB
dbus-daemon 204 kB
bash 200 kB
acpid 200 kB
bash 192 kB
avahi-daemon 192 kB
rtkit-daemon 184 kB
cron 184 kB
dbus-daemon 176 kB
ibus-engine-sim 160 kB
cpufreqd 152 kB
inetd 148 kB
dbus-daemon 148 kB
atd 148 kB
avahi-daemon 144 kB
getty 140 kB
getty 140 kB
getty 140 kB
getty 136 kB
kdm 128 kB
getty 128 kB
gam_server 116 kB
irqbalance 104 kB
init 100 kB
bash 100 kB
uuidd 84 kB
minissdpd 84 kB
kwrapper4 72 kB
iscsid 56 kB
start_kdeinit 44 kB
xfs_mru_cache
xfsalloc
writeback
watchdog/3
watchdog/2
watchdog/1
watchdog/0
usb-storage
ttm_swap
top 0 kB
telnet 0 kB
su 0 kB
su 0 kB
sort 0 kB
scsi_tmf_6
scsi_tmf_5
scsi_tmf_4
scsi_tmf_3
scsi_tmf_2
scsi_tmf_1
scsi_tmf_0
scsi_eh_6
scsi_eh_5
scsi_eh_4
scsi_eh_3
scsi_eh_2
scsi_eh_1
scsi_eh_0
rdma_cm
rcu_sched
rcu_bh
rc0
radeon-crtc
radeon-crtc
perf
obexd 0 kB
netns
migration/3
migration/2
migration/1
migration/0
login 0 kB
kworker/u12:2
kworker/u12:0
kworker/3:2
kworker/3:1H
kworker/3:1
kworker/3:0H
kworker/3:0
kworker/2:2
kworker/2:1H
kworker/2:0H
kworker/2:0
kworker/1:2
kworker/1:1H
kworker/1:0H
kworker/1:0
kworker/0:2
kworker/0:1H
kworker/0:0H
kworker/0:0
kvm-irqfd-clean
kthrotld
kthreadd
kswapd0
ksoftirqd/3
ksoftirqd/2
ksoftirqd/1
ksoftirqd/0
ksmd
kpsmoused
kintegrityd
khungtaskd
khugepaged
khelper
kdevtmpfs
kblockd
kauditd
jfsSync
jfsIO
jfsCommit
jfsCommit
jfsCommit
jfsCommit
jbd2/sdb2-8
jbd2/sda3-8
jbd2/sda2-8
iw_cm_wq
iscsi_eh
iscsid 0 kB
ipv6_addrconf
in.telnetd 0 kB
iceweasel 0 kB
ibus-engine-moz 0 kB
ib_mcast
ib_cm
ib_addr
gconfd-2 0 kB
fsnotify_mark
ext4-rsv-conver
ext4-rsv-conver
ext4-rsv-conver
edac-poller
devfreq_wq
deferwq
cupsd 0 kB
crypto
cifsiod
cifsd
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chromium 0 kB
chrome-sandbox 0 kB
bioset
bioset
bash 0 kB
bash 0 kB
bash 0 kB
bash 0 kB
awk 0 kB
awk 0 kB
ata_sff
akonadi_newmail 0 kB
akonadi_migrati 0 kB
akonadi_maildis 0 kB
akonadi_agent_l 0 kB
akonadi_agent_l 0 kB
akonadi_agent_l 0 kB
akonadi_agent_l 0 kB


# top
top - 18:11:03 up 21:09, 9 users, load average: 0.52, 0.70, 0.92
Tasks: 248 total, 1 running, 247 sleeping, 0 stopped, 0 zombie
%Cpu(s): 8.3 us, 1.9 sy, 2.6 ni, 84.2 id, 3.0 wa, 0.0 hi, 0.1 si,
0.0 st
KiB Mem : 7922668 total, 96576 free, 5592924 used, 2233168 buff/cache
KiB Swap: 4194288 total, 4039680 free, 154608 used. 1727172 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27128 amarsh04 20 0 1578688 284600 110012 S 20.0 3.6 4:27.31 chromium
29803 amarsh04 20 0 2748312 1.357g 51168 S 13.3 18.0 137:52.88
iceweasel
30894 amarsh04 20 0 1867844 413144 130396 S 13.3 5.2 150:37.02 chromium
4417 amarsh04 20 0 490624 7484 5736 S 6.7 0.1 4:49.91
pulseaudio
30725 amarsh04 20 0 3757224 323872 111140 S 6.7 4.1 18:48.66 chromium
1 root 20 0 15488 1560 1508 S 0.0 0.0 0:01.23 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.04
kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:04.49
ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00
kworker/0:+
7 root 20 0 0 0 0 S 0.0 0.0 2:19.19
rcu_sched
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root rt 0 0 0 0 S 0.0 0.0 0:00.72
migration/0
10 root rt 0 0 0 0 S 0.0 0.0 0:00.30
watchdog/0
11 root rt 0 0 0 0 S 0.0 0.0 0:00.31
watchdog/1
12 root rt 0 0 0 0 S 0.0 0.0 0:00.71
migration/1
13 root 20 0 0 0 0 S 0.0 0.0 0:04.39
ksoftirqd/1
15 root 0 -20 0 0 0 S 0.0 0.0 0:00.00
kworker/1:+
16 root rt 0 0 0 0 S 0.0 0.0 0:00.30
watchdog/2

/proc/mem:

MemTotal: 7922668 kB
MemFree: 121468 kB
MemAvailable: 1750644 kB
Buffers: 494724 kB
Cached: 1417772 kB
SwapCached: 5188 kB
Active: 4596992 kB
Inactive: 1398920 kB
Active(anon): 3801344 kB
Inactive(anon): 789752 kB
Active(file): 795648 kB
Inactive(file): 609168 kB
Unevictable: 20600 kB
Mlocked: 20600 kB
SwapTotal: 4194288 kB
SwapFree: 4039680 kB
Dirty: 280 kB
Writeback: 0 kB
AnonPages: 4099952 kB
Mapped: 725912 kB
Shmem: 488224 kB
Slab: 319332 kB
SReclaimable: 266264 kB
SUnreclaim: 53068 kB
KernelStack: 11792 kB
PageTables: 83748 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 8155620 kB
Committed_AS: 10080740 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 312564 kB
VmallocChunk: 34358947836 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 663232 kB
DirectMap2M: 7462912 kB
DirectMap1G: 0 kB

/proc/zoneinfo

Node 0, zone DMA
pages free 3969
min 5
low 6
high 7
scanned 0
spanned 4095
present 3998
managed 3977
nr_free_pages 3969
nr_alloc_batch 1
nr_inactive_anon 0
nr_active_anon 0
nr_inactive_file 0
nr_active_file 0
nr_unevictable 0
nr_mlock 0
nr_anon_pages 0
nr_mapped 0
nr_file_pages 0
nr_dirty 0
nr_writeback 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_page_table_pages 0
nr_kernel_stack 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 0
nr_dirtied 0
nr_written 0
nr_pages_scanned 0
numa_hit 30
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 30
numa_other 0
workingset_refault 0
workingset_activate 0
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 2980, 7719, 7719)
pagesets
cpu: 0
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 1
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 2
count: 0
high: 0
batch: 1
vm stats threshold: 6
cpu: 3
count: 0
high: 0
batch: 1
vm stats threshold: 6
all_unreclaimable: 1
start_pfn: 1
inactive_ratio: 1
Node 0, zone DMA32
pages free 12618
min 1077
low 1346
high 1615
scanned 0
spanned 1044480
present 782256
managed 763535
nr_free_pages 12618
nr_alloc_batch 269
nr_inactive_anon 89095
nr_active_anon 351268
nr_inactive_file 58044
nr_active_file 75150
nr_unevictable 2605
nr_mlock 2605
nr_anon_pages 393874
nr_mapped 70303
nr_file_pages 182436
nr_dirty 1
nr_writeback 0
nr_slab_reclaimable 25295
nr_slab_unreclaimable 4981
nr_page_table_pages 8333
nr_kernel_stack 235
nr_unstable 0
nr_bounce 0
nr_vmscan_write 19016
nr_vmscan_immediate_reclaim 0
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 46130
nr_dirtied 2219397
nr_written 1935612
nr_pages_scanned 0
numa_hit 134038382
numa_miss 0
numa_foreign 0
numa_interleave 0
numa_local 134038382
numa_other 0
workingset_refault 1184602
workingset_activate 333618
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 4738, 4738)
pagesets
cpu: 0
count: 132
high: 186
batch: 31
vm stats threshold: 36
cpu: 1
count: 68
high: 186
batch: 31
vm stats threshold: 36
cpu: 2
count: 191
high: 186
batch: 31
vm stats threshold: 36
cpu: 3
count: 74
high: 186
batch: 31
vm stats threshold: 36
all_unreclaimable: 0
start_pfn: 4096
inactive_ratio: 4
Node 0, zone Normal
pages free 11903
min 1712
low 2140
high 2568
scanned 0
spanned 1245184
present 1245184
managed 1213155
nr_free_pages 11903
nr_alloc_batch 126
nr_inactive_anon 108343
nr_active_anon 601201
nr_inactive_file 94143
nr_active_file 123873
nr_unevictable 2545
nr_mlock 2545
nr_anon_pages 633247
nr_mapped 111175
nr_file_pages 296970
nr_dirty 5
nr_writeback 0
nr_slab_reclaimable 41272
nr_slab_unreclaimable 8284
nr_page_table_pages 12607
nr_kernel_stack 501
nr_unstable 0
nr_bounce 0
nr_vmscan_write 19983
nr_vmscan_immediate_reclaim 141
nr_writeback_temp 0
nr_isolated_anon 0
nr_isolated_file 0
nr_shmem 75926
nr_dirtied 3417463
nr_written 2980155
nr_pages_scanned 0
numa_hit 211574469
numa_miss 0
numa_foreign 0
numa_interleave 11756
numa_local 211574469
numa_other 0
workingset_refault 1828395
workingset_activate 465293
workingset_nodereclaim 0
nr_anon_transparent_hugepages 0
nr_free_cma 0
protection: (0, 0, 0, 0)
pagesets
cpu: 0
count: 164
high: 186
batch: 31
vm stats threshold: 42
cpu: 1
count: 140
high: 186
batch: 31
vm stats threshold: 42
cpu: 2
count: 162
high: 186
batch: 31
vm stats threshold: 42
cpu: 3
count: 92
high: 186
batch: 31
vm stats threshold: 42
all_unreclaimable: 0
start_pfn: 1048576
inactive_ratio: 6


Attachments:
config-rc7+ (98.85 kB)
config-rc6 (107.56 kB)
Download all attachments

2015-08-21 11:37:40

by Vlastimil Babka

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7

On 08/21/2015 11:17 AM, Arthur Marsh wrote:
>
>
> Vlastimil Babka wrote on 20/08/15 17:46:
>> On 08/19/2015 05:44 PM, Arthur Marsh wrote:
>>> Hi, I've found that the Linus' git head kernel has had some unwelcome
>>> behaviour where chromium browser would exhaust all swap space in the
>>> course of a few hours. The behaviour appeared before the release of
>>> 4.2.0-rc7.
>>
>> Do you have any more details about the memory/swap usage? Is it really
>> that chromium process(es) itself eats more memory and starts swapping,
>> or that something else (a graphics driver?) eats kernel memory, and
>> chromium as one of the biggest processes is driven to swap by that? Can
>> you provide e.g. top output with good/bad kernels?
>>
>> Also what does /proc/meminfo and /proc/zoneinfo look like when it's
>> swapping?
>>
>> To see which processes use swap, you can try [1] :
>> for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{
>> print ""}' $file; done | sort -k 2 -n -r | less
>>
>> Thanks
>>
>> [1] http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/
>>
>>> This does not happen with kernel 4.2.0-rc6.
>
> Sorry for the delay in replying. I had to give an extended run under
> kernel 4.2.0-rc6 to obtain comparative results. Both kernels' config
> files are attached.
>
> The applications running are the same both times, mainly iceweasel
> 38.1.0esr-3 and chromium 44.0.2403.107-1.
>
> With the rc7+ kernel but not the rc6 kernel, chromium eventually gets
> into a state of consuming lots of swap.
>
> I was able to capture the output requested when running a 4.2.0-rc7+
> kernel (Linus' git head as of around 05:00 UTC 19 August 2015) just
> before swap was exhausted, forcing me to do a control-alt-delete
> shutdown and waiting ages. The kernel config for the rc7+ is attached
>
> The comparison good kernel is from Debian:
> Linux am64 4.2.0-rc6-amd64 #1 SMP Debian 4.2~rc6-1~exp1 (2015-08-12)
> x86_64 GNU/Linux

Hm I didn't how similar are the configs, was the debian one used as a
base for the self-compiled one? Just to rule out config differences...
during the bisection you did use the same for compiling a "good" rc6
kernel and "bad" rc7 kernel, right?

That, said, looking at the memory values:

rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
rc7: ... = 4714MB

That's 2GB unaccounted for. Which is bad, and yet not enough to explain
a full 4GB swap. Another noticeable difference is rc7 using 1560MB ShMem
vs 476MB. The rest must be due to more anonymous memory used by the
processes. Iceweasel looks unchanged, so I'm guessing the chromiums...
the top output probably doesn't give us the whole picture here. I'm
still suspecting a graphics driver, which one do you use?

The shmem could be inspected by listing ipcs -m and ipcs -mp and grep
grep SYSV /proc/*/maps and figuring out what processes are behind the
pids. Doing that for rc6 and rc7 could tell us which processes use the
extra 1GB of shmem in rc7.

2015-08-21 11:48:54

by Vlastimil Babka

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7

On 08/21/2015 01:37 PM, Vlastimil Babka wrote:
>
> That, said, looking at the memory values:
>
> rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
> rc7: ... = 4714MB
>
> That's 2GB unaccounted for.

So one brute-force way to see who allocated those 2GB is to use the
page_owner debug feature. You need to enable CONFIG_PAGE_OWNER and then
follow the Usage part of Documentation/vm/page_owner.txt
If you can do that, please send the sorted_page_owner.txt for rc7 when
it's semi-nearing the exhausted swap. Then you could start doing a
comparison run with rc6, but maybe it will be easy to figure from the
rc7 log already. Thanks.

2015-08-21 12:38:10

by Arthur Marsh

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7



Vlastimil Babka wrote on 21/08/15 21:07:
> On 08/21/2015 11:17 AM, Arthur Marsh wrote:
>>
>>
>> Vlastimil Babka wrote on 20/08/15 17:46:
>>> On 08/19/2015 05:44 PM, Arthur Marsh wrote:
>>>> Hi, I've found that the Linus' git head kernel has had some unwelcome
>>>> behaviour where chromium browser would exhaust all swap space in the
>>>> course of a few hours. The behaviour appeared before the release of
>>>> 4.2.0-rc7.
>>>
>>> Do you have any more details about the memory/swap usage? Is it really
>>> that chromium process(es) itself eats more memory and starts swapping,
>>> or that something else (a graphics driver?) eats kernel memory, and
>>> chromium as one of the biggest processes is driven to swap by that? Can
>>> you provide e.g. top output with good/bad kernels?
>>>
>>> Also what does /proc/meminfo and /proc/zoneinfo look like when it's
>>> swapping?
>>>
>>> To see which processes use swap, you can try [1] :
>>> for file in /proc/*/status ; do awk '/VmSwap|Name/{printf $2 " " $3}END{
>>> print ""}' $file; done | sort -k 2 -n -r | less
>>>
>>> Thanks
>>>
>>> [1] http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/
>>>
>>>> This does not happen with kernel 4.2.0-rc6.
>>
>> Sorry for the delay in replying. I had to give an extended run under
>> kernel 4.2.0-rc6 to obtain comparative results. Both kernels' config
>> files are attached.
>>
>> The applications running are the same both times, mainly iceweasel
>> 38.1.0esr-3 and chromium 44.0.2403.107-1.
>>
>> With the rc7+ kernel but not the rc6 kernel, chromium eventually gets
>> into a state of consuming lots of swap.
>>
>> I was able to capture the output requested when running a 4.2.0-rc7+
>> kernel (Linus' git head as of around 05:00 UTC 19 August 2015) just
>> before swap was exhausted, forcing me to do a control-alt-delete
>> shutdown and waiting ages. The kernel config for the rc7+ is attached
>>
>> The comparison good kernel is from Debian:
>> Linux am64 4.2.0-rc6-amd64 #1 SMP Debian 4.2~rc6-1~exp1 (2015-08-12)
>> x86_64 GNU/Linux
>
> Hm I didn't how similar are the configs, was the debian one used as a
> base for the self-compiled one? Just to rule out config differences...
> during the bisection you did use the same for compiling a "good" rc6
> kernel and "bad" rc7 kernel, right?
>
> That, said, looking at the memory values:
>
> rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
> rc7: ... = 4714MB
>
> That's 2GB unaccounted for. Which is bad, and yet not enough to explain
> a full 4GB swap. Another noticeable difference is rc7 using 1560MB ShMem
> vs 476MB. The rest must be due to more anonymous memory used by the
> processes. Iceweasel looks unchanged, so I'm guessing the chromiums...
> the top output probably doesn't give us the whole picture here. I'm
> still suspecting a graphics driver, which one do you use?
>
> The shmem could be inspected by listing ipcs -m and ipcs -mp and grep
> grep SYSV /proc/*/maps and figuring out what processes are behind the
> pids. Doing that for rc6 and rc7 could tell us which processes use the
> extra 1GB of shmem in rc7.

I could do another test with the output you requested using an rc6
kernel built with the same config as rc7 but it would mean the best part
of 24 hours letting it run again.

I had observed the differences in behaviour with rc6 and rc7 kernels I
had built with the same config, but it was difficult to bisect when the
problems took some hours to appear.

The graphics driver is radeon, an onboard radeon 3200HD (RS780), taken
from the r6 kernel dmesg (I have to do a power off restart with the
onboard video to get it initialised correctly):

dmesg|egrep -i '(video|vga|radeon|agp|drm|ttm)'

[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-4.2.0-rc6-amd64
root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro radeon.audio=1
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-4.2.0-rc6-amd64
root=UUID=39706f53-7c27-4310-b22a-36c7b042d1a1 ro radeon.audio=1
[ 0.000000] AGP: Checking aperture...
[ 0.000000] AGP: No AGP bridge found
[ 0.000000] AGP: Node 0: aperture [bus addr 0xe64000000-0xe65ffffff]
(32MB)
[ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole
[ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup
[ 0.000000] AGP: This costs you 64MB of RAM
[ 0.000000] AGP: Mapping aperture over RAM [mem
0xb4000000-0xb7ffffff] (65536KB)
[ 0.000000] Console: colour VGA+ 80x25
[ 0.250485] vgaarb: setting as boot device: PCI:0000:01:05.0
[ 0.250524] vgaarb: device added:
PCI:0000:01:05.0,decodes=io+mem,owns=io+mem,locks=none
[ 0.250562] vgaarb: loaded
[ 0.250591] vgaarb: bridge control possible 0000:01:05.0
[ 0.280443] pci 0000:01:05.0: Video device with shadowed ROM
[ 0.554082] PCI-DMA: Disabling AGP.
[ 0.554278] PCI-DMA: Reserving 64MB of IOMMU area in the AGP aperture
[ 0.581841] Linux agpgart interface v0.103
[ 8.192329] [drm] Initialized drm 1.1.0 20060810
[ 9.820433] [drm] radeon kernel modesetting enabled.
[ 10.061641] [drm] initializing kernel modesetting (RS780
0x1002:0x9610 0x1043:0x82F1).
[ 10.061723] [drm] register mmio base: 0xFEAF0000
[ 10.061761] [drm] register mmio size: 65536
[ 10.062752] radeon 0000:01:05.0: VRAM: 256M 0x00000000C0000000 -
0x00000000CFFFFFFF (256M used)
[ 10.062802] radeon 0000:01:05.0: GTT: 512M 0x00000000A0000000 -
0x00000000BFFFFFFF
[ 10.062848] [drm] Detected VRAM RAM=256M, BAR=256M
[ 10.062886] [drm] RAM width 32bits DDR
[ 10.063199] [TTM] Zone kernel: Available graphics memory: 3961334 kiB
[ 10.063242] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
[ 10.063282] [TTM] Initializing pool allocator
[ 10.063330] [TTM] Initializing DMA pool allocator
[ 10.063415] [drm] radeon: 256M of VRAM memory ready
[ 10.063457] [drm] radeon: 512M of GTT memory ready.
[ 10.063521] [drm] Loading RS780 Microcode
[ 10.375216] radeon 0000:01:05.0: firmware: direct-loading firmware
radeon/RS780_pfp.bin
[ 10.382622] radeon 0000:01:05.0: firmware: direct-loading firmware
radeon/RS780_me.bin
[ 10.418020] radeon 0000:01:05.0: firmware: direct-loading firmware
radeon/R600_rlc.bin
[ 10.418123] [drm] radeon: power management initialized
[ 10.563323] radeon 0000:01:05.0: firmware: direct-loading firmware
radeon/RS780_uvd.bin
[ 10.563473] [drm] GART: num cpu pages 131072, num gpu pages 131072
[ 10.582735] [drm] PCIE GART of 512M enabled (table at
0x00000000C0258000).
[ 10.582866] radeon 0000:01:05.0: WB enabled
[ 10.582914] radeon 0000:01:05.0: fence driver on ring 0 use gpu addr
0x00000000a0000c00 and cpu addr 0xffff8800bac13c00
[ 10.587596] radeon 0000:01:05.0: fence driver on ring 5 use gpu addr
0x00000000c0056038 and cpu addr 0xffffc90001016038
[ 10.587667] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[ 10.587706] [drm] Driver supports precise vblank timestamp query.
[ 10.587746] radeon 0000:01:05.0: radeon: MSI limited to 32-bit
[ 10.587806] [drm] radeon: irq initialized.
[ 10.619604] [drm] ring test on 0 succeeded in 1 usecs
[ 10.794153] [drm] ring test on 5 succeeded in 1 usecs
[ 10.794217] [drm] UVD initialized successfully.
[ 10.794840] [drm] ib test on ring 0 succeeded in 0 usecs
[ 11.441315] [drm] ib test on ring 5 succeeded
[ 11.442573] [drm] Radeon Display Connectors
[ 11.442625] [drm] Connector 0:
[ 11.442662] [drm] VGA-1
[ 11.442701] [drm] DDC: 0x7e40 0x7e40 0x7e44 0x7e44 0x7e48 0x7e48
0x7e4c 0x7e4c
[ 11.442742] [drm] Encoders:
[ 11.442779] [drm] CRT1: INTERNAL_KLDSCP_DAC1
[ 11.442816] [drm] Connector 1:
[ 11.442852] [drm] HDMI-A-1
[ 11.443780] [drm] HPD3
[ 11.443817] [drm] DDC: 0x7e50 0x7e50 0x7e54 0x7e54 0x7e58 0x7e58
0x7e5c 0x7e5c
[ 11.443857] [drm] Encoders:
[ 11.443893] [drm] DFP3: INTERNAL_KLDSCP_LVTMA
[ 11.492369] [drm] fb mappable at 0xD0359000
[ 11.492402] [drm] vram apper at 0xD0000000
[ 11.492430] [drm] size 8294400
[ 11.492458] [drm] fb depth is 24
[ 11.492487] [drm] pitch is 7680
[ 11.492697] fbcon: radeondrmfb (fb0) is primary device
[ 11.548492] radeon 0000:01:05.0: fb0: radeondrmfb frame buffer device
[ 11.548581] radeon 0000:01:05.0: registered panic notifier
[ 11.557161] [drm] Initialized radeon 2.43.0 20080528 for 0000:01:05.0
on minor 0
[ 12.615061] Linux video capture interface: v2.00

Arthur.

2015-08-21 12:43:09

by Arthur Marsh

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7



Vlastimil Babka wrote on 21/08/15 21:18:
> On 08/21/2015 01:37 PM, Vlastimil Babka wrote:
>>
>> That, said, looking at the memory values:
>>
>> rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
>> rc7: ... = 4714MB
>>
>> That's 2GB unaccounted for.
>
> So one brute-force way to see who allocated those 2GB is to use the
> page_owner debug feature. You need to enable CONFIG_PAGE_OWNER and then
> follow the Usage part of Documentation/vm/page_owner.txt
> If you can do that, please send the sorted_page_owner.txt for rc7 when
> it's semi-nearing the exhausted swap. Then you could start doing a
> comparison run with rc6, but maybe it will be easy to figure from the
> rc7 log already. Thanks.
>

I'm currently rebuilding the rc7 kernel with CONFIG_PAGE_OWNER=y and
will test that.

Arthur.

2015-08-22 04:48:24

by Arthur Marsh

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7



Vlastimil Babka wrote on 21/08/15 21:18:
> On 08/21/2015 01:37 PM, Vlastimil Babka wrote:
>>
>> That, said, looking at the memory values:
>>
>> rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
>> rc7: ... = 4714MB
>>
>> That's 2GB unaccounted for.
>
> So one brute-force way to see who allocated those 2GB is to use the
> page_owner debug feature. You need to enable CONFIG_PAGE_OWNER and then
> follow the Usage part of Documentation/vm/page_owner.txt
> If you can do that, please send the sorted_page_owner.txt for rc7 when
> it's semi-nearing the exhausted swap. Then you could start doing a
> comparison run with rc6, but maybe it will be easy to figure from the
> rc7 log already. Thanks.
>

Documentation/vm/page_owner.txt does not mention the need to do:

mount -t debugfs none /sys/kernel/debug

Having done that when about 1.5 GiB swap was in use, the output of
sorted_page_owner.txt with the rc7+ kernel starts with:

699487 times:
Page allocated via order 0, mask 0x280da
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff8118d35b>] handle_mm_fault+0x11bb/0x1480
[<ffffffff8104c3e8>] __do_page_fault+0x178/0x480
[<ffffffff8104c71b>] do_page_fault+0x2b/0x40
[<ffffffff815b01a8>] page_fault+0x28/0x30
[<ffffffffffffffff>] 0xffffffffffffffff

457823 times:
Page allocated via order 0, mask 0x202d0
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff8100a844>] dma_generic_alloc_coherent+0xa4/0xf0
[<ffffffff810481fd>] x86_swiotlb_alloc_coherent+0x2d/0x60
[<ffffffff8100a5ae>] dma_alloc_attrs+0x4e/0x90
[<ffffffffa0427d72>] ttm_dma_populate+0x502/0x900 [ttm]
[<ffffffffa046bf26>] radeon_ttm_tt_populate+0x216/0x2b0 [radeon]
[<ffffffffa041dd74>] ttm_tt_bind+0x44/0x80 [ttm]
[<ffffffffa0420316>] ttm_bo_handle_move_mem+0x3b6/0x440 [ttm]

213933 times:
Page allocated via order 0, mask 0x10200da
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff81158b7c>] pagecache_get_page+0x9c/0x1f0
[<ffffffff81158cf7>] grab_cache_page_write_begin+0x27/0x40
[<ffffffffa01228cd>] ext4_write_begin+0xbd/0x580 [ext4]
[<ffffffff8115773a>] generic_perform_write+0xaa/0x1a0
[<ffffffff8115a063>] __generic_file_write_iter+0x193/0x1f0
[<ffffffffa0115c85>] ext4_file_write_iter+0xf5/0x490 [ext4]
[<ffffffff811c6cf5>] __vfs_write+0xa5/0xe0

120198 times:
Page allocated via order 0, mask 0x200da
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff81175ab1>] shmem_getpage_gfp+0x381/0xa30
[<ffffffff811767f2>] shmem_fault+0x62/0x1b0
[<ffffffff81189648>] __do_fault+0x38/0x80
[<ffffffff8118c4ac>] handle_mm_fault+0x30c/0x1480
[<ffffffff8104c3e8>] __do_page_fault+0x178/0x480
[<ffffffff8104c71b>] do_page_fault+0x2b/0x40
[<ffffffff815b01a8>] page_fault+0x28/0x30

82253 times:
Page allocated via order 0, mask 0x213da
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff81167b41>] __do_page_cache_readahead+0x101/0x320
[<ffffffff8115a778>] filemap_fault+0x388/0x400
[<ffffffff81189648>] __do_fault+0x38/0x80
[<ffffffff8118ce7f>] handle_mm_fault+0xcdf/0x1480
[<ffffffff8104c3e8>] __do_page_fault+0x178/0x480
[<ffffffff8104c71b>] do_page_fault+0x2b/0x40
[<ffffffff815b01a8>] page_fault+0x28/0x30

47542 times:
Page allocated via order 0, mask 0x200da
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff81189baa>] wp_page_copy.isra.62+0x7a/0x5c0
[<ffffffff8118b6fd>] do_wp_page+0xbd/0x600
[<ffffffff8118c92c>] handle_mm_fault+0x78c/0x1480
[<ffffffff8104c3e8>] __do_page_fault+0x178/0x480
[<ffffffff8104c71b>] do_page_fault+0x2b/0x40
[<ffffffff815b01a8>] page_fault+0x28/0x30
[<ffffffffffffffff>] 0xffffffffffffffff

43843 times:
Page allocated via order 0, mask 0x0
[<ffffffff81b0a34a>] page_ext_init+0xe5/0xea
[<ffffffff81ae4ebe>] start_kernel+0x392/0x459
[<ffffffff81ae4315>] x86_64_start_reservations+0x2a/0x2c
[<ffffffff81ae444e>] x86_64_start_kernel+0x137/0x146
[<ffffffffffffffff>] 0xffffffffffffffff

28075 times:
Page allocated via order 0, mask 0x8
[<ffffffff81160388>] split_free_page+0x38/0x50
[<ffffffff81183865>] isolate_freepages_block+0x205/0x4b0
[<ffffffff81183cb1>] compaction_alloc+0x1a1/0x280
[<ffffffff811b2a51>] migrate_pages+0x241/0x9a0
[<ffffffff8118553e>] compact_zone+0x55e/0xea0
[<ffffffff81185ed8>] compact_zone_order+0x58/0x70
[<ffffffff81186317>] try_to_compact_pages+0x127/0x5b0
[<ffffffff811611f9>] __alloc_pages_direct_compact+0x49/0xf0

19845 times:
Page allocated via order 0, mask 0x2a4050
[<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
[<ffffffff811ad53b>] cache_alloc_refill+0x33b/0x5b0
[<ffffffff811ad095>] kmem_cache_alloc+0x1a5/0x310
[<ffffffffa0132dca>] ext4_alloc_inode+0x1a/0x210 [ext4]
[<ffffffff811e3f88>] alloc_inode+0x18/0x90
[<ffffffff811e58a8>] iget_locked+0xd8/0x190
[<ffffffffa011e85c>] ext4_iget+0x3c/0xa70 [ext4]
[<ffffffffa011f2bb>] ext4_iget_normal+0x2b/0x40 [ext4]

Also, once when attempting to do:

cat /sys/kernel/debug/page_owner > page_owner_full.txt

I received the following error:


[18410.829060] cat: page allocation failure: order:5, mode:0x2040d0
[18410.829068] CPU: 3 PID: 1732 Comm: cat Not tainted 4.2.0-rc7+ #1907
[18410.829070] Hardware name: System manufacturer System Product
Name/M3A78 PRO, BIOS 1701 01/27/2011
[18410.829073] 0000000000000005 ffff88001f4d7a58 ffffffff815a554d
0000000000000034
[18410.829078] 00000000002040d0 ffff88001f4d7ae8 ffffffff8115dedc
ffff8800360b4540
[18410.829082] 0000000000000005 ffff8800360b4540 00000000002040d0
ffff88001f4d7bc0
[18410.829085] Call Trace:
[18410.829091] [<ffffffff815a554d>] dump_stack+0x4f/0x7b
[18410.829096] [<ffffffff8115dedc>] warn_alloc_failed+0xdc/0x130
[18410.829099] [<ffffffff81161298>] ?
__alloc_pages_direct_compact+0xe8/0xf0
[18410.829101] [<ffffffff81161b31>] __alloc_pages_nodemask+0x891/0xb00
[18410.829104] [<ffffffff810aa375>] ? __lock_acquire+0xc05/0x1c70
[18410.829107] [<ffffffff811ad53b>] cache_alloc_refill+0x33b/0x5b0
[18410.829110] [<ffffffff811c1764>] ? print_page_owner+0x54/0x350
[18410.829112] [<ffffffff811adc7e>] __kmalloc+0x1be/0x330
[18410.829114] [<ffffffff811c1764>] print_page_owner+0x54/0x350
[18410.829116] [<ffffffff8115f786>] ? drain_pages_zone+0x76/0xa0
[18410.829118] [<ffffffff8115f860>] ? page_alloc_cpu_notify+0x50/0x50
[18410.829119] [<ffffffff8115f7cf>] ? drain_pages+0x1f/0x60
[18410.829122] [<ffffffff810a93c6>] ? trace_hardirqs_on_caller+0x136/0x1c0
[18410.829123] [<ffffffff8115f860>] ? page_alloc_cpu_notify+0x50/0x50
[18410.829126] [<ffffffff81084403>] ? preempt_count_sub+0x23/0x60
[18410.829129] [<ffffffff810eb58f>] ? on_each_cpu_mask+0x5f/0xd0
[18410.829131] [<ffffffff811c1bbf>] read_page_owner+0x15f/0x180
[18410.829134] [<ffffffff811c6ba3>] __vfs_read+0x23/0xd0
[18410.829137] [<ffffffff8126581b>] ? security_file_permission+0x9b/0xc0
[18410.829139] [<ffffffff811c71ba>] ? rw_verify_area+0x4a/0xe0
[18410.829141] [<ffffffff811c72dd>] vfs_read+0x8d/0x140
[18410.829143] [<ffffffff810acc51>] ? lockdep_sys_exit+0x1/0x90
[18410.829146] [<ffffffff811c7cbd>] SyS_read+0x4d/0xb0
[18410.829149] [<ffffffff815ae26e>] entry_SYSCALL_64_fastpath+0x12/0x76
[18410.829151] Mem-Info:
[18410.829157] active_anon:715055 inactive_anon:205953 isolated_anon:15
active_file:215967 inactive_file:199708 isolated_file:0
unevictable:5132 dirty:4186 writeback:5030 unstable:0
slab_reclaimable:49019 slab_unreclaimable:28035
mapped:168002 shmem:124895 pagetables:20296 bounce:0
free:14378 free_pcp:127 free_cma:0
[18410.829164] DMA free:15872kB min:20kB low:24kB high:28kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB
managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB
pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[18410.829166] lowmem_reserve[]: 0 2966 7692 7692
[18410.829174] DMA32 free:28064kB min:4296kB low:5368kB high:6444kB
active_anon:1064284kB inactive_anon:359240kB active_file:348344kB
inactive_file:326552kB unevictable:9228kB isolated(anon):60kB
isolated(file):0kB present:3129024kB managed:3040444kB mlocked:9228kB
dirty:6408kB writeback:7312kB mapped:276504kB shmem:197144kB
slab_reclaimable:78148kB slab_unreclaimable:44692kB kernel_stack:5216kB
pagetables:33052kB unstable:0kB bounce:0kB free_pcp:24kB local_pcp:20kB
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[18410.829176] lowmem_reserve[]: 0 0 4725 4725
[18410.829183] Normal free:13576kB min:6844kB low:8552kB high:10264kB
active_anon:1795936kB inactive_anon:464572kB active_file:515524kB
inactive_file:472280kB unevictable:11300kB isolated(anon):0kB
isolated(file):0kB present:4980736kB managed:4839416kB mlocked:11300kB
dirty:10336kB writeback:12808kB mapped:395504kB shmem:302436kB
slab_reclaimable:117928kB slab_unreclaimable:67448kB kernel_stack:7392kB
pagetables:48132kB unstable:0kB bounce:0kB free_pcp:484kB local_pcp:0kB
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[18410.829184] lowmem_reserve[]: 0 0 0 0
[18410.829188] DMA: 0*4kB 0*8kB 0*16kB 2*32kB (U) 1*64kB (U) 1*128kB (U)
1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (EM) = 15872kB
[18410.829201] DMA32: 1914*4kB (UEM) 835*8kB (UEM) 390*16kB (UEM)
170*32kB (EM) 34*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 28320kB
[18410.829213] Normal: 1049*4kB (UEM) 511*8kB (UEM) 222*16kB (UEM)
48*32kB (UM) 6*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
0*4096kB = 13756kB
[18410.829225] Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
[18410.829226] 555867 total pagecache pages
[18410.829228] 10431 pages in swap cache
[18410.829229] Swap cache stats: add 395263, delete 384832, find 34692/49351
[18410.829231] Free swap = 2820524kB
[18410.829232] Total swap = 4194288kB
[18410.829275] 2031438 pages RAM
[18410.829276] 0 pages HighMem/MovableOnly
[18410.829277] 57497 pages reserved

I'll try to repeat the process with the 4.2.0-rc6 kernel also.

Arthur.

2015-08-22 07:05:30

by Vlastimil Babka

[permalink] [raw]
Subject: Re: difficult to pinpoint exhaustion of swap between 4.2.0-rc6 and 4.2.0-rc7

On 22.8.2015 6:48, Arthur Marsh wrote:
>
>
> Vlastimil Babka wrote on 21/08/15 21:18:
>> On 08/21/2015 01:37 PM, Vlastimil Babka wrote:
>>>
>>> That, said, looking at the memory values:
>>>
>>> rc6: Free+Buffers+A/I(Anon)+A/I(File)+Slab = 6769MB
>>> rc7: ... = 4714MB
>>>
>>> That's 2GB unaccounted for.
>>
>> So one brute-force way to see who allocated those 2GB is to use the
>> page_owner debug feature. You need to enable CONFIG_PAGE_OWNER and then
>> follow the Usage part of Documentation/vm/page_owner.txt
>> If you can do that, please send the sorted_page_owner.txt for rc7 when
>> it's semi-nearing the exhausted swap. Then you could start doing a
>> comparison run with rc6, but maybe it will be easy to figure from the
>> rc7 log already. Thanks.
>>
>
> Documentation/vm/page_owner.txt does not mention the need to do:
>
> mount -t debugfs none /sys/kernel/debug

Ah, right...

> Having done that when about 1.5 GiB swap was in use, the output of
> sorted_page_owner.txt with the rc7+ kernel starts with:
>
> 699487 times:
> Page allocated via order 0, mask 0x280da
> [<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
> [<ffffffff8118d35b>] handle_mm_fault+0x11bb/0x1480
> [<ffffffff8104c3e8>] __do_page_fault+0x178/0x480
> [<ffffffff8104c71b>] do_page_fault+0x2b/0x40
> [<ffffffff815b01a8>] page_fault+0x28/0x30
> [<ffffffffffffffff>] 0xffffffffffffffff

That's userspace, that's fine.

> 457823 times:
> Page allocated via order 0, mask 0x202d0
> [<ffffffff8116147e>] __alloc_pages_nodemask+0x1de/0xb00
> [<ffffffff8100a844>] dma_generic_alloc_coherent+0xa4/0xf0
> [<ffffffff810481fd>] x86_swiotlb_alloc_coherent+0x2d/0x60
> [<ffffffff8100a5ae>] dma_alloc_attrs+0x4e/0x90
> [<ffffffffa0427d72>] ttm_dma_populate+0x502/0x900 [ttm]
> [<ffffffffa046bf26>] radeon_ttm_tt_populate+0x216/0x2b0 [radeon]
> [<ffffffffa041dd74>] ttm_tt_bind+0x44/0x80 [ttm]
> [<ffffffffa0420316>] ttm_bo_handle_move_mem+0x3b6/0x440 [ttm]

There. 1800MB of present RAM was allocated through ttm/radeon in rc7(+). And
apparently that doesn't happen with rc6.
The problem is, there were no commits between rc6 and rc7 in
drivers/gpu/drm/radeon/ or drivers/gpu/drm/ttm/. I'm CC'ing dri and some radeon
devs anyway. Please find the rest of this thread on lkml.

[...]

>
> Also, once when attempting to do:
>
> cat /sys/kernel/debug/page_owner > page_owner_full.txt
>
> I received the following error:
>
>
> [18410.829060] cat: page allocation failure: order:5, mode:0x2040d0
> [18410.829068] CPU: 3 PID: 1732 Comm: cat Not tainted 4.2.0-rc7+ #1907
> [18410.829070] Hardware name: System manufacturer System Product
> Name/M3A78 PRO, BIOS 1701 01/27/2011
> [18410.829073] 0000000000000005 ffff88001f4d7a58 ffffffff815a554d
> 0000000000000034
> [18410.829078] 00000000002040d0 ffff88001f4d7ae8 ffffffff8115dedc
> ffff8800360b4540
> [18410.829082] 0000000000000005 ffff8800360b4540 00000000002040d0
> ffff88001f4d7bc0
> [18410.829085] Call Trace:
> [18410.829091] [<ffffffff815a554d>] dump_stack+0x4f/0x7b
> [18410.829096] [<ffffffff8115dedc>] warn_alloc_failed+0xdc/0x130
> [18410.829099] [<ffffffff81161298>] ?
> __alloc_pages_direct_compact+0xe8/0xf0
> [18410.829101] [<ffffffff81161b31>] __alloc_pages_nodemask+0x891/0xb00
> [18410.829104] [<ffffffff810aa375>] ? __lock_acquire+0xc05/0x1c70
> [18410.829107] [<ffffffff811ad53b>] cache_alloc_refill+0x33b/0x5b0
> [18410.829110] [<ffffffff811c1764>] ? print_page_owner+0x54/0x350
> [18410.829112] [<ffffffff811adc7e>] __kmalloc+0x1be/0x330
> [18410.829114] [<ffffffff811c1764>] print_page_owner+0x54/0x350
> [18410.829116] [<ffffffff8115f786>] ? drain_pages_zone+0x76/0xa0
> [18410.829118] [<ffffffff8115f860>] ? page_alloc_cpu_notify+0x50/0x50
> [18410.829119] [<ffffffff8115f7cf>] ? drain_pages+0x1f/0x60
> [18410.829122] [<ffffffff810a93c6>] ? trace_hardirqs_on_caller+0x136/0x1c0
> [18410.829123] [<ffffffff8115f860>] ? page_alloc_cpu_notify+0x50/0x50
> [18410.829126] [<ffffffff81084403>] ? preempt_count_sub+0x23/0x60
> [18410.829129] [<ffffffff810eb58f>] ? on_each_cpu_mask+0x5f/0xd0
> [18410.829131] [<ffffffff811c1bbf>] read_page_owner+0x15f/0x180
> [18410.829134] [<ffffffff811c6ba3>] __vfs_read+0x23/0xd0
> [18410.829137] [<ffffffff8126581b>] ? security_file_permission+0x9b/0xc0
> [18410.829139] [<ffffffff811c71ba>] ? rw_verify_area+0x4a/0xe0
> [18410.829141] [<ffffffff811c72dd>] vfs_read+0x8d/0x140
> [18410.829143] [<ffffffff810acc51>] ? lockdep_sys_exit+0x1/0x90
> [18410.829146] [<ffffffff811c7cbd>] SyS_read+0x4d/0xb0
> [18410.829149] [<ffffffff815ae26e>] entry_SYSCALL_64_fastpath+0x12/0x76
> [18410.829151] Mem-Info:
> [18410.829157] active_anon:715055 inactive_anon:205953 isolated_anon:15
> active_file:215967 inactive_file:199708 isolated_file:0
> unevictable:5132 dirty:4186 writeback:5030 unstable:0
> slab_reclaimable:49019 slab_unreclaimable:28035
> mapped:168002 shmem:124895 pagetables:20296 bounce:0
> free:14378 free_pcp:127 free_cma:0
> [18410.829164] DMA free:15872kB min:20kB low:24kB high:28kB
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB
> managed:15904kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB
> pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> [18410.829166] lowmem_reserve[]: 0 2966 7692 7692
> [18410.829174] DMA32 free:28064kB min:4296kB low:5368kB high:6444kB
> active_anon:1064284kB inactive_anon:359240kB active_file:348344kB
> inactive_file:326552kB unevictable:9228kB isolated(anon):60kB
> isolated(file):0kB present:3129024kB managed:3040444kB mlocked:9228kB
> dirty:6408kB writeback:7312kB mapped:276504kB shmem:197144kB
> slab_reclaimable:78148kB slab_unreclaimable:44692kB kernel_stack:5216kB
> pagetables:33052kB unstable:0kB bounce:0kB free_pcp:24kB local_pcp:20kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [18410.829176] lowmem_reserve[]: 0 0 4725 4725
> [18410.829183] Normal free:13576kB min:6844kB low:8552kB high:10264kB
> active_anon:1795936kB inactive_anon:464572kB active_file:515524kB
> inactive_file:472280kB unevictable:11300kB isolated(anon):0kB
> isolated(file):0kB present:4980736kB managed:4839416kB mlocked:11300kB
> dirty:10336kB writeback:12808kB mapped:395504kB shmem:302436kB
> slab_reclaimable:117928kB slab_unreclaimable:67448kB kernel_stack:7392kB
> pagetables:48132kB unstable:0kB bounce:0kB free_pcp:484kB local_pcp:0kB
> free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [18410.829184] lowmem_reserve[]: 0 0 0 0
> [18410.829188] DMA: 0*4kB 0*8kB 0*16kB 2*32kB (U) 1*64kB (U) 1*128kB (U)
> 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (EM) = 15872kB
> [18410.829201] DMA32: 1914*4kB (UEM) 835*8kB (UEM) 390*16kB (UEM)
> 170*32kB (EM) 34*64kB (M) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB
> 0*4096kB = 28320kB
> [18410.829213] Normal: 1049*4kB (UEM) 511*8kB (UEM) 222*16kB (UEM)
> 48*32kB (UM) 6*64kB (M) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
> 0*4096kB = 13756kB
> [18410.829225] Node 0 hugepages_total=0 hugepages_free=0
> hugepages_surp=0 hugepages_size=2048kB
> [18410.829226] 555867 total pagecache pages
> [18410.829228] 10431 pages in swap cache
> [18410.829229] Swap cache stats: add 395263, delete 384832, find 34692/49351
> [18410.829231] Free swap = 2820524kB
> [18410.829232] Total swap = 4194288kB
> [18410.829275] 2031438 pages RAM
> [18410.829276] 0 pages HighMem/MovableOnly
> [18410.829277] 57497 pages reserved

OK we should look at this. It's annoying to rely on order-5 allocation when you
are debugging a memory leak issue. There should better be an order-0 fallback...

> I'll try to repeat the process with the 4.2.0-rc6 kernel also.

Hm I guess the memory stats for rc6 already rule out such high usage in ttm.

In rc7 it might be interesting to know how the page owner stats change after you
kill 1) the chrome/iceweasel processes, and then 2) the whole X. If the memory
is recovered, it might be not a full leak, but something like insufficient
shrinker response to memory pressure in the system.

Unless of course the drm devs have better ideas what to try...

> Arthur.
>