2013-09-13 00:51:39

by Fengguang Wu

[permalink] [raw]
Subject: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

Hi Peter,

FYI, we noticed much increased vmap_area_lock contentions since this
commit:

commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
Author: Peter Hurley <[email protected]>
Date: Sat Jun 15 10:21:19 2013 -0400

n_tty: Move buffers into n_tty_data

Reduce pointer reloading and improve locality-of-reference;
allocate read_buf and echo_buf within struct n_tty_data.

Signed-off-by: Peter Hurley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

:040000 040000 96d92e4e242c4b2ff11b25c005bccd093865b350 2822d87b2425c3e7adc7b722a20d739c9d4a3046 M drivers
bisect run success

# bad: [9cdfea961eeaf9b6b939fb9dc822a204a1d94d08] Merge remote-tracking branch 'ipvs-next/master' into devel-hourly-2013090112
# good: [d4e4ab86bcba5a72779c43dc1459f71fea3d89c8] Linux 3.11-rc5
git bisect start '9cdfea961eeaf9b6b939fb9dc822a204a1d94d08' 'd4e4ab86bcba5a72779c43dc1459f71fea3d89c8' '--'
# good: [903bc7e8c2a4f9ef8e40aed615ed48023f9460e3] Merge remote-tracking branch 'spi/for-next'
git bisect good 903bc7e8c2a4f9ef8e40aed615ed48023f9460e3
# bad: [d926318427c1fa8df182d3dcd7f802d7d0609225] Merge remote-tracking branch 'bcon/master'
git bisect bad d926318427c1fa8df182d3dcd7f802d7d0609225
# bad: [adcb2bcd0dd2bb88f21180c8ffb9ec2549f4d883] Merge remote-tracking branch 'usb/usb-next'
git bisect bad adcb2bcd0dd2bb88f21180c8ffb9ec2549f4d883
# good: [a83605266423437d8afe73cbd538df33cf8fb32b] Merge remote-tracking branch 'driver-core/driver-core-next'
git bisect good a83605266423437d8afe73cbd538df33cf8fb32b
# good: [224563b6ce034b82f8511969d9496113da34fb2c] Merge tag 'for-usb-next-2013-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into usb-next
git bisect good 224563b6ce034b82f8511969d9496113da34fb2c
# bad: [5284eba7b37dfeffe75bfdf81a13e9efebe0480a] serial/arc-uart: Handle Rx Error Interrupts w/o any data
git bisect bad 5284eba7b37dfeffe75bfdf81a13e9efebe0480a
# bad: [7d88d637a3c75257b7611f9feee0c17fa187ecc9] n_tty: Factor standard per-char i/o into separate fn
git bisect bad 7d88d637a3c75257b7611f9feee0c17fa187ecc9
# good: [11b9faa44df76189b8346ff602a2c01c610c37eb] tty: Merge tty_buffer_find() into tty_buffer_alloc()
git bisect good 11b9faa44df76189b8346ff602a2c01c610c37eb
# good: [019ebdf9f26fd2e43b9e1af576835183e95dc82e] n_tty: Eliminate echo_commit memory barrier
git bisect good 019ebdf9f26fd2e43b9e1af576835183e95dc82e
# bad: [1bb9d562856279a217a4e581a803dee9bb48a708] n_tty: Rename process_char_map to char_map
git bisect bad 1bb9d562856279a217a4e581a803dee9bb48a708
# good: [9dfd16ddea9bdbc8343340e543732db0a467ae32] n_tty: Avoid false-sharing echo buffer indices
git bisect good 9dfd16ddea9bdbc8343340e543732db0a467ae32
# good: [8cb06c983822103da1cfe57b9901e60a00e61f67] n_tty: Remove alias ptrs in __receive_buf()
git bisect good 8cb06c983822103da1cfe57b9901e60a00e61f67
# bad: [20bafb3d23d108bc0a896eb8b7c1501f4f649b77] n_tty: Move buffers into n_tty_data
git bisect bad 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
# first bad commit: [20bafb3d23d108bc0a896eb8b7c1501f4f649b77] n_tty: Move buffers into n_tty_data


Compare of all good/bad commits in this bisect:

lock_stat.vmap_area_lock.contentions

22000 ++-----------------------------------------------------------------+
20000 ++O O O O O O O O OO O O OO O O O OO O O O O O O O O |
O OO OO O O O |
18000 ++ |
16000 ++ |
| |
14000 ++ |
12000 ++ |
10000 ++ |
| |
8000 ++ |
6000 ++ |
| |
4000 *+*.**.*.*.*.**.*.*.*.*.**.*.*.*.**.*.*.*.**.*.*.*.*.**.*.*.*.**.*.*
2000 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.contentions.get_vmalloc_info

45000 ++-----------------------------------------------------------------+
| O O O O O O O |
40000 O+ OO O O O OO O O O OO O O OO O O OO O O OO O O O |
35000 ++ |
| |
30000 ++ |
| |
25000 ++ |
| |
20000 ++ |
15000 ++ |
| |
10000 ++ |
| *.*. *. .*. .*. .*.*. |
5000 *+*-**-*-*-*-**-*-*-*-*-*----*-*-*--*---*-**-*-*-*---**-----*-**-*-*


Attachments:
(No filename) (5.52 kB)
0001-n_tty-Move-buffers-into-n_tty_data.patch (2.41 kB)
Download all attachments

2013-09-13 01:09:49

by Fengguang Wu

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> Hi Peter,
>
> FYI, we noticed much increased vmap_area_lock contentions since this
> commit:
>
> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
> Author: Peter Hurley <[email protected]>
> Date: Sat Jun 15 10:21:19 2013 -0400
>
> n_tty: Move buffers into n_tty_data
>
> Reduce pointer reloading and improve locality-of-reference;
> allocate read_buf and echo_buf within struct n_tty_data.

Here are some comparison between this commit [o] with its parent commit [*].

slabinfo.kmalloc-128.active_objs

10400 ++--------O---------------O----------------------------------------+
| O O O O O O O O O O O O O O
10200 ++ O O O O O O O O |
10000 ++ O |
O O |
9800 ++ |
| |
9600 ++ |
| |
9400 ++ |
9200 ++ |
| |
9000 *+.*. |
| * |
8800 ++-----------------------------------------------------------------+


slabinfo.kmalloc-128.active_objs

2400 ++------------------------------------------------------------------+
| O O O O O|
2200 ++ O OO O O O O O O OO O O O
2000 O+O OOO O OOO OOOO OO OO O O O OOOO O O O OO O OOO O OOO O |
|O |
1800 ++ |
| |
1600 ++ |
| |
1400 ++ |
1200 ++ |
|* |
1000 ++*.* |
* ** |
800 ++------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
976.67 +108.3% 2034.67 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
8971.36 +11.4% 9997.05 nhm-white/micro/aim7/exec_test
9948.03 +20.9% 12031.72 TOTAL slabinfo.kmalloc-128.active_objs

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
976.67 +108.3% 2034.67 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
9127.64 +11.4% 10164.15 nhm-white/micro/aim7/exec_test
10104.31 +20.7% 12198.82 TOTAL slabinfo.kmalloc-128.num_objs

lock_stat.vmap_area_lock.holdtime-total

1.8e+07 ++---------------------------------------------------------------+
O O O O O O O O |
1.6e+07 ++ O O O O O O O O O O O O
1.4e+07 ++ |
| |
1.2e+07 ++ |
1e+07 ++ |
| |
8e+06 ++ |
6e+06 ++ |
| |
4e+06 ++ |
2e+06 ++ |
*..*...*..*...*..*...*..*..*...* |
0 ++---------------------------------------------------------------+



lock_stat.vmap_area_lock.holdtime-total

70000 ++-----------------------------------------------------------------+
| O O O |
60000 ++ O O O O O O O O O O |
O O O O
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
10000 ++ |
*...*...*....*...* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

80000 ++-----------------------------------------------------------------+
| |
70000 ++ O |
60000 ++ |
| O |
50000 ++ O O |
| O
40000 ++ O |
| |
30000 O+ O |
20000 ++ O O O O |
| |
10000 ++ |
| |
0 *+----*-----*-----*-----*------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

90000 ++-----------------------------------------------------------------+
| |
80000 ++ O |
70000 ++ |
| |
60000 ++ O
50000 ++ O |
| O O O |
40000 O+ O |
30000 ++ O |
| O O |
20000 ++ |
10000 ++ |
*...... |
0 ++-----*-----*------*------*------*-----*--------------------------+


lock_stat.vmap_area_lock.holdtime-total

80000 ++-----------------------------------------------------------------+
| O O |
70000 ++ |
60000 ++ |
| |
50000 ++ O O O |
| |
40000 ++ |
| |
30000 O+ O O |
20000 ++ O O
| |
10000 ++ |
*......*....... ...* |
0 ++-------------*---------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

80000 ++-------------------------------------------------------------O---+
| O O O O O
70000 ++ O |
60000 ++ O |
| O O |
50000 ++ |
| |
40000 ++ |
| O O |
30000 ++ O O |
20000 O+ O O |
| |
10000 ++ |
| |
0 *+--*---*----*-----------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

80000 ++-----------------------------------------------------------------+
| O O |
70000 ++ O O
60000 ++ |
| |
50000 ++ |
| |
40000 ++ |
O O |
30000 ++ O O O |
20000 ++ O |
| O |
10000 ++ |
| |
0 *+-----*-----*------*----------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

120000 ++----------------------------------------------------------------+
| |
100000 ++ O O
| O |
| O |
80000 ++ |
| |
60000 ++ O |
| O |
40000 O+ O |
| |
| O |
20000 ++ |
| |
0 *+------*--------*-------*----------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

1.6e+07 ++---------------------------------------------------------------+
| O O O O O
1.4e+07 O+ O O O O O O O O O O O O O O O O O O O O O |
1.2e+07 ++ |
| |
1e+07 ++ |
| |
8e+06 ++ |
| |
6e+06 ++ |
4e+06 ++ |
| |
2e+06 ++ |
*..*.* |
0 ++---------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

400000 ++----------------------------------------------------------------+
| O O O
350000 ++ O O O O O O O |
| O O O O O O O O O O |
300000 O+ O O O |
250000 ++ |
| |
200000 ++ |
| |
150000 ++ |
100000 ++ |
| |
50000 ++ |
*..*..* |
0 ++----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

60000 +O-----------O---------O-O--------O-----------------O-----------O--+
OOOO O OOO OO OOOOOOOOO O OOOOOOOOOOO O O OO OOOOO O OOOO OOOOO OO
50000 ++ O O O O O O O OO O OO OOO O O OOO OOO O O |
| |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
| |
10000 ++ |
** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

350000 ++----------------------------------------------------------------+
| O O O O O O
300000 O+ O O O O O O O O |
| O O O O O O O O O O O |
250000 ++ |
| |
200000 ++ |
| |
150000 ++ |
| |
100000 ++ |
| |
50000 ++ |
*..*.*..* |
0 ++----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

300000 ++--------------------------O----O--------------O-------O---------+
O O O O O O O O O O O |
250000 ++ O O O O O O O O
| O O O O |
| |
200000 ++ |
| |
150000 ++ |
| |
100000 ++ |
| |
| |
50000 ++ |
*..*.*..*.* |
0 ++----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

70000 ++-----------------------------------------------------------------+
| O O O |
60000 +O O OOO OO O O O O O OOOO O O O OO O OO O O O O O |
O O OO OO OO OOO O O O O O O O O OO O O OOO
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
10000 ++ |
***.*** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

400000 ++----------------------------------------------------------------+
| O O O O |
350000 O+ O O O O O O O O O O O O
| O O O O O O O O O O O |
300000 ++ |
250000 ++ |
| |
200000 ++ |
| |
150000 ++ |
100000 ++ |
| |
50000 ++ |
*.*..*.*..*.* |
0 ++----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

80000 ++--------------------------O--------------------------------------+
| O O O |
70000 ++O O OO OOO OO OO O OO O O OOO O OOO OO OO OO O |
60000 ++ OO O OO O O O O O O O O O OOO OO
OO O O |
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
20000 ++ |
| |
10000 ++ |
***.*** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

50000 ++-------------------------------O---------------------------------O
45000 OOO OOO O O OOO OO O O O OO OO O O O OO OOOO O|
| O O O O O O OOO O OOO O O O OO O OO O |
40000 ++ O O O O |
35000 ++ |
| |
30000 ++ |
25000 ++ |
20000 ++ |
| |
15000 ++ |
10000 ++ |
| |
5000 ***.*** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

45000 ++--------------O-----------O---O----------------------------------+
| O OO O OOOO O O OO O O O OOO |
40000 O+ O O O O OO OO OO O O OO OO OO OO OOOOO OOO
35000 +O O O O O O O |
| |
30000 ++ |
25000 ++ |
| |
20000 ++ |
15000 ++ |
| |
10000 ++ |
5000 ++ |
***.*** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

45000 ++------------------------------O------O--O------------------------+
| O O O O O OOOOO O O O O O O O O OO O OO O |
40000 OO OO OO OO O O OO O OO OO OOO O O O OO O OOO
35000 ++O O O |
| |
30000 ++ |
25000 ++ |
| |
20000 ++ |
15000 ++ |
| |
10000 ++ |
5000 ++ |
****.** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

5.5e+06 O+---------------------------O------O----------------------------+
5e+06 ++ O O O O O O |
| O
4.5e+06 ++ |
4e+06 ++ |
3.5e+06 ++ |
3e+06 ++ |
| |
2.5e+06 ++ |
2e+06 ++ |
1.5e+06 ++ |
1e+06 ++ |
| |
500000 *+.....*......*.......*......* |
0 ++---------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

7e+06 ++-----------------------------------------------------------------+
| O |
6e+06 ++O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
O O |
5e+06 ++ |
| |
4e+06 ++ |
| |
3e+06 ++ |
| |
2e+06 ++ |
| |
1e+06 ++ |
*.*.*.*.*.*.*.*.*.* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

45000 ++--O--------------------------------------------------------------+
| O O O O O O O O O O O O O O |
40000 O+O O O O O O O O O O O O O O |
35000 ++ O O
| |
30000 ++ |
25000 ++ |
| |
20000 ++ |
15000 ++ |
| |
10000 ++ |
5000 ++ |
*.*.* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

70000 ++-----------------------------------------------------------------+
| O |
60000 O+ O O O O O O O O O O O O O
| O |
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
10000 ++ |
*...*....* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

45000 ++-----------------------------------------------------------------+
| O O |
40000 O+ O O O O O O O O O O O O O O O O O O O O O O
35000 ++O O O O O O O O O |
| |
30000 ++ |
25000 ++ |
| |
20000 ++ |
15000 ++ |
| |
10000 ++ |
5000 ++ |
*.*.*.* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

400000 ++----------------------------------------------------------------+
| O O O O O |
350000 ++ O O O O O O O O O O O O O O O O O O O O O
O O O |
300000 ++O |
250000 ++ |
| |
200000 ++ |
| |
150000 ++ |
100000 ++ |
| |
50000 ++ |
| |
0 *+*--*-*----------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

50000 ++-----------------O-------O----------------------------O--------O-+
45000 O+ O O O OOO OO O O O OO O OOOO O O OOO OO OO O|
|OO O O OO O O O O O O O O O O OO O O O |
40000 ++ O OO O
35000 ++ |
| |
30000 ++ |
25000 ++ |
20000 ++ |
| |
15000 ++ |
10000 ++ |
| |
5000 ***.*** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

120000 ++----------------------------------------------------------------+
| O O O O O |
100000 ++ O O O O O O O O
| O O O O O O |
| O O O |
80000 ++ O O |
| O O O |
60000 O+O |
| O |
40000 ++ |
| |
| |
20000 ++ |
| |
0 *+*--*-*-*--------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

10000 ++-----------------------------------------------------------------+
9000 ++ O O O O O O O |
O O O O O O O O O O O O O O O O
8000 ++O O O O O |
7000 ++ |
| |
6000 ++ |
5000 ++ |
4000 ++ |
| |
3000 ++ |
2000 ++ |
| |
1000 ++ |
0 *+*--*-*--*-*------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

300000 ++------------------O----------------------------------O-O--------O
O O O O O O O O O O O O O O O O O O O O O O O O O |
250000 ++ O |
| |
| |
200000 ++ |
| |
150000 ++ |
| |
100000 ++ |
| |
| |
50000 ++ |
| |
0 *+*--*-*-*-*------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

70000 ++-----------------------------------------O-----------------------+
| O O O O O O O |
60000 OO O O O O OO OOO O O OOO OOO O OO O OO O OOO|
| O O O OOO O O O O O O OOO O OO OO O O
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
10000 ++ |
****.** |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

50000 ++-----------------------------------------------------------------+
45000 ++ O O O O O O O O OO O O O O
O O O OO O O O O O O O O O O O O O O O |
40000 ++ O |
35000 ++ O |
| |
30000 ++ |
25000 ++ |
20000 ++ |
| |
15000 ++ |
10000 ++ |
| |
5000 *+*.*.*.*.* |
0 ++-----------------------------------------------------------------+


lock_stat.vmap_area_lock.holdtime-total

60000 ++---------O----O------------O--------O--------------------------O-+
O O O O O O O O O O O O O O O O O O O O O
50000 ++ O O O O O |
| |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
| |
10000 ++ |
*.*.*..*.*.* |
0 ++-----------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
3470.31 +1631.0% 60070.49 fat/micro/dd-write/1HDD-cfq-ext4-10dd
3339.57 +1676.9% 59340.71 fat/micro/dd-write/1HDD-cfq-ext4-1dd
2848.79 +1489.1% 45269.54 lkp-a04/micro/netperf/120s-200%-TCP_CRR
3563.01 +1515.2% 57548.50 lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
2678.64 +1346.0% 38733.80 lkp-a04/micro/netperf/120s-200%-TCP_RR
2839.84 +1450.2% 44022.08 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
3417.37 +1571.4% 57116.34 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
2558.59 +1450.8% 39677.58 lkp-a04/micro/netperf/120s-200%-UDP_RR
3737.24 +1558.0% 61963.62 lkp-a04/micro/netperf/120s-200%-UDP_STREAM
20219.50 +1488.7% 321218.02 lkp-a06/crypto/tcrypt/2s-200-204
21017.17 +1457.1% 327257.41 lkp-a06/crypto/tcrypt/2s-205-210
22109.84 +1240.3% 296346.33 lkp-a06/crypto/tcrypt/2s-401-417
17909.60 +1467.3% 280693.71 lkp-a06/micro/dbench/100%
489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
12105.00 +2453.6% 309110.42 nhm-white/micro/aim7/dbase
822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
3452.91 +1685.5% 61650.74 nhm-white/micro/aim7/fserver
300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
2694.48 +1974.4% 55894.19 nhm-white/sysbench/oltp/100%-600-1000000
4415.67 +1202.2% 57501.52 nhm8/micro/dbench/100%
2284.65 +1505.2% 36672.75 snb-drag/crypto/tcrypt/2s-200-204
2446.02 +1537.1% 40042.87 snb-drag/crypto/tcrypt/2s-205-210
2484.11 +1599.6% 42219.71 snb-drag/crypto/tcrypt/2s-500-504
2118.55 +1155.8% 26604.99 vpx/crypto/tcrypt/2s-200-204
2713.48 +1198.5% 35234.77 vpx/crypto/tcrypt/2s-205-210
2711.31 +973.8% 29114.07 vpx/crypto/tcrypt/2s-301-319
2369.23 +940.3% 24648.12 vpx/crypto/tcrypt/2s-401-417
2620.64 +1428.7% 40060.71 vpx/crypto/tcrypt/2s-500-504
1713.98 +1624.3% 29553.72 vpx/crypto/tcrypt/2s-505-509
3423353.12 +1184.9% 43985148.08 TOTAL lock_stat.vmap_area_lock.holdtime-total

lock_stat.vmap_area_lock.contentions.get_vmalloc_info

200000 ++----------------------------------------------------------------+
O O O O O O O O O |
180000 ++ O O O O O O O O O O
160000 ++ O |
| |
140000 ++ |
120000 ++ |
| |
100000 ++ |
80000 ++ |
| |
60000 ++ |
40000 ++ |
*..*...*..*...*..*...*..*...*..* |
20000 ++----------------------------------------------------------------+



lock_stat.vmap_area_lock.contentions.get_vmalloc_info

70000 ++-----------------------------------------------------------------+
| |
60000 O+ O O O O O
| O O O O O O O O O O O O O O O O O O O O O |
50000 ++ |
| |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
10000 *+.*.* |
| |
0 ++-----------------------------------------------------------------+



lock_stat.vmap_area_lock.contentions.get_vmalloc_info

30000 ++-----------------------------------------------------------------+
O O O O O O O |
25000 ++ O O O
| |
| |
20000 ++ |
| |
15000 ++ |
| |
10000 ++ |
| |
| |
5000 *+.....*.......*......*.......* |
| |
0 ++-----------------------------------------------------------------+



lock_stat.vmap_area_lock.contentions.get_vmalloc_info

14000 ++-----------------------------------------------------------------+
| O |
12000 ++O O O O O O O O O O O O O O O O O O O |
O O O O O O O O O O O O O O
10000 ++ |
| |
8000 ++ |
| |
6000 ++ |
| |
4000 ++ |
| |
2000 ++ |
*.*.*.*.*.*.*.*.*.* |
0 ++-----------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
4952.40 +447.0% 27090.40 lkp-ne04/micro/aim7/shell_rtns_1
28410.80 +556.2% 186423.00 lkp-snb01/micro/aim7/exec_test
8142.00 +615.4% 58247.33 nhm-white/micro/aim7/exec_test
1386.00 +762.6% 11955.20 nhm-white/micro/aim7/shell_rtns_1
42891.20 +561.5% 283715.93 TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

lock_stat.vmap_area_lock.waittime-total

1.2e+06 ++---------------------------------------------------------------+
| O |
1e+06 O+ O O O O O |
| O O O O O O O O O
| O O O O |
800000 ++ |
| |
600000 ++ |
| |
400000 ++ |
| |
| |
200000 ++ |
| |
0 *+-*---*--*---*--*---*--*--*---*---------------------------------+



lock_stat.vmap_area_lock.waittime-total

250000 ++----------------------------------------------------------------+
| O O |
O O O O O O O *..*... ..*..*... .*...*..|
200000 ++ O : *..*. *. *
| : |
| : |
150000 ++ : |
| : |
100000 ++ : |
| : |
| : |
50000 ++ : |
| : |
| : |
0 *+-*---*--*---*--*---*--*---*--*----------------------------------+



lock_stat.vmap_area_lock.waittime-total

250000 ++----------------------------------------------------------------+
| O O O O O
O O O O O O O O O O O |
200000 ++ O O O O O O O O O O O |
| |
| |
150000 ++ |
| |
100000 ++ |
| |
| |
50000 ++ |
| |
| |
0 *+-*-*------------------------------------------------------------+



lock_stat.vmap_area_lock.waittime-total

120000 ++----------------------------------------------------------------+
O O O O O O O |
100000 ++ O O |
| O
| |
80000 ++ |
| |
60000 ++ |
| |
40000 ++ |
| |
| |
20000 ++ |
| |
0 *+-----*-------*------*------*------------------------------------+



lock_stat.vmap_area_lock.waittime-total

60000 ++-----------------------------------------------------------------+
| O |
50000 ++ O O O O O O O O O O O O O O O |
O O O O O O O O O O O O O O O O O
| O |
40000 ++ |
| |
30000 ++ |
| |
20000 ++ |
| |
| |
10000 ++ |
| |
0 *+*-*-*-*-*-*-*-*-*------------------------------------------------+



lock_stat.vmap_area_lock.waittime-total

3500 ++------------------------------------------------------------------+
| O |
3000 ++ O |
| |
2500 ++ O |
| |
2000 ++O O |
| O O O O O |
1500 ++ O O O O |
O O O O O |
1000 ++ O O O O O
| O O O O |
500 ++ O O |
| |
0 *+*--*--------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
1990.41 +5378.7% 109049.26 lkp-ne04/micro/aim7/shell_rtns_1
18611.42 +5349.4% 1014205.59 lkp-snb01/micro/aim7/exec_test
4171.17 +5040.7% 214427.74 lkp-snb01/micro/aim7/shell_rtns_1
10.18 +14277.7% 1463.17 nhm-white/micro/aim7/dbase
1715.53 +12791.3% 221153.70 nhm-white/micro/aim7/exec_test
325.21 +15376.6% 50331.77 nhm-white/micro/aim7/shell_rtns_1
26823.92 +5904.5% 1610631.24 TOTAL lock_stat.vmap_area_lock.waittime-total



lock_stat.vmap_area_lock.contentions

100000 ++----------------------------------------------------------------+
O O O O O O O O O |
90000 ++ O O O O O O O O O O
80000 ++ O |
| |
70000 ++ |
60000 ++ |
| |
50000 ++ |
40000 ++ |
| |
30000 ++ |
20000 ++ |
*..*...*..*...*..*...*..*...*..* |
10000 ++----------------------------------------------------------------+



lock_stat.vmap_area_lock.contentions

35000 ++-----------------------------------------------------------------+
| |
30000 O+ O O O O O
| O O O O O O O O O O O O O O O O O O O O O |
25000 ++ |
| |
20000 ++ |
| |
15000 ++ |
| |
10000 ++ |
| |
5000 *+.*.* |
| |
0 ++-----------------------------------------------------------------+



lock_stat.vmap_area_lock.contentions

14000 O+-----O-----------------------------O-------O--------------O------+
| O O O O O
12000 ++ |
| |
| |
10000 ++ |
| |
8000 ++ |
| |
6000 ++ |
| |
| |
4000 ++ |
*...... ....*......*.......* |
2000 ++-----*-----------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
2476.20 +447.0% 13545.20 lkp-ne04/micro/aim7/shell_rtns_1
14205.40 +556.2% 93211.50 lkp-snb01/micro/aim7/exec_test
4071.00 +615.4% 29123.67 nhm-white/micro/aim7/exec_test
20752.60 +554.8% 135880.37 TOTAL lock_stat.vmap_area_lock.contentions



lock_stat.vmap_area_lock.con-bounces

100000 ++----------------------------------------------------------------+
O O O O O O O O O |
90000 ++ O O O O O O O O O O
80000 ++ O |
| |
70000 ++ |
60000 ++ |
| |
50000 ++ |
40000 ++ |
| |
30000 ++ |
20000 ++ |
*..*...*..*...*..*...*..*...*..* |
10000 ++----------------------------------------------------------------+



lock_stat.vmap_area_lock.con-bounces

35000 ++-----------------------------------------------------------------+
| |
30000 O+ O O O O O
| O O O O O O O O O O O O O O O O O O O O O |
25000 ++ |
| |
20000 ++ |
| |
15000 ++ |
| |
10000 ++ |
| |
5000 *+.*.* |
| |
0 ++-----------------------------------------------------------------+



lock_stat.vmap_area_lock.con-bounces

14000 O+-----O-----------------------------O-------O--------------O------+
| O O O O O
12000 ++ |
| |
| |
10000 ++ |
| |
8000 ++ |
| |
6000 ++ |
| |
| |
4000 ++ |
*...... ....*......*.......* |
2000 ++-----*-----------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
2476.20 +447.0% 13545.20 lkp-ne04/micro/aim7/shell_rtns_1
14205.40 +556.2% 93211.50 lkp-snb01/micro/aim7/exec_test
4071.00 +615.4% 29123.67 nhm-white/micro/aim7/exec_test
20752.60 +554.8% 135880.37 TOTAL lock_stat.vmap_area_lock.con-bounces

time.system_time

550 ++--------------------------------------O------O----------O----------+
540 ++ O O O O O O
| O |
530 ++ |
520 ++ |
| |
510 ++ |
500 ++ |
490 ++ |
| |
480 O+ O |
470 *+.. O O O O O |
| *..*.. O .*.. ..*... |
460 ++ . .. *...*..*. * |
450 ++---------*---------------------------------------------------------+



time.system_time

30 ++--------------------------------------------------------------------+
| O |
| |
25 ++ |
| |
| |
20 ++ |
| |
15 O+ O O O O
| |
| O O O |
10 ++ |
*...... ...*......* O O |
| *... |
5 ++--------------------------------------------------------------------+



time.system_time

740 ++-------------------------------------------------------------------+
| O O |
730 ++ O |
720 O+ |
| O |
710 ++ |
| |
700 ++ |
| O O |
690 ++ |
680 ++ ....*...... ....*..... O O |
*... *... .. |
670 ++ * O
| |
660 ++-------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
677.85 +7.4% 728.28 lkp-ne04/micro/aim7/shell_rtns_1
459.71 +2.4% 470.66 lkp-snb01/micro/aim7/exec_test
8.05 +80.1% 14.49 vpx/crypto/tcrypt/2s-505-509
1145.61 +5.9% 1213.42 TOTAL time.system_time



time.user_time

48 ++---------------------------------------------------------O----------+
| O O |
47 ++ O |
46 ++ O O O O O
| O |
45 ++ |
O |
44 ++ O O O |
| O O O |
43 *+..*..O O |
42 ++ *.. *..*.. |
| . ..*..*.. .. . |
41 ++ *. . . * |
| * |
40 ++--------------------------------------------------------------------+



time.user_time

162 ++-----------------------------O-------------------------------------+
O |
160 ++ O O |
158 ++ |
| |
156 ++ O |
| |
154 ++ O |
| |
152 ++ O |
150 ++ .*... O O |
| ..*... .. ... |
148 ++.... .. .. . O
*. . .. * |
146 ++-------------*-----------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
148.19 +7.8% 159.75 lkp-ne04/micro/aim7/shell_rtns_1
41.96 +3.6% 43.49 lkp-snb01/micro/aim7/exec_test
190.15 +6.9% 203.23 TOTAL time.user_time



iostat.cpu.idle

34 ++---------------------*--------------------------------------------+
| .... . |
*.......*......*. * |
33.5 O+ O |
| O O |
| O |
33 ++ |
| |
32.5 ++ |
| |
| O |
32 ++ O
| |
| O O O |
31.5 ++------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
33.75 -1.3% 33.31 lkp-ne04/micro/aim7/shell_rtns_1
33.75 -1.3% 33.31 TOTAL iostat.cpu.idle



time.percent_of_cpu_this_job_got

270 ++--------------------------------------O------O----------O------O---+
| O O O O O
265 ++ O |
| |
| |
260 ++ |
| |
255 ++ |
O O O O |
250 ++ O |
*... O O O |
| *.. O *.. ..*.. |
245 ++ *.. .. .*. . |
| . . *...*. * |
240 ++---------*---------------------------------------------------------+



time.percent_of_cpu_this_job_got

555 ++-------------------------------------------------------------------+
550 ++ O |
| O O |
545 O+ |
540 ++ O |
| O |
535 ++ O |
530 ++ O |
525 ++ |
| O |
520 ++ |
515 ++ ..*..... O
*.......*...... .... .. |
510 ++ *. * |
505 ++-------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
510.40 +7.1% 546.60 lkp-ne04/micro/aim7/shell_rtns_1
244.60 +2.1% 249.80 lkp-snb01/micro/aim7/exec_test
755.00 +5.5% 796.40 TOTAL time.percent_of_cpu_this_job_got



iostat.cpu.system

34.2 ++------------------------------------------O-----------------------O
| O O |
34 ++ O O O |
| O O O O O O |
33.8 ++ O O O O O |
| O O |
33.6 ++ O |
| |
33.4 ++ |
| |
33.2 ++ |
| O O |
33 *+. |
| *..* |
32.8 ++------------------------------------------------------------------+



iostat.cpu.system

34.4 ++------------------------------------------------------------------+
| O O
34.2 ++ O O O O O O O |
| O O O O O O O O O |
34 ++ O O |
| O O |
33.8 ++ |
| |
33.6 ++ |
*..O O O |
33.4 O+ O *.. |
| *.*..O: * |
33.2 ++ : |
| * |
33 ++------------------------------------------------------------------+



iostat.cpu.system

56.6 ++----------------------------------------------------------O-------+
56.4 ++ O O |
| O
56.2 ++ O |
56 ++ |
| |
55.8 ++ |
55.6 ++ |
55.4 ++ O |
| O O O |
55.2 O+ |
55 *+......*...... |
| *....... ...* |
54.8 ++ *... |
54.6 ++------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
32.94 +0.4% 33.07 lkp-a06/crypto/tcrypt/2s-200-204
33.31 +0.4% 33.44 lkp-a06/crypto/tcrypt/2s-205-210
54.91 +0.8% 55.33 lkp-ne04/micro/aim7/shell_rtns_1
121.17 +0.6% 121.83 TOTAL iostat.cpu.system



vmstat.system.in

65500 ++-----------------------------------------------------------------+
| .*... .*... .*.. |
*...*. *..*...*. *. . |
65000 ++ * |
| O O O |
| |
64500 O+ O O O O O |
| O |
64000 ++ |
| |
| |
63500 ++ O |
| O O O O O |
| O O O
63000 ++------------------------------------------------------O----------+



vmstat.system.in

20600 ++-----*-------*---------------------------------------------------+
| .... O *.......* |
20500 *+ |
20400 ++ |
O O O |
20300 ++ O |
| |
20200 ++ |
| |
20100 ++ |
20000 ++ |
| |
19900 ++ |
| O |
19800 ++-----------------------------------O-------O------O--------------O



vmstat.system.in

1090 ++------------------------------------------------------------------+
| .*.* |
1088 *+* O |
1086 ++ O |
| O O |
1084 ++ O O O O O |
1082 ++ O O O O O |
| O O O |
1080 ++O O O O O |
1078 ++ |
| O O O O
1076 ++ O O O O |
1074 O+ O |
| O O |
1072 ++------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
20557.20 -0.8% 20384.60 lkp-ne04/micro/aim7/shell_rtns_1
65159.10 -1.0% 64507.71 lkp-snb01/micro/aim7/exec_test
1088.00 -0.9% 1078.50 snb-drag/crypto/tcrypt/2s-200-204
86804.30 -1.0% 85970.81 TOTAL vmstat.system.in



time.elapsed_time

222 ++-------------------------------------------------------------------+
| O O O O |
220 ++ O O O O O O
218 ++ |
| |
216 ++ |
214 ++ |
| |
212 ++ |
210 ++ |
| |
208 ++ |
206 ++ |
*...*..*...O ..*..*...O .*...*...* |
204 ++---------*----------*----------------------------------------------+



time.elapsed_time

160.5 ++-----------------------------------------------------------------+
160 ++ O O |
O ..*..O O .*...O |
159.5 *+ *...*..*...*..*...*. O * |
159 ++ : |
| : |
158.5 ++ : |
158 ++ : |
157.5 ++ : |
| : |
157 ++ : *... |
156.5 ++ *... *... .. *.. |
| *..*... .. * *...|
156 ++ * *
155.5 ++-----------------------------------------------------------------+



time.elapsed_time

280 ++-------------------------------------------------------------------+
| O O O O O O O O O |
279 ++ O O O O O O O O O O O O O
278 ++ O O |
| |
277 ++ |
276 ++ |
| |
275 ++ |
274 ++ |
| |
273 ++ O |
272 O+ |
*..*.* |
271 ++-------------------------------------------------------------------+



time.elapsed_time

602 ++--O--------------------------------------------------------------+
| |
601.8 ++ |
601.6 ++ O
| |
601.4 ++ O |
601.2 ++ O O |
| O |
601 ++ O O O O |
600.8 O+ O O |
*.. O |
600.6 ++ . O |
600.4 ++ *.... O |
| * |
600.2 ++-----------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
600.51 +0.1% 601.28 fat/micro/dd-write/1HDD-cfq-ext4-10dd
204.81 +0.3% 205.38 lkp-snb01/micro/aim7/exec_test
159.50 +0.2% 159.77 lkp-snb01/micro/aim7/shell_rtns_1
271.47 +0.4% 272.49 nhm-white/micro/aim7/exec_test
1236.29 +0.2% 1238.92 TOTAL time.elapsed_time



vmstat.system.cs

61600 ++-----------------------------------------------------------------+
| |
61400 ++ *
61200 ++ *.. +|
| .. .. ..*.... + |
61000 ++ . ..*.. *.... ..*... + |
| .. *.. *.....*.. . + |
60800 ++ .*....* * |
*.... .. |
60600 ++ *. O O |
60400 ++ O |
O O O O |
60200 ++ O O O |
| |
60000 ++-----------------------------------------------------------------+



vmstat.system.cs

19200 ++--------------------O--------------------------------------------+
| |
19100 ++ |
| |
| |
19000 ++ |
| |
18900 ++ O O |
| OO O |
18800 ++ O O O O O O O O
| O O O OOOO OOO OO O O O O O OOO O|
|O O O OO O OO O O O |
18700 *+ OO O O O OOO O |
O**.*** O |
18600 ++-----------------------------------------------------------------+



vmstat.system.cs

49100 ++-----------------------------------------------------------------+
| ....*.. |
49000 ++ .*... .. |
| .... .. |
48900 *+ ....* O |
| O *... |
48800 ++ O |
| O O
48700 ++ |
| O |
48600 ++ |
| O |
48500 O+ |
| |
48400 ++--------------------O-------O------------------------------------+



vmstat.system.cs

732 ++--*----------------------------------------------------------------+
*.* + O |
730 ++ * O O O O |
| O O O O O O O O |
728 ++ O O |
| O O O O O O O
726 O+ O O O O O O O |
| |
724 ++ O |
| O |
722 ++ |
| O |
720 ++ O |
| |
718 ++-------------------------------------------------------------------+

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
18644.50 +0.3% 18702.67 lkp-a04/micro/netperf/120s-200%-UDP_RR
48938.00 -0.8% 48539.80 lkp-ne04/micro/aim7/shell_rtns_1
60751.18 -0.6% 60359.04 lkp-nex05/micro/aim7/exec_test
731.00 -0.4% 728.25 snb-drag/crypto/tcrypt/2s-200-204
129064.68 -0.6% 128329.76 TOTAL vmstat.system.cs

2013-09-13 03:16:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> Hi Peter,
>
> FYI, we noticed much increased vmap_area_lock contentions since this
> commit:

What does that mean? What is happening, are we allocating/removing more
memory now?

What type of load were you running that showed this problem?

thanks,

greg k-h

2013-09-13 03:38:13

by Fengguang Wu

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> > Hi Peter,
> >
> > FYI, we noticed much increased vmap_area_lock contentions since this
> > commit:
>
> What does that mean? What is happening, are we allocating/removing more
> memory now?

// leave this question to Peter and Tejun. :)

> What type of load were you running that showed this problem?

The increased contentions and lock hold/wait time showed up in a
number of test cases.

The second email has the details, and this section of data is most
illustrating.

8cb06c983822103da1cf 20bafb3d23d108bc0a89
------------------------ ------------------------
3470.31 +1631.0% 60070.49 fat/micro/dd-write/1HDD-cfq-ext4-10dd
3339.57 +1676.9% 59340.71 fat/micro/dd-write/1HDD-cfq-ext4-1dd
2848.79 +1489.1% 45269.54 lkp-a04/micro/netperf/120s-200%-TCP_CRR
3563.01 +1515.2% 57548.50 lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
2678.64 +1346.0% 38733.80 lkp-a04/micro/netperf/120s-200%-TCP_RR
2839.84 +1450.2% 44022.08 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
3417.37 +1571.4% 57116.34 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
2558.59 +1450.8% 39677.58 lkp-a04/micro/netperf/120s-200%-UDP_RR
3737.24 +1558.0% 61963.62 lkp-a04/micro/netperf/120s-200%-UDP_STREAM
20219.50 +1488.7% 321218.02 lkp-a06/crypto/tcrypt/2s-200-204
21017.17 +1457.1% 327257.41 lkp-a06/crypto/tcrypt/2s-205-210
22109.84 +1240.3% 296346.33 lkp-a06/crypto/tcrypt/2s-401-417
17909.60 +1467.3% 280693.71 lkp-a06/micro/dbench/100%
489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
12105.00 +2453.6% 309110.42 nhm-white/micro/aim7/dbase
822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
3452.91 +1685.5% 61650.74 nhm-white/micro/aim7/fserver
300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
2694.48 +1974.4% 55894.19 nhm-white/sysbench/oltp/100%-600-1000000
4415.67 +1202.2% 57501.52 nhm8/micro/dbench/100%
2284.65 +1505.2% 36672.75 snb-drag/crypto/tcrypt/2s-200-204
2446.02 +1537.1% 40042.87 snb-drag/crypto/tcrypt/2s-205-210
2484.11 +1599.6% 42219.71 snb-drag/crypto/tcrypt/2s-500-504
2118.55 +1155.8% 26604.99 vpx/crypto/tcrypt/2s-200-204
2713.48 +1198.5% 35234.77 vpx/crypto/tcrypt/2s-205-210
2711.31 +973.8% 29114.07 vpx/crypto/tcrypt/2s-301-319
2369.23 +940.3% 24648.12 vpx/crypto/tcrypt/2s-401-417
2620.64 +1428.7% 40060.71 vpx/crypto/tcrypt/2s-500-504
1713.98 +1624.3% 29553.72 vpx/crypto/tcrypt/2s-505-509
3423353.12 +1184.9% 43985148.08 TOTAL lock_stat.vmap_area_lock.holdtime-total

The format of report is

parent commit commit
------------------------ ------------------------
number %change number testbox/testcase/test-params
...
...
...
total number %change total number TOTAL perf-metric-name being compared

Thanks,
Fengguang

2013-09-13 03:44:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> > On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> > > Hi Peter,
> > >
> > > FYI, we noticed much increased vmap_area_lock contentions since this
> > > commit:
> >
> > What does that mean? What is happening, are we allocating/removing more
> > memory now?
>
> // leave this question to Peter and Tejun. :)
>
> > What type of load were you running that showed this problem?
>
> The increased contentions and lock hold/wait time showed up in a
> number of test cases.
>
> The second email has the details, and this section of data is most
> illustrating.
>
> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> ------------------------ ------------------------
> 3470.31 +1631.0% 60070.49 fat/micro/dd-write/1HDD-cfq-ext4-10dd
> 3339.57 +1676.9% 59340.71 fat/micro/dd-write/1HDD-cfq-ext4-1dd
> 2848.79 +1489.1% 45269.54 lkp-a04/micro/netperf/120s-200%-TCP_CRR
> 3563.01 +1515.2% 57548.50 lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
> 2678.64 +1346.0% 38733.80 lkp-a04/micro/netperf/120s-200%-TCP_RR
> 2839.84 +1450.2% 44022.08 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
> 3417.37 +1571.4% 57116.34 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
> 2558.59 +1450.8% 39677.58 lkp-a04/micro/netperf/120s-200%-UDP_RR
> 3737.24 +1558.0% 61963.62 lkp-a04/micro/netperf/120s-200%-UDP_STREAM
> 20219.50 +1488.7% 321218.02 lkp-a06/crypto/tcrypt/2s-200-204
> 21017.17 +1457.1% 327257.41 lkp-a06/crypto/tcrypt/2s-205-210
> 22109.84 +1240.3% 296346.33 lkp-a06/crypto/tcrypt/2s-401-417
> 17909.60 +1467.3% 280693.71 lkp-a06/micro/dbench/100%
> 489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
> 1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
> 12105.00 +2453.6% 309110.42 nhm-white/micro/aim7/dbase
> 822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
> 9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
> 3452.91 +1685.5% 61650.74 nhm-white/micro/aim7/fserver
> 300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
> 345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
> 2694.48 +1974.4% 55894.19 nhm-white/sysbench/oltp/100%-600-1000000
> 4415.67 +1202.2% 57501.52 nhm8/micro/dbench/100%
> 2284.65 +1505.2% 36672.75 snb-drag/crypto/tcrypt/2s-200-204
> 2446.02 +1537.1% 40042.87 snb-drag/crypto/tcrypt/2s-205-210
> 2484.11 +1599.6% 42219.71 snb-drag/crypto/tcrypt/2s-500-504
> 2118.55 +1155.8% 26604.99 vpx/crypto/tcrypt/2s-200-204
> 2713.48 +1198.5% 35234.77 vpx/crypto/tcrypt/2s-205-210
> 2711.31 +973.8% 29114.07 vpx/crypto/tcrypt/2s-301-319
> 2369.23 +940.3% 24648.12 vpx/crypto/tcrypt/2s-401-417
> 2620.64 +1428.7% 40060.71 vpx/crypto/tcrypt/2s-500-504
> 1713.98 +1624.3% 29553.72 vpx/crypto/tcrypt/2s-505-509
> 3423353.12 +1184.9% 43985148.08 TOTAL lock_stat.vmap_area_lock.holdtime-total
>
> The format of report is
>
> parent commit commit
> ------------------------ ------------------------
> number %change number testbox/testcase/test-params
> ...
> ...
> ...
> total number %change total number TOTAL perf-metric-name being compared

That's a lot of slowdowns, especially for such a simple patch.

Peter, any ideas?

2013-09-13 09:55:55

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/12/2013 11:44 PM, Greg KH wrote:
> On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
>> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
>>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>>> Hi Peter,
>>>>
>>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>>> commit:
>>>
>>> What does that mean? What is happening, are we allocating/removing more
>>> memory now?

No. Same amount of memory, allocated and freed with the same frequency as
before.

>>> What type of load were you running that showed this problem?
>>
>> The increased contentions and lock hold/wait time showed up in a
>> number of test cases.

[...]

> That's a lot of slowdowns, especially for such a simple patch.
>
> Peter, any ideas?

Looks like this patch incidentally triggers some worst-case behavior in
the memory manager. I'm not sure how this is possible with two 4k buffers,
but the evidence is substantial.

This patch isn't critical so I suggest we back out this patch for mainline
but use the patch to find out what's wrong in the vmap area.

Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST)
so I'll get a revert to you then. Sorry 'bout that.

Regards,
Peter Hurley

2013-09-13 12:33:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Fri, Sep 13, 2013 at 05:55:47AM -0400, Peter Hurley wrote:
> On 09/12/2013 11:44 PM, Greg KH wrote:
> > On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
> >> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> >>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>>> Hi Peter,
> >>>>
> >>>> FYI, we noticed much increased vmap_area_lock contentions since this
> >>>> commit:
> >>>
> >>> What does that mean? What is happening, are we allocating/removing more
> >>> memory now?
>
> No. Same amount of memory, allocated and freed with the same frequency as
> before.

Ok, I thought so.

> >>> What type of load were you running that showed this problem?
> >>
> >> The increased contentions and lock hold/wait time showed up in a
> >> number of test cases.
>
> [...]
>
> > That's a lot of slowdowns, especially for such a simple patch.
> >
> > Peter, any ideas?
>
> Looks like this patch incidentally triggers some worst-case behavior in
> the memory manager. I'm not sure how this is possible with two 4k buffers,
> but the evidence is substantial.
>
> This patch isn't critical so I suggest we back out this patch for mainline
> but use the patch to find out what's wrong in the vmap area.
>
> Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST)
> so I'll get a revert to you then. Sorry 'bout that.

No rush, we have plenty of time.

I think it would be good to track down the real root cause of this, so
that the allocators don't run into this problem again with some other
innocuous change.

thanks,

greg k-h

>
> Regards,
> Peter Hurley

2013-09-17 02:42:19

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/12/2013 11:38 PM, Fengguang Wu wrote:
> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>> Hi Peter,
>>>
>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>> commit:
>>
>> What does that mean? What is happening, are we allocating/removing more
>> memory now?
>
> // leave this question to Peter and Tejun. :)
>
>> What type of load were you running that showed this problem?
>
> The increased contentions and lock hold/wait time showed up in a
> number of test cases.

How is the raw data acquired?

For example, is the test apparatus reading from /proc/vmallocinfo
(which claims the vmap_area_lock) during the test execution?

I'm wondering if a Heisenberg effect is happening here.

Regards,
Peter Hurley


> The second email has the details, and this section of data is most
> illustrating.
>
> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> ------------------------ ------------------------
> 3470.31 +1631.0% 60070.49 fat/micro/dd-write/1HDD-cfq-ext4-10dd
> 3339.57 +1676.9% 59340.71 fat/micro/dd-write/1HDD-cfq-ext4-1dd
> 2848.79 +1489.1% 45269.54 lkp-a04/micro/netperf/120s-200%-TCP_CRR
> 3563.01 +1515.2% 57548.50 lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
> 2678.64 +1346.0% 38733.80 lkp-a04/micro/netperf/120s-200%-TCP_RR
> 2839.84 +1450.2% 44022.08 lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
> 3417.37 +1571.4% 57116.34 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
> 2558.59 +1450.8% 39677.58 lkp-a04/micro/netperf/120s-200%-UDP_RR
> 3737.24 +1558.0% 61963.62 lkp-a04/micro/netperf/120s-200%-UDP_STREAM
> 20219.50 +1488.7% 321218.02 lkp-a06/crypto/tcrypt/2s-200-204
> 21017.17 +1457.1% 327257.41 lkp-a06/crypto/tcrypt/2s-205-210
> 22109.84 +1240.3% 296346.33 lkp-a06/crypto/tcrypt/2s-401-417
> 17909.60 +1467.3% 280693.71 lkp-a06/micro/dbench/100%
> 489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
> 1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
> 12105.00 +2453.6% 309110.42 nhm-white/micro/aim7/dbase
> 822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
> 9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
> 3452.91 +1685.5% 61650.74 nhm-white/micro/aim7/fserver
> 300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
> 345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
> 2694.48 +1974.4% 55894.19 nhm-white/sysbench/oltp/100%-600-1000000
> 4415.67 +1202.2% 57501.52 nhm8/micro/dbench/100%
> 2284.65 +1505.2% 36672.75 snb-drag/crypto/tcrypt/2s-200-204
> 2446.02 +1537.1% 40042.87 snb-drag/crypto/tcrypt/2s-205-210
> 2484.11 +1599.6% 42219.71 snb-drag/crypto/tcrypt/2s-500-504
> 2118.55 +1155.8% 26604.99 vpx/crypto/tcrypt/2s-200-204
> 2713.48 +1198.5% 35234.77 vpx/crypto/tcrypt/2s-205-210
> 2711.31 +973.8% 29114.07 vpx/crypto/tcrypt/2s-301-319
> 2369.23 +940.3% 24648.12 vpx/crypto/tcrypt/2s-401-417
> 2620.64 +1428.7% 40060.71 vpx/crypto/tcrypt/2s-500-504
> 1713.98 +1624.3% 29553.72 vpx/crypto/tcrypt/2s-505-509
> 3423353.12 +1184.9% 43985148.08 TOTAL lock_stat.vmap_area_lock.holdtime-total
>
> The format of report is
>
> parent commit commit
> ------------------------ ------------------------
> number %change number testbox/testcase/test-params
> ...
> ...
> ...
> total number %change total number TOTAL perf-metric-name being compared
>
> Thanks,
> Fengguang
>

2013-09-17 02:56:49

by Fengguang Wu

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Mon, Sep 16, 2013 at 10:42:11PM -0400, Peter Hurley wrote:
> On 09/12/2013 11:38 PM, Fengguang Wu wrote:
> >On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> >>On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>>Hi Peter,
> >>>
> >>>FYI, we noticed much increased vmap_area_lock contentions since this
> >>>commit:
> >>
> >>What does that mean? What is happening, are we allocating/removing more
> >>memory now?
> >
> >// leave this question to Peter and Tejun. :)
> >
> >>What type of load were you running that showed this problem?
> >
> >The increased contentions and lock hold/wait time showed up in a
> >number of test cases.
>
> How is the raw data acquired?
>
> For example, is the test apparatus reading from /proc/vmallocinfo
> (which claims the vmap_area_lock) during the test execution?

We didn't read /proc/vmallocinfo and read /proc/lock_stat once when
the test ends.

> I'm wondering if a Heisenberg effect is happening here.

Thanks,
Fengguang

2013-09-17 15:34:27

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>> Hi Peter,
>>
>> FYI, we noticed much increased vmap_area_lock contentions since this
>> commit:
>>
>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>> Author: Peter Hurley <[email protected]>
>> Date: Sat Jun 15 10:21:19 2013 -0400
>>
>> n_tty: Move buffers into n_tty_data
>>
>> Reduce pointer reloading and improve locality-of-reference;
>> allocate read_buf and echo_buf within struct n_tty_data.
>
> Here are some comparison between this commit [o] with its parent commit [*].

Hi Fenguang,

Can you give the particulars of the aim7 test runs below?
I ask because I get _no_ added contention on the vmap_area_lock when I run
these tests on a dual-socket xeon.

What is the machine configuration(s)?
Are you using the aim7 'multitask' test driver or your own custom driver?
What is the load configuration (ie., constant, linearly increasing, convergence)?
How many loads are you simulating?

lock_stat.vmap_area_lock.holdtime-total
[...]
> 489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
> 1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
[...]
> 822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
> 9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
[...]
> 300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
> 345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1


None of the tests below execute a code path that leads to get_vmalloc_info().
The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
which none of the tests below perform.

What is reading /proc/meminfo?

Regards,
Peter Hurley

> lock_stat.vmap_area_lock.contentions.get_vmalloc_info
>
> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> ------------------------ ------------------------
> 4952.40 +447.0% 27090.40 lkp-ne04/micro/aim7/shell_rtns_1
> 28410.80 +556.2% 186423.00 lkp-snb01/micro/aim7/exec_test
> 8142.00 +615.4% 58247.33 nhm-white/micro/aim7/exec_test
> 1386.00 +762.6% 11955.20 nhm-white/micro/aim7/shell_rtns_1
> 42891.20 +561.5% 283715.93 TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

2013-09-17 23:22:19

by Fengguang Wu

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Tue, Sep 17, 2013 at 11:34:21AM -0400, Peter Hurley wrote:
> On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> >On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>Hi Peter,
> >>
> >>FYI, we noticed much increased vmap_area_lock contentions since this
> >>commit:
> >>
> >>commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
> >>Author: Peter Hurley <[email protected]>
> >>Date: Sat Jun 15 10:21:19 2013 -0400
> >>
> >> n_tty: Move buffers into n_tty_data
> >>
> >> Reduce pointer reloading and improve locality-of-reference;
> >> allocate read_buf and echo_buf within struct n_tty_data.
> >
> >Here are some comparison between this commit [o] with its parent commit [*].
>
> Hi Fenguang,
>
> Can you give the particulars of the aim7 test runs below?
> I ask because I get _no_ added contention on the vmap_area_lock when I run
> these tests on a dual-socket xeon.
>
> What is the machine configuration(s)?
> Are you using the aim7 'multitask' test driver or your own custom driver?
> What is the load configuration (ie., constant, linearly increasing, convergence)?
> How many loads are you simulating?

The aim7 tests are basically

(
echo $HOSTNAME
echo $workfile

echo 1
echo 2000
echo 2
echo 2000
echo 1
) | ./multitask -t

> lock_stat.vmap_area_lock.holdtime-total
> [...]
> > 489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
> > 1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
> [...]
> > 822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
> > 9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
> [...]
> > 300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
> > 345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
>
>
> None of the tests below execute a code path that leads to get_vmalloc_info().
> The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
> which none of the tests below perform.
>
> What is reading /proc/meminfo?

Good point! That may explain it: I'm running a

loop:
cat /proc/meminfo
sleep 1

in all the tests.

Thanks,
Fengguang

> > lock_stat.vmap_area_lock.contentions.get_vmalloc_info
> >
> > 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> >------------------------ ------------------------
> > 4952.40 +447.0% 27090.40 lkp-ne04/micro/aim7/shell_rtns_1
> > 28410.80 +556.2% 186423.00 lkp-snb01/micro/aim7/exec_test
> > 8142.00 +615.4% 58247.33 nhm-white/micro/aim7/exec_test
> > 1386.00 +762.6% 11955.20 nhm-white/micro/aim7/shell_rtns_1
> > 42891.20 +561.5% 283715.93 TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

2013-09-18 00:22:47

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/17/2013 07:22 PM, Fengguang Wu wrote:
> On Tue, Sep 17, 2013 at 11:34:21AM -0400, Peter Hurley wrote:
>> On 09/12/2013 09:09 PM, Fengguang Wu wrote:
>>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>>> Hi Peter,
>>>>
>>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>>> commit:
>>>>
>>>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>>>> Author: Peter Hurley <[email protected]>
>>>> Date: Sat Jun 15 10:21:19 2013 -0400
>>>>
>>>> n_tty: Move buffers into n_tty_data
>>>>
>>>> Reduce pointer reloading and improve locality-of-reference;
>>>> allocate read_buf and echo_buf within struct n_tty_data.
>>>
>>> Here are some comparison between this commit [o] with its parent commit [*].
>>
>> Hi Fengguang,

Sorry for misspelling your name earlier. Fixed.

>> Can you give the particulars of the aim7 test runs below?
>> I ask because I get _no_ added contention on the vmap_area_lock when I run
>> these tests on a dual-socket xeon.
>>
>> What is the machine configuration(s)?
>> Are you using the aim7 'multitask' test driver or your own custom driver?
>> What is the load configuration (ie., constant, linearly increasing, convergence)?
>> How many loads are you simulating?
>
> The aim7 tests are basically
>
> (
> echo $HOSTNAME
> echo $workfile
>
> echo 1
> echo 2000
> echo 2
> echo 2000
> echo 1
> ) | ./multitask -t

Thanks for the profile. I ran the aim7 tests with these load parameters (2000!)
and didn't have any significant contention with vmap_area_lock (162).

I had to run a subset of the aim7 tests (just those below) because I don't have
anything fast enough to simulate 2000 loads on the entire workfile.shared testsuite.


>> lock_stat.vmap_area_lock.holdtime-total
>> [...]
>>> 489739.50 +978.5% 5281916.05 lkp-ne04/micro/aim7/shell_rtns_1
>>> 1601675.63 +906.7% 16123642.52 lkp-snb01/micro/aim7/exec_test
>> [...]
>>> 822461.02 +1585.0% 13858430.62 nhm-white/micro/aim7/exec_test
>>> 9858.11 +2715.9% 277595.41 nhm-white/micro/aim7/fork_test
>> [...]
>>> 300.14 +2621.5% 8168.53 nhm-white/micro/aim7/misc_rtns_1
>>> 345479.21 +1624.5% 5957828.25 nhm-white/micro/aim7/shell_rtns_1
>>
>>
>> None of the tests below execute a code path that leads to get_vmalloc_info().
>> The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
>> which none of the tests below perform.
>>
>> What is reading /proc/meminfo?
>
> Good point! That may explain it: I'm running a
>
> loop:
> cat /proc/meminfo
> sleep 1
>
> in all the tests.

Yep. That's what's creating the contention -- while the aim7 test is creating
ttys for each and every process (exec_test, shell_rtns_1, ...), the read of
/proc/meminfo is contending with the allocations/frees of 2000 tty ldisc buffers.

Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
could definitely be reduced (even nearly eliminated), but that's a project for
another day :)

Regards,
Peter Hurley


>>> lock_stat.vmap_area_lock.contentions.get_vmalloc_info
>>>
>>> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
>>> ------------------------ ------------------------
>>> 4952.40 +447.0% 27090.40 lkp-ne04/micro/aim7/shell_rtns_1
>>> 28410.80 +556.2% 186423.00 lkp-snb01/micro/aim7/exec_test
>>> 8142.00 +615.4% 58247.33 nhm-white/micro/aim7/exec_test
>>> 1386.00 +762.6% 11955.20 nhm-white/micro/aim7/shell_rtns_1
>>> 42891.20 +561.5% 283715.93 TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

2013-09-18 00:49:58

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>> Hi Peter,
>>
>> FYI, we noticed much increased vmap_area_lock contentions since this
>> commit:
>>
>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>> Author: Peter Hurley <[email protected]>
>> Date: Sat Jun 15 10:21:19 2013 -0400
>>
>> n_tty: Move buffers into n_tty_data
>>
>> Reduce pointer reloading and improve locality-of-reference;
>> allocate read_buf and echo_buf within struct n_tty_data.
>
> Here are some comparison between this commit [o] with its parent commit [*].

[...]

> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> ------------------------ ------------------------
> 976.67 +108.3% 2034.67 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
> 8971.36 +11.4% 9997.05 nhm-white/micro/aim7/exec_test
> 9948.03 +20.9% 12031.72 TOTAL slabinfo.kmalloc-128.active_objs
>
> 8cb06c983822103da1cf 20bafb3d23d108bc0a89
> ------------------------ ------------------------
> 976.67 +108.3% 2034.67 lkp-a04/micro/netperf/120s-200%-TCP_STREAM
> 9127.64 +11.4% 10164.15 nhm-white/micro/aim7/exec_test
> 10104.31 +20.7% 12198.82 TOTAL slabinfo.kmalloc-128.num_objs

The dramatic increase in 128-byte kmalloc blocks is from vmalloc overhead
with associated with each allocation. On a x86_64, struct vmap_area is 104 bytes,
rounded to 128, allocated with every vmalloc allocation. This is approx 1% overhead
(which seems high to me).

The reason this is still visible after the test completes is the vmap area is
lazily reclaimed (see mm/vmalloc.c:__purge_vmap_area_lazy()).

1% memory overhead coupled with the unwanted vmap_area_lock contention (even though it is
test-induced) -- I might revert this anyway.

Regards,
Peter Hurley

2013-09-25 09:04:24

by Ming Lin

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <[email protected]> wrote:
[snip]
>
> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> could definitely be reduced (even nearly eliminated), but that's a project
> for
> another day :)

Hi Peter,

I also looked over vmallo.c, but didn't find obvious way to reduce the
critical section footprint.
Could you share some hints how to do this?

Thanks,
Lin Ming

>
> Regards,
> Peter Hurley

2013-09-25 11:30:46

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/25/2013 05:04 AM, Lin Ming wrote:
> On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <[email protected]> wrote:
> [snip]
>>
>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>> could definitely be reduced (even nearly eliminated), but that's a project
>> for
>> another day :)
>
> Hi Peter,
>
> I also looked over vmallo.c, but didn't find obvious way to reduce the
> critical section footprint.
> Could you share some hints how to do this?

vmap_area_list is an RCU list.
get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
Look at __purge_vmap_area_lazy() for a howto.

Regards,
Peter Hurley

2013-09-25 14:53:25

by Ming Lin

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Wed, Sep 25, 2013 at 7:30 PM, Peter Hurley <[email protected]> wrote:
> On 09/25/2013 05:04 AM, Lin Ming wrote:
>>
>> On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <[email protected]>
>> wrote:
>> [snip]
>>>
>>>
>>> Looking over vmalloc.c, the critical section footprint of the
>>> vmap_area_lock
>>> could definitely be reduced (even nearly eliminated), but that's a
>>> project
>>> for
>>> another day :)
>>
>>
>> Hi Peter,
>>
>> I also looked over vmallo.c, but didn't find obvious way to reduce the
>> critical section footprint.
>> Could you share some hints how to do this?
>
>
> vmap_area_list is an RCU list.
> get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
> Look at __purge_vmap_area_lazy() for a howto.

Looking at it now ...

Thanks,
Lin Ming

>
> Regards,
> Peter Hurley
>

2013-09-25 16:02:43

by Ming Lin

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Wed, 2013-09-25 at 07:30 -0400, Peter Hurley wrote:
> On 09/25/2013 05:04 AM, Lin Ming wrote:
> > On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <[email protected]> wrote:
> > [snip]
> >>
> >> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> >> could definitely be reduced (even nearly eliminated), but that's a project
> >> for
> >> another day :)
> >
> > Hi Peter,
> >
> > I also looked over vmallo.c, but didn't find obvious way to reduce the
> > critical section footprint.
> > Could you share some hints how to do this?
>
> vmap_area_list is an RCU list.
> get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
> Look at __purge_vmap_area_lazy() for a howto.

Would you like below patch?

From: Lin Ming <[email protected]>
Date: Wed, 25 Sep 2013 23:48:19 +0800
Subject: [PATCH] mm/vmalloc.c: eliminate vmap_area_lock in get_vmalloc_info()

vmap_area_list is an RCU list.
get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
Use RCU to protect list scan.

Signed-off-by: Lin Ming <[email protected]>
---
mm/vmalloc.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 13a5495..4523c9c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2684,14 +2684,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)

prev_end = VMALLOC_START;

- spin_lock(&vmap_area_lock);
+ rcu_read_lock();

if (list_empty(&vmap_area_list)) {
vmi->largest_chunk = VMALLOC_TOTAL;
goto out;
}

- list_for_each_entry(va, &vmap_area_list, list) {
+ list_for_each_entry_rcu(va, &vmap_area_list, list) {
unsigned long addr = va->va_start;

/*
@@ -2718,7 +2718,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
vmi->largest_chunk = VMALLOC_END - prev_end;

out:
- spin_unlock(&vmap_area_lock);
+ rcu_read_unlock();
}
#endif

--
1.7.2.5



>
> Regards,
> Peter Hurley
>

2013-09-26 03:20:39

by Andi Kleen

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

Lin Ming <[email protected]> writes:
>
> Would you like below patch?

The loop body keeps rather complex state. It could easily
get confused by parallel RCU changes.

So if the list changes in parallel you may suddenly
report very bogus values, as the va_start - prev_end
computation may be bogus.

Perhaps it's ok (may report bogus gaps), but it seems a bit risky.

-Andi

--
[email protected] -- Speaking for myself only

2013-09-26 07:33:08

by Andrew Morton

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <[email protected]> wrote:

> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> could definitely be reduced (even nearly eliminated), but that's a project for
> another day :)

20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
kmalloc (which is very fast) to a vmalloc (which is very slow) without
so much as mentioning it in the changelog. This should have been
picked up at review, btw.

Revert that part of the patch and the problem will be solved.

If we are really really worried that a ~9k kmalloc might fail or will
be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
failed. This kinda sucks, but is practical, but really should only be
done if necessary - ie, if problems with using plain old kmalloc are
demonstrable.

Or just revert all of 20bafb3d23d10 - it was supposed to be a small
performance improvement but turned out to be a significant performance
loss. Therefore zap.

2013-09-26 11:31:52

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/26/2013 03:33 AM, Andrew Morton wrote:
> On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <[email protected]> wrote:
>
>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>> could definitely be reduced (even nearly eliminated), but that's a project for
>> another day :)
>
> 20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
> kmalloc (which is very fast) to a vmalloc (which is very slow) without
> so much as mentioning it in the changelog. This should have been
> picked up at review, btw.
>
> Revert that part of the patch and the problem will be solved.
>
> If we are really really worried that a ~9k kmalloc might fail or will
> be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
> failed. This kinda sucks, but is practical, but really should only be
> done if necessary - ie, if problems with using plain old kmalloc are
> demonstrable.
>
> Or just revert all of 20bafb3d23d10 - it was supposed to be a small
> performance improvement but turned out to be a significant performance
> loss. Therefore zap.

I have no particular objection to reverting the entire patch.

However, it's a mischaracterization to suggest that the reason is
because vmalloc() is very slow; without reading /proc/meminfo,
there is no performance loss.

IOW, the lock contention this patch precipitated needs to get fixed
regardless.


Regards,
Peter Hurley

2013-09-26 11:52:27

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/25/2013 11:20 PM, Andi Kleen wrote:
> Lin Ming <[email protected]> writes:
>>
>> Would you like below patch?
>
> The loop body keeps rather complex state. It could easily
> get confused by parallel RCU changes.
>
> So if the list changes in parallel you may suddenly
> report very bogus values, as the va_start - prev_end
> computation may be bogus.
>
> Perhaps it's ok (may report bogus gaps), but it seems a bit risky.

I don't understand how the computed gap would be bogus; there
_was_ a list state in which that particular gap existed. The fact
that it may not exist anymore can also happen in the existing
algorithm the instant get_vmalloc_info() drops the vmap_area_lock.

OTOH, parallel list changes could cause an rcu-based get_vmalloc_info()
to over-report or under-report used memory due to parallel list
changes.

If this is a problem in practice, then usage and largest chunk
should be tracked by the allocator instead, obviating the need for
get_vmalloc_info() to traverse the vmap_area_list at all.

Regards,
Peter Hurley

2013-09-26 15:04:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Thu, Sep 26, 2013 at 07:31:47AM -0400, Peter Hurley wrote:
> On 09/26/2013 03:33 AM, Andrew Morton wrote:
> >On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <[email protected]> wrote:
> >
> >>Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> >>could definitely be reduced (even nearly eliminated), but that's a project for
> >>another day :)
> >
> >20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
> >kmalloc (which is very fast) to a vmalloc (which is very slow) without
> >so much as mentioning it in the changelog. This should have been
> >picked up at review, btw.
> >
> >Revert that part of the patch and the problem will be solved.
> >
> >If we are really really worried that a ~9k kmalloc might fail or will
> >be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
> >failed. This kinda sucks, but is practical, but really should only be
> >done if necessary - ie, if problems with using plain old kmalloc are
> >demonstrable.
> >
> >Or just revert all of 20bafb3d23d10 - it was supposed to be a small
> >performance improvement but turned out to be a significant performance
> >loss. Therefore zap.
>
> I have no particular objection to reverting the entire patch.

How about just switching the call to vmalloc to kmalloc? Yes, it's a
larger size that is being allocated here, but we were allocating that
much memory anyway before, so it should be the same "speed", if not
faster than before (1 call to kmalloc instead of 3).

thanks,

greg k-h

2013-09-26 15:32:12

by Andi Kleen

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Thu, Sep 26, 2013 at 07:52:23AM -0400, Peter Hurley wrote:
> On 09/25/2013 11:20 PM, Andi Kleen wrote:
> >Lin Ming <[email protected]> writes:
> >>
> >>Would you like below patch?
> >
> >The loop body keeps rather complex state. It could easily
> >get confused by parallel RCU changes.
> >
> >So if the list changes in parallel you may suddenly
> >report very bogus values, as the va_start - prev_end
> >computation may be bogus.
> >
> >Perhaps it's ok (may report bogus gaps), but it seems a bit risky.
>
> I don't understand how the computed gap would be bogus; there
> _was_ a list state in which that particular gap existed. The fact

It could change any time as you don't have an atomic view
of vm_end / vm_start. It is valid to change the fields
with the lock held.

-Andi

2013-09-26 17:22:16

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/26/2013 11:32 AM, Andi Kleen wrote:
> On Thu, Sep 26, 2013 at 07:52:23AM -0400, Peter Hurley wrote:
>> On 09/25/2013 11:20 PM, Andi Kleen wrote:
>>> Lin Ming <[email protected]> writes:
>>>>
>>>> Would you like below patch?
>>>
>>> The loop body keeps rather complex state. It could easily
>>> get confused by parallel RCU changes.
>>>
>>> So if the list changes in parallel you may suddenly
>>> report very bogus values, as the va_start - prev_end
>>> computation may be bogus.
>>>
>>> Perhaps it's ok (may report bogus gaps), but it seems a bit risky.
>>
>> I don't understand how the computed gap would be bogus; there
>> _was_ a list state in which that particular gap existed. The fact
>
> It could change any time as you don't have an atomic view
> of vm_end / vm_start. It is valid to change the fields
> with the lock held.

va_start and va_end are constant for the lifetime of their vmap_area
(if it's accessible by traversing the vmap_area_list), so it is
not possible for an rcu-based list traversal to see different
values of these individual fields than the spin-locked version.

In addition, for the rcu-based traversal to have arrived at any given
vmap_area requires that the previous vmap_area was its adjacent
lower range at the instant in time when the list cursor was advanced;
again, this is no different than if the spin-locked version had
happened to begin at that same instant.

Regards,
Peter Hurley


2013-09-26 17:35:37

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/26/2013 11:04 AM, Greg KH wrote:
> On Thu, Sep 26, 2013 at 07:31:47AM -0400, Peter Hurley wrote:
>> On 09/26/2013 03:33 AM, Andrew Morton wrote:
>>> On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <[email protected]> wrote:
>>>
>>>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>>>> could definitely be reduced (even nearly eliminated), but that's a project for
>>>> another day :)
>>>
>>> 20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
>>> kmalloc (which is very fast) to a vmalloc (which is very slow) without
>>> so much as mentioning it in the changelog. This should have been
>>> picked up at review, btw.
>>>
>>> Revert that part of the patch and the problem will be solved.
>>>
>>> If we are really really worried that a ~9k kmalloc might fail or will
>>> be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
>>> failed. This kinda sucks, but is practical, but really should only be
>>> done if necessary - ie, if problems with using plain old kmalloc are
>>> demonstrable.
>>>
>>> Or just revert all of 20bafb3d23d10 - it was supposed to be a small
>>> performance improvement but turned out to be a significant performance
>>> loss. Therefore zap.
>>
>> I have no particular objection to reverting the entire patch.
>
> How about just switching the call to vmalloc to kmalloc? Yes, it's a
> larger size that is being allocated here, but we were allocating that
> much memory anyway before, so it should be the same "speed", if not
> faster than before (1 call to kmalloc instead of 3).

The allocation itself isn't performance-critical. The speed difference
between 1 kmalloc and 3 kmallocs here will be unmeasurable from any
user-space test.

And the only reason vmalloc has any measurable impact stems from the way
reads of /proc/meminfo behave (to which there are a number of appropriate
solutions).

The issue with a single large kmalloc is that it may fail where
3 separate, page-or-less kmallocs would not have.

Regards,
Peter Hurley

2013-09-26 18:05:41

by Andrew Morton

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <[email protected]> wrote:

> The issue with a single large kmalloc is that it may fail where
> 3 separate, page-or-less kmallocs would not have.

Or vmalloc fails first, because of internal fragmentation of the vmap
arena. This problem plus vmalloc's slowness are the reasons why
vmalloc should be avoided.

A tremendous number of places in the kernel perform higher-order
allocations nowadays. The page allocator works damn hard to service
them and I expect that switching to kmalloc here will be OK.

2013-09-26 21:42:56

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/26/2013 02:05 PM, Andrew Morton wrote:
> On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <[email protected]> wrote:
>
>> The issue with a single large kmalloc is that it may fail where
>> 3 separate, page-or-less kmallocs would not have.
>
> Or vmalloc fails first, because of internal fragmentation of the vmap
> arena. This problem plus vmalloc's slowness are the reasons why
> vmalloc should be avoided.

Ok, no vmalloc.

> A tremendous number of places in the kernel perform higher-order
> allocations nowadays. The page allocator works damn hard to service
> them and I expect that switching to kmalloc here will be OK.

I've had order-4 allocation failures before on 10Gb. In fact, the
nouveau driver switched to vmalloc for that very reason (commit
d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
for pgt allocation).

Regards,
Peter Hurley

2013-09-26 21:58:25

by Andrew Morton

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On Thu, 26 Sep 2013 17:42:52 -0400 Peter Hurley <[email protected]> wrote:

> On 09/26/2013 02:05 PM, Andrew Morton wrote:
> > On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <[email protected]> wrote:
> >
> >> The issue with a single large kmalloc is that it may fail where
> >> 3 separate, page-or-less kmallocs would not have.
> >
> > Or vmalloc fails first, because of internal fragmentation of the vmap
> > arena. This problem plus vmalloc's slowness are the reasons why
> > vmalloc should be avoided.
>
> Ok, no vmalloc.
>
> > A tremendous number of places in the kernel perform higher-order
> > allocations nowadays. The page allocator works damn hard to service
> > them and I expect that switching to kmalloc here will be OK.
>
> I've had order-4 allocation failures before on 10Gb.

Yep. But this allocation will be order=2, yes? And
PAGE_ALLOC_COSTLY_ORDER=3. So if that thing is working correctly,
order=2 will do a lot better than order=4.

> In fact, the
> nouveau driver switched to vmalloc for that very reason (commit
> d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
> for pgt allocation).

Sigh. I'm not aware of any reports of anyone hitting arena
fragmentation problems yet, so it remains a theoretical thing. But the
more we use vmalloc, the more likely it becomes. And because the usage
sites are so disparate, fixing it will be pretty horrid.

For this reason (plus vmalloc is slow), I do think it's better to do
the old

foo = kmalloc(__GFP_NOWARN);
if (!foo)
foo = vmalloc();

thing. It's ugly, but will greatly reduce the amount of vmallocing
which happens.

Someone had a patch a while back which wraps this operation (and the
corresponding free) into library functions. I said yuk and it wasn't
merged. Perhaps that was a mistake.

2013-09-26 22:21:42

by Peter Hurley

[permalink] [raw]
Subject: Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"

On 09/26/2013 05:58 PM, Andrew Morton wrote:
> On Thu, 26 Sep 2013 17:42:52 -0400 Peter Hurley <[email protected]> wrote:
>
>> On 09/26/2013 02:05 PM, Andrew Morton wrote:
>>> On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <[email protected]> wrote:
>>>
>>>> The issue with a single large kmalloc is that it may fail where
>>>> 3 separate, page-or-less kmallocs would not have.
>>>
>>> Or vmalloc fails first, because of internal fragmentation of the vmap
>>> arena. This problem plus vmalloc's slowness are the reasons why
>>> vmalloc should be avoided.
>>
>> Ok, no vmalloc.
>>
>>> A tremendous number of places in the kernel perform higher-order
>>> allocations nowadays. The page allocator works damn hard to service
>>> them and I expect that switching to kmalloc here will be OK.
>>
>> I've had order-4 allocation failures before on 10Gb.
>
> Yep. But this allocation will be order=2, yes? And
> PAGE_ALLOC_COSTLY_ORDER=3. So if that thing is working correctly,
> order=2 will do a lot better than order=4.

PAGE_ALLOC_COSTLY_ORDER was a subtlety I wasn't aware of; thanks
for the info.

>> In fact, the
>> nouveau driver switched to vmalloc for that very reason (commit
>> d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
>> for pgt allocation).
>
> Sigh. I'm not aware of any reports of anyone hitting arena
> fragmentation problems yet, so it remains a theoretical thing. But the
> more we use vmalloc, the more likely it becomes. And because the usage
> sites are so disparate, fixing it will be pretty horrid.
>
> For this reason (plus vmalloc is slow), I do think it's better to do
> the old
>
> foo = kmalloc(__GFP_NOWARN);
> if (!foo)
> foo = vmalloc();
>
> thing. It's ugly, but will greatly reduce the amount of vmallocing
> which happens.
>
> Someone had a patch a while back which wraps this operation (and the
> corresponding free) into library functions. I said yuk and it wasn't
> merged. Perhaps that was a mistake.

I would suggest either
1. documenting the bulk of our conversation in either/both
mm/vmalloc.c:vmalloc() and include/linux/slab.h
or
2. require that new vmalloc() users get your ack.

Regards,
Peter Hurley