It is no need to find the very beginning of the area within
alloc_vmap_area, which can be done by judging each node during the process
free_vmap_cache miss:
vmap_area_root
/ \
tmp_next U
/ (T1)
tmp
/
... (T2)
/
first
vmap_area_list->first->......->tmp->tmp_next->...->vmap_area_list
|-----(T3)----|
Under the scenario of free_vmap_cache miss, total time consumption of finding
the suitable hole is T = T1 + T2 + T3, while the commit decrease it to T1.
In fact, 'vmalloc' always start from the fix address(VMALLOC_START),which will
cause the 'first' to be close to the begining of the list(vmap_area_list) and
make T3 to be big.
The commit will especially help for a large and almost full vmalloc area.
Whearas, it would NOT affect current quick approach such as free_vmap_cache, for
it just take effect when free_vmap_cache miss and will reestablish it laterly.
Signed-off-by: Zhaoyang Huang <[email protected]>
---
mm/vmalloc.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 8698c1c..f58f445 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -471,9 +471,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
while (n) {
struct vmap_area *tmp;
+ struct vmap_area *tmp_next;
tmp = rb_entry(n, struct vmap_area, rb_node);
+ tmp_next = list_next_entry(tmp, list);
if (tmp->va_end >= addr) {
first = tmp;
+ if (ALIGN(tmp->va_end, align) + size
+ < tmp_next->va_start) {
+ /*
+ * free_vmap_cache miss now,don't
+ * update cached_hole_size here,
+ * as __free_vmap_area does
+ */
+ goto found;
+ }
if (tmp->va_start <= addr)
break;
n = n->rb_left;
--
1.9.1
FYI, we noticed the following commit:
commit: dec0553d3100b4b624dbbc2e5bf1d7df8d70b09f ("mm/vmalloc: terminate searching since one node found")
url: https://github.com/0day-ci/linux/commits/Zhaoyang-Huang/mm-vmalloc-terminate-searching-since-one-node-found/20170721-235704
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+--------------------------------------------------+-----------+------------+
| | v4.13-rc1 | dec0553d31 |
+--------------------------------------------------+-----------+------------+
| boot_successes | 3557 | 0 |
| boot_failures | 460 | 50 |
| BUG:kernel_in_stage | 26 | 1 |
| BUG:kernel_hang_in_test_stage | 397 | |
| general_protection_fault:#[##] | 4 | |
| Kernel_panic-not_syncing:Fatal_exception | 13 | 49 |
| BUG:unable_to_handle_kernel | 9 | |
| Oops:#[##] | 9 | |
| BUG:kernel_reboot-without-warning_in_test_stage | 12 | |
| invoked_oom-killer:gfp_mask=0x | 12 | |
| Mem-Info | 12 | 49 |
| Out_of_memory:Kill_process | 3 | |
| WARNING:at_mm/vmalloc.c:#vmap_page_range_noflush | 0 | 49 |
| kernel_BUG_at_net/core/ptp_classifier.c | 0 | 49 |
| invalid_opcode:#[##] | 0 | 49 |
+--------------------------------------------------+-----------+------------+
[ 0.214071] WARNING: CPU: 0 PID: 1 at mm/vmalloc.c:152 vmap_page_range_noflush+0x2d3/0x30b
[ 0.215090] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc1-00001-gdec0553 #35
[ 0.215856] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
[ 0.216005] task: ffff883f40058000 task.stack: ffffad1b00008000
[ 0.216586] RIP: 0010:vmap_page_range_noflush+0x2d3/0x30b
[ 0.217123] RSP: 0000:ffffad1b0000bcf0 EFLAGS: 00010282
[ 0.217635] RAX: ffff883f40021000 RBX: ffffad1b00001000 RCX: ffff883f5d7ba600
[ 0.218394] RDX: ffff883f40021000 RSI: ffff883f40096088 RDI: 0000000000021000
[ 0.219107] RBP: ffffad1b0000bd70 R08: ffff883f4017d2c0 R09: 0000000000002000
[ 0.220004] R10: 8000000000000163 R11: 0000000000000002 R12: ffffad1b00000000
[ 0.220727] R13: ffffffffa1416ad0 R14: ffffad1b00001000 R15: ffffad1b00001000
[ 0.221440] FS: 0000000000000000(0000) GS:ffff883f5d200000(0000) knlGS:0000000000000000
[ 0.222602] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.223198] CR2: 0000000000000000 CR3: 000000000b416000 CR4: 00000000000006f0
[ 0.224008] Call Trace:
[ 0.224264] map_vm_area+0x31/0x3d
[ 0.224611] __vmalloc_node_range+0x15d/0x1e5
[ 0.225067] __vmalloc_node+0x2d/0x2f
[ 0.225430] ? bpf_prog_alloc+0x37/0xa0
[ 0.225817] __vmalloc+0x1b/0x1d
[ 0.226169] bpf_prog_alloc+0x37/0xa0
[ 0.226533] ? set_debug_rodata+0x12/0x12
[ 0.226961] bpf_prog_create+0x41/0x90
[ 0.228007] ptp_classifier_init+0x26/0x2e
[ 0.228437] ? dmi_save_dev_pciaddr+0x42/0xa2
[ 0.228924] sock_init+0x95/0x9a
[ 0.229245] ? bsp_pm_check_init+0x14/0x14
[ 0.229672] do_one_initcall+0x8b/0x132
[ 0.230056] ? set_debug_rodata+0x12/0x12
[ 0.230471] kernel_init_freeable+0x19f/0x224
[ 0.230925] ? rest_init+0x143/0x143
[ 0.231308] kernel_init+0x9/0xe6
[ 0.231667] ret_from_fork+0x25/0x30
[ 0.232004] Code: 48 8b 7d c8 4c 89 e6 e8 2a ef fe ff 85 c0 74 99 eb 28 48 63 45 c4 48 8b 75 a0 48 8b 0c c6 48 8b 45 b0 48 f7 00 9f ff ff ff 74 04 <0f> ff eb 0b 48 85 c9 0f 85 0c fe ff ff 0f ff c7 45 c4 f4 ff ff
[ 0.234047] ---[ end trace 4503c1098d1f2449 ]---
To reproduce:
git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
Thanks,
Xiaolong