2018-06-21 08:11:59

by kernel test robot

[permalink] [raw]
Subject: [lkp-robot] [Kbuild] 050e9baa9d: netperf.Throughput_total_tps -5.6% regression (FYI)


FYI, we noticed a -5.6% regression of netperf.Throughput_total_tps due to commit 050e9b ("Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables")

We can see this commit impacted the auto generated config regarding stack
protector setup. Stack protector will be enabled after this commit.

We have verified the two kernel config files and stack protector is
disabled for the 'good' commit:

$ grep STACKPROTECTOR config-4.17.0-11782-gbe779f0
CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
# CONFIG_CC_STACKPROTECTOR is not set
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

And for the 'bad' commit, it is enabled(as expected):

$ grep STACKPROTECTOR config-4.17.0-11783-g050e9baa
CONFIG_HAVE_CC_STACKPROTECTOR=y
CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
CONFIG_STACKPROTECTOR=y
CONFIG_STACKPROTECTOR_STRONG=y
CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

As stack protector is a security feature and we believe it will have some
performance impact so the regression should not be surprising.


Details are as below:
-------------------------------------------------------------------------------------------------->

commit: 050e9baa9dc9fbd9ce2b27f0056990fc9e0a08a0 ("Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: netperf
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 8G memory
with following parameters:

ip: ipv4
runtime: 300s
nr_threads: 1
cluster: cs-localhost
test: TCP_CRR
cpufreq_governor: performance

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/


=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
cs-localhost/gcc-7/performance/ipv4/x86_64-rhel-7.2/1/debian-x86_64-2018-04-03.cgz/300s/lkp-bdw-de1/TCP_CRR/netperf

commit:
be779f03d5 (" Kbuild updates for v4.18 (2nd)")
050e9baa9d ("Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables")

be779f03d563981c 050e9baa9dc9fbd9ce2b27f005
---------------- --------------------------
%stddev %change %stddev
\ | \
23291 -5.6% 21998 netperf.Throughput_total_tps
23291 -5.6% 21998 netperf.Throughput_tps
56273 ? 3% +6.1% 59722 ? 3% netperf.time.involuntary_context_switches
85.00 +1.2% 86.00 netperf.time.percent_of_cpu_this_job_got
247.10 +1.9% 251.73 netperf.time.system_time
13899258 -5.3% 13156454 netperf.time.voluntary_context_switches
6987525 -5.6% 6599526 netperf.workload
5767 ? 3% +3.9% 5990 proc-vmstat.nr_mapped
22353589 -11.2% 19856856 softirqs.NET_RX
159854 +7.8% 172284 vmstat.system.cs
11623858 ? 4% -18.8% 9435045 cpuidle.C1E.time
151982 ? 26% -81.5% 28075 ? 6% cpuidle.C1E.usage
13315 ? 6% -46.0% 7188 ? 52% sched_debug.cpu.avg_idle.min
2.57 ? 16% +25.0% 3.21 ? 7% sched_debug.cpu.cpu_load[4].min
693.50 ? 15% +26.3% 876.00 ? 5% slabinfo.nsproxy.active_objs
693.50 ? 15% +26.3% 876.00 ? 5% slabinfo.nsproxy.num_objs
270.75 +1.3% 274.25 turbostat.Avg_MHz
151936 ? 26% -81.6% 28031 ? 6% turbostat.C1E
0.24 ? 3% -0.0 0.20 ? 2% turbostat.C1E%
2.89 -0.1 2.84 perf-stat.branch-miss-rate%
5.153e+09 -1.9% 5.053e+09 perf-stat.branch-misses
2.231e+10 +13.8% 2.539e+10 perf-stat.cache-misses
2.231e+10 +13.8% 2.539e+10 perf-stat.cache-references
49230432 +7.8% 53058273 perf-stat.context-switches
1.42 +1.7% 1.44 perf-stat.cpi
1.28e+12 +1.7% 1.302e+12 perf-stat.cpu-cycles
0.10 +0.0 0.11 ? 2% perf-stat.dTLB-load-miss-rate%
2.685e+08 +8.6% 2.917e+08 ? 4% perf-stat.dTLB-load-misses
0.03 -0.0 0.03 ? 3% perf-stat.dTLB-store-miss-rate%
58736841 -17.9% 48210911 ? 2% perf-stat.dTLB-store-misses
8.478e+08 -3.6% 8.174e+08 perf-stat.iTLB-loads
0.70 -1.7% 0.69 perf-stat.ipc
129050 +5.9% 136660 perf-stat.path-length
4.73 ? 4% -0.7 4.02 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.70 ? 4% -0.7 3.98 ? 2% perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.50 ? 4% -0.7 3.81 ? 2% perf-profile.calltrace.cycles-pp.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.46 ? 4% -0.7 3.77 ? 2% perf-profile.calltrace.cycles-pp.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64
3.08 ? 5% -0.7 2.40 ? 4% perf-profile.calltrace.cycles-pp.sk_wait_data.tcp_recvmsg.inet_recvmsg.__sys_recvfrom.__x64_sys_recvfrom
0.59 ? 2% +0.1 0.70 ? 9% perf-profile.calltrace.cycles-pp.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
0.60 ? 7% +0.1 0.71 ? 5% perf-profile.calltrace.cycles-pp.__sys_bind.__x64_sys_bind.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.54 ? 3% +0.1 0.64 ? 9% perf-profile.calltrace.cycles-pp.__release_sock.release_sock.tcp_sendmsg.sock_sendmsg.__sys_sendto
0.83 ? 2% +0.1 0.93 ? 6% perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_child_process.tcp_v4_rcv
0.80 +0.1 0.91 ? 6% perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable
0.81 ? 2% +0.1 0.93 ? 6% perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.tcp_child_process
0.61 ? 6% +0.1 0.73 ? 5% perf-profile.calltrace.cycles-pp.__x64_sys_bind.do_syscall_64.entry_SYSCALL_64_after_hwframe
0.58 ? 5% +0.2 0.74 ? 5% perf-profile.calltrace.cycles-pp.tcp_transmit_skb.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
1.90 ? 2% +0.2 2.07 ? 3% perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
1.62 ? 6% +0.2 1.86 ? 6% perf-profile.calltrace.cycles-pp.wait_woken.sk_wait_data.tcp_recvmsg.inet_recvmsg.__sys_recvfrom
1.37 ? 18% +0.3 1.67 ? 3% perf-profile.calltrace.cycles-pp.tcp_data_queue.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish
0.26 ?100% +0.3 0.60 ? 7% perf-profile.calltrace.cycles-pp.ip_queue_xmit.tcp_transmit_skb.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
0.32 ?100% +0.4 0.68 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.tcp_data_queue.tcp_rcv_established.tcp_v4_do_rcv
6.75 ? 3% +0.4 7.12 ? 2% perf-profile.calltrace.cycles-pp.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish.ip_local_deliver.ip_rcv
0.34 ?100% +0.4 0.72 ? 2% perf-profile.calltrace.cycles-pp.sock_def_readable.tcp_data_queue.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv
3.42 ? 3% +0.5 3.87 ? 2% perf-profile.calltrace.cycles-pp.tcp_rcv_established.tcp_v4_do_rcv.tcp_v4_rcv.ip_local_deliver_finish.ip_local_deliver
8.44 +0.5 8.96 perf-profile.calltrace.cycles-pp.ip_output.ip_queue_xmit.tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames
4.13 ? 2% +0.5 4.65 perf-profile.calltrace.cycles-pp.tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg
3.95 +0.6 4.50 ? 2% perf-profile.calltrace.cycles-pp.ip_queue_xmit.tcp_transmit_skb.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked
8.29 +0.6 8.86 perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_output.ip_queue_xmit.tcp_transmit_skb.tcp_write_xmit
12.79 ? 3% +0.7 13.45 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output.ip_queue_xmit
12.72 ? 3% +0.7 13.39 perf-profile.calltrace.cycles-pp.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2.ip_output
12.71 ? 3% +0.7 13.38 perf-profile.calltrace.cycles-pp.__softirqentry_text_start.do_softirq_own_stack.do_softirq.__local_bh_enable_ip.ip_finish_output2
5.08 +0.7 5.81 perf-profile.calltrace.cycles-pp.tcp_write_xmit.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg
5.08 +0.7 5.82 perf-profile.calltrace.cycles-pp.__tcp_push_pending_frames.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto
5.79 +0.8 6.61 ? 2% perf-profile.calltrace.cycles-pp.tcp_sendmsg_locked.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto
6.73 ? 2% +0.9 7.62 ? 2% perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.59 +0.9 7.49 ? 2% perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.46 +0.9 7.36 ? 2% perf-profile.calltrace.cycles-pp.tcp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64
6.69 ? 2% +0.9 7.60 ? 2% perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe
6.41 ? 5% -1.0 5.42 ? 2% perf-profile.children.cycles-pp.release_sock
5.90 ? 5% -0.9 5.04 ? 3% perf-profile.children.cycles-pp.__release_sock
3.17 ? 5% -0.8 2.41 ? 3% perf-profile.children.cycles-pp.sk_wait_data
4.75 ? 4% -0.7 4.02 ? 2% perf-profile.children.cycles-pp.__x64_sys_recvfrom
4.71 ? 4% -0.7 4.01 ? 2% perf-profile.children.cycles-pp.__sys_recvfrom
4.51 ? 4% -0.7 3.82 ? 2% perf-profile.children.cycles-pp.inet_recvmsg
4.47 ? 4% -0.7 3.78 ? 2% perf-profile.children.cycles-pp.tcp_recvmsg
1.06 ? 2% -0.2 0.81 ? 4% perf-profile.children.cycles-pp._raw_spin_lock_bh
0.81 ? 6% -0.2 0.62 ? 7% perf-profile.children.cycles-pp.lock_sock_nested
1.12 ? 8% -0.1 1.00 ? 6% perf-profile.children.cycles-pp.loopback_xmit
0.83 ? 6% -0.1 0.72 ? 6% perf-profile.children.cycles-pp.__kfree_skb
0.67 ? 5% -0.1 0.59 ? 6% perf-profile.children.cycles-pp.__inet_lookup_established
0.18 ? 17% -0.1 0.11 ? 6% perf-profile.children.cycles-pp.sock_has_perm
0.18 ? 22% -0.1 0.11 ? 39% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.11 ? 13% -0.1 0.04 ?106% perf-profile.children.cycles-pp.sock_put
0.24 ? 8% -0.1 0.18 ? 16% perf-profile.children.cycles-pp.kfree
0.17 ? 20% -0.1 0.12 ? 7% perf-profile.children.cycles-pp.security_inet_conn_request
0.24 ? 14% -0.1 0.19 ? 7% perf-profile.children.cycles-pp.vfs_read
0.14 ? 16% -0.0 0.10 ? 25% perf-profile.children.cycles-pp.validate_xmit_skb
0.08 ? 8% -0.0 0.05 ? 58% perf-profile.children.cycles-pp.tcp_established_options
0.09 ? 15% -0.0 0.06 ? 20% perf-profile.children.cycles-pp.__intel_pmu_enable_all
0.09 ? 17% -0.0 0.06 ? 16% perf-profile.children.cycles-pp.add_wait_queue
0.16 ? 5% -0.0 0.14 ? 10% perf-profile.children.cycles-pp.pipe_write
0.09 ? 19% -0.0 0.06 ? 17% perf-profile.children.cycles-pp.iput
0.07 ? 7% +0.0 0.09 ? 14% perf-profile.children.cycles-pp.eth_type_trans
0.12 ? 8% +0.0 0.15 ? 8% perf-profile.children.cycles-pp.inet_csk_route_req
0.07 ? 5% +0.0 0.11 ? 17% perf-profile.children.cycles-pp._copy_to_user
0.04 ? 57% +0.0 0.08 ? 8% perf-profile.children.cycles-pp.rb_insert_color
0.09 ? 9% +0.0 0.12 ? 16% perf-profile.children.cycles-pp.bictcp_acked
0.11 ? 21% +0.0 0.15 ? 7% perf-profile.children.cycles-pp.__update_load_avg_se
0.04 ? 58% +0.0 0.08 ? 15% perf-profile.children.cycles-pp.kfree_skbmem
0.32 ? 5% +0.0 0.36 ? 5% perf-profile.children.cycles-pp.tcp_event_new_data_sent
0.18 ? 10% +0.0 0.23 ? 5% perf-profile.children.cycles-pp.finish_task_switch
0.13 ? 8% +0.0 0.17 ? 13% perf-profile.children.cycles-pp.pm_qos_request
0.54 ? 6% +0.0 0.58 ? 4% perf-profile.children.cycles-pp.mod_timer
0.08 ? 20% +0.0 0.12 ? 17% perf-profile.children.cycles-pp.ip_rcv_finish
0.17 ? 8% +0.1 0.23 ? 12% perf-profile.children.cycles-pp.sock_getsockopt
0.03 ?105% +0.1 0.09 ? 12% perf-profile.children.cycles-pp.selinux_file_alloc_security
0.44 ? 5% +0.1 0.50 ? 2% perf-profile.children.cycles-pp.kmem_cache_alloc
0.03 ?100% +0.1 0.08 ? 10% perf-profile.children.cycles-pp.inet_sock_destruct
0.29 ? 17% +0.1 0.35 ? 6% perf-profile.children.cycles-pp.__sk_dst_check
0.03 ?102% +0.1 0.09 ? 15% perf-profile.children.cycles-pp.activate_task
0.02 ?173% +0.1 0.08 ? 10% perf-profile.children.cycles-pp.security_file_alloc
0.37 ? 9% +0.1 0.44 ? 8% perf-profile.children.cycles-pp.tcp_create_openreq_child
0.08 ? 23% +0.1 0.15 ? 22% perf-profile.children.cycles-pp.inet_csk_get_port
0.39 ? 6% +0.1 0.47 ? 9% perf-profile.children.cycles-pp.ttwu_do_wakeup
0.38 ? 7% +0.1 0.45 ? 9% perf-profile.children.cycles-pp.check_preempt_curr
0.26 ? 14% +0.1 0.34 ? 7% perf-profile.children.cycles-pp.alloc_file
0.10 ? 14% +0.1 0.19 ? 10% perf-profile.children.cycles-pp.filp_close
0.17 ? 8% +0.1 0.26 ? 8% perf-profile.children.cycles-pp.__switch_to
0.00 +0.1 0.09 ? 5% perf-profile.children.cycles-pp.default_wake_function
0.07 ? 14% +0.1 0.16 ? 11% perf-profile.children.cycles-pp.fput
0.18 ? 12% +0.1 0.28 ? 7% perf-profile.children.cycles-pp.__x64_sys_close
0.04 ? 58% +0.1 0.14 ? 13% perf-profile.children.cycles-pp.task_work_add
0.01 ?173% +0.1 0.11 ? 26% perf-profile.children.cycles-pp.skb_entail
0.83 ? 2% +0.1 0.93 ? 5% perf-profile.children.cycles-pp.autoremove_wake_function
0.21 ? 21% +0.1 0.31 ? 14% perf-profile.children.cycles-pp.ip_route_output_flow
0.60 ? 8% +0.1 0.71 ? 5% perf-profile.children.cycles-pp.__sys_bind
0.44 ? 12% +0.1 0.54 ? 11% perf-profile.children.cycles-pp.switch_mm
0.36 ? 8% +0.1 0.46 ? 14% perf-profile.children.cycles-pp.selinux_ip_postroute_compat
0.19 ? 10% +0.1 0.30 ? 10% perf-profile.children.cycles-pp.__inet_bind
0.61 ? 7% +0.1 0.73 ? 6% perf-profile.children.cycles-pp.__x64_sys_bind
0.59 ? 13% +0.1 0.72 ? 6% perf-profile.children.cycles-pp.dequeue_task_fair
0.38 ? 10% +0.2 0.54 ? 7% perf-profile.children.cycles-pp.ip_route_output_key_hash
0.36 ? 11% +0.2 0.53 ? 6% perf-profile.children.cycles-pp.ip_route_output_key_hash_rcu
2.06 ? 2% +0.2 2.23 ? 2% perf-profile.children.cycles-pp.menu_select
2.69 ? 2% +0.2 2.91 ? 3% perf-profile.children.cycles-pp.__wake_up_common
1.66 ? 5% +0.2 1.90 ? 6% perf-profile.children.cycles-pp.wait_woken
10.49 +0.8 11.29 ? 2% perf-profile.children.cycles-pp.tcp_write_xmit
10.48 +0.8 11.29 ? 2% perf-profile.children.cycles-pp.__tcp_push_pending_frames
5.81 +0.8 6.63 ? 2% perf-profile.children.cycles-pp.tcp_sendmsg_locked
6.60 +0.9 7.49 ? 2% perf-profile.children.cycles-pp.sock_sendmsg
6.73 ? 2% +0.9 7.62 ? 2% perf-profile.children.cycles-pp.__x64_sys_sendto
6.71 ? 2% +0.9 7.61 ? 2% perf-profile.children.cycles-pp.__sys_sendto
6.47 +0.9 7.38 ? 2% perf-profile.children.cycles-pp.tcp_sendmsg
0.93 ? 2% -0.2 0.77 ? 4% perf-profile.self.cycles-pp._raw_spin_lock_bh
0.40 ? 11% -0.1 0.31 ? 17% perf-profile.self.cycles-pp.__softirqentry_text_start
0.62 ? 4% -0.1 0.54 ? 7% perf-profile.self.cycles-pp.__inet_lookup_established
0.18 ? 17% -0.1 0.11 ? 6% perf-profile.self.cycles-pp.sock_has_perm
0.18 ? 22% -0.1 0.11 ? 39% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
0.11 ? 13% -0.1 0.04 ?106% perf-profile.self.cycles-pp.sock_put
0.24 ? 8% -0.1 0.18 ? 14% perf-profile.self.cycles-pp.kfree
0.19 ? 3% -0.0 0.14 ? 17% perf-profile.self.cycles-pp.__local_bh_enable_ip
0.10 ? 30% -0.0 0.06 ? 15% perf-profile.self.cycles-pp.inet_csk_accept
0.26 ? 13% -0.0 0.21 ? 9% perf-profile.self.cycles-pp.loopback_xmit
0.25 ? 4% -0.0 0.21 ? 6% perf-profile.self.cycles-pp.try_to_wake_up
0.08 ? 8% -0.0 0.05 ? 58% perf-profile.self.cycles-pp.tcp_established_options
0.11 ? 13% +0.0 0.14 ? 8% perf-profile.self.cycles-pp.tsc_verify_tsc_adjust
0.07 ? 7% +0.0 0.09 ? 14% perf-profile.self.cycles-pp.eth_type_trans
0.21 ? 9% +0.0 0.24 ? 5% perf-profile.self.cycles-pp.ip_output
0.16 ? 7% +0.0 0.20 ? 5% perf-profile.self.cycles-pp.finish_task_switch
0.04 ? 57% +0.0 0.08 ? 8% perf-profile.self.cycles-pp.rb_insert_color
0.09 ? 9% +0.0 0.12 ? 16% perf-profile.self.cycles-pp.bictcp_acked
0.34 ? 3% +0.0 0.38 ? 5% perf-profile.self.cycles-pp.selinux_socket_sock_rcv_skb
0.04 ? 58% +0.0 0.08 ? 15% perf-profile.self.cycles-pp.kfree_skbmem
0.10 ? 18% +0.0 0.14 ? 8% perf-profile.self.cycles-pp.__update_load_avg_se
0.11 ? 12% +0.0 0.15 ? 16% perf-profile.self.cycles-pp.tcp_connect
0.08 ? 23% +0.0 0.12 ? 12% perf-profile.self.cycles-pp.dequeue_task_fair
0.13 ? 8% +0.0 0.17 ? 13% perf-profile.self.cycles-pp.pm_qos_request
0.07 ? 20% +0.0 0.11 ? 18% perf-profile.self.cycles-pp.ip_rcv_finish
0.28 ? 9% +0.1 0.33 ? 3% perf-profile.self.cycles-pp.set_next_entity
0.14 ? 8% +0.1 0.19 ? 12% perf-profile.self.cycles-pp.mod_timer
0.11 ? 9% +0.1 0.17 ? 12% perf-profile.self.cycles-pp.ip_local_deliver
0.00 +0.1 0.06 ? 7% perf-profile.self.cycles-pp.alloc_file
0.01 ?173% +0.1 0.07 ? 17% perf-profile.self.cycles-pp.__sys_recvfrom
0.03 ?100% +0.1 0.09 ? 11% perf-profile.self.cycles-pp.check_preempt_curr
0.03 ?102% +0.1 0.09 ? 15% perf-profile.self.cycles-pp.activate_task
0.11 ? 17% +0.1 0.17 ? 8% perf-profile.self.cycles-pp.ip_route_output_key_hash_rcu
0.61 ? 4% +0.1 0.68 ? 8% perf-profile.self.cycles-pp.switch_mm_irqs_off
0.00 +0.1 0.07 ? 36% perf-profile.self.cycles-pp.inet_csk_get_port
0.18 ? 17% +0.1 0.26 ? 17% perf-profile.self.cycles-pp.wait_woken
0.01 ?173% +0.1 0.09 ? 19% perf-profile.self.cycles-pp.skb_entail
0.17 ? 8% +0.1 0.26 ? 8% perf-profile.self.cycles-pp.__switch_to
0.00 +0.1 0.09 ? 5% perf-profile.self.cycles-pp.default_wake_function
0.03 ?100% +0.1 0.14 ? 15% perf-profile.self.cycles-pp.task_work_add



netperf.Throughput_tps

23800 +-+-----------------------------------------------------------------+
23600 +-+ .+... |
| .. +... |
23400 +-+ .+. +... ..+...|
23200 +-+ .. +....+. |
| ..+...+...+... +...+ |
23000 +-+.+.. +... .. |
22800 +-+ . . |
22600 +-+ + |
| |
22400 +-+ |
22200 +-+ |
| O O O O |
22000 +-+ O O O O O O O O |
21800 O-+-----------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong


Attachments:
(No filename) (22.54 kB)
config-4.17.0-11783-g050e9baa (168.81 kB)
job-script (7.36 kB)
job.yaml (4.96 kB)
reproduce (340.00 B)
Download all attachments

2018-06-21 08:17:44

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp-robot] [Kbuild] 050e9baa9d: netperf.Throughput_total_tps -5.6% regression (FYI)

On Thu, Jun 21, 2018 at 5:10 PM kernel test robot <[email protected]> wrote:
>
> FYI, we noticed a -5.6% regression of netperf.Throughput_total_tps due to commit 050e9b ("Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables")

That's perhaps a surprisingly large cost to stack protector, but you
did move from "no stack protector at all":

> $ grep STACKPROTECTOR config-4.17.0-11782-gbe779f0
> CONFIG_HAVE_CC_STACKPROTECTOR=y
> CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
> # CONFIG_CC_STACKPROTECTOR is not set
> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

To having the *strong* stack protector enabled:

> $ grep STACKPROTECTOR config-4.17.0-11783-g050e9baa
> CONFIG_HAVE_CC_STACKPROTECTOR=y
> CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
> CONFIG_STACKPROTECTOR=y
> CONFIG_STACKPROTECTOR_STRONG=y
> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y

so you're testing the "no overhead" case to the "worst overhead" case.

Linus

2018-06-21 08:31:43

by Huang, Ying

[permalink] [raw]
Subject: Re: [lkp-robot] [Kbuild] 050e9baa9d: netperf.Throughput_total_tps -5.6% regression (FYI)

Hi, Linus,

Linus Torvalds <[email protected]> writes:

> On Thu, Jun 21, 2018 at 5:10 PM kernel test robot <[email protected]> wrote:
>>
>> FYI, we noticed a -5.6% regression of netperf.Throughput_total_tps
>> due to commit 050e9b ("Kbuild: rename CC_STACKPROTECTOR[_STRONG]
>> config variables")
>
> That's perhaps a surprisingly large cost to stack protector, but you
> did move from "no stack protector at all":
>
>> $ grep STACKPROTECTOR config-4.17.0-11782-gbe779f0
>> CONFIG_HAVE_CC_STACKPROTECTOR=y
>> CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
>> # CONFIG_CC_STACKPROTECTOR is not set
>> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
>
> To having the *strong* stack protector enabled:
>
>> $ grep STACKPROTECTOR config-4.17.0-11783-g050e9baa
>> CONFIG_HAVE_CC_STACKPROTECTOR=y
>> CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
>> CONFIG_STACKPROTECTOR=y
>> CONFIG_STACKPROTECTOR_STRONG=y
>> CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
>
> so you're testing the "no overhead" case to the "worst overhead" case.

Do you have interest in some other comparison?

Best Regards,
Huang, Ying

2018-06-21 08:39:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [lkp-robot] [Kbuild] 050e9baa9d: netperf.Throughput_total_tps -5.6% regression (FYI)

On Thu, Jun 21, 2018 at 5:25 PM Huang, Ying <[email protected]> wrote:
> Do you have interest in some other comparison?

No, I think the overhead of the strong stackprotector is a bit sad,
but I assume it's because of the nasty code to load the stack canary
from a cacheline that has absolutely nothing else in it.

Oh well.

Linus