2021-09-12 13:16:00

by kernel test robot

[permalink] [raw]
Subject: [drm/i915] 36b6b68169: phoronix-test-suite.supertuxkart.1024x768.Windowed.Basic.1.ZenGarden.frames_per_second 14.4% improvement



Greeting,

FYI, we noticed a 14.4% improvement of phoronix-test-suite.supertuxkart.1024x768.Windowed.Basic.1.ZenGarden.frames_per_second due to commit:


commit: 36b6b6816989cf6f468eea82694e83211a066fa4 ("drm/i915: Fix MOCS PTE setting for gen9+")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: phoronix-test-suite
on test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz with 16G memory
with following parameters:

need_x: true
test: supertuxkart-1.5.2
option_a: 1024 x 768
option_b: Windowed
option_c: Basic
option_d: 1
option_e: Zen Garden [Low poly]
cpufreq_governor: performance
ucode: 0xde

test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
bin/lkp run generated-yaml-file

=========================================================================================
compiler/cpufreq_governor/kconfig/need_x/option_a/option_b/option_c/option_d/option_e/rootfs/tbox_group/test/testcase/ucode:
gcc-9/performance/x86_64-rhel-8.3/true/1024 x 768/Windowed/Basic/1/Zen Garden [Low poly]/debian-x86_64-phoronix/lkp-cfl-d1/supertuxkart-1.5.2/phoronix-test-suite/0xde

commit:
d46b60a2e8 ("drm/i915: Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init")
36b6b68169 ("drm/i915: Fix MOCS PTE setting for gen9+")

d46b60a2e8d246f1 36b6b6816989cf6f468eea82694
---------------- ---------------------------
%stddev %change %stddev
\ | \
638.12 ? 2% +14.4% 730.32 ? 2% phoronix-test-suite.supertuxkart.1024x768.Windowed.Basic.1.ZenGarden.frames_per_second
5.98 ? 5% +52.2% 9.10 ? 3% turbostat.GFXWatt
31.18 ? 6% +12.8% 35.19 ? 2% turbostat.PkgWatt
0.08 ? 6% +29.7% 0.11 ? 15% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
5.42 ? 3% -12.0% 4.77 ? 5% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
6.68 ? 4% -10.6% 5.97 ? 8% perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork
6287 ? 3% +11.0% 6976 ? 3% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
5.40 ? 3% -12.1% 4.75 ? 5% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
6.67 ? 4% -10.6% 5.96 ? 8% perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork
37.42 ? 2% -12.2 25.26 ? 5% perf-stat.i.cache-miss-rate%
78410775 ? 3% -26.6% 57551017 ? 6% perf-stat.i.cache-misses
0.79 +10.8% 0.88 ? 2% perf-stat.i.ipc
0.00 ? 63% +0.0 0.00 ? 93% perf-stat.i.node-load-miss-rate%
7810539 ? 3% -40.4% 4657311 ? 10% perf-stat.i.node-loads
1653601 ? 2% -26.6% 1213931 ? 5% perf-stat.i.node-stores
0.82 ? 11% -0.4 0.47 ? 45% perf-profile.calltrace.cycles-pp.eb_lookup_vmas.i915_gem_do_execbuffer.i915_gem_execbuffer2_ioctl.drm_ioctl_kernel.drm_ioctl
0.76 ? 12% +0.2 0.97 ? 11% perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
1.11 ? 15% -0.4 0.74 ? 20% perf-profile.children.cycles-pp.__queue_work
0.87 ? 17% -0.4 0.51 ? 26% perf-profile.children.cycles-pp.__i915_vma_retire
0.91 ? 17% -0.3 0.59 ? 24% perf-profile.children.cycles-pp.queue_work_on
0.83 ? 11% -0.3 0.55 ? 10% perf-profile.children.cycles-pp.eb_lookup_vmas
0.68 ? 26% -0.2 0.46 ? 24% perf-profile.children.cycles-pp.process_csb
0.60 ? 30% -0.2 0.37 ? 20% perf-profile.children.cycles-pp.execlists_schedule_out
0.72 ? 24% -0.2 0.51 ? 23% perf-profile.children.cycles-pp.execlists_submission_tasklet
0.36 ? 16% -0.1 0.22 ? 21% perf-profile.children.cycles-pp.run_posix_cpu_timers
0.43 ? 15% -0.1 0.29 ? 22% perf-profile.children.cycles-pp.eb_pin_engine
0.39 ? 14% -0.1 0.26 ? 23% perf-profile.children.cycles-pp.__intel_context_do_pin_ww
0.33 ? 5% -0.1 0.22 ? 25% perf-profile.children.cycles-pp.__active_lookup
0.24 ? 13% -0.1 0.15 ? 25% perf-profile.children.cycles-pp.__perf_event_header__init_id
0.11 ? 11% -0.1 0.05 ? 76% perf-profile.children.cycles-pp.security_socket_recvmsg
0.07 ? 50% +0.0 0.10 ? 15% perf-profile.children.cycles-pp.irqentry_exit
0.19 ? 10% +0.1 0.24 ? 14% perf-profile.children.cycles-pp.printk
0.19 ? 10% +0.1 0.24 ? 14% perf-profile.children.cycles-pp.vprintk_emit
0.19 ? 10% +0.1 0.24 ? 14% perf-profile.children.cycles-pp.console_unlock
0.19 ? 10% +0.1 0.24 ? 14% perf-profile.children.cycles-pp.serial8250_console_write
0.19 ? 10% +0.1 0.24 ? 14% perf-profile.children.cycles-pp.uart_console_write
0.08 ? 33% +0.1 0.13 ? 34% perf-profile.children.cycles-pp.rcu_gp_kthread
0.64 ? 12% +0.2 0.80 ? 7% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.87 ? 17% -0.4 0.51 ? 27% perf-profile.self.cycles-pp.__i915_vma_retire
0.58 ? 17% -0.2 0.38 ? 9% perf-profile.self.cycles-pp.eb_lookup_vmas
0.36 ? 16% -0.1 0.22 ? 21% perf-profile.self.cycles-pp.run_posix_cpu_timers
0.32 ? 6% -0.1 0.20 ? 28% perf-profile.self.cycles-pp.__active_lookup
0.39 ? 15% -0.1 0.28 ? 16% perf-profile.self.cycles-pp.__radix_tree_lookup
0.12 ? 18% -0.1 0.06 ? 48% perf-profile.self.cycles-pp.kmem_cache_alloc
0.12 ? 16% -0.0 0.08 ? 20% perf-profile.self.cycles-pp.rcu_nmi_enter
0.59 ? 12% +0.2 0.74 ? 10% perf-profile.self.cycles-pp._raw_spin_lock_irqsave





800 +---------------------------------------------------------------------+
|O O OO OO OO O O OO O O O O OOO O OO O O OO O|
700 |-+O O O+. OOO O O.+O O O O O O O O |
600 |+.++.++.+++.+ ++.+++.++.++.+++ +.++.+ |
| |
500 |-+ |
| |
400 |-+ |
| |
300 |-+ |
200 |-+ |
| |
100 |-+ |
| |
0 +---------------------------------------------------------------------+


[*] bisect-good sample
[O] bisect-bad sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (8.70 kB)
config-5.9.0-rc2-00398-g36b6b6816989c (171.18 kB)
job-script (7.81 kB)
job.yaml (5.20 kB)
reproduce (314.00 B)
Download all attachments