From: Vikas Shivappa <vikas.shivappa@linux.intel.com>
To: vikas.shivappa@intel.com, vikas.shivappa@linux.intel.com
Cc: linux-kernel@vger.kernel.org, x86@kernel.org, tglx@linutronix.de,
        peterz@infradead.org, ravi.v.shankar@intel.com, tony.luck@intel.com,
        fenghua.yu@intel.com, andi.kleen@intel.com, davidcc@google.com,
        eranian@google.com, hpa@zytor.com
Subject: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation
Date: Fri, 16 Dec 2016 15:12:55 -0800
Message-Id: <1481929988-31569-2-git-send-email-vikas.shivappa@linux.intel.com>
In-Reply-To: <1481929988-31569-1-git-send-email-vikas.shivappa@linux.intel.com>
References: <1481929988-31569-1-git-send-email-vikas.shivappa@linux.intel.com>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 4337
Lines: 108

Add documentation of usage of cqm and mbm events, continuous monitoring,
lazy and non-lazy monitoring.

Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
---
 Documentation/x86/intel_rdt_mon_ui.txt | 91 ++++++++++++++++++++++++++++++++++
 1 file changed, 91 insertions(+)
 create mode 100644 Documentation/x86/intel_rdt_mon_ui.txt

diff --git a/Documentation/x86/intel_rdt_mon_ui.txt b/Documentation/x86/intel_rdt_mon_ui.txt
new file mode 100644
index 0000000..7d68a65
--- /dev/null
+++ b/Documentation/x86/intel_rdt_mon_ui.txt
@@ -0,0 +1,91 @@
+User Interface for Resource Monitoring in Intel Resource Director Technology
+
+Vikas Shivappa<vikas.shivappa@intel.com>
+David Carrillo-Cisneros<davidcc@google.com>
+Stephane Eranian <eranian@google.com>
+
+This feature is enabled by the CONFIG_INTEL_RDT_M Kconfig and the
+X86 /proc/cpuinfo flag bits cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local
+
+Resource Monitoring
+-------------------
+Resource Monitoring includes cqm(cache quality monitoring) and
+mbm(memory bandwidth monitoring) and uses the perf interface. A light
+weight interface to enable monitoring without perf is enabled as well.
+
+CQM provides OS/VMM a way to monitor llc occupancy. It measures the
+amount of L3 cache fills per task or cgroup.
+
+MBM provides OS/VMM a way to monitor bandwidth from one level of cache
+to another. The current patches support L3 external bandwidth
+monitoring. It supports both 'local bandwidth' and 'total bandwidth'
+monitoring for the socket. Local bandwidth measures the amount of data
+sent through the memory controller on the socket and total b/w measures
+the total system bandwidth.
+
+To check the monitoring events enabled:
+
+# ./tools/perf/perf list | grep -i cqm
+intel_cqm/llc_occupancy/                           [Kernel PMU event]
+intel_cqm/local_bytes/                             [Kernel PMU event]
+intel_cqm/total_bytes/                             [Kernel PMU event]
+
+Monitoring tasks and cgroups using perf
+---------------------------------------
+Monitoring tasks and cgroup is like using any other perf event.
+
+#perf stat -I 1000 -e intel_cqm_llc/local_bytes/ -p PID1
+
+This will monitor the local_bytes event of the PID1 and report once
+every 1000ms
+
+#mkdir /sys/fs/cgroup/perf_event/p1
+#echo PID1 > /sys/fs/cgroup/perf_event/p1/tasks
+#echo PID2 > /sys/fs/cgroup/perf_event/p1/tasks
+
+#perf stat -I 1000 -e intel_cqm_llc/llc_occupancy/ -a -G p1
+
+This will monitor the llc_occupancy event of the perf cgroup p1 in
+interval mode.
+
+Hierarchical monitoring should work just like other events and users can
+also monitor a task with in a cgroup and the cgroup together, or
+different cgroups in the same hierarchy can be monitored together.
+
+Continuous monitoring
+---------------------
+A new file cont_monitoring is added to perf_cgroup which helps to enable
+cqm continuous monitoring. Enabling this field would start monitoring of
+the cgroup without perf being launched. This can be used for long term
+light weight monitoring of tasks/cgroups.
+
+To enable continuous monitoring of cgroup p1.
+#echo 1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
+
+To disable continuous monitoring of cgroup p1.
+#echo 0 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_cont_monitoring
+
+To read the counters at the end of monitoring perf can be used.
+
+LAZY and NOLAZY Monitoring
+--------------------------
+LAZY:
+By default when monitoring is enabled, the RMIDs are not allocated
+immediately and allocated lazily only at the first sched_in.
+There are 2-4 RMIDs per logical processor on each package. So if a dual
+package has 48 logical processors, there would be upto 192 RMIDs on each
+package = total of 192x2 RMIDs.
+There is a possibility that RMIDs can runout and in that case the read
+reports an error since there was no RMID available to monitor for an
+event.
+
+NOLAZY:
+When user wants guaranteed monitoring, he can enable the 'monitoring
+mask' which is basically used to specify the packages he wants to
+monitor. The RMIDs are statically allocated at open and failure is
+indicated if RMIDs are not available.
+
+To specify monitoring on package 0 and package 1:
+#echo 0-1 > /sys/fs/cgroup/perf_event/p1/perf_event.cqm_mon_mask
+
+An error is thrown if packages not online are specified.
-- 
1.9.1