Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753423AbbHRPzo (ORCPT ); Tue, 18 Aug 2015 11:55:44 -0400 Received: from smtp.citrix.com ([66.165.176.89]:33913 "EHLO SMTP.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751344AbbHRPzm (ORCPT ); Tue, 18 Aug 2015 11:55:42 -0400 X-IronPort-AV: E=Sophos;i="5.15,702,1432598400"; d="asc'?scan'208";a="292182462" Message-ID: <1439913332.4239.134.camel@citrix.com> Subject: [PATCH RFC] xen: if on Xen, "flatten" the scheduling domain hierarchy From: Dario Faggioli To: "xen-devel@lists.xenproject.org" CC: Juergen Gross , Andrew Cooper , "Luis R. Rodriguez" , David Vrabel , Boris Ostrovsky , Konrad Rzeszutek Wilk , linux-kernel , Stefano Stabellini , George Dunlap Date: Tue, 18 Aug 2015 08:55:32 -0700 Organization: Citrix Inc. Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-l9tOkXxf9RqsGsMFkCZQ" X-Mailer: Evolution 3.12.11 (3.12.11-1.fc21) MIME-Version: 1.0 X-DLP: MIA1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 37177 Lines: 636 --=-l9tOkXxf9RqsGsMFkCZQ Content-Type: multipart/mixed; boundary="=-SVF4SqEMiFzEiiE/VYC5" --=-SVF4SqEMiFzEiiE/VYC5 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey everyone, So, as a followup of what we were discussing in this thread: [Xen-devel] PV-vNUMA issue: topology is misinterpreted by the guest http://lists.xenproject.org/archives/html/xen-devel/2015-07/msg03241.html I started looking in more details at scheduling domains in the Linux kernel. Now, that thread was about CPUID and vNUMA, and their weird way of interacting, while this thing I'm proposing here is completely independent from them both. In fact, no matter whether vNUMA is supported and enabled, and no matter whether CPUID is reporting accurate, random, meaningful or completely misleading information, I think that we should do something about how scheduling domains are build. Fact is, unless we use 1:1, and immutable (across all the guest lifetime) pinning, scheduling domains should not be constructed, in Linux, by looking at *any* topology information, because that just does not make any sense, when vcpus move around. Let me state this again (hoping to make myself as clear as possible): no matter in how much good shape we put CPUID support, no matter how beautifully and consistently that will interact with both vNUMA, licensing requirements and whatever else. It will be always possible for vCPU #0 and vCPU #3 to be scheduled on two SMT threads at time t1, and on two different NUMA nodes at time t2. Hence, the Linux scheduler should really not skew his load balancing logic toward any of those two situations, as neither of them could be considered correct (since nothing is!). For now, this only covers the PV case. HVM case shouldn't be any different, but I haven't looked at how to make the same thing happen in there as well. OVERALL DESCRIPTION =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D What this RFC patch does is, in the Xen PV case, configure scheduling domains in such a way that there is only one of them, spanning all the pCPUs of the guest. Note that the patch deals directly with scheduling domains, and there is no need to alter the masks that will then be used for building and reporting the topology (via CPUID, /proc/cpuinfo, /sysfs, etc.). That is the main difference between it and the patch proposed by Juergen here: http://lists.xenproject.org/archives/html/xen-devel/2015-07/msg05088.html This means that when, in future, we will fix CPUID handling and make it comply with whatever logic or requirements we want, that won't have any unexpected side effects on scheduling domains. Information about how the scheduling domains are being constructed during boot are available in `dmesg', if the kernel is booted with the 'sched_debug' parameter. It is also possible to look at /proc/sys/kernel/sched_domain/cpu*, and at /proc/schedstat. With the patch applied, only one scheduling domain is created, called the 'VCPU' domain, spanning all the guest's (or Dom0's) vCPUs. You can tell that from the fact that every cpu* folder in /proc/sys/kernel/sched_domain/ only have one subdirectory ('domain0'), with all the tweaks and the tunables for our scheduling domain. EVALUATION =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D I've tested this with UnixBench, and by looking at Xen build time, on a 16, 24 and 48 pCPUs hosts. I've run the benchmarks in Dom0 only, for now, but I plan to re-run them in DomUs soon (Juergen may be doing something similar to this in DomU already, AFAUI). I've run the benchmarks with and without the patch applied ('patched' and 'vanilla', respectively, in the tables below), and with different number of build jobs (in case of the Xen build) or of parallel copy of the benchmarks (in the case of UnixBench). What I get from the numbers is that the patch almost always brings benefits, in some cases even huge ones. There are a couple of cases where we regress, but always only slightly so, especially if comparing that to the magnitude of some of the improvement that we get. Bear also in mind that these results are gathered from Dom0, and without any overcommitment at the vCPU level (i.e., nr. vCPUs =3D=3D nr pCPUs). If we move things in DomU and do overcommit at the Xen scheduler level, I am expecting even better results. RESULTS =3D=3D=3D=3D=3D=3D=3D To have a quick idea of how a benchmark went, look at the '% improvement' row of each table. I'll put these results online, in a googledoc spreadsheet or something like that, to make them easier to read, as soon as possible. *** Intel(R) Xeon(R) E5620 @ 2.40GHz = = =20 *** pCPUs 16 DOM0 vCPUS 16 *** RAM 12285 MB DOM0 Memory 9955 MB *** NUMA nodes 2 =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D MAKE XEN (lower =3D=3D better) = = =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # of build jobs -j1 -j6 = -j8 -j16** -j24 =20 vanilla/patched vanilla patched vanilla patched va= nilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------ 153.72 152.41 35.33 34.93 = 30.7 30.33 26.79 25.97 26.88 26.21 153.81 152.76 35.37 34.99 = 30.81 30.36 26.83 26.08 27 26.24 153.93 152.79 35.37 35.25 = 30.92 30.39 26.83 26.13 27.01 26.28 153.94 152.94 35.39 35.28 = 31.05 30.43 26.9 26.14 27.01 26.44 153.98 153.06 35.45 35.31 = 31.17 30.5 26.95 26.18 27.02 26.55 154.01 153.23 35.5 35.35 = 31.2 30.59 26.98 26.2 27.05 26.61 154.04 153.34 35.56 35.42 = 31.45 30.76 27.12 26.21 27.06 26.78 154.16 153.5 37.79 35.58 = 31.68 30.83 27.16 26.23 27.16 26.78 154.18 153.71 37.98 35.61 = 33.73 30.9 27.49 26.32 27.16 26.8 154.9 154.67 38.03 37.64 = 34.69 31.69 29.82 26.38 27.2 28.63 ---------------------------------------------------------------------------= ------------------------------------------------------------ Avg. 154.067 153.241 36.177 35.536 = 31.74 30.678 27.287 26.184 27.055 26.732 ---------------------------------------------------------------------------= ------------------------------------------------------------ Std. Dev. 0.325 0.631 1.215 0.771 = 1.352 0.410 0.914 0.116 0.095 0.704 ---------------------------------------------------------------------------= ------------------------------------------------------------ % improvement 0.536 1.772 = 3.346 4.042 1.194 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D UNIXBENCH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # parallel copies 1 parallel 6 parral= lel 8 parallel 16 parallel** 24 parallel vanilla/patched vanilla patched vanilla p= ached vanilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------------------- Dhrystone 2 using register variables 2302.2 2302.1 13157.8 = 12262.4 15691.5 15860.1 18927.7 19078.5 18654.3 18855.6 Double-Precision Whetstone 620.2 620.2 3481.2 = 3566.9 4669.2 4551.5 7610.1 7614.3 11558.9 11561.3 Execl Throughput 184.3 186.7 884.6 = 905.3 1168.4 1213.6 2134.6 2210.2 2250.9 2265 File Copy 1024 bufsize 2000 maxblocks 780.8 783.3 1243.7 = 1255.5 1250.6 1215.7 1080.9 1094.2 1069.8 1062.5 File Copy 256 bufsize 500 maxblocks 479.8 482.8 781.8 = 803.6 806.4 781 682.9 707.7 698.2 694.6 File Copy 4096 bufsize 8000 maxblocks 1617.6 1593.5 2739.7 = 2943.4 2818.3 2957.8 2389.6 2412.6 2371.6 2423.8 Pipe Throughput 363.9 361.6 2068.6 = 2065.6 2622 2633.5 4053.3 4085.9 4064.7 4076.7 Pipe-based Context Switching 70.6 207.2 369.1 = 1126.8 623.9 1431.3 1970.4 2082.9 1963.8 2077 Process Creation 103.1 135 503 = 677.6 618.7 855.4 1138 1113.7 1195.6 1199 Shell Scripts (1 concurrent) 723.2 765.3 4406.4 = 4334.4 5045.4 5002.5 5861.9 5844.2 5958.8 5916.1 Shell Scripts (8 concurrent) 2243.7 2715.3 5694.7 = 5663.6 5694.7 5657.8 5637.1 5600.5 5582.9 5543.6 System Call Overhead 330 330.1 1669.2 = 1672.4 2028.6 1996.6 2920.5 2947.1 2923.9 2952.5 System Benchmarks Index Score 496.8 567.5 1861.9 = 2106 2220.3 2441.3 2972.5 3007.9 3103.4 3125.3 ---------------------------------------------------------------------------= ------------------------------------------------------------------------- % increase (of the Index Score) 14.231 = 13.110 9.954 1.191 0.706 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D *** Intel(R) Xeon(R) X5650 @ 2.67GHz *** pCPUs 24 DOM0 vCPUS 16 *** RAM 36851 MB DOM0 Memory 9955 MB *** NUMA nodes 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D MAKE XEN (lower =3D=3D better) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # of build jobs -j1 -j8 = -j12 -j24** -j32 vanilla/patched vanilla patched vanilla patched va= nilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------ 119.49 119.47 23.37 23.29 = 20.12 19.85 17.99 17.9 17.82 17.8 119.59 119.64 23.52 23.31 = 20.16 19.99 18.19 18.05 18.23 17.89 119.59 119.65 23.53 23.35 = 20.19 20.08 18.26 18.09 18.35 17.91 119.72 119.75 23.63 23.41 = 20.2 20.14 18.54 18.1 18.4 17.95 119.95 119.86 23.68 23.42 = 20.24 20.19 18.57 18.15 18.44 18.03 119.97 119.9 23.72 23.51 = 20.38 20.31 18.61 18.21 18.49 18.03 119.97 119.91 25.03 23.53 = 20.38 20.42 18.75 18.28 18.51 18.08 120.01 119.98 25.05 23.93 = 20.39 21.69 19.99 18.49 18.52 18.6 120.24 119.99 25.12 24.19 = 21.67 21.76 20.08 19.74 19.73 19.62 120.66 121.22 25.16 25.36 = 21.94 21.85 20.26 20.3 19.92 19.81 ---------------------------------------------------------------------------= ------------------------------------------------------------ Avg. 119.919 119.937 24.181 23.73 2= 0.567 20.628 18.924 18.531 18.641 18.372 ---------------------------------------------------------------------------= ------------------------------------------------------------ Std. Dev. 0.351 0.481 0.789 0.642 = 0.663 0.802 0.851 0.811 0.658 0.741 ---------------------------------------------------------------------------= ------------------------------------------------------------ % improvement -0.015 1.865 = -0.297 2.077 1.443 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D UNIXBENCH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # parallel copies 1 parallel 8 parral= lel 12 parallel 24 parallel** 32 parallel vanilla/patched vanilla patched vanilla = pached vanilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------------------- Dhrystone 2 using register variables 2650.1 2664.6 18967.8 = 19060.4 27534.1 27046.8 30077.9 30110.6 30542.1 30358.7 Double-Precision Whetstone 713.7 713.5 5463.6 = 5455.1 7863.9 7923.8 12725.1 12727.8 17474.3 17463.3 Execl Throughput 280.9 283.8 1724.4 = 1866.5 2029.5 2367.6 2370 2521.3 2453 2506.8 File Copy 1024 bufsize 2000 maxblocks 891.1 894.2 1423 = 1457.7 1385.6 1482.2 1226.1 1224.2 1235.9 1265.5 File Copy 256 bufsize 500 maxblocks 546.9 555.4 949 = 972.1 882.8 878.6 821.9 817.7 784.7 810.8 File Copy 4096 bufsize 8000 maxblocks 1743.4 1722.8 3406.5 = 3438.9 3314.3 3265.9 2801.9 2788.3 2695.2 2781.5 Pipe Throughput 426.8 423.4 3207.9 = 3234 4635.1 4708.9 7326 7335.3 7327.2 7319.7 Pipe-based Context Switching 110.2 223.5 680.8 = 1602.2 998.6 2324.6 3122.1 3252.7 3128.6 3337.2 Process Creation 130.7 224.4 1001.3 = 1043.6 1209 1248.2 1337.9 1380.4 1338.6 1280.1 Shell Scripts (1 concurrent) 1140.5 1257.5 5462.8 = 6146.4 6435.3 7206.1 7425.2 7636.2 7566.1 7636.6 Shell Scripts (8 concurrent) 3492 3586.7 7144.9 = 7307 7258 7320.2 7295.1 7296.7 7248.6 7252.2 System Call Overhead 387.7 387.5 2398.4 = 2367 2793.8 2752.7 3735.7 3694.2 3752.1 3709.4 System Benchmarks Index Score 634.8 712.6 2725.8 = 3005.7 3232.4 3569.7 3981.3 4028.8 4085.2 4126.3 ---------------------------------------------------------------------------= ------------------------------------------------------------------------- % increase (of the Index Score) 12.256 = 10.269 10.435 1.193 1.006 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D *** Intel(R) Xeon(R) X5650 @ 2.67GHz *** pCPUs 48 DOM0 vCPUS 16 *** RAM 393138 MB DOM0 Memory 9955 MB *** NUMA nodes 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D MAKE XEN (lower =3D=3D better) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # of build jobs -j1 -j20 = -j24 -j48** -j62 vanilla/patched vanilla patched vanilla patched va= nilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------ 267.78 233.25 36.53 35.53 = 35.98 34.99 33.46 32.13 33.57 32.54 268.42 233.92 36.82 35.56 = 36.12 35.2 34.24 32.24 33.64 32.56 268.85 234.39 36.92 35.75 = 36.15 35.35 34.48 32.86 33.67 32.74 268.98 235.11 36.96 36.01 = 36.25 35.46 34.73 32.89 33.97 32.83 269.03 236.48 37.04 36.16 = 36.45 35.63 34.77 32.97 34.12 33.01 269.54 237.05 40.33 36.59 = 36.57 36.15 34.97 33.09 34.18 33.52 269.99 238.24 40.45 36.78 = 36.58 36.22 34.99 33.69 34.28 33.63 270.11 238.48 41.13 39.98 = 40.22 36.24 38 33.92 34.35 33.87 270.96 239.07 41.66 40.81 = 40.59 36.35 38.99 34.19 34.49 37.24 271.84 240.89 42.07 41.24 = 40.63 40.06 39.07 36.04 34.69 37.59 ---------------------------------------------------------------------------= ------------------------------------------------------------ Avg. 269.55 236.688 38.991 37.441 3= 7.554 36.165 35.77 33.402 34.096 33.953 ---------------------------------------------------------------------------= ------------------------------------------------------------ Std. Dev. 1.213 2.503 2.312 2.288 = 2.031 1.452 2.079 1.142 0.379 1.882 ---------------------------------------------------------------------------= ------------------------------------------------------------ % improvement 12.191 3.975 = 3.699 6.620 0.419 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D UNIXBENCH =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D # parallel copies 1 parallel 20 parra= llel 24 parallel 48 parallel** 62 parallel vanilla/patched vanilla patched vanilla = pached vanilla patched vanilla patched vanilla patched ---------------------------------------------------------------------------= ------------------------------------------------------------------------- Dhrystone 2 using register variables 2037.6 2037.5 39615.4 = 38990.5 43976.8 44660.8 51238 51117.4 51672.5 52332.5 Double-Precision Whetstone 525.1 521.6 10389.7 = 10429.3 12236.5 12188.8 20897.1 20921.9 26957.5 27035.7 Execl Throughput 112.1 113.6 799 = 786.5 715.1 702.3 758.2 744 756.3 765.6 File Copy 1024 bufsize 2000 maxblocks 605.5 622 671.6 = 630.4 624.3 605.8 599 581.2 447.4 433.7 File Copy 256 bufsize 500 maxblocks 384 382.7 447.2 = 429.1 464.5 404.3 416.1 428.5 313.8 305.6 File Copy 4096 bufsize 8000 maxblocks 883.7 1100.5 1326 = 1307 1343.2 1305.9 1260.4 1245.3 1001.4 920.1 Pipe Throughput 283.7 282.8 5636.6 = 5634.2 6551 6571 10390 10437.4 10459 10498.9 Pipe-based Context Switching 41.5 143.7 518.5 = 1899.1 737.5 2068.8 2877.1 3093.2 2949.3 3184.1 Process Creation 58.5 78.4 370.7 = 389.4 338 355.8 380.1 375.5 383.8 369.6 Shell Scripts (1 concurrent) 443.7 475.5 1901.9 = 1945 1765.1 1789.6 2417 2354.4 2395.3 2362.2 Shell Scripts (8 concurrent) 1283.1 1319.1 2265.4 = 2209.8 2263.3 2209 2202.7 2216.1 2190.4 2206.5 System Call Overhead 254.1 254.3 891.6 = 881.6 971.1 958.3 1446.8 1409.5 1461.7 1429.2 System Benchmarks Index Score 340.8 398.6 1690.6 = 1866.3 1770.6 1902 2303.5 2300.8 2208.3 2189.8 ---------------------------------------------------------------------------= ------------------------------------------------------------------------- % increase (of the Index Score) 16.960 = 10.393 7.421 -0.117 -0.838 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D OVERHEAD EVALUATION =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Only in the Xen build case, I quickly checked with `perf stat' some scheduling related metrics. I only did this on the biggest box, for now, as it is there that we show the larger improvement (in case of "-j1" and a couple of slight regressions (although, those happen in UnixBench). We see that using only one, "flat", scheduling domain always means less migrations, while it seems to be increasing the number of context switches. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D =E2=80=9C-j1=E2=80=9D = =E2=80=9C-j24=E2=80=9D =E2=80=9C-j48=E2= =80=9D =E2=80=9C-j62=E2=80=9D ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= --------- cpu-migrations context-switches cpu-migrations context-= switches cpu-migrations context-switches cpu-migrations context= -switches ---------------------------------------------------------------------------= ---------------------------------------------------------------------------= --------- vanilla 21,242(0.074 K/s) 46,196(0.160 K/s) 22,992(0.066 K/s) 48,684(0.= 140 K/s) 24,516(0.064 K/s) 63,391(0.166 K/s) 23,164(0.062 K/s) 68,239(0= .182 K/s) patched 19,522(0.077 K/s) 50,871(0.201 K/s) 20,593(0.059 K/s) 57,688(0.= 167 K/s) 21,137(0.056 K/s) 63,822(0.169 K/s) 20,830(0.055 K/s) 69,783(0= .185 K/s) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D REQUEST FOR COMMENTS =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Basically, the kind of feedback I'd be really glad to hear is: - what you guys thing of the approach, - whether you think, looking at this preliminary set of numbers, that this is something worth continuing investigating, - if yes, what other workloads and benchmark it would make sense to throw at it. Thanks and Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --- commit 3240f68a08511c3db616cfc2a653e6761e23ff7f Author: Dario Faggioli Date: Tue Aug 18 08:41:38 2015 -0700 xen: if on Xen, "flatten" the scheduling domain hierarchy =20 With this patch applied, only one scheduling domain is created (called the 'VCPU' domain) spanning all the guest's vCPUs. =20 This is because, since vCPUs are moving around on pCPUs, there is no point in building a full hierarchy, based *any* topology information, which will just never be accurate. Having only one "flat" domain is really the only thing that looks sensible. =20 Signed-off-by: Dario Faggioli diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 8648438..34f39f1 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -55,6 +55,21 @@ static irqreturn_t xen_call_function_interrupt(int irq, = void *dev_id); static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_i= d); static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id); =20 +const struct cpumask *xen_pcpu_sched_domain_mask(int cpu) +{ + return cpu_online_mask; +} + +static struct sched_domain_topology_level xen_sched_domain_topology[] =3D = { + { xen_pcpu_sched_domain_mask, SD_INIT_NAME(VCPU) }, + { NULL, }, +}; + +static void xen_set_sched_topology(void) +{ + set_sched_topology(xen_sched_domain_topology); +} + /* * Reschedule call back. */ @@ -335,6 +350,8 @@ static void __init xen_smp_prepare_cpus(unsigned int ma= x_cpus) } set_cpu_sibling_map(0); =20 + xen_set_sched_topology(); + if (xen_smp_intr_init(0)) BUG(); =20 --=-SVF4SqEMiFzEiiE/VYC5 Content-Disposition: attachment; filename="topology.patch" Content-Type: text/x-patch; name="topology.patch"; charset="UTF-8" Content-Transfer-Encoding: base64 Y29tbWl0IDMyNDBmNjhhMDg1MTFjM2RiNjE2Y2ZjMmE2NTNlNjc2MWUyM2ZmN2YKQXV0aG9yOiBE YXJpbyBGYWdnaW9saSA8ZGFyaW8uZmFnZ2lvbGlAY2l0cml4LmNvbT4KRGF0ZTogICBUdWUgQXVn IDE4IDA4OjQxOjM4IDIwMTUgLTA3MDAKCiAgICB4ZW46IGlmIG9uIFhlbiwgImZsYXR0ZW4iIHRo ZSBzY2hlZHVsaW5nIGRvbWFpbiBoaWVyYXJjaHkKICAgIAogICAgV2l0aCB0aGlzIHBhdGNoIGFw cGxpZWQsIG9ubHkgb25lIHNjaGVkdWxpbmcgZG9tYWluIGlzCiAgICBjcmVhdGVkIChjYWxsZWQg dGhlICdWQ1BVJyBkb21haW4pIHNwYW5uaW5nIGFsbCB0aGUKICAgIGd1ZXN0J3MgdkNQVXMuCiAg ICAKICAgIFRoaXMgaXMgYmVjYXVzZSwgc2luY2UgdkNQVXMgYXJlIG1vdmluZyBhcm91bmQgb24g cENQVXMsCiAgICB0aGVyZSBpcyBubyBwb2ludCBpbiBidWlsZGluZyBhIGZ1bGwgaGllcmFyY2h5 LCBiYXNlZAogICAgKmFueSogdG9wb2xvZ3kgaW5mb3JtYXRpb24sIHdoaWNoIHdpbGwganVzdCBu ZXZlciBiZQogICAgYWNjdXJhdGUuIEhhdmluZyBvbmx5IG9uZSAiZmxhdCIgZG9tYWluIGlzIHJl YWxseSB0aGUKICAgIG9ubHkgdGhpbmcgdGhhdCBsb29rcyBzZW5zaWJsZS4KICAgIAogICAgU2ln bmVkLW9mZi1ieTogRGFyaW8gRmFnZ2lvbGkgPGRhcmlvLmZhZ2dpb2xpQGNpdHJpeC5jb20+Cgpk aWZmIC0tZ2l0IGEvYXJjaC94ODYveGVuL3NtcC5jIGIvYXJjaC94ODYveGVuL3NtcC5jCmluZGV4 IDg2NDg0MzguLjM0ZjM5ZjEgMTAwNjQ0Ci0tLSBhL2FyY2gveDg2L3hlbi9zbXAuYworKysgYi9h cmNoL3g4Ni94ZW4vc21wLmMKQEAgLTU1LDYgKzU1LDIxIEBAIHN0YXRpYyBpcnFyZXR1cm5fdCB4 ZW5fY2FsbF9mdW5jdGlvbl9pbnRlcnJ1cHQoaW50IGlycSwgdm9pZCAqZGV2X2lkKTsKIHN0YXRp YyBpcnFyZXR1cm5fdCB4ZW5fY2FsbF9mdW5jdGlvbl9zaW5nbGVfaW50ZXJydXB0KGludCBpcnEs IHZvaWQgKmRldl9pZCk7CiBzdGF0aWMgaXJxcmV0dXJuX3QgeGVuX2lycV93b3JrX2ludGVycnVw dChpbnQgaXJxLCB2b2lkICpkZXZfaWQpOwogCitjb25zdCBzdHJ1Y3QgY3B1bWFzayAqeGVuX3Bj cHVfc2NoZWRfZG9tYWluX21hc2soaW50IGNwdSkKK3sKKwlyZXR1cm4gY3B1X29ubGluZV9tYXNr OworfQorCitzdGF0aWMgc3RydWN0IHNjaGVkX2RvbWFpbl90b3BvbG9neV9sZXZlbCB4ZW5fc2No ZWRfZG9tYWluX3RvcG9sb2d5W10gPSB7CisgICAgICAgIHsgeGVuX3BjcHVfc2NoZWRfZG9tYWlu X21hc2ssIFNEX0lOSVRfTkFNRShWQ1BVKSB9LAorICAgICAgICB7IE5VTEwsIH0sCit9OworCitz dGF0aWMgdm9pZCB4ZW5fc2V0X3NjaGVkX3RvcG9sb2d5KHZvaWQpCit7CisgICAgICAgIHNldF9z Y2hlZF90b3BvbG9neSh4ZW5fc2NoZWRfZG9tYWluX3RvcG9sb2d5KTsKK30KKwogLyoKICAqIFJl c2NoZWR1bGUgY2FsbCBiYWNrLgogICovCkBAIC0zMzUsNiArMzUwLDggQEAgc3RhdGljIHZvaWQg X19pbml0IHhlbl9zbXBfcHJlcGFyZV9jcHVzKHVuc2lnbmVkIGludCBtYXhfY3B1cykKIAl9CiAJ c2V0X2NwdV9zaWJsaW5nX21hcCgwKTsKIAorCXhlbl9zZXRfc2NoZWRfdG9wb2xvZ3koKTsKKwog CWlmICh4ZW5fc21wX2ludHJfaW5pdCgwKSkKIAkJQlVHKCk7CiAK --=-SVF4SqEMiFzEiiE/VYC5-- --=-l9tOkXxf9RqsGsMFkCZQ Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlXTVXQACgkQk4XaBE3IOsQnNgCeK77eS0GjoldH/rbMXqWCaCxg BdYAnj5u7on3Am7tPz6FKWJQmYyLRaEt =myPt -----END PGP SIGNATURE----- --=-l9tOkXxf9RqsGsMFkCZQ-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/