Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752641AbbBWPXm (ORCPT ); Mon, 23 Feb 2015 10:23:42 -0500 Received: from smtp.eu.citrix.com ([185.25.65.24]:24733 "EHLO SMTP.EU.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752241AbbBWPXk (ORCPT ); Mon, 23 Feb 2015 10:23:40 -0500 X-Greylist: delayed 591 seconds by postgrey-1.27 at vger.kernel.org; Mon, 23 Feb 2015 10:23:40 EST X-IronPort-AV: E=Sophos;i="5.09,631,1418083200"; d="asc'?scan'208";a="32350605" From: Dario Faggioli To: "mgorman@suse.de" CC: "torvalds@linux-foundation.org" , "linux-kernel@vger.kernel.org" , David Vrabel , "Xen-devel@lists.xen.org" , Wei Liu Subject: Re: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0 Thread-Topic: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0 Thread-Index: AQHQT3tS9jofFJ9pB0uA3hsrjkUdWQ== Date: Mon, 23 Feb 2015 15:13:48 +0000 Message-ID: <1424704425.5819.38.camel@citrix.com> References: <54E5DFED.9050700@citrix.com> <20150219170104.GS3087@suse.de> In-Reply-To: <20150219170104.GS3087@suse.de> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-ejpDmzVCYMDBDCTE0u9Q" MIME-Version: 1.0 X-DLP: AMS1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3602 Lines: 119 --=-ejpDmzVCYMDBDCTE0u9Q Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi everyone, On Thu, 2015-02-19 at 17:01 +0000, Mel Gorman wrote: > On Thu, Feb 19, 2015 at 01:06:53PM +0000, David Vrabel wrote: > I cannot think of a reason why this would fail for NUMA balancing on bare > metal. The PAGE_NONE protection clears the present bit on p[te|md]_modify > so the expectations are matched before or after the patch is applied. So, > for bare metal at least >=20 > Acked-by: Mel Gorman >=20 > I *think* this will work ok with Xen but I cannot 100% convince myself. > I'm adding Wei Liu to the cc who may have a Xen PV setup handy that > supports NUMA and may be able to test the patch to confirm. >=20 I'm not Wei, but I've been able to test a kernel with David's patch in the following conditions: 1. as Dom0 kernel, when Xen does not have any virtual NUMA support 2. as DomU PV kernel, when Xen does not have any virtual NUMA support 3. as DomU PV kernel, when Xen _does_ _have_ virtual NUMA support (i.e., Wei's code) Cases 1. and 2. have been, I believe, tested by David already, but anyways... :-) Case 3. worked well for me, as the following commands show. In fact, with this in guest config file: vnuma =3D [ [ "pnode=3D0","size=3D1000","vcpus=3D0-3","vdistances=3D10,20" = ], [ "pnode=3D1","size=3D1000","vcpus=3D4-7","vdistances=3D20,10" ]= , ] This is what I get from inside the guest: root@test-pv:~# numactl --hardware available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 node 0 size: 951 MB node 0 free: 868 MB node 1 cpus: 4 5 6 7 node 1 size: 968 MB node 1 free: 924 MB node distances: node 0 1=20 0: 10 20=20 1: 20 10 And this is it from the host: root@Zhaman:~# xl debug-keys u ; xl dmesg |tail -12 (XEN) Memory location of each domain: (XEN) Domain 0 (total: 1047417): (XEN) Node 0: 1031009 (XEN) Node 1: 16408 (XEN) Domain 1 (total: 512000): (XEN) Node 0: 256000 (XEN) Node 1: 256000 (XEN) 2 vnodes, 8 vcpus, guest physical layout: (XEN) 0: pnode 0, vcpus 0-3=20 (XEN) 0000000000000000 - 000000003e800000 (XEN) 1: pnode 1, vcpus 4-7 (XEN) 000000003e800000 - 000000007d000000 Still inside the guest, I see this: root@test-pv:~# cat /proc/sys/kernel/numa_balancing 1 And this: root@test-pv:~# grep numa /proc/vmstat=20 numa_hit 65987 numa_miss 0 numa_foreign 0 numa_interleave 14473 numa_local 58642 numa_other 7345 numa_pte_updates 596 numa_huge_pte_updates 0 numa_hint_faults 479 numa_hint_faults_local 420 numa_pages_migrated 51 So, yes, I would say this wok with Xen, is that correct, Mel? I'll give it a try at running more complex stuff like 'perf bench numa' inside the guest and see what happens... Regards, Dario --=-ejpDmzVCYMDBDCTE0u9Q Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlTrQ6kACgkQk4XaBE3IOsTKFQCeNDJNdu3MNH4SNemQGXxagl0U RMMAn0zp5rg9fRo2ASw5kCh6dzxoM6AK =Yr9X -----END PGP SIGNATURE----- --=-ejpDmzVCYMDBDCTE0u9Q-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/