Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp104395ybe; Tue, 10 Sep 2019 15:17:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqyllkWqF0dtflfmB1doJZrXZOzSp8sP/wcheAndAgVN6riuPlCL6/HbzbtX+0w0tTgq0vnw X-Received: by 2002:a05:6402:611:: with SMTP id n17mr33756347edv.33.1568153836341; Tue, 10 Sep 2019 15:17:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568153836; cv=none; d=google.com; s=arc-20160816; b=nO2+jkpn7Dq1usQS5gT3NcPTNrz2pjM8vly0QedrdjOi3uwyRDoa6bAkJ2edWrmzyl J/ltYPTVCwstFLtWTaGrLx3IE1B9esXl1o2xb5yU28tw7PekTe2HUwVuB37p6DKSrs3s jiR+3jY/Nk6gPs2Fx5bpAs50e1wkUqEksLRKmc6c+gL2Llan0Q6Ipu7S6z6FlecWB+08 HRAMsuFoLYA2H3zp8LzeeSmlK+TCrzqAzftj6ogDtINWqj49fLCPGshNoWh3DviVeRGG lnzr3062aJfIeaZrouxpaHMnN+aT3aIEBj4N8QbfoBL8Xf/tdENX9nTUfsLB0Osil1lR eznw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:date:cc:to:from:subject :message-id; bh=NsDRtoUibaWeHUOIyVLVcqhaXCP+XLQXAqDruCqbmfI=; b=Cm/z7HCc5W0G930tr3BKb03ZUzZNI74Cx/6oKdff4ke6IaY4fRP5UOF4TmeZKSor07 t5USg8w2KJMQD9UhErvl5rBBG6GaDmURnv8U+7j4E/+EiA/DmviIs1bOGkpjbBJgb7XZ RIz+1DeGAWdWzEuaUCQquMXxBEf1RkMhboFzA/+kFj6FAVL7ic9/PE7kZlLhK9hxEPcd NT0HmEXcQQinpsJRy9FO/WS1DStGffeFTpJ7P+AcRM22M7CoMY0et4HH7YlTAXX9cLpA IWD6B41bCyJLeYAt49rg3N9CATf2IKdTPV1zD5vEWoCC0wtq0kSmvw5mZmqx30qGm99c uBGg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id pv19si9969242ejb.181.2019.09.10.15.16.52; Tue, 10 Sep 2019 15:17:16 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726349AbfIJWOt (ORCPT + 99 others); Tue, 10 Sep 2019 18:14:49 -0400 Received: from mga04.intel.com ([192.55.52.120]:57998 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725856AbfIJWOt (ORCPT ); Tue, 10 Sep 2019 18:14:49 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Sep 2019 15:14:48 -0700 X-IronPort-AV: E=Sophos;i="5.64,490,1559545200"; d="scan'208";a="178822353" Received: from ahduyck-desk1.jf.intel.com ([10.7.198.76]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Sep 2019 15:14:48 -0700 Message-ID: Subject: Re: [PATCH v9 1/8] mm: Add per-cpu logic to page shuffling From: Alexander Duyck To: Michal Hocko Cc: David Hildenbrand , Alexander Duyck , virtio-dev@lists.oasis-open.org, kvm@vger.kernel.org, mst@redhat.com, catalin.marinas@arm.com, dave.hansen@intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, linux-mm@kvack.org, akpm@linux-foundation.org, will@kernel.org, linux-arm-kernel@lists.infradead.org, osalvador@suse.de, yang.zhang.wz@gmail.com, pagupta@redhat.com, konrad.wilk@oracle.com, nitesh@redhat.com, riel@surriel.com, lcapitulino@redhat.com, wei.w.wang@intel.com, aarcange@redhat.com, ying.huang@intel.com, pbonzini@redhat.com, dan.j.williams@intel.com, fengguang.wu@intel.com, kirill.shutemov@linux.intel.com Date: Tue, 10 Sep 2019 15:14:47 -0700 In-Reply-To: <20190910121130.GU2063@dhcp22.suse.cz> References: <20190907172225.10910.34302.stgit@localhost.localdomain> <20190907172512.10910.74435.stgit@localhost.localdomain> <0df2e5d0-af92-04b4-aa7d-891387874039@redhat.com> <0ca58fea280b51b83e7b42e2087128789bc9448d.camel@linux.intel.com> <20190910121130.GU2063@dhcp22.suse.cz> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 2019-09-10 at 14:11 +0200, Michal Hocko wrote: > On Mon 09-09-19 08:11:36, Alexander Duyck wrote: > > On Mon, 2019-09-09 at 10:14 +0200, David Hildenbrand wrote: > > > On 07.09.19 19:25, Alexander Duyck wrote: > > > > From: Alexander Duyck > > > > > > > > Change the logic used to generate randomness in the suffle path so that we > > > > can avoid cache line bouncing. The previous logic was sharing the offset > > > > and entropy word between all CPUs. As such this can result in cache line > > > > bouncing and will ultimately hurt performance when enabled. > > > > > > So, usually we perform such changes if there is real evidence. Do you > > > have any such performance numbers to back your claims? > > > > I'll have to go rerun the test to get the exact numbers. The reason this > > came up is that my original test was spanning NUMA nodes and that made > > this more expensive as a result since the memory was both not local to the > > CPU and was being updated by multiple sockets. > > What was the pattern of page freeing in your testing? I am wondering > because order 0 pages should be prevailing and those usually go via pcp > lists so they do not get shuffled unless the batch is full IIRC. So I am pretty sure my previous data was faulty. One side effect of the page reporting is that it was evicting pages out of the guest and when the pages were faulted back in they were coming from local page pools. This was throwing off my early numbers and making tests look better than they should have for the reported case. I had this patch previously merged with another one so I wasn't testing it on its own, it was instead a part of a bigger set. Now that I have tried testing it on its own I can see that it has no significant impact on performance. With that being the case I will probably just drop it.