Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1031346AbbD2Bbz (ORCPT ); Tue, 28 Apr 2015 21:31:55 -0400 Received: from g2t2354.austin.hp.com ([15.217.128.53]:48727 "EHLO g2t2354.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1031276AbbD2Bbx (ORCPT ); Tue, 28 Apr 2015 21:31:53 -0400 Message-ID: <55403484.8060906@hp.com> Date: Tue, 28 Apr 2015 21:31:48 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Daniel J Blueman CC: Mel Gorman , Linux-MM , Nathan Zimmer , Dave Hansen , Scott Norton , Andrew Morton , LKML , "'Steffen Persvold'" Subject: Re: [PATCH 0/13] Parallel struct page initialisation v3 References: <1429785196-7668-1-git-send-email-mgorman@suse.de> <1429804437.24139.3@cpanel21.proisp.no> In-Reply-To: <1429804437.24139.3@cpanel21.proisp.no> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3355 Lines: 85 On 04/23/2015 11:53 AM, Daniel J Blueman wrote: > On Thu, Apr 23, 2015 at 6:33 PM, Mel Gorman wrote: >> The big change here is an adjustment to the topology_init path that >> caused >> soft lockups on Waiman and Daniel Blue had reported it was an expensive >> function. >> >> Changelog since v2 >> o Reduce overhead of topology_init >> o Remove boot-time kernel parameter to enable/disable >> o Enable on UMA >> >> Changelog since v1 >> o Always initialise low zones >> o Typo corrections >> o Rename parallel mem init to parallel struct page init >> o Rebase to 4.0 > [] > > Splendid work! On this 256c setup, topology_init now takes 185ms. > > This brings the kernel boot time down to 324s [1]. It turns out that > one memset is responsible for most of the time setting up the the PUDs > and PMDs; adapting memset to using non-temporal writes [3] avoids > generating RMW cycles, bringing boot time down to 186s [2]. > > If this is a possibility, I can split this patch and map other arch's > memset_nocache to memset, or change the callsite as preferred; > comments welcome. > > Thanks, > Daniel > > [1] https://resources.numascale.com/telemetry/defermem/h8qgl-defer2.txt > [2] > https://resources.numascale.com/telemetry/defermem/h8qgl-defer2-nontemporal.txt > > -- [3] > > From f822139736cab8434302693c635fa146b465273c Mon Sep 17 00:00:00 2001 > From: Daniel J Blueman > Date: Thu, 23 Apr 2015 23:26:27 +0800 > Subject: [RFC] Speedup PMD setup > > Using non-temporal writes prevents read-modify-write cycles, > which are much slower over large topologies. > > Adapt the existing memset() function into a _nocache variant and use > when setting up PMDs during early boot to reduce boot time. > > Signed-off-by: Daniel J Blueman > --- > arch/x86/include/asm/string_64.h | 3 ++ > arch/x86/lib/memset_64.S | 90 > ++++++++++++++++++++++++++++++++++++++++ > mm/memblock.c | 2 +- > 3 files changed, 94 insertions(+), 1 deletion(-) > I tried your patch on my 12-TB IvyBridge-EX test machine and the bootup time increased from 265s to 289s (24s increase). I think my IvyBridge-EX box was using the optimized memset_c_e (rep stosb) code which turned out to perform better than the non-temporal move in your code. I think that may be due to the temporal moves that need to be done at the beginning and end of the memory range. I had tried to replace clear_page() with non-temporal moves. I generally got about a few percentage points improvement compared with the optimized clear_page_c() and clear_page_c_e() code. That is not a lot. Anyway, I think the AMD box that you used wasn't setting the X86_FEATURE_REP_GOOD or X86_FEATURE_ERMS bits resulting in poor memset performance. If such a feature is supported in the AMD CPU (albeit in a different way), you may consider sending in patch to set those features bit. Alternatively, you will need to duplicate the alternative instruction stuff in your memset_nocache() to make sure that it can use the optimized code, if appropriate. Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/