Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966553AbbDWQas (ORCPT ); Thu, 23 Apr 2015 12:30:48 -0400 Received: from cantor2.suse.de ([195.135.220.15]:33112 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966314AbbDWQap (ORCPT ); Thu, 23 Apr 2015 12:30:45 -0400 Date: Thu, 23 Apr 2015 17:30:39 +0100 From: Mel Gorman To: Daniel J Blueman Cc: Linux-MM , Nathan Zimmer , Dave Hansen , Waiman Long , Scott Norton , Andrew Morton , LKML , "'Steffen Persvold'" Subject: Re: [PATCH 0/13] Parallel struct page initialisation v3 Message-ID: <20150423163039.GB2449@suse.de> References: <1429785196-7668-1-git-send-email-mgorman@suse.de> <1429804437.24139.3@cpanel21.proisp.no> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <1429804437.24139.3@cpanel21.proisp.no> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1919 Lines: 53 On Thu, Apr 23, 2015 at 11:53:57PM +0800, Daniel J Blueman wrote: > On Thu, Apr 23, 2015 at 6:33 PM, Mel Gorman wrote: > >The big change here is an adjustment to the topology_init path > >that caused > >soft lockups on Waiman and Daniel Blue had reported it was an > >expensive > >function. > > > >Changelog since v2 > >o Reduce overhead of topology_init > >o Remove boot-time kernel parameter to enable/disable > >o Enable on UMA > > > >Changelog since v1 > >o Always initialise low zones > >o Typo corrections > >o Rename parallel mem init to parallel struct page init > >o Rebase to 4.0 > [] > > Splendid work! On this 256c setup, topology_init now takes 185ms. > > This brings the kernel boot time down to 324s [1]. Good stuff. Am I correct in thinking that the vanilla kernel takes 732s? > It turns out that > one memset is responsible for most of the time setting up the the > PUDs and PMDs; adapting memset to using non-temporal writes [3] > avoids generating RMW cycles, bringing boot time down to 186s [2]. > > If this is a possibility, I can split this patch and map other > arch's memset_nocache to memset, or change the callsite as > preferred; comments welcome. > In general, I see no problem with the patch and that it would be useful going in before or after this series. I would suggest you splt this into three patches. The first that is an asm-generic alias of memset_nocache to memset with documentation saying it's optional for an architecture to implement. The second would be your implementation for x86 that needs to go to the x86 maintainers. The third would then be the memblock.c change. Thanks. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/