Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754200Ab3DKUI1 (ORCPT ); Thu, 11 Apr 2013 16:08:27 -0400 Received: from relay2.sgi.com ([192.48.179.30]:48087 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751633Ab3DKUIZ (ORCPT ); Thu, 11 Apr 2013 16:08:25 -0400 Date: Thu, 11 Apr 2013 15:08:20 -0500 From: Russ Anderson To: "Srivatsa S. Bhat" Cc: Paul Mackerras , Linus Torvalds , Ingo Molnar , Robin Holt , "H. Peter Anvin" , Andrew Morton , Linux Kernel Mailing List , Shawn Guo , Thomas Gleixner , Ingo Molnar , the arch/x86 maintainers , "Paul E. McKenney" , Tejun Heo , Oleg Nesterov , Lai Jiangshan , Michel Lespinasse , "rusty@rustcorp.com.au" , Peter Zijlstra Subject: Re: Bulk CPU Hotplug (Was Re: [PATCH] Do not force shutdown/reboot to boot cpu.) Message-ID: <20130411200820.GA10167@sgi.com> Reply-To: Russ Anderson References: <20130403193743.GB29151@sgi.com> <20130408155701.GB19974@gmail.com> <5162EC1A.4050204@zytor.com> <20130408165916.GA3672@sgi.com> <20130410111620.GB29752@gmail.com> <20130411053106.GA9042@drongo> <5166B05E.8010904@linux.vnet.ibm.com> <20130411142301.GB27990@sgi.com> <5166CC87.5060301@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5166CC87.5060301@linux.vnet.ibm.com> User-Agent: Mutt/1.5.20 (2009-12-10) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2948 Lines: 76 On Thu, Apr 11, 2013 at 08:15:27PM +0530, Srivatsa S. Bhat wrote: > On 04/11/2013 07:53 PM, Russ Anderson wrote: > > On Thu, Apr 11, 2013 at 06:15:18PM +0530, Srivatsa S. Bhat wrote: > >> > >> One more thing we have to note is that, there are 4 notifiers for taking a > >> CPU offline: > >> > >> CPU_DOWN_PREPARE > >> CPU_DYING > >> CPU_DEAD > >> CPU_POST_DEAD > >> > >> The first can be run in parallel as mentioned above. The second is run in > >> parallel in the stop_machine() phase as shown in Russ' patch. But the third > >> and fourth set of notifications all end up running only on CPU0, which will > >> again slow down things. > > > > In my testing the third and fourth set were a small part of the overall > > time. Less than 10%, with cpu notifiers 90+% of the time. > > *All* of them are cpu notifiers! All of them invoke __cpu_notify() internally. > So how did you differentiate between them and find out that the third and > fourth sets take less time? I reran a test on a 1024 cpu system, using my test patch to only call __stop_machine() once. Added printks to show the kernel timestamp at various points. When calling disable_nonboot_cpus() and enable_nonboot_cpus() just after booting the system: The loop calling __cpu_notify(CPU_DOWN_PREPARE) took 376.6 seconds. The loop calling cpu_notify_nofail(CPU_DEAD) took 8.1 seconds. My guess is that notifiers do more work in the CPU_DOWN_PREPARE case. I also added a loop calling a new notifier (CPU_TEST) which none of notifiers would recognize, to measure the time it took to spin through the call chain without the notifiers doing any work. It took 0.0067 seconds. On the actual reboot, as the system was shutting down: The loop calling __cpu_notify(CPU_DOWN_PREPARE) took 333.8 seconds. The loop calling cpu_notify_nofail(CPU_DEAD) took 2.7 seconds. I don't know how many notifiers are on the chain, or if there is one heavy hitter accounting for much of the time in the CPU_DOWN_PREPARE case. FWIW, the overall cpu stop times are somewhat longer than what I measured before. Not sure if the difference is due to changes in my test patch, other kernel changes pulled in, or some difference on the test system. > > So you may > > not need the added complexity, or at least fix the cpu notifier part > > first. > > > > To make the 3rd and 4th run fast, the only thing we need to do is take CPUs > offline in smaller steps, like 512, 256 etc.. It doesn't add any extra > complexity over and above what is necessary to make the cpu notifiers run > in parallel in the first place. > > Regards, > Srivatsa S. Bhat -- Russ Anderson, OS RAS/Partitioning Project Lead SGI - Silicon Graphics Inc rja@sgi.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/