Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1423247AbbD2OiZ (ORCPT ); Wed, 29 Apr 2015 10:38:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:55319 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1423013AbbD2OiV (ORCPT ); Wed, 29 Apr 2015 10:38:21 -0400 Message-ID: <5540ECDA.6060308@redhat.com> Date: Wed, 29 Apr 2015 10:38:18 -0400 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: Jason Low , Peter Zijlstra , Ingo Molnar , Thomas Gleixner CC: linux-kernel@vger.kernel.org, "Paul E. McKenney" , Andrew Morton , Oleg Nesterov , Frederic Weisbecker , Mel Gorman , Steven Rostedt , Preeti U Murthy , Mike Galbraith , Davidlohr Bueso , Waiman Long , Aswin Chandramouleeswaran , Scott J Norton Subject: Re: [PATCH v2 3/5] sched, timer: Use atomics in thread_group_cputimer to improve scalability References: <1430251224-5764-1-git-send-email-jason.low2@hp.com> <1430251224-5764-4-git-send-email-jason.low2@hp.com> In-Reply-To: <1430251224-5764-4-git-send-email-jason.low2@hp.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1666 Lines: 34 On 04/28/2015 04:00 PM, Jason Low wrote: > While running a database workload, we found a scalability issue with itimers. > > Much of the problem was caused by the thread_group_cputimer spinlock. > Each time we account for group system/user time, we need to obtain a > thread_group_cputimer's spinlock to update the timers. On larger systems > (such as a 16 socket machine), this caused more than 30% of total time > spent trying to obtain this kernel lock to update these group timer stats. > > This patch converts the timers to 64 bit atomic variables and use > atomic add to update them without a lock. With this patch, the percent > of total time spent updating thread group cputimer timers was reduced > from 30% down to less than 1%. > > Note: On 32 bit systems using the generic 64 bit atomics, this causes > sample_group_cputimer() to take locks 3 times instead of just 1 time. > However, we tested this patch on a 32 bit system ARM system using the > generic atomics and did not find the overhead to be much of an issue. > An explanation for why this isn't an issue is that 32 bit systems usually > have small numbers of CPUs, and cacheline contention from extra spinlocks > called periodically is not really apparent on smaller systems. I don't see 32 bit systems ever getting so many CPUs that this becomes an issue :) > Signed-off-by: Jason Low Acked-by: Rik van Riel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/