Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752167Ab3EAFJb (ORCPT ); Wed, 1 May 2013 01:09:31 -0400 Received: from mail-da0-f45.google.com ([209.85.210.45]:52693 "EHLO mail-da0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751275Ab3EAFJY (ORCPT ); Wed, 1 May 2013 01:09:24 -0400 Message-ID: <5180A37E.8010701@gmail.com> Date: Wed, 01 May 2013 13:09:18 +0800 From: Ric Mason User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: Tim Chen CC: Andrew Morton , Tejun Heo , Christoph Lameter , Al Viro , Dave Hansen , Andi Kleen , linux-kernel , linux-mm Subject: Re: [PATCH 2/2] Make batch size for memory accounting configured according to size of memory References: <8c9bc7d4646d48154604820a3ec5952ba8949de4.1367254913.git.tim.c.chen@linux.intel.com> In-Reply-To: <8c9bc7d4646d48154604820a3ec5952ba8949de4.1367254913.git.tim.c.chen@linux.intel.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3421 Lines: 98 Hi Tim, On 04/30/2013 01:12 AM, Tim Chen wrote: > Currently the per cpu counter's batch size for memory accounting is > configured as twice the number of cpus in the system. However, > for system with very large memory, it is more appropriate to make it > proportional to the memory size per cpu in the system. > > For example, for a x86_64 system with 64 cpus and 128 GB of memory, > the batch size is only 2*64 pages (0.5 MB). So any memory accounting > changes of more than 0.5MB will overflow the per cpu counter into > the global counter. Instead, for the new scheme, the batch size > is configured to be 0.4% of the memory/cpu = 8MB (128 GB/64 /256), If large batch size will lead to global counter more inaccurate? > which is more inline with the memory size. > > Signed-off-by: Tim Chen > --- > mm/mmap.c | 13 ++++++++++++- > mm/nommu.c | 13 ++++++++++++- > 2 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 0db0de1..082836e 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -89,6 +89,7 @@ int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT; > * other variables. It can be updated by several CPUs frequently. > */ > struct percpu_counter vm_committed_as ____cacheline_aligned_in_smp; > +int vm_committed_batchsz ____cacheline_aligned_in_smp; > > /* > * The global memory commitment made in the system can be a metric > @@ -3090,10 +3091,20 @@ void mm_drop_all_locks(struct mm_struct *mm) > /* > * initialise the VMA slab > */ > +static inline int mm_compute_batch(void) > +{ > + int nr = num_present_cpus(); > + > + /* batch size set to 0.4% of (total memory/#cpus) */ > + return (int) (totalram_pages/nr) / 256; > +} > + > void __init mmap_init(void) > { > int ret; > > - ret = percpu_counter_init(&vm_committed_as, 0); > + vm_committed_batchsz = mm_compute_batch(); > + ret = percpu_counter_and_batch_init(&vm_committed_as, 0, > + &vm_committed_batchsz); > VM_BUG_ON(ret); > } > diff --git a/mm/nommu.c b/mm/nommu.c > index 2f3ea74..a87a99c 100644 > --- a/mm/nommu.c > +++ b/mm/nommu.c > @@ -59,6 +59,7 @@ unsigned long max_mapnr; > unsigned long num_physpages; > unsigned long highest_memmap_pfn; > struct percpu_counter vm_committed_as; > +int vm_committed_batchsz; > int sysctl_overcommit_memory = OVERCOMMIT_GUESS; /* heuristic overcommit */ > int sysctl_overcommit_ratio = 50; /* default is 50% */ > int sysctl_max_map_count = DEFAULT_MAX_MAP_COUNT; > @@ -526,11 +527,21 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) > /* > * initialise the VMA and region record slabs > */ > +static inline int mm_compute_batch(void) > +{ > + int nr = num_present_cpus(); > + > + /* batch size set to 0.4% of (total memory/#cpus) */ > + return (int) (totalram_pages/nr) / 256; > +} > + > void __init mmap_init(void) > { > int ret; > > - ret = percpu_counter_init(&vm_committed_as, 0); > + vm_committed_batchsz = mm_compute_batch(); > + ret = percpu_counter_and_batch_init(&vm_committed_as, 0, > + &vm_committed_batchsz); > VM_BUG_ON(ret); > vm_region_jar = KMEM_CACHE(vm_region, SLAB_PANIC); > } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/