Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753861AbcCEGe5 (ORCPT ); Sat, 5 Mar 2016 01:34:57 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38926 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751356AbcCEGe4 (ORCPT ); Sat, 5 Mar 2016 01:34:56 -0500 Date: Sat, 5 Mar 2016 17:34:47 +1100 From: Dave Chinner To: Waiman Long Cc: Tejun Heo , Christoph Lameter , xfs@oss.sgi.com, linux-kernel@vger.kernel.org, Ingo Molnar , Peter Zijlstra , Scott J Norton , Douglas Hatch Subject: Re: [RFC PATCH 0/2] percpu_counter: Enable switching to global counter Message-ID: <20160305063447.GB2235@devil.localdomain> References: <1457146299-1601-1-git-send-email-Waiman.Long@hpe.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1457146299-1601-1-git-send-email-Waiman.Long@hpe.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Sat, 05 Mar 2016 06:34:55 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2174 Lines: 56 On Fri, Mar 04, 2016 at 09:51:37PM -0500, Waiman Long wrote: > This patchset allows the degeneration of per-cpu counters back to > global counters when: > > 1) The number of CPUs in the system is large, hence a high cost for > calling percpu_counter_sum(). > 2) The initial count value is small so that it has a high chance of > excessive percpu_counter_sum() calls. > > When the above 2 conditions are true, this patchset allows the user of > per-cpu counters to selectively degenerate them into global counters > with lock. This is done by calling the new percpu_counter_set_limit() > API after percpu_counter_set(). Without this call, there is no change > in the behavior of the per-cpu counters. > > Patch 1 implements the new percpu_counter_set_limit() API. > > Patch 2 modifies XFS to call the new API for the m_ifree and m_fdblocks > per-cpu counters. > > Waiman Long (2): > percpu_counter: Allow falling back to global counter on large system > xfs: Allow degeneration of m_fdblocks/m_ifree to global counters NACK. This change to turns off per-counter free block counters for 32p for the XFS free block counters. We proved 10 years ago that a global lock for these counters was a massive scalability limitation for concurrent buffered writes on 16p machines. IOWs, this change is going to cause fast path concurrent sequential write regressions for just about everyone, even on empty filesystems. The behaviour you are seeing only occurs when the filesystem is near to ENOSPC. As i asked you last time - if you want to make this problem go away, please increase the size of the filesystem you are running your massively concurrent benchmarks on. IOWs, please stop trying to optimise a filesystem slow path that: a) 99.9% of production workloads never execute, b) where we expect performance to degrade as allocation gets computationally expensive as we close in on ENOSPC, c) we start to execute blocking data flush operations that slow everything down massively, and d) is indicative that the workload is about to suffer from a fatal, unrecoverable error (i.e. ENOSPC) Cheers, Dave. -- Dave Chinner dchinner@redhat.com