Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966723Ab3E2VQl (ORCPT ); Wed, 29 May 2013 17:16:41 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:50451 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933570Ab3E2VQd (ORCPT ); Wed, 29 May 2013 17:16:33 -0400 Date: Wed, 29 May 2013 14:16:30 -0700 From: Andrew Morton To: Seth Jennings Cc: Greg Kroah-Hartman , Nitin Gupta , Minchan Kim , Konrad Rzeszutek Wilk , Dan Magenheimer , Robert Jennings , Jenifer Hopper , Mel Gorman , Johannes Weiner , Rik van Riel , Larry Woodman , Benjamin Herrenschmidt , Dave Hansen , Joe Perches , Joonsoo Kim , Cody P Schafer , Hugh Dickens , Paul Mackerras , Heesub Shin , linux-mm@kvack.org, linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org Subject: Re: [PATCHv12 3/4] zswap: add to mm/ Message-Id: <20130529141630.8f2d1aa9b16d05e60e4a7ada@linux-foundation.org> In-Reply-To: <20130529210820.GF428@cerebellum> References: <1369067168-12291-1-git-send-email-sjenning@linux.vnet.ibm.com> <1369067168-12291-4-git-send-email-sjenning@linux.vnet.ibm.com> <20130528145918.acbd84df00313e527cf04d1b@linux-foundation.org> <20130529145720.GA428@cerebellum> <20130529112929.24005ae9cf1d9d636b2ea42f@linux-foundation.org> <20130529195027.GC428@cerebellum> <20130529125747.23a6a26cdcb013842bf31644@linux-foundation.org> <20130529210820.GF428@cerebellum> X-Mailer: Sylpheed 3.2.0beta5 (GTK+ 2.24.10; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3183 Lines: 69 On Wed, 29 May 2013 16:08:20 -0500 Seth Jennings wrote: > On Wed, May 29, 2013 at 12:57:47PM -0700, Andrew Morton wrote: > > On Wed, 29 May 2013 14:50:27 -0500 Seth Jennings wrote: > > > > > On Wed, May 29, 2013 at 11:29:29AM -0700, Andrew Morton wrote: > > > > On Wed, 29 May 2013 09:57:20 -0500 Seth Jennings wrote: > > > > > > > > > > > +/********************************* > > > > > > > +* helpers > > > > > > > +**********************************/ > > > > > > > +static inline bool zswap_is_full(void) > > > > > > > +{ > > > > > > > + return (totalram_pages * zswap_max_pool_percent / 100 < > > > > > > > + zswap_pool_pages); > > > > > > > +} > > > > > > > > > > > > We have had issues in the past where percentage-based tunables were too > > > > > > coarse on very large machines. For example, a terabyte machine where 0 > > > > > > bytes is too small and 10GB is too large. > > > > > > > > > > Yes, this is known limitation of the code right now and it is a high priority > > > > > to come up with something better. It isn't clear what dynamic sizing policy > > > > > should be used so, until such time as that policy can be determined, this is a > > > > > simple stop-gap that works well enough for simple setups. > > > > > > > > It's a module parameter and hence is part of the userspace interface. > > > > It's undesirable that the interface be changed, and it would be rather > > > > dumb to merge it as-is when we *know* that it will be changed. > > > > > > > > I don't think we can remove the parameter altogether (or can we?), so I > > > > suggest we finalise it ASAP. Perhaps rename it to > > > > zswap_max_pool_ratio, with a range 1..999999. Better ideas needed :( > > > > > > zswap_max_pool_ratio is fine with me. I'm not entirely clear on the change > > > though. Would that just be a name change or a change in meaning? > > > > It would be a change in behaviour. The problem which I'm suggesting we > > address is that a 1% increment is too coarse. > > Sorry, but I'm not getting this. This zswap_max_pool_ratio is a ratio of what > to what? Maybe if you wrote out the calculation of the max pool size using > this ratio I'll get it. > This: totalram_pages * zswap_max_pool_percent / 100 means that we have are able to control the pool size in 10GB increments on a 1TB machine. Past experience with other tunables tells us that this can be a problem. Hence my (lame) suggestion that we replace it with totalram_pages * zswap_max_pool_ratio / 1000000 Another approach would be to stop using a ratio altogether, and make the tunable specify an absolute number of bytes. That's how we approached this problem in the case of /proc/sys/vm/dirty_background_ratio. See https://lkml.org/lkml/2008/11/23/160. (And it's "bytes", not "pages" because PAGE_SIZE can vary by a factor of 16, which is a lot). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/