Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752582Ab3CNTRo (ORCPT ); Thu, 14 Mar 2013 15:17:44 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:38378 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751476Ab3CNTRm convert rfc822-to-8bit (ORCPT ); Thu, 14 Mar 2013 15:17:42 -0400 MIME-Version: 1.0 Message-ID: Date: Thu, 14 Mar 2013 12:16:29 -0700 (PDT) From: Dan Magenheimer To: Dan Magenheimer , Robert Jennings Cc: minchan@kernel.org, sjenning@linux.vnet.ibm.com, Nitin Gupta , Konrad Wilk , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Bob Liu , Luigi Semenzato , Mel Gorman Subject: RE: zsmalloc limitations and related topics References: <0efe9610-1aa5-4aa9-bde9-227acfa969ca@default> <20130313151359.GA3130@linux.vnet.ibm.com> <4ab899f6-208c-4d61-833c-d1e5e8b1e761@default> In-Reply-To: <4ab899f6-208c-4d61-833c-d1e5e8b1e761@default> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.7 (607090) [OL 12.0.6665.5003 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT X-Source-IP: acsinet22.oracle.com [141.146.126.238] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3285 Lines: 69 > From: Dan Magenheimer > Subject: RE: zsmalloc limitations and related topics > > > > I would welcome ideas on how to evaluate workloads for > > > "representativeness". Personally I don't believe we should > > > be making decisions about selecting the "best" algorithms > > > or merging code without an agreement on workloads. > > > > I'd argue that there is no such thing as a "representative workload". > > Instead, we try different workloads to validate the design and illustrate > > the performance characteristics and impacts. > > Sorry for repeatedly hammering my point in the above, but > there have been many design choices driven by what was presumed > to be representative (kernbench and now SPECjbb) workload > that may be entirely wrong for a different workload (as > Seth once pointed out using the text of Moby Dick as a source > data stream). > > Further, the value of different designs can't be measured here just > by the workload because the pages chosen to swap may be completely > independent of the intended workload-driver... i.e. if you track > the pid of the pages intended for swap, the pages can be mostly > pages from long-running or periodic system services, not pages > generated by kernbench or SPECjbb. So it is the workload PLUS the > environment that is being measured and evaluated. That makes > the problem especially tough. > > Just to clarify, I'm not suggesting that there is any single > workload that can be called representative, just that we may > need both a broad set of workloads (not silly benchmarks) AND > some theoretical analysis to drive design decisions. And, without > this, arguing about whether zsmalloc is better than zbud or not > is silly. Both zbud and zsmalloc have strengths and weaknesses. > > That said, it should also be pointed out that the stream of > pages-to-compress from cleancache ("file pages") may be dramatically > different than for frontswap ("anonymous pages"), so unless you > and Seth are going to argue upfront that cleancache pages should > NEVER be candidates for compression, the evaluation criteria > to drive design decisions needs to encompass both anonymous > and file pages. It is currently impossible to evaluate that > with zswap. Sorry to reply to myself here, but I realized last night that I left off another related important point: We have a tendency to run benchmarks on a "cold" system so that the results are reproducible. For compression however, this may unnaturally skew the entropy of data-pages-to-be-compressed and so also the density measurements. I can't prove it, but I suspect that soon after boot the number of anonymous pages containing all (or nearly all) zeroes is large, i.e. entropy is low. As the length of time grows since the system booted, more anonymous pages will be written with non-zero data, thus increasing entropy and decreasing compressibility. So, over time, the distribution of zsize may slowly skew right (toward PAGE_SIZE). If so, this effect may be very real but very hard to observe. Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/