Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758501Ab3ENU5W (ORCPT ); Tue, 14 May 2013 16:57:22 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:26297 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758466Ab3ENU5U convert rfc822-to-8bit (ORCPT ); Tue, 14 May 2013 16:57:20 -0400 MIME-Version: 1.0 Message-ID: <370eb593-1a2f-41a6-8b16-163f54634f19@default> Date: Tue, 14 May 2013 13:54:41 -0700 (PDT) From: Dan Magenheimer To: Seth Jennings Cc: Bob Liu , Mel Gorman , Andrew Morton , Greg Kroah-Hartman , Nitin Gupta , Minchan Kim , Konrad Wilk , Robert Jennings , Jenifer Hopper , Johannes Weiner , Rik van Riel , Larry Woodman , Benjamin Herrenschmidt , Dave Hansen , Joe Perches , Joonsoo Kim , Cody P Schafer , Hugh Dickens , Paul Mackerras , linux-mm@kvack.org, linux-kernel@vger.kernel.org, devel@driverdev.osuosl.org Subject: RE: [PATCHv11 3/4] zswap: add to mm/ References: <1368448803-2089-1-git-send-email-sjenning@linux.vnet.ibm.com> <1368448803-2089-4-git-send-email-sjenning@linux.vnet.ibm.com> <51920197.9070105@oracle.com> <20130514160040.GB4024@medulla> <20130514172827.GE4024@medulla> In-Reply-To: <20130514172827.GE4024@medulla> X-Priority: 3 X-Mailer: Oracle Beehive Extensions for Outlook 2.0.1.7 (607090) [OL 12.0.6668.5000 (x86)] Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8BIT X-Source-IP: acsinet22.oracle.com [141.146.126.238] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4198 Lines: 100 > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > Subject: Re: [PATCHv11 3/4] zswap: add to mm/ > > On Tue, May 14, 2013 at 09:37:08AM -0700, Dan Magenheimer wrote: > > > From: Seth Jennings [mailto:sjenning@linux.vnet.ibm.com] > > > Subject: Re: [PATCHv11 3/4] zswap: add to mm/ > > > > > > On Tue, May 14, 2013 at 05:19:19PM +0800, Bob Liu wrote: > > > > Hi Seth, > > > > > > Hi Bob, thanks for the review! > > > > > > > > > > > > + /* reclaim space if needed */ > > > > > + if (zswap_is_full()) { > > > > > + zswap_pool_limit_hit++; > > > > > + if (zbud_reclaim_page(tree->pool, 8)) { > > > > > > > > My idea is to wake up a kernel thread here to do the reclaim. > > > > Once zswap is full(20% percent of total mem currently), the kernel > > > > thread should reclaim pages from it. Not only reclaim one page, it > > > > should depend on the current memory pressure. > > > > And then the API in zbud may like this: > > > > zbud_reclaim_page(pool, nr_pages_to_reclaim, nr_retry); > > > > > > So kswapd for zswap. I'm not opposed to the idea if a case can be > > > made for the complexity. I must say, I don't see that case though. > > > > > > The policy can evolve as deficiencies are demonstrated and solutions are > > > found. > > > > Hmmm... it is fairly easy to demonstrate the deficiency if > > one tries. I actually first saw it occur on a real (though > > early) EL6 system which started some graphics-related service > > that caused a very brief swapstorm that was invisible during > > normal boot but clogged up RAM with compressed pages which > > later caused reduced weird benchmarking performance. > > Without any specifics, I'm not sure what I can do with this. Well, I think its customary for the author of a patch to know the limitations of the patch. I suggest you synthesize a workload that attempts to measure worst case. That's exactly what I did a year ago that led me to the realization that zcache needed to solve some issues before it was ready to promote out of staging. > I'm hearing you say that the source of the benchmark degradation > are the idle pages in zswap. In that case, the periodic writeback > patches I have in the wings should address this. > > I think we are on the same page without realizing it. Right now > zswap supports a kind of "direct reclaim" model at allocation time. > The periodic writeback patches will handle the proactive writeback > part to free up the zswap pool when it has idle pages in it. I don't think we are on the same page though maybe you are heading in the same direction now. I won't repeat the comments from the previous email. > > I think Mel's unpredictability concern applies equally here... > > this may be a "long-term source of bugs and strange memory > > management behavior." > > > > > Can I get your ack on this pending the other changes? > > > > I'd like to hear Mel's feedback about this, but perhaps > > a compromise to allow for zswap merging would be to add > > something like the following to zswap's Kconfig comment: > > > > "Zswap reclaim policy is still primitive. Until it improves, > > zswap should be considered experimental and is not recommended > > for production use." > > Just for the record, an "experimental" tag in the Kconfig won't > work for me. > > The reclaim policy for zswap is not primitive, it's simple. There > is a difference. Plus zswap is already runtime disabled by default. > If distros/customers enabled it, it is because they purposely > enabled it. Hmmm... I think you are proposing to users/distros the following use model: "If zswap works for you, turn it on. If it sucks, turn it off. I can't tell you in advance whether it will work or suck for your distro/workload, but it will probably work so please try it." That sounds awfully experimental to me. The problem is not simple. Your solution is simple because you are simply pretending that the harder parts of the problem don't exist. Dan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/