Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753110Ab3HVAmU (ORCPT ); Wed, 21 Aug 2013 20:42:20 -0400 Received: from LGEMRELSE6Q.lge.com ([156.147.1.121]:55261 "EHLO LGEMRELSE6Q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752864Ab3HVAmS (ORCPT ); Wed, 21 Aug 2013 20:42:18 -0400 X-AuditID: 9c930179-b7c0bae0000040ac-cd-52155e68aa3e Date: Thu, 22 Aug 2013 09:42:50 +0900 From: Minchan Kim To: Bob Liu Cc: Greg Kroah-Hartman , Andrew Morton , Jens Axboe , Seth Jennings , Nitin Gupta , Konrad Rzeszutek Wilk , Luigi Semenzato , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Pekka Enberg , Mel Gorman , lliubbo@gmail.com Subject: Re: [PATCH v7 0/5] zram/zsmalloc promotion Message-ID: <20130822004250.GB4665@bbox> References: <1377065791-2959-1-git-send-email-minchan@kernel.org> <52148730.4000709@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52148730.4000709@oracle.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: AAAAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5162 Lines: 110 Hi Bob, On Wed, Aug 21, 2013 at 05:24:00PM +0800, Bob Liu wrote: > Hi Minchan, > > On 08/21/2013 02:16 PM, Minchan Kim wrote: > > It's 7th trial of zram/zsmalloc promotion. > > I rewrote cover-letter totally based on previous discussion. > > > > The main reason to prevent zram promotion was no review of > > zsmalloc part while Jens, block maintainer, already acked > > zram part. > > > > At that time, zsmalloc was used for zram, zcache and zswap so > > everybody wanted to make it general and at last, Mel reviewed it > > when zswap was submitted to merge mainline a few month ago. > > Most of review was related to zswap writeback mechanism which > > can pageout compressed page in memory into real swap storage > > in runtime and the conclusion was that zsmalloc isn't good for > > zswap writeback so zswap borrowed zbud allocator from zcache to > > replace zsmalloc. The zbud is bad for memory compression ratio(2) > > but it's very predictable behavior because we can expect a zpage > > includes just two pages as maximum. Other reviews were not major. > > http://lkml.indiana.edu/hypermail/linux/kernel/1304.1/04334.html > > > > Zcache doesn't use zsmalloc either so zsmalloc's user is only > > zram now so this patchset moves it into zsmalloc directory. > > Recently, Bob tried to move zsmalloc under mm directory to unify > > zram and zswap with adding pseudo block device in zswap(It's > > very weired to me) but he was simple ignoring zram's block device > > (a.k.a zram-blk) feature and considered only swap usecase of zram, > > in turn, it lose zram's good concept. > > > > Yes, I didn't notice the feature that zram can be used as a normal block > device. > > > > Mel raised an another issue in v6, "maintainance headache". > > He claimed zswap and zram has a similar goal that is to compresss > > swap pages so if we promote zram, maintainance headache happens > > sometime by diverging implementaion between zswap and zram > > so that he want to unify zram and zswap. For it, he want zswap > > to implement pseudo block device like Bob did to emulate zram so > > zswap can have an advantage of writeback as well as zram's benefit. > > If consider zram as a swap device only, I still think it's better to add > a pseudo block device to zswap and just disable the writeback of zswap. Why do you think zswap is better? > > But I have no idea of zram's block device feature. > > > But I wonder frontswap-based zswap's writeback is really good > > approach for writeback POV. I think that problem isn't only > > specific for zswap. If we want to configure multiple swap hierarchy > > with various speed device such as RAM, NVRAM, SSD, eMMC, NAS etc, > > it would be a general problem. So we should think of more general > > approach. At a glance, I can see two approach. > > > > First, VM could be aware of heterogeneous swap configuration > > so it could aim for being able to configure cache hierarchy > > among swap devices. It may need indirction layer on swap, which > > was already talked about that way so VM can migrate a block from > > A to B easily. It will support various configuration with VM's > > hints, maybe, in future. > > http://lkml.indiana.edu/hypermail/linux/kernel/1203.3/03812.html > > > > Second, as more practical solution, we could use device mapper like > > dm-cache(https://lwn.net/Articles/540996/), which makes it very > > flexible. Now, it supports various configruation and cache policy > > (block size, writeback/writethrough, LRU, MFU although MQ is merged > > now) so it would be good fit for our purpose. Even, it can make zram > > support writeback. I tested it following as following scenario > > in KVM 4 CPU, 1G DRAM with background 800M memory hogger, which is > > allocates random data up to 800M. > > > > 1) zram swap disk 1G, untar kernel.tgz to tmpfs, build -j 4 > > Fail to untar due to shortage of memory space by tmpfs default size limit > > > > 2) zram swap disk 1G, untar kernel.tgz to ext2 on zram-blk, build -j 4 > > OOM happens while building the kernel but it untar successfully > > on ext2 based on zram-blk. The reason OOM happend is zram can not find > > free pages from main memory to store swap out pages although empty > > swap space is still enough. > > > > 3) dm-cache swap disk 1G, untar kernel.tgz to ext2 on zram-blk, build -j 4 > > dmcache consists of zram-meta 10M, zram-cache 1G and real swap storage 1G > > No OOM happens and successfully building done. > > > > Above tests proves zram can support writeback into real swap storage > > so that zram-cache can always have a free space. If necessary, we could > > add new plugin in dm-cache. I see It's really flexible and well-layered > > architecure so zram-blk's concept is good for us and it has lots of > > potential to be enhanced by MM/FS/Block developers. > > > > That's an exciting direction! Thanks! -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/