Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756325Ab0FOMuX (ORCPT ); Tue, 15 Jun 2010 08:50:23 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:33353 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751072Ab0FOMuU (ORCPT ); Tue, 15 Jun 2010 08:50:20 -0400 Date: Tue, 15 Jun 2010 18:19:55 +0530 From: Balbir Singh To: Avi Kivity Cc: Dave Hansen , kvm , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cache control Message-ID: <20100615124955.GF4306@balbir.in.ibm.com> Reply-To: balbir@linux.vnet.ibm.com References: <20100611045600.GE5191@balbir.in.ibm.com> <4C15E3C8.20407@redhat.com> <20100614084810.GT5191@balbir.in.ibm.com> <1276528376.6437.7176.camel@nimitz> <20100614165853.GW5191@balbir.in.ibm.com> <1276535371.6437.7417.camel@nimitz> <20100614171624.GY5191@balbir.in.ibm.com> <4C1727EC.2020500@redhat.com> <20100615075210.GB4306@balbir.in.ibm.com> <4C174DD7.3000608@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <4C174DD7.3000608@redhat.com> User-Agent: Mutt/1.5.20 (2009-12-10) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3890 Lines: 89 * Avi Kivity [2010-06-15 12:54:31]: > On 06/15/2010 10:52 AM, Balbir Singh wrote: > >>> > >>>That is why the policy (in the next set) will come from the host. As > >>>to whether the data is truly duplicated, my experiments show up to 60% > >>>of the page cache is duplicated. > >>Isn't that incredibly workload dependent? > >> > >>We can't expect the host admin to know whether duplication will > >>occur or not. > >> > >I was referring to cache = (policy) we use based on the setup. I don't > >think the duplication is too workload specific. Moreover, we could use > >aggressive policies and restrict page cache usage or do it selectively > >on ballooning. We could also add other options to make the ballooning > >option truly optional, so that the system management software decides. > > Consider a read-only workload that exactly fits in guest cache. > Without trimming, the guest will keep hitting its own cache, and the > host will see no access to the cache at all. So the host (assuming > it is under even low pressure) will evict those pages, and the guest > will happily use its own cache. If we start to trim, the guest will > have to go to disk. That's the best case. > > Now for the worst case. A random access workload that misses the > cache on both guest and host. Now every page is duplicated, and > trimming guest pages allows the host to increase its cache, and > potentially reduce misses. In this case trimming duplicated pages > works. > > Real life will see a mix of this. Often used pages won't be > duplicated, and less often used pages may see some duplication, > especially if the host cache portion dedicated to the guest is > bigger than the guest cache. > > I can see that trimming duplicate pages helps, but (a) I'd like to > be sure they are duplicates and (b) often trimming them from the > host is better than trimming them from the guest. > Lets see the behaviour with these patches The first patch is a proactive approach to keep more memory around. Enabling the parameter implies we are OK paying the cost of some overhead. My data shows that leaves a significant amount of free memory with a small 5% (in my case) overhead. This brings us back to what you can do with free memory. The second patch shows no overhead and selectively tries to use free cache to return back on memory pressure (as indicated by the balloon driver). We've discussed the reasons for doing this 1. In the situations where cache is duplicated this should benefit us. Your contention is that we need to be specific about the duplication. That falls under the realm of CMM. 2. In the case of slab cache, duplication does not matter, it is a free page, that should be reclaimed ahead of mapped pages ideally. If the slab grows, it will get another new page. What is the cost of (1) In the worst case, we select a non-duplicated page, but for us to select it, it should be inactive, in that case we do I/O to bring back the page. > Trimming from the guest is worthwhile if the pages are not used very > often (but enough that caching them in the host is worth it) and if > the host cache can serve more than one guest. If we can identify > those pages, we don't risk degrading best-case workloads (as defined > above). > > (note ksm to some extent identifies those pages, though it is a bit > expensive, and doesn't share with the host pagecache). > I see that you are hinting towards finding exact duplicates, I don't know if the cost and complexity justify it. I hope more users can try the patches with and without the boot parameter and provide additional feedback. -- Three Cheers, Balbir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/