Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759031AbZKYRgs (ORCPT ); Wed, 25 Nov 2009 12:36:48 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1758986AbZKYRgr (ORCPT ); Wed, 25 Nov 2009 12:36:47 -0500 Received: from e23smtp03.au.ibm.com ([202.81.31.145]:37794 "EHLO e23smtp03.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758876AbZKYRgp (ORCPT ); Wed, 25 Nov 2009 12:36:45 -0500 Date: Wed, 25 Nov 2009 23:06:45 +0530 From: Balbir Singh To: Hugh Dickins Cc: Andrew Morton , Izik Eidus , Andrea Arcangeli , Chris Wright , KAMEZAWA Hiroyuki , Daisuke Nishimura , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH 6/9] ksm: mem cgroup charge swapin copy Message-ID: <20091125173645.GF2970@balbir.in.ibm.com> Reply-To: balbir@linux.vnet.ibm.com References: <20091125142355.GD2970@balbir.in.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-08-17) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2791 Lines: 64 * Hugh Dickins [2009-11-25 17:12:13]: > On Wed, 25 Nov 2009, Balbir Singh wrote: > > * Hugh Dickins [2009-11-24 16:51:13]: > > > > > But ksm swapping does require one small change in mem cgroup handling. > > > When do_swap_page()'s call to ksm_might_need_to_copy() does indeed > > > substitute a duplicate page to accommodate a different anon_vma (or a > > > different index), that page escaped mem cgroup accounting, because of > > > the !PageSwapCache check in mem_cgroup_try_charge_swapin(). > > > > > > > The duplicate page doesn't show up as PageSwapCache > > That's right. > > > or are we optimizing > > for the race condition where the page is not in SwapCache? > > No, optimization wasn't on my mind at all. To be honest, it's slightly > worsening the case of the race in which another thread has independently > faulted it in, and then removed it from swap cache. But I think we'll > agree that that's rare enough a case that a few more cycles doing it > won't matter. > Thanks for clarifying, yes I agree that the condition is rare and nothing for us to worry about about at the moment. > > I should probably look at the full series. > > 2/9 is the one which brings the problem: it's ksm_might_need_to_copy() > (an inline which tests for the condition) and ksm_does_need_to_copy() > (which makes a duplicate page when the condition has been found so). > > The problem arises because an Anon struct page contains a pointer to > its anon_vma, used to locate its ptes when swapping. Suddenly, with > KSM swapping, an anon page may get read in from swap, faulted in and > pointed to its anon_vma, everything fine; but then faulted in again > somewhere else, and needs to be pointed to a different anon_vma... > > Lose its anon_vma and it becomes unswappable, not a good choice when > trying to extend swappability: so instead we allocate a duplicate page > just to point to the different anon_vma; and if they last long enough, > unchanged, KSM will come around again to find them the same and > remerge them. Not an efficient solution, but a simple solution, > much in keeping with the way KSM already works. > > The duplicate page is not PageSwapCache: certainly it crossed my mind > to try making it PageSwapCache like the original, but I think that > raises lots of other problems (how do we make the radix_tree slot > for that offset hold two page pointers?). > Thanks for the detailed explanation, it does help me understand what is going on. -- Balbir -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/