Received: by 10.213.65.68 with SMTP id h4csp405902imn; Fri, 16 Mar 2018 06:53:09 -0700 (PDT) X-Google-Smtp-Source: AG47ELt40A9OraH98R1WfEORgtrPNaYRSQg1+Fc1l9pUAG8Lypz/lhvegkZg5lv0luzrOjtd8AFd X-Received: by 10.98.214.10 with SMTP id r10mr1674212pfg.8.1521208389022; Fri, 16 Mar 2018 06:53:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521208388; cv=none; d=google.com; s=arc-20160816; b=ke9MIRj/2JEp0NvVPuZb4qJrZCUgfUBa/UgIO87RM7nQtJBB4UAveIbK7SgnlMZmko mukBfO85c3zT5x7Zt6p4dpevydshNrUKcrSanm4cXsO+oA9s/EsPP0GNMC0JrRRKzKmQ GDgGEpfOxuRXArgRZRak49E9So+aiugnSUC+Xo+kNOV4R54B0nRjhcYd8g+MAmIrDFST mGdawIh+XAuuacBQFrJuVU3DQ1ubw5nz6Zw8zQ/MLzOy2gy/zTORBXVxj7/4poVkzyf9 OouzvJm2rJKUgl31mk1M9RZ3YP84Gqfb6BVCxeg0g25XcttVJzXx1aGyafiUCAuZ614R Skbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=b/+EytOJcmli8PcURS5IDjga7YDhRX4Yt8bpp2lf7cU=; b=LDQv6MsmuClgvDAfjqYZAnVbBUncRCysRpbDmtABU9X0NWbNMdY1suAp5SlKqifKwB 6w/y4XvtMCrT+5eC1JFCgLVqPNAbD5V21BKNNrq77zsM1E7SQnKQslVLw/BueaPXtNV8 u00VF4LxsrQozKKWwGlgq6Z3asJtQvbwuEr0/UMsaf171O8PycgzSwZfRdI2FLg+bhnB bV4B9FezGSZG3FfkcZHJ5X/fzKNp5cTrXglRqvxCK7XulK9BwVRiKXVd5/oUpgVcTpTq bjfuuGOBVEXb8VoiHl6QP0e+25+sqi0I9at+x/4Dd3JQ39izALMHixxDIxYn9Guh6qtc lt9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q128si4978454pga.833.2018.03.16.06.52.55; Fri, 16 Mar 2018 06:53:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752511AbeCPNvf (ORCPT + 99 others); Fri, 16 Mar 2018 09:51:35 -0400 Received: from mga02.intel.com ([134.134.136.20]:63289 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751405AbeCPNve (ORCPT ); Fri, 16 Mar 2018 09:51:34 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Mar 2018 06:51:33 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,316,1517904000"; d="scan'208";a="42610844" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga002.jf.intel.com with ESMTP; 16 Mar 2018 06:51:31 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id B511C3CE; Fri, 16 Mar 2018 15:51:30 +0200 (EET) Date: Fri, 16 Mar 2018 16:51:30 +0300 From: "Kirill A. Shutemov" To: Claudio Imbrenda Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, aarcange@redhat.com, minchan@kernel.org, linux-mm@kvack.org, hughd@google.com, borntraeger@de.ibm.com, gerald.schaefer@de.ibm.com Subject: Re: [PATCH v2 1/1] mm/ksm: fix interaction with THP Message-ID: <20180316135130.dlyn6patvgvwaf4r@black.fi.intel.com> References: <1521207755-28381-1-git-send-email-imbrenda@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1521207755-28381-1-git-send-email-imbrenda@linux.vnet.ibm.com> User-Agent: NeoMutt/20170714-126-deb55f (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 16, 2018 at 01:42:35PM +0000, Claudio Imbrenda wrote: > This patch fixes a corner case for KSM. When two pages belong or > belonged to the same transparent hugepage, and they should be merged, > KSM fails to split the page, and therefore no merging happens. > > This bug can be reproduced by: > * making sure ksm is running (in case disabling ksmtuned) > * enabling transparent hugepages > * allocating a THP-aligned 1-THP-sized buffer > e.g. on amd64: posix_memalign(&p, 1<<21, 1<<21) > * filling it with the same values > e.g. memset(p, 42, 1<<21) > * performing madvise to make it mergeable > e.g. madvise(p, 1<<21, MADV_MERGEABLE) > * waiting for KSM to perform a few scans > > The expected outcome is that the all the pages get merged (1 shared and > the rest sharing); the actual outcome is that no pages get merged (1 > unshared and the rest volatile) > > The reason of this behaviour is that we increase the reference count > once for both pages we want to merge, but if they belong to the same > hugepage (or compound page), the reference counter used in both cases > is the one of the head of the compound page. > This means that split_huge_page will find a value of the reference > counter too high and will fail. > > This patch solves this problem by testing if the two pages to merge > belong to the same hugepage when attempting to merge them. If so, the > hugepage is split safely. This means that the hugepage is not split if > not necessary. > > Co-authored-by: Gerald Schaefer > Signed-off-by: Claudio Imbrenda > --- > mm/ksm.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/mm/ksm.c b/mm/ksm.c > index 293721f..882d6ec 100644 > --- a/mm/ksm.c > +++ b/mm/ksm.c > @@ -2082,8 +2082,22 @@ static void cmp_and_merge_page(struct page *page, struct rmap_item *rmap_item) > tree_rmap_item = > unstable_tree_search_insert(rmap_item, page, &tree_page); > if (tree_rmap_item) { > + bool split; > + > kpage = try_to_merge_two_pages(rmap_item, page, > tree_rmap_item, tree_page); > + /* > + * If both pages we tried to merge belong to the same compound > + * page, then we actually ended up increasing the reference > + * count of the same compound page twice, and split_huge_page > + * failed. > + * Here we set a flag if that happened, and we use it later to > + * try split_huge_page again. Since we call put_page right > + * afterwards, the reference count will be correct and > + * split_huge_page should succeed. > + */ > + split = PageTransCompound(page) && PageTransCompound(tree_page) > + && compound_head(page) == compound_head(tree_page); You don't need to check *both* pages if they are compound if they share compound_head(). One check is enough. -- Kirill A. Shutemov