Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754104AbcCSBBJ (ORCPT ); Fri, 18 Mar 2016 21:01:09 -0400 Received: from mail-wm0-f50.google.com ([74.125.82.50]:34666 "EHLO mail-wm0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752040AbcCSBBG (ORCPT ); Fri, 18 Mar 2016 21:01:06 -0400 Date: Sat, 19 Mar 2016 04:01:01 +0300 From: "Kirill A. Shutemov" To: "Aneesh Kumar K.V" Cc: "Kirill A. Shutemov" , Hugh Dickins , Andrea Arcangeli , Andrew Morton , Dave Hansen , Vlastimil Babka , Christoph Lameter , Naoya Horiguchi , Jerome Marchand , Yang Shi , Sasha Levin , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCHv4 04/25] rmap: support file thp Message-ID: <20160319010101.GA29883@node.shutemov.name> References: <1457737157-38573-1-git-send-email-kirill.shutemov@linux.intel.com> <1457737157-38573-5-git-send-email-kirill.shutemov@linux.intel.com> <87d1qs9lah.fsf@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87d1qs9lah.fsf@linux.vnet.ibm.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1926 Lines: 49 On Fri, Mar 18, 2016 at 03:10:06PM +0530, Aneesh Kumar K.V wrote: > "Kirill A. Shutemov" writes: > > > [ text/plain ] > > Naive approach: on mapping/unmapping the page as compound we update > > ->_mapcount on each 4k page. That's not efficient, but it's not obvious > > how we can optimize this. We can look into optimization later. > > > > PG_double_map optimization doesn't work for file pages since lifecycle > > of file pages is different comparing to anon pages: file page can be > > mapped again at any time. > > > > Can you explain this more ?. We added PG_double_map so that we can keep > page_remove_rmap simpler. So if it isn't a compound page we still can do > > if (!atomic_add_negative(-1, &page->_mapcount)) > > I am trying to understand why we can't use that with file pages ? The first thing: for non-compound pages we still have simple atomic_inc_and_test() / atomic_add_negative(-1), nothing changed here. About compound pages: For anon-THP PG_double_map allowed to not touch _mapcount in all subpages until a PMD which maps the page is split. This way we significantly lower overhead on refcounting as long as we have the page mapped with PMD-only, since we only need to increment compound_mapcount(). The optimization is possible due to relatively simple lifecycle of anonymous THP page: - anon-THPs always mapped with PMD first; - new mapping of THP can only be created via fork(); - the page only can get mapped with PTEs via split_huge_pmd(); For file-THP the situation is different. Once we allocated a huge page and put it on radix tree, the page can be mapped with PTEs or PMDs at any time. It makes the same optimization inapplicable there. I think there *can* be some room for optimization, but I don't want to invest more time here, until it's identified as bottleneck. It can lead to more complex code on rmap side. -- Kirill A. Shutemov