Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8130416imu; Tue, 4 Dec 2018 03:32:07 -0800 (PST) X-Google-Smtp-Source: AFSGD/VPve/0WhKTlRaHcm7+E2wDnxduDFxL3JrUP1QLcPxOJdlcDNz3Md38YLQdCAUUIRX00/Kd X-Received: by 2002:a17:902:365:: with SMTP id 92mr18696729pld.327.1543923127753; Tue, 04 Dec 2018 03:32:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543923127; cv=none; d=google.com; s=arc-20160816; b=calrpjadNR7xxXp1x2eF3yE9fCAINUPOj9mY7G+vuWDCfd5JPjs8BNM/vuZLIOVKyv mwHpKzf8Hd6gsjBxbtj+edTpdNZR1+zTUiZ3UywOIas71c8YXKjR6PO6zmgTYLk8nb3v I7Tayt1MHSjVJ4z1m9OzKVg3V0y++zGSZfnRsaLrPGMH0RPYmbVBGN6Isx0h8K74D/+9 cQ3PsBFDdtZfkXR+h5kiQIw072L2berMYvt6Cn/l2bB6Y1Vxq+OYZALwEuKIwYRGZeeE KUeghy6f+pANpGfd8nvL088rPg0ldB4Zl20RLdRzjtr5BBA0hhaU7ndronXoPsdyCBse R8Qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=DudHNNRfiOvRb7oey1MpO+0Z3vSjbKui+cJKhYZggTY=; b=g3Mc8Agrj1hNuy22koE17VjEc1cW65cRZWClMDYr/3wM2VU6dpZFwwd4h3fStXiVvT AG4DiSf8vSHSTQjk6rcsCgRLVE+MS4Tl+AQm9uJZsYyiK5+2mmFEhymEOBIR1pXDfNk8 u5TJTvZ4aPSbBzcRreXk3slhnfbad6wxTCs8yiXLi28EbLGWkT2izFQeMC4WxL58WN4+ wKxGOLtjigVi/7LljruFrQP1SFC/jwuqW7ycH6AqAD+0wBOYCcbCWAFf1S5yrCTLdgl8 LfIq1GaFsJX+O7fEefbZC5soOrbhPUgM8uQlR2l85mNQdLylVYmOFWMQ/XeEXX1CK8lD P0Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="xQWPzJ/6"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h9si15749754pgb.319.2018.12.04.03.31.52; Tue, 04 Dec 2018 03:32:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="xQWPzJ/6"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726546AbeLDK4l (ORCPT + 99 others); Tue, 4 Dec 2018 05:56:41 -0500 Received: from mail.kernel.org ([198.145.29.99]:40058 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726519AbeLDK4j (ORCPT ); Tue, 4 Dec 2018 05:56:39 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5A36A2087F; Tue, 4 Dec 2018 10:56:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543920997; bh=5xKEH5D0anEv59Kh/nUAfHw/V73MZoSFQgOkEjg/1vw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=xQWPzJ/6QFJg5wi/Qii8AOOaS/2bjNnPt/S2RszyxfcNkYKgZZejyxvtYv+ImLR7u VWpTBPEBsvn3uaXEocFO8vCEsvhftSk5xLeSbFaV9rwWcwaAqkKZLgZQn4F1qW48RG qMqy3GaO43EweI+2xXKLxLsF51L/UcWKgXvl/fyQ= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Hugh Dickins , "Kirill A. Shutemov" , Jerome Glisse , Konstantin Khlebnikov , Matthew Wilcox , Andrew Morton , Linus Torvalds , Sasha Levin Subject: [PATCH 4.19 008/139] mm/khugepaged: collapse_shmem() without freezing new_page Date: Tue, 4 Dec 2018 11:48:09 +0100 Message-Id: <20181204103650.325542785@linuxfoundation.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181204103649.950154335@linuxfoundation.org> References: <20181204103649.950154335@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review X-Patchwork-Hint: ignore MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.19-stable review patch. If anyone has any objections, please let me know. ------------------ commit 87c460a0bded56195b5eb497d44709777ef7b415 upstream. khugepaged's collapse_shmem() does almost all of its work, to assemble the huge new_page from 512 scattered old pages, with the new_page's refcount frozen to 0 (and refcounts of all old pages so far also frozen to 0). Including shmem_getpage() to read in any which were out on swap, memory reclaim if necessary to allocate their intermediate pages, and copying over all the data from old to new. Imagine the frozen refcount as a spinlock held, but without any lock debugging to highlight the abuse: it's not good, and under serious load heads into lockups - speculative getters of the page are not expecting to spin while khugepaged is rescheduled. One can get a little further under load by hacking around elsewhere; but fortunately, freezing the new_page turns out to have been entirely unnecessary, with no hacks needed elsewhere. The huge new_page lock is already held throughout, and guards all its subpages as they are brought one by one into the page cache tree; and anything reading the data in that page, without the lock, before it has been marked PageUptodate, would already be in the wrong. So simply eliminate the freezing of the new_page. Each of the old pages remains frozen with refcount 0 after it has been replaced by a new_page subpage in the page cache tree, until they are all unfrozen on success or failure: just as before. They could be unfrozen sooner, but cause no problem once no longer visible to find_get_entry(), filemap_map_pages() and other speculative lookups. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261527570.2275@eggly.anvils Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov Cc: Jerome Glisse Cc: Konstantin Khlebnikov Cc: Matthew Wilcox Cc: [4.8+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/khugepaged.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index d0a347e6fd08..e2b13c04626e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1287,7 +1287,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * collapse_shmem - collapse small tmpfs/shmem pages into huge one. * * Basic scheme is simple, details are more complex: - * - allocate and freeze a new huge page; + * - allocate and lock a new huge page; * - scan over radix tree replacing old pages the new one * + swap in pages if necessary; * + fill in gaps; @@ -1295,11 +1295,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * - if replacing succeed: * + copy data over; * + free old pages; - * + unfreeze huge page; + * + unlock huge page; * - if replacing failed; * + put all pages back and unfreeze them; * + restore gaps in the radix-tree; - * + free huge page; + * + unlock and free huge page; */ static void collapse_shmem(struct mm_struct *mm, struct address_space *mapping, pgoff_t start, @@ -1334,13 +1334,11 @@ static void collapse_shmem(struct mm_struct *mm, __SetPageSwapBacked(new_page); new_page->index = start; new_page->mapping = mapping; - BUG_ON(!page_ref_freeze(new_page, 1)); /* - * At this point the new_page is 'frozen' (page_count() is zero), locked - * and not up-to-date. It's safe to insert it into radix tree, because - * nobody would be able to map it or use it in other way until we - * unfreeze it. + * At this point the new_page is locked and not up-to-date. + * It's safe to insert it into the page cache, because nobody would + * be able to map it or use it in another way until we unlock it. */ index = start; @@ -1517,9 +1515,8 @@ static void collapse_shmem(struct mm_struct *mm, index++; } - /* Everything is ready, let's unfreeze the new_page */ SetPageUptodate(new_page); - page_ref_unfreeze(new_page, HPAGE_PMD_NR); + page_ref_add(new_page, HPAGE_PMD_NR - 1); set_page_dirty(new_page); mem_cgroup_commit_charge(new_page, memcg, false, true); lru_cache_add_anon(new_page); @@ -1566,8 +1563,6 @@ static void collapse_shmem(struct mm_struct *mm, VM_BUG_ON(nr_none); xa_unlock_irq(&mapping->i_pages); - /* Unfreeze new_page, caller would take care about freeing it */ - page_ref_unfreeze(new_page, 1); mem_cgroup_cancel_charge(new_page, memcg, true); new_page->mapping = NULL; } -- 2.17.1