Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp556609imm; Wed, 23 May 2018 01:27:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZrlVKGIcRSnFB3qASWD+Blk66DsWuTbYOX0K4Rp7VrV5HetvJ4yWilKnYaMsq0MXh534DeH X-Received: by 2002:a62:de02:: with SMTP id h2-v6mr1909175pfg.205.1527064046443; Wed, 23 May 2018 01:27:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527064046; cv=none; d=google.com; s=arc-20160816; b=CMy5t/SCVZq+rOYTkAwcj5XxTUyBv87M3x2n/lRA0jBSivpSJ6LXKppfHi3jmWuczV 2tlK0fe/IHdoATDiTuuV/oj2bmXMkXR5Ks9YK7mF11EyH/SXKfOfvNgbL2YHmzbVdPBR 4f5vvKup7kTJM9rdD4sEw8iSu1+gZglLDfyQZLT2J/e1njKLN3EARS9gZP+Ja4RdXaNY qoP1gprickGiWzpWkA74XmWhQ4PWQ1BcqRtXDNf4pmMdrDCZvxtBLj5cSUFysqr/6VuX 0P+PzDSUE39hHakV4lVPQysTQ7IEMHT6vdQHqCyiPTWOJ0GaH62ssXHO/cFWUSBUZ+Qs kJ4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=nT+5o2EvIPvhfoV22RAhP5mCQF6EKt3UikIVOVFClCA=; b=O4PX0XfO68ucRjPCPK7Gt+r4UgGFGGunfShcxr7WbN6Wv2GxJ5yxv4Xi572LVOLON9 eJpKdB+6JkcGPQyN6+LxMqNEB99PS/MDYwn8rfaNmrkiiCcfxEGJIgEIGX2l9gI7q9qR 9Dt+CT9F2afKwbuskDJpa+eeTvuKmjFWjb1MyXdzqVwSc9+BzlvgSAPcSLHfQDs5nuZA RTJGDN8OQX8WUHcf1oCYvf5XMB1PZpaHCkNZ85b8heVgP+s1pSrrGsiUEmDWMA3Bb0iV 1KLuglbd0I5GgSD2cgnGEUspQT1u6R0HPUiLR8cNIKzHdu6dEKNwqeOlBaEsQmaILgea 3z/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s71-v6si18524906pfi.74.2018.05.23.01.27.11; Wed, 23 May 2018 01:27:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754514AbeEWI0s (ORCPT + 99 others); Wed, 23 May 2018 04:26:48 -0400 Received: from mga03.intel.com ([134.134.136.65]:7708 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754195AbeEWI0o (ORCPT ); Wed, 23 May 2018 04:26:44 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 May 2018 01:26:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,432,1520924400"; d="scan'208";a="57726023" Received: from yhuang6-ux31a.sh.intel.com ([10.239.197.97]) by fmsmga001.fm.intel.com with ESMTP; 23 May 2018 01:26:40 -0700 From: "Huang, Ying" To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying , "Kirill A. Shutemov" , Andrea Arcangeli , Michal Hocko , Johannes Weiner , Shaohua Li , Hugh Dickins , Minchan Kim , Rik van Riel , Dave Hansen , Naoya Horiguchi , Zi Yan Subject: [PATCH -mm -V3 04/21] mm, THP, swap: Support PMD swap mapping in swapcache_free_cluster() Date: Wed, 23 May 2018 16:26:08 +0800 Message-Id: <20180523082625.6897-5-ying.huang@intel.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180523082625.6897-1-ying.huang@intel.com> References: <20180523082625.6897-1-ying.huang@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Huang Ying Previously, during swapout, all PMD page mapping will be split and replaced with PTE swap mapping. And when clearing the SWAP_HAS_CACHE flag for the huge swap cluster in swapcache_free_cluster(), the huge swap cluster will be split. Now, during swapout, the PMD page mapping will be changed to PMD swap mapping. So when clearing the SWAP_HAS_CACHE flag, the huge swap cluster will only be split if the PMD swap mapping count is 0. Otherwise, we will keep it as the huge swap cluster. So that we can swapin a THP as a whole later. Signed-off-by: "Huang, Ying" Cc: "Kirill A. Shutemov" Cc: Andrea Arcangeli Cc: Michal Hocko Cc: Johannes Weiner Cc: Shaohua Li Cc: Hugh Dickins Cc: Minchan Kim Cc: Rik van Riel Cc: Dave Hansen Cc: Naoya Horiguchi Cc: Zi Yan --- mm/swapfile.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 075048032383..8dbc0f9b2f90 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -514,6 +514,18 @@ static void dec_cluster_info_page(struct swap_info_struct *p, free_cluster(p, idx); } +#ifdef CONFIG_THP_SWAP +static inline int cluster_swapcount(struct swap_cluster_info *ci) +{ + if (!ci || !cluster_is_huge(ci)) + return 0; + + return cluster_count(ci) - SWAPFILE_CLUSTER; +} +#else +#define cluster_swapcount(ci) 0 +#endif + /* * It's possible scan_swap_map() uses a free cluster in the middle of free * cluster list. Avoiding such abuse to avoid list corruption. @@ -905,6 +917,7 @@ static void swap_free_cluster(struct swap_info_struct *si, unsigned long idx) struct swap_cluster_info *ci; ci = lock_cluster(si, offset); + memset(si->swap_map + offset, 0, SWAPFILE_CLUSTER); cluster_set_count_flag(ci, 0, 0); free_cluster(si, idx); unlock_cluster(ci); @@ -1288,24 +1301,30 @@ static void swapcache_free_cluster(swp_entry_t entry) ci = lock_cluster(si, offset); VM_BUG_ON(!cluster_is_huge(ci)); + VM_BUG_ON(!is_cluster_offset(offset)); + VM_BUG_ON(cluster_count(ci) < SWAPFILE_CLUSTER); map = si->swap_map + offset; - for (i = 0; i < SWAPFILE_CLUSTER; i++) { - val = map[i]; - VM_BUG_ON(!(val & SWAP_HAS_CACHE)); - if (val == SWAP_HAS_CACHE) - free_entries++; + if (!cluster_swapcount(ci)) { + for (i = 0; i < SWAPFILE_CLUSTER; i++) { + val = map[i]; + VM_BUG_ON(!(val & SWAP_HAS_CACHE)); + if (val == SWAP_HAS_CACHE) + free_entries++; + } + if (free_entries != SWAPFILE_CLUSTER) + cluster_clear_huge(ci); } if (!free_entries) { - for (i = 0; i < SWAPFILE_CLUSTER; i++) - map[i] &= ~SWAP_HAS_CACHE; + for (i = 0; i < SWAPFILE_CLUSTER; i++) { + val = map[i]; + VM_BUG_ON(!(val & SWAP_HAS_CACHE) || + val == SWAP_HAS_CACHE); + map[i] = val & ~SWAP_HAS_CACHE; + } } - cluster_clear_huge(ci); unlock_cluster(ci); if (free_entries == SWAPFILE_CLUSTER) { spin_lock(&si->lock); - ci = lock_cluster(si, offset); - memset(map, 0, SWAPFILE_CLUSTER); - unlock_cluster(ci); mem_cgroup_uncharge_swap(entry, SWAPFILE_CLUSTER); swap_free_cluster(si, idx); spin_unlock(&si->lock); -- 2.16.1