Received: by 10.223.176.5 with SMTP id f5csp2037421wra; Thu, 8 Feb 2018 07:29:34 -0800 (PST) X-Google-Smtp-Source: AH8x226Br8zIOWnzORPJhk0Z847mD4N10VHOHRch+cvxcdKPKlGCjI2gEjZgVnBdBRNzZ4hpYqPf X-Received: by 2002:a17:902:8a4:: with SMTP id 33-v6mr940837pll.279.1518103774188; Thu, 08 Feb 2018 07:29:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518103774; cv=none; d=google.com; s=arc-20160816; b=hx+Nb2PoVwZikOPX8Gi2+U5Ub3EFcLJc96HpRDctZX5QqI8OTABTVJwL6x4jiTN28q 0ZRBdCTVQiQ70AK1d94FTDUspXL1XX52Crw+9JLdEYtT/rNVjHg9wnk2caD0YVMuAP5M joxfSgEIxwMjcy8ZWRVhWwgUDtqXC+D1QcIUPAe2oRFDfLXUHQM91KC3nIhlVUlCj63a OAlUys48DhuStMowyJgAH53UGAHj+6zSgzU5eXiTOJMf+5px38E1zjNNDu/maXqesCpk vvmhsVuCWSVjbNqvhIpZp0KKBBqk37tE9P2zsb+rAOaeP9KMPX5pG+mAxe0uTv0LgA6b eLXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=7bHwIWxjgQHPMuMuQ2AkqBp5oOlSKGfRu21skDIaq7Y=; b=r4UvzUsJiuwY3mcC+aVoVhzwsVckafL87E1AxzS7RkLOUziK8gkUTGof25ewrInqzi jysmiQMDTdF+Xi4YdKDS1XyevdgSL8hIthZTySyr3QMUloWPOjtuxCHmKjjumZdxxaVN 8jTzhgtTGBgIr64ZANH4iBAIdHzGi/ovZexIyeDTphFo3jiTf9NKoHQLoN4z6eREJ6Z2 +L0b0UJSJ7o/ti7OaTik7/6s4eosl5oUPS0D1J8OL8n9E5iXE6+N3hbwN6KPjCmY/nsw xnSUzUTQDdAgPQlI1hog6KiQzNBJhveb42Dum8KpBlVBHNpI4ILY3UNG3Qnoa8Db5GAP phbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rhdG/cz5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b89si63698pfk.163.2018.02.08.07.29.18; Thu, 08 Feb 2018 07:29:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rhdG/cz5; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752215AbeBHP1y (ORCPT + 99 others); Thu, 8 Feb 2018 10:27:54 -0500 Received: from mail-qt0-f193.google.com ([209.85.216.193]:34200 "EHLO mail-qt0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750961AbeBHP1v (ORCPT ); Thu, 8 Feb 2018 10:27:51 -0500 Received: by mail-qt0-f193.google.com with SMTP id d14so6627452qtg.1; Thu, 08 Feb 2018 07:27:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=7bHwIWxjgQHPMuMuQ2AkqBp5oOlSKGfRu21skDIaq7Y=; b=rhdG/cz5NYkI9klFL5cnMfuLdapfkpDPqxF6Q5x+eeqUmxRkVd+fBxdkfPCSLkfpc+ jjOCJ8eNhc9qpHtYKCYBetNe/dMBw+TO6/bxKrf7+ZmWFkc0haIQJqOwwsWVEnpvIf9i dlRl+QM6GfLUZnec3jjd+Vw0EXtugLNHJSm3ot8iu0yM2vlTEnFK9uB68MYOqm/jGnEz XCjxsIXUTR0l6Y9p/I8qJXuLbJI0fBCN/NScRSHRYV6YALy6eu95tpMzG9tu7cfSM1UH gFZUVRKIU7+yabE3x4/nJWVXhMBUNe6klAzzyZ4GTabasVdgI2XaJTvcUHJx7sGV/Gb0 kCrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=7bHwIWxjgQHPMuMuQ2AkqBp5oOlSKGfRu21skDIaq7Y=; b=LZyDYpuv8i/6sCM0/pK9DOj1M+tjguw2oM0AOztDNLTEYNzacb7qjCsST3oaPz6dfC MWjTFzccEWNeEdg2CQGYe4WDsc8RPgSbQop8mOjduVh7IZHemOuNqOcPq/2bzMj6txYx xm/HvMmKyTuR9i4UZAxH/+SfGoipsdPgnwJdWbQXVHb7+9wXlcUqmNf9BN6wlNPHBrdM ujxS7a1PPyYOWEYx13NNHVuCPosP0jdhPD6dMcAcHxtjSwHzRE68s3N+BbQU1OZpOmwn bkRZUsDsdJT+zVbkcj18NZaxpFwulYsGsPtwZDOM8+PvfRkI2ODyZqy2S3Iz/+oCPtWw 77AA== X-Gm-Message-State: APf1xPADqRXA+wjaDC5lSlD1MMuli/5EmgAE2ITb77qXvORd3kG9X+T1 g6leYwc3wHBr2NDbTPLY7CG3twUYEBpvHvxnFBM= X-Received: by 10.237.51.199 with SMTP id v65mr1706170qtd.184.1518103671075; Thu, 08 Feb 2018 07:27:51 -0800 (PST) MIME-Version: 1.0 Received: by 10.200.23.216 with HTTP; Thu, 8 Feb 2018 07:27:50 -0800 (PST) In-Reply-To: <20180207070035.30302-1-ying.huang@intel.com> References: <20180207070035.30302-1-ying.huang@intel.com> From: huang ying Date: Thu, 8 Feb 2018 23:27:50 +0800 Message-ID: Subject: Re: [PATCH -mm -v2] mm, swap, frontswap: Fix THP swap if frontswap enabled To: Konrad Rzeszutek Wilk , Minchan Kim , Andrew Morton Cc: linux-mm@kvack.org, LKML , Dan Streetman , Seth Jennings , Tetsuo Handa , Shaohua Li , Michal Hocko , Johannes Weiner , Mel Gorman , Shakeel Butt , stable@vger.kernel.org, Sergey Senozhatsky , "Huang, Ying" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 7, 2018 at 3:00 PM, Huang, Ying wrote: > From: Huang Ying > > It was reported by Sergey Senozhatsky that if THP (Transparent Huge > Page) and frontswap (via zswap) are both enabled, when memory goes low > so that swap is triggered, segfault and memory corruption will occur > in random user space applications as follow, > > kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000] > #0 0x00007fc08889ae0d _int_malloc (libc.so.6) > #1 0x00007fc08889c2f3 malloc (libc.so.6) > #2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt) > #3 0x0000560e6005e75c n/a (urxvt) > #4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt) > #5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt) > #6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt) > #7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt) > #8 0x0000560e6005cb55 ev_run (urxvt) > #9 0x0000560e6003b9b9 main (urxvt) > #10 0x00007fc08883af4a __libc_start_main (libc.so.6) > #11 0x0000560e6003f9da _start (urxvt) > > After bisection, it was found the first bad commit is > bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped > out"). > > The root cause is as follow. > > When the pages are written to swap device during swapping out in > swap_writepage(), zswap (fontswap) is tried to compress the pages > instead to improve the performance. But zswap (frontswap) will treat > THP as normal page, so only the head page is saved. After swapping > in, tail pages will not be restored to its original contents, so cause > the memory corruption in the applications. > > This is fixed via splitting THP before writing the page to swap device > if frontswap is enabled. To deal with the situation where frontswap > is enabled at runtime, whether the page is THP is checked before using > frontswap during swapping out too. > > Reported-and-tested-by: Sergey Senozhatsky > Signed-off-by: "Huang, Ying" > Cc: Konrad Rzeszutek Wilk > Cc: Dan Streetman > Cc: Seth Jennings > Cc: Minchan Kim > Cc: Tetsuo Handa > Cc: Shaohua Li > Cc: Michal Hocko > Cc: Johannes Weiner > Cc: Mel Gorman > Cc: Shakeel Butt > Cc: stable@vger.kernel.org # 4.14 > Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out") > > Changelog: > > v2: > > - Move frontswap check into swapfile.c to avoid to make vmscan.c > depends on frontswap. > --- > mm/page_io.c | 2 +- > mm/swapfile.c | 3 +++ > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/page_io.c b/mm/page_io.c > index b41cf9644585..6dca817ae7a0 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -250,7 +250,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) > unlock_page(page); > goto out; > } > - if (frontswap_store(page) == 0) { > + if (!PageTransHuge(page) && frontswap_store(page) == 0) { > set_page_writeback(page); > unlock_page(page); > end_page_writeback(page); > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 006047b16814..0b7c7883ce64 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -934,6 +934,9 @@ int get_swap_pages(int n_goal, bool cluster, swp_entry_t swp_entries[]) > > /* Only single cluster request supported */ > WARN_ON_ONCE(n_goal > 1 && cluster); > + /* Frontswap doesn't support THP */ > + if (frontswap_enabled() && cluster) > + goto noswap; I found this will cause THP swap optimization be turned off forever if CONFIG_ZSWAP=y (which cannot =m). Because frontswap is enabled quite statically instead of dynamically. If frontswap_ops is registered, it will be enabled unconditionally and forever. And zswap will register frontswap_ops during initialize regardless whether zswap is enabled or not. So I think it will be better to remove swapfile.c changes in this patch, just keep page_io.c changes. Because THP is more dynamic, it is usually used if madvised by default. And even if it is used always by default, this can be changed dynamically. And even THP is used, zswap can still be used for non-THP pages. Best Regards, Huang, Ying > > avail_pgs = atomic_long_read(&nr_swap_pages) / nr_pages; > if (avail_pgs <= 0) > -- > 2.15.1 >