Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755093AbcKVDqi (ORCPT ); Mon, 21 Nov 2016 22:46:38 -0500 Received: from mail-pg0-f43.google.com ([74.125.83.43]:35768 "EHLO mail-pg0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754957AbcKVDqg (ORCPT ); Mon, 21 Nov 2016 22:46:36 -0500 Date: Mon, 21 Nov 2016 19:46:28 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Minchan Kim cc: Hugh Dickins , Andrew Morton , "Darrick J . Wong" , Hyeoncheol Lee , yjay.kim@lge.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2] mm: support anonymous stable page In-Reply-To: <20161120233015.GA14113@bbox> Message-ID: References: <20161120233015.GA14113@bbox> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3810 Lines: 103 On Mon, 21 Nov 2016, Minchan Kim wrote: > From: Minchan Kim > Date: Fri, 11 Nov 2016 15:02:57 +0900 > Subject: [PATCH v2] mm: support anonymous stable page > > For developemnt for zram-swap asynchronous writeback, I found > strange corruption of compressed page. With investigation, it > reveals currently stable page doesn't support anonymous page. > IOW, reuse_swap_page can reuse the page without waiting > writeback completion so that it can corrupt data during > zram compression. It can affect every swap device which supports > asynchronous writeback and CRC checking as well as zRAM. > > Unfortunately, reuse_swap_page should be atomic so that we > cannot wait on writeback in there so the approach in this patch > is simply return false if we found it needs stable page. > Although it increases memory footprint temporarily, it happens > rarely and it should be reclaimed easily althoug it happened. > Also, It would be better than waiting of IO completion, which > is critial path for application latency. > > Cc: Hugh Dickins > Cc: Darrick J. Wong > Signed-off-by: Minchan Kim Acked-by: Hugh Dickins Looks good, thanks: we can always optimize away that little overhead in the PageWriteback case, if it ever shows up in someone's testing. Andrew might ask if we should Cc stable (haha): I think we agree that it's a defect we've been aware of ever since stable pages were first proposed, but nobody has actually been troubled by it before your async zram development: so, you're right to be fixing it ahead of your zram changes, but we don't see a call for backporting. > --- > * from v1 > * use swap_info_struct instead of swapper_space->host inode - Hugh > > include/linux/swap.h | 3 ++- > mm/swapfile.c | 20 +++++++++++++++++++- > 2 files changed, 21 insertions(+), 2 deletions(-) > > diff --git a/include/linux/swap.h b/include/linux/swap.h > index a56523cefb9b..55ff5593c193 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -150,8 +150,9 @@ enum { > SWP_FILE = (1 << 7), /* set after swap_activate success */ > SWP_AREA_DISCARD = (1 << 8), /* single-time swap area discards */ > SWP_PAGE_DISCARD = (1 << 9), /* freed swap page-cluster discards */ > + SWP_STABLE_WRITES = (1 << 10), /* no overwrite PG_writeback pages */ > /* add others here before... */ > - SWP_SCANNING = (1 << 10), /* refcount in scan_swap_map */ > + SWP_SCANNING = (1 << 11), /* refcount in scan_swap_map */ > }; > > #define SWAP_CLUSTER_MAX 32UL > diff --git a/mm/swapfile.c b/mm/swapfile.c > index 2210de290b54..66bc330c0b65 100644 > --- a/mm/swapfile.c > +++ b/mm/swapfile.c > @@ -943,11 +943,25 @@ bool reuse_swap_page(struct page *page, int *total_mapcount) > count = page_trans_huge_mapcount(page, total_mapcount); > if (count <= 1 && PageSwapCache(page)) { > count += page_swapcount(page); > - if (count == 1 && !PageWriteback(page)) { > + if (count != 1) > + goto out; > + if (!PageWriteback(page)) { > delete_from_swap_cache(page); > SetPageDirty(page); > + } else { > + swp_entry_t entry; > + struct swap_info_struct *p; > + > + entry.val = page_private(page); > + p = swap_info_get(entry); > + if (p->flags & SWP_STABLE_WRITES) { > + spin_unlock(&p->lock); > + return false; > + } > + spin_unlock(&p->lock); > } > } > +out: > return count <= 1; > } > > @@ -2447,6 +2461,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags) > error = -ENOMEM; > goto bad_swap; > } > + > + if (bdi_cap_stable_pages_required(inode_to_bdi(inode))) > + p->flags |= SWP_STABLE_WRITES; > + > if (p->bdev && blk_queue_nonrot(bdev_get_queue(p->bdev))) { > int cpu; > > -- > 2.7.4