Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751486Ab3IZIsJ (ORCPT ); Thu, 26 Sep 2013 04:48:09 -0400 Received: from mail-ie0-f178.google.com ([209.85.223.178]:33065 "EHLO mail-ie0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750821Ab3IZIsF (ORCPT ); Thu, 26 Sep 2013 04:48:05 -0400 MIME-Version: 1.0 In-Reply-To: <20130926075725.GA22339@bbox> References: <52118042.30101@oracle.com> <20130819054742.GA28062@bbox> <20130821074939.GE3022@bbox> <20130926055802.GA20634@bbox> <20130926075725.GA22339@bbox> Date: Thu, 26 Sep 2013 16:48:03 +0800 Message-ID: Subject: Re: [BUG REPORT] ZSWAP: theoretical race condition issues From: Weijie Yang To: Minchan Kim Cc: Bob Liu , Bob Liu , Seth Jennings , Linux-MM , Linux-Kernel Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5039 Lines: 124 On Thu, Sep 26, 2013 at 3:57 PM, Minchan Kim wrote: > On Thu, Sep 26, 2013 at 03:26:33PM +0800, Weijie Yang wrote: >> On Thu, Sep 26, 2013 at 1:58 PM, Minchan Kim wrote: >> > Hello Weigie, >> > >> > On Wed, Sep 25, 2013 at 05:33:43PM +0800, Weijie Yang wrote: >> >> On Wed, Sep 25, 2013 at 4:31 PM, Bob Liu wrote: >> >> > On Wed, Sep 25, 2013 at 4:09 PM, Weijie Yang wrote: >> >> >> I think I find a new issue, for integrity of this mail thread, I reply >> >> >> to this mail. >> >> >> >> >> >> It is a concurrence issue either, when duplicate store and reclaim >> >> >> concurrentlly. >> >> >> >> >> >> zswap entry x with offset A is already stored in zswap backend. >> >> >> Consider the following scenario: >> >> >> >> >> >> thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page) >> >> >> >> >> >> thread 1: store new page with the same offset A, alloc a new zswap entry y. >> >> >> store finished. shrink_page_list() call __remove_mapping(), and now >> >> >> it is not in swap_cache >> >> >> >> >> > >> >> > But I don't think swap layer will call zswap with the same offset A. >> >> >> >> 1. store page of offset A in zswap >> >> 2. some time later, pagefault occur, load page data from zswap. >> >> But notice that zswap entry x is still in zswap because it is not >> >> frontswap_tmem_exclusive_gets_enabled. >> > >> > frontswap_tmem_exclusive_gets_enabled is just option to see tradeoff >> > between CPU burining by frequent swapout and memory footprint by duplicate >> > copy in swap cache and frontswap backend so it shouldn't affect the stability. >> >> Thanks for explain this. >> I don't mean to say this option affects the stability, but that zswap >> only realize >> one option. Maybe it's better to realize both options for different workloads. > > "zswap only relize one option" > What does it mena? Sorry. I couldn't parse your intention. :) > You mean zswap should do something special to support frontswap_tmem_exclusive_gets? Yes. But I am not sure whether it is worth. >> >> >> this page is with PageSwapCache(page) and page_private(page) = entry.val >> >> 3. change this page data, and it become dirty >> > >> > If non-shared swapin page become redirty, it should remove the page from >> > swapcache. If shared swapin page become redirty, it should do CoW so it's a >> > new page so that it doesn't live in swap cache. It means it should have new >> > offset which is different with old's one for swap out. >> > >> > What's wrong with that? >> >> It is really not a right scene for duplicate store. And I can not think out one. >> If duplicate store is impossible, How about delete the handle code in zswap? >> If it does exist, I think there is a potential issue as I described. > > You mean "zswap_duplicate_entry"? > AFAIR, I already had a question to Seth when zswap was born but AFAIRC, > he said that he didn't know exact reason but he saw that case during > experiement so copy the code peice from zcache. > > Do you see the case, too? Yes, I mean duplicate store. I check the /Documentation/vm/frontswap.txt, it mentions "duplicate stores", but I am still confused. I wrote a zcache varietas which swap out compressed page to swapfile. I did see that case when I test it on andorid smartphone(arm v7), and it happens rarely and occasionally. In one test, only 1 duplicate store occur in about 3157162 times stores. > Anyway, we need to dive into that to know what happens and then open > our eyes for clear solution before dumping meaningless patch. > > I hope Seth or Bob already know it. > >> >> >> 4. some time later again, swap this page on the same offset A. >> >> >> >> so, a duplicate store happens. >> >> >> >> what I can think is that use flags and CAS to protect store and reclaim on >> >> the same offset happens concurrentlly. >> >> >> >> >> thread 0: zswap_get_swap_cache_page called. old page data is added to swap_cache >> >> >> >> >> >> Now, swap cache has old data rather than new data for offset A. >> >> >> error will happen If do_swap_page() get page from swap_cache. >> >> >> >> >> > >> >> > -- >> >> > Regards, >> >> > --Bob >> >> >> >> -- >> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> >> the body to majordomo@kvack.org. For more info on Linux MM, >> >> see: http://www.linux-mm.org/ . >> >> Don't email: email@kvack.org >> > >> > -- >> > Kind regards, >> > Minchan Kim >> >> -- >> To unsubscribe, send a message with 'unsubscribe linux-mm' in >> the body to majordomo@kvack.org. For more info on Linux MM, >> see: http://www.linux-mm.org/ . >> Don't email: email@kvack.org > > -- > Kind regards, > Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/