Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752523AbdCDMBt (ORCPT ); Sat, 4 Mar 2017 07:01:49 -0500 Received: from mail-qk0-f170.google.com ([209.85.220.170]:34963 "EHLO mail-qk0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751677AbdCDMBq (ORCPT ); Sat, 4 Mar 2017 07:01:46 -0500 MIME-Version: 1.0 In-Reply-To: <20170303144329.94d47b1015ba2f18f64c5893@linux-foundation.org> References: <20170301143905.12846-1-ying.huang@intel.com> <20170303144329.94d47b1015ba2f18f64c5893@linux-foundation.org> From: huang ying Date: Sat, 4 Mar 2017 19:53:10 +0800 Message-ID: Subject: Re: [PATCH] mm, swap: Fix a race in free_swap_and_cache() To: Andrew Morton Cc: "Huang, Ying" , Hugh Dickins , Shaohua Li , Minchan Kim , Rik van Riel , Tim Chen , linux-mm@kvack.org, LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1514 Lines: 32 Hi, Andrew, Sorry, I clicked the wrong button in my mail client, so forgot Ccing mailing list. Sorry for duplicated mail. On Sat, Mar 4, 2017 at 6:43 AM, Andrew Morton wrote: > On Wed, 1 Mar 2017 22:38:09 +0800 "Huang, Ying" wrote: > >> Before using cluster lock in free_swap_and_cache(), the >> swap_info_struct->lock will be held during freeing the swap entry and >> acquiring page lock, so the page swap count will not change when >> testing page information later. But after using cluster lock, the >> cluster lock (or swap_info_struct->lock) will be held only during >> freeing the swap entry. So before acquiring the page lock, the page >> swap count may be changed in another thread. If the page swap count >> is not 0, we should not delete the page from the swap cache. This is >> fixed via checking page swap count again after acquiring the page >> lock. > > What are the user-visible runtime effects of this bug? Please always > include this info when fixing things, thanks. Sure. I find the race when I review the code, so I didn't trigger the race via a test program. If the race occurs for an anonymous page shared by multiple processes via fork, multiple pages will be allocated and swapped in from the swap device for the previously shared one page. That is, the user-visible runtime effect is more memory will be used and the access latency for the page will be higher, that is, the performance regression. Best Regards, Huang, Ying