Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751589AbdITJBv (ORCPT ); Wed, 20 Sep 2017 05:01:51 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43644 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751009AbdITJBt (ORCPT ); Wed, 20 Sep 2017 05:01:49 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com DF3C22CE901 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=asavkov@redhat.com Date: Wed, 20 Sep 2017 11:01:47 +0200 From: Artem Savkov To: Shaohua Li , Minchan Kim Cc: Michal Hocko , Johannes Weiner , Hillf Danton , Jan Stancek , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: MADV_FREE is broken Message-ID: <20170920090147.5iqdkctmw7ujlmt3@shodan.usersys.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline User-Agent: NeoMutt/20161126 (1.7.1) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 20 Sep 2017 09:01:49 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1640 Lines: 42 Hi All, We recently started noticing madvise09[1] test from ltp failing strangely. The test does the following: maps 32 pages, sets MADV_FREE for the range it got, dirties 2 of the pages, creates memory pressure and check that nondirty pages are free. The test hanged while accessing the last 4 pages(29-32) of madvised range at line 121 [2]. Any other process (gdb/cat) accessing those pages would also hang as would rebooting the machine. It doesn't trigger any debug warnings or kasan. The issue bisected to "802a3a92ad7a mm: reclaim MADV_FREE pages" (so 4.12 and up are affected). I did some poking around and found out that the "bad" pages had SwapBacked flag set in shrink_page_list() which confused it a lot. It looks like mark_page_lazyfree() only calls lru_lazyfree_fn() when the pagevec is full (that is in batches of 14) and never drains the rest (so last four in madvise09 case). The patch below greatly reduces the failure rate, but doesn't fix it completely, it still shows up with the same symptoms (hanging trying to access last 4 pages) after a bunch of retries. [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise09.c [2] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise09.c#L121 diff --git a/mm/madvise.c b/mm/madvise.c index 21261ff0466f..a0b868e8b7d2 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -453,6 +453,7 @@ static void madvise_free_page_range(struct mmu_gather *tlb, tlb_start_vma(tlb, vma); walk_page_range(addr, end, &free_walk); + lru_add_drain(); tlb_end_vma(tlb, vma); } -- Regards, Artem