Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752326AbdFEQWR (ORCPT ); Mon, 5 Jun 2017 12:22:17 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:58474 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752026AbdFEQWP (ORCPT ); Mon, 5 Jun 2017 12:22:15 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Yisheng Xie , Kefeng Wang , Vlastimil Babka , Joern Engel , Mel Gorman , Michel Lespinasse , Hugh Dickins , Rik van Riel , Johannes Weiner , Michal Hocko , Xishi Qiu , zhongjiang , Hanjun Guo , Andrew Morton , Linus Torvalds Subject: [PATCH 4.4 38/53] mlock: fix mlock count can not decrease in race condition Date: Mon, 5 Jun 2017 18:17:36 +0200 Message-Id: <20170605153039.683884560@linuxfoundation.org> X-Mailer: git-send-email 2.13.0 In-Reply-To: <20170605153037.105331684@linuxfoundation.org> References: <20170605153037.105331684@linuxfoundation.org> User-Agent: quilt/0.65 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3231 Lines: 113 4.4-stable review patch. If anyone has any objections, please let me know. ------------------ From: Yisheng Xie commit 70feee0e1ef331b22cc51f383d532a0d043fbdcc upstream. Kefeng reported that when running the follow test, the mlock count in meminfo will increase permanently: [1] testcase linux:~ # cat test_mlockal grep Mlocked /proc/meminfo for j in `seq 0 10` do for i in `seq 4 15` do ./p_mlockall >> log & done sleep 0.2 done # wait some time to let mlock counter decrease and 5s may not enough sleep 5 grep Mlocked /proc/meminfo linux:~ # cat p_mlockall.c #include #include #include #define SPACE_LEN 4096 int main(int argc, char ** argv) { int ret; void *adr = malloc(SPACE_LEN); if (!adr) return -1; ret = mlockall(MCL_CURRENT | MCL_FUTURE); printf("mlcokall ret = %d\n", ret); ret = munlockall(); printf("munlcokall ret = %d\n", ret); free(adr); return 0; } In __munlock_pagevec() we should decrement NR_MLOCK for each page where we clear the PageMlocked flag. Commit 1ebb7cc6a583 ("mm: munlock: batch NR_MLOCK zone state updates") has introduced a bug where we don't decrement NR_MLOCK for pages where we clear the flag, but fail to isolate them from the lru list (e.g. when the pages are on some other cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK accounting gets permanently disrupted by this. Fix it by counting the number of page whose PageMlock flag is cleared. Fixes: 1ebb7cc6a583 (" mm: munlock: batch NR_MLOCK zone state updates") Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie Reported-by: Kefeng Wang Tested-by: Kefeng Wang Cc: Vlastimil Babka Cc: Joern Engel Cc: Mel Gorman Cc: Michel Lespinasse Cc: Hugh Dickins Cc: Rik van Riel Cc: Johannes Weiner Cc: Michal Hocko Cc: Xishi Qiu Cc: zhongjiang Cc: Hanjun Guo Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman --- mm/mlock.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) --- a/mm/mlock.c +++ b/mm/mlock.c @@ -277,7 +277,7 @@ static void __munlock_pagevec(struct pag { int i; int nr = pagevec_count(pvec); - int delta_munlocked; + int delta_munlocked = -nr; struct pagevec pvec_putback; int pgrescued = 0; @@ -297,6 +297,8 @@ static void __munlock_pagevec(struct pag continue; else __munlock_isolation_failed(page); + } else { + delta_munlocked++; } /* @@ -308,7 +310,6 @@ static void __munlock_pagevec(struct pag pagevec_add(&pvec_putback, pvec->pages[i]); pvec->pages[i] = NULL; } - delta_munlocked = -nr + pagevec_count(&pvec_putback); __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked); spin_unlock_irq(&zone->lru_lock);