Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1946061AbbEWDz5 (ORCPT ); Fri, 22 May 2015 23:55:57 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:42370 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1422637AbbEWDza (ORCPT ); Fri, 22 May 2015 23:55:30 -0400 From: Mike Kravetz To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Naoya Horiguchi , Davidlohr Bueso , David Rientjes , Luiz Capitulino , Andrew Morton , Mike Kravetz Subject: [PATCH v2 0/2] alloc_huge_page/hugetlb_reserve_pages race Date: Fri, 22 May 2015 20:55:02 -0700 Message-Id: <1432353304-12767-1-git-send-email-mike.kravetz@oracle.com> X-Mailer: git-send-email 2.1.0 X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2409 Lines: 52 This updated patch set includes new documentation for the region/ reserve map routines. Since I am not the original author of this code, comments would be appreciated. While working on hugetlbfs fallocate support, I noticed the following race in the existing code. It is unlikely that this race is hit very often in the current code. However, if more functionality to add and remove pages to hugetlbfs mappings (such as fallocate) is added the likelihood of hitting this race will increase. alloc_huge_page and hugetlb_reserve_pages use information from the reserve map to determine if there are enough available huge pages to complete the operation, as well as adjust global reserve and subpool usage counts. The order of operations is as follows: - call region_chg() to determine the expected change based on reserve map - determine if enough resources are available for this operation - adjust global counts based on the expected change - call region_add() to update the reserve map The issue is that reserve map could change between the call to region_chg and region_add. In this case, the counters which were adjusted based on the output of region_chg will not be correct. In order to hit this race today, there must be an existing shared hugetlb mmap created with the MAP_NORESERVE flag. A page fault to allocate a huge page via this mapping must occur at the same another task is mapping the same region without the MAP_NORESERVE flag. The patch set does not prevent the race from happening. Rather, it adds simple functionality to detect when the race has occurred. If a race is detected, then the incorrect counts are adjusted. v2: Added documentation for the region/reserve map routines Created common routine for vma_commit_reservation and vma_commit_reservation to help prevent them from drifting apart in the future. Mike Kravetz (2): mm/hugetlb: compute/return the number of regions added by region_add() mm/hugetlb: handle races in alloc_huge_page and hugetlb_reserve_pages mm/hugetlb.c | 154 +++++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 124 insertions(+), 30 deletions(-) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/