Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756567AbcJWLh1 (ORCPT ); Sun, 23 Oct 2016 07:37:27 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:51980 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754033AbcJWLhY (ORCPT ); Sun, 23 Oct 2016 07:37:24 -0400 From: "Aneesh Kumar K.V" To: Mike Kravetz , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Naoya Horiguchi , Michal Hocko , "Kirill A . Shutemov" , Hillf Danton , Dave Hansen , Jan Stancek , Mike Kravetz , stable@vger.kernel.org Subject: Re: [PATCH 1/1] mm/hugetlb: fix huge page reservation leak in private mapping error paths In-Reply-To: <1476933077-23091-2-git-send-email-mike.kravetz@oracle.com> References: <1476933077-23091-1-git-send-email-mike.kravetz@oracle.com> <1476933077-23091-2-git-send-email-mike.kravetz@oracle.com> Date: Sun, 23 Oct 2016 17:06:08 +0530 MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16102311-0004-0000-0000-000010AC1416 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005963; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000187; SDB=6.00771641; UDB=6.00370229; IPR=6.00548377; BA=6.00004824; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013082; XFM=3.00000011; UTC=2016-10-23 11:37:22 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16102311-0005-0000-0000-000079F68D2F Message-Id: <87vawjthzr.fsf@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-23_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=5 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610230211 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2935 Lines: 71 Mike Kravetz writes: > Error paths in hugetlb_cow() and hugetlb_no_page() may free a newly > allocated huge page. If a reservation was associated with the huge > page, alloc_huge_page() consumed the reservation while allocating. > When the newly allocated page is freed in free_huge_page(), it will > increment the global reservation count. However, the reservation entry > in the reserve map will remain. This is not an issue for shared > mappings as the entry in the reserve map indicates a reservation exists. > But, an entry in a private mapping reserve map indicates the reservation > was consumed and no longer exists. This results in an inconsistency > between the reserve map and the global reservation count. This 'leaks' > a reserved huge page. > > Create a new routine restore_reserve_on_error() to restore the reserve > entry in these specific error paths. This routine makes use of a new > function vma_add_reservation() which will add a reserve entry for a > specific address/page. > > In general, these error paths were rarely (if ever) taken on most > architectures. However, powerpc contained arch specific code that > that resulted in an extra fault and execution of these error paths > on all private mappings. > > Fixes: 67961f9db8c4 ("mm/hugetlb: fix huge page reserve accounting for private mappings) > > Cc: stable@vger.kernel.org > Reported-by: Jan Stancek > Signed-off-by: Mike Kravetz > --- > mm/hugetlb.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 66 insertions(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index ec49d9e..418bf01 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -1826,11 +1826,17 @@ static void return_unused_surplus_pages(struct hstate *h, > * is not the case is if a reserve map was changed between calls. It > * is the responsibility of the caller to notice the difference and > * take appropriate action. > + * > + * vma_add_reservation is used in error paths where a reservation must > + * be restored when a newly allocated huge page must be freed. It is > + * to be called after calling vma_needs_reservation to determine if a > + * reservation exists. > */ > enum vma_resv_mode { > VMA_NEEDS_RESV, > VMA_COMMIT_RESV, > VMA_END_RESV, > + VMA_ADD_RESV, > }; > static long __vma_reservation_common(struct hstate *h, > struct vm_area_struct *vma, unsigned long addr, > @@ -1856,6 +1862,14 @@ static long __vma_reservation_common(struct hstate *h, > region_abort(resv, idx, idx + 1); > ret = 0; > break; > + case VMA_ADD_RESV: > + if (vma->vm_flags & VM_MAYSHARE) > + ret = region_add(resv, idx, idx + 1); > + else { > + region_abort(resv, idx, idx + 1); > + ret = region_del(resv, idx, idx + 1); > + } It is confusing to find ADD_RESV doing region_del, but I don't have suggestion for a better name. -aneesh