Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933672AbdC3M3W (ORCPT ); Thu, 30 Mar 2017 08:29:22 -0400 Received: from mail-vk0-f52.google.com ([209.85.213.52]:36588 "EHLO mail-vk0-f52.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932938AbdC3M3V (ORCPT ); Thu, 30 Mar 2017 08:29:21 -0400 MIME-Version: 1.0 In-Reply-To: <20170329141711.50c183a7bb1bfa75e24d4426@linux-foundation.org> References: <1490821682-23228-1-git-send-email-mike.kravetz@oracle.com> <20170329141711.50c183a7bb1bfa75e24d4426@linux-foundation.org> From: Dmitry Vyukov Date: Thu, 30 Mar 2017 14:28:58 +0200 Message-ID: Subject: Re: [PATCH RESEND] mm/hugetlb: Don't call region_abort if region_chg fails To: Andrew Morton Cc: Mike Kravetz , "linux-mm@kvack.org" , LKML , Hillf Danton , Michal Hocko , "Kirill A . Shutemov" , Andrey Ryabinin , Naoya Horiguchi Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1207 Lines: 27 On Wed, Mar 29, 2017 at 11:17 PM, Andrew Morton wrote: > On Wed, 29 Mar 2017 14:08:02 -0700 Mike Kravetz wrote: > >> Resending because of typo in Andrew's e-mail when first sent >> >> Changes to hugetlbfs reservation maps is a two step process. The first >> step is a call to region_chg to determine what needs to be changed, and >> prepare that change. This should be followed by a call to call to >> region_add to commit the change, or region_abort to abort the change. >> >> The error path in hugetlb_reserve_pages called region_abort after a >> failed call to region_chg. As a result, the adds_in_progress counter >> in the reservation map is off by 1. This is caught by a VM_BUG_ON >> in resv_map_release when the reservation map is freed. >> >> syzkaller fuzzer found this bug, that resulted in the following: > > I'll change the above to > > : syzkaller fuzzer (when using an injected kmalloc failure) found this bug, > : that resulted in the following: > > it's important, because this bug won't be triggered (at all easily, at > least) in real-world workloads. I wonder if memory-constrained cgroups make such bugs much easier to trigger.