MIME-Version: 1.0
In-Reply-To: <20170404201334.GV15132@dhcp22.suse.cz>
References: <20170331164028.GA118828@beast> <20170404113022.GC15490@dhcp22.suse.cz>
 <alpine.DEB.2.20.1704041005570.23420@east.gentwo.org> <20170404151600.GN15132@dhcp22.suse.cz>
 <alpine.DEB.2.20.1704041412050.27424@east.gentwo.org> <20170404194220.GT15132@dhcp22.suse.cz>
 <alpine.DEB.2.20.1704041457030.28085@east.gentwo.org> <20170404201334.GV15132@dhcp22.suse.cz>
From: Kees Cook <keescook@chromium.org>
Date: Mon, 10 Apr 2017 21:58:22 -0700
Message-ID: <CAGXu5jL1t2ZZkwnGH9SkFyrKDeCugSu9UUzvHf3o_MgraDFL1Q@mail.gmail.com>
Subject: Re: [PATCH] mm: Add additional consistency check
To: Michal Hocko <mhocko@kernel.org>
Cc: Christoph Lameter <cl@linux.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        Pekka Enberg <penberg@kernel.org>,
        David Rientjes <rientjes@google.com>,
        Joonsoo Kim <iamjoonsoo.kim@lge.com>, Linux-MM <linux-mm@kvack.org>,
        LKML <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1676
Lines: 40

On Tue, Apr 4, 2017 at 1:13 PM, Michal Hocko <mhocko@kernel.org> wrote:
> On Tue 04-04-17 14:58:06, Cristopher Lameter wrote:
>> On Tue, 4 Apr 2017, Michal Hocko wrote:
>>
>> > On Tue 04-04-17 14:13:06, Cristopher Lameter wrote:
>> > > On Tue, 4 Apr 2017, Michal Hocko wrote:
>> > >
>> > > > Yes, but we do not have to blow the kernel, right? Why cannot we simply
>> > > > leak that memory?
>> > >
>> > > Because it is a serious bug to attempt to free a non slab object using
>> > > slab operations. This is often the result of memory corruption, coding
>> > > errs etc. The system needs to stop right there.
>> >
>> > Why when an alternative is a memory leak?
>>
>> Because the slab allocators fail also in case you free an object multiple
>> times etc etc. Continuation is supported by enabling a special resiliency
>> feature via the kernel command line. The alternative is selectable but not
>> the default.
>
> I disagree! We should try to continue as long as we _know_ that the
> internal state of the allocator is still consistent and a further
> operation will not spread the corruption even more. This is clearly not
> the case for an invalid pointer to kfree.
>
> I can see why checking for an early allocator corruption is not always
> feasible and you can only detect after-the-fact but this is not the case
> here and putting your system down just because some buggy code is trying
> to free something it hasn't allocated is not really useful. I completely
> agree with Linus that we overuse BUG way too much and this is just
> another example of it.

Instead of the proposed BUG here, what's the correct "safe" return value?

-Kees

-- 
Kees Cook
Pixel Security