From: ebiederm@xmission.com (Eric W. Biederman)
To: "Reshetova\, Elena" <elena.reshetova@intel.com>
Cc: "linux-kernel\@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        "peterz\@infradead.org" <peterz@infradead.org>,
        "gregkh\@linuxfoundation.org" <gregkh@linuxfoundation.org>,
        "akpm\@linux-foundation.org" <akpm@linux-foundation.org>,
        "mingo\@redhat.com" <mingo@redhat.com>,
        "adobriyan\@gmail.com" <adobriyan@gmail.com>,
        "serge\@hallyn.com" <serge@hallyn.com>,
        "arozansk\@redhat.com" <arozansk@redhat.com>,
        "dave\@stgolabs.net" <dave@stgolabs.net>,
        "keescook\@chromium.org" <keescook@chromium.org>,
        Hans Liljestrand <ishkamiel@gmail.com>,
        "David Windsor" <dwindsor@gmail.com>
References: <1499417992-3238-1-git-send-email-elena.reshetova@intel.com>
        <1499417992-3238-2-git-send-email-elena.reshetova@intel.com>
        <87bmottgo4.fsf@xmission.com>
        <2236FBA76BA1254E88B949DDB74E612B6FF2269B@IRSMSX102.ger.corp.intel.com>
        <87k23gsn4u.fsf@xmission.com>
        <2236FBA76BA1254E88B949DDB74E612B6FF22730@IRSMSX102.ger.corp.intel.com>
        <878tjwpm7l.fsf@xmission.com>
        <2236FBA76BA1254E88B949DDB74E612B6FF227D2@IRSMSX102.ger.corp.intel.com>
Date: Mon, 10 Jul 2017 15:32:36 -0500
In-Reply-To: <2236FBA76BA1254E88B949DDB74E612B6FF227D2@IRSMSX102.ger.corp.intel.com>
        (Elena Reshetova's message of "Mon, 10 Jul 2017 12:11:32 +0000")
Message-ID: <874lukkp7f.fsf@xmission.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain
Subject: Re: [PATCH 1/3] ipc: convert ipc_namespace.count from atomic_t to refcount_t
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 7046
Lines: 156

"Reshetova, Elena" <elena.reshetova@intel.com> writes:

>> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>> 
>> >> "Reshetova, Elena" <elena.reshetova@intel.com> writes:
>> >>
>> >> 2>> Elena Reshetova <elena.reshetova@intel.com> writes:
>> >> >>
>> >> >> > refcount_t type and corresponding API should be
>> >> >> > used instead of atomic_t when the variable is used as
>> >> >> > a reference counter. This allows to avoid accidental
>> >> >> > refcounter overflows that might lead to use-after-free
>> >> >> > situations.
>> >> >>
>> >> >> In this patch you can see all of the uses of the count.
>> >> >> What accidental refcount overflows are possible?
>> >> >
>> >> > Even if one can guarantee and prove that in the current implementation
>> >> > there are no overflows possible, we can't say that for
>> >> > sure for any future implementation. Bugs might always happen
>> >> > unfortunately, but if we convert the refcounter to a safer
>> >> > type we can be sure that overflows are not possible.
>> >> >
>> >> > Does it make sense to you?
>> >>
>> >> Not for code that is likely to remain unchanged for a decade no.
>> >
>> > Can we really be sure for any kernel code about this? And does it make
>> > sense to trust our security on a fact like this?
>> 
>> But refcount_t doesn't fix anything.  At best it changes a bad bug to a
>> less bad bug.  So now my machine OOMS instead of allows a memory
>> overwrite.   It still doesn't work.
>
> Well, it is a step forward from security standpoint. OOMS is really hard
> to exploit vs. memory overwrites. Pretty much all exploits need either
> memory write or memory read, out of memory is really much harder to
> exploit.

OOM in production is a denial of service attack.  From a serverity point
of view an OOM can be considered equivalent to a kernel panic and
something that requires a box to reboot.

>From a long term perspective I expect we will need to change all
reference counters to a type where that is not saturating but instead
fails to increment and returns an error.  If we want to keep a system
functioning in the face of maxing out a reference count that is the only
way it can truly be done.

>> Plus refcount_t does not provide any safety on the architectures where
>> it is a noop.
>
> Not sure I understood this. What do you mean by "noop"? 
> refcount_t is currently architecture independent.

noop being short for no operation.

I believe there were some/most archicture implementations that define
refcount_t to atomic_t out of performance concerns.  I know I saw
patches fly by to that effect.  On an architecture where refcount_t is
equivalent to atomic_t the change in these patches is a noop.

>> >> This looks like a large set of unautomated changes without any real
>> >> thought put into it.
>> >
>> > We are soon into the end of the first year that we started to look into
>> > refcounter overflow/underflow problem and coming up this far was
>> > not easy enough (just check all the millions of emails on kernel-hardening
>> > mailing list). Each refcount_t conversion candidate was first found by Coccinelle
>> > analysis and then manually checked and converted. The story of
>> > refcount_t API and all discussions go even further.
>> > So you can't really claim that there is no " thought put into it " :)
>> 
>> But the conversion of the instance happens without thought and manually.
>> Which is a good recipe for typos.  Which is what I am saying.
>> 
>> There have been lots of conversions like that in the kernel and
>> practically every one has introduced at least one typo.
>
> What do you exactly mean by "typo"? Typos should be detected at these
> stages:

An unintentional mistake.  The term "thinko" might be more what I am
thinking of.   Many human mistakes are not slips of the fingers but
accidentally giving the wrong command at some level.  

> 1) typos like wrong function name etc. can be found at compile time
>     (trust me I have found a number of these on the very first iteration with patches)
> 2) "typos" (not sure if it is correct to call them typos) like usage of wrong refcount_t
>       API vs. original atomic_t API can be found during internal reviews or reviews by maintainers

This I worry about most as the mental distance from xxx_inc to xxx_dec
can be very short.

> 3) much bigger problem is actually not any typos, but hidden issues that show up only
>     in run-time that detect underflows/overflows or inability to increment from zero. 
>  These only are nasty, but given that refcount_t WARNs left and right about them,
>   we can detect them fast.

Which means I worry about those less.

> I don't know what is a better recipe for doing API changes like this?
> Do you have any suggestions?

I would think a semantic patch targeting a specific lock would be less
error prone.  I would think that the same semantic patch could be used
from lock to lock with just a change of the lock that is being targeted.

I strongly suspect that would reduce the chance of accident when dealing
with a particular API and being scripted would increase the confidence
in the changes.

>> So from an engineering standpoint it is a very valid question to ask
>> about.  And I find the apparent insistence that you don't make typos
>> very disturbing.
>> 
>> >  That almost always results in a typo somewhere
>> >> that breaks things.
>> >>
>> >> So there is no benefit to the code, and a non-zero chance that there
>> >> will be a typo breaking the code.
>> >
>> > The code is very active on issuing WARNs when anything goes wrong.
>> > Using this feature we have not only found errors in conversions, but
>> > sometimes errors in code itself. So, any bug would be actually much
>> > faster visible than using old atomic_t interface.
>> >
>> > In addition by default refcount_t equals to atomic, which also gives a
>> > possibility to make a softer transition and catch all related bugs in couple
>> > of cycles when enabling CONFIG_REFCOUNT_FULL.
>> 
>> But if you make a typo and change one operation for another I don't see
>> how any of that applies.
>
> It is hard to make a "typo" to change one operation to another. It is not a
> one-two char mismatch error.

Those might be more properly "thinkos" but they do happen.

> When doing these patches we followed the
> logic of "less code changes - better" (since less chances of making mistake),
> so if in some cases functions are changed (like from atomic_sub to
> refcount_sub_and_test(), or from atomic_inc_not_zero() to atomic_inc() etc.)
> there was a reason for making it and the change wasn't automatic and without
> thinking at all. Again, we do have our maintainers also to catch if a change that
> we did doesn't actually work for them right?

In general a maintainers job is make certain the appropriate code review
and due dilligence has happened not nececessarily to perform that code
review themselves.

Changing from atomoic_inc_not_zero to a simple refcount_inc really
troubles me because it makes it harder to switch to something fully
correct.

Eric