Date: Thu, 17 Nov 2016 08:19:40 -0800
From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>,
        Greg KH <gregkh@linuxfoundation.org>,
        Will Deacon <will.deacon@arm.com>,
        "Reshetova, Elena" <elena.reshetova@intel.com>,
        Arnd Bergmann <arnd@arndb.de>, Thomas Gleixner <tglx@linutronix.de>,
        Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>,
        David Windsor <dave@progbits.org>, LKML <linux-kernel@vger.kernel.org>,
        Daniel Borkmann <daniel@iogearbox.net>
Subject: Re: [RFC][PATCH 2/7] kref: Add kref_read()
Message-ID: <20161117161937.GA46515@ast-mbp.thefacebook.com>
References: <CAADnVQJ4mQJ8XCq-fq2hRMchK9Q7zQNET4RqG3LXTcE2TU=k7Q@mail.gmail.com>
 <20161117085342.GB3142@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161117085342.GB3142@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1642
Lines: 43

On Thu, Nov 17, 2016 at 09:53:42AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 16, 2016 at 12:08:52PM -0800, Alexei Starovoitov wrote:
> 
> > I prefer to avoid 'fixing' things that are not broken.
> > Note, prog->aux->refcnt already has explicit checks for overflow.
> > locked_vm is used for resource accounting and not refcnt,
> > so I don't see issues there either.
> 
> The idea is to use something along the lines of:
> 
>   http://lkml.kernel.org/r/20161115104608.GH3142@twins.programming.kicks-ass.net
> 
> for all refcounts in the kernel.

I understand the idea. I'm advocating to fix refcnts
explicitly the way we did in bpf land instead of leaking memory,
making processes unkillable and so on.
If refcnt can be bounds checked, it should be done that way, since
it's a clean error path without odd side effects.
Therefore I'm against unconditionally applying refcount to all atomics.

> Also note that your:
> 
> struct bpf_prog *bpf_prog_add(struct bpf_prog *prog, int i)
> {
>         if (atomic_add_return(i, &prog->aux->refcnt) > BPF_MAX_REFCNT) {
>                 atomic_sub(i, &prog->aux->refcnt);
>                 return ERR_PTR(-EBUSY);
>         }
>         return prog;
> }
> 
> is actually broken in the face of an actual overflow. Suppose @i is big
> enough to wrap refcnt into negative space.

'i' is not controlled by user. It's a number of nic hw queues
and BPF_MAX_REFCNT is 32k, so above is always safe.

> Also, the current sentiment is to strongly discourage add/sub operations
> for refcounts.

I agree with this reasoning as well, but it's not hard and fast rule.
If we know we can do 'add' safely, we should.