Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753002AbdCPRik (ORCPT ); Thu, 16 Mar 2017 13:38:40 -0400 Received: from mail-it0-f41.google.com ([209.85.214.41]:38802 "EHLO mail-it0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752622AbdCPRih (ORCPT ); Thu, 16 Mar 2017 13:38:37 -0400 MIME-Version: 1.0 In-Reply-To: <1489683534.28631.231.camel@edumazet-glaptop3.roam.corp.google.com> References: <1489678147-21404-1-git-send-email-elena.reshetova@intel.com> <1489678147-21404-8-git-send-email-elena.reshetova@intel.com> <1489683534.28631.231.camel@edumazet-glaptop3.roam.corp.google.com> From: Kees Cook Date: Thu, 16 Mar 2017 11:38:25 -0600 X-Google-Sender-Auth: 3xTF87TEtWOqW4OsddduWEHk9d0 Message-ID: Subject: Re: [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t To: Eric Dumazet , Peter Zijlstra Cc: Elena Reshetova , Network Development , bridge@lists.linux-foundation.org, LKML , Alexey Kuznetsov , James Morris , Patrick McHardy , Stephen Hemminger , Hans Liljestrand , David Windsor Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2269 Lines: 60 On Thu, Mar 16, 2017 at 10:58 AM, Eric Dumazet wrote: > On Thu, 2017-03-16 at 17:28 +0200, Elena Reshetova wrote: >> refcount_t type and corresponding API should be >> used instead of atomic_t when the variable is used as >> a reference counter. This allows to avoid accidental >> refcounter overflows that might lead to use-after-free >> situations. > > > ... > >> static __always_inline void sock_hold(struct sock *sk) >> { >> - atomic_inc(&sk->sk_refcnt); >> + refcount_inc(&sk->sk_refcnt); >> } >> > > While I certainly see the value of these refcount_t, we have a very > different behavior on these atomic_inc() which were doing a single > inlined LOCK RMW on x86. I think we can certainly investigate arch-specific ways to improve the performance, but the consensus seemed to be that getting the infrastructure in and doing the migration was the first set of steps. > We now call an external function performing a > atomic_read(), various ops/tests, then atomic_cmpxchg_relaxed(), in a > loop, loosing the nice ability for x86 of preventing live locks. > > Looks a lot of bloat, just to be able to chase hypothetical bugs in the > kernel. > > I would love to have a way to enable extra debugging when I want a debug > kernel, like LOCKDEP or KASAN. > > By adding all this bloat, we assert linux kernel is terminally buggy and > every atomic_inc() we did was suspicious, and need to be always > instrumented/validated. This IS the assertion, unfortunately. With average 5 year lifetimes on security flaws[1], and many of the last couple years' public exploits being refcount flaws[2], this is something we have to get done. We need the default kernel to be much more self-protective, and this is one of many places to make it happen. I am, of course, biased, but I think the evidence of actual refcounting attacks outweighs the theoretical performance cost of these changes. If there is a realistic workflow that shows actual problems, let's examine it and find a solution for that as a separate part of this work without blocking this migration. -Kees [1] https://outflux.net/blog/archives/2016/10/18/security-bug-lifetime/ [2] http://kernsec.org/wiki/index.php/Bug_Classes/Integer_overflow -- Kees Cook Pixel Security