Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753104AbcLLUSp (ORCPT ); Mon, 12 Dec 2016 15:18:45 -0500 Received: from mail-vk0-f67.google.com ([209.85.213.67]:36790 "EHLO mail-vk0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752463AbcLLUSn (ORCPT ); Mon, 12 Dec 2016 15:18:43 -0500 MIME-Version: 1.0 X-Originating-IP: [96.230.190.88] In-Reply-To: <5714bd7468cfec225407a6c367e658478d590495.1481534171.git.rgb@redhat.com> References: <20161212100215.GA1305@madcap2.tricolour.ca> <5714bd7468cfec225407a6c367e658478d590495.1481534171.git.rgb@redhat.com> From: Paul Moore Date: Mon, 12 Dec 2016 15:18:41 -0500 Message-ID: Subject: Re: [PATCH v2] audit: use proper refcount locking on audit_sock To: Richard Guy Briggs Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-audit@redhat.com, edumazet@google.com, xiyou.wangcong@gmail.com, dvyukov@google.com Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3778 Lines: 91 On Mon, Dec 12, 2016 at 5:03 AM, Richard Guy Briggs wrote: > Resetting audit_sock appears to be racy. > > audit_sock was being copied and dereferenced without using a refcount on > the source sock. > > Bump the refcount on the underlying sock when we store a refrence in > audit_sock and release it when we reset audit_sock. audit_sock > modification needs the audit_cmd_mutex. > > See: https://lkml.org/lkml/2016/11/26/232 > > Thanks to Eric Dumazet and Cong Wang > on ideas how to fix it. > > Signed-off-by: Richard Guy Briggs > --- > There has been a lot of change in the audit code that is about to go > upstream to address audit queue issues. This patch is based on the > source tree: git://git.infradead.org/users/pcmoore/audit#next > --- > kernel/audit.c | 34 ++++++++++++++++++++++++++++------ > 1 files changed, 28 insertions(+), 6 deletions(-) My previous question about testing still stands, but I took a closer look and have some additional comments, see below ... > diff --git a/kernel/audit.c b/kernel/audit.c > index f20eee0..439f7f3 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -452,7 +452,9 @@ static void auditd_reset(void) > struct sk_buff *skb; > > /* break the connection */ > + sock_put(audit_sock); > audit_pid = 0; > + audit_nlk_portid = 0; > audit_sock = NULL; > > /* flush all of the retry queue to the hold queue */ > @@ -478,6 +480,12 @@ static int kauditd_send_unicast_skb(struct sk_buff *skb) > if (rc >= 0) { > consume_skb(skb); > rc = 0; > + } else { > + if (rc & (-ENOMEM|-EPERM|-ECONNREFUSED)) { I dislike the way you wrote this because instead of simply looking at this to see if it correct I need to sort out all the bits and find out if there are other error codes that could run afoul of this check ... make it simple, e.g. (rc == -ENOMEM || rc == -EPERM || ...). Actually, since EPERM is 1, -EPERM (-1 in two's compliment is 0xffffffff) is going to cause this to be true for pretty much any value of rc, yes? > + mutex_lock(&audit_cmd_mutex); > + auditd_reset(); > + mutex_unlock(&audit_cmd_mutex); > + } The code in audit#next handles netlink_unicast() errors in kauditd_thread() and you are adding error handling code here in kauditd_send_unicast_skb() ... that's messy. I don't care too much where the auditd_reset() call is made, but let's only do it in one function; FWIW, I originally put the error handling code in kauditd_thread() because there was other error handling code that needed to done in that scope so it resulted in cleaner code. Related, I see you are now considering ENOMEM to be a fatal condition, that differs from the AUDITD_BAD macro in kauditd_thread(); this difference needs to be reconciled. Finally, you should update the comment header block for auditd_reset() that it needs to be called with the audit_cmd_mutex held. > @@ -1004,17 +1018,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > return -EACCES; > } > if (audit_pid && new_pid && > - audit_replace(requesting_pid) != -ECONNREFUSED) { > + (audit_replace(requesting_pid) & (-ECONNREFUSED|-EPERM|-ENOMEM))) { Do we simply want to treat any error here as fatal, and not just ECONN/EPERM/ENOMEM? If not, let's come up with a single macro to handle the fatal netlink_unicast() return codes so we have some chance to keep things consistent in the future. -- paul moore www.paul-moore.com