Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752758AbcLJEOH (ORCPT ); Fri, 9 Dec 2016 23:14:07 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:36493 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752434AbcLJEOG (ORCPT ); Fri, 9 Dec 2016 23:14:06 -0500 MIME-Version: 1.0 In-Reply-To: <20161209110155.GW22655@madcap2.tricolour.ca> References: <20161129164859.GD26673@madcap2.tricolour.ca> <20161130045207.GE26673@madcap2.tricolour.ca> <20161209060248.GT22655@madcap2.tricolour.ca> <20161209110155.GW22655@madcap2.tricolour.ca> From: Cong Wang Date: Fri, 9 Dec 2016 20:13:44 -0800 Message-ID: Subject: Re: netlink: GPF in sock_sndtimeo To: Richard Guy Briggs Cc: linux-audit@redhat.com, Paul Moore , Dmitry Vyukov , David Miller , Johannes Berg , Florian Westphal , Eric Dumazet , Herbert Xu , netdev , LKML , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1518 Lines: 34 On Fri, Dec 9, 2016 at 3:01 AM, Richard Guy Briggs wrote: > On 2016-12-08 22:57, Cong Wang wrote: >> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs wrote: >> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a >> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error >> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. >> > Eliminating the lock since the sock is dead anways eliminates the error. >> > >> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to >> > get the test case to compile. >> >> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' >> are updated as a whole and race between audit_receive_msg() and >> NETLINK_URELEASE. > > This is what I expected and why I originally added the mutex lock in the > callback... The dumps I got were bare with no wrapper identifying the > process context or specific error, so I'm at a bit of a loss how to > solve this (without thinking more about it) other than instinctively > removing the mutex. Netlink notifier can safely be converted to blocking one, I will send a patch. But I seriously doubt you really need NETLINK_URELEASE here, it adds nothing but overhead, b/c the netlink notifier is called on every netlink socket in the system, but for net exit path, that is relatively a slow path. Also, kauditd_send_skb() needs audit_cmd_mutex too. I will send a formal patch. Thanks.