Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932470AbcLLKIF (ORCPT ); Mon, 12 Dec 2016 05:08:05 -0500 Received: from mail-wm0-f41.google.com ([74.125.82.41]:35190 "EHLO mail-wm0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932406AbcLLKID (ORCPT ); Mon, 12 Dec 2016 05:08:03 -0500 MIME-Version: 1.0 In-Reply-To: References: <20161129164859.GD26673@madcap2.tricolour.ca> <20161130045207.GE26673@madcap2.tricolour.ca> <20161209060248.GT22655@madcap2.tricolour.ca> <20161209110155.GW22655@madcap2.tricolour.ca> From: Dmitry Vyukov Date: Mon, 12 Dec 2016 11:07:41 +0100 Message-ID: Subject: Re: netlink: GPF in sock_sndtimeo To: syzkaller Cc: Richard Guy Briggs , linux-audit@redhat.com, Paul Moore , David Miller , Johannes Berg , Florian Westphal , Eric Dumazet , Herbert Xu , netdev , LKML Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1699 Lines: 35 On Sat, Dec 10, 2016 at 8:40 AM, Cong Wang wrote: >>> On 2016-12-08 22:57, Cong Wang wrote: >>>> On Thu, Dec 8, 2016 at 10:02 PM, Richard Guy Briggs wrote: >>>> > I also tried to extend Cong Wang's idea to attempt to proactively respond to a >>>> > NETLINK_URELEASE on the audit_sock and reset it, but ran into a locking error >>>> > stack dump using mutex_lock(&audit_cmd_mutex) in the notifier callback. >>>> > Eliminating the lock since the sock is dead anways eliminates the error. >>>> > >>>> > Is it safe? I'll resubmit if this looks remotely sane. Meanwhile I'll try to >>>> > get the test case to compile. >>>> >>>> It doesn't look safe, because 'audit_sock', 'audit_nlk_portid' and 'audit_pid' >>>> are updated as a whole and race between audit_receive_msg() and >>>> NETLINK_URELEASE. >>> >>> This is what I expected and why I originally added the mutex lock in the >>> callback... The dumps I got were bare with no wrapper identifying the >>> process context or specific error, so I'm at a bit of a loss how to >>> solve this (without thinking more about it) other than instinctively >>> removing the mutex. >> >> Netlink notifier can safely be converted to blocking one, I will send >> a patch. >> >> But I seriously doubt you really need NETLINK_URELEASE here, >> it adds nothing but overhead, b/c the netlink notifier is called on >> every netlink socket in the system, but for net exit path, that is >> relatively a slow path. >> >> Also, kauditd_send_skb() needs audit_cmd_mutex too. > > Please let me know what you think about the attached patch? Applied the patch locally and have not seen the bug since then (~24 hours of testing).