Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752735AbcKQW1Y (ORCPT ); Thu, 17 Nov 2016 17:27:24 -0500 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:59692 "EHLO out2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752374AbcKQW1W (ORCPT ); Thu, 17 Nov 2016 17:27:22 -0500 X-ME-Sender: X-Sasl-enc: iAjagHiJbaODFZjOApdo+XxKtqR3cDGnsvPaB3PxdvkE 1479421641 Subject: Re: net: BUG still has locks held in unix_stream_splice_read To: Cong Wang , Al Viro References: <20161010024607.GV19539@ZenIV.linux.org.uk> <20161010031450.GW19539@ZenIV.linux.org.uk> Cc: Dmitry Vyukov , David Miller , Eric Dumazet , netdev , LKML , syzkaller , Colin Cross , Mandeep Singh Baines From: Hannes Frederic Sowa Message-ID: Date: Thu, 17 Nov 2016 23:27:17 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1730 Lines: 37 On 17.11.2016 22:44, Cong Wang wrote: > On Sun, Oct 9, 2016 at 8:14 PM, Al Viro wrote: >> E.g what will happen if some code does a read on AF_UNIX socket with >> some local mutex held? AFAICS, there are exactly two callers of >> freezable_schedule_timeout() - this one and one in XFS; the latter is >> in a kernel thread where we do have good warranties about the locking >> environment, but here it's in the bleeding ->recvmsg/->splice_read and >> for those assumption that caller doesn't hold any locks is pretty >> strong, especially since it's not documented anywhere. >> >> What's going on there? > > Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read") > converts schedule_timeout() to its freezable version, it was probably correct > at that time, but later, commit 2b514574f7e88c8498027ee366 > ("net: af_unix: implement splice for stream af_unix sockets") breaks its > requirement for a freezable sleep: > > commit 0f9548ca10916dec166eaf74c816bded7d8e611d > > lockdep: check that no locks held at freeze time > > We shouldn't try_to_freeze if locks are held. Holding a lock can cause a > deadlock if the lock is later acquired in the suspend or hibernate path > (e.g. by dpm). Holding a lock can also cause a deadlock in the case of > cgroup_freezer if a lock is held inside a frozen cgroup that is later > acquired by a process outside that group. > > So probably we just need to revert commit 2b15af6f95 now. > > I am going to send a revert for at least -net and -stable, since Dmitry > saw this warning again. I am not an expert on freezing but this looks around right from the freezer code. Awesome, thanks a lot for spotting this one!