Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752980AbcDXSsb (ORCPT ); Sun, 24 Apr 2016 14:48:31 -0400 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:39904 "EHLO out2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751693AbcDXSs3 (ORCPT ); Sun, 24 Apr 2016 14:48:29 -0400 X-Sasl-enc: a2/hf1nymdu7h0qiuVFnhKqNe472NIedZWri6zEbOwMM 1461523707 Subject: Re: linux-next: zillions of lockdep whinges in include/net/sock.h:1408 To: David Miller References: <43037.1461229555@turing-police.cc.vt.edu> <1461245496.7627.17.camel@edumazet-glaptop3.roam.corp.google.com> <5718DA71.7050902@stressinduktion.org> <20160424.143833.2292980084570149367.davem@davemloft.net> Cc: eric.dumazet@gmail.com, Valdis.Kletnieks@vt.edu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org From: Hannes Frederic Sowa Message-ID: <571D14F8.6070306@stressinduktion.org> Date: Sun, 24 Apr 2016 20:48:24 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1 MIME-Version: 1.0 In-Reply-To: <20160424.143833.2292980084570149367.davem@davemloft.net> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2558 Lines: 65 On 24.04.2016 20:38, David Miller wrote: > From: Hannes Frederic Sowa > Date: Thu, 21 Apr 2016 15:49:37 +0200 > >> On 21.04.2016 15:31, Eric Dumazet wrote: >>> On Thu, 2016-04-21 at 05:05 -0400, Valdis.Kletnieks@vt.edu wrote: >>>> On Thu, 21 Apr 2016 09:42:12 +0200, Hannes Frederic Sowa said: >>>>> Hi, >>>>> >>>>> On Thu, Apr 21, 2016, at 02:30, Valdis Kletnieks wrote: >>>>>> linux-next 20160420 is whining at an incredible rate - in 20 minutes of >>>>>> uptime, I piled up some 41,000 hits from all over the place (cleaned up >>>>>> to skip the CPU and PID so the list isn't quite so long): >>>>> >>>>> Thanks for the report. Can you give me some more details: >>>>> >>>>> Is this an nfs socket? Do you by accident know if this socket went >>>>> through xs_reclassify_socket at any point? We do hold the appropriate >>>>> locks at that point but I fear that the lockdep reinitialization >>>>> confused lockdep. >>>> >>>> It wasn't an NFS socket, as NFS wasn't even active at the time. I'm reasonably >>>> sure that multiple sockets were in play, given that tcp_v6_rcv and >>>> udpv6_queue_rcv_skb were both implicated. I strongly suspect that pretty much >>>> any IPv6 traffic could do it - the frequency dropped off quite a bit when I >>>> closed firefox, which is usually a heavy network hitter on my laptop. >>> >>> >>> Looks like the following patch is needed, can you try it please ? >>> >>> Thanks ! >>> >>> diff --git a/include/net/sock.h b/include/net/sock.h >>> index d997ec13a643..db8301c76d50 100644 >>> --- a/include/net/sock.h >>> +++ b/include/net/sock.h >>> @@ -1350,7 +1350,8 @@ static inline bool lockdep_sock_is_held(const struct sock *csk) >>> { >>> struct sock *sk = (struct sock *)csk; >>> >>> - return lockdep_is_held(&sk->sk_lock) || >>> + return !debug_locks || >>> + lockdep_is_held(&sk->sk_lock) || >>> lockdep_is_held(&sk->sk_lock.slock); >>> } >>> #endif >> >> I would prefer to add debug_locks at the WARN_ON level, like >> WARN_ON(debug_locks && !lockdep_sock_is_held(sk)), but I am not sure if >> this fixes the initial splat. > > Can we finish this conversation out and come up with a final patch > for this soon? Eric's patch is worth to apply anyway, but I am not sure if it solves the (fundamental) problem. I couldn't reproduce it with the exact next- tag provided in the initial mail. All other reports also only happend with linux-next and not net-next. I hope I Valdis provides his config soon and I will continue my analysis on this then. Thanks, Hannes