Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754266AbdHYBLU (ORCPT ); Thu, 24 Aug 2017 21:11:20 -0400 Received: from LGEAMRELO13.lge.com ([156.147.23.53]:47269 "EHLO lgeamrelo13.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753880AbdHYBLT (ORCPT ); Thu, 24 Aug 2017 21:11:19 -0400 X-Original-SENDERIP: 156.147.1.151 X-Original-MAILFROM: byungchul.park@lge.com X-Original-SENDERIP: 10.177.222.33 X-Original-MAILFROM: byungchul.park@lge.com Date: Fri, 25 Aug 2017 10:11:14 +0900 From: Byungchul Park To: Peter Zijlstra Cc: mingo@kernel.org, tj@kernel.org, boqun.feng@gmail.com, david@fromorbit.com, johannes@sipsolutions.net, oleg@redhat.com, linux-kernel@vger.kernel.org, kernel-team@lge.com Subject: Re: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation Message-ID: <20170825011114.GA3858@X58A-UD3R> References: <20170823115843.662056844@infradead.org> <20170823121432.990701317@infradead.org> <20170824021840.GC6772@X58A-UD3R> <20170824140240.t4imrpvussebfimm@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170824140240.t4imrpvussebfimm@hirez.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2223 Lines: 67 On Thu, Aug 24, 2017 at 04:02:40PM +0200, Peter Zijlstra wrote: > On Thu, Aug 24, 2017 at 11:18:40AM +0900, Byungchul Park wrote: > > On Wed, Aug 23, 2017 at 01:58:47PM +0200, Peter Zijlstra wrote: > > > > Also, unconditinoally switching to recursive-read here would fail to > > > detect the actual deadlock on single-threaded workqueues, which do > > > > Do you mean it's true even in case having fixed lockdep properly? > > Could you explain why if so? IMHO, I don't think so. > > I'm saying that if lockdep is fixed it should be: > > if (wq->saved_max_active == 1 || wq->rescuer) { > lock_map_acquire(wq->lockdep_map); > lock_map_acquire(lockdep_map); > } else { > lock_map_acquire_read(wq->lockdep_map); > lock_map_acquire_read(lockdep_map); > } > > or something like that, because for a single-threaded workqueue, the > following _IS_ a deadlock: > > work-n: > wait_for_completion(C); > > work-n+1: > complete(C); > > And that is the only case we now fail to catch. Thank you for explanation. > > > +void crossrelease_hist_start(enum xhlock_context_t c, bool force) > > > { > > > struct task_struct *cur = current; > > > > > > - if (cur->xhlocks) { > > > - cur->xhlock_idx_hist[c] = cur->xhlock_idx; > > > - cur->hist_id_save[c] = cur->hist_id; > > > + if (!cur->xhlocks) > > > + return; > > > + > > > + /* > > > + * We call this at an invariant point, no current state, no history. > > > + */ > > > > This very work-around code _must_ be removed after fixing read-recursive > > thing in lockdep. I think it would be better to add a tag(comment) > > saying it. > > > > > + if (c == XHLOCK_PROC) { > > > + /* verified the former, ensure the latter */ > > > + WARN_ON_ONCE(!force && cur->lockdep_depth); > > > + invalidate_xhlock(&xhlock(cur->xhlock_idx)); > > > } > > No, this is not a work around, this is fundamentally so. It's not going > away. The only thing that should go away is the .force argument. I meant, this seems to be led from your mis-understanding of crossrelease_hist_{start, end}(). Uer of force == 1 should not exist or don't have to exist. I am sure you haven't read my replys. Please read the following at least: https://lkml.org/lkml/2017/8/24/126