Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756267AbcJaAJv (ORCPT ); Sun, 30 Oct 2016 20:09:51 -0400 Received: from mail-pf0-f182.google.com ([209.85.192.182]:33659 "EHLO mail-pf0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752149AbcJaAJu (ORCPT ); Sun, 30 Oct 2016 20:09:50 -0400 Date: Sun, 30 Oct 2016 17:09:41 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Mike Krinkin cc: peterz@infradead.org, jason.low2@hpe.com, mingo@kernel.org, tglx@linutronix.de, chris@chris-wilson.co.uk, linux-kernel@vger.kernel.org Subject: Re: Commit "locking/drm: Kill mutex trickery" causes hangs In-Reply-To: <20161030220604.GA6834@gmail.com> Message-ID: References: <20161030220604.GA6834@gmail.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3292 Lines: 76 On Mon, 31 Oct 2016, Mike Krinkin wrote: > > i faced system hangs with recent linux-next versions, bisect points at the > commit 3ab7c086d5ec72585ef0 ("locking/drm: Kill mutex trickery"), bisect log > attached. System just hangs after few minutes when i compile kernel with -j4 > and watch some video simultaneously. [...] > also lspci -vvv output: [...] > Kernel driver in use: i915 > Kernel modules: i915 Yes, that's hit me too, on mmotm on i915. i915_gem_shrinker_lock() is broken: but copy the pattern from msm_gem_shrinker_lock() and it's okay - patch below. Well, okay-ish: I'm reluctant to sign off on that as more than a quick fix for i915 linux-next users, since the unlock variable and those _gem_shrinker_lock() wrappers should just be deleted (if the mutex trickery is indeed to be killed). And I'm still left with a "sleeping function called from invalid context" warning, which seems easier to live with: I've not looked to see whether that's a consequence of the mutex trickery killage or something else. [ 12.887922] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:956 [ 12.887925] in_atomic(): 1, irqs_disabled(): 0, pid: 787, name: X [ 12.887927] 1 lock held by X/787: [ 12.887928] #0: [ 12.887929] ( [ 12.887930] &dev->struct_mutex [ 12.887931] ){+.+.+.} [ 12.887932] , at: [ 12.887937] [] i915_mutex_lock_interruptible+0x23/0x26 [ 12.887939] Preemption disabled at: [ 12.887943] [] i915_gem_execbuffer_relocate_entry+0x5fb/0x70f [ 12.887947] CPU: 2 PID: 787 Comm: X Not tainted 4.9.0-rc2-mm1 #5 [ 12.887948] Hardware name: LENOVO 4174EH1/4174EH1, BIOS 8CET51WW (1.31 ) 11/29/2011 [ 12.887950] Call Trace: [ 12.887955] dump_stack+0x67/0x90 [ 12.887958] ? i915_gem_execbuffer_relocate_entry+0x5fb/0x70f [ 12.887961] ___might_sleep+0x223/0x23a [ 12.887963] __might_sleep+0x6d/0x81 [ 12.887966] __pm_runtime_resume+0x35/0x7a [ 12.887970] intel_runtime_pm_get+0x20/0x7f [ 12.887973] aliasing_gtt_bind_vma+0x4d/0xb1 [ 12.887975] i915_vma_bind+0x67/0xbd [ 12.887977] i915_gem_execbuffer_relocate_entry+0xc6/0x70f [ 12.887981] ? _raw_spin_unlock_irq+0x27/0x45 [ 12.887984] i915_gem_execbuffer_relocate_vma+0x128/0x1dd [ 12.887987] ? nommu_map_sg+0x9e/0xca [ 12.887990] ? __i915_vma_do_pin+0x3da/0x421 [ 12.887994] ? i915_gem_execbuffer_reserve_vma.isra.34+0xbc/0x189 [ 12.887996] ? i915_gem_execbuffer_reserve.isra.35+0x32f/0x3da [ 12.887999] i915_gem_do_execbuffer.isra.36+0x64c/0x10a9 [ 12.888002] i915_gem_execbuffer2+0x15d/0x203 [ 12.888005] drm_ioctl+0x25a/0x38b [ 12.888007] ? i915_gem_execbuffer+0x2d3/0x2d3 [ 12.888011] vfs_ioctl+0x1c/0x33 [ 12.888014] do_vfs_ioctl+0x5c5/0x601 [ 12.888016] ? __fget+0x17e/0x18f [ 12.888019] ? expand_files+0x23e/0x23e [ 12.888021] SyS_ioctl+0x38/0x60 [ 12.888023] entry_SYSCALL_64_fastpath+0x18/0xad --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c @@ -229,8 +229,9 @@ unsigned long i915_gem_shrink_all(struct static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock) { if (!mutex_trylock(&dev->struct_mutex)) - *unlock = false; + return false; + *unlock = true; return true; }