Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S942783AbcJaNTg (ORCPT ); Mon, 31 Oct 2016 09:19:36 -0400 Received: from mail-lf0-f44.google.com ([209.85.215.44]:36165 "EHLO mail-lf0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S942617AbcJaNTe (ORCPT ); Mon, 31 Oct 2016 09:19:34 -0400 Date: Mon, 31 Oct 2016 16:19:30 +0300 From: Mike Krinkin To: Hugh Dickins Cc: peterz@infradead.org, jason.low2@hpe.com, mingo@kernel.org, tglx@linutronix.de, chris@chris-wilson.co.uk, linux-kernel@vger.kernel.org Subject: Re: Commit "locking/drm: Kill mutex trickery" causes hangs Message-ID: <20161031131928.GA9699@gmail.com> References: <20161030220604.GA6834@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3556 Lines: 80 On Sun, Oct 30, 2016 at 05:09:41PM -0700, Hugh Dickins wrote: > On Mon, 31 Oct 2016, Mike Krinkin wrote: > > > > i faced system hangs with recent linux-next versions, bisect points at the > > commit 3ab7c086d5ec72585ef0 ("locking/drm: Kill mutex trickery"), bisect log > > attached. System just hangs after few minutes when i compile kernel with -j4 > > and watch some video simultaneously. > [...] > > also lspci -vvv output: > [...] > > Kernel driver in use: i915 > > Kernel modules: i915 > > Yes, that's hit me too, on mmotm on i915. i915_gem_shrinker_lock() > is broken: but copy the pattern from msm_gem_shrinker_lock() and it's > okay - patch below. Well, okay-ish: I'm reluctant to sign off on that > as more than a quick fix for i915 linux-next users, since the unlock > variable and those _gem_shrinker_lock() wrappers should just be deleted > (if the mutex trickery is indeed to be killed). > > And I'm still left with a "sleeping function called from invalid context" > warning, which seems easier to live with: I've not looked to see whether > that's a consequence of the mutex trickery killage or something else. > > [ 12.887922] BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:956 > [ 12.887925] in_atomic(): 1, irqs_disabled(): 0, pid: 787, name: X > [ 12.887927] 1 lock held by X/787: > [ 12.887928] #0: > [ 12.887929] ( > [ 12.887930] &dev->struct_mutex > [ 12.887931] ){+.+.+.} > [ 12.887932] , at: > [ 12.887937] [] i915_mutex_lock_interruptible+0x23/0x26 > [ 12.887939] Preemption disabled at: > [ 12.887943] [] i915_gem_execbuffer_relocate_entry+0x5fb/0x70f > [ 12.887947] CPU: 2 PID: 787 Comm: X Not tainted 4.9.0-rc2-mm1 #5 > [ 12.887948] Hardware name: LENOVO 4174EH1/4174EH1, BIOS 8CET51WW (1.31 ) 11/29/2011 > [ 12.887950] Call Trace: > [ 12.887955] dump_stack+0x67/0x90 > [ 12.887958] ? i915_gem_execbuffer_relocate_entry+0x5fb/0x70f > [ 12.887961] ___might_sleep+0x223/0x23a > [ 12.887963] __might_sleep+0x6d/0x81 > [ 12.887966] __pm_runtime_resume+0x35/0x7a > [ 12.887970] intel_runtime_pm_get+0x20/0x7f > [ 12.887973] aliasing_gtt_bind_vma+0x4d/0xb1 > [ 12.887975] i915_vma_bind+0x67/0xbd > [ 12.887977] i915_gem_execbuffer_relocate_entry+0xc6/0x70f > [ 12.887981] ? _raw_spin_unlock_irq+0x27/0x45 > [ 12.887984] i915_gem_execbuffer_relocate_vma+0x128/0x1dd > [ 12.887987] ? nommu_map_sg+0x9e/0xca > [ 12.887990] ? __i915_vma_do_pin+0x3da/0x421 > [ 12.887994] ? i915_gem_execbuffer_reserve_vma.isra.34+0xbc/0x189 > [ 12.887996] ? i915_gem_execbuffer_reserve.isra.35+0x32f/0x3da > [ 12.887999] i915_gem_do_execbuffer.isra.36+0x64c/0x10a9 > [ 12.888002] i915_gem_execbuffer2+0x15d/0x203 > [ 12.888005] drm_ioctl+0x25a/0x38b > [ 12.888007] ? i915_gem_execbuffer+0x2d3/0x2d3 > [ 12.888011] vfs_ioctl+0x1c/0x33 > [ 12.888014] do_vfs_ioctl+0x5c5/0x601 > [ 12.888016] ? __fget+0x17e/0x18f > [ 12.888019] ? expand_files+0x23e/0x23e > [ 12.888021] SyS_ioctl+0x38/0x60 > [ 12.888023] entry_SYSCALL_64_fastpath+0x18/0xad > > --- a/drivers/gpu/drm/i915/i915_gem_shrinker.c > +++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c > @@ -229,8 +229,9 @@ unsigned long i915_gem_shrink_all(struct > static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock) > { > if (!mutex_trylock(&dev->struct_mutex)) > - *unlock = false; > + return false; > > + *unlock = true; > return true; > } Works for me, no warnings noted yet. Thank you. >