Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755282Ab3ILVu6 (ORCPT ); Thu, 12 Sep 2013 17:50:58 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:49041 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754781Ab3ILVu4 (ORCPT ); Thu, 12 Sep 2013 17:50:56 -0400 Message-ID: <52323734.4070908@canonical.com> Date: Thu, 12 Sep 2013 23:50:44 +0200 From: Maarten Lankhorst User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130804 Thunderbird/17.0.8 MIME-Version: 1.0 To: Thomas Hellstrom CC: Daniel Vetter , Peter Zijlstra , Dave Airlie , intel-gfx , dri-devel , Linux Kernel Mailing List , Ingo Molnar , Thomas Gleixner Subject: Re: [BUG] completely bonkers use of set_need_resched + VM_FAULT_NOPAGE References: <20130912150645.GZ31370@twins.programming.kicks-ass.net> <5231E18D.7070306@canonical.com> <5231EF5A.7010901@vmware.com> In-Reply-To: <5231EF5A.7010901@vmware.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3535 Lines: 68 Op 12-09-13 18:44, Thomas Hellstrom schreef: > On 09/12/2013 05:45 PM, Maarten Lankhorst wrote: >> Op 12-09-13 17:36, Daniel Vetter schreef: >>> On Thu, Sep 12, 2013 at 5:06 PM, Peter Zijlstra wrote: >>>> So I'm poking around the preemption code and stumbled upon: >>>> >>>> drivers/gpu/drm/i915/i915_gem.c: set_need_resched(); >>>> drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); >>>> drivers/gpu/drm/ttm/ttm_bo_vm.c: set_need_resched(); >>>> drivers/gpu/drm/udl/udl_gem.c: set_need_resched(); >>>> >>>> All these sites basically do: >>>> >>>> while (!trylock()) >>>> yield(); >>>> >>>> which is a horrible and broken locking pattern. >>>> >>>> Firstly its deadlock prone, suppose the faulting process is a FIFOn+1 >>>> task that preempted the lock holder at FIFOn. >>>> >>>> Secondly the implementation is worse than usual by abusing >>>> VM_FAULT_NOPAGE, which is supposed to install a PTE so that the fault >>>> doesn't retry, but you're using it as a get out of fault path. And >>>> you're using set_need_resched() which is not something a driver should >>>> _ever_ touch. >>>> >>>> Now I'm going to take away set_need_resched() -- and while you can >>>> 'reimplement' it using set_thread_flag() you're not going to do that >>>> because it will be broken due to changes to the preempt code. >>>> >>>> So please as to fix ASAP and don't allow anybody to trick you into >>>> merging silly things like that again ;-) >>> The set_need_resched in i915_gem.c:i915_gem_fault can actually be >>> removed. It was there to give the error handler a chance to sneak in >>> and reset the hw/sw tracking when the gpu is dead. That hack goes back >>> to the days when the locking around our error handler was somewhere >>> between nonexistent and totally broken, nowadays we keep things from >>> live-locking by a bit of magic in i915_mutex_lock_interruptible. I'll >>> whip up a patch to rip this out. I'll also check that our testsuite >>> properly exercises this path (needs a bit of work on a quick look for >>> better coverage). >>> >>> The one in ttm is just bonghits to shut up lockdep: ttm can recurse >>> into it's own pagefault handler and then deadlock, the trylock just >>> keeps lockdep quiet. We've had that bug arise in drm/i915 due to some >>> fun userspace did and now have testcases for them. The right solution >>> to fix this is to use copy_to|from_user_atomic in ttm everywhere it >>> holds locks and have slowpaths which drops locks, copies stuff into a >>> temp allocation and then continues. At least that's how we've fixed >>> all those inversions in i915-gem. I'm not volunteering to fix this ;-) >> Ah the case where a mmap'd address is passed to the execbuf ioctl? :P >> >> Fine I'll look into it a bit, hopefully before tuesday. Else it might take a bit longer since I'll be on my way to plumbers.. > > I think a possible fix would be if fault() were allowed to return an error and drop the mmap_sem() before returning. > > Otherwise we need to track down all copy_to_user / copy_from_user which happen with bo::reserve held. CONFIG_PROVE_LOCKING=y and hard grab that reserve lock within the fault handler, done.. lockdep will spit it out for you :p ~Maarten -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/