Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966964AbcCPXWU (ORCPT ); Wed, 16 Mar 2016 19:22:20 -0400 Received: from mail-lf0-f43.google.com ([209.85.215.43]:36541 "EHLO mail-lf0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933264AbcCPXWT (ORCPT ); Wed, 16 Mar 2016 19:22:19 -0400 MIME-Version: 1.0 Date: Wed, 16 Mar 2016 16:22:17 -0700 Message-ID: Subject: RFC on fixing mutex spinning on owner From: Joel Fernandes To: linux-rt-users@vger.kernel.org, Linux Kernel Mailing List , kernelnewbies Cc: Steven Rostedt , Ingo Molnar , Greg Kroah-Hartman , Peter Zijlstra , Thomas Gleixner Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1329 Lines: 31 Hi, On a fairly recent kernel and android userspace, I am seeing that with i915 driver is in a spin loop waiting for mutex owner to release it (mutex_spin_on_owner). I believe this because the owner of the mutex is running on another CPU and the expectation is the mutex owner releases the mutex or goes to sleep soon, so we avoid sleeping if we fail to acquire mutex and continue to spin and try to acquire it much like a spinlock (while disabling preemption through out the spinning). My question is, what if the owner cannot or doesn't want to sleep and holds the mutex runs for a while while holding it. (Lets also assume that all other tasks are sleeping on the mutex owner's CPU so its not preempted). In this case, does it make sense to time out the spinning after a while? Because preemption is disabled during the spinning so this spinning business seems a very very bad thing. Should the code holding the mutex and running (the owner) be fixed to not hold mutex for a while? Or would a patch introducing a timeout of a certain threshold on the spinning be welcomed? To give numbers, I am seeing spinning of as long as 20 ms in the worst case, while the mutex owner holds the mutex for 22 ms. The ftrace preemptoff tracer goes off. Thanks for any advice on what the right fix of the problem should be. Best, Joel