Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755182AbbF0NxF (ORCPT ); Sat, 27 Jun 2015 09:53:05 -0400 Received: from mail-ob0-f181.google.com ([209.85.214.181]:34986 "EHLO mail-ob0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754532AbbF0Nw5 (ORCPT ); Sat, 27 Jun 2015 09:52:57 -0400 MIME-Version: 1.0 X-Originating-IP: [2a02:168:56c9:0:22cf:30ff:fe4c:37d6] In-Reply-To: <3908561D78D1C84285E8C5FCA982C28F32AA062A@ORSMSX114.amr.corp.intel.com> References: <1435305314-14337-1-git-send-email-rui.y.wang@intel.com> <3908561D78D1C84285E8C5FCA982C28F32AA062A@ORSMSX114.amr.corp.intel.com> Date: Sat, 27 Jun 2015 15:52:56 +0200 Message-ID: Subject: Re: drm/mgag200: doesn't work in panic context From: Daniel Vetter To: "Luck, Tony" Cc: "Wang, Rui Y" , Dave Airlie , "Clark, Rob" , "Roper, Matthew D" , "Chen, Gong" , Borislav Petkov , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1985 Lines: 39 On Fri, Jun 26, 2015 at 8:30 PM, Luck, Tony wrote: >>> I'm here to report two panics which hang forever (the machine cannot reboot). It is because mgag200 doesn't work in panic context. It sleeps and allocates memory non-atomically. >> >> This is the same for all drm drivers, the drm atomic handling with >> fbcon/fbdev is totally broken. It would be serious work to fix this >> properly. > > It's a serious problem when a server crashes ... even worse when it hangs while doing so > because we have to rely on some other agent to notice the hung server and go poke it > with a stick. > > If it is too hard to fix all of the drivers, is it possible to attack this in the allocator? Hm, what do you mean by fixing this in the allocator? I've made some rough sketch of the problem space in http://www.x.org/wiki/DRMJanitors/ under "Make panic handling work". Problem is that the folks which know what to do (drm hackers) have zero incentive to fix it (since if you blow up a drm driver any kind of fbcon panic handling is hopeless anyway). The other problem is is that this is a serious effort with tons of little things all over to consider. My gut estimate is that probably it'll take something of the order of a man year to fix this for real. David Herrmann has supplied parts of the required puzzle to actually be able to somewhat reliably show panics on drm modesetting drivers, but that didn't contain any of the work to make fbdev not totally suck at panic handling first. And I guess for general distros and servers that's needed - developers simply disable all of fbdev to be able to debug kms hangs. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/