Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752192AbbF3HlY (ORCPT ); Tue, 30 Jun 2015 03:41:24 -0400 Received: from mga02.intel.com ([134.134.136.20]:44733 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750821AbbF3HlR (ORCPT ); Tue, 30 Jun 2015 03:41:17 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,376,1432623600"; d="scan'208";a="720131460" From: Rui Wang To: daniel.vetter@ffwll.ch Cc: bp@alien8.de, tony.luck@intel.com, airlied@redhat.com, robdclark@gmail.com, matthew.d.roper@intel.com, gong.chen@intel.com, linux-kernel@vger.kernel.org, Rui Wang Subject: Re: drm/mgag200: doesn't work in panic context Date: Tue, 30 Jun 2015 15:23:24 +0800 Message-Id: <1435649004-379-1-git-send-email-rui.y.wang@intel.com> X-Mailer: git-send-email 1.7.5.4 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1665 Lines: 36 On Tuesday, June 30, 2015 2:37 PM, Daniel Vetter wrote: > On Tue, Jun 30, 2015 at 4:53 AM, Rui Wang wrote: > > > > I think testing can be done by injecting a fatal machine check > > exception via einj's debugfs interface. I can reproduce the hard hang every > time. > > I think It can be a simple script or C program do to the automated testing. > > If anyone has any patch I'll be happy to help test it out. > > Testing shouldn't kill the machine ;-) Yes :) What I assumed was that after applying a future patch the machine should be able to reboot instead of hanging itself, so the testing can repeat. > > The idea I had is to just exercise the drm panic code (since we'd need to > shunt everything else), and that can be done my calling the relevant > functions from a hardirq context. And hardirq context is simples to get with a > IPI to the local cpu. This way we don't depend upon the entire panic path to > be recoverable, but only upon the drm bits being sane. Yes If it can be tested without rebooting then it'll be more efficient. But einj does something more than what an IPI can do, it injects hardware errors which trigger exceptions in NMI context... and the exception handler usually panics on fatal errors. And the display may be the only way to catch what has happened. I'm just hoping that the future version may work in NMI context. Thanks Rui -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/