Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965079AbbEMO0M (ORCPT ); Wed, 13 May 2015 10:26:12 -0400 Received: from cantor2.suse.de ([195.135.220.15]:55895 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934274AbbEMO0K (ORCPT ); Wed, 13 May 2015 10:26:10 -0400 Date: Wed, 13 May 2015 16:26:06 +0200 (CEST) From: Jiri Kosina To: =?GB2312?B?zfXB+g==?= cc: rostedt , paulmck , pmladek , dzickus , johannes , koct9i , tglx , mingo , hpa , x86 , atomlin , akpm , "sasha.levin" , linux-kernel , peifeiyue , "long.wanglong" , "morgan.wang" Subject: Re: [RFC] how to perform a safe NMI stack trace on all CPUs on x86? In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (LNX 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=GB2312 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1813 Lines: 43 On Wed, 13 May 2015, ???? wrote: > Hi all, > > In kernel before 3.19, when trigger_all_cpu_backtrace() is called on x86, > it will trigger an NMI on each CPU and call show_regs(). But this can lead > to a hard lock up if the NMI comes in on another printk(). > > The commit a9edc88093287183ac934be44f295f183b2c62dd (x86/nmi: Perform a safe > NMI stack trace on all CPUs) fix this problem on kernel mainline. when the NMI > triggers, it switches the printk routine for that CPU to call a NMI safe printk > function that records the printk in a per_cpu seq_buf descriptor. After all > NMIs have finished recording its data, the seq_bufs are printed in a safe > context. But how do we fix this problem in older version of kernel(eg, 3.10 stable)? > The 3.10 stable has no "switch printk routine" and "seq_buf" infrastructures. > > Could anyone give me some ideas? Either you backport seq_buf-based aproach to the older kernel, or, if you are working on 3.4 kernel or earlier (basically any kernel preceeding the printk() revamp that happened in 7ff9554bb57 and after), you can use slightly simpler aproach. It's an aproach we used initially when finding out the issue for the first time, and it is proven to work as well (but it's not applicable after Kay added all the complexity to printk()). You can see it in our SLE11 kernel tree, available on http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP4&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9 for example. It's up to you to judget which is the least painful way :) -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/