Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932600AbbENMNH (ORCPT ); Thu, 14 May 2015 08:13:07 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:37572 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751978AbbENMNB (ORCPT ); Thu, 14 May 2015 08:13:01 -0400 Message-ID: <55548503.2050406@huawei.com> Date: Thu, 14 May 2015 19:20:35 +0800 From: "long.wanglong" User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:16.0) Gecko/20121010 Thunderbird/16.0.1 MIME-Version: 1.0 To: Jiri Kosina CC: =?GB2312?B?zfXB+g==?= , rostedt , paulmck , pmladek , dzickus , johannes , koct9i , tglx , mingo , hpa , x86 , atomlin , akpm , "sasha.levin" , linux-kernel , peifeiyue , "morgan.wang" Subject: Re: [RFC] how to perform a safe NMI stack trace on all CPUs on x86? References: In-Reply-To: Content-Type: text/plain; charset="GB2312" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.111.88.174] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2105 Lines: 54 On 2015/5/13 22:26, Jiri Kosina wrote: > On Wed, 13 May 2015, ???? wrote: > >> Hi all, >> >> In kernel before 3.19, when trigger_all_cpu_backtrace() is called on x86, >> it will trigger an NMI on each CPU and call show_regs(). But this can lead >> to a hard lock up if the NMI comes in on another printk(). >> >> The commit a9edc88093287183ac934be44f295f183b2c62dd (x86/nmi: Perform a safe >> NMI stack trace on all CPUs) fix this problem on kernel mainline. when the NMI >> triggers, it switches the printk routine for that CPU to call a NMI safe printk >> function that records the printk in a per_cpu seq_buf descriptor. After all >> NMIs have finished recording its data, the seq_bufs are printed in a safe >> context. But how do we fix this problem in older version of kernel(eg, 3.10 stable)? >> The 3.10 stable has no "switch printk routine" and "seq_buf" infrastructures. >> >> Could anyone give me some ideas? > > Either you backport seq_buf-based aproach to the older kernel, or, if you > are working on 3.4 kernel or earlier (basically any kernel preceeding the > printk() revamp that happened in 7ff9554bb57 and after), you can use > slightly simpler aproach. > > It's an aproach we used initially when finding out the issue for the first > time, and it is proven to work as well (but it's not applicable after Kay > added all the complexity to printk()). > > You can see it in our SLE11 kernel tree, available on > > http://kernel.suse.com/cgit/kernel/commit/?h=SLE11-SP4&id=8d62ae68ff61d77ae3c4899f05dbd9c9742b14c9 > > for example. > > It's up to you to judget which is the least painful way :) > Hi Jiri Kosina, For 3.10 stable, the only way to solve this problem is backport seq_buf-based aproach. I will backport necessary patches to 3.10 stable. Welcome you to review my backport patches. Best Regards Wang Long -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/