Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933499AbbENLms (ORCPT ); Thu, 14 May 2015 07:42:48 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:21314 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932771AbbENLj3 (ORCPT ); Thu, 14 May 2015 07:39:29 -0400 From: Wang Long To: , , CC: , , , , , , , , , Subject: [RFC PATCH 00/17][request for stable 3.10 inclusion] x86/nmi: Print all cpu stacks from NMI safely Date: Thu, 14 May 2015 11:34:47 +0000 Message-ID: <1431603304-162571-1-git-send-email-long.wanglong@huawei.com> X-Mailer: git-send-email 1.8.3.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.107.197.200] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.5554888A.0583,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: c2c59246bae51e6ae1ca5c2cae62594b Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2869 Lines: 68 This is my backport patch series to Fix the problem(backport to 3.10): " When trigger_all_cpu_backtrace() is called on x86, it will trigger an NMI on each CPU and call show_regs(). But this can lead to a hard lock up if the NMI comes in on another printk(). " The solution is described in commit "a9edc88093287183ac934be44f295f183b2c62dd": when the NMI triggers, it switches the printk routine for that CPU to call a NMI safe printk function that records the printk in a per_cpu seq_buf descriptor. After all NMIs have finished recording its data, the trace_ seqs are printed in a safe context. The solution use "switch printk routine" and "seq_buf" infrastructures, but the 3.10 stable have no both of them. The patch 1-13 backport the "seq_buf" infrastructures. in detail, patch 1, 2 and 6 only backport "seq_buf" related code. The patch 14-15 backport the "switch printk routine". The patch 16-17 is the patch to print all cpu stacks from NMI safely as discussed in https://lkml.org/lkml/2015/5/13/497, in 3.10 stable, this is the only way to solve the problem and the backport code is a bit more. Any thoughts? Sasha Levin (1): x86/nmi: Fix use of unallocated cpumask_var_t Steven Rostedt (Red Hat) (16): tracing: Create seq_buf layer in trace_seq tracing: Convert seq_buf_path() to be like seq_path() tracing: Convert seq_buf fields to be like seq_file fields tracing: Add a seq_buf_clear() helper and clear len and readpos in init seq_buf: Create seq_buf_used() to find out how much was written tracing: Use trace_seq_used() and seq_buf_used() instead of len seq_buf: Add seq_buf_can_fit() helper function tracing: Have seq_buf use full buffer tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF seq_buf: Move the seq_buf code to lib/ seq_buf: Fix seq_buf_vprintf() truncation seq_buf: Fix seq_buf_bprintf() truncation printk: Add per_cpu printk func to allow printk to be diverted printk/percpu: Define printk_func when printk is not defined x86/nmi: Perform a safe NMI stack trace on all CPUs arch/x86/kernel/apic/hw_nmi.c | 86 +++++++++- include/linux/percpu.h | 4 + include/linux/printk.h | 2 + include/linux/seq_buf.h | 136 ++++++++++++++++ kernel/printk.c | 41 +++-- lib/Makefile | 2 +- lib/seq_buf.c | 359 ++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 617 insertions(+), 13 deletions(-) create mode 100644 include/linux/seq_buf.h create mode 100644 lib/seq_buf.c -- 1.8.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/