Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp408801ybm; Fri, 29 May 2020 03:15:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy0SQddNsMluQKv+UK0JXmgE2FNjxaQKXM4dPLHZeidiRzc3CLGS3QS4NIoDbASrwl8l+31 X-Received: by 2002:a17:906:1359:: with SMTP id x25mr7518962ejb.42.1590747340345; Fri, 29 May 2020 03:15:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590747340; cv=none; d=google.com; s=arc-20160816; b=sIbhvszKznlZful85DH7hvnYGF5eTW0pnPdzgc5J4bNOCWVQP3sCxQGb4WOKXuPh6G x3T42DsplLc52mvJ2AIIUwjaw1UXTNbvOSkZlxbB0k1BfO9ki3r0kz9IjoPijKMU9zY0 71pySGx8j6Lb2tcPF5Yghun32H5vYo0izQxDp+HRTr4Nsrcu0ecgJi4Q/diXmrBECchl VQ4uAD6PXPGaV/66pTabKgTMWJGpSQZMMeHpr3JBoebf6cJ2rOI3hoNMfEF4cOAubeat U2UBUatY5G5jEVixouxM2bnfPES07Wq1mBlibGCoOu9ENVSfs9xmawl4DOLVDhsGt3/X Tkrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:mime-version:message-id:date:subject :cc:from; bh=69MJ2v3soU95bCZxTj9HYLNypSkH2Tf0dymzlcWRv5o=; b=uUeNy3Y9q1ujUh6Voi51J4++P3tVMQWG5csaKg1M+7va3zwAle/jpkKX3J/HDUAmCJ fv9LJmLdJbnS8VcK6naFRXCWW1e9ssaeXsxWUNpOokA1V+2Ik2SjhkMLWQjsStf4j5lr 3xGjR85n1SsejsvOT4yb6SrW49ACj/Rz+KiL9YZyQGI68zy0gr+1SWbEAm8y0kPivFid XLOdnW8UQ2IutTE+RvLjHd08SYpkOOTaOsSG85EL2AF98trISanvSA0xGznMRAbjus1l jIKRtiWuDOz+NC3R97zy9jMWpcwdRPYs7tR7cVvFa66DG13axENeJB6IkND+rUZlO82H cicA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id re19si5346323ejb.56.2020.05.29.03.15.16; Fri, 29 May 2020 03:15:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725863AbgE2KLa (ORCPT + 99 others); Fri, 29 May 2020 06:11:30 -0400 Received: from szxga06-in.huawei.com ([45.249.212.32]:58174 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725601AbgE2KL3 (ORCPT ); Fri, 29 May 2020 06:11:29 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id C003B6E80F94C9C22371; Fri, 29 May 2020 18:11:26 +0800 (CST) Received: from huawei.com (10.175.124.27) by DGGEMS412-HUB.china.huawei.com (10.3.19.212) with Microsoft SMTP Server id 14.3.487.0; Fri, 29 May 2020 18:11:15 +0800 From: Wang ShaoBo CC: , , , , , , , , , , , , Subject: Question: livepatch failed for new fork() task stack unreliable Date: Fri, 29 May 2020 18:10:59 +0800 Message-ID: <20200529101059.39885-1-bobo.shaobowang@huawei.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.124.27] X-CFilter-Loop: Reflected To: unlisted-recipients:; (no To-header on input) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Stack unreliable error is reported by stack_trace_save_tsk_reliable() when trying to insmod a hot patch for module modification, this results in frequent failures sometimes. We found this 'unreliable' stack is from task just fork. The task just fork need to go through these steps will the problem not appear: _do_fork -=> copy_process ... -=> ret_from_fork -=> UNWIND_HINT_REGS Call trace as follow when stack_trace_save_tsk_reliable() return failure: [ 896.214710] livepatch: klp_check_stack: monitor-process:41642 has an unreliable stack [ 896.214735] livepatch: Call Trace: # print trace entries by myself [ 896.214760] Call Trace: # call show_stack() [ 896.214763] ? __switch_to_asm+0x70/0x70 Only for user mode task, there are two cases related for one task just created: 1) The task was not actually scheduled to excute, at this time UNWIND_HINT_EMPTY in ret_from_fork() has not reset unwind_hint, it's sp_reg and end field remain default value and end up throwing an error in unwind_next_frame() when called by arch_stack_walk_reliable(); 2) The task has been scheduled but UNWIND_HINT_REGS not finished, at this time arch_stack_walk_reliable() terminates it's backtracing loop for pt_regs unknown and return -EINVAL because it's a user task. As shown below, for user task, There exists a gap where ORC unwinder cannot capture the stack state of task immediately, at this time the task has already been created but ret_from_fork() has not complete it's mission. We attempt to append a bit field orc_info_prepared in task_struct to probe when related actions finished in ret_from_fork, we found scenario 1) 2) can be capatured. It's a informal solution, just for testing our conjecture. I am eager to purse an effective answer, welcome any ideas. Another similar question: https://lkml.org/lkml/2020/3/12/590 Following is the draft modification: 1. Add a bit field orc_info_prepared int task_struct. diff --git a/include/linux/sched.h b/include/linux/sched.h index 4418f5cb8324..3ff1368b8877 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -791,6 +791,9 @@ struct task_struct { /* Stalled due to lack of memory */ unsigned in_memstall:1; #endif +#ifdef CONFIG_UNWINDER_ORC + unsigned orc_info_prepared:1; +#endif unsigned long atomic_flags; /* Flags requiring atomic access. */ 2. if UNWIND_HINT_REGS complete, pt_regs can be known by orc unwinder, set orc_info_prepared = 1 in orc_info_prepared_fini(). diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 3063aa9090f9..637bdb091090 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -339,6 +339,7 @@ SYM_CODE_START(ret_from_fork) 2: UNWIND_HINT_REGS + call orc_info_prepared_fini movq %rsp, %rdi call syscall_return_slowpath /* returns with IRQs disabled */ TRACE_IRQS_ON /* user mode is traced as IRQS on */ 3. Simply judge orc_info_prepared if task is user mode process. diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c index 6ad43fc44556..bf1d2887f00b 100644 --- a/arch/x86/kernel/stacktrace.c +++ b/arch/x86/kernel/stacktrace.c @@ -77,6 +77,10 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry, return -EINVAL; } + if (!(task->flags & (PF_KTHREAD | PF_IDLE)) && + !task_orc_info_prepared(task)) + return 0; + /* Check for stack corruption */ if (unwind_error(&state)) return -EINVAL;