Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp2071839pxb; Wed, 9 Feb 2022 10:15:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJzT7AmmyQ27eZC+GL05pQWxVhy21o6o0mNe06Y6z4lgzLxkNzNf9KqlHcYKxOZbX6WE5E4o X-Received: by 2002:a63:5460:: with SMTP id e32mr2850453pgm.330.1644430538389; Wed, 09 Feb 2022 10:15:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644430538; cv=none; d=google.com; s=arc-20160816; b=UenOfMh+Rjyr5IK/R3CNGtiM8Du5IliuZg8VGxYFOIlG2HE8pLfVYuKgIf3d/+PSoq vurfJNFWn/hin/5i/Og9Yp/L0/X6EattnXOkZVMxPvroSWa27wDJhkQviw5nTN3U0X99 g6JAkWtaSCUf7Jv/TJ5++ulYFkARDLRBbk9dvyJqMcaF+7Megix6TVJtptv57y6MrikY UYrOgO1vQ52npaAMX1HTTQrRXDZRrfG8P4bX+cRRmyjpgBasCUX6dZyCyX4HkS8a5dRz VvYcqFwI3cdc56hUJ25kxDQ5scZhCnTsilQuQm8W2yXW+piUMSoAbEX7VURpVtiHvETf 4+nA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=exKBn0mZkZM1PCp9qzmAhE7kb7Yo7PBdQWExqGwCblo=; b=TDeG/LSidYY4TsppYQr1RA1VF4c5JrUO03D9LM+hgoVmvQUgBtIyEk7gkzJhUMzLFZ MCq4NUEC8M9i18iC13wTecbYjGcsVn+I5prMZKlaX9OFDJtxYllqJ/Me22cgrM5lYQJb U5boUMCdn3d/tk5cPvgNf5OsQU1QMs/GmycSEpOsa24ULUj1I4QEd8Thpq6gaTyTXy7s gFl0FgqvrRClbbjt9N+caxNvOekobHwFS11clP7HyI76LoxT59+bp2ZVJ/0s+jCY37l+ ue18Fb+W8oFyl2j+1dsxaJiL6vxXeQj8LOdge1rCAj7yr3CgZsePldWghDWRx9VKzPIG wj/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bLoJxufY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 15si16821804pgt.260.2022.02.09.10.15.25; Wed, 09 Feb 2022 10:15:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bLoJxufY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238391AbiBIRkw (ORCPT + 99 others); Wed, 9 Feb 2022 12:40:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238284AbiBIRkt (ORCPT ); Wed, 9 Feb 2022 12:40:49 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56D04C0613C9 for ; Wed, 9 Feb 2022 09:40:52 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E8CA3B82322 for ; Wed, 9 Feb 2022 17:40:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF287C340E7; Wed, 9 Feb 2022 17:40:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1644428449; bh=AlopD8g/IxO/TUJsCJ2/uUlPLB4X9pN9HSSfhXOXIRI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=bLoJxufYJICvZfk36laW9vZUmtzElmgSphRskprZx4SqZTlfShVvcpL7IIUBBO2EU mL03JFh7fRXaEyEHon3W0bd2sGklgjmpNCmaltRoctiUYSEqL8Oc77DdcD7of9u20a yJxgmhv/pGBtACLDPa7dX8R3hwSKH5I42phpvCuiYmzX6zVXpqBOsPA2siFwhK9B6s PBs03GZuSs1ek+L7eDIkcW0jooUANG1eD29PiQBpTLrY/SPjRough0oNINJIBdjPKc 87HzNRut3b1UfSaW4GaA8Sc7AlZplas396NUVtOOsu0twHaljRFUCM0Ydv1DNftfIH c+WPnWPNnxOnA== Date: Thu, 10 Feb 2022 01:32:59 +0800 From: Jisheng Zhang To: Changbin Du Cc: Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] riscv: fix oops caused by irq on/off tracer Message-ID: References: <20220129004226.32868-1-changbin.du@gmail.com> <20220207123850.l4r5qjswaegwisbx@mail.google.com> <20220208003502.62gi5xhyg6bk2t2h@mail.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20220208003502.62gi5xhyg6bk2t2h@mail.google.com> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 08, 2022 at 08:35:02AM +0800, Changbin Du wrote: > On Mon, Feb 07, 2022 at 11:31:41PM +0800, Jisheng Zhang wrote: > > On Mon, Feb 07, 2022 at 08:38:50PM +0800, ChangbinCONFIG_IRQSOFF_TRACER Du wrote: > > > On Mon, Feb 07, 2022 at 01:25:03AM +0800, Jisheng Zhang wrote: > > > > On Sat, Jan 29, 2022 at 08:42:26AM +0800, Changbin Du wrote: > > > > > The trace_hardirqs_on/off requires at least two parent call frames. > > > > > If not, the code generated by CALLER_ADDR1 (aka. ftrace_return_address(1)) > > > > > could trigger memory access fault. > > > > > > > > > > [ 0.039615][ T0] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000f8 > > > > > [ 0.041925][ T0] Oops [#1] > > > > > [ 0.042063][ T0] Modules linked in: > > > > > [ 0.042864][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-rc1-00233-g9a20c48d1ed2 #29 > > > > > [ 0.043568][ T0] Hardware name: riscv-virtio,qemu (DT) > > > > > [ 0.044343][ T0] epc : trace_hardirqs_on+0x56/0xe2 > > > > > [ 0.044601][ T0] ra : restore_all+0x12/0x6e > > > > > [ 0.044721][ T0] epc : ffffffff80126a5c ra : ffffffff80003b94 sp : ffffffff81403db0 > > > > > [ 0.044801][ T0] gp : ffffffff8163acd8 tp : ffffffff81414880 t0 : 0000000000000020 > > > > > [ 0.044882][ T0] t1 : 0098968000000000 t2 : 0000000000000000 s0 : ffffffff81403de0 > > > > > [ 0.044967][ T0] s1 : 0000000000000000 a0 : 0000000000000001 a1 : 0000000000000100 > > > > > [ 0.045046][ T0] a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000 > > > > > [ 0.045124][ T0] a5 : 0000000000000000 a6 : 0000000000000000 a7 : 0000000054494d45 > > > > > [ 0.045210][ T0] s2 : ffffffff80003b94 s3 : ffffffff81a8f1b0 s4 : ffffffff80e27b50 > > > > > [ 0.045289][ T0] s5 : ffffffff81414880 s6 : ffffffff8160fa00 s7 : 00000000800120e8 > > > > > [ 0.045389][ T0] s8 : 0000000080013100 s9 : 000000000000007f s10: 0000000000000000 > > > > > [ 0.045474][ T0] s11: 0000000000000000 t3 : 7fffffffffffffff t4 : 0000000000000000 > > > > > [ 0.045548][ T0] t5 : 0000000000000000 t6 : ffffffff814aa368 > > > > > [ 0.045620][ T0] status: 0000000200000100 badaddr: 00000000000000f8 cause: 000000000000000d > > > > > [ 0.046402][ T0] [] restore_all+0x12/0x6e > > > > > > > > > > > > > Hi Changbin, > > > > > > > > Could you please provide the reproduce steps? It looks a bit > > > > interesting. > > > > > > > Just enable CONFIG_IRQSOFF_TRACER and rebuild kernel with llvm. Then boot the > > > new kernel. > > > > Thanks for the information. I tried IRQSOFF_TRACER with gcc+binutils, > > can't reproduce the issue. I forget to try clang+llvm. From another side > > The fact that gcc+bintuils can't reproduce it means this is a clang+llvm > > speicial case, no? > The behaviour of GCC is a bit different, please refer to another disccusion: > https://lore.kernel.org/lkml/C2470F2D-9E45-49D7-A03B-E6A7BB4B9738@jrtc27.com/T/ > > But I suppose it still has similar issue. Make sure FRAME_POINTER is enabled > also. > Hi Changbin, I read the code and find that current riscv frame records during exception isn't as completed as other architectures. riscv only records frames from the ret_from_exception(). If we add completed frame records as other arch do, then the issue you saw can also be fixed at the same time. However, I'm not sure what's the best choice now. A simple demo to this incomplete frames: add dump_stack() in any ISR, then in riscv: [ 2.961294] Call Trace: [ 2.961460] [] dump_backtrace+0x1c/0x24 [ 2.961823] [] show_stack+0x2c/0x38 [ 2.962153] [] dump_stack_lvl+0x40/0x58 [ 2.962483] [] dump_stack+0x14/0x1c [ 2.962792] [] serial8250_interrupt+0x20/0x82 [ 2.963139] [] __handle_irq_event_percpu+0x4c/0x106 [ 2.963526] [] handle_irq_event+0x38/0x80 [ 2.963856] [] handle_fasteoi_irq+0x96/0x188 [ 2.964198] [] generic_handle_domain_irq+0x28/0x3a [ 2.964567] [] plic_handle_irq+0x88/0xec [ 2.964896] [] generic_handle_domain_irq+0x28/0x3a [ 2.965264] [] riscv_intc_irq+0x34/0x5c [ 2.965584] [] generic_handle_arch_irq+0x4a/0x74 [ 2.966068] [] ret_from_exception+0x0/0xc in x86: [ 1.191274] Call Trace: [ 1.192223] [ 1.192758] dump_stack_lvl+0x45/0x59 [ 1.192982] serial8250_interrupt+0x24/0x88 [ 1.193105] __handle_irq_event_percpu+0x66/0x1b0 [ 1.193239] handle_irq_event+0x34/0x70 [ 1.193345] handle_edge_irq+0x85/0x1e0 [ 1.193455] __common_interrupt+0x38/0x90 [ 1.193573] common_interrupt+0x73/0x90 [ 1.193809] [ 1.193889] [ 1.193956] asm_common_interrupt+0x1b/0x40 [ 1.194318] RIP: 0010:_raw_spin_unlock_irqrestore+0x1b/0x40 [ 1.194566] Code: 24 be 01 02 00 00 e9 54 20 bf ff 0f 1f 40 00 0f 1f 44 00 00 f7 c6 00f [ 1.195137] RSP: 0000:ffff888000243b68 EFLAGS: 00000246 [ 1.195314] RAX: 0000000000000000 RBX: ffffffff82025840 RCX: 0000000000000000 [ 1.195482] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001 [ 1.195645] RBP: 0000000000000202 R08: ffffffffffffffff R09: 0000000000000000 [ 1.195808] R10: 00000000000000eb R11: 0000000000000000 R12: 0000000000000000 [ 1.195972] R13: 0000000000000040 R14: 0000000000000000 R15: ffff888000c39000 [ 1.196245] ? _raw_spin_unlock_irqrestore+0x15/0x40 [ 1.196373] serial8250_do_startup+0x42d/0x600 [ 1.196502] uart_port_startup+0x11b/0x270 [ 1.196619] uart_port_activate+0x3f/0x60 [ 1.196729] tty_port_open+0x7e/0xd0 [ 1.196835] ? _raw_spin_unlock+0x12/0x30 [ 1.196942] uart_open+0x1a/0x30 [ 1.197036] tty_open+0x153/0x7c0 [ 1.197144] chrdev_open+0xbf/0x230 [ 1.197253] ? cdev_device_add+0x90/0x90 [ 1.197359] do_dentry_open+0x13c/0x360 [ 1.197470] path_openat+0xb0c/0xe00 [ 1.197577] ? update_load_avg+0x5f/0x640 [ 1.197691] ? finish_task_switch.isra.0+0xac/0x240 [ 1.197821] do_filp_open+0xb2/0x150 [ 1.197935] ? preempt_schedule_thunk+0x16/0x18 [ 1.198049] ? preempt_schedule_common+0x90/0xd0 [ 1.198167] ? preempt_schedule_thunk+0x16/0x18 [ 1.198291] file_open_name+0xf1/0x1b0 [ 1.198397] filp_open+0x2c/0x50 [ 1.198495] console_on_rootfs+0x19/0x52 [ 1.198648] kernel_init_freeable+0x19a/0x1c7 [ 1.198765] ? rest_init+0xc0/0xc0 [ 1.198867] kernel_init+0x16/0x110 [ 1.198965] ret_from_fork+0x1f/0x30 [ 1.199131]