Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp3468464pxb; Thu, 10 Feb 2022 23:48:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJxjhKTWgKZuyKk90/5Rp9e0zMDti76p2koYFKb9rI19kK0A5gk0vzHfkQzrxIqU3j8fv6TC X-Received: by 2002:a05:6402:2710:: with SMTP id y16mr554349edd.350.1644565689735; Thu, 10 Feb 2022 23:48:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644565689; cv=none; d=google.com; s=arc-20160816; b=lWZJPgratoo3Ke/eYFidZXVvPzqXFALsycUtdYZ3FFdDb575QnQKlxGsIGxRKfSWth 6AU+n+iPWpMmTplkhH5opxRmHtX4sZHkaj2/1JHGyAtlzGLvwBVWqo6Nf6wvjDwRAk2Z ON74UMVXvEcLSwcgiyb4QDTThdVc4TzP6Z52ILlunw31vRPrcLOPNeltI6LcQcQvfpAo bBfULr98hqP4Y7ro4Sxwm6t98abBqdc0wdElB9M0xAWOTNZEml5bi+9DaBf6hj/qlEHU m6+DIcPtpT2gNsihCL19cTNnXN5bgrKtitf/PZCmTuwmE57MnhEqknVjoRZyWEIM8H6Y Sk2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=oqHpQ1jA9x8DZY/4tJM8BD66i7X6IhemK1mVygEcjUc=; b=Fmfo6+89SXDZl927vXRoBZtmvkDEMZW5S9QoiOtj4pcTrjRtBZGBhQetbHywnbTJHc v2soLpxPQ+o6p18Zkz3Jrl1sqxs+wJLW5xkuHAhcg874NbF8MH+1SVdsQZGrv8QmnYl+ 1Rhr8gMnla8+g8+3uRecmK2fw8HOwQtNlV9EgdDJkt8tWjMzUMOLacwPHHTrwFmoxGTP jmp6v5N+reOABAx5aC+SaTIn4RnO07iAhnYQepPh+OhX0sBx/TsQF3Sg6DcKDYoArFOm dMZv40W5ketcD66rRrd7Vz4rwZa7/a50C7b2qFKTeCEDK8UI8ObStjoFK/tmQbCwUe8+ ScgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=azcko7kQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gz11si2932339ejc.529.2022.02.10.23.47.44; Thu, 10 Feb 2022 23:48:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=azcko7kQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233847AbiBKDVj (ORCPT + 99 others); Thu, 10 Feb 2022 22:21:39 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:45464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229534AbiBKDVi (ORCPT ); Thu, 10 Feb 2022 22:21:38 -0500 Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70F901105 for ; Thu, 10 Feb 2022 19:21:38 -0800 (PST) Received: by mail-qk1-x730.google.com with SMTP id o25so7087976qkj.7 for ; Thu, 10 Feb 2022 19:21:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=oqHpQ1jA9x8DZY/4tJM8BD66i7X6IhemK1mVygEcjUc=; b=azcko7kQOkANFQZMcnL2nNee6Zngw9DZkEGYSAE868a2GfUVhf6QSs2d/X/gg+IaH5 c2ICX+hKH4gHN8nasVyM/HB5f2DGhyIlqdBShF6+aDF0U9r+NPV4lMYW8LtG1Qf+oi+a 3B7s9eaz+azr9si6HkgNcIQ/rnZ8HnVVoQoA1wmtcwTpY2qDmY7Ao7foi8R1ykRfaScU VcKeAwhq0p4F4P7fjPZQWnxQTSCC3yiXvd4d/zrtseDhHkKf6td3gYvDyLZgtXjq9sZ2 ewTUTQa0/1UnLqJogfV51Jq4eDshovGv21Yp5BpvSGkNvGReVdu58ToKPItAUXfyOYMQ mQ2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=oqHpQ1jA9x8DZY/4tJM8BD66i7X6IhemK1mVygEcjUc=; b=Xpf6VfbaZQA5MNOF9nLMLePTksGrNnZVBub3rj+KLYC4sCtKtfbxe1PGW1ykfGp4g1 QzNLcZUiqGRhhr6GPk0EE0oMsam1KQstKyzF5MiMD47J+Fq9HCEkH3w/EoeDto5ZnNj4 iG78K/GJL1sd8sESJRhVFT+N+oOn3L3YbHmYdntcu5rowBT2XYDYUtjY4uS+sIaCYBaU /bXqUqsewTCSHE8md5fs1zCXF6DHzSiXweI4437PD6lGt8ge8vgz5GYnTXOYCggxTIpm VyJJGMEfdbVEaZccZ99RbCEU87GcyCnCfk+zshxa4EjvZhYElj8EwvxHQKLxVb/PVzlJ eiIw== X-Gm-Message-State: AOAM532mcyKMW44vYRZiiwNEnzeb4k0nns1Ny73Zb93Z7F+KKe4PHaBq Dx+7SGviv3m6DVWolObLjRk= X-Received: by 2002:a05:620a:210b:: with SMTP id l11mr5498256qkl.201.1644549697541; Thu, 10 Feb 2022 19:21:37 -0800 (PST) Received: from mail.google.com ([207.246.89.135]) by smtp.gmail.com with ESMTPSA id x13sm11408917qko.114.2022.02.10.19.21.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Feb 2022 19:21:37 -0800 (PST) Date: Fri, 11 Feb 2022 11:21:35 +0800 From: Changbin Du To: Jessica Clarke Cc: Jisheng Zhang , Changbin Du , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] riscv: fix oops caused by irq on/off tracer Message-ID: <20220211032135.i7jwvtb75pultwsp@mail.google.com> References: <20220129004226.32868-1-changbin.du@gmail.com> <20220207123850.l4r5qjswaegwisbx@mail.google.com> <20220208003502.62gi5xhyg6bk2t2h@mail.google.com> <20220210133758.yzebffln6j76zme6@mail.google.com> <0D51738E-C4C0-4D30-BCDF-55786E0CC201@jrtc27.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0D51738E-C4C0-4D30-BCDF-55786E0CC201@jrtc27.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 10, 2022 at 03:59:35PM +0000, Jessica Clarke wrote: > On 10 Feb 2022, at 15:27, Jisheng Zhang wrote: > > On Thu, Feb 10, 2022 at 09:37:58PM +0800, Changbin Du wrote: > >> On Thu, Feb 10, 2022 at 01:32:59AM +0800, Jisheng Zhang wrote: > >> [snip] > >>> Hi Changbin, > >>> > >>> I read the code and find that current riscv frame records during > >>> exception isn't as completed as other architectures. riscv only > >>> records frames from the ret_from_exception(). If we add completed > >> What do you mean for 'record'? > >> > > > > stack frame record. > > > >>> frame records as other arch do, then the issue you saw can also > >>> be fixed at the same time. > >>> > >> I don't think so. The problem is __builtin_return_address(1) trigger page fault > >> here. > > > > There's misunderstanding here. I interpret this bug as incomplete > > stackframes. > > > > This is current riscv stackframe during exception: > > > > high > > ---------------- > > top | | <- ret_from_exception > > ---------------- > > | | <- trace_hardirqs_on > > ----------------- > > low > > > > As you said, the CALLER_ADDR1 a.k.a __builtin_return_address(1) needs > > at least two parent call frames. > > No it doesn’t, you’re off by one, it only needs a valid current frame. > > Jess > yes, it is two frames not two 'parent' frames. My fault. > > If we complete the stackframes during exception as other arch does: > > > > high > > ---------------- > > top | | <- the synthetic stackframe from the interrupted point > > ---------------- > > ..... > > ---------------- > > | | <- ret_from_exception > > ---------------- > > | | <- trace_hardirqs_on > > ----------------- > > low > > > > > > Then we meet the "at least two parent call frames" requirement. IOW, my > > solution solve the problem from the entry.S side. One of the advantages > > would be we let interrupted point show up in dump_stack() as other arch > > do. What I'm not sure is whether it's safe to do so now since rc3 is > > released. > > > >> > >>> However, I'm not sure what's the best choice now. > >>> > >>> A simple demo to this incomplete frames: > >>> add dump_stack() in any ISR, then > >>> > >>> in riscv: > >>> [ 2.961294] Call Trace: > >>> [ 2.961460] [] dump_backtrace+0x1c/0x24 > >>> [ 2.961823] [] show_stack+0x2c/0x38 > >>> [ 2.962153] [] dump_stack_lvl+0x40/0x58 > >>> [ 2.962483] [] dump_stack+0x14/0x1c > >>> [ 2.962792] [] serial8250_interrupt+0x20/0x82 > >>> [ 2.963139] [] __handle_irq_event_percpu+0x4c/0x106 > >>> [ 2.963526] [] handle_irq_event+0x38/0x80 > >>> [ 2.963856] [] handle_fasteoi_irq+0x96/0x188 > >>> [ 2.964198] [] generic_handle_domain_irq+0x28/0x3a > >>> [ 2.964567] [] plic_handle_irq+0x88/0xec > >>> [ 2.964896] [] generic_handle_domain_irq+0x28/0x3a > >>> [ 2.965264] [] riscv_intc_irq+0x34/0x5c > >>> [ 2.965584] [] generic_handle_arch_irq+0x4a/0x74 > >>> [ 2.966068] [] ret_from_exception+0x0/0xc > >>> > >>> in x86: > >>> [ 1.191274] Call Trace: > >>> [ 1.192223] > >>> [ 1.192758] dump_stack_lvl+0x45/0x59 > >>> [ 1.192982] serial8250_interrupt+0x24/0x88 > >>> [ 1.193105] __handle_irq_event_percpu+0x66/0x1b0 > >>> [ 1.193239] handle_irq_event+0x34/0x70 > >>> [ 1.193345] handle_edge_irq+0x85/0x1e0 > >>> [ 1.193455] __common_interrupt+0x38/0x90 > >>> [ 1.193573] common_interrupt+0x73/0x90 > >>> [ 1.193809] > >>> [ 1.193889] > >>> [ 1.193956] asm_common_interrupt+0x1b/0x40 > >>> [ 1.194318] RIP: 0010:_raw_spin_unlock_irqrestore+0x1b/0x40 > >>> [ 1.194566] Code: 24 be 01 02 00 00 e9 54 20 bf ff 0f 1f 40 00 0f 1f > >>> 44 00 00 f7 c6 00f > >>> [ 1.195137] RSP: 0000:ffff888000243b68 EFLAGS: 00000246 > >>> [ 1.195314] RAX: 0000000000000000 RBX: ffffffff82025840 RCX: > >>> 0000000000000000 > >>> [ 1.195482] RDX: 0000000000000001 RSI: 0000000000000000 RDI: > >>> 0000000000000001 > >>> [ 1.195645] RBP: 0000000000000202 R08: ffffffffffffffff R09: > >>> 0000000000000000 > >>> [ 1.195808] R10: 00000000000000eb R11: 0000000000000000 R12: > >>> 0000000000000000 > >>> [ 1.195972] R13: 0000000000000040 R14: 0000000000000000 R15: > >>> ffff888000c39000 > >>> [ 1.196245] ? _raw_spin_unlock_irqrestore+0x15/0x40 > >>> [ 1.196373] serial8250_do_startup+0x42d/0x600 > >>> [ 1.196502] uart_port_startup+0x11b/0x270 > >>> [ 1.196619] uart_port_activate+0x3f/0x60 > >>> [ 1.196729] tty_port_open+0x7e/0xd0 > >>> [ 1.196835] ? _raw_spin_unlock+0x12/0x30 > >>> [ 1.196942] uart_open+0x1a/0x30 > >>> [ 1.197036] tty_open+0x153/0x7c0 > >>> [ 1.197144] chrdev_open+0xbf/0x230 > >>> [ 1.197253] ? cdev_device_add+0x90/0x90 > >>> [ 1.197359] do_dentry_open+0x13c/0x360 > >>> [ 1.197470] path_openat+0xb0c/0xe00 > >>> [ 1.197577] ? update_load_avg+0x5f/0x640 > >>> [ 1.197691] ? finish_task_switch.isra.0+0xac/0x240 > >>> [ 1.197821] do_filp_open+0xb2/0x150 > >>> [ 1.197935] ? preempt_schedule_thunk+0x16/0x18 > >>> [ 1.198049] ? preempt_schedule_common+0x90/0xd0 > >>> [ 1.198167] ? preempt_schedule_thunk+0x16/0x18 > >>> [ 1.198291] file_open_name+0xf1/0x1b0 > >>> [ 1.198397] filp_open+0x2c/0x50 > >>> [ 1.198495] console_on_rootfs+0x19/0x52 > >>> [ 1.198648] kernel_init_freeable+0x19a/0x1c7 > >>> [ 1.198765] ? rest_init+0xc0/0xc0 > >>> [ 1.198867] kernel_init+0x16/0x110 > >>> [ 1.198965] ret_from_fork+0x1f/0x30 > >>> [ 1.199131] > >>> > >> As I said before, this issue is not related to stackdump. > >> > >> Besides, you can see more calltrace on x86 that because x86 iterate all stacks > >> (kernel, irq or exception) when dumping stacktrace. While RISCV only show > >> calltrace of current stack. > >> > > > > I'm not sure whether there's misunderstanding. See above. > > > > Thanks > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv > -- Cheers, Changbin Du