Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp3979584pxb; Fri, 11 Feb 2022 12:06:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwngWlvdybV7BnP1D/XRbLt3w6fcg4tFGmGxwO7eDqaNXZbpPUSVgF4wWv3BGsC2JthHq2u X-Received: by 2002:a05:6a00:1493:: with SMTP id v19mr3292829pfu.40.1644609972522; Fri, 11 Feb 2022 12:06:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1644609972; cv=none; d=google.com; s=arc-20160816; b=AJUFflSuO990ebFa4o9WwJ7K+kJKytZ2X3LkipjtDUPWIwbp2SU22afy9qVQO2dGet o85n423OpqSaEFv3gDCOCxdEIWgbitsSAeWM5VQr3QDBA9NEqtMASgWY53EJGSjxKH06 xXUlW1zWBL6tNadVAUAAIHg7TNwsfyEbBwFm8VH675blYt1IcsR+EsHFGNlp050kMx+k 4L74p7RfQAW04E3UL/eNkcc+sBP7O/idrbd+ZcclcSvm+VE6iQdYNG/rNgOzLjXdCfTz vXjdKU1KARz16X8uWKVM9XKhRrSrIrYq3q4WmLJuAdwgbP+bwKac0kTFEDzgxOyBhecD QPiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=wtoQPG3jIyNhfYs5Ug8GoOsBEXEnyAg/kCZOfo5eUW4=; b=c0+jWSP+pXfdRs7/Hwj6cn6sooW03M6FTFUJAt0br9I9/WZ22fXP3mTYV5XRQy+45J CGGmc0j+vLnHY/ESJBtqzlICJs1HRPzlN3o2KizbZEXrbSgAiqRfCyPAI5YoH406rQNx FWRD2ot1ixdR9lv9Bbf3aBvXV7WJXLYPBagMTBANKLD6guDguCtsUZ8Osf6Psq2FbatF XBm5m1aUgEoY5s5G0z908dd9EJGpleH7Gk9lvujSv9J0KQb4Shz4OT39CT6deMbLswJx Q4TTu4nDaSk7wNk3YxcZaKlJQfyF+j9yN47mTZY0/WlqPypeV55jF4nRH0Mk6xdWCDrF gW/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aTQRjBrt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x16si5516066plr.564.2022.02.11.12.05.56; Fri, 11 Feb 2022 12:06:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=aTQRjBrt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236171AbiBKDGY (ORCPT + 99 others); Thu, 10 Feb 2022 22:06:24 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:40046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1347550AbiBKDGQ (ORCPT ); Thu, 10 Feb 2022 22:06:16 -0500 Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4979D2647 for ; Thu, 10 Feb 2022 19:06:16 -0800 (PST) Received: by mail-qt1-x82d.google.com with SMTP id b5so7686916qtq.11 for ; Thu, 10 Feb 2022 19:06:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=wtoQPG3jIyNhfYs5Ug8GoOsBEXEnyAg/kCZOfo5eUW4=; b=aTQRjBrtMxHqjA8dGQkttGtKl/OvztIJaTGZxxn47YkTNBCFWo7/UTFqTXotHgY3NU 1Oeztm+aI3Ay1vO3Xu0GlbBR2/bUyafb2Lz2hwOJFCD6LMAR71oaiepbsNTYIe+zR6Xa ePz7MUEcCGSl7OKEmANSik79EWuo6//V44iBXNsthzF3M4PmN6Xg047/Z516mOCYYwl5 bD87MlDZ0DCYGVv49FYerymRfIh0Syil85VX33E8kDOJGGNYfy0fopp8W2fHMPb3KXro 7rwUVpRBLaMhZ/SbsD8CcFZmQXvQ7jaLdkWxou2ock+CF3A8DKSJe0GUSA3xlYDzYgDp Scpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=wtoQPG3jIyNhfYs5Ug8GoOsBEXEnyAg/kCZOfo5eUW4=; b=yRS+t2qPCzA327IxfQRydgmQ1qh3CNhllH1qzrotB5Fw5lRJ0x7hQvPTp2HN882pRo qp17gbBv8VYcIC9aZ2yOWqTQQwAR/OeZg8OEyX8O3ovCek70vZo0D6L2a1WKkv9qBbet eNqQX15fODZg63QXqsXsXzbJ3h2Luex3dKb7yU8nbyhV80rLBXrTxmummSrI/Kdwt8IX XzdHtIwLELY37MzKVQdzpfCf9ULt4aq93+ZBYpyDCEKKY2OAicA4Mg4rKGuN4b9VNoAB HAAUU6yfnM23dELtExayg7A+cKDAE2oFGSM0Er9UgdKpSWWvhtyGoWlsDgjVjuypgLNd sWqw== X-Gm-Message-State: AOAM532dZtBX1f3FICFZ8Asv8xft78TA+fzr6MYhg8gpo8Qr0g8RMMe6 VqTGYFRO27LMS502pB5/CX7ErD7d6PHF9xUC X-Received: by 2002:a05:622a:11ca:: with SMTP id n10mr7027854qtk.42.1644548775320; Thu, 10 Feb 2022 19:06:15 -0800 (PST) Received: from mail.google.com ([207.246.89.135]) by smtp.gmail.com with ESMTPSA id u63sm10780528qkh.43.2022.02.10.19.06.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Feb 2022 19:06:14 -0800 (PST) Date: Fri, 11 Feb 2022 11:06:13 +0800 From: Changbin Du To: Jisheng Zhang Cc: Changbin Du , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] riscv: fix oops caused by irq on/off tracer Message-ID: <20220211030613.s75irqxhflc25t7a@mail.google.com> References: <20220129004226.32868-1-changbin.du@gmail.com> <20220207123850.l4r5qjswaegwisbx@mail.google.com> <20220208003502.62gi5xhyg6bk2t2h@mail.google.com> <20220210133758.yzebffln6j76zme6@mail.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I reconsidered the problem and found my previous analysis is flawed. So let's re-explain. The fault happens on code generated by CALLER_ADDR1 (aka.__builtin_return_address(1)): 0xffffffff8011510e <+80>: ld a1,-16(s0) 0xffffffff80115112 <+84>: ld s2,-8(a1) # <-- paging fault here,a1=0x0000000000000100 This because the assembly entry code doesn't setup a valid frame pointer, and the fp(aka. s0) register is used for other purpose. resume_kernel: REG_L s0, TASK_TI_PREEMPT_COUNT(tp) bnez s0, restore_all REG_L s0, TASK_TI_FLAGS(tp) andi s0, s0, _TIF_NEED_RESCHED beqz s0, restore_all call preempt_schedule_irq j restore_all So, there is two solutions: 1) Invoke trace_hardirqs_on/off in C function, so the compiler will take care of frame pointer. This what I did. 2) Always setup vaild frame pointer in assembly entry code. I think this is what JiSheng suggested? I prefer #1 since we don't need to setup frame pointer if irqoff tracer is not enabled. On Thu, Feb 10, 2022 at 11:37:06PM +0800, Jisheng Zhang wrote: > On Thu, Feb 10, 2022 at 11:27:21PM +0800, Jisheng Zhang wrote: > > On Thu, Feb 10, 2022 at 09:37:58PM +0800, Changbin Du wrote: > > > On Thu, Feb 10, 2022 at 01:32:59AM +0800, Jisheng Zhang wrote: > > > [snip] > > > > Hi Changbin, > > > > > > > > I read the code and find that current riscv frame records during > > > > exception isn't as completed as other architectures. riscv only > > > > records frames from the ret_from_exception(). If we add completed > > > What do you mean for 'record'? > > > > > > > stack frame record. > > > > > > frame records as other arch do, then the issue you saw can also > > > > be fixed at the same time. > > > > > > > I don't think so. The problem is __builtin_return_address(1) trigger page fault > > > here. > > > > There's misunderstanding here. I interpret this bug as incomplete > > stackframes. > > > > This is current riscv stackframe during exception: > > > > high > > ---------------- > > top | | <- ret_from_exception > > ---------------- > > | | <- trace_hardirqs_on > > ----------------- > > low > > sorry, the "top" is wrongly placed. > high > ---------------- > | | <- ret_from_exception > ---------------- > | | <- trace_hardirqs_on > ----------------- > top > > low > > > > > > > As you said, the CALLER_ADDR1 a.k.a __builtin_return_address(1) needs > > at least two parent call frames. > > > > If we complete the stackframes during exception as other arch does: > > > > high > > ---------------- > > top | | <- the synthetic stackframe from the interrupted point > > ---------------- > > ..... > > ---------------- > > | | <- ret_from_exception > > ---------------- > > | | <- trace_hardirqs_on > > ----------------- > > low > > ditto > > > > > > > Then we meet the "at least two parent call frames" requirement. IOW, my > > solution solve the problem from the entry.S side. One of the advantages > > would be we let interrupted point show up in dump_stack() as other arch > > do. What I'm not sure is whether it's safe to do so now since rc3 is > > released. > > -- Cheers, Changbin Du