Received: by 2002:ab2:6857:0:b0:1ef:ffd0:ce49 with SMTP id l23csp335017lqp; Thu, 21 Mar 2024 02:44:34 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUMjaOB/1ACbKspGrTWKtREAcHmot5yCSxluJILHVf/Ak3GSzBFmvS7Y8BmSSx3CWbLl4Rc3i7sQLFNy7n2z0aDSdslyFSEh4nqiUcbFQ== X-Google-Smtp-Source: AGHT+IH93oEaJZyMClRXA/nfUY88kpKovCJYRoo6Irr1RSUBjOET9G85uz2Pny3C+nxO85Z0zdh5 X-Received: by 2002:a17:903:41cc:b0:1e0:378c:43e1 with SMTP id u12-20020a17090341cc00b001e0378c43e1mr3132643ple.27.1711014274179; Thu, 21 Mar 2024 02:44:34 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1711014274; cv=pass; d=google.com; s=arc-20160816; b=jXy2AmYipmi9uYQ3qieDj157y9C4hKf6NpBz0Xbhc+v0IEzDzNDRfRvPzltmJEL/2m fQu2vNgThIBShcUkhWSWfvtDfxTt7Lvnrklh9Ev6hPm1OQJ4IOVfD3494dywwCcadB8i rE/0gxHRDrMZQtUgWBWhrxm631uwoPRBNwfW7MyYmSV6yQuUbpW8coOb79pdRLHdSoOO sF13TnArmnbWgw0t2cN5RWcRcIGMNilNgavLazbKV0elZC11UBOf3jcIdOJJVOzyE/ka RLydiJdUNz5wRKEp8gRz1jcdaAsDtL73uUc3uOyCrA6/M3E3RkUTobV+HymGfq2YHjn2 ZGvw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:user-agent:date:message-id:from :cc:references:to:subject; bh=G8W851vGJgllndMVT93YKSPiOHm5O+GzZntaTnMX6F0=; fh=/4dw+fzGWhJ5PwC1WdNlHGW9bZ9yn+PzrJiLAXdgmSY=; b=TiHHZSTKLo6E5ZSvrPM1d1EtEz+oVzJf4XEho5jHNOdzpQUXCwz16lYXo45sUTxjK2 uqv17IgrZbZwtYplBDybDmzykz4Bzg9OCO+Qi7kjOc5nz7zdmBXCwH8zpC8KFU2Yuxk5 mzqAYM9P9MgvvTwPkPniz2Bix9/0ru7nFRDg8k90Vi61rjsKYCkti061321f8q48XJdY CC+0WdgMp1fsJehwm9qdbvSvUQqwP1kOOBH0ApWdS6An4bF6ImiPdpK3rKcePhON/blV avc30PROQtlNooQvGwSZyBxJDI7MlxozfNBGmM3gTPFmyv28ZFyOpR7WovesD/tHglcs Y+CQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-109860-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-109860-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id m18-20020a170902f21200b001dda53c3867si14155944plc.648.2024.03.21.02.44.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Mar 2024 02:44:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-109860-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-109860-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-109860-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id D07252832D4 for ; Thu, 21 Mar 2024 09:44:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7494E55E75; Thu, 21 Mar 2024 09:44:20 +0000 (UTC) Received: from szxga07-in.huawei.com (szxga07-in.huawei.com [45.249.212.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 948FE53384; Thu, 21 Mar 2024 09:44:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.35 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711014259; cv=none; b=fERYQJks0jZCzMtOQrf359UofY84YgAIW5FFAkdCWb6Osg0MMHwR8wSTw7/IHhH3NMQiVCNekZSLNd1vRCjODYaYHQd7TkUjCIhX/bn1oUT6g+UaXqb+RvMbIIPgops8IBHtpgeDfwHrTKJYD1RlIse/2iRnV8j0Jg4hFJNlJTs= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711014259; c=relaxed/simple; bh=nC4zC/ekvm31nkrj1AOh27fmuR8c5ikECCO9VZEASp4=; h=Subject:To:References:CC:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=Q+sIBKMBhheq52xcBM/pTNdV8qW0Cx8/vy3+Txk/ldBRgcysthnOON6UyBDoLpym6ofCO3CCPe8Y0wUggefk8BUaAIOaBUDlY+yij29lxJRyx1V/GahFxq/QJYV+Mi/8xyRDduP94AhSvu1F3Uapu+uqj2dX5nsIifDZ+Iuhvug= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.17]) by szxga07-in.huawei.com (SkyGuard) with ESMTP id 4V0gTH1hZzz1R7hs; Thu, 21 Mar 2024 17:41:39 +0800 (CST) Received: from canpemm500010.china.huawei.com (unknown [7.192.105.118]) by mail.maildlp.com (Postfix) with ESMTPS id CD0C81A0188; Thu, 21 Mar 2024 17:44:13 +0800 (CST) Received: from [10.67.111.82] (10.67.111.82) by canpemm500010.china.huawei.com (7.192.105.118) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Thu, 21 Mar 2024 17:44:13 +0800 Subject: Re: [PATCH v2] ARM: unwind: improve unwinders for noreturn case To: "Russell King (Oracle)" References: <1709516385-7778-1-git-send-email-xiaojiangfeng@huawei.com> <1710906278-23851-1-git-send-email-xiaojiangfeng@huawei.com> <84a57ca8-8963-ca24-8bd1-ddc5c33bf4da@huawei.com> CC: , , , , , , , , , , , , , , , , , , , , , , From: Jiangfeng Xiao Message-ID: Date: Thu, 21 Mar 2024 17:44:07 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To canpemm500010.china.huawei.com (7.192.105.118) On 2024/3/21 3:40, Russell King (Oracle) wrote: > On Wed, Mar 20, 2024 at 11:30:05PM +0800, Jiangfeng Xiao wrote: >> >> >> On 2024/3/20 16:45, Russell King (Oracle) wrote: >>> On Wed, Mar 20, 2024 at 11:44:38AM +0800, Jiangfeng Xiao wrote: >>>> This is an off-by-one bug which is common in unwinders, >>>> due to the fact that the address on the stack points >>>> to the return address rather than the call address. >>>> >>>> So, for example, when the last instruction of a function >>>> is a function call (e.g., to a noreturn function), it can >>>> cause the unwinder to incorrectly try to unwind from >>>> the function after the callee. >>>> >>>> foo: >>>> ... >>>> bl bar >>>> ... end of function and thus next function ... >>>> >>>> which results in LR pointing into the next function. >>>> >>>> Fixed this by subtracting 1 from frmae->pc in the call frame >>>> (but not exception frames) like ORC on x86 does. >>> >>> The reason that I'm not accepting this patch is because the above says >>> that it fixes it by subtracting 1 from the PC value, but the patch is >>> *way* more complicated than that and there's no explanation why. >>> >>> For example, the following are unexplained: >>> >>> - Why do we always need ex_frame >> >> ``` >> bar: >> ... >> ... end of function bar ... >> >> foo: >> BUG(); >> ... end of function foo ... >> ``` >> >> For example, when the first instruction of function 'foo' >> is a undefined instruction, after function 'foo' is executed >> to trigger an exception, 'arm_get_current_stackframe' assigns >> 'regs->ARM_pc' to 'frame.pc'. >> >> If we always decrement frame.pc by 1, unwinder will incorrectly >> try to unwind from the function 'bar' before the function 'foo'. >> >> So we need to 'ext_frame' to distinguish this case >> where we don't need to subtract 1. > > This just sounds wrong to me. We have two different cases: > > 1) we're unwinding a frame where PC points at the offending instruction. > This may or may not be in the exception text. > 2) we're unwinding a frame that has been created because of a branch, > where the PC points at the next instruction _after_ that callsite. > > While we're unwinding, we will mostly hit the second type of frames, but > we'll only hit the first type on the initial frame. Some exception > entries will have the PC pointing at the next instruction to be > executed, others will have it pointing at the offending instruction > (e.g. if it needs to be retried.) Thank you for your summary. Now we try to enumerate all cases: 1) When we hit the first type of frames 1.1) If the pc pointing at the next instruction of the offending instruction 1.1.1) If the offending instruction is the first instruction of the function pc: cause to unwind from current function pc-1: casue to unwind from current function 1.1.2) If the offending instruction is neither the first instruction nor the last instruction of the function pc: cause to unwind from current function pc-1: casue to unwind from current function 1.1.3) If the offending instruction is the last instruction of the function pc: cause to unwind from next function pc-1: casue to unwind from current function 1.2) If the pc pointing at the offending instruction 1.2.1) If the offending instruction is the first instruction of the function pc: cause to unwind from current function pc-1: casue to unwind from previous function 1.2.2) If the offending instruction is neither the first instruction nor the last instruction of the function pc: cause to unwind from current function pc-1: casue to unwind from current function 1.2.3) If the offending instruction is the last instruction of the function pc: cause to unwind from current function pc-1: casue to unwind from current function 2) When we hit the second type of frames 2.1) pc always pointing at the next instruction after that callsite 2.1.1) If the callsite is the first instruction pc: cause to unwind from current function pc-1: casue to unwind from current function 2.1.2) If the callsite is neither the first nor last instruction pc: cause to unwind from current function pc-1: casue to unwind from current function 2.1.3) If the callsite is the last instruction pc: cause to unwind from next function pc-1: casue to unwind from current function All in all, I think you are right. In case 2), We can always unwind by 'pc-1'. In case 1), If we unwind by 'pc', case 1.1.3) is problematic. If we unwind by 'pc-1', 1.2.1) is problematic. > > So, I don't see what being in the exception/entry text really has much > to do with any decision making here. I think you've found that it works > for your case, but it won't _always_ work and you're just shifting the > "bug" with how these traces work from one issue to a different but > similar issue.