Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp76515rdb; Thu, 21 Dec 2023 03:21:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IFN+jyDsy52xSb9oa9GanV6C4/inGbjRRaW3ZqeK/ICS8vEJrwwrI0gfGMqw82axWuuAN+2 X-Received: by 2002:a05:6e02:1b0e:b0:35f:c6ff:6a30 with SMTP id i14-20020a056e021b0e00b0035fc6ff6a30mr4306746ilv.6.1703157697079; Thu, 21 Dec 2023 03:21:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703157697; cv=none; d=google.com; s=arc-20160816; b=l8g0OcnsYU+Tg2K5YrPkFJ3DiUYQpjR8zGLWkN+1iCFG03z2RqS6ghyADiTWLTQLG3 OZf8PaEp+5RpMQLxGuQ647Tp6CbpnAAa6YGjaKJX0gtwplaPySl1wWG4NWhOnFoK/CrG 9lx460+KRpKkZH9yY3YeA5+tLxMZlLJgc9NaDS/NI+s7EOILdB7KbTeBEQMQO8srRIba s2Ul+7FL4Rdvfb67irLJppKZkdegdM1ltefj1YnbQaDEn9zNMu6i53N6ixnV/0UpR6Fv 7VIhKXQ0ss8Bhwd3bGpE+6aJMJumQMwX/h0fU9hk9iqjcMqsNBJdg0M2APkqPo6tvU3d v/Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=NQJECblTjZws8DERjsDQNvtfCSGD9Q7Kdnkc0Bd1eM8=; fh=6v7/79oMYoRy67ZfZt9QYX6F5Cdwi+N/luivP4wnsbw=; b=bZUvWiU2qtuabM5s4lBHoO/dtizEg8O359MKnU2kRheeTPBIQVWImhOcNMUdm3EDn3 ig5RT8hj/JiLe7Zh3Tu+Cg6VzbXp338piH9++hQRCsxHelUjU2DtDjd08CtRgIQ0jFnD RTmBF6qju4X35isN9LW8gj+tLsLKQYLIvCv+iGEB+arI73AwWqEvScU/mRt5Cra31Nm6 hBvrT1FNx8Rmr/b1CYp8OHcaVgeIK/VmFz6glKngVbzDDKKibGyWduEMY4RuRi3ZxVN8 C9McdC7LSENHaBqzB0ZigJujugxQ7v4pHv4pdJPmPaZWEfr5Ir/LA4QHyCP4xmnjTk2d 4mag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZxESpBW7; spf=pass (google.com: domain of linux-embedded+bounces-11-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-embedded+bounces-11-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id i69-20020a638748000000b005bdbedcaf61si1375117pge.674.2023.12.21.03.21.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 03:21:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-embedded+bounces-11-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZxESpBW7; spf=pass (google.com: domain of linux-embedded+bounces-11-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-embedded+bounces-11-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id D240AB21D3D for ; Thu, 21 Dec 2023 11:19:29 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 828346BB34; Thu, 21 Dec 2023 11:19:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZxESpBW7" X-Original-To: linux-embedded@vger.kernel.org Received: from mail-lf1-f46.google.com (mail-lf1-f46.google.com [209.85.167.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A5E06BB31 for ; Thu, 21 Dec 2023 11:19:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lf1-f46.google.com with SMTP id 2adb3069b0e04-50e49a0b5caso881636e87.0 for ; Thu, 21 Dec 2023 03:19:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703157556; x=1703762356; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=NQJECblTjZws8DERjsDQNvtfCSGD9Q7Kdnkc0Bd1eM8=; b=ZxESpBW7X9AwIp7B21OcJdw5Vvq8GChGqvU+LYFsMopMc2UE/wP6XcMVzbiG/TVsp1 8ZgnqUUgEsEk+qMvuzMRRfnim23qinYF2ZVIrErFEA5vSHjqno1XA9jnSmSA2M+lYtlY A0KKW/8SB0FHbEJyBKr7FU5Ankpaxi9W2CGUZuypNXOozBOLaI54JbMr/ozUK2CJMs8g 7pucHbYapm+b/yWbsd6kC/tIgoo9G9HgAk34+SjpTBPSnAtsovlw7byM8ZpqxX2YBYU6 W5bwORqL+IwpdE1fxmzawKFUIqT+cNDPIyA4Jfz2rUzjWNrw/pEOtSTjio4kY3B59OGF v8og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703157556; x=1703762356; h=content-transfer-encoding:in-reply-to:from:references:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NQJECblTjZws8DERjsDQNvtfCSGD9Q7Kdnkc0Bd1eM8=; b=SNK053fZ8iGHOg8Ckq8zF7RMQ+JVWL9c+dq+Nq52xo+ZT23em3xj8iri2KJuzx32+d LpIEbeAwhznA2/xdcRjidBE2qauaCEM8Ftloz7L7QyvZ2TjXS44LH3p6T9uoTQ+Nw1Vm olAzVbc1S5AsN7cEU1tcU1U6yAvrGUg+cIbagY6nS4oZtazemB9NKkjKLD+OHrlTHUPe bJyBECfsgqVSppq5E+3PslLD6TiuuBFaMy5oAVSQc3RqDGlUHJcYNTXybvEBECUENqZ3 IdAu9hFwIbY3YmGDmfDHpag1XDBNyjtCHZif6mo7IEejmYiEOkitxUvssDDyH569TX2u xOgA== X-Gm-Message-State: AOJu0Yyldva8JDJ6kBQr4XwPL52uI13dt1rA7S5O2QTaOUvvxTTvRq51 YfLKgrfRxYgO/TUtwFP1DqMxTjZRsTw= X-Received: by 2002:a05:6512:2310:b0:50e:4bf6:8848 with SMTP id o16-20020a056512231000b0050e4bf68848mr2598945lfu.38.1703157555896; Thu, 21 Dec 2023 03:19:15 -0800 (PST) Received: from ?IPV6:2003:df:bf1f:bd00:6b99:9f3f:d5fe:7a63? (p200300dfbf1fbd006b999f3fd5fe7a63.dip0.t-ipconnect.de. [2003:df:bf1f:bd00:6b99:9f3f:d5fe:7a63]) by smtp.gmail.com with ESMTPSA id l3-20020a170907914300b00a1b7769f834sm854306ejs.81.2023.12.21.03.19.15 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 21 Dec 2023 03:19:15 -0800 (PST) Message-ID: <8140c4c7-10d5-46dc-8c32-8bee7bf95918@gmail.com> Date: Thu, 21 Dec 2023 12:19:15 +0100 Precedence: bulk X-Mailing-List: linux-embedded@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Debugging early SError exception Content-Language: de-AT-frami To: Lior Weintraub , "linux-embedded@vger.kernel.org" References: <375eeb75-dde5-4806-a2d7-7f4e97342ee8@gmail.com> From: Dirk Behme In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Am 21.12.23 um 11:04 schrieb Lior Weintraub: > Thanks Dirk, > > Regarding the earlyprintk, not sure I know how to make it work. > I have defined CONFIG_EARLY_PRINTK=y and CONFIG_DEBUG_LL=y on my config but it doesn't seem to work. > Do I need to pass something in the bootargs from the U-BOOT? > Do I need to add that into my device tree? > (Tried to set bootargs = "console=ttyS0,115200 earlyprintk"; under "chosen" on my DT but it didn't work) Yes, what has to be enabled and what not and what has to be set how is often confusing. I think this is not common for all systems, so I think to be on the safe side you have to look into the code for you system. Or short; The code is the documentation ;) > The UART I am using is "snps,dw-apb-uart". > > Last week, to output the early logs I have implemented this hack: > 1. Modify printk macro to run my print_func > 2. This print_func wrote the characters into a single global variable (u32 simul_uart;) > 3. Get the address location of this global variable and extract all writes to it from the Tarmac logs. > > This is a very slow and tedious process but it helped me identify the initial SError. > Initially I thought I can write directly into the UART FIFO register (which I know the address) but this didn't work because Linux already setup the MMU so I guess I need to know the virtual address of this FIFO. > Do I need to use __phys_to_virt of some sort? Yes, I think so. Have a look to the existing serial driver, too. It should do whats needed, and you can borrow that, then. Best regards Dirk > Cheers, > Lior. > >> -----Original Message----- >> From: Dirk Behme >> Sent: Thursday, December 21, 2023 10:30 AM >> To: Lior Weintraub ; linux-embedded@vger.kernel.org >> Subject: Re: Debugging early SError exception >> >> [You don't often get email from dirk.behme@gmail.com. Learn why this is >> important at https://aka.ms/LearnAboutSenderIdentification ] >> >> CAUTION: External Sender >> >> Am 21.12.23 um 08:43 schrieb Lior Weintraub: >>> Hi Dirk, >>> >>> We found that the issue was at the early stages of Barebox (a.k.a U-BOOT >> v2). >> >> Glad to hear that! :) >> >>> Our implementation of putc_ll (on debug_ll) was writing into the UART Tx >> FIFO without checking if the FIFO is full. >>> Once the fifo got full it caused this SError probably because the UART IP >> generated an apberror signal. >> >> Thanks for the report! >> >>> Now the Linux is running and doesn't report the SError again but now we >> face another issue. >>> We see that the PC is getting into a "report_bug" function. >>> The Linux doesn't print anything to the UART (probably since it hasn't got to >> the point where the console is configured?). >> >> For cases like this using earlyprintk is usually a good option. Check >> the Linux kernel serial console (UART) dirver of you SoC if it >> supports it. In the end it should be "just" a function in the serial >> console driver which outputs the console data via polling before >> (later) the interrupt driven console part takes over. >> >> Best regards >> >> Dirk >> >> >>> Since our debug means are limited it can take some time to find the root >> cause. >>> >>> I will keep you posted and update our findings. >>> Love to hear your thoughts, >>> >>> Cheers, >>> Lior. >>> >>> >>>> -----Original Message----- >>>> From: Dirk Behme >>>> Sent: Tuesday, December 19, 2023 3:37 PM >>>> To: Lior Weintraub ; linux-embedded@vger.kernel.org >>>> Subject: Re: Debugging early SError exception >>>> >>>> [You don't often get email from dirk.behme@gmail.com. Learn why this is >>>> important at https://aka.ms/LearnAboutSenderIdentification ] >>>> >>>> CAUTION: External Sender >>>> >>>> Am 19.12.23 um 14:23 schrieb Lior Weintraub: >>>>> Thanks Dirk, >>>> >>>> Welcome :) >>>> >>>> In case you find the root cause it would be nice to get some generic >>>> description of it so that we can learn something :) >>>> >>>> Best regards >>>> >>>> Dirk >>>> >>>> >>>>>> -----Original Message----- >>>>>> From: Dirk Behme >>>>>> Sent: Tuesday, December 19, 2023 9:09 AM >>>>>> To: Lior Weintraub ; linux- >> embedded@vger.kernel.org >>>>>> Subject: Re: Debugging early SError exception >>>>>> >>>>>> [You don't often get email from dirk.behme@gmail.com. Learn why this >> is >>>>>> important at https://aka.ms/LearnAboutSenderIdentification ] >>>>>> >>>>>> CAUTION: External Sender >>>>>> >>>>>> Am 17.12.23 um 22:32 schrieb Lior Weintraub: >>>>>>> Hi, >>>>>>> >>>>>>> We have a new SoC with eLinux porting (kernel v6.5). >>>>>>> This SoC is ARM64 (A53) single core based device. >>>>>>> It runs correctly on QEMU but fails with SError on emulation platform >>>>>> (Synopsys Zebu running our SoC model). >>>>>>> There is no debugger connected to this emulation but there are several >>>>>> debug capabilities we can use: >>>>>>> 1. Generating wave dump of CPU signals >>>>>>> 2. Generate a Tarmac log >>>>>>> 3. UART >>>>>>> >>>>>>> Since the SError happens at early stages of Linux boot the UART is not >>>>>> enabled yet. >>>>>>> From the Tarmac log we can see: >>>>>>> 3824884521 ps ES (ffff800080760888:d65f03c0) O el1h_ns: ret >>>>>> (parse_early_param) >>>>>>> 3824884522 ps ES (ffff800080763a60:d2801800) O el1h_ns: mov >>>> x0, >>>>>> #0xc0 // #192 (setup_arch) >>>>>>> R X0 (AARCH64) 00000000 000000c0 >>>>>>> 3824884523 ps ES (ffff800080763a64:d51b4220) O el1h_ns: msr >>>>>> daif, x0 (setup_arch) >>>>>>> R CPSR 600000c5 >>>>>>> 3824884529 ps ES System Error (Abort) >>>>>>> EXC [0x380] SError/vSError Current EL with SP_ELx >>>>>>> R ESR_EL1 (AARCH64) bf000002 >>>>>>> R CPSR 600003c5 >>>>>>> R SPSR_EL1 (AARCH64) 600000c5 >>>>>>> R ELR_EL1 (AARCH64) ffff8000 80763a68 >>>>>>> 3824884925 ps ES (ffff800080010b80:d10543ff) O el1h_ns: sub >>>> sp, >>>>>> sp, #0x150 (vectors) >>>>>>> R SP_EL1 (AARCH64) ffff8000 808f3c50 >>>>>>> 3824884925 ps ES (ffff800080010b84:8b2063ff) O el1h_ns: add >>>> sp, >>>>>> sp, x0 (vectors) >>>>>>> R SP_EL1 (AARCH64) ffff8000 808f3d10 >>>>>>> 3824884926 ps ES (ffff800080010b88:cb2063e0) O el1h_ns: sub >>>> x0, >>>>>> sp, x0 (vectors) >>>>>>> R X0 (AARCH64) ffff8000 808f3c50 >>>>>>> 3824884927 ps ES (ffff800080010b8c:37700080) O el1h_ns: tbnz >>>> w0, >>>>>> #14, ffff800080010b9c (vectors) >>>>>>> 3824884935 ps ES (ffff800080010b90:cb2063e0) O el1h_ns: sub >>>> x0, >>>>>> sp, x0 (vectors) >>>>>>> R X0 (AARCH64) 00000000 000000c0 >>>>>>> 3824884937 ps ES (ffff800080010b94:cb2063ff) O el1h_ns: sub >> sp, >>>>>> sp, x0 (vectors) >>>>>>> R SP_EL1 (AARCH64) ffff8000 808f3c50 >>>>>>> 3824884938 ps ES (ffff800080010b98:140001ef) O el1h_ns: b >>>>>> ffff800080011354 (vectors) >>>>>>> >>>>>>> If I understand correctly, the exception happened sometime earlier and >>>> only >>>>>> now Linux boot code (setup_arch) opened the exception handling and as >> a >>>>>> result we immediately jump to the SError exception handler. >>>>>> >>>>>> >>>>>> Yes, that sounds reasonable. If I understood correctly, you are >>>>>> running something "quite new" on some software (QEMU) and >> hardware >>>>>> (Synopsis) simulators. >>>>>> >>>>>> That would mean that you have new hardware with e.g. new memory >> map >>>>>> not used before. What you describe might sound like in the code before >>>>>> Linux (boot loader) there is anything resulting in the SError. This >>>>>> might be an access to non-existing or non-enabled hardware. I.e. it >>>>>> might be that you try to access (read/write) an address what is not >>>>>> available, yet (or just invalid). It's hard to debug that. In case you >>>>>> are able to modify the code before Linux (the boot loader?) you might >>>>>> try to enable SError exceptions, there, too. To get it earlier and >>>>>> with that make the search window smaller. I'm not that familiar with >>>>>> QEMU, but could you try to trace which (all?) hardware accesses your >>>>>> code does. And with that analyse all accesses and with that check if >>>>>> all these accesses are valid even on the hardware (Synopsis) emulation >>>>>> system? That should be checked from valid address and from hardware >>>>>> subsystem enablement point of view. >>>>>> >>>>>> Hth, >>>>>> >>>>>> Dirk >>>>>> >>>>>> >>>>>>> From the Linux source: >>>>>>> parse_early_param(); >>>>>>> >>>>>>> dynamic_scs_init(); >>>>>>> >>>>>>> /* >>>>>>> * Unmask asynchronous aborts and fiq after bringing up possible >>>>>>> * earlycon. (Report possible System Errors once we can report this >>>>>>> * occurred). >>>>>>> */ >>>>>>> local_daif_restore(DAIF_PROCCTX_NOIRQ); <---- This is when we >> get >>>> the >>>>>> exception. >>>>>>> >>>>>>> After some kernel hacking (replacing printk) we could extract the logs: >>>>>>> 6Booting Linux on physical CPU 0x0000000000 [0x410fd034] >>>>>>> 5Linux version 6.5.0 (pliops@dev-liorw) (aarch64-buildroot-linux-gnu- >>>>>> gcc.br_real (Buildroot 2023.02.1-95-g8391404e23) 11.3.0, GNU ld >> (GNU >>>>>> Binutils) 2.38) #101 SMP Sun Dec 17 20:09:06 IST 2023 >>>>>>> 6Machine model: Pliops Spider MK-I EVK >>>>>>> 2SError Interrupt on CPU0, code 0x00000000bf000002 -- SError >>>>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0 #101 >>>>>>> Hardware name: Pliops Spider MK-I EVK (DT) >>>>>>> pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>>>>> pc : setup_arch+0x13c/0x5ac >>>>>>> lr : setup_arch+0x134/0x5ac >>>>>>> sp : ffff8000808f3da0 >>>>>>> x29: ffff8000808f3da0c x28: 0000000008758074c x27: >>>>>> 0000000005e31b58c >>>>>>> x26: 0000000000000001c x25: 0000000007e5f728c x24: >>>>>> ffff8000808f8000c >>>>>>> x23: ffff8000808f8600c x22: ffff8000807b6000c x21: >>>> ffff800080010000c >>>>>>> x20: ffff800080a1e000c x19: fffffbfffddfe190c x18: >> 000000002266684ac >>>>>>> x17: 00000000fcad60bbc x16: 0000000000001800c x15: >>>>>> 0000000000000008c >>>>>>> x14: ffffffffffffffffc x13: 0000000000000000c x12: >> 0000000000000003c >>>>>>> x11: 0101010101010101c x10: ffffffffffee87dfc x9 : >>>> 0000000000000038c >>>>>>> x8 : 0101010101010101c x7 : 7f7f7f7f7f7f7f7fc x6 : >>>> 0000000000000001c >>>>>>> x5 : 0000000000000000c x4 : 8000000000000000c x3 : >>>>>> 0000000000000065c >>>>>>> x2 : 0000000000000000c x1 : 0000000000000000c x0 : >>>>>> 00000000000000c0c >>>>>>> 0Kernel panic - not syncing: Asynchronous SError Interrupt >>>>>>> CPU: 0 PID: 0 Comm: swapper Not tainted 6.5.0 #101 >>>>>>> Hardware name: Pliops Spider MK-I EVK (DT) >>>>>>> Call trace: >>>>>>> dump_backtrace+0x9c/0xd0 >>>>>>> show_stack+0x14/0x1c >>>>>>> dump_stack_lvl+0x44/0x58 >>>>>>> dump_stack+0x14/0x1c >>>>>>> panic+0x2e0/0x33c >>>>>>> nmi_panic+0x68/0x6c >>>>>>> arm64_serror_panic+0x68/0x78 >>>>>>> do_serror+0x24/0x54 >>>>>>> el1h_64_error_handler+0x2c/0x40 >>>>>>> el1h_64_error+0x64/0x68 >>>>>>> setup_arch+0x13c/0x5ac >>>>>>> start_kernel+0x5c/0x5b8 >>>>>>> __primary_switched+0xb4/0xbc >>>>>>> 0---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]--- >>>>>>> >>>>>>> Can you please advice how to proceed with debugging? >>>>>>> >>>>>>> Thanks in advanced, >>>>>>> Cheers, >>>>>>> Lior. >>>>>>> >>>>>>> >>>>>> >>>>> >>> >