Received: by 2002:a05:7412:37c9:b0:e2:908c:2ebd with SMTP id jz9csp1506047rdb; Wed, 20 Sep 2023 10:57:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGtb7DXuzt6hGI7HX24bTKO2msVDY+018H0E6AT22PQ2ScqhCuhAhsx3ZnS4p6zJD0XIXV/ X-Received: by 2002:a05:6a20:8e01:b0:13a:fa9e:787b with SMTP id y1-20020a056a208e0100b0013afa9e787bmr4147789pzj.12.1695232646869; Wed, 20 Sep 2023 10:57:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695232646; cv=none; d=google.com; s=arc-20160816; b=TW7lfZM17riC8MfNklPzn14ZZDvgj8S7GRZIlPfaqvV/bWyFGb3TGWdUX75AWWuG0d 2Hf96/i24JeFDSoMR68nNlpx6ih2QL1DfIiF1TGrIRd3z7HVdzW+0msEoJn9YeWNKW9O 4r83VebPvFtzGC5erZVxzZxg/YhjF5vzx3Gvt4jxtrdoeiOyR4SHIQ9Ny04eOdhdkqF/ xzBuB7KgqFkR4BBBwiLjzi3+SOIGZVY/snGYHo5PIL9+AsWUoHud/pArU8SdrSspHnDE ESxxkhAxwq91EPxZCO0U8za43tt3lgFnx+O0NyUhFdYud/Ylj2Z+QfUZjsyEKmNkL2Nb 3JDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=jgk9bAT3fBuPZQnlemPRepUuIEAFwRDEFf1vzIWRFP4=; fh=c6d5smtRxFec6VYgyJZt6U4Z94ZBIDl35UyOzP/hlAk=; b=SXB9qrmLnEbP9dqkBTGMe/ZOFP2pqxGI03T6dk8u/4dHg5ExDipWqwTnlZ1yUSB7w3 ZsLMguKYf0tJYNrfFwel0UvGvUCVMWRegFw+IkiGaE58+FWKC0aoIqz6n9pS2GbiIMoI +0meNqSG+s0M3HMjHbDTHfuBwXakzwaDgfBC9RAC36b0Id9tXi5R+46cmKUBIlxPn7MZ CfhSIdBVwsijHFqA65E67Cdt4tPkJCgQ+V3ULjmkk4afjYaOuE93HpLfs2HPlTEPP8xd o+vTVphvtECPxniJwIp9czH2F/i1l6gZP82Qd7+GZEwSSM16wir3MfhJdb/SFFmh5Avl 3ENg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZEsRpWOa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id j12-20020aa78dcc000000b00690258a9777si11984558pfr.20.2023.09.20.10.57.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 10:57:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZEsRpWOa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 288938371D36; Wed, 20 Sep 2023 10:01:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234588AbjITRB7 (ORCPT + 99 others); Wed, 20 Sep 2023 13:01:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232318AbjITRB6 (ORCPT ); Wed, 20 Sep 2023 13:01:58 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F63EC6; Wed, 20 Sep 2023 10:01:49 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D3CAEC433C7; Wed, 20 Sep 2023 17:01:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1695229309; bh=FE+ERN9kYmk/tFJKbkEObD7giTNovYxp0e/E+ZM5F30=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=ZEsRpWOa+xwQHlJKfBH6YqDYvONF9ywj/F8ePf64xIH+dCAytAPiDv7KscV2PVAfw GhrXTr8sP2WIBy++735sJPPV+//PxUS/quR7D+L3YlFcrGrGdBNlKK1dfJybXpbbFT X22uq5HbesoEvl0Nd4shsdJZAcr7yoOKi7SiLcBgg6fxu9NnJwahLm5GHYxQXvDIYE hA3Gj2okbfScI/UyaUGOh5x8Evae3QS0FLE99fdfJn1ug94hOHc65CpfSLhBi6KpYL SZn9AJQjgpmBVKNnVctU+1eg4nn/J+xSDaYVFZBw7sXlQmYs8A44B39Eie5YjHdeLw wiFFt2lK8wW/w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A9CE3CE0CF9; Wed, 20 Sep 2023 10:01:46 -0700 (PDT) Date: Wed, 20 Sep 2023 10:01:46 -0700 From: "Paul E. McKenney" To: Robin Murphy Cc: Mark Rutland , Naresh Kamboju , LTP List , open list , Linux ARM , rcu , lkft-triage@lists.linaro.org, chrubis , Greg Kroah-Hartman , Peter Zijlstra , Josh Poimboeuf , Jason Baron , Steven Rostedt , Ard Biesheuvel , Catalin Marinas , Will Deacon , Dan Carpenter , Arnd Bergmann Subject: Re: arm64: Unable to handle kernel execute from non-executable memory at virtual address ffff8000834c13a0 Message-ID: <8474df43-0718-4ae5-b36e-2c3c1f19d5e9@paulmck-laptop> Reply-To: paulmck@kernel.org References: <7c85cbf5-efb2-9cc6-4a5c-9854f7db1b0e@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7c85cbf5-efb2-9cc6-4a5c-9854f7db1b0e@arm.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 20 Sep 2023 10:01:56 -0700 (PDT) On Wed, Sep 20, 2023 at 05:26:33PM +0100, Robin Murphy wrote: > On 20/09/2023 3:32 pm, Mark Rutland wrote: > > Hi Naresh, > > > > On Wed, Sep 20, 2023 at 11:29:12AM +0200, Naresh Kamboju wrote: > > > [ my two cents ] > > > While running LTP pty07 test cases on arm64 juno-r2 with Linux next-20230919 > > > the following kernel crash was noticed. > > > > > > I have been noticing this issue intermittently on Juno-r2 for more than a month. > > > Anyone have noticed this crash ? > > > > How intermittent is this? 1/2, 1/10, 1/100, rarer still? > > > > Are you running *just* the pty07 test, or are you running a whole LTP suite and > > the issue first occurs around pty07? > > > > Given you've been hitting this for a month, have you tried testing mainline? Do > > you have a known-good kernel that we can start a bisect from? > > > > Do you *only* see this on Juno-r2 and are you testing on other hardware? > > > > > Reported-by: Linux Kernel Functional Testing > > > > > > [ 0.000000] Linux version 6.6.0-rc2-next-20230919 (tuxmake@tuxmake) > > > (aarch64-linux-gnu-gcc (Debian 13.2.0-2) 13.2.0, GNU ld (GNU Binutils > > > for Debian) 2.41) #1 SMP PREEMPT @1695107157 > > > [ 0.000000] KASLR disabled due to lack of seed > > > [ 0.000000] Machine model: ARM Juno development board (r2) > > > ... > > > LTP running pty > > > ... > > > > > > pty07.c:92: TINFO: Saving active console 1 > > > ../../../include/tst_fuzzy_sync.h:640: TINFO: Stopped sampling at 552 > > > (out of 1024) samples, sampling time reached 50% of the total time > > > limit > > > ../../../include/tst_fuzzy_sync.h:307: TINFO: loop = 552, delay_bias = 0 > > > ../../../include/tst_fuzzy_sync.h:295: TINFO: start_a - start_b: { avg > > > = 127ns, avg_dev = 84ns, dev_ratio = 0.66 } > > > ../../../include/tst_fuzzy_sync.h:295: TINFO: end_a - start_a : { avg > > > = 17296156ns, avg_dev = 5155058ns, dev_ratio = 0.30 } > > > ../../../include/tst_fuzzy_sync.h:295: TINFO: end_b - start_b : { avg > > > = 101202336ns, avg_dev = 6689286ns, dev_ratio = 0.07 } > > > ../../../include/tst_fuzzy_sync.h:295: TINFO: end_a - end_b : { avg > > > = -83906064ns, avg_dev = 10230694ns, dev_ratio = 0.12 } > > > ../../../include/tst_fuzzy_sync.h:295: TINFO: spins : { avg > > > = 2765565 , avg_dev = 339285 , dev_ratio = 0.12 } > > > [ 384.133538] Unable to handle kernel execute from non-executable > > > memory at virtual address ffff8000834c13a0 > > > [ 384.133559] Mem abort info: > > > [ 384.133568] ESR = 0x000000008600000f > > > [ 384.133578] EC = 0x21: IABT (current EL), IL = 32 bits > > > [ 384.133590] SET = 0, FnV = 0 > > > [ 384.133600] EA = 0, S1PTW = 0 > > > [ 384.133610] FSC = 0x0f: level 3 permission fault > > > [ 384.133621] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000082375000 > > > [ 384.133634] [ffff8000834c13a0] pgd=10000009fffff003, > > > p4d=10000009fffff003, pud=10000009ffffe003, pmd=10000009ffff8003, > > > pte=00780000836c1703 > > > [ 384.133697] Internal error: Oops: 000000008600000f [#1] PREEMPT SMP > > > [ 384.133707] Modules linked in: tda998x onboard_usb_hub cec hdlcd > > > crct10dif_ce drm_dma_helper drm_kms_helper fuse drm backlight dm_mod > > > ip_tables x_tables > > > [ 384.133767] CPU: 3 PID: 589 Comm: (udev-worker) Not tainted > > > 6.6.0-rc2-next-20230919 #1 > > > [ 384.133779] Hardware name: ARM Juno development board (r2) (DT) > > > [ 384.133784] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > > [ 384.133796] pc : in_lookup_hashtable+0x178/0x2000 > > > > This indicates that the faulting address ffff8000834c13a0 is > > in_lookup_hashtable+0x178/0x2000, which would been we've somehow marked the > > kernel text as non-executable, which we never do intentionally. > > > > I suspect that implies memory corruption. Have you tried running this with > > KASAN enabled? > > > > > [ 384.133818] lr : rcu_core (arch/arm64/include/asm/preempt.h:13 > > > (discriminator 1) kernel/rcu/tree.c:2146 (discriminator 1) > > > kernel/rcu/tree.c:2403 (discriminator 1)) > > For the record, this LR appears to be the expected return address of the > "f(rhp);" call within rcu_do_batch() (if CONFIG_DEBUG_LOCK_ALLOC=n), so it > looks like a case of a bogus or corrupted RCU callback. The PC is in the > middle of a data symbol (in_lookup_hashtable is an array), so NX is expected > and I wouldn't imagine the pagetables have gone wrong, just regular data > corruption or use-after-free somewhere. Is it possible to use either KASAN or CONFIG_DEBUG_OBJECTS_RCU_HEAD=y here? Thanx, Paul > Robin. > > > > [ 384.133832] sp : ffff800083533e60 > > > [ 384.133836] x29: ffff800083533e60 x28: ffff0008008a6180 x27: 000000000000000a > > > [ 384.133854] x26: 0000000000000000 x25: 0000000000000000 x24: ffff800083533f10 > > > [ 384.133871] x23: ffff800082404008 x22: ffff800082ebea80 x21: ffff800082f55940 > > > [ 384.133889] x20: ffff00097ed75440 x19: 0000000000000001 x18: 0000000000000000 > > > [ 384.133905] x17: ffff8008fc95c000 x16: ffff800083530000 x15: 00003d0900000000 > > > [ 384.133922] x14: 0000000000030d40 x13: 0000000000000000 x12: 003d090000000000 > > > [ 384.133939] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff80008015b05c > > > [ 384.133955] x8 : ffff800083533da8 x7 : 0000000000000000 x6 : 0000000000000100 > > > [ 384.133971] x5 : ffff800082ebf000 x4 : ffff800082ebf2e8 x3 : 0000000000000000 > > > [ 384.133987] x2 : ffff000825bf8618 x1 : ffff8000834c13a0 x0 : ffff00082b6d7170 > > > [ 384.134005] Call trace: > > > [ 384.134009] in_lookup_hashtable+0x178/0x2000 > > > [ 384.134022] rcu_core_si (kernel/rcu/tree.c:2421) > > > [ 384.134035] __do_softirq (arch/arm64/include/asm/jump_label.h:21 > > > include/linux/jump_label.h:207 include/trace/events/irq.h:142 > > > kernel/softirq.c:554) > > > [ 384.134046] ____do_softirq (arch/arm64/kernel/irq.c:81) > > > [ 384.134058] call_on_irq_stack (arch/arm64/kernel/entry.S:888) > > > [ 384.134070] do_softirq_own_stack (arch/arm64/kernel/irq.c:86) > > > [ 384.134082] irq_exit_rcu (arch/arm64/include/asm/percpu.h:44 > > > kernel/softirq.c:612 kernel/softirq.c:634 kernel/softirq.c:644) > > > [ 384.134094] el0_interrupt (arch/arm64/include/asm/daifflags.h:28 > > > arch/arm64/kernel/entry-common.c:133 > > > arch/arm64/kernel/entry-common.c:144 > > > arch/arm64/kernel/entry-common.c:763) > > > [ 384.134110] __el0_irq_handler_common (arch/arm64/kernel/entry-common.c:769) > > > [ 384.134124] el0t_64_irq_handler (arch/arm64/kernel/entry-common.c:774) > > > [ 384.134137] el0t_64_irq (arch/arm64/kernel/entry.S:592) > > > [ 384.134153] Code: 00000000 00000000 00000000 00000000 (2b6d7170) > > > All code > > > ======== > > > ... > > > 10: 70 71 jo 0x83 > > > 12: 6d insl (%dx),%es:(%rdi) > > > 13: 2b .byte 0x2b > > > > > > Code starting with the faulting instruction > > > =========================================== > > > 0: 70 71 jo 0x73 > > > 2: 6d insl (%dx),%es:(%rdi) > > > 3: 2b .byte 0x2b > > > > As a general thing, can you *please* fix this code dump to decode arm64 as > > arm64? > > > > Given the instructions before this are all UDF #0, I suspect the page table > > entry has been corrupted and this is pointing at entirely the wrong page. > > > > Thanks, > > Mark. > > > > > [ 384.134161] ---[ end trace 0000000000000000 ]--- > > > [ 384.134168] Kernel panic - not syncing: Oops: Fatal exception in interrupt > > > [ 384.134173] SMP: stopping secondary CPUs > > > [ 384.134184] Kernel Offset: disabled > > > [ 384.134187] CPU features: 0x8000020c,3c020000,0000421b > > > [ 384.134194] Memory Limit: none > > > > > > Links: > > > - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230919/testrun/20054202/suite/log-parser-test/tests/ > > > - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20230919/testrun/20054202/suite/log-parser-test/test/check-kernel-oops/log > > > - https://storage.tuxsuite.com/public/linaro/lkft/builds/2VbZdpWwncUx8oSxsSXCWV3N5DH/ > > > - https://lkft.validation.linaro.org/scheduler/job/6666807#L2461 > > > > > > -- > > > Linaro LKFT > > > https://lkft.linaro.org > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel