Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp2041137rwb; Fri, 11 Nov 2022 04:19:45 -0800 (PST) X-Google-Smtp-Source: AA0mqf63yXptVOE1uKhjlXvh1MFx8mBZLLcNPB7EvGzmjkV/m7bXoYLXzeldDE53hwHFfXOGlhJ1 X-Received: by 2002:a05:6402:79a:b0:45b:e6:3ffc with SMTP id d26-20020a056402079a00b0045b00e63ffcmr1213039edy.82.1668169184905; Fri, 11 Nov 2022 04:19:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668169184; cv=none; d=google.com; s=arc-20160816; b=GbI1X3CfhLjyIGXcb6se5L45d/Si1W6ZhQmT2Q77w8rMKZIS7/jp5Fs68xhQbhRMxX CMRFMARr6MwBTtJaV6QgQr8SxyOawmxqeftqlfJtYUQmBN+iVG5V7t3hMOXiLsNbCkuc di8oA1FiYU+Cwv0vW8rOEGL1UoisgXY/8/k7t8VY/n42biKidhTS+t+dhBLBEvlIBGAS xhF+6qfm0pguEUhpMFcmrX7DnJZ3AvJd/cEqT+L/rtUo8+iihRw8eTKbumCiZ6Ya/SFZ tKvSQ3+tpw+sRpPFb9oaQJPuwyJWBf2bhe7DMgEBPqXoM7SHG/aSzKT77noflldI33Wc 4Tnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=gKcO6kCv+ZllcRj5asyPcqwCZM6NxdsUkck9QeeqaWA=; b=GVPUjJf/YFuWWmF7iMazPWHQjTl3XtcBHGSPf97u0WaBieHsHImxLqo0MWzBrbnkcj CQLfbtKhc7qE74y13yALftD+Cn6Ya5UMhNBGQGGyWQvz8nQPNqV1Xl6sfLvJVE5Onkhn 7H7JpBzAsoxrKF1OxYYe/AQN2Gj//1De+BY6B3rxgIPoO95340/8uDiLBakBqxruc5yB v/LGhi2NdYLNgmyRqMA+SjIxxt8oHuoXAqqs+UmRhM+f9zm0rTgCaeV191U1Nbi4FbPX kV5gpXlP4zcxqpDb3+e8ZTGU/gAJxFtdesh8zClXrIv+utJVsjfbz5onanVEnP3IIWLF HNvg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id go14-20020a1709070d8e00b007ae63fe980dsi2168724ejc.931.2022.11.11.04.19.22; Fri, 11 Nov 2022 04:19:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232609AbiKKLyx (ORCPT + 93 others); Fri, 11 Nov 2022 06:54:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233577AbiKKLyr (ORCPT ); Fri, 11 Nov 2022 06:54:47 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5E9E965862 for ; Fri, 11 Nov 2022 03:54:45 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 501C31FB; Fri, 11 Nov 2022 03:54:51 -0800 (PST) Received: from [10.57.38.243] (unknown [10.57.38.243]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 98E9C3F534; Fri, 11 Nov 2022 03:54:43 -0800 (PST) Message-ID: Date: Fri, 11 Nov 2022 11:54:22 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.4.1 Subject: Re: [GIT PULL] arm64 updates for 6.1-rc1 Content-Language: en-GB To: Catalin Marinas , Amit Pundir Cc: Bjorn Andersson , Sibi Sankar , Manivannan Sadhasivam , Will Deacon , Linus Torvalds , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dmitry Baryshkov References: <20221005144116.2256580-1-catalin.marinas@arm.com> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-11-11 11:15, Catalin Marinas wrote: > On Tue, Nov 08, 2022 at 10:58:16PM +0530, Amit Pundir wrote: >> On Tue, 25 Oct 2022 at 18:08, Amit Pundir wrote: >>> On Wed, 12 Oct 2022 at 17:24, Catalin Marinas wrote: >>>> On Sat, Oct 08, 2022 at 08:28:26PM +0530, Amit Pundir wrote: >>>>> On Wed, 5 Oct 2022 at 20:11, Catalin Marinas wrote: >>>>>> Will Deacon (2): >>>>>> arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() >>>>> >>>>> This patch broke AOSP on Dragonboard 845c (SDM845). I don't see any >>>>> relevant crash in the attached log and device silently reboots into >>>>> USB crash dump mode. The crash is fairly reproducible on db845c. I >>>>> could trigger it twice in 5 reboots and it always crash at the same >>>>> point during the boot process. Reverting this patch fixes the crash. >>>>> >>>>> I'm happy to test run any debug patche(s), that would help narrow >>>>> down this breakage. > [...] >>> Further narrowed down the breakage to the userspace daemon rmtfs >>> https://github.com/andersson/rmtfs. Is there anything specific in the >>> userspace code that I should be paying attention to? FWIW, this scenario appears to have pretty much everything going on - buffers allocated from no-map carveouts, being shared with firmware as well as DMA devices, being poked by userspace through /dev/mem, and presumably with the funky Qualcomm sort-of-coherent outer cache in the mix too (where IIRC the outer non-cacheable attribute behaves differently for CPUs vs. DMA). If anything's ever going to go awry with mismatched attributes and stale cachelines, it's probably in that setup somewhere. > Since you don't see anything in the logs like a crash and the system > restarts, I suspect it's some deadlock and that's triggering the > watchdog. We have an erratum (826319) but that's for Cortex-A53. IIUC > SDM845 has Kryo 3xx series which based on some random google searches is > derived from A75/A55. Unfortunately the MIDR_EL1 register doesn't match > the Arm Ltd numbering, so I have no idea what CPUs these are by looking > at the boot log. Note that the EL2 firmware on these things tends to happily reset the system without warning if you so much as look at it funny, so I'd imagine a straightforward timeout or other unexpected condition due to coherency getting lost somewhere in the kernel/firmware/device handoff process is probably more than enough. Robin.