Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp5594333rwb; Mon, 14 Nov 2022 06:57:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf6U6a4M3lYn3joHBAWqOeTRKd5vYBNtuUETgoBSvy0CSCJ+zEmVMCLCvGe7hB3ZgBnhjbLB X-Received: by 2002:a05:6a00:1ad2:b0:56c:235:83a9 with SMTP id f18-20020a056a001ad200b0056c023583a9mr14160816pfv.6.1668437834630; Mon, 14 Nov 2022 06:57:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668437834; cv=none; d=google.com; s=arc-20160816; b=Tos7lk2GTkK+ypiuF/QMIRzmAY/qjbPvO8LDhQKKtWhUA7Qbk1dJoE2BVVtFmQOObp uutxLEuRlwkbM9eIT3QlJWlh5LQOEr5V8KqeehNaFn7U1/QctyvOqhJ4FbvbZo7Zp9PC o84ng/cEJqDu/9Tk6KQFYxiVrW21/FiJcp6xbPSJp0UsddNYKamvv9xBYzE20JBFFbUy uBk+66SPCHz3oGhcwNHWb5kg9IRIlSAbvblvSVYjVgeu2YdRBaSXVsGbCFHeCv6BNoKI ZpSz+Jj9hY6vwnXy7G1m4vtXfEHwu9Cq3pIwG2FefnYI2zhbe71StMcHoAglBodzkEvM U/xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=GgujfQaV6/m0ta28qb8mPOyNOFsiVKHgl4688yb9RFE=; b=jbEz6ilC0qvY8JGalpKrIO9aP9JLdi1eldowHaJHhoxZhdks8/DJP/aWJisUvbiBrl /WNL0otVUGB5bedwIuLCwHsjvebnpVL48k1u8M9l/xtC4suSXLHhORh/fMaCh2/3v7Dv /cpxxyBV6tYELla5fHk/6/MHm5WSzg8r6YKOR5LDbvy0ldjfEB9mzw6grLhMW4736u+v MtGpnez8ibnXPtUjBCIYtiBhKs5hrIkRaRGiEnyqxl5PY+/r9A3VhUyY5JNeb3es4ztN 4X000bbJwoiQOKn55/kKg9Jgagnhjd8w2DWOzCmt8ic6Khn6V+ViXFLPHLm3aK2Hyc0X 2eAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iq5PrCv3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id om17-20020a17090b3a9100b002135b4f8632si16434020pjb.171.2022.11.14.06.57.02; Mon, 14 Nov 2022 06:57:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iq5PrCv3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236899AbiKNOJQ (ORCPT + 87 others); Mon, 14 Nov 2022 09:09:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236935AbiKNOJO (ORCPT ); Mon, 14 Nov 2022 09:09:14 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B756624943 for ; Mon, 14 Nov 2022 06:09:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 69740B80EBB for ; Mon, 14 Nov 2022 14:09:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 323C7C4314E; Mon, 14 Nov 2022 14:09:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1668434950; bh=OkrAT6K7xXq9Vzf1rCG22jNAzcXb4Qf+jlQwNH73UMs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=iq5PrCv3X7ahO2e6vQs3RtEsBWbomTwydrPK/sVMFTrVjMBYLQfICWPjKmyYsBbS7 5mcCb1Ngb0BXr/qnKXupqwOERiELJNRBSm0XQJHmMKQsmClYP4PaoPNn/iy/Z2W0Sa QQah3p2+AxBKWDM3qWcRlm5MlFBMN2CWMi6bVw6FskSb4Tyuah8UkRtc2/i+r2Q1YA 3bnf04mbb6iXym/bpIFRAZP0tmRI8Jf8z4Q+x1ZI0/eMzYYs39zgFED/VkzWlQNqGp 1G3WFWUzP3sS5r7UbGDBm38z+HacyaIk5KRLef7rHXCBSXGBDeJcAsJLoIfhgk2AUm mErzc7IUs6nZQ== Date: Mon, 14 Nov 2022 14:09:04 +0000 From: Will Deacon To: Manivannan Sadhasivam Cc: Catalin Marinas , Amit Pundir , Robin Murphy , Bjorn Andersson , Sibi Sankar , Linus Torvalds , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dmitry Baryshkov Subject: Re: [GIT PULL] arm64 updates for 6.1-rc1 Message-ID: <20221114140903.GF30263@willie-the-truck> References: <20221005144116.2256580-1-catalin.marinas@arm.com> <20221111173952.GB5393@thinkpad> <20221111191820.GC5393@thinkpad> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221111191820.GC5393@thinkpad> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Nov 12, 2022 at 12:48:20AM +0530, Manivannan Sadhasivam wrote: > On Fri, Nov 11, 2022 at 11:10:01PM +0530, Manivannan Sadhasivam wrote: > > On Fri, Nov 11, 2022 at 11:15:11AM +0000, Catalin Marinas wrote: > > > On Tue, Nov 08, 2022 at 10:58:16PM +0530, Amit Pundir wrote: > > > > On Tue, 25 Oct 2022 at 18:08, Amit Pundir wrote: > > > > > On Wed, 12 Oct 2022 at 17:24, Catalin Marinas wrote: > > > > > > On Sat, Oct 08, 2022 at 08:28:26PM +0530, Amit Pundir wrote: > > > > > > > On Wed, 5 Oct 2022 at 20:11, Catalin Marinas wrote: > > > > > > > > Will Deacon (2): > > > > > > > > arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() > > > > > > > > > > > > > > This patch broke AOSP on Dragonboard 845c (SDM845). I don't see any > > > > > > > relevant crash in the attached log and device silently reboots into > > > > > > > USB crash dump mode. The crash is fairly reproducible on db845c. I > > > > > > > could trigger it twice in 5 reboots and it always crash at the same > > > > > > > point during the boot process. Reverting this patch fixes the crash. > > > > > > > > > > > > > > I'm happy to test run any debug patche(s), that would help narrow > > > > > > > down this breakage. > > > [...] > > > > > Further narrowed down the breakage to the userspace daemon rmtfs > > > > > https://github.com/andersson/rmtfs. Is there anything specific in the > > > > > userspace code that I should be paying attention to? > > > > > > Since you don't see anything in the logs like a crash and the system > > > restarts, I suspect it's some deadlock and that's triggering the > > > watchdog. We have an erratum (826319) but that's for Cortex-A53. IIUC > > > SDM845 has Kryo 3xx series which based on some random google searches is > > > derived from A75/A55. Unfortunately the MIDR_EL1 register doesn't match > > > the Arm Ltd numbering, so I have no idea what CPUs these are by looking > > > at the boot log. > > > > > > I wouldn't be surprised if you hit a similar bug, though I couldn't find > > > anything close in the A55 errata notice. > > > > > > While we could revert commit c44094eee32f ("arm64: dma: Drop cache > > > invalidation from arch_dma_prep_coherent()"), if you hit a real hardware > > > issue it may trigger in other scenario where we only do cache cleaning > > > (without invalidate), like arch_sync_dma_for_device(). So I'd rather get > > > to the bottom of this and potentially enable the workaround for this > > > chipset. > > > > > > You could give it a quick try to by adding the MIDR ranges for SDM845 to > > > struct midr_range workaround_clean_cache[]. > > > > > > > I gave it a shot and indeed it fixes the crash on DB845. > > > > > After that I suggest you raise it with Qualcomm to investigate. Normally > > > we ask for an erratum number to enable a workaround and it's only > > > Qualcomm that can provide one here. > > > > > > > I will check with Qualcomm folks and update. > > > > I digged a little further and found that the crash was due to the secure > processor (XPU) violation. It happens because, CPU tried acccessing the memory > after sharing it with the modem for firmware metadata validation. Can you share more details about this violation, please? For example, is it s read or a write, what size is it, how is it detected? > Sibi tried fixing this problem earlier by using a hack in the remoteproc driver > [1], but I guess that got negated due to c44094eee32f? Performing a clean rather than a clean+invalidate when the buffer is allocated (which is what is achieved by c44094eee32f) shouldn't affect this afaict. > This is a common issue for other Qcom remoteproc drivers as well where CPU > shares a chunk of memory with the modem. There is one more hack in place where > the a chunk of memory is reserved and the driver will do memremap/copy the > data/memunmap using it and share it with modem. > > But is there a better solution overall that you could advise? I think we need a better understanding of what Qualcomm's SCM firmware is expecting about the state of the buffer pages being shared with the modem before we can suggest other solutions. Will