Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753716AbcDTEfU (ORCPT ); Wed, 20 Apr 2016 00:35:20 -0400 Received: from nat-hk.nvidia.com ([203.18.50.4]:40944 "EHLO hkmmgate101.nvidia.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750807AbcDTEfS (ORCPT ); Wed, 20 Apr 2016 00:35:18 -0400 X-PGP-Universal: processed; by hkpgpgate101.nvidia.com on Tue, 19 Apr 2016 21:35:15 -0700 Subject: Re: Nouveau crashes in 4.6-rc on arm64 To: Robin Murphy , , , References: <57064992.1060509@arm.com> <570737F5.30105@nvidia.com> <5707FC9F.50905@arm.com> <570B50B4.4020304@nvidia.com> CC: From: Alexandre Courbot Organization: NVIDIA X-Nvconfidentiality: public Message-ID: <571706FF.1010300@nvidia.com> Date: Wed, 20 Apr 2016 13:35:11 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <570B50B4.4020304@nvidia.com> X-Originating-IP: [10.19.57.128] X-ClientProxiedBy: HKMAIL103.nvidia.com (10.18.16.12) To HKMAIL103.nvidia.com (10.18.16.12) Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2870 Lines: 65 On 04/11/2016 04:22 PM, Alexandre Courbot wrote: > Hi Robin, > > On 04/09/2016 03:46 AM, Robin Murphy wrote: >> Hi Alex, >> >> On 08/04/16 05:47, Alexandre Courbot wrote: >>> Hi Robin, >>> >>> On 04/07/2016 08:50 PM, Robin Murphy wrote: >>>> Hello, >>>> >>>> With 4.6-rc2 (and -rc1) I'm seeing Nouveau blowing up at boot, from the >>>> look of it by dereferencing some offset from NULL inside >>>> nouveau_fbcon_imageblit(). My setup is an old XFX 7600GT card plugged >>>> into an ARM Juno r1 board, which works fine with 4.5 and earlier. >>>> >>>> Attached are a couple of logs from booting arm64 defconfig plus DRM and >>>> Nouveau enabled - the second also has framebuffer console rotation >>>> turned on, which interestingly seems to move the point of failure, and >>>> the display does eventually come up to show the tail end of the >>>> panic in >>>> that case. >>>> >>>> I might be able to find time for a full bisection next week if isn't >>>> something sufficiently obvious to anyone who knows this driver. >>> >>> Looking at the log it is not clear to me what could be causing this. I >>> can boot 4.6-rc2 with a GM206 card without any issue. A bisect would >>> indeed be useful here. >> >> OK, turns out the lure of writing something to remotely drive a Juno and >> parse kernel bootlogs through an automatic bisection was too great to >> resist on a Friday afternoon :D >> >> Bisection came down to 1733a2ad3674("drm/nouveau/device/pci: set as >> non-CPU-coherent on ARM64"), and sure enough reverting that removes the >> crash. > > Thanks for taking the time to bisect this. And apologies as it seems my > commit is the reason for your troubles. > > The CPU coherency flag is used for two things: explicitly sync buffers > pages when required, and allocating buffers that are not explicitly > synced (like fences or pushbuffers) using the DMA API. For this latter > use, it also accesses the buffer's content using the mapping provided by > dma_alloc_coherent() instead of creating a new one. All nouveau_bos are > supposed to be written using nouveau_bo_rd32(), and this function > handles the case of an DMA-API allocated object by detecting that the > result of ttm_kmap_obj_virtual() is NULL. > > But as it turns out, OUT_RINGp() also calls ttm_kmap_obj_virtual() in > order to perform a memcpy and uses its result directly - which means we > are doing memcpy on a NULL pointer. We never caught this because we > typically do not use Nouveau's fbcon with an ARM setup. > > I don't really like this special access for coherent objects, and > actually had a patch in my tree to attempt to remove it (attached). > Although it is not the whole solution (see below), the issue should at > least not be visible with it applied - could you confirm? Hi Robin, could you confirm whether the attached patch in my previous mail helps with your problem? Thanks!