Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757137AbcK3VZH (ORCPT ); Wed, 30 Nov 2016 16:25:07 -0500 Received: from mx2.suse.de ([195.135.220.15]:49870 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751255AbcK3VZE (ORCPT ); Wed, 30 Nov 2016 16:25:04 -0500 Subject: Re: drm/radeon spamming alloc_contig_range: [xxx, yyy) PFNs busy busy To: "Robin H. Johnson" , Michal Hocko , Michal Nazarewicz , linux-kernel@vger.kernel.org, linux-mm@kvack.org, dri-devel@lists.freedesktop.org References: <20161130092239.GD18437@dhcp22.suse.cz> <20161130132848.GG18432@dhcp22.suse.cz> Cc: Marek Szyprowski , Joonsoo Kim , Minchan Kim From: Vlastimil Babka Message-ID: <9d6e922b-d853-f24d-353c-25fbac38115b@suse.cz> Date: Wed, 30 Nov 2016 22:24:59 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4637 Lines: 89 [add more CC's] On 11/30/2016 09:19 PM, Robin H. Johnson wrote: > Somewhere in the Radeon/DRM codebase, CMA page allocation has either > regressed in the timeline of 4.5->4.9, and/or the drm/radeon code is > doing something different with pages. Could be that it didn't use dma_generic_alloc_coherent() before, or you didn't have the generic CMA pool configured. What's the output of "grep CMA" on your .config? Or any kernel boot options with cma in name? By default config this should not be used on x86. > Given that I haven't seen ANY other reports of this, I'm inclined to > believe the problem is drm/radeon specific (if I don't start X, I can't > reproduce the problem). It's rather CMA specific, the allocation attemps just can't be 100% reliable due to how CMA works. The question is if it should be spewing in the log in the context of dma-cma, which has a fallback allocation option. It even uses __GFP_NOWARN, perhaps the CMA path should respect that? > The rate of the problem starts slow, and also is relatively low on an idle > system (my screens blank at night, no xscreensaver running), but it still ramps > up over time (to the point of generating 2.5GB/hour of "(timestamp) > alloc_contig_range: [83e4d9, 83e4da) PFNs busy"), with various addresses (~100 > unique ranges for a day). > > My X workload is ~50 chrome tabs and ~20 terminals (over 3x 24" monitors w/ 9 > virtual desktops per monitor). So IIUC, except the messages, everything actually works fine? > I added a stack trace & rate limit to alloc_contig_range's PFNs busy message > (patch in previous email on LKML/-MM lists); and they point to radeon. > > alloc_contig_range: [83f2a3, 83f2a4) PFNs busy > CPU: 3 PID: 8518 Comm: X Not tainted 4.9.0-rc7-00024-g6ad4037e18ec #27 > Hardware name: System manufacturer System Product Name/P8Z68 DELUXE, BIOS 0501 05/09/2011 > ffffad50c3d7f730 ffffffffb236c873 000000000083f2a3 000000000083f2a4 > ffffad50c3d7f810 ffffffffb2183b38 ffff999dff4d8040 0000000020fca8c0 > 000000000083f400 000000000083f000 000000000083f2a3 0000000000000004 > Call Trace: > [] dump_stack+0x85/0xc2 > [] alloc_contig_range+0x368/0x370 > [] cma_alloc+0x127/0x2e0 > [] dma_alloc_from_contiguous+0x38/0x40 > [] dma_generic_alloc_coherent+0x91/0x1d0 > [] x86_swiotlb_alloc_coherent+0x25/0x50 > [] ttm_dma_populate+0x48a/0x9a0 [ttm] > [] ? __kmalloc+0x1b6/0x250 > [] radeon_ttm_tt_populate+0x22a/0x2d0 [radeon] > [] ? ttm_dma_tt_init+0x67/0xc0 [ttm] > [] ttm_tt_bind+0x37/0x70 [ttm] > [] ttm_bo_handle_move_mem+0x528/0x5a0 [ttm] > [] ? shmem_alloc_inode+0x1a/0x30 > [] ttm_bo_validate+0x114/0x130 [ttm] > [] ? _raw_write_unlock+0xe/0x10 > [] ttm_bo_init+0x31d/0x3f0 [ttm] > [] radeon_bo_create+0x19b/0x260 [radeon] > [] ? radeon_update_memory_usage.isra.0+0x50/0x50 [radeon] > [] radeon_gem_object_create+0xad/0x180 [radeon] > [] radeon_gem_create_ioctl+0x5f/0xf0 [radeon] > [] drm_ioctl+0x21b/0x4d0 [drm] > [] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon] > [] radeon_drm_ioctl+0x4c/0x80 [radeon] > [] do_vfs_ioctl+0x92/0x5c0 > [] SyS_ioctl+0x79/0x90 > [] do_syscall_64+0x73/0x190 > [] entry_SYSCALL64_slow_path+0x25/0x25 > > The Radeon card in my case is a VisionTek HD 7750 Eyefinity 6, which is > reported as: > > 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] (prog-if 00 [VGA controller]) > Subsystem: VISIONTEK Cape Verde PRO [Radeon HD 7750/8740 / R7 250E] > Flags: bus master, fast devsel, latency 0, IRQ 58 > Memory at c0000000 (64-bit, prefetchable) [size=256M] > Memory at fbe00000 (64-bit, non-prefetchable) [size=256K] > I/O ports at e000 [size=256] > Expansion ROM at 000c0000 [disabled] [size=128K] > Capabilities: [48] Vendor Specific Information: Len=08 > Capabilities: [50] Power Management version 3 > Capabilities: [58] Express Legacy Endpoint, MSI 00 > Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+ > Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 > Capabilities: [150] Advanced Error Reporting > Kernel driver in use: radeon > Kernel modules: radeon, amdgpu >