Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp2924798pxb; Tue, 24 Aug 2021 10:44:03 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzxSJp8Ic67J/Kb4a9OhX6F9Js2xc8rEnjw3EFWd9TSCUxC1qK56QPttY9eUilnlQZgLMEO X-Received: by 2002:a05:6402:17d6:: with SMTP id s22mr410743edy.185.1629827042843; Tue, 24 Aug 2021 10:44:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629827042; cv=none; d=google.com; s=arc-20160816; b=V178r3F10bZUWmIvnqovYCKe1KN7Aj2bKHoJ1ybIRqL/SxeiQQz13wQln4jHtjdDUD r8+gpqOtPcbCBLzdWRMirZHYqJv4f/RuO7HO8ofAF8whbLF57ljKq7S/n2NC6Yp0Xzl7 XF/ULv+udg6EA+cBNIuptKPSVnU4S2aXsllrTJiO6a3JeLCsszFFif3yObHEZ/fjisDu C8srDEm6n91+IrNZwjyaonoT0Xtbr9D+fbjtZe5swWFkndCUVqxMpjxpD46vvp/I/Bxy U1aXs4a6og9+5TDnPee7Cn/KnPELMpBLWfhFZXue0ZcNjqXw7LpNfzHZ6MfZPZypOJM4 63hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=thFwhv9slJw42C1nVwF23IvxJAxw0r4e8rsFL/C+08o=; b=crzVXuQVgxxISgdS0+sQ+vTOM2aJBtBVWMfcS4uUcpl7NFZ0Q2Os63kIo8XWGwh4+v 2WEgnFEpE54EHxLSZnBijHQjL/KI8FKOtvaHn8OqY30ztXblRwbVgW6gswpdQFmnD8ZF AMW0hl7gF0s3maE1DjWQ8zyIPkxAgAZYZBNjkWEwDNXMoyOGW8dHyjP08XtuDo0B8L0c IHMGQX3jNkBryeLbD47o0ociXyZexVksxqqs4eH3t/V/A6FuHExNrQliX31fvHeiXgPU 5uyNXVbl5mwLQ7307dLUSrhFnxcM/6CjRu+MK63TNvT0Y2x/VQhADOucXB6149L+YeEM ccZQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q1si17225711ejr.43.2021.08.24.10.43.39; Tue, 24 Aug 2021 10:44:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240080AbhHXRnK (ORCPT + 99 others); Tue, 24 Aug 2021 13:43:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:42808 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242761AbhHXRju (ORCPT ); Tue, 24 Aug 2021 13:39:50 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5482B60F25; Tue, 24 Aug 2021 17:37:44 +0000 (UTC) Date: Tue, 24 Aug 2021 18:37:41 +0100 From: Catalin Marinas To: Alex Bee Cc: Will Deacon , Andrew Morton , Anshuman Khandual , Linux Kernel Mailing List , linux-mm@kvack.org, Linux ARM , Mike Rapoport , Robin Murphy Subject: Re: [BUG 5.14] arm64/mm: dma memory mapping fails (in some cases) Message-ID: <20210824173741.GC623@arm.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Alex, Thanks for the report. On Tue, Aug 24, 2021 at 03:40:47PM +0200, Alex Bee wrote: > it seems there is a regression in arm64 memory mapping in 5.14, since it > fails on Rockchip RK3328 when the pl330 dmac tries to map with: > > [??? 8.921909] ------------[ cut here ]------------ > [??? 8.921940] WARNING: CPU: 2 PID: 373 at kernel/dma/mapping.c:235 dma_map_resource+0x68/0xc0 > [??? 8.921973] Modules linked in: spi_rockchip(+) fuse > [??? 8.921996] CPU: 2 PID: 373 Comm: systemd-udevd Not tainted 5.14.0-rc7 #1 > [??? 8.922004] Hardware name: Pine64 Rock64 (DT) > [??? 8.922011] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--) > [??? 8.922018] pc : dma_map_resource+0x68/0xc0 > [??? 8.922026] lr : pl330_prep_slave_fifo+0x78/0xd0 > [??? 8.922040] sp : ffff800012102ae0 > [??? 8.922043] x29: ffff800012102ae0 x28: ffff000005c94800 x27: 0000000000000000 > [??? 8.922056] x26: ffff000000566bd0 x25: 0000000000000001 x24: 0000000000000001 > [??? 8.922067] x23: 0000000000000002 x22: ffff000000628c00 x21: 0000000000000001 > [??? 8.922078] x20: ffff000000566bd0 x19: 0000000000000001 x18: 0000000000000000 > [??? 8.922089] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 > [??? 8.922100] x14: 0000000000000277 x13: 0000000000000001 x12: 0000000000000000 > [??? 8.922111] x11: 0000000000000001 x10: 00000000000008e0 x9 : ffff800012102a80 > [??? 8.922123] x8 : ffff000000d14b80 x7 : ffff0000fe7b12f0 x6 : ffff0000fe7b1100 > [??? 8.922134] x5 : fffffc000000000f x4 : 0000000000000000 x3 : 0000000000000001 > [??? 8.922145] x2 : 0000000000000001 x1 : 00000000ff190800 x0 : ffff000000628c00 > [??? 8.922158] Call trace: > [??? 8.922163]? dma_map_resource+0x68/0xc0 > [??? 8.922173]? pl330_prep_slave_sg+0x58/0x220 > [??? 8.922181]? rockchip_spi_prepare_dma+0xd8/0x2c0 [spi_rockchip] > [??? 8.922208]? rockchip_spi_transfer_one+0x294/0x3d8 [spi_rockchip] [...] > Note: This does not relate to the spi driver - when disabling this device in > the device tree it fails for any other (i2s, for instance) which uses dma. > Commenting out the failing check at [1], however, helps and the mapping > works again. Do you know which address dma_map_resource() is trying to map (maybe add some printk())? It's not supposed to map RAM, hence the warning. Random guess, the address is 0xff190800 (based on the x1 above but the regs might as well be mangled). > I tried to follow the recent changes for arm64 mm which could relate to the > check failing at [1] and reverting > ? commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") > helps and makes it work again, but I'm 100% uncertain if that commit is > really the culprit. > > Note, that the firmware (legacy u-boot) injects memory configuration in the > device tree as follows: > > /memreserve/??? 0x00000000fcefc000 0x000000000000d000; > / { > .. > ??? compatible = "pine64,rock64\0rockchip,rk3328"; > .. > ??? memory { > ??? ??? reg = <0x00 0x200000 0x00 0xfee00000 0x00 0x00 0x00 0x00>; > ??? ??? device_type = "memory"; > ??? }; > > .. > } Either pfn_valid() gets confused in 5.14 or something is wrong with the DT. I have a suspicion it's the former since reverting the above commit makes it disappear. > So: there is a "hole" in the mappable memory and reading the commit message > of > ? commit a7d9f306ba70 ("arm64: drop pfn_valid_within() and simplify > pfn_valid()") > suggests, there was a change for that case recently. I think the change from the arm64 pfn_valid() to the generic one is avoiding the call to memblock_is_memory(). I wonder whether pfn_valid() returns true just because we have a struct page available but the memory may have been reserved. Cc'ing Mike R. > I also noticed there is a diff in the kernel log regarding memory init up > until 5.13.12 it says > > [??? 0.000000] Zone ranges: > [??? 0.000000]?? DMA????? [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000]?? DMA32??? empty > [??? 0.000000]?? Normal?? empty > [??? 0.000000] Movable zone start for each node > [??? 0.000000] Early memory node ranges > [??? 0.000000]?? node?? 0: [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000] On node 0 totalpages: 1043968 > [??? 0.000000]?? DMA zone: 16312 pages used for memmap > [??? 0.000000]?? DMA zone: 0 pages reserved > [??? 0.000000]?? DMA zone: 1043968 pages, LIFO batch:63 > > In contrary in 5.14-rc7 it says: > > [??? 0.000000] Zone ranges: > [??? 0.000000]?? DMA????? [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000]?? DMA32??? empty > [??? 0.000000]?? Normal?? empty > [??? 0.000000] Movable zone start for each node > [??? 0.000000] Early memory node ranges > [??? 0.000000]?? node?? 0: [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000] Initmem setup node 0 [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000] On node 0, zone DMA: 512 pages in unavailable ranges > [??? 0.000000] On node 0, zone DMA: 4096 pages in unavailable ranges > > (note the "unavailable ranges") > I'm uncertain again here, if that diff is expected behavior because of those > recent mm changes for arm64. > > After reverting > ? commit 16c9afc77660 ("arm64/mm: drop HAVE_ARCH_PFN_VALID") > the log changes to > > [??? 0.000000] Zone ranges: > [??? 0.000000]?? DMA????? [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000]?? DMA32??? empty > [??? 0.000000]?? Normal?? empty > [??? 0.000000] Movable zone start for each node > [??? 0.000000] Early memory node ranges > [??? 0.000000]?? node?? 0: [mem 0x0000000000200000-0x00000000feffffff] > [??? 0.000000] Initmem setup node 0 [mem > 0x0000000000200000-0x00000000feffffff] > > (no DMA zones here) > > As you might have noticed I have _zero_ clue about memory mapping and dma > subsystem - so let me know if there is any more information needed for that > and thanks for your help. Adding Robin as well, he has a better clue than us on DMA ;). > Alex > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/mapping.c?id=e22ce8eb631bdc47a4a4ea7ecf4e4ba499db4f93#n235 -- Catalin