Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp2164210rwi; Tue, 11 Oct 2022 05:46:25 -0700 (PDT) X-Google-Smtp-Source: AMsMyM57WF4MXx1JDZRqLArvD1BED1AWbda1GpbiPh61C9ttwzxhlpFxCmiuYIcRkCSXdUUDyJaJ X-Received: by 2002:a65:6849:0:b0:461:8779:2452 with SMTP id q9-20020a656849000000b0046187792452mr10795605pgt.383.1665492384829; Tue, 11 Oct 2022 05:46:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665492384; cv=none; d=google.com; s=arc-20160816; b=K62fi6yb16b9cOdzuBrhO7kxy5eUwM0wdffHMflx4ADSGeq0hzpunpz0EwDaukdua8 pBWZyYj8BYkbzgsj2ULTzZZgpfmZWTPJVB5eSzWSQarOuUET+xdLTZzf8x17UoNlhA+3 4eS/3a1BGI5PITUAJcHg/mXxd3KaXA+W/WZDlfNJK3NAZlC4PQhv6KEkDQDCTFJ3bVcD e1nK4m/ktW5CBfK0RPVl/H4X8nYqZJMe2K1kLK665I5gVWbu5wdTjNpyWEzDbxht3D/m hH4QBnkhKF5KsM8TZVVvFSCTZLsn8rZLXqYRF/HbFsNj5qd7RDrIOK3r+WUhlaW1AMfK QRew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=YrbFM6jF1XX1PQb3lc7jzBXFeU9L1tPoXuIWRAgagEw=; b=fsSCdfdKp52nKVgVGL0u3YIbQrPOBIVLH4bFo8q6nRLvgFk9UJZSrOdjLW5yIGbBjp rkcWPUG6rr4VzplcIIwMaM8nE6+gdtB5Hl4oE39oDvXt82J1hHe7wTX3UbUvGnEZ/qHx 7IpYFYgV3c6+cbcEPreaoNCnJLq0ciyDs2NpzW9TlGK6WXyu/jjAUAULRmYIBT/IEK3h WF3bmv64YAkNhADDkUh3O2m7ZKevR+rV7QHoAXGkIZflThyIS4mX0A52TsGLi2jMriZA 6WMZ05Jn+6T+wdrte3sKvJUANKE4L9d+Vi56517OwNEpF0gXWqg4myvxyfgId/KH+DQP cQiw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z10-20020a63330a000000b004573ce84a1asi14675735pgz.744.2022.10.11.05.46.07; Tue, 11 Oct 2022 05:46:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229691AbiJKMQP (ORCPT + 99 others); Tue, 11 Oct 2022 08:16:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229703AbiJKMQL (ORCPT ); Tue, 11 Oct 2022 08:16:11 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 606F614D14 for ; Tue, 11 Oct 2022 05:16:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 19E821042; Tue, 11 Oct 2022 05:16:16 -0700 (PDT) Received: from [10.57.4.42] (unknown [10.57.4.42]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0DC093F766; Tue, 11 Oct 2022 05:16:08 -0700 (PDT) Message-ID: <9df969a8-08b1-2b5a-3a86-9a1918f1949b@arm.com> Date: Tue, 11 Oct 2022 13:15:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Thunderbird/102.3.2 Subject: Re: Issue seen since commit f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP") To: Jerry Snitselaar , Christoph Hellwig Cc: Joerg Roedel , iommu@lists.linux.dev, linux-kernel@vger.kernel.org References: <20221010185739.vgw27m7fpmftly3q@cantor> Content-Language: en-GB From: Robin Murphy In-Reply-To: <20221010185739.vgw27m7fpmftly3q@cantor> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-5.5 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2022-10-10 19:57, Jerry Snitselaar wrote: > I've been looking at an odd issue that shows up with commit > f5ff79fddf0e ("dma-mapping: remove CONFIG_DMA_REMAP"). What is being > seen is the bnx2fc driver calling dma_free_coherent(), and eventually > hits the BUG_ON() in vunmap(). bnx2fc_free_session_resc() does a > spin_lock_bh() around the dma_free_coherent() calls, and looking at > preempt.h that will trigger in_interrupt() to return positive, so that > makes sense. The really odd part is this only happens with the > shutdown of the kernel after a system install. Reboots after that do not > hit the BUG_ON() in vunmap(). Most likely a difference in IOMMU config/parameters between the installer and the installed kernel - if the latter is defaulting to passthrough then it won't be remapping (assuming the device is coherent). > I still need to grab a system and try to see what it is doing on the > subsequent shutdowns, because it seems to me that any time > bnx2fc_free_session_resc() is called it will end up there, unless the > allocs are not coming from vmalloc() in the later boots. Between the > comments in dma_free_attrs(), and preempt.h, dma_free_coherent() > shouldn't be called under a spin_lock_bh(), yes? I think the comments > in dma_free_attrs() might be out of date with commit f5ff79fddf0e > ("dma-mapping: remove CONFIG_DMA_REMAP") in place since now it is more > general that you can land in vunmap(). Also, should that WARN_ON() in > dma_free_attrs() trigger as well for the BH disabled case? > > It was also reproduced with a 6.0-rc5 kernel build[1]. Looking at the history of that comment I guess I was just trying to capture the most common case to explain the original motivation for having the WARN_ON(). It was never meant to imply that that's the *only* reason, especially since iommu-dma was already well established by that point. That warning has been present on x86 in one form or another for 15 years, so I guess the real issue at hand is the difference between irqs_disabled() and in_interrupt()? As far as that particular driver goes it looks rather questionable anyway - it seems like a terrible design flaw if a race between consuming things and freeing them can exist at all, but then it looks like bnx2fc_arm_cq() might actually program the hardware to *reuse* a CQ which may already be waiting to be freed as soon as the lock is dropped... that can't be good :/ Thanks, Robin.