Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2549240pxj; Mon, 10 May 2021 05:42:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxoB8IgfhD6hwmUlGRQUJaSvB1NXSu4E0OeBGXcBh1EADjxdBU8eTBhd3Ghgzfjp7/pe3Rp X-Received: by 2002:a17:906:31c6:: with SMTP id f6mr25642701ejf.446.1620650553431; Mon, 10 May 2021 05:42:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620650553; cv=none; d=google.com; s=arc-20160816; b=BRMG7mbr6zOUAgcg+XjJ+JQXO+6tdpErw9inlVRPBlNkOIPaZJhkWPlM7inB1f52vH Gwm2Oz57WymUb8gZuRT7RDmxGgdBSUiNRw/OxsnJVVP5zD4+Cb17bwR3MXVQ9UMhQGOS 8qF6+P/XKPSPpt+2EA9BwdFh6B76CDk7NWYKcncX9XPo6khfGsaHh1D7nWRsF/PhYK2R //7bvd6HQ7MK2/PQmH4sAsIXvPIuD/8gD7MBxf0L0zpcKfnVVnGvlxiN66rxypFsg4Ea 3V6n1R/1t5HS4VMDBMv8cvPfCs3vQtZEFfbFgWqyPc8zf+7/XOSQ6ptjKfMLdOhNbPyh 3PaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=lTmFhAyxuKAyKHLvDzhtut0ATRxcdKgaAWY/tSO89QA=; b=WLpSQ2sY/aLRfAvS5DFY2Oy73TbXP6v4UnP/JxeOV1GbFqj5pM+QJrAL0rWx2EwWQq XmfV8Ts2nWBs9vu+XXT2oJZmY4hJ6ZVxsqX+k8v/bGfMuaXJIFGTFI5mXizHP7X7uAGi mvOTBNpkfhBZNxNn8pbWlQpc6PkAUHp1qjsj1UectVmBjN0LXq+f4TNlO/doZOq/iUit 4B2qljwKFFjntVlJMD2FAwuLwrB6ud21m/YHJX18k0VmCsYnD5KUPZuspXFjYFGczn2l 8hiSrDVdnDtaKSKgaV/p2+uSRxQ849z43ucuf/AROQBJB/35A3tvWczYrNUi3PX6qdwz WE4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=N97CbDw6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bx14si14269801edb.119.2021.05.10.05.42.07; Mon, 10 May 2021 05:42:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=N97CbDw6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347813AbhEJMhY (ORCPT + 99 others); Mon, 10 May 2021 08:37:24 -0400 Received: from mail.kernel.org ([198.145.29.99]:33952 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237666AbhEJLPy (ORCPT ); Mon, 10 May 2021 07:15:54 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id BE0066143B; Mon, 10 May 2021 11:11:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1620645077; bh=sxdG9erlraFQgNi9ASTWqHeMNHqt4H+GmvkuDd2xkrk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=N97CbDw6ZW4nxbnaVGZNu+yJv9kc+AdSBd1f1LW/vfFJ2+8IrMs9hxtt7gtzSalzX Hy9rot7M2g3eHLiGrbvTzzBdcl6NKC5HFCvu0x/ifSYzO1vOJSoSeuPI1zHoWqIvAu +YXQRLEqljQgGrAATCU6q5OiNjZERVXI7p30uaz8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dominic DeMarco , Mahesh Salgaonkar , "Aneesh Kumar K.V" , Michael Ellerman Subject: [PATCH 5.12 307/384] powerpc/eeh: Fix EEH handling for hugepages in ioremap space. Date: Mon, 10 May 2021 12:21:36 +0200 Message-Id: <20210510102024.922923014@linuxfoundation.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210510102014.849075526@linuxfoundation.org> References: <20210510102014.849075526@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mahesh Salgaonkar commit 5ae5bc12d0728db60a0aa9b62160ffc038875f1a upstream. During the EEH MMIO error checking, the current implementation fails to map the (virtual) MMIO address back to the pci device on radix with hugepage mappings for I/O. This results into failure to dispatch EEH event with no recovery even when EEH capability has been enabled on the device. eeh_check_failure(token) # token = virtual MMIO address addr = eeh_token_to_phys(token); edev = eeh_addr_cache_get_dev(addr); if (!edev) return 0; eeh_dev_check_failure(edev); <= Dispatch the EEH event In case of hugepage mappings, eeh_token_to_phys() has a bug in virt -> phys translation that results in wrong physical address, which is then passed to eeh_addr_cache_get_dev() to match it against cached pci I/O address ranges to get to a PCI device. Hence, it fails to find a match and the EEH event never gets dispatched leaving the device in failed state. The commit 33439620680be ("powerpc/eeh: Handle hugepages in ioremap space") introduced following logic to translate virt to phys for hugepage mappings: eeh_token_to_phys(): + pa = pte_pfn(*ptep); + + /* On radix we can do hugepage mappings for io, so handle that */ + if (hugepage_shift) { + pa <<= hugepage_shift; <= This is wrong + pa |= token & ((1ul << hugepage_shift) - 1); + } This patch fixes the virt -> phys translation in eeh_token_to_phys() function. $ cat /sys/kernel/debug/powerpc/eeh_address_cache mem addr range [0x0000040080000000-0x00000400807fffff]: 0030:01:00.1 mem addr range [0x0000040080800000-0x0000040080ffffff]: 0030:01:00.1 mem addr range [0x0000040081000000-0x00000400817fffff]: 0030:01:00.0 mem addr range [0x0000040081800000-0x0000040081ffffff]: 0030:01:00.0 mem addr range [0x0000040082000000-0x000004008207ffff]: 0030:01:00.1 mem addr range [0x0000040082080000-0x00000400820fffff]: 0030:01:00.0 mem addr range [0x0000040082100000-0x000004008210ffff]: 0030:01:00.1 mem addr range [0x0000040082110000-0x000004008211ffff]: 0030:01:00.0 Above is the list of cached io address ranges of pci 0030:01:00.. Before this patch: Tracing 'arg1' of function eeh_addr_cache_get_dev() during error injection clearly shows that 'addr=' contains wrong physical address: kworker/u16:0-7 [001] .... 108.883775: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x80103000a510 dmesg shows no EEH recovery messages: [ 108.563768] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x9ae) != mcp_pulse (0x7fff) [ 108.563788] bnx2x: [bnx2x_hw_stats_update:870(eth2)]NIG timer max (4294967295) [ 108.883788] bnx2x: [bnx2x_acquire_hw_lock:2013(eth1)]lock_status 0xffffffff resource_bit 0x1 [ 108.884407] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout [ 108.884976] bnx2x 0030:01:00.0 eth1: MDC/MDIO access timeout <..> After this patch: eeh_addr_cache_get_dev() trace shows correct physical address: -0 [001] ..s. 1043.123828: eeh_addr_cache_get_dev: (eeh_addr_cache_get_dev+0xc/0xf0) addr=0x40080bc7cd8 dmesg logs shows EEH recovery getting triggerred: [ 964.323980] bnx2x: [bnx2x_timer:5801(eth2)]MFW seems hanged: drv_pulse (0x746f) != mcp_pulse (0x7fff) [ 964.323991] EEH: Recovering PHB#30-PE#10000 [ 964.324002] EEH: PE location: N/A, PHB location: N/A [ 964.324006] EEH: Frozen PHB#30-PE#10000 detected <..> Fixes: 33439620680b ("powerpc/eeh: Handle hugepages in ioremap space") Cc: stable@vger.kernel.org # v5.3+ Reported-by: Dominic DeMarco Signed-off-by: Mahesh Salgaonkar Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/161821396263.48361.2796709239866588652.stgit@jupiter Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/kernel/eeh.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -362,14 +362,11 @@ static inline unsigned long eeh_token_to pa = pte_pfn(*ptep); /* On radix we can do hugepage mappings for io, so handle that */ - if (hugepage_shift) { - pa <<= hugepage_shift; - pa |= token & ((1ul << hugepage_shift) - 1); - } else { - pa <<= PAGE_SHIFT; - pa |= token & (PAGE_SIZE - 1); - } + if (!hugepage_shift) + hugepage_shift = PAGE_SHIFT; + pa <<= PAGE_SHIFT; + pa |= token & ((1ul << hugepage_shift) - 1); return pa; }