Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp9103234pxu; Mon, 28 Dec 2020 06:43:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJwVhzL9vsMrgscy9g+qZJc7Y/S5Ix/20L3pgCOYfnlE+3e2zFzVK+2H95OLRYDGJzHykmWe X-Received: by 2002:a17:906:2681:: with SMTP id t1mr42186662ejc.29.1609166588874; Mon, 28 Dec 2020 06:43:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1609166588; cv=none; d=google.com; s=arc-20160816; b=GrlYVx3hM4QPvQwxEFRx+8f52oNr/sa9SROnw+0nR3TmJeSnJNpn0w/W8IFr7tJVOr 1zwUl5OnTPngXMv9k6BF3qUICRMPZGCKYN0JHNK3tfby1Yxtlco3rIbRucZ8CGBmO8bX rhE2sKM+FDeWd6gz1mWUYS8aJVTCg6URpmj58xr3vJuENjheAH8uY4Gy6wSR4sDHEjOP d2HD9Fk29hlNcCSuXSRJLWwJmB8WWsMyrOPPaJ+NMIsQ8XXgqczxZJjXI/1+N67NoOOQ vj8M4kKQd6pExkkx4yac3CGqc90oeqe8JXSoHClsXV6g5dmYbEk5kVY1H5/uoIRBXUEe Xcyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=sLIu72LuPrTLqqtn6LvjpSQNSVZKzrwK3y7eKAY0d64=; b=JVnts7VMxFzLe6Lgdl67S8+Ky5Vnv5wk9Vr+r8D28J4b+wdZo38phbtE30XuyjFhdX Uv4GeDsqVhBXSOEPRuB0V5inuruT1xqv9n88enLESQQVtHkXIo7w6QMvt+FBmeENdS4e VCghUea2UU1vtju+Zb6gdmAQR+MGGfprJJmKnATFfnTjBEKEfBXudUFSVI3zLe+5fOBP L2Hf6G6v3hakmyl0WyZG2F/oxeaunfL6U1+L7vrL6N9kbiopez1oImdgD1/CWDHaMN+D y8uImfxPDMFbxUNA/UmCVWPZh/9NYqmOL4XKqbFOaQ02yYQjiqrCTojyEHzbtQ20/uF8 Medg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=umUAxjUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g3si18425478ejw.12.2020.12.28.06.42.46; Mon, 28 Dec 2020 06:43:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=umUAxjUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2505744AbgL1Ojq (ORCPT + 99 others); Mon, 28 Dec 2020 09:39:46 -0500 Received: from mail.kernel.org ([198.145.29.99]:37028 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2503359AbgL1O3h (ORCPT ); Mon, 28 Dec 2020 09:29:37 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7EEDD20739; Mon, 28 Dec 2020 14:29:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1609165762; bh=mUB608Vqd6TqfnVWY9ukBPJBkNbxCYqMvQsSAdRgrHY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=umUAxjUZ2fidOOPqdZ4G1jM3NMObyn7kBCqEPuquIaqbFnZopO9lVWHQWEwI6gYv1 rVAULgPhw9HpmN4Ao01HDKnbGsevSSc9WNbK0pmhTd6hm98UlWkzNIbVE7iRsa3Fqk lHoVSgTeb5O0Jm3NCZ5GylSkngDz7JM9rXFMYcJo= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Alexey Kardashevskiy , Michael Ellerman Subject: [PATCH 5.10 615/717] powerpc/powernv/npu: Do not attempt NPU2 setup on POWER8NVL NPU Date: Mon, 28 Dec 2020 13:50:13 +0100 Message-Id: <20201228125050.381726299@linuxfoundation.org> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201228125020.963311703@linuxfoundation.org> References: <20201228125020.963311703@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Alexey Kardashevskiy commit b1198a88230f2ce50c271e22b82a8b8610b2eea9 upstream. We execute certain NPU2 setup code (such as mapping an LPID to a device in NPU2) unconditionally if an Nvlink bridge is detected. However this cannot succeed on POWER8NVL machines and errors appear in dmesg. This is harmless as skiboot returns an error and the only place we check it is vfio-pci but that code does not get called on P8+ either. This adds a check if pnv_npu2_xxx helpers are called on a machine with NPU2 which initializes pnv_phb::npu in pnv_npu2_init(); pnv_phb::npu==NULL on POWER8/NVL (Naples). While at this, fix NULL derefencing in pnv_npu_peers_take_ownership/ pnv_npu_peers_release_ownership which occurs when GPUs on mentioned P8s cause EEH which happens if "vfio-pci" disables devices using the D3 power state; the vfio-pci's disable_idle_d3 module parameter controls this and must be set on Naples. The EEH handling clears the entire pnv_ioda_pe struct in pnv_ioda_free_pe() hence the NULL derefencing. We cannot recover from that but at least we stop crashing. Tested on - POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm, NVIDIA GV100 10de:1db1 driver 418.39 - POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm, NVIDIA P100 10de:15f9 driver 396.47 Fixes: 1b785611e119 ("powerpc/powernv/npu: Add release_ownership hook") Cc: stable@vger.kernel.org # 5.0 Signed-off-by: Alexey Kardashevskiy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20201122073828.15446-1-aik@ozlabs.ru Signed-off-by: Greg Kroah-Hartman --- arch/powerpc/platforms/powernv/npu-dma.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) --- a/arch/powerpc/platforms/powernv/npu-dma.c +++ b/arch/powerpc/platforms/powernv/npu-dma.c @@ -385,7 +385,8 @@ static void pnv_npu_peers_take_ownership for (i = 0; i < npucomp->pe_num; ++i) { struct pnv_ioda_pe *pe = npucomp->pe[i]; - if (!pe->table_group.ops->take_ownership) + if (!pe->table_group.ops || + !pe->table_group.ops->take_ownership) continue; pe->table_group.ops->take_ownership(&pe->table_group); } @@ -401,7 +402,8 @@ static void pnv_npu_peers_release_owners for (i = 0; i < npucomp->pe_num; ++i) { struct pnv_ioda_pe *pe = npucomp->pe[i]; - if (!pe->table_group.ops->release_ownership) + if (!pe->table_group.ops || + !pe->table_group.ops->release_ownership) continue; pe->table_group.ops->release_ownership(&pe->table_group); } @@ -623,6 +625,11 @@ int pnv_npu2_map_lpar_dev(struct pci_dev return -ENODEV; hose = pci_bus_to_host(npdev->bus); + if (hose->npu == NULL) { + dev_info_once(&npdev->dev, "Nvlink1 does not support contexts"); + return 0; + } + nphb = hose->private_data; dev_dbg(&gpdev->dev, "Map LPAR opalid=%llu lparid=%u\n", @@ -670,6 +677,11 @@ int pnv_npu2_unmap_lpar_dev(struct pci_d return -ENODEV; hose = pci_bus_to_host(npdev->bus); + if (hose->npu == NULL) { + dev_info_once(&npdev->dev, "Nvlink1 does not support contexts"); + return 0; + } + nphb = hose->private_data; dev_dbg(&gpdev->dev, "destroy context opalid=%llu\n",