Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp3392318ybp; Sun, 6 Oct 2019 10:56:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqz8xEO1awq3Oyy73R1VVfhtuBUGdrUj+B/ElOjCablxnD7JSx7OfKJWU3mf4Kccz3oUGoO5 X-Received: by 2002:a05:6402:1f4:: with SMTP id i20mr25556891edy.137.1570384579815; Sun, 06 Oct 2019 10:56:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570384579; cv=none; d=google.com; s=arc-20160816; b=KEJzNp53ryCuArm29tuZ6YmGsRZ+xXco37hwihMaQwwvaZ03hrCluPwxlWTfrPM5QQ GVRF/UPFRIRt4qc/gWHKeTI7rUr3J8P4tDV2hkwnYRY+3YA265sUdtE4aEG523esBe8Q zy09oxbv8qnT8qFYmdcoKntDUeaxolbswJkj8YgJafnsAqvZscySYXUhVZZBjMJiDP5K YaCWnSEvOvEBi2Z5B9s2EsAaXPTZeoDh2n91vtPAr8PEJArhgLhJiSG7gVpnvcFvBAX1 PnDuARWiJMzGsQXolFeUxZZMcbkRNB1U+fwUEqp1uGSIEQXz/xBdLBkNdw7UfBxUM8ua VUbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=Ohj4XXyMer1qOXaWi+5Wba81NWllOW499QO+eQkfCKA=; b=Fih89o3LXQdD7kfTJSzy9E3XTdeJqXfjPdCjfgvSTX0gDg+aN7PNFnZr/SZbizoMSy SQLUsIQtH2QKY+YfmDBvBk/tRhzI4cuqQYSinqrqoOyv+qDVj6zU6zifJHzUjmtSB57y GDDnXiigJohDqJnEq+sFSUn+xpX3T7xztt9O1A7vLA9l6rdIhTM72Z9mUIR6PtGFAbm8 +3ttWr4PYQ0eksH3EbKQslyOOfi7Jo6XlHx3bUM6+ijluklN03/RbcRTS8l9xAT76YHz Hz6b3ydMuvYSdG+7+l33F8soY0ywZFidGmb5xRSHcJy4+ebyeJ8SUcazV6RomGR9T5ko YwtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wLJz+ctO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b8si7643647edc.231.2019.10.06.10.55.55; Sun, 06 Oct 2019 10:56:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wLJz+ctO; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730047AbfJFRgC (ORCPT + 99 others); Sun, 6 Oct 2019 13:36:02 -0400 Received: from mail.kernel.org ([198.145.29.99]:34724 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730218AbfJFRf6 (ORCPT ); Sun, 6 Oct 2019 13:35:58 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id AB0462064A; Sun, 6 Oct 2019 17:35:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1570383357; bh=jiqWDvjr3y286qK70hFFRbHs5tN2gQs8Lk1YAW0uoAc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wLJz+ctO6s9FgmAZIP0Zfd/BIkEVbf4HekMFACUj/NKwjXwflE0OTMf1DJz5tPtMP 2CW5tAvcvv442nRQr7exlJquMcIrTj8xqe0t5IjEXx057DmZw8Fydz3Od2Tj97u4li 21q5Dd3x3Trf/fEg9VRo1+udb0kSI3uHKYU5HF5I= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Oliver OHalloran , Michael Ellerman , Sasha Levin Subject: [PATCH 5.2 073/137] powerpc/eeh: Clean up EEH PEs after recovery finishes Date: Sun, 6 Oct 2019 19:20:57 +0200 Message-Id: <20191006171215.175180587@linuxfoundation.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191006171209.403038733@linuxfoundation.org> References: <20191006171209.403038733@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Oliver O'Halloran [ Upstream commit 799abe283e5103d48e079149579b4f167c95ea0e ] When the last device in an eeh_pe is removed the eeh_pe structure itself (and any empty parents) are freed since they are no longer needed. This results in a crash when a hotplug driver is involved since the following may occur: 1. Device is suprise removed. 2. Driver performs an MMIO, which fails and queues and eeh_event. 3. Hotplug driver receives a hotplug interrupt and removes any pci_devs that were under the slot. 4. pci_dev is torn down and the eeh_pe is freed. 5. The EEH event handler thread processes the eeh_event and crashes since the eeh_pe pointer in the eeh_event structure is no longer valid. Crashing is generally considered poor form. Instead of doing that use the fact PEs are marked as EEH_PE_INVALID to keep them around until the end of the recovery cycle, at which point we can safely prune any empty PEs. Signed-off-by: Oliver O'Halloran Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20190903101605.2890-2-oohall@gmail.com Signed-off-by: Sasha Levin --- arch/powerpc/kernel/eeh_driver.c | 36 ++++++++++++++++++++++++++++++-- arch/powerpc/kernel/eeh_event.c | 8 +++++++ arch/powerpc/kernel/eeh_pe.c | 23 +++++++++++++++++++- 3 files changed, 64 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c index 1fbe541856f5e..fe0c32fb9f96f 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -744,6 +744,33 @@ static int eeh_reset_device(struct eeh_pe *pe, struct pci_bus *bus, */ #define MAX_WAIT_FOR_RECOVERY 300 + +/* Walks the PE tree after processing an event to remove any stale PEs. + * + * NB: This needs to be recursive to ensure the leaf PEs get removed + * before their parents do. Although this is possible to do recursively + * we don't since this is easier to read and we need to garantee + * the leaf nodes will be handled first. + */ +static void eeh_pe_cleanup(struct eeh_pe *pe) +{ + struct eeh_pe *child_pe, *tmp; + + list_for_each_entry_safe(child_pe, tmp, &pe->child_list, child) + eeh_pe_cleanup(child_pe); + + if (pe->state & EEH_PE_KEEP) + return; + + if (!(pe->state & EEH_PE_INVALID)) + return; + + if (list_empty(&pe->edevs) && list_empty(&pe->child_list)) { + list_del(&pe->child); + kfree(pe); + } +} + /** * eeh_handle_normal_event - Handle EEH events on a specific PE * @pe: EEH PE - which should not be used after we return, as it may @@ -782,8 +809,6 @@ void eeh_handle_normal_event(struct eeh_pe *pe) return; } - eeh_pe_state_mark(pe, EEH_PE_RECOVERING); - eeh_pe_update_time_stamp(pe); pe->freeze_count++; if (pe->freeze_count > eeh_max_freezes) { @@ -973,6 +998,12 @@ void eeh_handle_normal_event(struct eeh_pe *pe) return; } } + + /* + * Clean up any PEs without devices. While marked as EEH_PE_RECOVERYING + * we don't want to modify the PE tree structure so we do it here. + */ + eeh_pe_cleanup(pe); eeh_pe_state_clear(pe, EEH_PE_RECOVERING, true); } @@ -1045,6 +1076,7 @@ void eeh_handle_special_event(void) */ if (rc == EEH_NEXT_ERR_FROZEN_PE || rc == EEH_NEXT_ERR_FENCED_PHB) { + eeh_pe_state_mark(pe, EEH_PE_RECOVERING); eeh_handle_normal_event(pe); } else { pci_lock_rescan_remove(); diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c index 64cfbe41174b2..e36653e5f76b3 100644 --- a/arch/powerpc/kernel/eeh_event.c +++ b/arch/powerpc/kernel/eeh_event.c @@ -121,6 +121,14 @@ int __eeh_send_failure_event(struct eeh_pe *pe) } event->pe = pe; + /* + * Mark the PE as recovering before inserting it in the queue. + * This prevents the PE from being free()ed by a hotplug driver + * while the PE is sitting in the event queue. + */ + if (pe) + eeh_pe_state_mark(pe, EEH_PE_RECOVERING); + /* We may or may not be called in an interrupt context */ spin_lock_irqsave(&eeh_eventlist_lock, flags); list_add(&event->list, &eeh_eventlist); diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index 854cef7b18f4d..f0813d50e0b1c 100644 --- a/arch/powerpc/kernel/eeh_pe.c +++ b/arch/powerpc/kernel/eeh_pe.c @@ -491,6 +491,7 @@ int eeh_add_to_parent_pe(struct eeh_dev *edev) int eeh_rmv_from_parent_pe(struct eeh_dev *edev) { struct eeh_pe *pe, *parent, *child; + bool keep, recover; int cnt; struct pci_dn *pdn = eeh_dev_to_pdn(edev); @@ -516,10 +517,21 @@ int eeh_rmv_from_parent_pe(struct eeh_dev *edev) */ while (1) { parent = pe->parent; + + /* PHB PEs should never be removed */ if (pe->type & EEH_PE_PHB) break; - if (!(pe->state & EEH_PE_KEEP)) { + /* + * XXX: KEEP is set while resetting a PE. I don't think it's + * ever set without RECOVERING also being set. I could + * be wrong though so catch that with a WARN. + */ + keep = !!(pe->state & EEH_PE_KEEP); + recover = !!(pe->state & EEH_PE_RECOVERING); + WARN_ON(keep && !recover); + + if (!keep && !recover) { if (list_empty(&pe->edevs) && list_empty(&pe->child_list)) { list_del(&pe->child); @@ -528,6 +540,15 @@ int eeh_rmv_from_parent_pe(struct eeh_dev *edev) break; } } else { + /* + * Mark the PE as invalid. At the end of the recovery + * process any invalid PEs will be garbage collected. + * + * We need to delay the free()ing of them since we can + * remove edev's while traversing the PE tree which + * might trigger the removal of a PE and we can't + * deal with that (yet). + */ if (list_empty(&pe->edevs)) { cnt = 0; list_for_each_entry(child, &pe->child_list, child) { -- 2.20.1