Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp628234pxv; Thu, 15 Jul 2021 11:55:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyLDV2PT27q0MYHnLlMTRmC5KHGQOfpv2Akzijop56fUkXtiekHtDu2W2StjKpnebm/2GoG X-Received: by 2002:a05:6402:3453:: with SMTP id l19mr8933571edc.88.1626375351017; Thu, 15 Jul 2021 11:55:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626375351; cv=none; d=google.com; s=arc-20160816; b=de7wa+eI7r5VAMyXC6KAX+O6sQaDuIwr1NjgSbqOV2h6mvUiSJLr1QteOe05TT7fAR X5BtN4yHliCzl/qtB37cpB5cFJcCJ6OyucbIgB6Ypb2yG8Bd8xb9iZXJBCiihMHMoKlq reHL8kkVmYrGLAjY75I0/G4jhSc+Q+Nd8XrnsE2/Ylsl06sptq75s1tfPc+3xnP0O6Yx YsRx+R5pN4884u8pk3D2EakTA1SXbZdbC3/kaO2JotkSixhg0pcIyGWTCRgOp0+hVSJ5 7KSaE36rxwCf1QUQZARapfZQF+KFBucokixHHG7ZgdORz9QTKfQQv0ngJIRqEDmU7xKf 8hUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=8+K+2xJwKrMZpvqVAvknhiZYG7csw0mTNsXX0z3C1nI=; b=UkFbHoM3oam0mYavxrspq2hrLXbUOLlxCzl6+6KTbKdo+i3iMkpi5T6Ilrerma3/wB c/x6D1ZaJ0XNCrkUDmyzVl9ze7pBKaffo1e1b2DuTkXWuXt2Rq0Lrkxox25vdV6yQZzt 9frSC7XsKWLe1D8mTDcP4ix+SNEcsHv5enqi8tsM0rIrj4LAR9IzDE+8fZaamscDpFzh tTik5D5sKs+gFnWS9I1PTclOb+hHm3FBC5Lj+qUVm7TAbXXnL8jUaJcxJMOAWLQ3drUb GrwtPmeX/enw3QSEDPBXpQLrem4DtcEspFkIebx9ay3GJv7/woIxMe0iK2cEd/a3nBi+ JPSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=fkeVFH4c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 88si8575666edc.402.2021.07.15.11.55.27; Thu, 15 Jul 2021 11:55:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=fkeVFH4c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242041AbhGOS5Z (ORCPT + 99 others); Thu, 15 Jul 2021 14:57:25 -0400 Received: from mail.kernel.org ([198.145.29.99]:53924 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239745AbhGOSvg (ORCPT ); Thu, 15 Jul 2021 14:51:36 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id E1133613DA; Thu, 15 Jul 2021 18:48:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1626374922; bh=1qZ0Qc8pmNVIaeeD9OYZgaU7VuI/iQtl0zsaTI3flzY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fkeVFH4crwkFV0BD5BR0kkoo08XaZU0kZF4CxeUUY5bSz/9xvPtKFbBxmqpozWfBC 75N2c7c1TqC9UhVrHL//BjIixlVVGUahhsSkG+T6gFz/wCLgjmm6ciTVjvFzvSSYNd ci40I1Zl29j/wemobRxhs+Xa6kuNMvJGg51dvhjM= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Amber Lin , Felix Kuehling , Alex Deucher , Sasha Levin Subject: [PATCH 5.10 083/215] drm/amdkfd: Fix circular lock in nocpsch path Date: Thu, 15 Jul 2021 20:37:35 +0200 Message-Id: <20210715182614.108260458@linuxfoundation.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210715182558.381078833@linuxfoundation.org> References: <20210715182558.381078833@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Amber Lin [ Upstream commit a7b2451d31cfa2e8aeccf3b35612ce33f02371fc ] Calling free_mqd inside of destroy_queue_nocpsch_locked can cause a circular lock. destroy_queue_nocpsch_locked is called under a DQM lock, which is taken in MMU notifiers, potentially in FS reclaim context. Taking another lock, which is BO reservation lock from free_mqd, while causing an FS reclaim inside the DQM lock creates a problematic circular lock dependency. Therefore move free_mqd out of destroy_queue_nocpsch_locked and call it after unlocking DQM. Signed-off-by: Amber Lin Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- .../drm/amd/amdkfd/kfd_device_queue_manager.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c index b971532e69eb..ffb3d37881a8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -486,9 +486,6 @@ static int destroy_queue_nocpsch_locked(struct device_queue_manager *dqm, if (retval == -ETIME) qpd->reset_wavefronts = true; - - mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj); - list_del(&q->list); if (list_empty(&qpd->queues_list)) { if (qpd->reset_wavefronts) { @@ -523,6 +520,8 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm, int retval; uint64_t sdma_val = 0; struct kfd_process_device *pdd = qpd_to_pdd(qpd); + struct mqd_manager *mqd_mgr = + dqm->mqd_mgrs[get_mqd_type_from_queue_type(q->properties.type)]; /* Get the SDMA queue stats */ if ((q->properties.type == KFD_QUEUE_TYPE_SDMA) || @@ -540,6 +539,8 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm, pdd->sdma_past_activity_counter += sdma_val; dqm_unlock(dqm); + mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj); + return retval; } @@ -1632,7 +1633,7 @@ static int set_trap_handler(struct device_queue_manager *dqm, static int process_termination_nocpsch(struct device_queue_manager *dqm, struct qcm_process_device *qpd) { - struct queue *q, *next; + struct queue *q; struct device_process_node *cur, *next_dpn; int retval = 0; bool found = false; @@ -1640,12 +1641,19 @@ static int process_termination_nocpsch(struct device_queue_manager *dqm, dqm_lock(dqm); /* Clear all user mode queues */ - list_for_each_entry_safe(q, next, &qpd->queues_list, list) { + while (!list_empty(&qpd->queues_list)) { + struct mqd_manager *mqd_mgr; int ret; + q = list_first_entry(&qpd->queues_list, struct queue, list); + mqd_mgr = dqm->mqd_mgrs[get_mqd_type_from_queue_type( + q->properties.type)]; ret = destroy_queue_nocpsch_locked(dqm, qpd, q); if (ret) retval = ret; + dqm_unlock(dqm); + mqd_mgr->free_mqd(mqd_mgr, q->mqd, q->mqd_mem_obj); + dqm_lock(dqm); } /* Unregister process */ -- 2.30.2