Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3366969pxf; Mon, 5 Apr 2021 10:08:44 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwwI+WA1SWQdAPVeA5Xm8J+gXMv3xQbUTbm6wZDWlV+H/DstlYz/LukBnSJKKQZkUBL2FKf X-Received: by 2002:a17:906:7f01:: with SMTP id d1mr30110611ejr.136.1617642523923; Mon, 05 Apr 2021 10:08:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617642523; cv=none; d=google.com; s=arc-20160816; b=RyJx6+nHYCI7nQL58IR2TrmR5D219yTQeB3ic2KwT16rfo368OBpDF4kW7BcXyCER7 fFqr+aSAWlTeu1n5kbHUtgkul71QkuSNW0aV3paUPt9x+zDjsHIt/NvZA0HB2Xe6Jwtf 583nK8WxyRu9aV4zqRtTvENJFVLaI18A5Jvd7Ih7hORcm8X2ItmF6msTzsSAoG/wrSp7 Gfd18Y+obQLONsSw6AWKXgpleG5nEyXhAvRRqO1TESRXDa4UAzxEmwkwq+PCQdRTsZem Q1Fdd90Satr3kTYWhjL2qj/1uNoSEYYR+0KJA6j9jy2cbMgPDOMi2Dh0iMFkSIPcjsiV ptzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=l/hIrsswdMlBuq5oOSXlUzIE4T2RjLniHruVz7yzBoo=; b=wFEZq1ZUY0MXKmHhEoyFdwhM0kYhogiyMJCguwTx65Ca9TmBch+eGraaBHwZ7otlsd 9djXchdSts9nyIXZc5YanqDiCO25ZWGTfj/gN1RsYZX9+8F67JxrGXCZkZgBhoe92jDS Ff3qsyz1QcPCxDtxh+8qJjMdQVrgRBTXS+TLHXYugdAoauXIxXX3ONXAJLmPnauYhLOS Y6gLruKEy3QDhRHHt7hPXVE+q1C1v9/dNeQBnm6TAk4MkljOQYpKldglvVALMpeAkBQu Vl8bA1F8JZmyHFjCPPvL+JR12zb0PeOtGJoltkq3nJiATi2k18zUEBHGFWRkWq3FCkys oRRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=PpYDQnY0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g19si7180757edu.127.2021.04.05.10.08.19; Mon, 05 Apr 2021 10:08:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=PpYDQnY0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240035AbhDEJTD (ORCPT + 99 others); Mon, 5 Apr 2021 05:19:03 -0400 Received: from mail.kernel.org ([198.145.29.99]:34514 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240568AbhDEJPV (ORCPT ); Mon, 5 Apr 2021 05:15:21 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8668B61398; Mon, 5 Apr 2021 09:15:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1617614114; bh=66WVbx2Csa6fUN4Oy7yoaGe6Ecc2ydG5s5vhxDsQW9k=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PpYDQnY0t8jrGtM0hxhUnj5FxPS8j3lIqCSkT3IcxuFZejvCc9kktYe90ZmqOI9Zk 6/cdtxzmTc0lAabtuYmpImBNP+2EcBn7UtSSpBaRUQ6cdGzu9nx/1S2Modn05Lcmn0 kwYG5FxFUIGRqqQe6D/Yg5pMvUT+q1goRsBE0rdc= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Qu Huang , Felix Kuehling , Alex Deucher Subject: [PATCH 5.11 088/152] drm/amdkfd: dqm fence memory corruption Date: Mon, 5 Apr 2021 10:53:57 +0200 Message-Id: <20210405085037.113739625@linuxfoundation.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210405085034.233917714@linuxfoundation.org> References: <20210405085034.233917714@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Qu Huang commit e92049ae4548ba09e53eaa9c8f6964b07ea274c9 upstream. Amdgpu driver uses 4-byte data type as DQM fence memory, and transmits GPU address of fence memory to microcode through query status PM4 message. However, query status PM4 message definition and microcode processing are all processed according to 8 bytes. Fence memory only allocates 4 bytes of memory, but microcode does write 8 bytes of memory, so there is a memory corruption. Changes since v1: * Change dqm->fence_addr as a u64 pointer to fix this issue, also fix up query_status and amdkfd_fence_wait_timeout function uses 64 bit fence value to make them consistent. Signed-off-by: Qu Huang Reviewed-by: Felix Kuehling Signed-off-by: Felix Kuehling Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman --- drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 8 ++++---- 7 files changed, 12 insertions(+), 12 deletions(-) --- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c @@ -155,7 +155,7 @@ static int dbgdev_diq_submit_ib(struct k /* Wait till CP writes sync code: */ status = amdkfd_fence_wait_timeout( - (unsigned int *) rm_state, + rm_state, QUEUESTATE__ACTIVE, 1500); kfd_gtt_sa_free(dbgdev->dev, mem_obj); --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c @@ -1167,7 +1167,7 @@ static int start_cpsch(struct device_que if (retval) goto fail_allocate_vidmem; - dqm->fence_addr = dqm->fence_mem->cpu_ptr; + dqm->fence_addr = (uint64_t *)dqm->fence_mem->cpu_ptr; dqm->fence_gpu_addr = dqm->fence_mem->gpu_addr; init_interrupts(dqm); @@ -1340,8 +1340,8 @@ out: return retval; } -int amdkfd_fence_wait_timeout(unsigned int *fence_addr, - unsigned int fence_value, +int amdkfd_fence_wait_timeout(uint64_t *fence_addr, + uint64_t fence_value, unsigned int timeout_ms) { unsigned long end_jiffies = msecs_to_jiffies(timeout_ms) + jiffies; --- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.h @@ -192,7 +192,7 @@ struct device_queue_manager { uint16_t vmid_pasid[VMID_NUM]; uint64_t pipelines_addr; uint64_t fence_gpu_addr; - unsigned int *fence_addr; + uint64_t *fence_addr; struct kfd_mem_obj *fence_mem; bool active_runlist; int sched_policy; --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c @@ -347,7 +347,7 @@ fail_create_runlist_ib: } int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address, - uint32_t fence_value) + uint64_t fence_value) { uint32_t *buffer, size; int retval = 0; --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_v9.c @@ -283,7 +283,7 @@ static int pm_unmap_queues_v9(struct pac } static int pm_query_status_v9(struct packet_manager *pm, uint32_t *buffer, - uint64_t fence_address, uint32_t fence_value) + uint64_t fence_address, uint64_t fence_value) { struct pm4_mes_query_status *packet; --- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager_vi.c @@ -263,7 +263,7 @@ static int pm_unmap_queues_vi(struct pac } static int pm_query_status_vi(struct packet_manager *pm, uint32_t *buffer, - uint64_t fence_address, uint32_t fence_value) + uint64_t fence_address, uint64_t fence_value) { struct pm4_mes_query_status *packet; --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -1003,8 +1003,8 @@ int pqm_get_wave_state(struct process_qu u32 *ctl_stack_used_size, u32 *save_area_used_size); -int amdkfd_fence_wait_timeout(unsigned int *fence_addr, - unsigned int fence_value, +int amdkfd_fence_wait_timeout(uint64_t *fence_addr, + uint64_t fence_value, unsigned int timeout_ms); /* Packet Manager */ @@ -1040,7 +1040,7 @@ struct packet_manager_funcs { uint32_t filter_param, bool reset, unsigned int sdma_engine); int (*query_status)(struct packet_manager *pm, uint32_t *buffer, - uint64_t fence_address, uint32_t fence_value); + uint64_t fence_address, uint64_t fence_value); int (*release_mem)(uint64_t gpu_addr, uint32_t *buffer); /* Packet sizes */ @@ -1062,7 +1062,7 @@ int pm_send_set_resources(struct packet_ struct scheduling_resources *res); int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues); int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address, - uint32_t fence_value); + uint64_t fence_value); int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type, enum kfd_unmap_queues_filter mode,