Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1152768pxb; Wed, 6 Apr 2022 09:58:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiEGmtQ3SywzBnKCQLrSQNJ8dwLwPiKwonlqQfkbLPU4H4uTzpRfwnF0eWitq6WCgtO32+ X-Received: by 2002:a05:6a00:1741:b0:4fa:f5bc:30bd with SMTP id j1-20020a056a00174100b004faf5bc30bdmr9704817pfc.0.1649264288531; Wed, 06 Apr 2022 09:58:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649264288; cv=none; d=google.com; s=arc-20160816; b=WVu+BgVepJHaEUgyisuYaKakg7du+3JLaf/UDIbI5C1jZqVCtNt0ZaCMMFn684ss4I qDjlgmWPh0y9mPDR3Fhxwxhcep8Xm2RW3fk8vpWNV6EwkG5RGH/OhrQbjYrwbeQTWOvT koVkFMqnXXQekJ6l27d7cqjcNmBdU/vhTc+g1WAKENSSSQvrrJO3ukWOYwaYvGt9sb3F dpKb5YkC7m7rpr6UjfYqgbY1YB8qm2NzvBCa1DMXpBTyS8y6RY6AnaCCC3mvxVB5yneD myMTbeF8W++yvsk7fMTr/C+8j5iF1waLdDNqsRGIDgAlfcptwrKP4vyfmYmQZDe52GeF ONOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=0IeTkY1yhaJPK6VDqHKztL6/sdrLKeK5VaUPik01uco=; b=WlWozbFqzAcGIo1Ntg+guBq3az8uWEyXrwyI3f+gRD+3IYX5vJ0hq4MwURJvW5chzV bK9aGqOwtUuQs/tkyQzTTJCodCWFZJk0MycXs/ZUFKomdQtzj0oVJLkvn8pxzP0NgbwX aJ0jzJBjLwFnrruByu3WPLDxjYi/zVkEj+S4k3a+MWDzulwKFF8PmMhZTTyZKl+FfjW+ 8UKE2TcRJ0yoZFY+bbrrImrDTkFUTmc9Rvyo+Bi04PVNtf9GBxXwNFUrrQfEN6KpaT/V ZJAf9ZpsI8+PAc9D7BnIxmTiMxbBfGN5J3WPr2cViZul1HBZ8RJmOnjyyiNxYI6HvGBH OK/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WbP9ySUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id x37-20020a634a25000000b003828dd9d0fdsi16472961pga.843.2022.04.06.09.58.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 09:58:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=WbP9ySUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4DFD51F6219; Wed, 6 Apr 2022 09:13:25 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234900AbiDFQPT (ORCPT + 99 others); Wed, 6 Apr 2022 12:15:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237134AbiDFQPF (ORCPT ); Wed, 6 Apr 2022 12:15:05 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 221902A26E for ; Wed, 6 Apr 2022 06:41:37 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id l26so4402167ejx.1 for ; Wed, 06 Apr 2022 06:41:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0IeTkY1yhaJPK6VDqHKztL6/sdrLKeK5VaUPik01uco=; b=WbP9ySUZSPgHLrHFRxIUWqAkKlwtz3/lphdxBTdMOVQuUT5+HfJKYrAg4eUTEsXZkz 8e8KhdnBO1xUEtEoMVUbtRmhxsR8BiyPPc5TfM+9lfBW5O/VYyOW+YLyas51CMQO09SJ sZRGZOroZ5LEsaydgXPFKgZgbG1xzpflqC0GHSWYpUCcjg8C4j+ZieSFKHPLTUOmbrOS FoxY0v0cP6iQBZ+ms6c4zzRlAar3PgroXfqXIn9AfuiaBKa3RqrlilCrfE7ZwizUQ6Gj m53Hcr11tJjGhloLapsO3Q2jt656PpXU7Rt7ClPkwJ/n3+3r+zLKxwgtFNDXTmcfVLZU ZYlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0IeTkY1yhaJPK6VDqHKztL6/sdrLKeK5VaUPik01uco=; b=6Iw7zn8Y1Xtvd20tsENPlDb4tHQSagUGPRqJrltTo2QrwgzX42OaRnGC9RghfWwshd qRRVXr49J5rZS9WBhNDisxop1ZhOHVuljtmAwnh0kihFJ27WjGftNXoU9RRK3YL6Npcq MwCtSsAqiPrKOYrSDcxiiryBNrPS1eYzY1ohgLp6zRkYgyhS4wF0M62b0CoKOcHiQ/Qm PMfKCLJbD7RKuJi7s3Ao4MN5h5HpAU4kngFyMQ5SMwD5Vo5kr7MH0TZu8AO7pGqtNDcD fC99iu3/j4O5NjOyO+Ss5HdYOWd58CgdZ+vK5MbRegAN903zsQo6oyAH8tN9jAEw9bRN GgBw== X-Gm-Message-State: AOAM533wYYMyiYInkFQ8tvnPIVSWb4Tf3tHzELpUZxSdmAFbPkFb8g+m gtjvnj2VPHbn/tq7iasnt3iYfbP189Se29E2Xos= X-Received: by 2002:a17:907:980b:b0:6e0:71d9:c87e with SMTP id ji11-20020a170907980b00b006e071d9c87emr8142309ejc.510.1649252495645; Wed, 06 Apr 2022 06:41:35 -0700 (PDT) MIME-Version: 1.0 References: <20220404222132.12740-1-h0tc0d3@gmail.com> In-Reply-To: From: =?UTF-8?B?0JPRgNC40LPQvtGA0LjQuQ==?= Date: Wed, 6 Apr 2022 16:41:34 +0300 Message-ID: Subject: Re: [PATCH] drm/amdkfd: Fix potential NULL pointer dereference To: Felix Kuehling Cc: Rodrigo Siqueira , Melissa Wen , "Pan, Xinhui" , LKML , amd-gfx list , David Airlie , Maling list - DRI developers , Alex Deucher , =?UTF-8?Q?Christian_K=C3=B6nig?= Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Thanks. You are right. I found a potential bug, and as I understand it, the code only applies to the Aldebaran GPU and I can not check the correctness of the code. I only test code on my navi 10 and run GPU stress tests. My knowledge of amdgpu is limited, and fixing potential bugs allows me to learn more about amdgpu code. But there are many that I still don't understand. In any case, we need to fix the code to eliminate problems in the future. Regards, Grigory. =D0=B2=D1=82, 5 =D0=B0=D0=BF=D1=80. 2022 =D0=B3. =D0=B2 20:00, Felix Kuehli= ng : > > Am 2022-04-04 um 18:21 schrieb Grigory Vasilyev: > > In the amdgpu_amdkfd_get_xgmi_bandwidth_mbytes function, > > the peer_adev pointer can be NULL and is passed to amdgpu_xgmi_get_num_= links. > > > > In amdgpu_xgmi_get_num_links, peer_adev pointer is dereferenced > > without any checks: peer_adev->gmc.xgmi.node_id . > > What's worse, peer_adev is uninitialized with an undefined value if src > is NULL. So that code was definitely bogus. > > However, I think your patch will result in incorrect results. Currently > amdgpu_amdkfd_get_xgmi_bandwidth is always called with is_min=3Dtrue if > src=3D=3DNULL, and with is_min=3Dfalse if src!=3DNULL. The intention is, = that we > assume a single XGMI link in the case that src=3D=3DNULL. That means the > is_min parameter is redundant. What we should do instead is, assume that > num_links=3D=3D1 if src=3D=3DNULL, and drop the is_min parameter. > > That would keep things working the way they do now, and prevent > potential problems in the future. > > Regards, > Felix > > > > > > Signed-off-by: Grigory Vasilyev > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/d= rm/amd/amdgpu/amdgpu_amdkfd.c > > index be1a55f8b8c5..1a1006b18016 100644 > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > > @@ -541,11 +541,10 @@ int amdgpu_amdkfd_get_xgmi_bandwidth_mbytes(struc= t amdgpu_device *dst, > > struct amdgpu_device *adev =3D dst, *peer_adev; > > int num_links; > > > > - if (adev->asic_type !=3D CHIP_ALDEBARAN) > > + if (!src || adev->asic_type !=3D CHIP_ALDEBARAN) > > return 0; > > > > - if (src) > > - peer_adev =3D src; > > + peer_adev =3D src; > > > > /* num links returns 0 for indirect peers since indirect route is= unknown. */ > > num_links =3D is_min ? 1 : amdgpu_xgmi_get_num_links(adev, peer_a= dev);