Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp38198582rwd; Wed, 12 Jul 2023 04:39:12 -0700 (PDT) X-Google-Smtp-Source: APBJJlHEF/1AA9BTxBwjsZLKNTXRvye4noHcDUx1bcO1HMktNBw1WxfB/chs8D6r6I4YaCwDXOm6 X-Received: by 2002:a05:6358:4306:b0:134:c739:75f7 with SMTP id r6-20020a056358430600b00134c73975f7mr15785048rwc.12.1689161952296; Wed, 12 Jul 2023 04:39:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689161952; cv=none; d=google.com; s=arc-20160816; b=G11xRGz7cRWszzBeLsDVxEpvVV1+XMUltSWhZgdElb8txlzJjgPtE71Mnyn8ctYnWA 2tlX24cmic0Q/Nk/KI0+x6vrtSQ6Or221oCJ6VmLtD+JxIdedzA7gxUfysMBx2xxmvgL PJhO9OyeLc0RaUmDeAl25MEvnLvau7JesHrVx01SbLUzWHtNNrzd5NbHmOju5E59flF0 4PTQ3Y4rMrLsuo0GXoeSg+k9R/sXkzn+bRGiRR9dhFUtu0hosgAs6z0GIyt/JecxkeaQ qkm862LN3f3XaGiMVu029hguDXVxlIsBzqcTzvW8izn63PTEwdjVH0RL5r3o/giD9/b1 scIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id; bh=BNaW/r4RGLPRrgSPcouYEKUax978ZWbgPsbsPlzVFSU=; fh=as83xKXz40zkMGkThvvAdXZoMmgBjHMDzpLxpmxp5DI=; b=RM4A9gip3xUBJjIA94pP62RupO+4RjOvQaax8VYy1yFD+7D0X5ifAJPCBNBt/SyLhv evokCo9qRj8DNss6cbgeopS4c57Uk8VR/rNPqVQlF3as9QLZ983xKMD/Hz38UkoOP0My KM9GR3DWJHnuWT8BTo7AOiV3is8s3Ucbt/l+7H2rO1UySRbeiBu/f4mYoCwO/uG48F2h dmIqWHK4HATG0V1t779NzPDqDkVQL0ju/kIaWH2KbNMC+j9Xg6I6SQJ8B12RU1eeRKHj +9pbk9YmwAUXYhN8RaTh182v2wrGlkf3SRitbgHYETdFzin3z5KgXHo9giLHvOE3Pbo/ RsuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s25-20020a656459000000b0055ad2d6c8aesi2979485pgv.772.2023.07.12.04.38.59; Wed, 12 Jul 2023 04:39:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232798AbjGLK4W convert rfc822-to-8bit (ORCPT + 99 others); Wed, 12 Jul 2023 06:56:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231822AbjGLK4R (ORCPT ); Wed, 12 Jul 2023 06:56:17 -0400 Received: from metis.ext.pengutronix.de (metis.ext.pengutronix.de [IPv6:2001:67c:670:201:290:27ff:fe1d:cc33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FA24E77 for ; Wed, 12 Jul 2023 03:56:15 -0700 (PDT) Received: from ptz.office.stw.pengutronix.de ([2a0a:edc0:0:900:1d::77] helo=[IPv6:::1]) by metis.ext.pengutronix.de with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qJXVx-0005cG-C5; Wed, 12 Jul 2023 12:56:09 +0200 Message-ID: Subject: Re: [PATCH 3/6] drm/amdgpu: Rework coredump to use memory dynamically From: Lucas Stach To: Christian =?ISO-8859-1?Q?K=F6nig?= , =?ISO-8859-1?Q?Andr=E9?= Almeida , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: pierre-eric.pelloux-prayer@amd.com, 'Marek =?UTF-8?Q?Ol=C5=A1=C3=A1k=27?= , Timur =?ISO-8859-1?Q?Krist=F3f?= , michel.daenzer@mailbox.org, Samuel Pitoiset , kernel-dev@igalia.com, alexander.deucher@amd.com Date: Wed, 12 Jul 2023 12:56:08 +0200 In-Reply-To: <3764d627-d632-5754-0bcc-a150c157d9f9@amd.com> References: <20230711213501.526237-1-andrealmeid@igalia.com> <20230711213501.526237-4-andrealmeid@igalia.com> <0e7f2b0cc29ac77d4a55d0de6a66c477d867fbf7.camel@pengutronix.de> <3764d627-d632-5754-0bcc-a150c157d9f9@amd.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8BIT User-Agent: Evolution 3.46.4 (3.46.4-1.fc37) MIME-Version: 1.0 X-SA-Exim-Connect-IP: 2a0a:edc0:0:900:1d::77 X-SA-Exim-Mail-From: l.stach@pengutronix.de X-SA-Exim-Scanned: No (on metis.ext.pengutronix.de); SAEximRunCond expanded to false X-PTX-Original-Recipient: linux-kernel@vger.kernel.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Sorry, accidentally hit sent on the previous mail. Am Mittwoch, dem 12.07.2023 um 12:39 +0200 schrieb Christian König: > Am 12.07.23 um 10:59 schrieb Lucas Stach: > > Am Mittwoch, dem 12.07.2023 um 10:37 +0200 schrieb Christian König: > > > Am 11.07.23 um 23:34 schrieb André Almeida: > > > > Instead of storing coredump information inside amdgpu_device struct, > > > > move if to a proper separated struct and allocate it dynamically. This > > > > will make it easier to further expand the logged information. > > > Verry big NAK to this. The problem is that memory allocation isn't > > > allowed during a GPU reset. > > > > > > What you could do is to allocate the memory with GFP_ATOMIC or similar, > > > but for a large structure that might not be possible. > > > > > I'm still not fully clear on what the rules are here. In etnaviv we do > > devcoredump allocation in the GPU reset path with __GFP_NOWARN | > > __GFP_NORETRY, which means the allocation will kick memory reclaim if > > necessary, but will just give up if no memory could be made available > > easily. This satisfies the forward progress guarantee in the absence of > > successful memory allocation, which is the most important property in > > this path, I think. > > > > However, I'm not sure if the reclaim could lead to locking issues or > > something like that with the more complex use-cases with MMU notifiers > > and stuff like that. Christian, do you have any experience or > > information that would be good to share in this regard? > > Yeah, very good question. > > __GFP_NORETRY isn't sufficient as far as I know. Reclaim must be > completely suppressed to be able to allocate in a GPU reset handler. > > Daniel added lockdep annotation to some of the dma-fence signaling paths > and this yielded quite a bunch of potential deadlocks. > > It's not even that reclaim itself waits for dma_fences (that can happen, > but is quite unlikely), but rather that reclaim needs locks and under > those locks we then wait for dma_fences. > > We should probably add a define somewhere which documents that > (GFP_ATOMIC | __NO_WARN) should be used in the GPU reset handlers when > allocating memory for coredumps. > Hm, if the problem is the direct reclaim path where we might recurse on a lock through those indirect dependencies then we should document this somewhere. kswapd reclaim should be fine as far as I can see, as we'll guarantee progress without waiting for the background reclaim. I don't think it's appropriate to dip into the atomic allocation reserves for a best-effort thing like writing the devcoredump, so I think this should be GFP_NOWAIT, which will also avoid the direct reclaim path. Regards, Lucas > Regards, > Christian. > > > > > Regards, > > Lucas > > > > > Regards, > > > Christian. > > > > > > > Signed-off-by: André Almeida > > > > --- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 14 +++-- > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 65 ++++++++++++++-------- > > > > 2 files changed, 51 insertions(+), 28 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > index dbe062a087c5..e1cc83a89d46 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > > > > @@ -1068,11 +1068,6 @@ struct amdgpu_device { > > > > uint32_t *reset_dump_reg_list; > > > > uint32_t *reset_dump_reg_value; > > > > int num_regs; > > > > -#ifdef CONFIG_DEV_COREDUMP > > > > - struct amdgpu_task_info reset_task_info; > > > > - bool reset_vram_lost; > > > > - struct timespec64 reset_time; > > > > -#endif > > > > > > > > bool scpm_enabled; > > > > uint32_t scpm_status; > > > > @@ -1085,6 +1080,15 @@ struct amdgpu_device { > > > > uint32_t aid_mask; > > > > }; > > > > > > > > +#ifdef CONFIG_DEV_COREDUMP > > > > +struct amdgpu_coredump_info { > > > > + struct amdgpu_device *adev; > > > > + struct amdgpu_task_info reset_task_info; > > > > + struct timespec64 reset_time; > > > > + bool reset_vram_lost; > > > > +}; > > > > +#endif > > > > + > > > > static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) > > > > { > > > > return container_of(ddev, struct amdgpu_device, ddev); > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > index e25f085ee886..23b9784e9787 100644 > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > > > > @@ -4963,12 +4963,17 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) > > > > return 0; > > > > } > > > > > > > > -#ifdef CONFIG_DEV_COREDUMP > > > > +#ifndef CONFIG_DEV_COREDUMP > > > > +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, > > > > + struct amdgpu_reset_context *reset_context) > > > > +{ > > > > +} > > > > +#else > > > > static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, > > > > size_t count, void *data, size_t datalen) > > > > { > > > > struct drm_printer p; > > > > - struct amdgpu_device *adev = data; > > > > + struct amdgpu_coredump_info *coredump = data; > > > > struct drm_print_iterator iter; > > > > int i; > > > > > > > > @@ -4982,21 +4987,21 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, > > > > drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); > > > > drm_printf(&p, "kernel: " UTS_RELEASE "\n"); > > > > drm_printf(&p, "module: " KBUILD_MODNAME "\n"); > > > > - drm_printf(&p, "time: %lld.%09ld\n", adev->reset_time.tv_sec, adev->reset_time.tv_nsec); > > > > - if (adev->reset_task_info.pid) > > > > + drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); > > > > + if (coredump->reset_task_info.pid) > > > > drm_printf(&p, "process_name: %s PID: %d\n", > > > > - adev->reset_task_info.process_name, > > > > - adev->reset_task_info.pid); > > > > + coredump->reset_task_info.process_name, > > > > + coredump->reset_task_info.pid); > > > > > > > > - if (adev->reset_vram_lost) > > > > + if (coredump->reset_vram_lost) > > > > drm_printf(&p, "VRAM is lost due to GPU reset!\n"); > > > > - if (adev->num_regs) { > > > > + if (coredump->adev->num_regs) { > > > > drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); > > > > > > > > - for (i = 0; i < adev->num_regs; i++) > > > > + for (i = 0; i < coredump->adev->num_regs; i++) > > > > drm_printf(&p, "0x%08x: 0x%08x\n", > > > > - adev->reset_dump_reg_list[i], > > > > - adev->reset_dump_reg_value[i]); > > > > + coredump->adev->reset_dump_reg_list[i], > > > > + coredump->adev->reset_dump_reg_value[i]); > > > > } > > > > > > > > return count - iter.remain; > > > > @@ -5004,14 +5009,34 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, > > > > > > > > static void amdgpu_devcoredump_free(void *data) > > > > { > > > > + kfree(data); > > > > } > > > > > > > > -static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) > > > > +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, > > > > + struct amdgpu_reset_context *reset_context) > > > > { > > > > + struct amdgpu_coredump_info *coredump; > > > > struct drm_device *dev = adev_to_drm(adev); > > > > > > > > - ktime_get_ts64(&adev->reset_time); > > > > - dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_KERNEL, > > > > + coredump = kmalloc(sizeof(*coredump), GFP_KERNEL); > > > > + > > > > + if (!coredump) { > > > > + DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); > > > > + return; > > > > + } > > > > + > > > > + memset(coredump, 0, sizeof(*coredump)); > > > > + > > > > + coredump->reset_vram_lost = vram_lost; > > > > + > > > > + if (reset_context->job && reset_context->job->vm) > > > > + coredump->reset_task_info = reset_context->job->vm->task_info; > > > > + > > > > + coredump->adev = adev; > > > > + > > > > + ktime_get_ts64(&coredump->reset_time); > > > > + > > > > + dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_KERNEL, > > > > amdgpu_devcoredump_read, amdgpu_devcoredump_free); > > > > } > > > > #endif > > > > @@ -5119,15 +5144,9 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle, > > > > goto out; > > > > > > > > vram_lost = amdgpu_device_check_vram_lost(tmp_adev); > > > > -#ifdef CONFIG_DEV_COREDUMP > > > > - tmp_adev->reset_vram_lost = vram_lost; > > > > - memset(&tmp_adev->reset_task_info, 0, > > > > - sizeof(tmp_adev->reset_task_info)); > > > > - if (reset_context->job && reset_context->job->vm) > > > > - tmp_adev->reset_task_info = > > > > - reset_context->job->vm->task_info; > > > > - amdgpu_reset_capture_coredumpm(tmp_adev); > > > > -#endif > > > > + > > > > + amdgpu_coredump(tmp_adev, vram_lost, reset_context); > > > > + > > > > if (vram_lost) { > > > > DRM_INFO("VRAM is lost due to GPU reset!\n"); > > > > amdgpu_inc_vram_lost(tmp_adev); >