Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp20725764rwd; Thu, 29 Jun 2023 06:20:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7T4KNfm7G+v+iADnwJX+vnraYPlVuVxcQq1krAYr5I2gJqMtNpsIxIpZizA4VWSGPeCE+/ X-Received: by 2002:a17:90a:43e5:b0:25e:d013:c22c with SMTP id r92-20020a17090a43e500b0025ed013c22cmr23057978pjg.47.1688044841780; Thu, 29 Jun 2023 06:20:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688044841; cv=none; d=google.com; s=arc-20160816; b=Il0xHw07+hkIOtH49WDZWWesCE7bgqwRbWs6DJbRTibUJPLF3QoOURqmXfDnbRX5G+ gEaH/j+EWc+nnx3ChtKm9zSmhkP3NCcCIUkI0FLmmvPAFedp6gg72xAO52k0OIM6k+Gi LzUBL7lSPLViXIVj6gFkpeyZOAY7Br94FJe6zJvXLZdRiJVaJ+AXBpwybqA9odYqTjHa wpJoGJS2tpasDk9Vx+gvrAwxLlKSmnokuNuBkOmD3JllHnZnSy409pFCgcaMIT9R5H1n NpW75+PkPZlb/GOChJq98M6uRKuyiCtP78hI0xlCbnDnzqxPYRsI4KYY7OJHuA6hNl8i WWSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:references :cc:to:from:content-language:subject:user-agent:mime-version:date :message-id:dkim-signature; bh=5gBm7015A4hdAPyt1XTpmFIFSaE1k9BcyQR+lyfqJ8o=; fh=hSkWbTg/sPOp08JFiS7xUOpVjTx6rT4+KRetYYIRzAg=; b=PtOY4t5SpJGtokBYel9TS/nl+QQJvP1BvioQrLwZBvUQciB0gImB+jnIKkckE3xr6c 1QOQmCGqjB0Q5bes6Ech+DNRLGAEGvdvk4F5JoDTxiMJhHqONzUGUpD+sduxurWOO0On F3g59kDt3FAFo9q0VjkylnlkRI39bWpD8nNtvk0jIg0HoOfIK6wMgdawrtxr5iKYNpM7 cze/ldoQ8UokRX+EM0LgIKwFawAqqMs0qtMrTEa2+4m4ePPZvoxDTDxuWZ/gUg2jo/5e mI+3n3PCBmvl+m4YCJX2A1XiNB7ndN04yAM16S0LP7XNb376h6e5Mdbz8GQahv/xqz6h zEvg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=WcF7a7Pv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pc4-20020a17090b3b8400b0025bcbba10c6si11726222pjb.85.2023.06.29.06.20.26; Thu, 29 Jun 2023 06:20:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=WcF7a7Pv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231208AbjF2NLv (ORCPT + 99 others); Thu, 29 Jun 2023 09:11:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230079AbjF2NLi (ORCPT ); Thu, 29 Jun 2023 09:11:38 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8FF33586 for ; Thu, 29 Jun 2023 06:11:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:In-Reply-To:References: Cc:To:From:Subject:MIME-Version:Date:Message-ID:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=5gBm7015A4hdAPyt1XTpmFIFSaE1k9BcyQR+lyfqJ8o=; b=WcF7a7PvHPSSH8RkVJssqlUU+s ImScZlAnduZcKAwEtiUvHyTPNF9vZBW26pH3ld6VExtPVXcKZMyDvFSXjvRVLv6F46fkbIwtmGJSW 3J/hGWgetzDJjykSs/LPhPzQQ9d9hSg6lyJXfm6mYTJLoTQN0EEcduJ10WsEXRVuZ+a4Sgp6KX73m sxLLQffRdpKecaMLzKCKuuFm0WePnUOmZMTNN2UR2dq00AskhaueObNPhKW3YM2R08xCBCm6YUs8l xzTOWIrZ97X1ZdRqiHZuyxWgh9XNpjmDihWhIQPXF7VqBwr6gC9Dg03Fq31r8nL6P6B5DIAbQy1It Eo/BZCEQ==; Received: from [187.74.70.209] (helo=[192.168.1.111]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_128_GCM:128) (Exim) id 1qErQX-005nTk-1p; Thu, 29 Jun 2023 15:11:13 +0200 Message-ID: <02789f9b-ff16-b419-097f-b97b56afad57@igalia.com> Date: Thu, 29 Jun 2023 10:11:06 -0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations Content-Language: en-US From: =?UTF-8?Q?Andr=c3=a9_Almeida?= To: =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: pierre-eric.pelloux-prayer@amd.com, Randy Dunlap , Daniel Vetter , =?UTF-8?B?J01hcmVrIE9sxaHDoWsn?= , =?UTF-8?Q?Michel_D=c3=a4nzer?= , Simon Ser , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, =?UTF-8?Q?Timur_Krist=c3=b3f?= , amd-gfx@lists.freedesktop.org, Pekka Paalanen , Daniel Stone , Rob Clark , Samuel Pitoiset , kernel-dev@igalia.com, Bas Nieuwenhuizen , alexander.deucher@amd.com, Pekka Paalanen , Dave Airlie , christian.koenig@amd.com References: <20230627132323.115440-1-andrealmeid@igalia.com> <1dbeb507-3f18-1b5d-37be-fcfd60a1c0d4@gmail.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em 27/06/2023 18:17, André Almeida escreveu: > Em 27/06/2023 14:47, Christian König escreveu: >> Am 27.06.23 um 15:23 schrieb André Almeida: >>> Create a section that specifies how to deal with DRM device resets for >>> kernel and userspace drivers. >>> >>> Acked-by: Pekka Paalanen >>> Signed-off-by: André Almeida >>> --- >>> >>> v4: >>> https://lore.kernel.org/lkml/20230626183347.55118-1-andrealmeid@igalia.com/ >>> >>> Changes: >>>   - Grammar fixes (Randy) >>> >>>   Documentation/gpu/drm-uapi.rst | 68 ++++++++++++++++++++++++++++++++++ >>>   1 file changed, 68 insertions(+) >>> >>> diff --git a/Documentation/gpu/drm-uapi.rst >>> b/Documentation/gpu/drm-uapi.rst >>> index 65fb3036a580..3cbffa25ed93 100644 >>> --- a/Documentation/gpu/drm-uapi.rst >>> +++ b/Documentation/gpu/drm-uapi.rst >>> @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a >>> third handler for >>>   mmapped regular files. Threads cause additional pain with signal >>>   handling as well. >>> +Device reset >>> +============ >>> + >>> +The GPU stack is really complex and is prone to errors, from >>> hardware bugs, >>> +faulty applications and everything in between the many layers. Some >>> errors >>> +require resetting the device in order to make the device usable >>> again. This >>> +sections describes the expectations for DRM and usermode drivers when a >>> +device resets and how to propagate the reset status. >>> + >>> +Kernel Mode Driver >>> +------------------ >>> + >>> +The KMD is responsible for checking if the device needs a reset, and >>> to perform >>> +it as needed. Usually a hang is detected when a job gets stuck >>> executing. KMD >>> +should keep track of resets, because userspace can query any time >>> about the >>> +reset stats for an specific context. >> >> Maybe drop the part "for a specific context". Essentially the reset >> query could use global counters instead and we won't need the context >> any more here. >> > > Right, I wrote like this to reflect how it's currently implemented. > > If follow correctly what you meant, KMD could always notify the global > count for UMD, and we would move to the UMD the responsibility to manage > the reset counters, right? This would also simplify my > DRM_IOCTL_GET_RESET proposal. I'll apply your suggestion to the next doc > version. > Actually, if we drop the context identifier we would lose the ability to track which is the guilty context. Vulkan API doesn't seem to care about this, but OpenGL does. >> Apart from that this sounds good to me, feel free to add my rb. >> >> Regards, >> Christian. >> >>