Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp537381rwr; Wed, 26 Apr 2023 02:58:48 -0700 (PDT) X-Google-Smtp-Source: AKy350ZCxGCcEldYIzYFPBRRQpj9bqFbywI/3j/Uql37xV/D/F3cNrzCrhneLHTYxRV0p1dFdb+H X-Received: by 2002:a05:6a20:7346:b0:f4:fd7:db9f with SMTP id v6-20020a056a20734600b000f40fd7db9fmr13661257pzc.17.1682503128387; Wed, 26 Apr 2023 02:58:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682503128; cv=none; d=google.com; s=arc-20160816; b=BvL0CWWKG7jOe2kIB9ZaClCL2l1qWQE0SVqmqL95lvhVZ810Rbm71WJjyO7Al+ib2m E+9G2puvuogp1HWFlnS8O7k2dACj/NezP7a9A95zdCiU+WtdYgHTxGNMyocf6JPcq6Gk m32pcFQ/K2idlr6tK+xI2JGvuTyWx8ANl4SQs5qfsd0MO55yQXmR3xvw+DKJjPM0qVyY kV4YQIRNHSGtlsTTaw95qbGgocz3gNV3dyCJ8gMjAVAAKTOrxP13IGZiReZsnZjbD7nM 1OXP6yc3HVdIDUF1vhHJ0brnu5SnAYB7YsS9VYuOTdrjXevR68U6PuyXQ28831eyZ/Yi Za2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:mime-version:date :message-id:dkim-signature; bh=Pi1+yq2ekKSgxL5M2BKRF64dsgqTeeiFapJCZNKczms=; b=MxVLjuwfbcxL30SAcv+mjokLih5FNz2NotFJj8XK5rHpPQsWk8eGvCbea624a1oJo3 UX18t24N+/0L9WP4ZnEwPYFBBRCF6SKtCp6BstCjtUh5OWlHvkGVidqDUQGROA8qr34+ /LC1T09W7fnsG3ozSgPZhcY2Sc9FcccYVHL+kQezOBlqUo1mFqXotLY5G3CrGuF9LE4f G+fEhGpUYqkjiLUm+M+HTPIEgR8CJ6jVq4H0ZSrwnbOf9eEekASyODlt0egN1nN6+bL3 u/zhDYqHYQJh9l96XSLx/JoUVfMgPHjyxF76FL9+CD+2u6rgS8Mp6JN7Mad2vv5M4SgP porA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mailbox.org header.s=mail20150812 header.b=C9OEzOCr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=mailbox.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c36-20020a631c24000000b00513af3211cfsi16236325pgc.453.2023.04.26.02.58.35; Wed, 26 Apr 2023 02:58:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@mailbox.org header.s=mail20150812 header.b=C9OEzOCr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=mailbox.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239750AbjDZJwB (ORCPT + 99 others); Wed, 26 Apr 2023 05:52:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239731AbjDZJv6 (ORCPT ); Wed, 26 Apr 2023 05:51:58 -0400 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [IPv6:2001:67c:2050:0:465::201]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 624CD2726 for ; Wed, 26 Apr 2023 02:51:56 -0700 (PDT) Received: from smtp202.mailbox.org (smtp202.mailbox.org [10.196.197.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4Q5vKN4DmGz9shS; Wed, 26 Apr 2023 11:51:52 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mailbox.org; s=mail20150812; t=1682502712; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Pi1+yq2ekKSgxL5M2BKRF64dsgqTeeiFapJCZNKczms=; b=C9OEzOCreatLMtPJ+jtDU1ggcLGqaw+2LBAr2Z0pHYomiQyvcbyAOHMEBbLXy0zPwZzLUP Ix88v55gwl4URqbnxFuzed96vZU8Hd55K3+upO9T4rCt9mRzgjop62aIg/KOFgYIp+rHLw EN3w5K2pB3VvyPOEnIXon8WRdCpNROPljbN1kWFa1VogQZJ62STMnlBj3GKK2J5/icwF2J jcNInKb+MO+nOejgcaudd3cxd1h2GtLlU9zGOLZoIzounzGm0iAVE8PFx7XJ78H0hi4+9w Zm3ijx5Va7O3hHdjEXkyO2sDhHUJztGAaCuhNn0BQyAkaRA7AYLE1BIJPFsSsw== Message-ID: <9087ef09-e617-dcf3-343e-162f79dc3e51@mailbox.org> Date: Wed, 26 Apr 2023 11:51:50 +0200 MIME-Version: 1.0 Subject: Re: [PATCH] drm/amdgpu: Mark contexts guilty for any reset type Content-Language: de-CH-frami, en-CA To: =?UTF-8?B?TWFyZWsgT2zFocOhaw==?= , =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: Pierre-Eric Pelloux-Prayer , =?UTF-8?Q?Andr=c3=a9_Almeida?= , Linux Kernel Mailing List , dri-devel , "Tuikov, Luben" , amd-gfx mailing list , kernel-dev@igalia.com, "Deucher, Alexander" References: <20230424014324.218531-1-andrealmeid@igalia.com> <784561bb-0937-befc-3774-892d6f6a4318@mailbox.org> <19406ec5-79d6-e9e6-fbdd-eb2f4a872fc4@amd.com> <5262c73e-e77c-91f7-e49e-a9c3571e2cc9@mailbox.org> From: =?UTF-8?Q?Michel_D=c3=a4nzer?= In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-MBO-RS-META: 33jn9k1t3rxidepdiwr1xgfh7r7i69ff X-MBO-RS-ID: 5e7fcf1a1e18bd88dee X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/25/23 21:11, Marek Olšák wrote: > The last 3 comments in this thread contain arguments that are false and were specifically pointed out as false 6 comments ago: Soft resets are just as fatal as hard resets. There is nothing better about soft resets. If the VRAM is lost completely, that's a different story, and if the hard reset is 100% unreliable, that's also a different story, but other than those two outliers, there is no difference between the two from the user point view. Both can repeatedly hang if you don't prevent the app that caused the hang from using the GPU even if the app is not robust. The robustness context type doesn't matter here. By definition, no guilty app can continue after a reset, and no innocent apps affected by a reset can continue either because those can now hang too. That's how destructive all resets are. Personal anecdotes that the soft reset is better are just that, anecdotes. You're trying to frame the situation as black or white, but reality is shades of grey. There's a similar situation with kernel Oopsen: In principle it's not safe to continue executing the kernel after it hits an Oops, since it might be in an inconsistent state, which could result in any kind of misbehaviour. Still, the default behaviour is to continue executing, and in most cases it turns out fine. Users which cannot accept the residual risk can choose to make the kernel panic when it hits an Oops (either via CONFIG_PANIC_ON_OOPS at build time, or via oops=panic on the kernel command line). A kernel panic means that the machine basically freezes from a user PoV, which would be worse as the default behaviour for most users (because it would e.g. incur a higher risk of losing filesystem data). -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer