Received: by 10.192.165.148 with SMTP id m20csp3707270imm; Mon, 30 Apr 2018 05:09:18 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpfC6OU+vhNEOrDTi5N35apQxVA0PZvRBPe8jSWsaqqSw9fIfD9/xrX3IS1Zh9i5bTsFeSH X-Received: by 2002:a65:534d:: with SMTP id w13-v6mr10081167pgr.429.1525090158410; Mon, 30 Apr 2018 05:09:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525090158; cv=none; d=google.com; s=arc-20160816; b=aK67dHTot9v8KoAZ5SQK00tU4n1FPPYqHrPwwoeDKuiXAWYgn1a3zxyp1Zvf/57P+s t4UkCC07WrleOFRIfwBQ4MwEXdHQDwr00eiW2lzEMYjPNLCqoJbHM7puTtjjy1FgKiwQ R33sGesfOrTH47EMfCQrrEQUsTVp+YQngpkb7vGtsWN2jZVb8xk7WR6lo9UPuLjw106C 6oBApgOzOEYmoPZUl4fOZD3MSW9EhWnkZtCINgKguJ0wVIH7RyRLDIkpxdRrWbmaAwaJ WguAYe46oYGasqznuKaFuNWCZC+2NmFz2cYmmzXHulZbaS/7tGKFuGkJIFUXnvXZGf4v mTxA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:reply-to:dkim-signature :arc-authentication-results; bh=zpy59MEERr+jMAiP9jsMpYAt1Ee7NoFt3o3Fd0zVtmY=; b=gbdq8z96fZmtutzYVF/s3z2ufOAFhDzllVrfHwTKJyTUe0oGsJxH2ihiuWvAqCSHuk kFEnjd/n+5uZtTMY75umAprMSwKT3EigEl2Ii/IJiGYcQBqV6bAhYu2rcfVpV6ISR8a9 GQEseY3jY2r3A2T97u1230niyXzhXCqGmB/5rt1THZ2PkoEx+BOVJVxP3IaWGnbcBSc4 zRnnivIVmspYhh2SQxQkvhHh5rbZoUyMKQKi9fZooawyN2FkX9NOpSHt71ZC2X+mfxQc aE2Cd3cqLT3KacAvBk7XCVIeW6MDGlbkC1AFQGGJinjH4FmTm5ZCLqeok8Eqc/wADj2n giOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rVKKlHfB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m2-v6si6225280pgm.360.2018.04.30.05.09.02; Mon, 30 Apr 2018 05:09:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=rVKKlHfB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752674AbeD3MIs (ORCPT + 99 others); Mon, 30 Apr 2018 08:08:48 -0400 Received: from mail-wm0-f48.google.com ([74.125.82.48]:39552 "EHLO mail-wm0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751822AbeD3MIr (ORCPT ); Mon, 30 Apr 2018 08:08:47 -0400 Received: by mail-wm0-f48.google.com with SMTP id f8so3319079wmc.4 for ; Mon, 30 Apr 2018 05:08:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=zpy59MEERr+jMAiP9jsMpYAt1Ee7NoFt3o3Fd0zVtmY=; b=rVKKlHfBHtdkGfvTytmLvrALcWwXmMjHhOkgWy3+nBqfvUxJWJFH78Kff2Pu0XzZSL O8xJfh2OpgbH/wJsVwSiglzinzo8OZBmYjgXnBGeCvogw0M0mirvgkk0gry4QSIxwNB/ WCkUu34Ztd49Io+O9pShKaAS4dyh+oiWM8KHzADhKDYmkG6bRkHMWXs7SMwOLR7ophnR xZzK9TyDJZFC/G+njzjtmPFa78XIyjCk+oKI3CDIIkDrdSK5NsycNvJgFnEneHyM6Xgr 6Ce3fi9aLblIK0JccNDzaIRS3hKnp2ftA4979aHlY54W0ymp6XAZziCkdf+OeFO5Hlqw 5WSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:cc:references:from :message-id:date:user-agent:mime-version:in-reply-to :content-transfer-encoding:content-language; bh=zpy59MEERr+jMAiP9jsMpYAt1Ee7NoFt3o3Fd0zVtmY=; b=CvgCHSYmltX/jJ4eob3wDJRNlV5nbXCQaTA5ovZxKR9QIkX6hFjB8z8EeSSv/hoz5z n6+aoyBwHKNVmCvXkwT3MHFN5NL8S3ZkdvHXTV6NxjPOuxidtgg5Ieay4Ipqsl/Dm3wY M71XDX8jfeVcv/d34friSPSzG8FgLUglK2q02X9GMvmlBjvCEpKgO1ikscKuxetNyGny SVsX5ijIzVR498ZtU20HEMPvZogCAxuTZuvG+0/IphnWKIC3249kfnOERLmf1BPwENh7 YU6NyAC15Z0BIw+e2ZMvq2k3yAmcH8gPuSh6g24MHI50PpjuPX/1bIOhBz1BRukLdlJi nRiA== X-Gm-Message-State: ALQs6tAlTNW/moaSt9Y3I7IyNN/drcRyKuhoVGi8qiqsouFox3dfS1/O OU4lSBNzGA6vIlHohK3fZF4= X-Received: by 10.28.147.83 with SMTP id v80mr7210927wmd.91.1525090126206; Mon, 30 Apr 2018 05:08:46 -0700 (PDT) Received: from ?IPv6:2a02:908:1257:4460:1ab8:55c1:a639:6740? ([2a02:908:1257:4460:1ab8:55c1:a639:6740]) by smtp.gmail.com with ESMTPSA id o10-v6sm7084069wrg.90.2018.04.30.05.08.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Apr 2018 05:08:45 -0700 (PDT) Reply-To: christian.koenig@amd.com Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process. To: "Eric W. Biederman" , Andrey Grodzovsky Cc: David.Panariti@amd.com, Oleg Nesterov , amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, Alexander.Deucher@amd.com, akpm@linux-foundation.org, Christian.Koenig@amd.com References: <1524583836-12130-1-git-send-email-andrey.grodzovsky@amd.com> <1524583836-12130-3-git-send-email-andrey.grodzovsky@amd.com> <87muxsbmkp.fsf@xmission.com> <8840ac96-50c4-f94d-eb7c-f007940163f3@amd.com> <877eowa5qh.fsf@xmission.com> <20180425135552.GD7592@redhat.com> <20180425171757.GA10441@redhat.com> <874ljyu98e.fsf@xmission.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: Date: Mon, 30 Apr 2018 14:08:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <874ljyu98e.fsf@xmission.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Eric, sorry for the late response, was on vacation last week. Am 26.04.2018 um 02:01 schrieb Eric W. Biederman: > Andrey Grodzovsky writes: > >> On 04/25/2018 01:17 PM, Oleg Nesterov wrote: >>> On 04/25, Andrey Grodzovsky wrote: >>>> here (drm_sched_entity_fini) is also a bad idea, but we still want to be >>>> able to exit immediately >>>> and not wait for GPU jobs completion when the reason for reaching this code >>>> is because of KILL >>>> signal to the user process who opened the device file. >>> Can you hook f_op->flush method? THANKS! That sounds like a really good idea to me and we haven't investigated into that direction yet. >> But this one is called for each task releasing a reference to the the file, so >> not sure I see how this solves the problem. > The big question is why do you need to wait during the final closing a > file? As always it's because of historical reasons. Initially user space pushed commands directly to a hardware queue and when a processes finished we didn't need to wait for anything. Then the GPU scheduler was introduced which delayed pushing the jobs to the hardware queue to a later point in time. This wait was then added to maintain backward compability and not break userspace (but see below). > The wait can be terminated so the wait does not appear to be simply a > matter of correctness. Well when the process is killed we don't care about correctness any more, we just want to get rid of it as quickly as possible (OOM situation etc...). But it is perfectly possible that a process submits some render commands and then calls exit() or terminates because of a SIGTERM, SIGINT etc.. In this case we need to wait here to make sure that all rendering is pushed to the hardware because the scheduler might need resources/settings from the file descriptor. For example if you just remove that wait you could close firefox and get garbage on the screen for a millisecond because the remaining rendering commands where not executed. So what we essentially need is to distinct between a SIGKILL (which means stop processing as soon as possible) and any other reason because then we don't want to annoy the user with garbage on the screen (even if it's just for a few milliseconds). Constructive ideas how to handle this would be very welcome, cause I completely agree that what we have at the moment by checking PF_SIGNAL is just a very very hacky workaround. Thanks, Christian. > > Eric > _______________________________________________ > amd-gfx mailing list > amd-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/amd-gfx