Received: by 10.192.165.148 with SMTP id m20csp678479imm; Wed, 25 Apr 2018 06:06:53 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+DM9kpyQ+qx5BNEUnRYpFdooPA+q1kil/rgbgmeAxgp9j+ENAmltXcS6Lez0rKrMIZ3Ywg X-Received: by 10.99.114.83 with SMTP id c19mr17066866pgn.425.1524661613409; Wed, 25 Apr 2018 06:06:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524661613; cv=none; d=google.com; s=arc-20160816; b=wGte2wSLxZ/gnuWBR3zUHyEWywoGbZj4rvKtNJgn409pvDcHKJjgfFgeza0p03JRaV 74IwUX6NiY+3zs90ju+s+QQADEMeSEwX9JnUkaYKBsO8T7WaPbaiZUO8RbD6RKV8ISZd hPwx2qUGNPe27APDI/OZm1wXo8cgvDfUPTI9y3A6n2HtvsczyyLV7TNQtMSsc9KhWWpD YOtE2z9l+/AGxvXkxOpdfNylK8PzQ85BtmHJ79ToWeE4X24OLFTEXd1W+A/XKzrByue7 oAGFyzTg5O4efayT0Ge636hHaMdWhiol+MMzrdMYmKj28R9IBfLHOKKuaekw0T91Bkst Mb9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=UnSl4j/5AAyJcC8LZ9R+lTQov6gkwQFul1DXhcQMOHU=; b=G+IoPGDTNkYKBK2y3v/AdehtAN6PTl45rrmvqxCiHOSjNwR+M3VvDc7Wd393KbdJnh lkR3BD6G5w8VgQLNNjz2xnne3RkuS/veJ2hEXIBv59H5+SRjGsXqGVtQ4NBxZ1VQXBVE VVmZ9qunTe7sClFWsg/9f+fWVYtTz8edWBr5IBQZT/ocbz86PdZwtfAe21uMZ9IL24PP 4GIe9KRUn8/C4SK5gytiSIVeHP48fx1xe6xKEdwZZTrMrD9kVViL/ppP1p45wLnl3y6q Cp88NXtQHylJXz2O+Otz4IFmJpAZSIK8i4frwF5/H9dbXOYKMNf7UUO6yje3D7KNdYcQ Mxnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t12-v6si10214652plm.201.2018.04.25.06.06.38; Wed, 25 Apr 2018 06:06:53 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753951AbeDYNFb (ORCPT + 99 others); Wed, 25 Apr 2018 09:05:31 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:40776 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753508AbeDYNFZ (ORCPT ); Wed, 25 Apr 2018 09:05:25 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B9E064270971; Wed, 25 Apr 2018 13:05:24 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.34.27.30]) by smtp.corp.redhat.com (Postfix) with SMTP id 5D05C7C47; Wed, 25 Apr 2018 13:05:22 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Wed, 25 Apr 2018 15:05:24 +0200 (CEST) Date: Wed, 25 Apr 2018 15:05:22 +0200 From: Oleg Nesterov To: Andrey Grodzovsky Cc: linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, Alexander.Deucher@amd.com, Christian.Koenig@amd.com, David.Panariti@amd.com, akpm@linux-foundation.org, ebiederm@xmission.com Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process. Message-ID: <20180425130522.GA7592@redhat.com> References: <1524583836-12130-1-git-send-email-andrey.grodzovsky@amd.com> <1524583836-12130-3-git-send-email-andrey.grodzovsky@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1524583836-12130-3-git-send-email-andrey.grodzovsky@amd.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 25 Apr 2018 13:05:24 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 25 Apr 2018 13:05:24 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'oleg@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/24, Andrey Grodzovsky wrote: > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, > return; > /** > * The client will not queue more IBs during this fini, consume existing > - * queued IBs or discard them on SIGKILL > + * queued IBs or discard them when in death signal state since > + * wait_event_killable can't receive signals in that state. > */ > - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) > + if (current->flags & PF_SIGNALED) please do not use PF_SIGNALED, it must die. Besides you can't rely on this flag in multi-threaded case. current->exit_code doesn't look right too. > entity->fini_status = -ERESTARTSYS; > else > entity->fini_status = wait_event_killable(sched->job_scheduled, So afaics the problem is that fatal_signal_pending() is not necessarily true after SIGKILL was already dequeued and thus wait_event_killable(), right? This was already discussed, but it is not clear what we can/should do. We can probably change get_signal() to not dequeue SIGKILL or do something else to keep fatal_signal_pending() == T for the exiting killed thread. But in this case we probably also want to discriminate the "real" SIGKILL's from group_exit/exec/coredump. Oleg.