Received: by 10.192.165.148 with SMTP id m20csp5147379imm; Tue, 24 Apr 2018 14:42:14 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+lvIwKuXFQ7SFDQaOIrf38mhMzo72vWFRAVUU4Tel8VwzDGoGrxREvjSkkQdGcwi5y2XVD X-Received: by 2002:a17:902:8b84:: with SMTP id ay4-v6mr26361078plb.57.1524606134204; Tue, 24 Apr 2018 14:42:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524606134; cv=none; d=google.com; s=arc-20160816; b=FlY60bxFCL3MKya2He+taSeM+3cNLiG3ehSu8PcV22nG3pMyxtoQEnjxA4rwKlQzEY Q/7kp5u9NqDirDwU6uJcLO6Wvf33ER8YhKaGGnm42xZJFPP9inuLj3ZjpLwnVTGSQLaN 4QSUac/I8RL9GrpesiX2gnBjNCf644uRQr1k9JRdnGn+Ks8mKM6iHuh/oldMOqCWIp8X 843mhUxPkefRS4N79C7WHXRYbzrC6BG16UUhm6pw7qeqd1+f7PPa9FjbVZm6fTnDmTD0 0zVkEtLuA1N0TBVf45mUSGRy4gJ2R0QgGhm6N6TNDrKAxxldhSlJiNAQK90eWo0Zwim2 vrgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:mail-followup-to:message-id:subject:cc:to:from:date :dkim-signature:arc-authentication-results; bh=ickVwjl8k5H8qZISUMNjUP58CMXmRPejrrSGhSKfeNE=; b=tM/po5JPZyKR02oLXGTSl9rtUI+M8MxwjqV4cfI4iJvWospxB2pasBOdfFA7saenkR R5PvclpD60qr5u+Lvqzq3sPzK8AVLk/AdBfIE4i4oy+e8gzP5l8bLyZjR3oLDBtnLy/G S2pPXuXf61NxK0aFrsncNJVU2fghOTKvZ6LN8vr177bbnQ8k/TIjZhvDhN2TrnZHHFY/ EkbtoyK/sWd05T6l+drl2oKMLVXUKTV83nLeim3j8U/DAvpsUVlPu03RI+2em0OVDwaE laS7N9yaPwBNwQAoezZsxbHq9QbTGS4JQJTptlAQscO9yJFWokHEDvpXMm6V+A0fu4Uf XMZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@ffwll.ch header.s=google header.b=hOPLhlIk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g4-v6si14773422plb.522.2018.04.24.14.41.47; Tue, 24 Apr 2018 14:42:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@ffwll.ch header.s=google header.b=hOPLhlIk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751150AbeDXVkc (ORCPT + 99 others); Tue, 24 Apr 2018 17:40:32 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36920 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750826AbeDXVkb (ORCPT ); Tue, 24 Apr 2018 17:40:31 -0400 Received: by mail-wm0-f67.google.com with SMTP id l16so3266877wmh.2 for ; Tue, 24 Apr 2018 14:40:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=sender:date:from:to:cc:subject:message-id:mail-followup-to :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=ickVwjl8k5H8qZISUMNjUP58CMXmRPejrrSGhSKfeNE=; b=hOPLhlIkbT823ghPIxtkCUNth8ZwW2jhvsY2gDCKyzOC8DelMNs/ez4i2+vFxG9dD3 i8yN92SbPlBoRNy7ZQIAKt8bjEUXrGuQ22TVvjdpNDL/ClY475dw8fQt4GXEtWcCSvxD niyGahLTfZAsPwvbhglNg2evCDWXYb7T/4XzY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=ickVwjl8k5H8qZISUMNjUP58CMXmRPejrrSGhSKfeNE=; b=IJw1F8raA5vRnF4CpxiwcoWa/6+3xPw/AmAEWod49e5Lpf9o6UoXK/Rlx8/Md/r1Cy bK+YcjcWTan5JuRESIZ04qwFXFOcguO4ar4GCNgMYXbafy0SGlqa3ogVOEcvKrjGo+AU nOFjHjFn+DNoNT9DZlXPITvYCCsd4BG7ZkgfPiAPVhtoDK/zldDayiVX5ndRD1Vl+q85 4UFrB03bzt2Z8xE/NyVEHSyc0B+FvBAULMSzzIHooBklzyLAkPAHsUvTls11VfdgP0ka nkMwLpqTPVD2R3yty7fXlgIo6a8W9VZd66hdjie1AUcQtBXs+BPzrRM9bCTnRVnMY77g PxgA== X-Gm-Message-State: ALQs6tAa3QEqX50cM1Xx3iYwbnW2x+X0V3XP9ZxcFk0Xl6IJ2J74/hQG q6hN6jGGLJGFR/p0yLEzCFS1dA== X-Received: by 10.80.220.195 with SMTP id v3mr35829305edk.221.1524606030072; Tue, 24 Apr 2018 14:40:30 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:5635:0:39d2:f87e:2033:9f6]) by smtp.gmail.com with ESMTPSA id o3sm7578854edi.16.2018.04.24.14.40.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 24 Apr 2018 14:40:29 -0700 (PDT) Date: Tue, 24 Apr 2018 23:40:27 +0200 From: Daniel Vetter To: Andrey Grodzovsky Cc: Michel =?iso-8859-1?Q?D=E4nzer?= , linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, David.Panariti@amd.com, oleg@redhat.com, ebiederm@xmission.com, Alexander.Deucher@amd.com, akpm@linux-foundation.org, Christian.Koenig@amd.com Subject: Re: [PATCH 2/3] drm/scheduler: Don't call wait_event_killable for signaled process. Message-ID: <20180424214027.GG25142@phenom.ffwll.local> Mail-Followup-To: Andrey Grodzovsky , Michel =?iso-8859-1?Q?D=E4nzer?= , linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, David.Panariti@amd.com, oleg@redhat.com, ebiederm@xmission.com, Alexander.Deucher@amd.com, akpm@linux-foundation.org, Christian.Koenig@amd.com References: <1524583836-12130-1-git-send-email-andrey.grodzovsky@amd.com> <1524583836-12130-3-git-send-email-andrey.grodzovsky@amd.com> <7313704c-0693-0bb9-8818-99cd2b7c0ca0@daenzer.net> <20180424194418.GE25142@phenom.ffwll.local> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Operating-System: Linux phenom 4.15.0-1-amd64 User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 24, 2018 at 05:02:40PM -0400, Andrey Grodzovsky wrote: > > > On 04/24/2018 03:44 PM, Daniel Vetter wrote: > > On Tue, Apr 24, 2018 at 05:46:52PM +0200, Michel D?nzer wrote: > > > Adding the dri-devel list, since this is driver independent code. > > > > > > > > > On 2018-04-24 05:30 PM, Andrey Grodzovsky wrote: > > > > Avoid calling wait_event_killable when you are possibly being called > > > > from get_signal routine since in that case you end up in a deadlock > > > > where you are alreay blocked in singla processing any trying to wait > > > Multiple typos here, "[...] already blocked in signal processing and [...]"? > > > > > > > > > > on a new signal. > > > > > > > > Signed-off-by: Andrey Grodzovsky > > > > --- > > > > drivers/gpu/drm/scheduler/gpu_scheduler.c | 5 +++-- > > > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c > > > > index 088ff2b..09fd258 100644 > > > > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c > > > > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c > > > > @@ -227,9 +227,10 @@ void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, > > > > return; > > > > /** > > > > * The client will not queue more IBs during this fini, consume existing > > > > - * queued IBs or discard them on SIGKILL > > > > + * queued IBs or discard them when in death signal state since > > > > + * wait_event_killable can't receive signals in that state. > > > > */ > > > > - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) > > > > + if (current->flags & PF_SIGNALED) > > You want fatal_signal_pending() here, instead of inventing your own broken > > version. > > I rely on current->flags & PF_SIGNALED because this being set from within > get_signal, > meaning I am within signal processing? in which case I want to avoid any > signal based wait for that task, > From what i see in the code, task_struct.pending.signal is being set for > other threads in same > group (zap_other_threads) or for other scenarios, those task are still able > to receive signals > so calling wait_event_killable there will not have problem. > > > > entity->fini_status = -ERESTARTSYS; > > > > else > > > > entity->fini_status = wait_event_killable(sched->job_scheduled, > > But really this smells like a bug in wait_event_killable, since > > wait_event_interruptible does not suffer from the same bug. It will return > > immediately when there's a signal pending. > > Even when wait_event_interruptible is called as following - > ...->do_signal->get_signal->....->wait_event_interruptible ? > I haven't tried it but wait_event_interruptible is very much alike to > wait_event_killable so I would assume it will also > not be interrupted if called like that. (Will give it a try just out of > curiosity anyway) wait_event_killabel doesn't check for fatal_signal_pending before calling schedule, so definitely has a nice race there. But if you're sure that you really need to check PF_SIGNALED, then I'm honestly not clear on what you're trying to pull off here. Your sparse explanation of what happens isn't enough, since I have no idea how you can get from get_signal() to the above wait_event_killable callsite. -Daniel > > Andrey > > > > > I think this should be fixed in core code, not papered over in some > > subsystem. > > -Daniel > > > > > > > > -- > > > Earthling Michel D?nzer | http://www.amd.com > > > Libre software enthusiast | Mesa and X developer > > > _______________________________________________ > > > dri-devel mailing list > > > dri-devel@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch