Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp71582ybn; Thu, 3 Oct 2019 01:36:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqx/JOWfqlL4atDxH1pGGGUNd66qu6e4ZXqGaJn1OZbWK9JHX3JyEaFPUMpGCSAc1aTaMWXd X-Received: by 2002:a17:906:d04f:: with SMTP id bo15mr6861376ejb.296.1570091814024; Thu, 03 Oct 2019 01:36:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570091814; cv=none; d=google.com; s=arc-20160816; b=mzvB38n2TQKXXmzz8Vz6L7Fir71TCdOX0EBjjFNemoQbvDLySah7ecCOWOfb62N8JI SMu4n5ZJ7kM0HZ2aifwsZQJW6scIgcZsguGR7kQxt9zERBvc1ihyfTubTBXNHw18C/+n 1hf2D6teD92y0xpqTdz1S21E92wsmMShTFvqKe1QO5/eeCRbLBzu0me2oUOOn16+PCOC RUk5A3abAzwUr5bIFo+JV/oi5fmSKDM7k2NDNClZ7iyokbfLOBp8ZRJazk7JAFh9fVW0 zZtlHgAR9tTCJZRSmRSlLUmwxTmc2xpJhke0ExF+qMwCc0kZnoy1k2pObaiC0CvmqWN1 c7RA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:cc:references:to :subject:dkim-signature; bh=pI9C6ahl+nbQHPAC35dp9wq8rqxSsDDWofoWIyRDkLM=; b=Nv/6fyy7hE4excVailWGe+OIZ/yLuARaV8BO+uC29mglI2CVdxtqtFVnPbLdHxvfTL VVe3exa+mYcrjE7klBp4kTTS6g6PwyuVjG5XFjZkc475jCv3CkPXuI37B7K+TdGr/q3M W3at3pVhMlsKMkOPZCqAusiFlEJyeUxgrJVT5ihV0STr9YAztjlgT1XMs1nPzB995nmq mEDzvBqRnfM4epfxF3ENadnCKKl/xZRQClYg0kZ/vMGcL/YduPPv5U7m6BcYUnCqfIU3 U5H03hJj3CrDr6ph+/LD9qJccffBus9iQ3XGr2Avs/JvZWRVcX0y4foB8B1FN+SkgeQZ 1nJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@baylibre-com.20150623.gappssmtp.com header.s=20150623 header.b=zTchbkJb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k6si1056856edb.235.2019.10.03.01.36.28; Thu, 03 Oct 2019 01:36:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@baylibre-com.20150623.gappssmtp.com header.s=20150623 header.b=zTchbkJb; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727831AbfJCIeZ (ORCPT + 99 others); Thu, 3 Oct 2019 04:34:25 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:51932 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727357AbfJCIeY (ORCPT ); Thu, 3 Oct 2019 04:34:24 -0400 Received: by mail-wm1-f66.google.com with SMTP id 7so1549206wme.1 for ; Thu, 03 Oct 2019 01:34:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20150623.gappssmtp.com; s=20150623; h=subject:to:references:cc:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding; bh=pI9C6ahl+nbQHPAC35dp9wq8rqxSsDDWofoWIyRDkLM=; b=zTchbkJbgDTpNLzYwpdrWTE/gd+vuOPqap7nsSbbZ7+8Ec77H1VBaUDGtFpxoA8Wsy rO2dvutBNGUKTXyH+TJSnv6zLMGC8UBJjbZ/DzCH3ko9E+3EEUAZNvxBnKTjImTxKGHg dFid8JAY3zOjgb0NKCgxcHpbwJXFBuGJR2nOeNlybSaS3hFq9eIa1e8Diqpp1nosCmjz GNr1KuzN7K3FCz0nMCInp5KFNv0xH8+1uC1g3VYS9Qm449WqeUoPFBSFasSQLUn/U4eM tBhXWmFRC05WZo/hqCw1XiR8CdQT8uhC6ekLnd1hwADf8pIesbTUpa81F+D60JXHYCOC xmzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:cc:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=pI9C6ahl+nbQHPAC35dp9wq8rqxSsDDWofoWIyRDkLM=; b=f3Ds7R5jXpEGkiCEGVNCfpz6U8umvLOlIbYnZOfAPinEnbzlnFeuV1f0EdBmZG7iCw 80WV3ji4AjttzHwzyKY0sR/0Rd/ZyzL8r4ROAKEWqrND0SsnhdCD2lqqxL3LSPfTu25y BCSFNClzJ5ksXb39voqKDQnPJ61dWKbLEwstCrk5hOoze4lpTuqvhs1Q/dtpQT+86kBe gZ+0KAaI5PRm5lMWrtnIblRsDHBeDAuEkfm5n8ziOVZ4zRwBDr3wnb1GybQJg3TaFawn oAQYEf2/WKBUXlICu62J9JwWbOdDQFr82Uq20Idk0iDh0ydGUUbVVx+6nevx3kny8733 MprQ== X-Gm-Message-State: APjAAAVlfb0pWu0KHCTpX1MdbYwyf0gSG9cRdzLG9vS0i7gKK4Itt8Np s1+kzHdSV0vIBBX3UkonoSyzBg== X-Received: by 2002:a7b:cb91:: with SMTP id m17mr6042506wmi.151.1570091661782; Thu, 03 Oct 2019 01:34:21 -0700 (PDT) Received: from [192.168.1.77] (176-150-251-154.abo.bbox.fr. [176.150.251.154]) by smtp.gmail.com with ESMTPSA id x129sm2591475wmg.8.2019.10.03.01.34.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 03 Oct 2019 01:34:20 -0700 (PDT) Subject: Re: drm_sched with panfrost crash on T820 To: "Grodzovsky, Andrey" , Hillf Danton References: <20190930145228.14000-1-hdanton@sina.com> Cc: "daniel@ffwll.ch" , "airlied@linux.ie" , "Koenig, Christian" , Erico Nunes , "linux-kernel@vger.kernel.org" , "steven.price@arm.com" , "dri-devel@lists.freedesktop.org" , Rob Herring , Tomeu Vizoso , "open list:ARM/Amlogic Meson..." From: Neil Armstrong Message-ID: <7339b7a1-2d1c-4379-89a0-daf8b28d81c8@baylibre.com> Date: Thu, 3 Oct 2019 10:34:19 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrey, Le 02/10/2019 à 16:40, Grodzovsky, Andrey a écrit : > > On 9/30/19 10:52 AM, Hillf Danton wrote: >> On Mon, 30 Sep 2019 11:17:45 +0200 Neil Armstrong wrote: >>> Did a new run from 5.3: >>> >>> [ 35.971972] Call trace: >>> [ 35.974391] drm_sched_increase_karma+0x5c/0xf0 >>> ffff000010667f38 FFFF000010667F94 >>> drivers/gpu/drm/scheduler/sched_main.c:335 >>> >>> The crashing line is : >>> if (bad->s_fence->scheduled.context == >>> entity->fence_context) { >>> >>> Doesn't seem related to guilty job. >> Bail out if s_fence is no longer fresh. >> >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -333,6 +333,10 @@ void drm_sched_increase_karma(struct drm >> >> spin_lock(&rq->lock); >> list_for_each_entry_safe(entity, tmp, &rq->entities, list) { >> + if (!smp_load_acquire(&bad->s_fence)) { >> + spin_unlock(&rq->lock); >> + return; >> + } >> if (bad->s_fence->scheduled.context == >> entity->fence_context) { >> if (atomic_read(&bad->karma) > >> @@ -543,7 +547,7 @@ EXPORT_SYMBOL(drm_sched_job_init); >> void drm_sched_job_cleanup(struct drm_sched_job *job) >> { >> dma_fence_put(&job->s_fence->finished); >> - job->s_fence = NULL; >> + smp_store_release(&job->s_fence, 0); >> } >> EXPORT_SYMBOL(drm_sched_job_cleanup); > This fixed the problem on the 10 CI runs. Neil > > Does this change help the problem ? Note that drm_sched_job_cleanup is > called from scheduler thread which is stopped at all times when work_tdr > thread is running and anyway the 'bad' job is still in the > ring_mirror_list while it's being accessed from > drm_sched_increase_karma so I don't think drm_sched_job_cleanup can be > called for it BEFORE or while drm_sched_increase_karma is executed. > > Andrey > > >> >> -- >>