Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp3214136ybn; Fri, 27 Sep 2019 02:58:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzOO+OahVAURoFY7os5Rqyeedqx0Mh7QlYxCcGaNSDK2euiGGnlzsuVlVdV0lQUZs5HtHuX X-Received: by 2002:a50:da44:: with SMTP id a4mr3540228edk.120.1569578330433; Fri, 27 Sep 2019 02:58:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1569578330; cv=none; d=google.com; s=arc-20160816; b=a+AvwV97GGGP8MQKd1etphqrkTjUHX3OPSJJwGw8SF3XPWn+Y3MUUUFOnj8CIc6v+5 hGoz3P0hf6ISpA4UyHybdH7+PJOh0amZ+4s9W/CNo6yhT+odmVG6E7R2Yp8OgqtAqjew qETkVTGK9lFvYqx3Cu6LsrIVBDxClEpyzq/IORStDCToDEDOlvDZFWRhvJg0R2ZamY07 r/OosgjGKKBVMD0COdOBCpPSK8mBoIVU3bhXR9SxLK28gwMmOsYTjkgIoK7qT9u4vSg8 B88oHxpK85NEd12zukf4j/Jg3oWXEONShlkoI/aHsAI8Ry0JgxASkRz9JMtB6wEQ01yX hsLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:subject:from; bh=CzYrz3YhWXB8XgmG1ptzFxNlga2fzd3dkRY7WrJ0TPk=; b=CGpwvHvXbudg92Z0iFCKSZuguAyGLfBD+bknUwqiq2NMILhaT4a1Pw0VtxqH+Q238p vk/B9F9ZVzSXbf/mmfNYJ6Fu1itFdlVVk9n/69dQE4+Yfwq+g6sSf8ZJxnjeHBFgZeFp GfoJm6IpfkFA2GmgNsuVrEWc4Vwo/c/xvA+woKQ/pJSA6+oI77y09DFEzDo8u8OiNGjV mnIIlHcj9+OTUVG5Cvn3FPLmqbEuXJ+GgmgsuA2O/hdZX78cUi/Cd0sVO0TVbKNDVI8m Cl125EJnEkx9czIyzdVe5G+mKJ0IIUtEx34who4l/0ipEgiGWKJqtYT29KGON4R+ma5Q 8AZA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k5si2392515ejp.157.2019.09.27.02.58.25; Fri, 27 Sep 2019 02:58:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726976AbfI0Jzh (ORCPT + 99 others); Fri, 27 Sep 2019 05:55:37 -0400 Received: from foss.arm.com ([217.140.110.172]:47594 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726116AbfI0Jzh (ORCPT ); Fri, 27 Sep 2019 05:55:37 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 73DCB28; Fri, 27 Sep 2019 02:55:36 -0700 (PDT) Received: from [10.1.196.133] (e112269-lin.cambridge.arm.com [10.1.196.133]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1C1353F534; Fri, 27 Sep 2019 02:55:34 -0700 (PDT) From: Steven Price Subject: Re: drm_sched with panfrost crash on T820 To: Neil Armstrong , daniel@ffwll.ch, airlied@linux.ie, =?UTF-8?Q?Christian_K=c3=b6nig?= Cc: Tomeu Vizoso , linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, "open list:ARM/Amlogic Meson..." , Erico Nunes , Rob Herring References: Message-ID: <3fb178d8-f069-0ae2-1ed3-4ded84a71951@arm.com> Date: Fri, 27 Sep 2019 10:55:33 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 27/09/2019 09:12, Neil Armstrong wrote: > Hi Christian, > > In v5.3, running dEQP triggers the following kernel crash : > > [ 20.224982] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038 > [...] > [ 20.291064] Hardware name: Khadas VIM2 (DT) > [ 20.295217] Workqueue: events drm_sched_job_timedout > [...] > [ 20.304867] pc : drm_sched_increase_karma+0x5c/0xf0 > [ 20.309696] lr : drm_sched_increase_karma+0x44/0xf0 > [...] > [ 20.396720] Call trace: > [ 20.399138] drm_sched_increase_karma+0x5c/0xf0 > [ 20.403623] panfrost_job_timedout+0x12c/0x1e0 > [ 20.408021] drm_sched_job_timedout+0x48/0xa0 > [ 20.412336] process_one_work+0x1e0/0x320 > [ 20.416300] worker_thread+0x40/0x450 > [ 20.419924] kthread+0x124/0x128 > [ 20.423116] ret_from_fork+0x10/0x18 > [ 20.426653] Code: f9400001 540001c0 f9400a83 f9402402 (f9401c64) > [ 20.432690] ---[ end trace bd02f890139096a7 ]--- > > Which never happens, at all, on v5.2. > > I did a (very) long (7 days, ~100runs) bisect run using our LAVA lab (thanks tomeu !), but > bisecting was not easy since the bad commit landed on drm-misc-next after v5.1-rc6, and > then v5.2-rc1 was backmerged into drm-misc-next at: > [1] 374ed5429346 Merge drm/drm-next into drm-misc-next > > Thus bisecting between [1] ang v5.2-rc1 leads to commit based on v5.2-rc1... where panfrost was > not enabled in the Khadas VIM2 DT. > > Anyway, I managed to identify 3 possibly breaking commits : > [2] 290764af7e36 drm/sched: Keep s_fence->parent pointer > [3] 5918045c4ed4 drm/scheduler: rework job destruction > [4] a5343b8a2ca5 drm/scheduler: Add flag to hint the release of guilty job. My suspicion is that this is really a bug in Panfrost than the scheduler. I can see the following sequence: 1. Jobs submitted to *both* slot 0 and slot 1 on the hardware. 2. Hardware 'hangs' (most likely an unhandled page fault) 3. The reset timeouts occur. Because Panfrost uses a separate scheduler for each slot this means we have two threads racing to kill the respective jobs. 4. Each thread calls drm_sched_job_timedout which calls panfrost_job_timedout(). 5. Both threads calling panfrost_job_timedout() execute: for (i = 0; i < NUM_JOB_SLOTS; i++) drm_sched_stop(&pfdev->js->queue[i].sched, sched_job); 6. drm_sched_stop() will free the job from the other scheduler. So both jobs get freed (and sched->free_guilty is set). 7. Depending on the exact race either drm_sched_increase_karma() dereferences the freed job, or drm_sched_job_timedout() attempts to double-free the job (because free_guilty is set). Now there is a lock (pfdev->reset_lock) held during steps 5-6, but the problem is that the job which has timed out can be freed by the other scheduler's timeout before/after that critical section. One obvious issue with the DRM scheduler is that there is a call to cancel_delayed_work() in drm_sched_stop() which to me looks like it should be cancel_delayed_work_sync() to ensure that the timeout handling has completed. However in the above scenario a _sync() variety would then cause a deadlock (one thread has pfdev->reset_lock and is waiting for the other thread which is trying to take the lock). So we need to update Panfrost so that it can coordinate the reset between schedulers. Can you try something like the following (untested): ----8<--- diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index f503c566e99f..6441c7fba6c4 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -99,6 +99,8 @@ struct panfrost_device { unsigned long cur_volt; struct panfrost_devfreq_slot slot[NUM_JOB_SLOTS]; } devfreq; + + bool is_resetting; }; struct panfrost_mmu { diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 05c85f45a0de..8b935680c066 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -388,6 +388,10 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) mutex_lock(&pfdev->reset_lock); + if (pfdev->is_resetting) + goto out; + pfdev->is_resetting = true; + for (i = 0; i < NUM_JOB_SLOTS; i++) drm_sched_stop(&pfdev->js->queue[i].sched, sched_job); @@ -406,6 +410,8 @@ static void panfrost_job_timedout(struct drm_sched_job *sched_job) for (i = 0; i < NUM_JOB_SLOTS; i++) drm_sched_start(&pfdev->js->queue[i].sched, true); + pfdev->is_resetting = false; +out: mutex_unlock(&pfdev->reset_lock); }