Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933948AbcCITGA (ORCPT ); Wed, 9 Mar 2016 14:06:00 -0500 Received: from mail-bn1bon0074.outbound.protection.outlook.com ([157.56.111.74]:37252 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933901AbcCITFu (ORCPT ); Wed, 9 Mar 2016 14:05:50 -0500 X-Greylist: delayed 3646 seconds by postgrey-1.27 at vger.kernel.org; Wed, 09 Mar 2016 14:05:49 EST Authentication-Results: canonical.com; dkim=none (message not signed) header.d=none;canonical.com; dmarc=none action=none header.from=amd.com; Subject: Re: Oops in 3.10.99 -- NULL pointer dereference in radeon_fence_ref To: Luis Henriques , Greg Kroah-Hartman References: <20160307025014.GA9499@mail.codepoet.org> <20160307204654.GB6545@kroah.com> <56DDED67.2030801@amd.com> <20160307225851.GB25867@kroah.com> <20160309135612.GA20283@charon.olymp> CC: =?UTF-8?Q?Christian_K=c3=b6nig?= , , linux-kernel , , Sasha Levin , Jiri Slaby , Kamal Mostafa From: =?UTF-8?Q?Nicolai_H=c3=a4hnle?= Message-ID: <56E04FFA.7070906@amd.com> Date: Wed, 9 Mar 2016 11:31:54 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160309135612.GA20283@charon.olymp> Content-Type: multipart/mixed; boundary="------------090506060305070802030409" X-Originating-IP: [181.66.190.196] X-ClientProxiedBy: BY2PR04CA0083.namprd04.prod.outlook.com (10.255.247.51) To SN1PR12MB0750.namprd12.prod.outlook.com (25.164.26.24) X-MS-Office365-Filtering-Correlation-Id: 72e29b6d-c584-48b3-4d07-08d3483858fc X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0750;2:6hAR/LCPMjGkJHkYxIbvduyn2JfwqC+/JUDAQTRFmpOIVg6pGiRiSvsOUVgQ+rHfFkjxYNiiYj2fKyYMUxrl+Gz19JpwDadN43EwS5BN8doQHk0V8t0KLpF0E+6b/N/LYzvsqMJ+Oe+81fsADqIh7c2GMUmyFj2utjGx+W6hwiR5UGbwFsdgmfn9m3hlH2En;3:uBgOwEG9rib5LEJuVat3f3Yy2EiYJwfn5iWqELQGd8yK3nqEstNoM+YbVmnJ6w3Y3lZ9+WrP9eJeo7Y1FT6XubxxL/AFpYLAQV7ySI0k+0XgtsG9oP95KRoTK7IB2OyT;25:2Vivqz0EL5tItfOkajIFPRLV/9nIe5QjrtBxPCJ5whwok1/fO/lEMLBQHYC44+MkZmOXESvP+Y8lM7/3yBbILG/RbS22CD2Is5sbCrIVdTOqDCWh+KPp5WvFMxa4e+3L+NsCzRKjYppWKhrdBGgOqCurlhYt/XfdVR1sMqpJWIKV1bd3dJ6dlqXwv/9iMiSQyZO695YhEhvmuOpddMIfr0tuB/l/dlH65gEkDgj/DfrCv6Xxm98+FKVsCYrenBH1VNhnLF2SngshIKe+NTmlQy740MLjpZR14rUuY8d18IDagABImXgF2SCwCjD7kwfk7uxSXwyOtMykBPU0Oc+FWw== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0750; X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0750;20:FLvK/asKX6QPFZAp31Jk7VgqfWoqpn263SdaOLPo3jluQ4/b/I/14MTF6NSIz/DmDq7wVCIsVvpewLMq/zfIdC10gUfTFbTuqJ5EyxWUNF+FYv9nm5tiD7SAcSuro1xOV/nvPjbwIYRl2uk1FWFSwMAu46TACDCUnfeWEtPhj/tYvg+vT8oRsfYFmmIvPUfZozKnZqfpN6W7GzQkhh5F1SWFVqf/cyGMY4YiH4OwIKUw4qADGUV151xpQ9GRfpKBqQiSf/WgAjfbk37TjuGlIwqNW01kb5PyAmHK/jvpRImc7G0pw/c10trgm47eYpbYU0Oplc74hLE+UHvRzaXjkU0Fm8NDl38UXzO4SZmayGVxpsskJMjBh7fJ64fksXyDOCswBaEyv8znmb9M3TxepzMj9KVx+oG15Hd6136Yrn/NSyzHr4Rl0MZix32pZlVXhrwxl+MRh5tndMQOvCGwjXsVHGbWIcqECSDaIJnvgFNdV0IEO5nZGKcG3JXBuzv0;4:JqZNeZMxZV+nuXCTlw9gxDng0Q72yFFVyQjZpywinwBOKkP/KQI0Odk3nVVKw8jzIhJut5QH7dfzy1sZ6ZoNt86Es0szJnCTQMCL81PbG+U39QNXJYxa4hfV6QdbrNNS+SHFt95DCH7nmyrUi2YEOAovqMpSYXrx+toOmvewzoGV2wi3XzM7pCwINLc/+QtEdccJVot+mrxXFpQzDRUTjq2VWTqCEeah8qLjxHUeKQbflmnXPz9kBgv0KHly7DXjO7LrL74JGbd/n19kqVcDNdVBwhb0Pk9sGim4dgDrq2hkMG8g/OBmrPZ1OWPhZF45tP3kukQOU7oPwjNMplggmT8q1jo/GG6cNC4V5F8EiwuHEZkouihvhn2e47MxbdrlWC+UXxekS/EmtSavsMHsOt5Wb0RhKLFLY/CKjec802M= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(102415293)(102615271)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046);SRVR:SN1PR12MB0750;BCL:0;PCL:0;RULEID:;SRVR:SN1PR12MB0750; X-Forefront-PRVS: 0876988AF0 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6049001)(6009001)(52044002)(24454002)(5423002)(568964002)(65806001)(189998001)(54356999)(87266999)(66066001)(76176999)(50986999)(36756003)(33656002)(5890100001)(42186005)(586003)(64126003)(5008740100001)(5001770100001)(2950100001)(81166005)(4001350100001)(575784001)(4610100001)(19580405001)(92566002)(117156001)(80316001)(86362001)(4326007)(2476003)(1096002)(93886004)(77096005)(83506001)(15975445007)(5004730100002)(512944002)(3846002)(6116002)(2906002)(19580395003)(84326002);DIR:OUT;SFP:1101;SCL:1;SRVR:SN1PR12MB0750;H:[192.168.7.120];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;SN1PR12MB0750;23:0peHk4cqIXAKtRfRjSHtcBr4NsLtQM/GjwfQK?= =?Windows-1252?Q?5rdYeQTwD9prGFx2UvdfkOOTDT2TAbBd8AnYqtgUnm8mZuB+cLDMRE7Y?= =?Windows-1252?Q?Mo8e3dfDwkEBH29svwaaO8n57LFqTC5qRbPv9DbPf2ktWLKjZitOk37R?= =?Windows-1252?Q?ugT9XkaCmeCqidOo22z8Y0stLmqJ8w3Razr3KyguNPC/jXAb8VoWdSV1?= =?Windows-1252?Q?gVMRxW1csCGdeOknL/enFkvFDXEKsrpHM1FC7R58CrA0W9em1bdL9jyU?= =?Windows-1252?Q?IeM6/JSEijCF+MCChicueQsULwKe5mWuPzNGoxT7q78D421fruRoKFci?= =?Windows-1252?Q?dc34mKX1YqJAkJfIdQ9iLKKL+3fKS3n+gz6H3sdoTrtWLIfQYf+WGKZj?= =?Windows-1252?Q?9O7U7issthT4KDbBhWfAZ/X64ZvGzZVzJvHro2iJeZLO6+iQlOSMg3EM?= =?Windows-1252?Q?ZuBb6a+bAMG7NNvRe9gWvYm6OhEQILZeq1ZUwrmSHCjtQL2TLp/LknD7?= =?Windows-1252?Q?KsvRiT+n+4xX793it9cJU9edqmsd0LnJKY8WJi0/Udmk7EaDHa/GAVQQ?= =?Windows-1252?Q?2BxATOedAHKpDiDufRk6qgJBm4I0hsY6J1ioqGScbE69plnFM7ug8Uqu?= =?Windows-1252?Q?Dks1bCBfFuI6ilgO4Dosurxc+4mqP+QlFeTWqwPNXir39/knpxG3YJpp?= =?Windows-1252?Q?R1bTzl5dNeglIp7gCPBvfny3Zt0VaECAiYTBJv7Xn3MuBmbE6h8xsyHo?= =?Windows-1252?Q?v+8bKVAxGbn4jpmGKkmQa2dyRxqtrt1/nvJ0qByFG+dFNWrZ6URiJZr+?= =?Windows-1252?Q?CzLoDBuCQxyy8IrJof48GzPEjT0ybzo824gfOl6t3Vk1f0JYXwTRQE4e?= =?Windows-1252?Q?r2ne4qORx7FPGCZSC5SxErMQd1gXjk5a3ysAZmBSjNDOwXNDt+bvdhHD?= =?Windows-1252?Q?z9hKD3QBNcr0UB9TX3U0AqXx00zZXCEvPeudp07hxLyzvSKgPvlb5aIG?= =?Windows-1252?Q?nlUQ2pJKJkOhpbVRG+0Rl5vSaF9eom8hiYc4QpngPstTRNDSTIYNBMlt?= =?Windows-1252?Q?4ItPl4663FF1bMV4hGxaNnlxT5PuZUYZU0oDjrJBEhb31GT7grWFdHGg?= =?Windows-1252?Q?eITAVqkXwj2NMugKZ4tPzl/3tPHCZaOtRsKXQdAk4THWTHaJ1NSiSlcQ?= =?Windows-1252?Q?zMl3AVy1kRGHkIb/4sUQWs0Oj9P8SLosxDUUkoq4PGp2EznLG5eX4Dfd?= =?Windows-1252?Q?3zC47RBv2QAeXe3qfkel6hoklANpN1BT8RMf7R8rD/2qUDZcIqfEkAPA?= =?Windows-1252?Q?P7+MYMNi5yMU3gDmFaBEIZp7Q=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;SN1PR12MB0750;5:Ex7GHiCCLD6IcmA/0xF6xOvInvBwgGd/aeZ5csfBTJkTEKe4zy1g4gDAetcwbaskk0nFU+kFx6MxIia0nVyBB3UXEqYrUInTy95qqze5zyZ67ZPx5lqUFKQm7G9Q8tHNCZfD6aB7Q3FAAqwg4eTIRQ==;24:f/TjElJdfC/BbGzvbp2pwEUe4tllVsGWxS8/Gbb+15Lfpcr9LmuF1BsoKKd6bAgUfrVG5eNH73u5VOc8Xc3ky6Rwy1FEI+XFItVKUXL37LY=;20:mLi5qRLtrC9J9Edcc4qheO5RcPyvcy4zjWvn66zvm00wh+4FaYTIvf1F0+PXB3qlx9CUNv53k+HlW+DHY6cIrgDLGSvFoZWUQBa1WELSPV9ZB+fCYXBsT1eKggwy3bdrlFsA1gm6VnSe+wWCGaDpO26ndc8my0eaklUdwWUrCVNbPQ8emYDPFBGg/nPrgAsMaRrWhVRHaxmBoA+F35NYW/CsfLi3jxGmiBEDu8WdzZUrmF55M/jXQjAiLX33X/TO X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Mar 2016 16:32:00.6106 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR12MB0750 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6261 Lines: 148 --------------090506060305070802030409 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 8bit On 09.03.2016 08:56, Luis Henriques wrote: > On Mon, Mar 07, 2016 at 02:58:51PM -0800, Greg Kroah-Hartman wrote: >> On Mon, Mar 07, 2016 at 10:06:47PM +0100, Christian K?nig wrote: >>> Am 07.03.2016 um 21:46 schrieb Greg Kroah-Hartman: >>>> On Sun, Mar 06, 2016 at 07:50:14PM -0700, Erik Andersen wrote: >>>>> The following patch to radeon_sa_bo_new that >>>>> went into 3.10.99 >>>>> >>>>> commit 8d5e1e5af0c667545c202e8f4051f77aa3bf31b7 >>>>> Author: Nicolai Hähnle >>>>> Date: Fri Feb 5 14:35:53 2016 -0500 >>>>> drm/radeon: hold reference to fences in radeon_sa_bo_new >>>>> commit f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream. >>>>> >>>>> is triggering an Oops for me right when xscreensaver >>>>> first began doing 3D stuff. After reverting this >>>>> patch, xscreensaver has been happily running 3D stuff. >>>>> >>>>> Mar 6 18:00:43 sage kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 >>>>> Mar 6 18:00:43 sage kernel: IP: [] radeon_fence_ref+0xd/0x50 [radeon] >>>>> Mar 6 18:00:43 sage kernel: PGD 799e1d067 PUD 819186067 PMD 0 >>>>> Mar 6 18:00:43 sage kernel: Oops: 0002 [#1] SMP >>>>> >>>>> Mar 6 18:00:43 sage kernel: Stack: >>>>> Mar 6 18:00:43 sage kernel: ffffffffa01607ec ffff88108a4e8000 ffff88108a4e8000 ffff880888fbc000 >>>>> Mar 6 18:00:43 sage kernel: ffff880ecbf11c78 0000fe2001000006 0000000000000000 0020000000000100 >>>>> Mar 6 18:00:43 sage kernel: 00000000000d1200 ffff880ecbf11c14 0000000000000000 0000000000000000 >>>>> Mar 6 18:00:43 sage kernel: Call Trace: >>>>> Mar 6 18:00:43 sage kernel: [] ? radeon_sa_bo_new+0x2ac/0x4f0 [radeon] >>>>> Mar 6 18:00:43 sage kernel: [] ? ttm_eu_list_ref_sub+0x3d/0x60 [ttm] >>>>> Mar 6 18:00:43 sage kernel: [] radeon_ib_get+0x39/0x110 [radeon] >>>>> Mar 6 18:00:43 sage kernel: [] radeon_cs_ioctl+0x69a/0xa70 [radeon] >>>>> Mar 6 18:00:43 sage kernel: [] drm_ioctl+0x512/0x650 [drm] >>>>> Mar 6 18:00:43 sage kernel: [] ? do_futex+0x111/0xc30 >>>>> Mar 6 18:00:43 sage kernel: [] do_vfs_ioctl+0x305/0x520 >>>>> Mar 6 18:00:43 sage kernel: [] ? vtime_account_user+0x69/0x80 >>>>> Mar 6 18:00:43 sage kernel: [] SyS_ioctl+0x81/0xa0 >>>>> Mar 6 18:00:43 sage kernel: [] tracesys+0xe1/0xe6 >>>>> >>>>> $ lspci | grep VGA >>>>> 03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. >>>>> [AMD/ATI] Redwood XT [Radeon HD 5670/5690/5730] >>>> Next time, please cc: the people responsible for that patch as well... >>>> >>>> I can revert it, but maybe something else is going on here? Do you have >>>> this same problem on 3.14, and 4.5-rc7? >>> >>> Hi Greg, >>> >>> yes that's an already known issue. Feel free to revert that one for now. >>> >>> I got it on my TODO list to provide a fixed patch for older kernel, but that >>> can take a while. >>> >>> For the background Nicolais patch is correct, but assumes that >>> radeon_fence_unref() can safely take NULL as the fence which is not the case >>> for older kernels. Actually, the call to radeon_fence_ref() is the culprit. >> >> Ok, thanks, now reverted. >> > > And looks like a few more kernels may be affected as well. I'll > revert it from 3.16 kernel, and I'm adding Kamal, Sasha and Jiri to > the CC list. Kernels that contain commit 954605ca "drm/radeon: use common fence implementation for fences, v4" are safe, older kernels require a NULL-pointer check around the call to radeon_fence_ref. This means kernels 3.17 and older are affected and need the additional NULL pointer check that I've sent out already on a different thread (I'm attaching it again, hoping that Erik gets a chance to test it). It would be nice to get a confirmation that this really does fix the observed bug, then I can prepare a fixed version of the patch for 3.17 and older (i.e. squash the original bad commit with the attached patch). Cheers, Nicolai > > Cheers, > -- > Lu?s > >> greg k-h >> -- >> To unsubscribe from this list: send the line "unsubscribe stable" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html --------------090506060305070802030409 Content-Type: text/x-patch; name="0001-drm-radeon-guard-call-to-radeon_fence_ref-against-NU.patch" Content-Transfer-Encoding: 8bit Content-Disposition: attachment; filename*0="0001-drm-radeon-guard-call-to-radeon_fence_ref-against-NU.pa"; filename*1="tch" >From 85d028178d9772f2a07e4ed156820d95c4e0ad18 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Nicolai=20H=C3=A4hnle?= Date: Mon, 7 Mar 2016 23:41:52 -0300 Subject: [PATCH] drm/radeon: guard call to radeon_fence_ref against NULL pointers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Candidate fix for a kernel oops that was introduced by the backport of commit 954605ca3 "drm/radeon: hold reference to fences in radeon_sa_bo_new" to kernels where radeon does not use the common fence implementation for fences. Reported-by: Lutz Euler Signed-off-by: Nicolai Hähnle --- drivers/gpu/drm/radeon/radeon_sa.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index 197b157..7d11901 100644 --- a/drivers/gpu/drm/radeon/radeon_sa.c +++ b/drivers/gpu/drm/radeon/radeon_sa.c @@ -349,8 +349,10 @@ int radeon_sa_bo_new(struct radeon_device *rdev, /* see if we can skip over some allocations */ } while (radeon_sa_bo_next_hole(sa_manager, fences, tries)); - for (i = 0; i < RADEON_NUM_RINGS; ++i) - radeon_fence_ref(fences[i]); + for (i = 0; i < RADEON_NUM_RINGS; ++i) { + if (fences[i]) + radeon_fence_ref(fences[i]); + } spin_unlock(&sa_manager->wq.lock); r = radeon_fence_wait_any(rdev, fences, false); -- 2.5.0 --------------090506060305070802030409--