Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp1362363ybi; Fri, 14 Jun 2019 13:22:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqw6xBuhTd4reBkV6GKfIF9ePF113B6txnt+VtGD08sKDaxOqLEoGJEb2L6/f8ZYZKT3Oxrm X-Received: by 2002:a65:6104:: with SMTP id z4mr37509692pgu.319.1560543757452; Fri, 14 Jun 2019 13:22:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560543757; cv=none; d=google.com; s=arc-20160816; b=yzHoauZWH9ZliOtqLtAGlc7U7+54H24xGh6aE6ZDcl+loqGovsxAuCvyfeg/RN1wKs sBuQgKKbE9VezmRcANDAvcFfVv64zhCpe8RZJgyjPAusfAg+5VXTcb/i7gwWvfARt3Aa 4PV4pnB7uLf/tgapVDNUsR+GHpoR9HV42EGAERG9ELABuEs3iIcR1TdTx1ugn9UqqiUX p4UAa4bdiwjQa6BeqgA3ncR9u25bMUbRAbT+A7X0coSkJ6g5kegPLgzerPV4MwiEHXm0 ERwR/6MjyQvCp6qHVq0ILx4qclVm2VDlifHi95b4QazfetO4VM0SeLfhu2ojqyReWLwQ jssA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:dkim-signature:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from; bh=TyMsiSXJXvB4f8OYfcJYvENBphf76n8t9aZCHzDZ8O8=; b=zmeJRnsqjGvekvjoqEYQQXqOenPxTCAA3KgT5S9xK+H1NveKyJDldRDHH3Nf7KIa1q og6hMuVQGKorgOXR66ZpfX1M9g6Q+Mz9klAqFrnQ/eAtHk1NBfuo13E+Ea0mvZtqr8pm ervd/6Ps/Vupm1G04VOc15rugmCWTEV8sHtfX78UlkEGz7KokPqTc36bSwBftntcwuRG olaHhEbYCKLZOFWrCxJZ45P71wuAl+fhADLO/6AhG7mPnkMZ187qntR6HcYZjjduGnKv uJQfJInb7nkbzKlHhsMG4Ixu9YoMqmzEKay6EEz05gGaLLF/yOr3yfSwn/4YJsz/+xyV acwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=kJwjrBly; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x6si3237298pfo.246.2019.06.14.13.22.21; Fri, 14 Jun 2019 13:22:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=kJwjrBly; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726082AbfFNUUa (ORCPT + 99 others); Fri, 14 Jun 2019 16:20:30 -0400 Received: from hqemgate14.nvidia.com ([216.228.121.143]:8499 "EHLO hqemgate14.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725808AbfFNUUa (ORCPT ); Fri, 14 Jun 2019 16:20:30 -0400 Received: from hqpgpgate102.nvidia.com (Not Verified[216.228.121.13]) by hqemgate14.nvidia.com (using TLS: TLSv1.2, DES-CBC3-SHA) id ; Fri, 14 Jun 2019 13:20:29 -0700 Received: from hqmail.nvidia.com ([172.20.161.6]) by hqpgpgate102.nvidia.com (PGP Universal service); Fri, 14 Jun 2019 13:20:28 -0700 X-PGP-Universal: processed; by hqpgpgate102.nvidia.com on Fri, 14 Jun 2019 13:20:28 -0700 Received: from HQMAIL102.nvidia.com (172.18.146.10) by HQMAIL104.nvidia.com (172.18.146.11) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 14 Jun 2019 20:20:28 +0000 Received: from HQMAIL105.nvidia.com (172.20.187.12) by HQMAIL102.nvidia.com (172.18.146.10) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 14 Jun 2019 20:20:24 +0000 Received: from hqnvemgw01.nvidia.com (172.20.150.20) by HQMAIL105.nvidia.com (172.20.187.12) with Microsoft SMTP Server (TLS) id 15.0.1473.3 via Frontend Transport; Fri, 14 Jun 2019 20:20:24 +0000 Received: from rcampbell-dev.nvidia.com (Not Verified[10.110.48.66]) by hqnvemgw01.nvidia.com with Trustwave SEG (v7,5,8,10121) id ; Fri, 14 Jun 2019 13:20:24 -0700 From: Ralph Campbell To: Jerome Glisse , David Airlie , "Ben Skeggs" , Jason Gunthorpe CC: , , , , "Ralph Campbell" Subject: [PATCH v2] drm/nouveau/dmem: missing mutex_lock in error path Date: Fri, 14 Jun 2019 13:20:03 -0700 Message-ID: <20190614202003.1642-1-rcampbell@nvidia.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-NVConfidentiality: public Content-Transfer-Encoding: quoted-printable Content-Type: text/plain DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1560543629; bh=TyMsiSXJXvB4f8OYfcJYvENBphf76n8t9aZCHzDZ8O8=; h=X-PGP-Universal:From:To:CC:Subject:Date:Message-ID:X-Mailer: MIME-Version:X-NVConfidentiality:Content-Transfer-Encoding: Content-Type; b=kJwjrBlydBpBJgo8hyIrwjIOnGphjs3oV7L8YuZTh4U/469wv6xcrNip4TU0R3X2s 92NQlgKpdKxKuA2jzbM+RkQtwdDwmpYZuPS6wx5ihP7u+1aTgnKmateplMqV0GFumV 4jJgqIGUx3q+7dqFTe9fyPuGSOLyr6MzJML+zIfBOkw2FGHZW4Qs3G5ZUKMiAcaLKF znT9Quq8WVsrLGhD7krNmRsrAJwwdMCtaPulpyDIn0VuK/zDdGwdU8QM8U+cD9bTKs uo6keHXl2cU0fpn4HGBjMg6sl1044/79SayQHsMnN/dYp1I3HSSMNEv8uEj4zTHrX6 RINHVn03sQ9QA== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org In nouveau_dmem_pages_alloc(), the drm->dmem->mutex is unlocked before calling nouveau_dmem_chunk_alloc() as shown when CONFIG_PROVE_LOCKING is enabled: [ 1294.871933] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D [ 1294.876656] WARNING: bad unlock balance detected! [ 1294.881375] 5.2.0-rc3+ #5 Not tainted [ 1294.885048] ------------------------------------- [ 1294.889773] test-malloc-vra/6299 is trying to release lock (&drm->dmem->= mutex) at: [ 1294.897482] [] nouveau_dmem_migrate_alloc_and_copy+0x7= 9f/0xbf0 [nouveau] [ 1294.905782] but there are no more locks to release! [ 1294.910690]=20 [ 1294.910690] other info that might help us debug this: [ 1294.917249] 1 lock held by test-malloc-vra/6299: [ 1294.921881] #0: 0000000016e10454 (&mm->mmap_sem#2){++++}, at: nouveau_s= vmm_bind+0x142/0x210 [nouveau] [ 1294.931313]=20 [ 1294.931313] stack backtrace: [ 1294.935702] CPU: 4 PID: 6299 Comm: test-malloc-vra Not tainted 5.2.0-rc3= + #5 [ 1294.942786] Hardware name: ASUS X299-A/PRIME X299-A, BIOS 1401 05/21/201= 8 [ 1294.949590] Call Trace: [ 1294.952059] dump_stack+0x7c/0xc0 [ 1294.955469] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.962213] print_unlock_imbalance_bug.cold.52+0xca/0xcf [ 1294.967641] lock_release+0x306/0x380 [ 1294.971383] ? nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1294.978089] ? lock_downgrade+0x2d0/0x2d0 [ 1294.982121] ? find_held_lock+0xac/0xd0 [ 1294.985979] __mutex_unlock_slowpath+0x8f/0x3f0 [ 1294.990540] ? wait_for_completion+0x230/0x230 [ 1294.995002] ? rwlock_bug.part.2+0x60/0x60 [ 1294.999197] nouveau_dmem_migrate_alloc_and_copy+0x79f/0xbf0 [nouveau] [ 1295.005751] ? page_mapping+0x98/0x110 [ 1295.009511] migrate_vma+0xa74/0x1090 [ 1295.013186] ? move_to_new_page+0x480/0x480 [ 1295.017400] ? __kmalloc+0x153/0x300 [ 1295.021052] ? nouveau_dmem_migrate_vma+0xd8/0x1e0 [nouveau] [ 1295.026796] nouveau_dmem_migrate_vma+0x157/0x1e0 [nouveau] [ 1295.032466] ? nouveau_dmem_init+0x490/0x490 [nouveau] [ 1295.037612] ? vmacache_find+0xc2/0x110 [ 1295.041537] nouveau_svmm_bind+0x1b4/0x210 [nouveau] [ 1295.046583] ? nouveau_svm_fault+0x13e0/0x13e0 [nouveau] [ 1295.051912] drm_ioctl_kernel+0x14d/0x1a0 [ 1295.055930] ? drm_setversion+0x330/0x330 [ 1295.059971] drm_ioctl+0x308/0x530 [ 1295.063384] ? drm_version+0x150/0x150 [ 1295.067153] ? find_held_lock+0xac/0xd0 [ 1295.070996] ? __pm_runtime_resume+0x3f/0xa0 [ 1295.075285] ? mark_held_locks+0x29/0xa0 [ 1295.079230] ? _raw_spin_unlock_irqrestore+0x3c/0x50 [ 1295.084232] ? lockdep_hardirqs_on+0x17d/0x250 [ 1295.088768] nouveau_drm_ioctl+0x9a/0x100 [nouveau] [ 1295.093661] do_vfs_ioctl+0x137/0x9a0 [ 1295.097341] ? ioctl_preallocate+0x140/0x140 [ 1295.101623] ? match_held_lock+0x1b/0x230 [ 1295.105646] ? match_held_lock+0x1b/0x230 [ 1295.109660] ? find_held_lock+0xac/0xd0 [ 1295.113512] ? __do_page_fault+0x324/0x630 [ 1295.117617] ? lock_downgrade+0x2d0/0x2d0 [ 1295.121648] ? mark_held_locks+0x79/0xa0 [ 1295.125583] ? handle_mm_fault+0x352/0x430 [ 1295.129687] ksys_ioctl+0x60/0x90 [ 1295.133020] ? mark_held_locks+0x29/0xa0 [ 1295.136964] __x64_sys_ioctl+0x3d/0x50 [ 1295.140726] do_syscall_64+0x68/0x250 [ 1295.144400] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 1295.149465] RIP: 0033:0x7f1a3495809b [ 1295.153053] Code: 0f 1e fa 48 8b 05 ed bd 0c 00 64 c7 00 26 00 00 00 48 = c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48= > 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd bd 0c 00 f7 d8 64 89 01 48 [ 1295.171850] RSP: 002b:00007ffef7ed1358 EFLAGS: 00000246 ORIG_RAX: 000000= 0000000010 [ 1295.179451] RAX: ffffffffffffffda RBX: 00007ffef7ed1628 RCX: 00007f1a349= 5809b [ 1295.186601] RDX: 00007ffef7ed13b0 RSI: 0000000040406449 RDI: 00000000000= 00004 [ 1295.193759] RBP: 00007ffef7ed13b0 R08: 0000000000000000 R09: 00000000015= 7e770 [ 1295.200917] R10: 000000000151c010 R11: 0000000000000246 R12: 00000000404= 06449 [ 1295.208083] R13: 0000000000000004 R14: 0000000000000000 R15: 00000000000= 00000 Reacquire the lock before continuing to the next page. Signed-off-by: Ralph Campbell --- I found this while testing Jason Gunthorpe's hmm tree but this is independent of those changes. Jason thinks it is best to go through David Airlie's nouveau tree. Changes for v2: Updated change log to include console output. drivers/gpu/drm/nouveau/nouveau_dmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouve= au/nouveau_dmem.c index 27aa4e72abe9..00f7236af1b9 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -379,9 +379,10 @@ nouveau_dmem_pages_alloc(struct nouveau_drm *drm, ret =3D nouveau_dmem_chunk_alloc(drm); if (ret) { if (c) - break; + return 0; return ret; } + mutex_lock(&drm->dmem->mutex); continue; } =20 --=20 2.20.1