Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1310866imu; Fri, 9 Nov 2018 14:27:42 -0800 (PST) X-Google-Smtp-Source: AJdET5f2vVPuK4EUCQYZhpgFltzWAp/tCntvAfL5g37WoOT/9u/u1+e+L9+3+I2h7ZH3kBhOqWIa X-Received: by 2002:a63:d40a:: with SMTP id a10mr8883196pgh.394.1541802462386; Fri, 09 Nov 2018 14:27:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541802462; cv=none; d=google.com; s=arc-20160816; b=RGa+27t2C1kBGxFYSuBEipNiFaHbz/A26HABDUuyXcTCer1Ns1Ez2YSij04hizXSeG T0Dg/QNvrVEuYQ/5uMc5iUGeUfhC0lGG3yJ9UM8N28/ne2oYhz8RMtFEpIV/ebzum2Hv 7Xq4OA79REQvrLWmWcRvWbqX0pCLnKE+2XGRn5J4zMPB/wX658aHna26D6p18xYrnuE7 ah/wwk0DcPp3Dl31jof655Nhyngd6sqBPCpbAMx08MISje3QUQZDjhjdoM6oLVXkEs9G EFOdQF03bhMObHSYok54tTbm8dcFyt7VZrCJuNPq82BkpLhpUbSgmhu9E8Dclkh011aU tSvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from; bh=r7BapjlDtkU8VDOBV9lPAuGtKjOFrAQz32qTArBFhWA=; b=pM/ZHgjcQliPibH+tLDgDTGkISp8SJojCzzlEq0IyXfEmUqEIF6tNl1jsrjVsxzpl9 FRg+B4ZW8uluxVmoPHhPaX0ug+MhoV2e7S0iSdvAQ22W0Vo0ihSNXVUkH/wTldV/nVUF CShly9qZrX1J9nh/A/3uh0AYHW0JXhDAR4HD7syICHlBKLR9cftFBTYbyXLmpIb1IumB RE36jHrUz0KYjLQm2sQpH/TJB1Qs6VEkthFDHXtPZhTfTWrr1Pn8XQdqlfG1zzX+vqW2 8MxzFPrTqUlINlG7bPCFGmfs9j5JA1UPmRZMguMoV9+p81o5I5jDa2o5Ac2N8DhpcnsA EZRw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b14si8515222pgj.20.2018.11.09.14.27.26; Fri, 09 Nov 2018 14:27:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727961AbeKJIJh (ORCPT + 99 others); Sat, 10 Nov 2018 03:09:37 -0500 Received: from anholt.net ([50.246.234.109]:46322 "EHLO anholt.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726181AbeKJIJh (ORCPT ); Sat, 10 Nov 2018 03:09:37 -0500 Received: from localhost (localhost [127.0.0.1]) by anholt.net (Postfix) with ESMTP id E681610A1314; Fri, 9 Nov 2018 14:27:00 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at anholt.net Received: from anholt.net ([127.0.0.1]) by localhost (kingsolver.anholt.net [127.0.0.1]) (amavisd-new, port 10024) with LMTP id RHJU0tWRG4Vj; Fri, 9 Nov 2018 14:26:59 -0800 (PST) Received: from eliezer.anholt.net (localhost [127.0.0.1]) by anholt.net (Postfix) with ESMTP id 5FCE310A01F2; Fri, 9 Nov 2018 14:26:59 -0800 (PST) Received: by eliezer.anholt.net (Postfix, from userid 1000) id E170B2FE1B8F; Fri, 9 Nov 2018 14:26:58 -0800 (PST) From: Eric Anholt To: zhoucm1 , christian.koenig@amd.com, "dri-devel\@lists.freedesktop.org" Cc: Daniel Vetter , "linux-kernel\@vger.kernel.org" Subject: Re: [PATCH 2/2] drm: Revert syncobj timeline changes. In-Reply-To: <87d0rex8h2.fsf@anholt.net> References: <20181108160422.17743-1-eric@anholt.net> <20181108160422.17743-3-eric@anholt.net> <635caa27-eb0b-a4d6-5a1d-3fbe5382bd6b@amd.com> <87d0rex8h2.fsf@anholt.net> User-Agent: Notmuch/0.22.2+1~gb0bcfaa (http://notmuchmail.org) Emacs/25.2.2 (x86_64-pc-linux-gnu) Date: Fri, 09 Nov 2018 14:26:58 -0800 Message-ID: <87y3a1sx8t.fsf@anholt.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Eric Anholt writes: > [ Unknown signature status ] > zhoucm1 writes: > >> On 2018=E5=B9=B411=E6=9C=8809=E6=97=A5 00:52, Christian K=C3=B6nig wrote: >>> Am 08.11.18 um 17:07 schrieb Koenig, Christian: >>>> Am 08.11.18 um 17:04 schrieb Eric Anholt: >>>>> Daniel suggested I submit this, since we're still seeing regressions >>>>> from it.=C2=A0 This is a revert to before 48197bc564c7 ("drm: add syn= cobj >>>>> timeline support v9") and its followon fixes. >>>> This is a harmless false positive from lockdep, Chouming and I are >>>> already working on a fix. >>> >>> On the other hand we had enough trouble with that patch, so if it=20 >>> really bothers you feel free to add my Acked-by: Christian K=C3=B6nig=20 >>> and push it. >> NAK, please no, I don't think this needed, the Warning totally isn't=20 >> related to syncobj timeline, but fence-array implementation flaw, just=20 >> exposed by syncobj. >> In addition, Christian already has a fix for this Warning, I've tested.= =20 >> Please Christian send to public review. > > I backed out my revert of #2 (#1 still necessary) after adding the > lockdep regression fix, and now my CTS run got oomkilled after just a > few hours, with these notable lines in the unreclaimable slab info list: > > [ 6314.373099] drm_sched_fence 69095KB 69095KB > [ 6314.373653] kmemleak_object 428249KB 428384KB > [ 6314.373736] kmalloc-262144 256KB 256KB > [ 6314.373743] kmalloc-131072 128KB 128KB > [ 6314.373750] kmalloc-65536 64KB 64KB > [ 6314.373756] kmalloc-32768 1472KB 1728KB > [ 6314.373763] kmalloc-16384 64KB 64KB > [ 6314.373770] kmalloc-8192 208KB 208KB > [ 6314.373778] kmalloc-4096 2408KB 2408KB > [ 6314.373784] kmalloc-2048 288KB 336KB > [ 6314.373792] kmalloc-1024 1457KB 1512KB > [ 6314.373800] kmalloc-512 854KB 1048KB > [ 6314.373808] kmalloc-256 188KB 268KB > [ 6314.373817] kmalloc-192 69141KB 69142KB > [ 6314.373824] kmalloc-64 47703KB 47704KB > [ 6314.373886] kmalloc-128 46396KB 46396KB > [ 6314.373894] kmem_cache 31KB 35KB > > No results from kmemleak, though. OK, it looks like the #2 revert probably isn't related to the OOM issue. Running a single job on otherwise unused DRM, watching /proc/slabinfo every second for drm_sched_fence, I get: drm_sched_fence 0 0 192 21 1 : tunables 32 16 8 = : slabdata 0 0 0 : globalstat 0 0 0 0 0= 0 0 0 0 : cpustat 0 0 0 0 drm_sched_fence 16 21 192 21 1 : tunables 32 16 8 = : slabdata 1 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 drm_sched_fence 13 21 192 21 1 : tunables 32 16 8 = : slabdata 1 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 drm_sched_fence 6 21 192 21 1 : tunables 32 16 8 = : slabdata 1 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 drm_sched_fence 4 21 192 21 1 : tunables 32 16 8 = : slabdata 1 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 drm_sched_fence 2 21 192 21 1 : tunables 32 16 8 = : slabdata 1 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 drm_sched_fence 0 21 192 21 1 : tunables 32 16 8 = : slabdata 0 1 0 : globalstat 16 16 1 0 0= 0 0 0 0 : cpustat 5 1 6 0 So we generate a ton of fences, and I guess free them slowly because of RCU? And presumably kmemleak was sucking up lots of memory because of how many of these objects were laying around. --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCgAdFiEE/JuuFDWp9/ZkuCBXtdYpNtH8nugFAlvmCbIACgkQtdYpNtH8 nug3FA//dSeupJhOUQlYzR/yEjezWSaHnRezKzRVfofkIOEpwvxuEJ52tkeUMR/G bqseYn3rZAREeDw0uj5gJenSHO0u06clcMuVjOMu/zbRXT89mAFYre2txEoShDwv YnH/vm9PxOAEGpXcSB/gquqmlyhBaig53hIXwKVUl0qA1G01biFiySgN6DYvu/Us pAy3UZ07xHBMn0zpm3ZpDdHShEXOS8XQEEeJm6Y3BOWqEhOVsb9M9WIiFFDwBm6H GvUnN4SA5QfloSlWHwBR1CL5shCAhmVf2XNd4RwsyqkrRi1aExXirxCL8d0u1dIK A0T2K919aokdvpiUWOHIBJJaw7BkCLFk3Ysdn2ygWxrV6Ipi2ux3wdmbmB/Ojbz2 FEV+6NDisp05a4wI03VZc2ZgkQQGkcg+k1pFfpAepDq1GqdjoWerkGM8OlvZFuYP m0qTrdawwcSLaYWT9N47gSOJm5QepxjAxtMA3uMXWSQH4+b8F8GO6RWagyxGEw1D ayZ/2ANtSNG9KTTETgdl75kJPUSdgZxm2OtZHCngsufIJM4Yj/kfonle3BJJskv0 UdIb4GKRhXE+T9sBCjcIazhWbJ4oftwbXbGCgGAFFSq+IFU+UWecG0AMGdILHItp KoL1bzoGWzBJAU3EOLB8cJmBzNYNlbFgOX1qQfHQ9fHx/I3QdPM= =a3+X -----END PGP SIGNATURE----- --=-=-=--