Received: by 10.213.65.68 with SMTP id h4csp2390994imn; Mon, 9 Apr 2018 02:46:37 -0700 (PDT) X-Google-Smtp-Source: AIpwx48HGTg+S1GWl3T/oxJ846M/ykxiJ9eYsnNdpfQW5xxNcpjEdmiwT/GYxMOwtmm1LpkxBsoz X-Received: by 10.99.53.6 with SMTP id c6mr24895529pga.413.1523267197916; Mon, 09 Apr 2018 02:46:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523267197; cv=none; d=google.com; s=arc-20160816; b=pJmnaeKJA0I3R2YGjBoNqRAvlM/rgfYAtvTgMDDz4FqI97+AWwFA2I0q9s5qZfssJn th0FTHK2BkKqhgVJ7p1B4pY0spnZPjCW6f8HZndjOr398nL35nxv3NucaO+cEozojx26 aqgPEaLDy1p//le3T5OeMFZIgoK0GcMJgtw5KKsoe4NYSrgpOHGdPGM+PPzfJsHkZhCZ jkzwp71ni3+/9E94HPWOGWVyyjUREmRDtalYxUrunnyb6sjWUXgA2AfjtS3h/eVFshmF IhoTckAB1X4tQSeMrpRHjPvYdRVyEqs091/75uoVlvj5k30iQHSB7nxTmoCAHmrIh22g 4jYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:reply-to:dkim-signature :arc-authentication-results; bh=78tClE4fda2VAOPlZNdWu5epnY/3u2ON/1HA3tQGlF0=; b=J9eZNxYi7ZI6i3E1iGWi/69vGIP/7/UnhopLr7eOw8YDC3FJnSkJDfEzMFCDhkJnkV eIZvAhObUr3AwvBvZJUF7fsYvDLrEyVBpO6MFos9xrr52A28iUR3kHjdxyqGDjD02ywQ +mHaZKOV1ekb9sO0AhezXvVD5G5+pazsvuKYWnstgguBh0qPN6AYBhmX43lvORiVGIf1 rg7V3YspljyLORyyGZsdVfrhL+O7MHT04IYEklVy91r8puowiNo1zsC++VX70l02gmYe iLMiY8aIyT4FUmTRvYEs3aWZKLlX3gMmYt9+a1wM+gGHrjIpYgtZ6X+Ik9SpsprR6JVG 0Fnw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=lAKFn/Mk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y5-v6si10732894plr.180.2018.04.09.02.46.00; Mon, 09 Apr 2018 02:46:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=lAKFn/Mk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752721AbeDIJm2 (ORCPT + 99 others); Mon, 9 Apr 2018 05:42:28 -0400 Received: from mail-wr0-f175.google.com ([209.85.128.175]:34979 "EHLO mail-wr0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751767AbeDIJm0 (ORCPT ); Mon, 9 Apr 2018 05:42:26 -0400 Received: by mail-wr0-f175.google.com with SMTP id 80so8812628wrb.2 for ; Mon, 09 Apr 2018 02:42:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:to:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=78tClE4fda2VAOPlZNdWu5epnY/3u2ON/1HA3tQGlF0=; b=lAKFn/Mk5lYLTyNQ3xZsebE4VyV40sT+KGGTYgHHHdJVxvJ6ToHGXkLTg4U5UqKyBO vtgcUg/bB1PxmQeApOzvO1VdCD9EXeg47MxNbOodP8SDKxTGaQORhGFxEksKSL9tY2P0 zpub79YXzWPEyuoKGwZI6smxnLOlNt/Tw05nAzKuBy6TKAR7eAeAJXJdqsuwoY1Xj+Hm J7YEQ3Ufj+3GnaxprDiUGwlH3r48zyEbZ1nObNErWHr40rpjjSDb9q5otP7AoDSl1JcH H7E76IRmP8FnTU2K3cwJncKUks2vE7uDawvjJYinlFynAHroAvrlckW/G4j+cMguo9AQ v7VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:to:references:from:message-id :date:user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=78tClE4fda2VAOPlZNdWu5epnY/3u2ON/1HA3tQGlF0=; b=FrS37Icdh0trvdY5cKptMlhULVdaKsLejef4hwx9LVmCo99hdOeuKGo1im3gwkQC+z 9lDwTCrJ4mocPQTSZs6tab6OBLTD7NBp8kC6QmMLGGI5XNqKJ3/pDhxRZj10fd/cg+g4 goFmK1YhQ3nXQ5OgueIdeomz03QyyPqnQ50l0rPdBi8z1ODeSNo3pGH012SqLL81RwAx 0N1XjYjqyFknJwlHVTnrZdyayaR4dQW7ENkhfbVztUW/okcLJoMbZIpgl0yScKtNO7vI DDdRedPH7bn340eJ205S0t7LMzn/IllP2LYeJ1tbSHf1W90Rxejo/EVyCTdsxm4BF+B2 /P9w== X-Gm-Message-State: AElRT7E+PE0Ennapy5Ag6vLb1fcBl+kgYJp/yG7Tb17CaPLNTKF8UeHI 4dSbpiv/e/PEoFVGe6eAJZtuBFH9 X-Received: by 10.223.226.66 with SMTP id n2mr27733145wri.228.1523266944682; Mon, 09 Apr 2018 02:42:24 -0700 (PDT) Received: from ?IPv6:2a02:908:1257:4460:1ab8:55c1:a639:6740? ([2a02:908:1257:4460:1ab8:55c1:a639:6740]) by smtp.gmail.com with ESMTPSA id g75sm301325wmc.47.2018.04.09.02.42.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 09 Apr 2018 02:42:19 -0700 (PDT) Reply-To: christian.koenig@amd.com Subject: Re: AMD graphics performance regression in 4.15 and later To: Jean-Marc Valin , =?UTF-8?Q?Christian_K=c3=b6nig?= , airlied@linux.ie, alexander.deucher@amd.com, Felix.Kuehling@amd.com, labbott@redhat.com, akpm@linux-foundation.org, michel.daenzer@amd.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <9ca940f1-7f21-c420-de45-13d72e783ab6@amd.com> <6cebabff-908f-5ebe-4252-760773c4cd6f@amd.com> <312ed341-7052-a61e-331f-d1e8fd5b477e@mozilla.com> <77866d66-2728-8295-d7ee-9975dbf64b99@mozilla.com> From: =?UTF-8?Q?Christian_K=c3=b6nig?= Message-ID: <55e1712b-6567-50c5-3789-53dd1ccddb94@gmail.com> Date: Mon, 9 Apr 2018 11:42:18 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <77866d66-2728-8295-d7ee-9975dbf64b99@mozilla.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 07.04.2018 um 00:00 schrieb Jean-Marc Valin: > Hi Christian, > > Thanks for the info. FYI, I've also opened a Firefox bug for that at: > https://bugzilla.mozilla.org/show_bug.cgi?id=1448778 > Feel free to comment since you have a better understanding of what's > going on. > > One last question: right now I'm running 4.15.0 with the "offending" > patch reverted. Is that safe to run or are there possible bad > interactions with other changes. That should work without problems. But I just had another idea as well, if you want you could still test the new code path which will be using in 4.17. Backporting all the detection logic is to invasive, but you could just go into drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c and forcefull use the other code path. Just look out for "#ifdef CONFIG_SWIOTLB" checks and disable those. Regards, Christian. > > Cheers, > > Jean-Marc > > On 04/06/2018 01:20 PM, Christian König wrote: >> Am 06.04.2018 um 18:42 schrieb Jean-Marc Valin: >>> Hi Christian, >>> >>> On 04/09/2018 07:48 AM, Christian König wrote: >>>> Am 06.04.2018 um 17:30 schrieb Jean-Marc Valin: >>>>> Hi Christian, >>>>> >>>>> Is there a way to turn off these huge pages at boot-time/run-time? >>>> Only at compile time by not setting CONFIG_TRANSPARENT_HUGEPAGE. >>> Any reason why >>> echo never > /sys/kernel/mm/transparent_hugepage/enabled >>> doesn't solve the problem? >> Because we unfortunately try to allocate huge pages anyway, we >> unfortunately just fail in 100% of all cases. >> >> That basically gives you both, the extra allocation overhead and the >> still bad throughput. >> >>> Also, I assume that disabling CONFIG_TRANSPARENT_HUGEPAGE will disable >>> them for everything and not just what your patch added, right? >> Correct, that's why I wrote that disabling SWIOTLBs might be better. >> >>>>> I'm not sure what you mean by "We mitigated the problem by avoiding the >>>>> slow coherent DMA code path on almost all platforms on newer >>>>> kernels". I >>>>> tested up to 4.16 and the performance regression is just as bad as >>>>> it is >>>>> for 4.15. >>>> Indeed 4.16 still doesn't have that. You could use the >>>> amd-staging-drm-next branch or wait for 4.17. >>> Is there a way to pull just that change or is there too much >>> interactions with other changes? >> It adds a new detection if memory allocation needs to be coherent or >> not, that is not something you can easily pull into older versions. >> >>>> That isn't related to the GFX hardware, but to your CPU/motherboard and >>>> whatever else you have in the system. >>> Well, I have an nvidia GPU in the same system (normally only used for >>> CUDA) and if I use it instead of my RX 560 then I'm not seeing any >>> performance issue with 4.15. >> That's because you are probably using the Nvidia binary driver which has >> a completely separate code base. >> >>>> Some part of your system needs SWIOTLB and that makes allocating memory >>>> much slower. >>> What would that part be? FTR, I have a complete description of my system >>> at https://jmvalin.dreamwidth.org/15583.html >>> >>> I don't know if it's related, but I can maybe see one thing in common >>> between my machine and the Core 2 Quad from the other bug report and >>> that's the "NUMA part". I have a dual-socket Xeon and (AFAIK) the Core 2 >>> Quad is made of two two-core CPUs glued together with little >>> communication between them. >> Yeah, that is probably the reason. >> >>>> Intel doesn't use TTM because they don't have dedicated VRAM, but the >>>> open source nvidia driver should be affected as well. >>> I'm using the proprietary nvidia driver (because CUDA). Is that supposed >>> to be affected as well? >> No. >> >>>> We already mitigated that problem and I don't see any solution which >>>> will arrive faster than 4.17. >>> Is that supposed to make the slowdown unnoticeable or just slightly >>> better? >> It completely goes away. The issue with the coherent path is that it >> tries to always allocate the lowest possible memory to make sure that it >> fits into the DMA constrains of all devices in the system. >> >> But since AMD GPU can handle 40bits of addresses you would need at least >> 1TB of memory in the system to trigger that (or a NUMA where some system >> is low and some in a high area). >> >> Christian. >> >>>> The only quick workaround I can see is to avoid firefox, chrome for >>>> example is reported to work perfectly fine. >>> Or use an unaffected GPU/driver ;-) >>> >>> Cheers, >>> >>>     Jean-Marc >>> > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel