Received: by 10.213.65.68 with SMTP id h4csp37072imn; Fri, 6 Apr 2018 15:04:30 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/9WCR+ZqdUBEqkNeE4TnwHYcmKCgFy89N3JbmJy2g1Ppch/4POUxztMmMdR8bKyBMZ7crt X-Received: by 2002:a17:902:51ce:: with SMTP id y72-v6mr28589007plh.157.1523052270039; Fri, 06 Apr 2018 15:04:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523052269; cv=none; d=google.com; s=arc-20160816; b=LwivRHUiRim/4TPXFLcXONUUvyFAXRJhfzxQCwQ094NjdMU8nTLg67koTOYxjlBhex PjjY+I4hd9QgQl/SBQFtAPk4Q1LEgJEXl+Ls8VPFot5SCN5Nj31SqCcaos69oRxu0XVs 4zEo/v1poydblKg9N3nOtDa+iaqhTEMkwSv1zI0EI8V1uQnATln+A+oqx68HxnTG+KaT LwF2CO6+3oFvFsYn0fYjS9ptDYXbOBM0AhLFfiJ1nNBhq5rd3yic72+83Aw+a3q5hwD3 XH75nQNZbJJWcDIRwtozNewRvw2lZwJWMvUkapUMXR+HKbq6TGhLrSWl5ZOb0YeTFj97 0CBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:dkim-signature :arc-authentication-results; bh=C8PPDDear4q3jyIROFgF8meuhRWSGgg3Vupt8DxFJJY=; b=pnWhJYDHLfPZ6LnwDHRrIy8H0osiRpFN2uiLkeUtIe1sp6jYEny8uADkk3DekDlaxz To8OXG8+Dpjfo6Wemw2ZSOzQSNcXkjRwu8rD58M+Y5I94aSHtFp/tEeU9sM7WbMpHuNx q6X1XPDRt/5P8lcUNIqIhNU2XeW2+OX1qe+lz8/o0rKtvOZTH1M7LOkF9NgOvbe5Nu3d I2eSFafnu778LxPNp4rSiIy7POOz/exboTq0u/gAvPKClh6v1t5XTSxJ3hBiBnxDxn6M rcoOm4IEkkylZivEdryKh5Huv+x8DbioqeEvXtkURwwBw/NjJYRNpApaL2i1TyXy6VLV xLKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mozilla.com header.s=google header.b=deBAYmm0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mozilla.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l6si5958561pgq.550.2018.04.06.15.03.51; Fri, 06 Apr 2018 15:04:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@mozilla.com header.s=google header.b=deBAYmm0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mozilla.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751984AbeDFWBC (ORCPT + 99 others); Fri, 6 Apr 2018 18:01:02 -0400 Received: from mail-qk0-f173.google.com ([209.85.220.173]:38634 "EHLO mail-qk0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751418AbeDFWBB (ORCPT ); Fri, 6 Apr 2018 18:01:01 -0400 Received: by mail-qk0-f173.google.com with SMTP id 132so2819590qkd.5 for ; Fri, 06 Apr 2018 15:01:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mozilla.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=C8PPDDear4q3jyIROFgF8meuhRWSGgg3Vupt8DxFJJY=; b=deBAYmm0mthFh8LD3tH8K9zhahgVQRXS2/YAX1WAEea67oCFusmIkNBYOIdKFNRu8J 0p/BYXCP5OGlLFEUJTBT2LeSWAqGTwpLPlva1QZtzCf8o2oc3zNXhjO8lyA1YWmp/BKw QRlGw3ri/uZlHhtZOB771GV2AsZDwQHdkmNUE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=C8PPDDear4q3jyIROFgF8meuhRWSGgg3Vupt8DxFJJY=; b=TiEx7sUxEJf17vzrAum+2DPa8yes+s0xQvaoTn2zJWUdyxY2PxbMKJKdSttBz/43pW +HVoEZaIcwgmyAsYXaUQBZbu6Bcv7kv1qN9boYRRVDqEbIK6JVcQXFf7llr9yPaQPVLn GwN7XL5CpA8dzxGzEAk7vPvq/3PFuxRckonjtzTRejvTjCt3Tkm+wNt1vtiXKs/tW+lB Thna4v3iV8ISfEGno/Z04pzLwG2UbyGQ++Fty2mzGa6QVCGCmykIAyPMakFdT9CBUi8U EVU9tLxGOL70MOpw4ItMpfCE8PBowaxfoPg8dgmC1VMqeu2YtZOV51fUw7h4cIJ/zYV/ PdkA== X-Gm-Message-State: ALQs6tDGDAi9B1W2SSfBFMwKd+HkU1jcUcQ8S6XsM3QLtUJIRv6rGLmP FlEhh5Vl1d2obFoczDRghoKSmhkLrlhCZQ== X-Received: by 10.55.217.70 with SMTP id u67mr18076163qki.294.1523052060248; Fri, 06 Apr 2018 15:01:00 -0700 (PDT) Received: from obelix.jmvalin.ca (modemcable231.101-131-66.mc.videotron.ca. [66.131.101.231]) by smtp.gmail.com with ESMTPSA id r57sm8834732qtb.46.2018.04.06.15.00.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Apr 2018 15:00:59 -0700 (PDT) Subject: Re: AMD graphics performance regression in 4.15 and later To: =?UTF-8?Q?Christian_K=c3=b6nig?= , airlied@linux.ie, alexander.deucher@amd.com, Felix.Kuehling@amd.com, labbott@redhat.com, akpm@linux-foundation.org, michel.daenzer@amd.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org References: <9ca940f1-7f21-c420-de45-13d72e783ab6@amd.com> <6cebabff-908f-5ebe-4252-760773c4cd6f@amd.com> <312ed341-7052-a61e-331f-d1e8fd5b477e@mozilla.com> From: Jean-Marc Valin Message-ID: <77866d66-2728-8295-d7ee-9975dbf64b99@mozilla.com> Date: Fri, 6 Apr 2018 18:00:58 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Christian, Thanks for the info. FYI, I've also opened a Firefox bug for that at: https://bugzilla.mozilla.org/show_bug.cgi?id=1448778 Feel free to comment since you have a better understanding of what's going on. One last question: right now I'm running 4.15.0 with the "offending" patch reverted. Is that safe to run or are there possible bad interactions with other changes. Cheers, Jean-Marc On 04/06/2018 01:20 PM, Christian König wrote: > Am 06.04.2018 um 18:42 schrieb Jean-Marc Valin: >> Hi Christian, >> >> On 04/09/2018 07:48 AM, Christian König wrote: >>> Am 06.04.2018 um 17:30 schrieb Jean-Marc Valin: >>>> Hi Christian, >>>> >>>> Is there a way to turn off these huge pages at boot-time/run-time? >>> Only at compile time by not setting CONFIG_TRANSPARENT_HUGEPAGE. >> Any reason why >> echo never > /sys/kernel/mm/transparent_hugepage/enabled >> doesn't solve the problem? > > Because we unfortunately try to allocate huge pages anyway, we > unfortunately just fail in 100% of all cases. > > That basically gives you both, the extra allocation overhead and the > still bad throughput. > >> Also, I assume that disabling CONFIG_TRANSPARENT_HUGEPAGE will disable >> them for everything and not just what your patch added, right? > > Correct, that's why I wrote that disabling SWIOTLBs might be better. > >>>> I'm not sure what you mean by "We mitigated the problem by avoiding the >>>> slow coherent DMA code path on almost all platforms on newer >>>> kernels". I >>>> tested up to 4.16 and the performance regression is just as bad as >>>> it is >>>> for 4.15. >>> Indeed 4.16 still doesn't have that. You could use the >>> amd-staging-drm-next branch or wait for 4.17. >> Is there a way to pull just that change or is there too much >> interactions with other changes? > > It adds a new detection if memory allocation needs to be coherent or > not, that is not something you can easily pull into older versions. > >>> That isn't related to the GFX hardware, but to your CPU/motherboard and >>> whatever else you have in the system. >> Well, I have an nvidia GPU in the same system (normally only used for >> CUDA) and if I use it instead of my RX 560 then I'm not seeing any >> performance issue with 4.15. > > That's because you are probably using the Nvidia binary driver which has > a completely separate code base. > >>> Some part of your system needs SWIOTLB and that makes allocating memory >>> much slower. >> What would that part be? FTR, I have a complete description of my system >> at https://jmvalin.dreamwidth.org/15583.html >> >> I don't know if it's related, but I can maybe see one thing in common >> between my machine and the Core 2 Quad from the other bug report and >> that's the "NUMA part". I have a dual-socket Xeon and (AFAIK) the Core 2 >> Quad is made of two two-core CPUs glued together with little >> communication between them. > > Yeah, that is probably the reason. > >>> Intel doesn't use TTM because they don't have dedicated VRAM, but the >>> open source nvidia driver should be affected as well. >> I'm using the proprietary nvidia driver (because CUDA). Is that supposed >> to be affected as well? > > No. > >>> We already mitigated that problem and I don't see any solution which >>> will arrive faster than 4.17. >> Is that supposed to make the slowdown unnoticeable or just slightly >> better? > > It completely goes away. The issue with the coherent path is that it > tries to always allocate the lowest possible memory to make sure that it > fits into the DMA constrains of all devices in the system. > > But since AMD GPU can handle 40bits of addresses you would need at least > 1TB of memory in the system to trigger that (or a NUMA where some system > is low and some in a high area). > > Christian. > >>> The only quick workaround I can see is to avoid firefox, chrome for >>> example is reported to work perfectly fine. >> Or use an unaffected GPU/driver ;-) >> >> Cheers, >> >>     Jean-Marc >> >