Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1710081imu; Tue, 6 Nov 2018 03:08:05 -0800 (PST) X-Google-Smtp-Source: AJdET5eS1C6z73YYD5+k/jkiJ2AkccbnmoKlmcqyRc36wxsEeByacVFQ8REOF7NZa0N1WZyX9AHr X-Received: by 2002:a17:902:bccc:: with SMTP id o12-v6mr21934944pls.281.1541502485502; Tue, 06 Nov 2018 03:08:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541502485; cv=none; d=google.com; s=arc-20160816; b=Z01jmbhUSTDYR4XiOQYaKK33DNCR1dfYfsO6gX+nQhH/LQfAAUGY0jwMMFcr8atwGU CaEJEv+T+iU7tXg2FUAMr/lq2rE2eX617YN+vfsgOr9Jp2Hk9Y7pRByecXfQ1mrJ7dtu stfYbqdPuIy9qoBBtfCXEwCEFQVzCfDYVzk7R/kTjbsVoBm3nKZyPK+NSPy7QGTbOqp+ lVe7atcorVSikowQjsrIRD8UcwGPnXfPNNGf19ZEsbM1C5si24LjqibD5Rq43m6yKXiG b6cBaPvW5PEK21DgQmMdDQ9JyhA/YaZTQmRek8D8jhycCsGlcnrs9vMMj0Lsp8Sg9fIH BqYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=du5HM79VDjPgAkOZ1UubmeV8QS+2gXyFhfF9n0VaoWM=; b=gz190sO7uVWqgKGXTy76Pdn1nuxRlSu7HUHFkE9XhGKvGIuJREFM0NEs/rHg9w2rvr tf5qxKX7IeKgMlniTHLqyMjCWyIBaThgE1++Q5BOlw8vkvgzlNrr6u3+oebYacW6VHG/ W4WH/Uzgjl6moblyM1GJpZfU6LUR/ixLOybbm7KP4s95IYLIrz+y7XvhmdWVJePAAoEi sxy6NLTpEbNOXV/z6aOnBESpkZQRuDpmTiieZW/AUmTCLAJA13mo9P1qdHloug48U9Kn nzXwmd4LFOMo5hpYwQrpNpSAV8hsT1PXIgKxHVsNN6hmCKFuKUiYrkG+9SDbytXZQI/z lnHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l19si5469884pgm.432.2018.11.06.03.07.50; Tue, 06 Nov 2018 03:08:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730336AbeKFUYs (ORCPT + 99 others); Tue, 6 Nov 2018 15:24:48 -0500 Received: from mx2.suse.de ([195.135.220.15]:57832 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729816AbeKFUYs (ORCPT ); Tue, 6 Nov 2018 15:24:48 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 767E6B180; Tue, 6 Nov 2018 11:00:07 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id B754D1E07AB; Tue, 6 Nov 2018 12:00:06 +0100 (CET) Date: Tue, 6 Nov 2018 12:00:06 +0100 From: Jan Kara To: Dave Chinner Cc: John Hubbard , Jan Kara , Christoph Hellwig , Matthew Wilcox , Michal Hocko , Christopher Lameter , Jason Gunthorpe , Dan Williams , linux-mm@kvack.org, Andrew Morton , LKML , linux-rdma , linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 4/6] mm: introduce page->dma_pinned_flags, _count Message-ID: <20181106110006.GE25414@quack2.suse.cz> References: <20181012060014.10242-1-jhubbard@nvidia.com> <20181012060014.10242-5-jhubbard@nvidia.com> <20181013035516.GA18822@dastard> <7c2e3b54-0b1d-6726-a508-804ef8620cfd@nvidia.com> <20181013164740.GA6593@infradead.org> <84811b54-60bf-2bc3-a58d-6a7925c24aad@nvidia.com> <20181105095447.GE6953@quack2.suse.cz> <20181106024715.GU6311@dastard> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181106024715.GU6311@dastard> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 06-11-18 13:47:15, Dave Chinner wrote: > On Mon, Nov 05, 2018 at 04:26:04PM -0800, John Hubbard wrote: > > On 11/5/18 1:54 AM, Jan Kara wrote: > > > Hmm, have you tried larger buffer sizes? Because synchronous 8k IO isn't > > > going to max-out NVME iops by far. Can I suggest you install fio [1] (it > > > has the advantage that it is pretty much standard for a test like this so > > > everyone knows what the test does from a glimpse) and run with it something > > > like the following workfile: > > > > > > [reader] > > > direct=1 > > > ioengine=libaio > > > blocksize=4096 > > > size=1g > > > numjobs=1 > > > rw=read > > > iodepth=64 > > > > > > And see how the numbers with and without your patches compare? > > > > > > Honza > > > > > > [1] https://github.com/axboe/fio > > > > That program is *very* good to have. Whew. Anyway, it looks like read bandwidth > > is approximately 74 MiB/s with my patch (it varies a bit, run to run), > > as compared to around 85 without the patch, so still showing about a 20% > > performance degradation, assuming I'm reading this correctly. > > > > Raw data follows, using the fio options you listed above: > > > > Baseline (without my patch): > > ---------------------------- > .... > > lat (usec): min=179, max=14003, avg=2913.65, stdev=1241.75 > > clat percentiles (usec): > > | 1.00th=[ 2311], 5.00th=[ 2343], 10.00th=[ 2343], 20.00th=[ 2343], > > | 30.00th=[ 2343], 40.00th=[ 2376], 50.00th=[ 2376], 60.00th=[ 2376], > > | 70.00th=[ 2409], 80.00th=[ 2933], 90.00th=[ 4359], 95.00th=[ 5276], > > | 99.00th=[ 8291], 99.50th=[ 9110], 99.90th=[10945], 99.95th=[11469], > > | 99.99th=[12256] > ..... > > Modified (with my patch): > > ---------------------------- > ..... > > lat (usec): min=81, max=15766, avg=3496.57, stdev=1450.21 > > clat percentiles (usec): > > | 1.00th=[ 2835], 5.00th=[ 2835], 10.00th=[ 2835], 20.00th=[ 2868], > > | 30.00th=[ 2868], 40.00th=[ 2868], 50.00th=[ 2868], 60.00th=[ 2900], > > | 70.00th=[ 2933], 80.00th=[ 3425], 90.00th=[ 5080], 95.00th=[ 6259], > > | 99.00th=[10159], 99.50th=[11076], 99.90th=[12649], 99.95th=[13435], > > | 99.99th=[14484] > > So it's adding at least 500us of completion latency to every IO? > I'd argue that the IO latency impact is far worse than the a 20% > throughput drop. Hum, right. So for each IO we have to remove the page from LRU on submit and then put it back on IO completion (which is going to race with new submits so LRU lock contention might be an issue). Spending 500 us on that is not unthinkable when the lock is contended but it is more expensive than I'd have thought. John, could you perhaps profile where the time is spent? Honza -- Jan Kara SUSE Labs, CR