Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3223371imu; Thu, 29 Nov 2018 18:20:34 -0800 (PST) X-Google-Smtp-Source: AFSGD/WZpigP1ewdLk+ilyq26Uozyuplv88A0F/qz4masXspiIcdx6l3h82MEdW+L7Qs80nQmvIN X-Received: by 2002:a62:160d:: with SMTP id 13mr3769416pfw.203.1543544434665; Thu, 29 Nov 2018 18:20:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543544434; cv=none; d=google.com; s=arc-20160816; b=BBUPucae2U/Er5Na0B7sx3wmtK23clia1ltYe9lECVWHHHIMi4uwoV9VrCUjIuidVA 9lAiGt40sQQfEO02JmIL5Rbz2A/VnCIvFulR/vPpBMnnDP5suJETXddog3Eu1fHW3Mtm zUbBtwnhZLIP011ZjVu0++I37IQmTeEL5qfN2k3F0Cjjf0KNdT6BAxabIICUyC8PJZnM UjxQ+usS1y7a5kKJ1/zstvy3WaYdEHs/osfl8tj2nqIDr4Hr42zNi0pDAyLo1G9zzb2d 4Gyd8HvW9AJaOecUgOyq0KcdBx7R4VxiiG5nwKrRPmGyOmQFaK3VDqghPdof2z4NoJL9 T5Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=Dm9L629HZ6zb/bpRM/j97A6Q1SfYm4KEFHlW4hXCu24=; b=Y/aLywVcuDMYya4lBdLvNmNeIaZw7GP4AjsrwmpLfzWCFvIO1EHUHcOetyYwBaPJih k3ojtmsBe+zLo/OgPM4qC2QEya42OmCdkTawl3qSiJ8QqiDU8WO0jGg5okea7bj5pbMv cEcCtNWpqIaeZnA/UV2r63a76+lA3jnrt7J67Up9l++4x/X3s80qnkJ1lzYlcRS8g0TD 7vzm1RvuYg15T9Qdst87bQmvaP3jtympib2EQnMjKoo1J3zDcWg9IWQa6EJ12ciavMU3 7vYTsdxtTNTKiYM01OhIoymEdIJMwf2omrbDnWABY7gRx7omjJhPGOslwZW6Jir2OYLy AG4Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k190si3428722pgd.64.2018.11.29.18.20.20; Thu, 29 Nov 2018 18:20:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727073AbeK3NZp (ORCPT + 99 others); Fri, 30 Nov 2018 08:25:45 -0500 Received: from p3plsmtpa11-05.prod.phx3.secureserver.net ([68.178.252.106]:57773 "EHLO p3plsmtpa11-05.prod.phx3.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726675AbeK3NZo (ORCPT ); Fri, 30 Nov 2018 08:25:44 -0500 Received: from [192.168.0.55] ([24.218.182.144]) by :SMTPAUTH: with ESMTPSA id SYNYg427iAqHTSYNZgGuhb; Thu, 29 Nov 2018 19:18:05 -0700 Subject: Re: [PATCH v2 0/6] RFC: gup+dma: tracking dma-pinned pages To: John Hubbard , john.hubbard@gmail.com, linux-mm@kvack.org Cc: Andrew Morton , LKML , linux-rdma , linux-fsdevel@vger.kernel.org References: <20181110085041.10071-1-jhubbard@nvidia.com> <942cb823-9b18-69e7-84aa-557a68f9d7e9@talpey.com> <97934904-2754-77e0-5fcb-83f2311362ee@nvidia.com> <5159e02f-17f8-df8b-600c-1b09356e46a9@talpey.com> <15e4a0c0-cadd-e549-962f-8d9aa9fc033a@talpey.com> <313bf82d-cdeb-8c75-3772-7a124ecdfbd5@nvidia.com> <2aa422df-d5df-5ddb-a2e4-c5e5283653b5@talpey.com> <7a68b7fc-ff9d-381e-2444-909c9c2f6679@nvidia.com> From: Tom Talpey Message-ID: <1939f47a-eaec-3f2c-4ae7-f92d9fba7693@talpey.com> Date: Thu, 29 Nov 2018 21:18:05 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <7a68b7fc-ff9d-381e-2444-909c9c2f6679@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4wfF9qknIpQJmp8aBwJ3rmaH9di8o1vUQuWCrXwTQ/0WQ7j+XT0YoHPohrly9tftRvIYnYTfT52ViLBMajYjXayn3RX47vxEuy4bqAiB6294uz68XzUMnd VWmgSYTR1KJcXPoJ0cqHR8cQd/Y0oJGxl2eVVyTJ5wDsO6cteK9trifMFfeguMi/OaaYeQsxCtMgrWQUzIWyQtpKVk9iMTNvf51BFYoBde/YBFIQMgHRNr4c 2UHpRCZKccFKWi9H8D0DeWMddO+h2jTJpL18SGFcvpkphzNVw6/HQlBGztoVLAvJbf2DZzWK6N2hDRSTyMG2GldW+Ml55rWHfKQLgrjR8Up0+hbFC9JFzwx/ SO9cm3P5hja7gjAmzIAfsAnhwSTWtyH92ej8UR15vOHPmVKgWIs= Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/29/2018 8:39 PM, John Hubbard wrote: > On 11/28/18 5:59 AM, Tom Talpey wrote: >> On 11/27/2018 9:52 PM, John Hubbard wrote: >>> On 11/27/18 5:21 PM, Tom Talpey wrote: >>>> On 11/21/2018 5:06 PM, John Hubbard wrote: >>>>> On 11/21/18 8:49 AM, Tom Talpey wrote: >>>>>> On 11/21/2018 1:09 AM, John Hubbard wrote: >>>>>>> On 11/19/18 10:57 AM, Tom Talpey wrote: >>> [...] >>>> I'm super-limited here this week hardware-wise and have not been able >>>> to try testing with the patched kernel. >>>> >>>> I was able to compare my earlier quick test with a Bionic 4.15 kernel >>>> (400K IOPS) against a similar 4.20rc3 kernel, and the rate dropped to >>>> ~_375K_ IOPS. Which I found perhaps troubling. But it was only a quick >>>> test, and without your change. >>>> >>> >>> So just to double check (again): you are running fio with these parameters, >>> right? >>> >>> [reader] >>> direct=1 >>> ioengine=libaio >>> blocksize=4096 >>> size=1g >>> numjobs=1 >>> rw=read >>> iodepth=64 >> >> Correct, I copy/pasted these directly. I also ran with size=10g because >> the 1g provides a really small sample set. >> >> There was one other difference, your results indicated fio 3.3 was used. >> My Bionic install has fio 3.1. I don't find that relevant because our >> goal is to compare before/after, which I haven't done yet. >> > > OK, the 50 MB/s was due to my particular .config. I had some expensive debug options > set in mm, fs and locking subsystems. Turning those off, I'm back up to the rated > speed of the Samsung NVMe device, so now we should have a clearer picture of the > performance that real users will see. Oh, good! I'm especially glad because I was having a heck of a time reconfiguring the one machine I have available for this. > Continuing on, then: running a before and after test, I don't see any significant > difference in the fio results: Excerpting from below: > Baseline 4.20.0-rc3 (commit f2ce1065e767), as before: > read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1360msec) > cpu : usr=16.26%, sys=48.05%, ctx=251258, majf=0, minf=73 vs > With patches applied: > read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1360msec) > cpu : usr=16.26%, sys=48.05%, ctx=251258, majf=0, minf=73 Perfect results, not CPU limited, and full IOPS. Curiously identical, so I trust you've checked that you measured both targets, but if so, I say it's good. Tom. > > fio.conf: > > [reader] > direct=1 > ioengine=libaio > blocksize=4096 > size=1g > numjobs=1 > rw=read > iodepth=64 > > --------------------------------------------------------- > Baseline 4.20.0-rc3 (commit f2ce1065e767), as before: > > $ fio ./experimental-fio.conf > reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 > fio-3.3 > Starting 1 process > Jobs: 1 (f=1) > reader: (groupid=0, jobs=1): err= 0: pid=1738: Thu Nov 29 17:20:07 2018 > read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1360msec) > slat (nsec): min=1381, max=46469, avg=1649.48, stdev=594.46 > clat (usec): min=162, max=12247, avg=330.00, stdev=185.55 > lat (usec): min=165, max=12253, avg=331.68, stdev=185.69 > clat percentiles (usec): > | 1.00th=[ 322], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326], > | 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326], > | 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326], > | 99.00th=[ 379], 99.50th=[ 594], 99.90th=[ 603], 99.95th=[ 611], > | 99.99th=[12125] > bw ( KiB/s): min=751640, max=782912, per=99.52%, avg=767276.00, stdev=22112.64, samples=2 > iops : min=187910, max=195728, avg=191819.00, stdev=5528.16, samples=2 > lat (usec) : 250=0.08%, 500=99.30%, 750=0.59% > lat (msec) : 20=0.02% > cpu : usr=16.26%, sys=48.05%, ctx=251258, majf=0, minf=73 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% > issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > Run status group 0 (all jobs): > READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s), io=1024MiB (1074MB), run=1360-1360msec > > Disk stats (read/write): > nvme0n1: ios=220798/0, merge=0/0, ticks=71481/0, in_queue=71966, util=100.00% > > --------------------------------------------------------- > With patches applied: > > fast_256GB $ fio ./experimental-fio.conf > reader: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64 > fio-3.3 > Starting 1 process > Jobs: 1 (f=1) > reader: (groupid=0, jobs=1): err= 0: pid=1738: Thu Nov 29 17:20:07 2018 > read: IOPS=193k, BW=753MiB/s (790MB/s)(1024MiB/1360msec) > slat (nsec): min=1381, max=46469, avg=1649.48, stdev=594.46 > clat (usec): min=162, max=12247, avg=330.00, stdev=185.55 > lat (usec): min=165, max=12253, avg=331.68, stdev=185.69 > clat percentiles (usec): > | 1.00th=[ 322], 5.00th=[ 326], 10.00th=[ 326], 20.00th=[ 326], > | 30.00th=[ 326], 40.00th=[ 326], 50.00th=[ 326], 60.00th=[ 326], > | 70.00th=[ 326], 80.00th=[ 326], 90.00th=[ 326], 95.00th=[ 326], > | 99.00th=[ 379], 99.50th=[ 594], 99.90th=[ 603], 99.95th=[ 611], > | 99.99th=[12125] > bw ( KiB/s): min=751640, max=782912, per=99.52%, avg=767276.00, stdev=22112.64, samples=2 > iops : min=187910, max=195728, avg=191819.00, stdev=5528.16, samples=2 > lat (usec) : 250=0.08%, 500=99.30%, 750=0.59% > lat (msec) : 20=0.02% > cpu : usr=16.26%, sys=48.05%, ctx=251258, majf=0, minf=73 > IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% > submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% > complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0% > issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0 > latency : target=0, window=0, percentile=100.00%, depth=64 > > Run status group 0 (all jobs): > READ: bw=753MiB/s (790MB/s), 753MiB/s-753MiB/s (790MB/s-790MB/s), io=1024MiB (1074MB), run=1360-1360msec > > Disk stats (read/write): > nvme0n1: ios=220798/0, merge=0/0, ticks=71481/0, in_queue=71966, util=100.00% > > > thanks, >