Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932844AbcLGUL2 (ORCPT ); Wed, 7 Dec 2016 15:11:28 -0500 Received: from mail-pg0-f67.google.com ([74.125.83.67]:32906 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932503AbcLGUL1 (ORCPT ); Wed, 7 Dec 2016 15:11:27 -0500 Message-ID: <1481141424.4930.71.camel@edumazet-glaptop3.roam.corp.google.com> Subject: Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v7 From: Eric Dumazet To: Mel Gorman Cc: Andrew Morton , Christoph Lameter , Michal Hocko , Vlastimil Babka , Johannes Weiner , Jesper Dangaard Brouer , Joonsoo Kim , Linux-MM , Linux-Kernel Date: Wed, 07 Dec 2016 12:10:24 -0800 In-Reply-To: <20161207194801.krhonj7yggbedpba@techsingularity.net> References: <20161207101228.8128-1-mgorman@techsingularity.net> <1481137249.4930.59.camel@edumazet-glaptop3.roam.corp.google.com> <20161207194801.krhonj7yggbedpba@techsingularity.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.10.4-0ubuntu2 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2547 Lines: 71 On Wed, 2016-12-07 at 19:48 +0000, Mel Gorman wrote: > > > Interesting because it didn't match what I previous measured but then > again, when I established that netperf on localhost was slab intensive, > it was also an older kernel. Can you tell me if SLAB or SLUB was enabled > in your test kernel? > > Either that or the baseline I used has since been changed from what you > are testing and we're not hitting the same paths. lpaa6:~# uname -a Linux lpaa6 4.9.0-smp-DEV #429 SMP @1481125332 x86_64 GNU/Linux lpaa6:~# perf record -g ./netperf -t UDP_STREAM -l 3 -- -m 16384 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to localhost () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 16384 3.00 654644 0 28601.04 212992 3.00 654592 28598.77 [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 1.888 MB perf.data (~82481 samples) ] perf report --stdio ... 1.92% netperf [kernel.kallsyms] [k] cache_alloc_refill | --- cache_alloc_refill | |--82.22%-- kmem_cache_alloc_node_trace | __kmalloc_node_track_caller | __alloc_skb | alloc_skb_with_frags | sock_alloc_send_pskb | sock_alloc_send_skb | __ip_append_data.isra.50 | ip_make_skb | udp_sendmsg | inet_sendmsg | sock_sendmsg | SYSC_sendto | sys_sendto | entry_SYSCALL_64_fastpath | __sendto_nocancel | | | --100.00%-- 0x0 | Oh wait, sock_alloc_send_skb() requests for all the bytes in skb->head : struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size, int noblock, int *errcode) { return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0); } Maybe one day we will avoid doing order-4 (or even order-5 in extreme cases !) allocations for loopback as we did for af_unix :P I mean, maybe some applications are sending 64KB UDP messages over loopback right now...