Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp4196567yba; Wed, 17 Apr 2019 06:40:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqw5ssZjhVf1TZXey+gDd3U3KlHYOIgkKvqCvsSBIqQ7k0hDvRr/xmijv6CmXunLYuE7jwLo X-Received: by 2002:a63:90c3:: with SMTP id a186mr74047984pge.306.1555508419476; Wed, 17 Apr 2019 06:40:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555508419; cv=none; d=google.com; s=arc-20160816; b=BQE5MPfEKQiAsKemiT35EIeYu5taFidtUrXcRrQrKipZIgBYrnkWbWSCEWZ0AbLcSd 2GOQ203zqK2RClbKp65nlf/bGHvI1yp9qhGTaocrVrQ27S9HvTX/X2U3vKlU1CZjCCKg xZ/q6ZBuh9ZBEEQ7HKzYLIuCkABvrbFKXqAQ411Ocan0xLl2kS0DlooqRtQeK6jZpH8x DzaQ8S7bkaO1bVOsrMQDY+eU75dPxr/vzODZu7wqT9T7GeiaASl57R1ZJOioKkWtGJuM 3z1L7MFuTkUXHsG/VCMZd5TojATsiCbwKe510Y8FIuCIx1Vr3kTVpKizYwYqPUdoniE8 k9cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=zzbTahPvFzJAf/J/iaw3D5L2MX0o0O0EXoJ7XjyE06M=; b=NDs98ZNyrm7348Oz9+ntB99ga1pq9/9iiYu8GnJZpv0B/6PI15RsGQwz+c6yKPNVnx Tt0K0jTns2uZdo3OfSYOEaSo7BDNuVWF/WFIPexPDZ6wK7x+qpM1MntVVRDKWEeQ4TRY C5Ce22DOu+FpVA0HicV/Ll7+8I3s77MPPJvMsVKScMYZeDbA+fQPSohvcD708/DZLOF5 olKBrJf457GmGxgOmw0Z6YfrVvatGRfRtARnjTPEzd2kea3dHeXtR/cMfkgnhBdxJ2Xc pkGjcOGsPhF45e9AzJJ1FcZmIf7HS3S1KBBzS5trTSC3Y8XZo/Zfap9wRUBAMdyEvjpD mXDw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y66si45869315pgy.186.2019.04.17.06.40.03; Wed, 17 Apr 2019 06:40:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732277AbfDQNi5 (ORCPT + 99 others); Wed, 17 Apr 2019 09:38:57 -0400 Received: from mx2.suse.de ([195.135.220.15]:35126 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729356AbfDQNi4 (ORCPT ); Wed, 17 Apr 2019 09:38:56 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 7D129B174; Wed, 17 Apr 2019 13:38:54 +0000 (UTC) Date: Wed, 17 Apr 2019 15:38:52 +0200 From: Michal Hocko To: Jesper Dangaard Brouer Cc: Pekka Enberg , "Tobin C. Harding" , Vlastimil Babka , "Tobin C. Harding" , Andrew Morton , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Tejun Heo , Qian Cai , Linus Torvalds , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mel Gorman , "netdev@vger.kernel.org" , Alexander Duyck Subject: Re: [PATCH 0/1] mm: Remove the SLAB allocator Message-ID: <20190417133852.GL5878@dhcp22.suse.cz> References: <20190410024714.26607-1-tobin@kernel.org> <20190410081618.GA25494@eros.localdomain> <20190411075556.GO10383@dhcp22.suse.cz> <262df687-c934-b3e2-1d5f-548e8a8acb74@iki.fi> <20190417105018.78604ad8@carbon> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190417105018.78604ad8@carbon> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 17-04-19 10:50:18, Jesper Dangaard Brouer wrote: > On Thu, 11 Apr 2019 11:27:26 +0300 > Pekka Enberg wrote: > > > Hi, > > > > On 4/11/19 10:55 AM, Michal Hocko wrote: > > > Please please have it more rigorous then what happened when SLUB was > > > forced to become a default > > > > This is the hard part. > > > > Even if you are able to show that SLUB is as fast as SLAB for all the > > benchmarks you run, there's bound to be that one workload where SLUB > > regresses. You will then have people complaining about that (rightly so) > > and you're again stuck with two allocators. > > > > To move forward, I think we should look at possible *pathological* cases > > where we think SLAB might have an advantage. For example, SLUB had much > > more difficulties with remote CPU frees than SLAB. Now I don't know if > > this is the case, but it should be easy to construct a synthetic > > benchmark to measure this. > > I do think SLUB have a number of pathological cases where SLAB is > faster. If was significantly more difficult to get good bulk-free > performance for SLUB. SLUB is only fast as long as objects belong to > the same page. To get good bulk-free performance if objects are > "mixed", I coded this[1] way-too-complex fast-path code to counter > act this (joined work with Alex Duyck). > > [1] https://github.com/torvalds/linux/blob/v5.1-rc5/mm/slub.c#L3033-L3113 How often is this a real problem for real workloads? > > For example, have a userspace process that does networking, which is > > often memory allocation intensive, so that we know that SKBs traverse > > between CPUs. You can do this by making sure that the NIC queues are > > mapped to CPU N (so that network softirqs have to run on that CPU) but > > the process is pinned to CPU M. > > If someone want to test this with SKBs then be-aware that we netdev-guys > have a number of optimizations where we try to counter act this. (As > minimum disable TSO and GRO). > > It might also be possible for people to get inspired by and adapt the > micro benchmarking[2] kernel modules that I wrote when developing the > SLUB and SLAB optimizations: > > [2] https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/mm While microbenchmarks are good to see pathological behavior, I would be really interested to see some numbers for real world usecases. > > It's, of course, worth thinking about other pathological cases too. > > Workloads that cause large allocations is one. Workloads that cause lots > > of slab cache shrinking is another. > > I also worry about long uptimes when SLUB objects/pages gets too > fragmented... as I said SLUB is only efficient when objects are > returned to the same page, while SLAB is not. Is this something that has been actually measured in a real deployment? -- Michal Hocko SUSE Labs