Received: by 10.192.165.148 with SMTP id m20csp2614043imm; Thu, 26 Apr 2018 14:11:33 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/g/NDZ7JwEMEFka+m0orUiJfGKyZA13P5p7KV8ys9eu5kgE4Rvyi7DJo/5qgG1jIU3Nxml X-Received: by 10.99.151.17 with SMTP id n17mr28313104pge.171.1524777093653; Thu, 26 Apr 2018 14:11:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524777093; cv=none; d=google.com; s=arc-20160816; b=0h9q7gXY4XR0F6czdj2cSGR29bexKqH1XgMsDH1tjkbHXhGmf5Htr+wpNFY9iYzPwY yfRBO0ERNC8bks7HnaaaOOrcFZVG08WcWah0v9CVjRxtubE+Jul4fFv1jfGqIeJrvEpU h1PDW5Zssd0z3lEAtCxbrvcHbbFgn2zLGQ9Apd5H7GKV76fe/xrS+UgpTIyXMnA/zaYG 624lNVMM3zwJ2l5A7p4Wk5NXqPTWLJGaBYJjHAEv37LZgHGD781Z6JfQRrqbO4Y9MSG9 0HJ+sf9Vp3aEbkwga/oRbrnood40QEpQHMjXA9+X9Ej2v8GnL7XIOnNlLJfkgyYvFlLO 8YmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=FAuAzLZQZ+POdTmc3zTcKSHp3qYSDzMf0LFEk+dbZJg=; b=I9VDifJUe6X5NTIHmIxgQncPlHJRcu4TNK2d0Bae7O6cWnBP9i7dThtmzoDaCuzj9Q V9EnHR3OHcic3I2YyXoR4FxTSsBOP9EUr8d/x5Op3NpK8c9MOVLQxp4KEC65VTDS8auF S7k3/pGQKPPNWoonTeYHuUUhZkGvvnJDADhCKDSoSJK8eWsHvtKqofogK99yU8FYdKTH aev0qONtic/egBMGHsoXUBtiIOdnxy+hZjyHXXQ+1YrQYVKgUay0LJjrZLyNMhEignFa VoGyZ1b7kUeK7u/dXrd5r7c6XMFgHDc3fKmIinBjV5JezmJ8Oo35Govek4vfvGGK3Q7C O2cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l2-v6si20636974plk.380.2018.04.26.14.11.18; Thu, 26 Apr 2018 14:11:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755614AbeDZVKC (ORCPT + 99 others); Thu, 26 Apr 2018 17:10:02 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:59096 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754840AbeDZVJ5 (ORCPT ); Thu, 26 Apr 2018 17:09:57 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AE476818592A; Thu, 26 Apr 2018 21:09:56 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 472CD2166BC6; Thu, 26 Apr 2018 21:09:56 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id w3QL9u5J017334; Thu, 26 Apr 2018 17:09:56 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id w3QL9sSa017327; Thu, 26 Apr 2018 17:09:54 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Thu, 26 Apr 2018 17:09:54 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Christopher Lameter cc: Mike Snitzer , Vlastimil Babka , Matthew Wilcox , Pekka Enberg , linux-mm@kvack.org, dm-devel@redhat.com, David Rientjes , Joonsoo Kim , Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE In-Reply-To: Message-ID: References: <20c58a03-90a8-7e75-5fc7-856facfb6c8a@suse.cz> <20180413151019.GA5660@redhat.com> <20180416142703.GA22422@redhat.com> <20180416144638.GA22484@redhat.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 26 Apr 2018 21:09:56 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Thu, 26 Apr 2018 21:09:56 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mpatocka@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 26 Apr 2018, Christopher Lameter wrote: > On Wed, 25 Apr 2018, Mikulas Patocka wrote: > > > Do you want this? It deletes slab_order and replaces it with the > > "minimize_waste" logic directly. > > Well yes that looks better. Now we need to make it easy to read and less > complicated. Maybe try to keep as much as possible of the old code > and also the names of variables to make it easier to review? > > > It simplifies the code and it is very similar to the old algorithms, most > > slab caches have the same order, so it shouldn't cause any regressions. > > > > This patch changes order of these slabs: > > TCPv6: 3 -> 4 > > sighand_cache: 3 -> 4 > > task_struct: 3 -> 4 > > Hmmm... order 4 for these caches may cause some concern. These should stay > under costly order I think. Otherwise allocations are no longer > guaranteed. You said that slub has fallback to smaller order allocations. The whole purpose of this "minimize waste" approach is to use higher-order allocations to use memory more efficiently, so it is just doing its job. (for these 3 caches, order-4 really wastes less memory than order-3 - on my system TCPv6 and sighand_cache have size 2112, task_struct 2752). We could improve the fallback code, so that if order-4 allocation fails, it tries order-3 allocation, and then falls back to order-0. But I think that these failures are rare enough that it is not a problem. > > @@ -3269,35 +3245,35 @@ static inline int calculate_order(unsign > > max_objects = order_objects(slub_max_order, size, reserved); > > min_objects = min(min_objects, max_objects); > > > > - while (min_objects > 1) { > > - unsigned int fraction; > > + /* Get the minimum acceptable order for one object */ > > + order = get_order(size + reserved); > > + > > + for (test_order = order + 1; test_order < MAX_ORDER; test_order++) { > > + unsigned order_obj = order_objects(order, size, reserved); > > + unsigned test_order_obj = order_objects(test_order, size, reserved); > > + > > + /* If there are too many objects, stop searching */ > > + if (test_order_obj > MAX_OBJS_PER_PAGE) > > + break; > > > > - fraction = 16; > > - while (fraction >= 4) { > > - order = slab_order(size, min_objects, > > - slub_max_order, fraction, reserved); > > - if (order <= slub_max_order) > > - return order; > > - fraction /= 2; > > - } > > - min_objects--; > > + /* Always increase up to slub_min_order */ > > + if (test_order <= slub_min_order) > > + order = test_order; > > Well that is a significant change. In our current scheme the order > boundart wins. I think it's not a change. The existing function slab_order() starts with min_order (unless it overshoots MAX_OBJS_PER_PAGE) and then goes upwards. My code does the same - my code tests for MAX_OBJS_PER_PAGE (and bails out if we would overshoot it) and increases the order until it reaches slub_min_order (and then increases it even more if it satisfies the other conditions). If you believe that it behaves differently, please describe the situation in detail. > > + > > + /* If we are below min_objects and slub_max_order, increase order */ > > + if (order_obj < min_objects && test_order <= slub_max_order) > > + order = test_order; > > + > > + /* Increase order even more, but only if it reduces waste */ > > + if (test_order_obj <= 32 && > > Where does the 32 come from? It is to avoid extremely high order for extremely small slabs. For example, see kmalloc-96. 10922 96-byte objects would fit into 1MiB 21845 96-byte objects would fit into 2MiB The algorithm would recognize this one more object that fits into 2MiB slab as "waste reduction" and increase the order to 2MiB - and we don't want this. So, the general reasoning is - if we have 32 objects in a slab, then it is already considered that wasted space is reasonably low and we don't want to increase the order more. Currently, kmalloc-96 uses order-0 - that is reasonable (we already have 42 objects in 4k page, so we don't need to use higher order, even if it wastes one-less object). > > + test_order_obj > order_obj << (test_order - order)) > > Add more () to make the condition better readable. > > > + order = test_order; > > Can we just call test_order order and avoid using the long variable names > here? Variable names in functions are typically short. You need two variables - "order" and "test_order". "order" is the best order found so far and "test_order" is the order that we are now testing. If "test_order" wastes less space than "order", we assign order = test_order. Mikulas