Received: by 10.192.165.148 with SMTP id m20csp1082588imm; Fri, 27 Apr 2018 12:20:58 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpyBB/MiwG/yUZ21zHjwxbScK2QmKb7p6Ss5ynQohyNppr/+6QphJ1zDj4dUo4B3+joijvx X-Received: by 2002:a17:902:7e05:: with SMTP id b5-v6mr3297833plm.230.1524856858421; Fri, 27 Apr 2018 12:20:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524856858; cv=none; d=google.com; s=arc-20160816; b=BodnnGjD613vaSieoRBtEKTS2VA2YCqaPTIpUISbcxz3S8niDPSRDBzGA2PkajGUZi DbB1W+/3nhCjIQ+BelTgb0UUdOvrI537suP2G+ES3qhnQbpPE3A8H7RR3J0Kl1GHAdrV EHdVtSjC91k40RJg+QtLxHFvn59Gzx5vgPMgEwiv7m/Ssrzxr68oxiZUDrsiaRexdCY3 Df2Z1yKdQvLC4wB7HXSGH37zG3QhpLt0j2Rk8zdXKCNU7ezUFkC0EMlNzTCX5o9Tj03j NPTSWW8A2ucENn5pQemRTuz/6cXOYqs3yqpa7nqBdL9tJU4kOJ4kRR7HQE7jX1vYeClb LDvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=D9kvqN8wZA3Aya1QkuJLo+bavZlxtkTIOWMyqTGxpFM=; b=Wz3LzL3q5tChExFQ1V2S6cZYMuUsbn2O+HIV8A1qLvCTf9LxGPUdCkVJQcpUdtRxcK AUlBfO8obtgAODS2OagrtT4i/BExI5yYZbiV2SuNuHzLGrur+ab6NeFedgulpDtBMHXb nHEcK+GzlbZTX374RN612Xy82OgtD8prTlGNWi1f3GLUJTrFlvKCBV+vybsf0eJurj5f 7AobYyrXDr8MfKY1dV3oDvE8RNyZW5oDERFCtHfAojPbzhUsP5bnc97Scy488ba+iBwd A+UXRf34H2v9MX7bxAXPxhQ9CR5e4Sd9bnNyi4uqI0L3sKIQbM9V4E8voYmuYxhZTiG2 H+/A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t64-v6si1708665pgc.160.2018.04.27.12.20.44; Fri, 27 Apr 2018 12:20:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758496AbeD0TTg (ORCPT + 99 others); Fri, 27 Apr 2018 15:19:36 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:53034 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1758069AbeD0TTf (ORCPT ); Fri, 27 Apr 2018 15:19:35 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D477A8E692; Fri, 27 Apr 2018 19:19:34 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7CB31215CDCB; Fri, 27 Apr 2018 19:19:34 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id w3RJJYYu028221; Fri, 27 Apr 2018 15:19:34 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id w3RJJVjC028217; Fri, 27 Apr 2018 15:19:32 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Fri, 27 Apr 2018 15:19:31 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Christopher Lameter cc: Mike Snitzer , Vlastimil Babka , Matthew Wilcox , Pekka Enberg , linux-mm@kvack.org, dm-devel@redhat.com, David Rientjes , Joonsoo Kim , Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE In-Reply-To: Message-ID: References: <20c58a03-90a8-7e75-5fc7-856facfb6c8a@suse.cz> <20180413151019.GA5660@redhat.com> <20180416142703.GA22422@redhat.com> <20180416144638.GA22484@redhat.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 27 Apr 2018 19:19:34 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 27 Apr 2018 19:19:34 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mpatocka@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 27 Apr 2018, Christopher Lameter wrote: > On Thu, 26 Apr 2018, Mikulas Patocka wrote: > > > > Hmmm... order 4 for these caches may cause some concern. These should stay > > > under costly order I think. Otherwise allocations are no longer > > > guaranteed. > > > > You said that slub has fallback to smaller order allocations. > > Yes it does... > > > The whole purpose of this "minimize waste" approach is to use higher-order > > allocations to use memory more efficiently, so it is just doing its job. > > (for these 3 caches, order-4 really wastes less memory than order-3 - on > > my system TCPv6 and sighand_cache have size 2112, task_struct 2752). > > Hmmm... Ok if the others are fine with this as well. I got some pushback > there in the past. > > > We could improve the fallback code, so that if order-4 allocation fails, > > it tries order-3 allocation, and then falls back to order-0. But I think > > that these failures are rare enough that it is not a problem. > > I also think that would be too many fallbacks. You are right - it's better to fallback to the minimum possible size, so that the allocation is faster. > The old code uses the concept of a "fraction" to calculate overhead. The > code here uses absolute counts of bytes. Fraction looks better to me. OK - I reworked the patch using the same "fraction" calculation as before. The existing logic targets 1/16 wasted space, so I used this target in this patch too. This patch increases only the order of task_struct (from 3 to 4), all the other caches have the same order as before. Mikulas From: Mikulas Patocka Subject: [PATCH] slub: use higher order to reduce wasted space If we create a slub cache with large object size (larger than slub_max_order), the slub subsystem currently rounds up the object size to the next power of two. This is inefficient, because it wastes too much space. We use the slab cache as a buffer cache in dm-bufio, in order to use the memory efficiently, we need to reduce wasted space. This patch reworks the slub order calculation algorithm, so that it uses higher order allocations if it would reduce wasted space. The slub subsystem has fallback if the higher-order allocations fails, so using order higher than PAGE_ALLOC_COSTLY_ORDER is ok. The new algorithm first calculates the minimum order that can be used for a give object size and then increases the order according to these conditions: * if we would overshoot MAX_OBJS_PER_PAGE, don't increase * if we are below slub_min_order, increase * if we are below slub_max_order and below min_objects, increase * we increase above slub_max_order only if it reduces wasted space and if we alrady waste at least 1/16 of the compound page The new algorithm gives very similar results to the old one, all the caches on my system have the same order as before, only the order of task_struct (size 2752) is increased from 3 to 4. Signed-off-by: Mikulas Patocka --- mm/slub.c | 82 +++++++++++++++++++++++--------------------------------------- 1 file changed, 31 insertions(+), 51 deletions(-) Index: linux-2.6/mm/slub.c =================================================================== --- linux-2.6.orig/mm/slub.c 2018-04-27 19:30:34.000000000 +0200 +++ linux-2.6/mm/slub.c 2018-04-27 21:05:53.000000000 +0200 @@ -3224,34 +3224,10 @@ static unsigned int slub_min_objects; * requested a higher mininum order then we start with that one instead of * the smallest order which will fit the object. */ -static inline unsigned int slab_order(unsigned int size, - unsigned int min_objects, unsigned int max_order, - unsigned int fract_leftover, unsigned int reserved) +static int calculate_order(unsigned int size, unsigned int reserved) { - unsigned int min_order = slub_min_order; - unsigned int order; - - if (order_objects(min_order, size, reserved) > MAX_OBJS_PER_PAGE) - return get_order(size * MAX_OBJS_PER_PAGE) - 1; - - for (order = max(min_order, (unsigned int)get_order(min_objects * size + reserved)); - order <= max_order; order++) { - - unsigned int slab_size = (unsigned int)PAGE_SIZE << order; - unsigned int rem; - - rem = (slab_size - reserved) % size; - - if (rem <= slab_size / fract_leftover) - break; - } - - return order; -} - -static inline int calculate_order(unsigned int size, unsigned int reserved) -{ - unsigned int order; + unsigned int best_order; + unsigned int test_order; unsigned int min_objects; unsigned int max_objects; @@ -3269,34 +3245,38 @@ static inline int calculate_order(unsign max_objects = order_objects(slub_max_order, size, reserved); min_objects = min(min_objects, max_objects); - while (min_objects > 1) { - unsigned int fraction; + /* Get the minimum acceptable order for one object */ + best_order = get_order(size + reserved); + + for (test_order = best_order + 1; test_order < MAX_ORDER; test_order++) { + unsigned best_order_obj = order_objects(best_order, size, reserved); + unsigned test_order_obj = order_objects(test_order, size, reserved); + + unsigned best_order_slab_size = (unsigned int)PAGE_SIZE << best_order; + unsigned best_order_rem = (best_order_slab_size - reserved) % size; + + /* If there would be too many objects, stop searching */ + if (test_order_obj > MAX_OBJS_PER_PAGE) + break; - fraction = 16; - while (fraction >= 4) { - order = slab_order(size, min_objects, - slub_max_order, fraction, reserved); - if (order <= slub_max_order) - return order; - fraction /= 2; - } - min_objects--; + /* Always increase up to slub_min_order */ + if (test_order <= slub_min_order) + best_order = test_order; + + /* If we are below min_objects and slub_max_order, increase the order */ + if (best_order_obj < min_objects && test_order <= slub_max_order) + best_order = test_order; + + /* Increase the order even more, but only if it reduces waste */ + /* If we already waste less than 1/16, don't increase it */ + if (best_order_rem >= (best_order_slab_size / 16) && + test_order_obj > (best_order_obj << (test_order - best_order))) + best_order = test_order; } - /* - * We were unable to place multiple objects in a slab. Now - * lets see if we can place a single object there. - */ - order = slab_order(size, 1, slub_max_order, 1, reserved); - if (order <= slub_max_order) - return order; + if (best_order < MAX_ORDER) + return best_order; - /* - * Doh this slab cannot be placed using slub_max_order. - */ - order = slab_order(size, 1, MAX_ORDER, 1, reserved); - if (order < MAX_ORDER) - return order; return -ENOSYS; }