Received: by 10.192.165.156 with SMTP id m28csp308826imm; Tue, 17 Apr 2018 10:28:21 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/yjDU7+fm+VVnB9w8o3edMQkMWHbTk5Z4ZDg+OOFb4PFVVeMM3faOgLuovuEDTneFbUbmQ X-Received: by 10.101.64.140 with SMTP id t12mr2485300pgp.98.1523986101305; Tue, 17 Apr 2018 10:28:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523986101; cv=none; d=google.com; s=arc-20160816; b=kLp3bWd+Cl9j9Ed1DqZZmxxO1iKlyEHbn4WLBprzybn4SKOTdqy3HYeIk7BDZquzH8 Krvre60refPA9DikEiQjrb3jmfdL3YN4pbY3PRwTx08YXF3y01MChB++9a149kJ3cAZD 11mjZMiJJm5+mbc4TAwM0GWCD0cO2H0zxhUCj2K7Ls/aEFVtzgQvYpA6AxFglEgv7bQL /vzGJUSdtOTqB52fYk/yb7nO12m4APxvGvIaSrF6/mBBY5o2+q95bKWih/SBPsdSoQ/j Zzkc/WaNttvW+5ljWszeBNuBYOxbB5r5umfuQQpJH39QrIb4nO6TVAp47N6WiOF/PUGH jmLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=+zyz8OphXzKbgAyE8oU8IF6C5erCWAuG8zhxSi9G0IA=; b=mUIElGN/WgG5jnGS7JI44HYXfpVdtxB6V+qadILCof7r2atlxACC1KFj6YP0Nm5AxQ PCdetadNzNLgKvJtsN64lrHMVZwFJjwhnFTyGkC6L5ncU4e/OHctDSLs+BdSDj0C1SCv DGFV/kBGwV5R001bq2mYqdV7lqvoWNsWzlMjFebKq9eOIMRPjSiED+0X9U5a2xV3EokS ofk66wYzcaWZarFXuUKP3r1LfC4godMOnKk4dAcDZxUtCQ+AIpMVtpazpQBnvkJqffYB p6huzw/iQJsOfDEUxKUnB4RBKUmoS3CAGMoPLZwL1bBl8o0blgVRFOFsEI6fOJ5iuR8N MsVA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k15si7406663pgs.331.2018.04.17.10.28.05; Tue, 17 Apr 2018 10:28:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752254AbeDQR1B (ORCPT + 99 others); Tue, 17 Apr 2018 13:27:01 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:58068 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751187AbeDQR06 (ORCPT ); Tue, 17 Apr 2018 13:26:58 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 403D8814DF4C; Tue, 17 Apr 2018 17:26:57 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (file01.intranet.prod.int.rdu2.redhat.com [10.11.5.7]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A34C2111DD0E; Tue, 17 Apr 2018 17:26:52 +0000 (UTC) Received: from file01.intranet.prod.int.rdu2.redhat.com (localhost [127.0.0.1]) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4) with ESMTP id w3HHQqvU011741; Tue, 17 Apr 2018 13:26:52 -0400 Received: from localhost (mpatocka@localhost) by file01.intranet.prod.int.rdu2.redhat.com (8.14.4/8.14.4/Submit) with ESMTP id w3HHQprO011737; Tue, 17 Apr 2018 13:26:51 -0400 X-Authentication-Warning: file01.intranet.prod.int.rdu2.redhat.com: mpatocka owned process doing -bs Date: Tue, 17 Apr 2018 13:26:51 -0400 (EDT) From: Mikulas Patocka X-X-Sender: mpatocka@file01.intranet.prod.int.rdu2.redhat.com To: Vlastimil Babka cc: Christopher Lameter , Mike Snitzer , Matthew Wilcox , Pekka Enberg , linux-mm@kvack.org, dm-devel@redhat.com, David Rientjes , Joonsoo Kim , Andrew Morton , linux-kernel@vger.kernel.org Subject: Re: [PATCH RESEND] slab: introduce the flag SLAB_MINIMIZE_WASTE In-Reply-To: Message-ID: References: <20c58a03-90a8-7e75-5fc7-856facfb6c8a@suse.cz> <20180413151019.GA5660@redhat.com> <20180416142703.GA22422@redhat.com> <20180416144638.GA22484@redhat.com> User-Agent: Alpine 2.02 (LRH 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Tue, 17 Apr 2018 17:26:57 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Tue, 17 Apr 2018 17:26:57 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mpatocka@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 17 Apr 2018, Vlastimil Babka wrote: > On 04/17/2018 04:45 PM, Christopher Lameter wrote: > > On Mon, 16 Apr 2018, Mikulas Patocka wrote: > > > >> This patch introduces a flag SLAB_MINIMIZE_WASTE for slab and slub. This > >> flag causes allocation of larger slab caches in order to minimize wasted > >> space. > >> > >> This is needed because we want to use dm-bufio for deduplication index and > >> there are existing installations with non-power-of-two block sizes (such > >> as 640KB). The performance of the whole solution depends on efficient > >> memory use, so we must waste as little memory as possible. > > > > Hmmm. Can we come up with a generic solution instead? > > Yes please. > > > This may mean relaxing the enforcement of the allocation max order a bit > > so that we can get dense allocation through higher order allocs. > > > > But then higher order allocs are generally seen as problematic. > > I think in this case they are better than wasting/fragmenting 384kB for > 640kB object. Wasting 37% of memory is still better than the kernel randomly returning -ENOMEM when higher-order allocation fails. > > That > > means that callers need to be able to tolerate failures. > > Is it any different from now? I suppose there would still be > smallest-order fallback involved in sl*b itself? And if your allocation > is so large it can fail even with the fallback (i.e. >= costly order), > you need to tolerate failures anyway? > > One corner case I see is if there is anyone who would rather use their > own fallback instead of the space-wasting smallest-order fallback. > Maybe we could map some GFP flag to indicate that. For example, if you create a cache with 17KB objects, the slab subsystem will pad it up to 32KB. You are wasting almost 1/2 memory, but the allocation is realiable and it won't fail. If you use order higher than 32KB, you get less wasted memory, but you also get random -ENOMEMs (yes, we had a problem in dm-thin that it was randomly failing during initialization due to 64KB allocation). Mikulas