Received: by 10.213.65.68 with SMTP id h4csp1868417imn; Thu, 29 Mar 2018 12:34:37 -0700 (PDT) X-Google-Smtp-Source: AIpwx49OWo0x2fjEyNjHspVO/c4/c1cCFTXtpXk/OmxKW1Bxb6MzoFKwkeXpZ8VUjbm5lOFDOxPA X-Received: by 2002:a17:902:1e5:: with SMTP id b92-v6mr9699791plb.78.1522352077771; Thu, 29 Mar 2018 12:34:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522352077; cv=none; d=google.com; s=arc-20160816; b=Kk8G7Gmg8x5OW4dm/CYhg+/Az/BgiE6aCmtqHkeM9cEB1G8gEmrvg7fFoo6CTzpVlg rJ5Hi25EbtXSkiMlhjDf0rT9Rfjt9WFVhOdIho7lcj95P54myxJNIk+st8LGkN0poFES Ja6qOk5ugRH3HTtxSIx4eh6SlXEhMLd5I+BLBZMRp7GrgZ+rv9FJexcvQ3cUlsHTD3+w NKwYrRgyQljfZwm+qkwNPa0YquxOjtMUITzPlfcTKtsPHCWS5AeRqqVVbvIiNmw0qis9 Jck2w0rx96MvOkTyywg1uO537FnGR82Wsf3E4Z0lGqJoZq5KzjWkKP7KOkvRz4EnCmQ0 wJwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=xr4OxAjSYRy9CZm8jAP9bGB41TjCk2LxALdzQOFMVQo=; b=o1jx8hmx3DCnd3XQt8S51SdYnsqKWIelDRvufFxZq0hSyGfutiXMKjBp9Dt/dK77/S 234mRWYYIGgi/oQxsaD81sz1P+pQfbcstmWKFCb8BIkD8XK4wPDLJ3MnErnGT16DO/6+ uIq67LY6jUzcMu0pieQkMrQ0LDVPq2KEr7MjyTFUKnsyi3AOlf5TuYqLbe6S6PJwRz2r 6PZcJCtT8YafkfrqsWJz8Sebl5IQMhBV9TbSx3F6KIb4/4eeiOcydr9ZTomew2kBBxSi RpGU3UQSms4EyUXIxcWCWMseS89bFuPzRq1yEOClWYCxpNLuYe7ZlyDPUGa9ECpuz16Q uE/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=YI5qilYK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v4si4448689pgt.83.2018.03.29.12.34.23; Thu, 29 Mar 2018 12:34:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=YI5qilYK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752216AbeC2TdC (ORCPT + 99 others); Thu, 29 Mar 2018 15:33:02 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:57140 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751096AbeC2TdA (ORCPT ); Thu, 29 Mar 2018 15:33:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=xr4OxAjSYRy9CZm8jAP9bGB41TjCk2LxALdzQOFMVQo=; b=YI5qilYKTohdA71ehtBgKBj3Z 0r26NaJ9hsuWZpQ3JLlUaEPTCL/uOLVT40JXhVMeCYZTt87WtbkFMNF1ZZsdsCTXVdRE//8uR6GHr pdsxem+1qH0bK3M4RPMk1BHzdDOjJ+LjVq8xTy3vv6ykz3O8+N+g26XSTz5n2NJW4SLQn1tLW2l+r BsBRl0+6FOYi+SY2VsrgpmCUSb2oXm7TDazhCvbzjlVQ71wM5uNEmhA/itqCDgtlBWpif2QxpXc4p WIWv8/AORpDz54YFPKFxVL61Lc3hVx8LYSMlPHFKMlWmz9swkp6BovifA765JIsQ+A4BRwcURSNAa 92F5Z8Chw==; Received: from willy by bombadil.infradead.org with local (Exim 4.90_1 #2 (Red Hat Linux)) id 1f1dI6-0005lG-8X; Thu, 29 Mar 2018 19:32:54 +0000 Date: Thu, 29 Mar 2018 12:32:54 -0700 From: Matthew Wilcox To: Manfred Spraul Cc: Davidlohr Bueso , Waiman Long , Michael Kerrisk , "Eric W. Biederman" , "Luis R. Rodriguez" , Kees Cook , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton , Al Viro , Stanislav Kinsbursky , Linux Containers , linux-api@vger.kernel.org Subject: Re: [RFC][PATCH] ipc: Remove IPCMNI Message-ID: <20180329193254.GA22300@bombadil.infradead.org> References: <87o9jru3bf.fsf@xmission.com> <935a7c50-50cc-2dc0-33bb-92c000d039bc@redhat.com> <87woyego2u.fsf_-_@xmission.com> <047c6ed6-6581-b543-ba3d-cadc543d3d25@redhat.com> <87h8ph6u67.fsf@xmission.com> <7d3a1f93-f8e5-5325-f9a7-0079f7777b6f@redhat.com> <20180329021409.gcjjrmviw2lckbfk@linux-n805> <3e201de2-bed2-6f7d-0783-700d095142e0@colorfullife.com> <20180329105601.GA597@bombadil.infradead.org> <05772f83-d680-aea1-b222-cef2430dcc83@colorfullife.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <05772f83-d680-aea1-b222-cef2430dcc83@colorfullife.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 29, 2018 at 08:07:44PM +0200, Manfred Spraul wrote: > Hello Mathew, > > On 03/29/2018 12:56 PM, Matthew Wilcox wrote: > > On Thu, Mar 29, 2018 at 10:47:45AM +0200, Manfred Spraul wrote: > > > > > > > > This can be implemented trivially with the current code > > > > > > > > using idr_alloc_cyclic. > > > Is there a performance impact? > > > Right now, the idr tree is only large if there are lots of objects. > > > What happens if we have only 1 object, with id=INT_MAX-1? > > The radix tree uses a branching factor of 64 entries (6 bits) per level. > > The maximum ID is 31 bits (positive signed 32-bit integer). So the > > worst case for a single object is 6 pointer dereferences to find the > > object anywhere in the range (INT_MAX/2 - INT_MAX]. That will read 12 > > cachelines. If we were to constrain ourselves to a maximum of INT_MAX/2 > > (30 bits), we'd reduce that to 5 pointer dereferences and 10 cachelines. > I'm concerned about the up to 6 branches. > But this is just guessing, we need a test with a realistic workload. Yes, and once there's a realistic workload, I'll be happy to prioritise adapting the data structure to reduce the pointer chases. FWIW, the plan is this: There's currently an unused 32-bit field (on 64-bit machines) which I plan to make the 'base' field. So at each step down the tree, one subtracts that field from the index in order to decide which slot to look at next. Something like this: index=0x3000'0000 (root) -> order=24 offset=48 -> order=18 offset=0 -> order=12 offset=0 -> order=6 offset=0 -> order=0 offset=0 -> data compresses to a single node: (root) -> order=0 base=3000'0000 offset=0 -> data If one has one entry at 5 and another entry at 0x3000'0000, the tree looks like this (three nodes): (root) -> order=24 base=0 offset=0 -> order=0 base=0 offset=5 -> entry1 offset=48 -> order=0 base=0 offset=0 -> entry2 The trick is making sure that looking up offset 0x300'1000 returns NULL and not entry2, but I can make that work. An alternative is to go to something a little more libJudy and have mutating internal nodes in the tree that can represent this kind of situation in a more compact form. There's a tradeoff to be made between simplicity of implementation, cost of insertion, cost of lookup and memory consumption. I don't know where the right balance point is yet.