Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941766AbcLWRTx (ORCPT ); Fri, 23 Dec 2016 12:19:53 -0500 Received: from mail-sn1nam02on0115.outbound.protection.outlook.com ([104.47.36.115]:57536 "EHLO NAM02-SN1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932280AbcLWRTw (ORCPT ); Fri, 23 Dec 2016 12:19:52 -0500 From: Matthew Wilcox To: Rasmus Villemoes CC: Tejun Heo , "linux-kernel@vger.kernel.org" , Lai Jiangshan , "Jens Axboe" , Greg Kroah-Hartman , "linux-block@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , "Andrew Morton" Subject: RE: [RFC 00/10] implement alternative and much simpler id allocator Thread-Topic: [RFC 00/10] implement alternative and much simpler id allocator Thread-Index: AQHSUme+SjIWPmjllE++nkCQM7WotKEKtg/QgABcHJ+AAAiNEIAACqdQgAmQ8VKAAPanYA== Date: Fri, 23 Dec 2016 17:03:51 +0000 Message-ID: References: <1481160187-9652-1-git-send-email-linux@rasmusvillemoes.dk> <20161209140140.5e0a68e2e1cf9861335bdf3b@linux-foundation.org> <87inqj8vjo.fsf@rasmusvillemoes.dk> <87pokjjznr.fsf@rasmusvillemoes.dk> In-Reply-To: <87pokjjznr.fsf@rasmusvillemoes.dk> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=mawilcox@microsoft.com; x-originating-ip: [76.10.168.227] x-ms-office365-filtering-correlation-id: 4c8b2f5e-0256-4979-0f30-08d42b55aba7 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:BY2PR21MB0036; x-microsoft-exchange-diagnostics: 1;BY2PR21MB0036;7:nJqp6ZbUtizS4b4thiJaTH3XTbV/xpRlDuokheOKi9MfwFJjA/czBLjaqeP68ksAUk9u1nV++5T6wCi4Q9gf37R2WEhErw2FVCZPbAQJujq1Z0pSVFgmrvXJEUCJRZUPA3y2Rsu+pOK/fEAwiG1KjgqgIfigC6MNkfpXK3LL8tBlrQT2Z+B/fMhim2lmZ4pzK2zAB4LYIlgHjpMBmx0Ganmuxmf1mfFgRC5yHOpP0NvxyEl7FOA7us+G/VI2PxqiYrKksCY5sZWAvP2NH3QxvJ5saKdfj/Gx+7UVnmiBifmO1FDew1AXXyp/xwRCJdKrzJGwd8Wgopb7nXs93DiSQDQ9PvhkFzUMd2UaPCcmzL81K/H6GUQdUnYx6P0c4rBakwMG9EvoEvAow/bGKpdWOp5vssNPtfUw20bV5qVG7oiMFmg5OEa3QhbKkWgXSAaNfdPPN/1+MbDrMXVmvsTEKArL+SMFhOgiTAp2PTU/70A= x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(61425038)(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(61426038)(61427038)(6041248)(20161123564025)(20161123562025)(20161123555025)(20161123560025)(6072148);SRVR:BY2PR21MB0036;BCL:0;PCL:0;RULEID:;SRVR:BY2PR21MB0036; x-forefront-prvs: 016572D96D x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(6009001)(7916002)(39410400002)(39850400002)(39860400002)(39840400002)(39450400003)(199003)(189002)(3846002)(102836003)(8936002)(3280700002)(189998001)(101416001)(6116002)(4001150100001)(77096006)(38730400001)(2906002)(39060400001)(8676002)(97736004)(122556002)(6436002)(81166006)(54356999)(76176999)(4326007)(81156014)(50986999)(68736007)(76576001)(6506006)(7696004)(86362001)(110136003)(5660300001)(106356001)(6916009)(2950100002)(9686002)(86612001)(99286002)(33656002)(5005710100001)(74316002)(10290500002)(229853002)(7736002)(105586002)(10090500001)(106116001)(93886004)(66066001)(3660700001)(305945005)(25786008)(8990500004)(2900100001)(92566002);DIR:OUT;SFP:1102;SCL:1;SRVR:BY2PR21MB0036;H:BY2PR21MB0036.namprd21.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Dec 2016 17:03:51.5805 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR21MB0036 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by mail.home.local id uBNHJwUI002350 Content-Length: 2832 Lines: 69 From: Rasmus Villemoes [mailto:linux@rasmusvillemoes.dk] > Nice work! A few random comments/questions: > > - It does add some complexity, but I think a few comments would make it > more digestable. I'm open to adding some comments ... I need some time between writing the code and writing the comments to be sure what comments are useful. > - Hm, maybe I'm confused, and I certainly don't understand how the whole > radix tree works. But do you use every leaf node as an exceptional > entry initially, to allocate the first 62 ids from that level? This > code I do! And that question tells me one useful comment to add! > if ((bit + RADIX_TREE_EXCEPTIONAL_SHIFT) < > BITS_PER_LONG) { > bit += RADIX_TREE_EXCEPTIONAL_SHIFT; > radix_tree_iter_replace(root, &iter, slot, > (void *)((1UL << bit) | > RADIX_TREE_EXCEPTIONAL_ENTRY)); > *id = new; > return 0; > } > > operates on bit, which is the offset from index*IDA_BITMAP_BITS, and > it seems to create an exceptional entry somewhere down the tree > (which may of course be the root). > > But we don't seem to allocate another bit from that exceptional entry > ever unless it happened to sit at index 0; the code > > unsigned long tmp = (unsigned long)bitmap; > if (start < BITS_PER_LONG) { > unsigned tbit = find_next_zero_bit(&tmp, > BITS_PER_LONG, start); > if (tbit < BITS_PER_LONG) { > tmp |= 1UL << tbit; > rcu_assign_pointer(*slot, (void *)tmp); > *id = new + tbit - > RADIX_TREE_EXCEPTIONAL_SHIFT; > return 0; > } > } > > is only used for small values of start (though it does seem to > account for a non-zero value of new == iter.index * IDA_BITMAP_BITS). Ahh. You're reading the code wrong, and I wrote the code wrongly too, so it's clearly overly complex. I was testing with 'start' set to 0, allocating N entries, and then freeing them. If I'd set start to 1024 and allocated two entries, I'd've seen the failure. Please see the top commit here ("Improve IDA exceptional entry handling"): http://git.infradead.org/users/willy/linux-dax.git/shortlog/refs/heads/idr-2016-12-20 > - In the micro-optimization department, I think we should avoid > find_next_zero_bit on a single word, and instead use __ffs. Something > like (assuming we can use bit instead of start) > > if (bit + RADIX_TREE_EXCEPTIONAL_SHIFT < BITS_PER_LONG) { > tmp = (~(unsigned long)bitmap) >> (bit + RADIX_TREE_EXCEPTIONAL_SHIFT); > if (tmp) { > tbit = __ffs(tmp) + bit + RADIX_TREE_EXCEPTIONAL_SHIFT; > tmp = (unsigned long)bitmap | (1UL << tbit); > rcu_assign_pointer(*slot, (void *)tmp); > *id = new + tbit - RADIX_TREE_EXCEPTIONAL_SHIFT; > return 0; > } > } I'm certainly open to microoptimisations, but I'll have to think about this one for a bit.