Received: by 10.223.176.5 with SMTP id f5csp3245335wra; Thu, 1 Feb 2018 13:14:00 -0800 (PST) X-Google-Smtp-Source: AH8x225T+9XI7bsy9kxMSrKziFLSIjqtUSBUXS05+uBPnmvzL4lrgURK/1obh3kotjylGAGFwL1b X-Received: by 2002:a17:902:523:: with SMTP id 32-v6mr25435252plf.283.1517519639886; Thu, 01 Feb 2018 13:13:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517519639; cv=none; d=google.com; s=arc-20160816; b=KhMBONdyFNnthCm83OBn8RZ5A2Ujb3UiLjD4rwoTivxeiB67kK6j4uEeFQMvM1sroX I9Kpf30EnzP4g1/+drf0tXXD609Tv4Q8iH/bu1DJduWLMeZpd49PBPMPk/EK8drPgu1Q DIqB1oSBebplPJX1OkAJM3lZ7w38vYeUtUi9pPJ53t0cvpLFj84W5UGBBK8ya0W9aRuB XqrValj4kWx3LfO1BvqTeAX2K2a26NEeIVxwdefhz+cFRnw7h6vZAG7dxHpfF82FJQQv 1SDSTsbPVvjH+nbHJo4mszIurHSL3bKLc6kCxeTZq5l182hFHG3cyXdCX/+rooOlZJ7C BH4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=BBLJdxWAtIvuwehvj2DRhiVaQAqznqOBDVJUBcCe+s8=; b=TmbsDk+SBv0j/K3MM/C0TfDA3dOHEWGzIDWH4NweaaWDITH/1rTiLB5nU5YC1+OtiY Bxn1bTdZSpgbaMIB6nwQ1mSMWi/KCh/yM/HtfYx/MojzVY2XJebNsUgQ5t7oV0TQqsB3 UHkk7uynk4laV445BPj6ElNhx0HGzJY3f8eIaT9ewVTNoRG4iGpH962pioE9RsrrVDlk MQ3IyW1LmiIoG2P8pxDSTHJrjjO96R0t7hqXtGcnXaAOht2H9qzwjrqxWtsdc3JrnLDH SkuDWGZyyvIQkrCLYYz71aXdk735QOnfGlgeMmhVfAbS2+HK1oVy4dnpdKENRagyEZnX Gefw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DELSpLE0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z8si293388pgc.83.2018.02.01.13.13.43; Thu, 01 Feb 2018 13:13:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=DELSpLE0; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751829AbeBAVKF (ORCPT + 99 others); Thu, 1 Feb 2018 16:10:05 -0500 Received: from mail-pf0-f193.google.com ([209.85.192.193]:35779 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751561AbeBAVJ7 (ORCPT ); Thu, 1 Feb 2018 16:09:59 -0500 Received: by mail-pf0-f193.google.com with SMTP id t12so16119922pfg.2; Thu, 01 Feb 2018 13:09:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=BBLJdxWAtIvuwehvj2DRhiVaQAqznqOBDVJUBcCe+s8=; b=DELSpLE0q8L6opDo8G0tnVa4rqUE/ySjAwSXGF426OuuxKZAeL6SZW19yw2BAcf0Us kGO1Z+jqxBtlKOxbm2owpAEbcWh6/PwppiH2sixCI7aw+XllMfPJ4c25etlndQJjhm20 g53YugEkA6drXwzpmOrK6rBifBJMLJPKYnDURe3lW0Y9eeHPau+KVWlGmKJhADehiyAv LhX0lyz7T83YjCqv0E269aHdHnHNP2px84hT8Ln9dKxGzmyHInsJB7xgXW+6WT2eNnp3 SUzZhUsXe0jRSeZSqCBDEkAz12EI2Ljv+/N4VbfqLt7XaALdEHtZJFXCCFYJU8QiO746 Mg0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=BBLJdxWAtIvuwehvj2DRhiVaQAqznqOBDVJUBcCe+s8=; b=CF2gT1fycLOJmJ4itDv+2NXElO1j8p1+WWSEN2JLdGbjLe5TPD12yl8c0IB9Istkz5 WpITD8iwhFWxxcX2kiXzQJYNcJAyDvQl+ZM4COGsurqK2KqqnIgAlXubIds/9u0Zwy6N wEN/xf6io38PA3dNhcc9yPNwg5u8cBmWIHm3c3+ruptexXPsyYwB969PbxbhTKHu+JJM 6cbEPQwETKL7E4Svyv7MoXc5bUu7z1o5XpuKfXMSMfBPGNtZvpnQiVOWEWN0BYX62wio zbHVwndoapgjp0ffp4R85x/o8wy0JzpHHBbGp9d017x1kHuCUV/nSGdf63SBSN6rsFKR QC7Q== X-Gm-Message-State: AKwxytecbsfBgdePzZIbZWJOpNPYqRZu39emPiYo7MDaFwyFg05Ff9RC nxgwPU+q79Lp+/dh1uIa/3c= X-Received: by 10.98.133.93 with SMTP id u90mr38636363pfd.134.1517519398698; Thu, 01 Feb 2018 13:09:58 -0800 (PST) Received: from [192.168.1.70] (c-73-93-215-6.hsd1.ca.comcast.net. [73.93.215.6]) by smtp.gmail.com with ESMTPSA id s67sm517619pfe.133.2018.02.01.13.09.57 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Feb 2018 13:09:57 -0800 (PST) Subject: Re: [PATCH] of: cache phandle nodes to decrease cost of of_find_node_by_phandle() To: Rob Herring Cc: Chintan Pandya , "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , "linux-kernel@vger.kernel.org" References: <1517429142-25727-1-git-send-email-frowand.list@gmail.com> <5dd35d8f-c430-237e-9863-2e73556f92ec@gmail.com> From: Frank Rowand Message-ID: <4f2b3755-9ef1-4817-7436-9f5aafb38b60@gmail.com> Date: Thu, 1 Feb 2018 13:09:56 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/01/18 06:24, Rob Herring wrote: > On Wed, Jan 31, 2018 at 3:43 PM, Frank Rowand wrote: >> On 01/31/18 12:05, frowand.list@gmail.com wrote: >>> From: Frank Rowand >>> >>> Create a cache of the nodes that contain a phandle property. Use this >>> cache to find the node for a given phandle value instead of scanning >>> the devicetree to find the node. If the phandle value is not found >>> in the cache, of_find_node_by_phandle() will fall back to the tree >>> scan algorithm. >>> >>> The cache is initialized in of_core_init(). >>> >>> The cache is freed via a late_initcall_sync(). >>> >>> Signed-off-by: Frank Rowand >>> --- >>> >>> Some of_find_by_phandle() calls may occur before the cache is >>> initialized or after it is freed. For example, for the qualcomm >>> qcom-apq8074-dragonboard, 11 calls occur before the initialization >>> and 80 occur after the cache is freed (out of 516 total calls.) >>> >>> >>> drivers/of/base.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++--- >>> drivers/of/of_private.h | 5 +++ >>> drivers/of/resolver.c | 21 ------------ >>> 3 files changed, 86 insertions(+), 25 deletions(-) >> >> Some observations.... >> >> The size of the cache for a normal device tree would be a couple of >> words of overhead for the cache, plus one pointer per devicetree node >> that contains a phandle property. This will be less space than >> would be used by adding a hash field to each device node. It is >> also less space than was used by the older algorithm (long gone) >> that added a linked list through the nodes that contained a >> phandle property. >> >> This is assuming that the values of the phandle properties are >> the default ones created by the dtc compiler. In the case >> where a very large phandle property value is hand-coded in >> a devicetree source, the size of the cache is capped at one >> entry per node. In this case, a little bit of space will be >> wasted -- but this is just a sanity fallback, it should not >> be encountered, and can be fixed by fixing the devicetree >> source. > > I don't think we should rely on how dtc allocates phandles. dtc is not > the only source of DeviceTrees. If we could do that, then lets make It seems like a reasonable thing to rely on. dtc is the in-tree compiler to create an FDT. Are you thinking about the IBM PPC devicetrees as devicetrees created in some manner other than dtc? Are there other examples you are aware of? If non-dtc tools create phandle property values that are not a contiguous range of values starting with one, then the devicetrees they create may not benefit from this performance optimization. But no user of such a devicetree is complaining about performance issues with of_find_node_by_phandle() against their tree. So until there is an issue, no big deal. If my effort to create a new version of the FDT, I would like to include a rule to the effect of "phandle property values created by the compiler _should_ be in the range of 1..n, where n is the number of phandle properties in the tree". That would provide some assurance of future trees being able to benefit from this specific optimization. Also, this specific implementation to decrease the cost of of_find_node_by_phandle() is just an implementation, not an architecture. Other implementations to achieve the same goal have existed in the past, and yet other methods could replace this one in the future if needed. > them have some known flag in the upper byte so we have some hint for > phandle values. 2^24 phandles should be enough for anyone.TM I don't understand. What is the definition of the flag? A flag that says the phandle property values are in the range of 1..n, where n is the number of phandle properties in the tree? > Your cache size is also going to balloon if the dtb was built with > '-@'. "Balloon" is a bit strong. Worst case is one entry per node, which is comparable to the old method of a linked list of nodes with phandle properties but with lower of_find_node_by_phandle() cost than the linked list implementation. And this assumes that every node has a label on it. < snip (retracted) > > Freeing after boot is nice, but if someone has lots of modules or > large overlays, this doesn't help them at all. The cache has to be regenerated anyway after applying an overlay that adds phandle properties to the live tree. Modules is something I thought about, but did not want to complicate the patch until we decided if this was a good direction to follow. Some ways to deal with overlays could be: don't auto-free the cache if modules are configured in the kernel, repopulate the cache any time a module is loaded, add a boot command line option to specify "do not free the cache" (or alternatively, do not automatically free the cache but provide an option of "do free the cache"). For now this seems like a lot of complexity for a non-problem. And we don't even have the performance numbers yet to see if this solves the reported problem. I'd prefer to start simple, and add complexity if needed. > There's still more tweaks we could do with a cache based (i.e. can > miss) approach. We could have an access count or some most recently > used list to avoid evicting frequently accessed phandles (your data > tends to indicate that would help). We could have cache sets. That seems like a lot of complexity for little or no gain. I actually like the elegance of the patch you created, thinking that the cache population and freeing code in my patch added a level of complexity. In the end, I think the reduced overhead of my approach supports the slight increase in complexity. > And so > far, no one has explained why a bigger cache got slower. Yes, I still find that surprising. > Or we could do something decentralized and make the frequent callers > cache their phandles. That could be a good solution if we have just one or two uses of of_find_node_by_phandle() that are the cause of the long boot time. That is why I was trying to get Chintan to examine and understand the use(s) of of_find_by_phandle() that were causing the slow boot, or if the problem is dispersed to a wide variety of callers. > > Rob >