Received: by 10.223.176.5 with SMTP id f5csp278198wra; Fri, 2 Feb 2018 22:48:15 -0800 (PST) X-Google-Smtp-Source: AH8x225WSgsPYQeN5KyUfpUI8KKGbLi7GRVGEZ2FsH0yDqvar2fZL1Dj0Gjr3KqN4OGASILlYlOW X-Received: by 10.98.253.5 with SMTP id p5mr42607964pfh.132.1517640495221; Fri, 02 Feb 2018 22:48:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517640495; cv=none; d=google.com; s=arc-20160816; b=0afeNV7aypgaSKoYJ57yeZyLuOaI4AwqnGsG+IRX4Ae3D81FwD3nOUtWkB33t9X0HF iQeggDZJve8n8o9WmlYoL9yYmKrg/8iz5BWFMCdLsLLJFxpX6T34mWfe8OID+RrtU6ci a5T/xnBuiUi2HedASLhVesS8gP6X2LB0zFW81xUOATftCJmsen6Cr5lqmcI0uCdqStQo HQtSB2445kcVPj5tBcB/WYSVMR5lfL/jrPyXbe5eukXdRoV53WlnxJAsnCMBPVU3+rr8 nyYHcjrjk6MiNJqi48TFLrl8yeQSqhceZLi6N93GNbFFzioEwA2uQSIFCg+thPwpGWr9 IQ0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=bj8LJQt3F1SZx8EKwzrOu8ld2ptT/65WKyI6CJMkT4Y=; b=GOKltU5nYLyBwsMvTBRqiZ/sWC5qKgaq0Xb4K/NSPRh0P+r54VGQ9Q+lBFl0z82n/9 0UGeUObttiRW5nvpqNNYE9j5YIzK9THYFCptRcdn1e4N9/V6q5wi4RFW/cUoS0bzl45i bHDwblphFIk0qMrrWzPy8erLxc6QML0X5a0LCVro4Y2m0eADpVnDwqb53/gHGCm/Qv1H 8NyivWaBk8qOMgEO+6WfF9FCP4cKU7hf2gAU94JBRw5Z+LV1c3Rkirr+ndKpleDwcEyB CDY5/HUhniuHPh1Warm7AaEq3WG4H8NuiiQj95HF9Iqfa/ftwop4pnnFjQ/qzqLqG6QE swjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=cfPJWZGa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y38-v6si1230280plh.448.2018.02.02.22.48.00; Fri, 02 Feb 2018 22:48:15 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=cfPJWZGa; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752606AbeBCDzo (ORCPT + 99 others); Fri, 2 Feb 2018 22:55:44 -0500 Received: from mail-pl0-f48.google.com ([209.85.160.48]:37968 "EHLO mail-pl0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752577AbeBCDzj (ORCPT ); Fri, 2 Feb 2018 22:55:39 -0500 Received: by mail-pl0-f48.google.com with SMTP id 13so7670742plb.5; Fri, 02 Feb 2018 19:55:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=bj8LJQt3F1SZx8EKwzrOu8ld2ptT/65WKyI6CJMkT4Y=; b=cfPJWZGaBWSK3LAZK3pnFzOdGsiIY729dZfiLQBKj5cSxGqsVw8quEO+EWPslJ1Tko hTSFezC2QV9JAYrv2ZpihQuV42ETF1TQmAlWDue3ppOF0sREwVGLhC9lGWc/1nYXu5DW zEOcGbwxPfyH6xeKrT4JYml9FrdKgY9gBw9T8bI8wpbPvyy0J6xty+JlzLaSs0ljb9bp X5l+xJsbfS3O7PLhZ4FdbCViORzQLUagNz9Ju6Wh6oDv69ZAt/CUQmJyMjeRjzRHoGVS yadEQLz3NDmTVikzSt8jnb3ETP4vBg91eTCNNH9mOZ+IKvQnCQRhrI0pn9sdZW3KC423 VbJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=bj8LJQt3F1SZx8EKwzrOu8ld2ptT/65WKyI6CJMkT4Y=; b=lsQhkpL9xa7BTIErwoIkord6QrJ0wGtNn3uZ5TOBjYAih6AiEcbMVYSBAghdQDnWXE TGz/fOyYSDkQr2n0EfpzyN7g/LyJbYDJ7D8+BxYv4x1rfdtq84EVdzDp1hjl/xfLQHdJ jxsM1ANtmEZ5ma+0p1679N3bdihiRfRn7udplEHJBFYob/Ir/LBg6V78AABHoHsecZq6 Q3kTfervZ1wxd8AG5EXc0o5QVFz7abbkZbmreFQPqMm4cZwp2igVyhC7BVcUqx8ITQl8 r64jbqrvpAltRd9bU/OS35raeWV/41bRArMfR+TZwJILqRpFsEicQhSW1WgS7Vqmmalv gNuw== X-Gm-Message-State: AKwxytey2+xfoxeL8G31lndDFg5GlMVCFV1ioIgpyVr0+Re3/HTQdJsj BJlJYuiNFWfU0bfmPZXlSz8DxJ57 X-Received: by 2002:a17:902:3225:: with SMTP id y34-v6mr36616191plb.399.1517630137894; Fri, 02 Feb 2018 19:55:37 -0800 (PST) Received: from [192.168.1.70] (c-73-93-215-6.hsd1.ca.comcast.net. [73.93.215.6]) by smtp.gmail.com with ESMTPSA id h18sm5423761pfi.174.2018.02.02.19.55.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Feb 2018 19:55:37 -0800 (PST) Subject: Re: [PATCH] of: cache phandle nodes to decrease cost of of_find_node_by_phandle() To: Chintan Pandya , Rob Herring Cc: "open list:OPEN FIRMWARE AND FLATTENED DEVICE TREE BINDINGS" , "linux-kernel@vger.kernel.org" References: <1517429142-25727-1-git-send-email-frowand.list@gmail.com> <5dd35d8f-c430-237e-9863-2e73556f92ec@gmail.com> <4f2b3755-9ef1-4817-7436-9f5aafb38b60@gmail.com> From: Frank Rowand Message-ID: Date: Fri, 2 Feb 2018 19:55:35 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/01/18 21:53, Chintan Pandya wrote: > > > On 2/2/2018 2:39 AM, Frank Rowand wrote: >> On 02/01/18 06:24, Rob Herring wrote: >>> And so >>> far, no one has explained why a bigger cache got slower. >> >> Yes, I still find that surprising. > > I thought a bit about this. And realized that increasing the cache size should help improve the performance only if there are too many misses with the smaller cache. So, from my experiments some time back, I looked up the logs and saw the access pattern. Seems like, there is *not_too_much* juggling during look up by phandles. > > See the access pattern here: https://drive.google.com/file/d/1qfAD8OsswNJABgAwjJf6Gr_JZMeK7rLV/view?usp=sharing Thanks! Very interesting. I was somewhat limited at playing detective with this, because the phandle values are not consistent with the dts file you are currently working with (arch/arm64/boot/dts/qcom/sda670-mtp.dts). For example, I could not determine what the target nodes for the hot phandle values. That information _could_ possibly point at algorithms within the devicetree core code that could be improved. Or maybe not. Hard to tell until actually looking at the data. Anyway, some observations were possible. There are 485 unique phandle values searched for. The ten phandle values most frequently referenced account for 3932 / 6745 (or 58%) of all references. Without the corresponding devicetree I can not tell how many nodes need to be scanned to locate each of these ten values (using the existing algorithm). Thus I can not determine how much scanning would be eliminated by caching just the nodes corresponding to these ten phandle values. There are 89 phandle values that were searched for 10 times or more, accounting for 86% of the searches. Only 164 phandle values were searched for just one time. 303 phandle values were searched for just one or two times. Here is a more complete picture: 10 values each used 100 or more times; searches: 3932 58% 11 values each used 90 or more times; searches: 3994 59% 12 values each used 80 or more times; searches: 4045 60% 13 values each used 70 or more times; searches: 4093 61% 14 values each used 60 or more times; searches: 4136 61% 15 values each used 50 or more times; searches: 4178 62% 18 values each used 40 or more times; searches: 4300 64% 32 values each used 30 or more times; searches: 4774 71% 54 values each used 20 or more times; searches: 5293 78% 89 values each used 10 or more times; searches: 5791 86% 93 values each used 9 or more times; searches: 5827 86% 117 values each used 8 or more times; searches: 6019 89% 122 values each used 7 or more times; searches: 6054 90% 132 values each used 6 or more times; searches: 6114 91% 144 values each used 5 or more times; searches: 6174 92% 162 values each used 4 or more times; searches: 6246 93% 181 values each used 3 or more times; searches: 6303 93% 320 values each used 2 or more times; searches: 6581 98% 484 values each used 1 or more times; searches: 6746 100% A single system does not prove anything. It is possible that other devicetrees would exhibit similarly long tailed behavior, but that is just wild speculation on my part. _If_ the long tail is representative of other systems, then identifying a few hot spots could be useful, but fixing them is not likely to significantly reduce the overhead of calls to of_find_node_by_phandle(). Some method of reducing the overhead of each call would be the answer for a system of this class. > Sample log is pasted below where number in the last is phandle value. >     Line 8853: [   37.425405] OF: want to search this 262 >     Line 8854: [   37.425453] OF: want to search this 262 >     Line 8855: [   37.425499] OF: want to search this 262 >     Line 8856: [   37.425549] OF: want to search this 15 >     Line 8857: [   37.425599] OF: want to search this 5 >     Line 8858: [   37.429989] OF: want to search this 253 >     Line 8859: [   37.430058] OF: want to search this 253 >     Line 8860: [   37.430217] OF: want to search this 253 >     Line 8861: [   37.430278] OF: want to search this 253 >     Line 8862: [   37.430337] OF: want to search this 253 >     Line 8863: [   37.430399] OF: want to search this 254 >     Line 8864: [   37.430597] OF: want to search this 254 >     Line 8865: [   37.430656] OF: want to search this 254 > > > Above explains why results with cache size 64 and 128 have almost similar results. Now, for cache size 256 we have degrading performance. I don't have a good theory here but I'm assuming that by making large SW cache, we miss the benefits of real HW cache which is typically smaller than our array size. Also, in my set up, I've set max_cpu=1 to reduce the variance. That again, should affect the cache holding pattern in HW and affect the perf numbers. > > > Chintan