Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030853AbdDSSeJ (ORCPT ); Wed, 19 Apr 2017 14:34:09 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:33615 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030837AbdDSSeD (ORCPT ); Wed, 19 Apr 2017 14:34:03 -0400 Subject: Re: [PATCH] of: introduce event tracepoints for dynamic device_node lifecyle To: Frank Rowand , Michael Ellerman , Tyrel Datwyler , robh+dt@kernel.org References: <1492475525-10827-1-git-send-email-tyreld@linux.vnet.ibm.com> <58F6AA35.2040902@gmail.com> <87efwp6v4e.fsf@concordia.ellerman.id.au> <58F6CBF9.7060000@gmail.com> Cc: devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, rostedt@goodmis.org, mingo@redhat.com, nfont@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org From: Tyrel Datwyler Date: Wed, 19 Apr 2017 11:33:47 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <58F6CBF9.7060000@gmail.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 x-cbid: 17041918-0008-0000-0000-000007A3F62A X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006940; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000208; SDB=6.00849714; UDB=6.00419602; IPR=6.00628348; BA=6.00005304; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015095; XFM=3.00000013; UTC=2017-04-19 18:33:55 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17041918-0009-0000-0000-0000419CEED4 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-04-19_16:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1704190153 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4075 Lines: 96 On 04/18/2017 07:31 PM, Frank Rowand wrote: > On 04/18/17 18:31, Michael Ellerman wrote: >> Frank Rowand writes: >> >>> On 04/17/17 17:32, Tyrel Datwyler wrote: >>>> This patch introduces event tracepoints for tracking a device_nodes >>>> reference cycle as well as reconfig notifications generated in response >>>> to node/property manipulations. >>>> >>>> With the recent upstreaming of the refcount API several device_node >>>> underflows and leaks have come to my attention in the pseries (DLPAR) dynamic >>>> logical partitioning code (ie. POWER speak for hotplugging virtual and physcial >>>> resources at runtime such as cpus or IOAs). These tracepoints provide a >>>> easy and quick mechanism for validating the reference counting of >>>> device_nodes during their lifetime. >>>> >>>> Further, when pseries lpars are migrated to a different machine we >>>> perform a live update of our device tree to bring it into alignment with the >>>> configuration of the new machine. The of_reconfig_notify trace point >>>> provides a mechanism that can be turned for debuging the device tree >>>> modifications with out having to build a custom kernel to get at the >>>> DEBUG code introduced by commit 00aa3720. >>> >>> I do not like changing individual (or small groups of) printk() style >>> debugging information to tracepoint style. >> >> I'm not quite sure which printks() you're referring to. >> >> The only printks that are removed in this series are under #ifdef DEBUG, >> and so are essentially not there unless you build a custom kernel. > > Yes, I am talking about pr_debug(), pr_info(), pr_err(), etc. > > >> >> They also only cover the reconfig case, which is actually less >> interesting than the much more common and bug-prone get/put logic. > > When I was looking at the get/put issue I used pr_debug(). > > >>> As far as I know, there is no easy way to combine trace data and printk() >>> style data to create a single chronology of events. If some of the >>> information needed to debug an issue is trace data and some is printk() >>> style data then it becomes more difficult to understand the overall >>> situation. >> >> If you enable CONFIG_PRINTK_TIME then you should be able to just sort >> the trace and the printk output by the timestamp. If you're really >> trying to correlate the two then you should probably just be using >> trace_printk(). > > Except the existing debug code that uses pr_debug() does not use > trace_printk(). > > And "just sort" does not apply to multi-line output like: > > cpuhp/23-147 [023] .... 128.324827: > of_node_put: refcount=5, dn->full_name=/cpus/PowerPC,POWER8@10 > cpuhp/23-147 [023] .... 128.324829: > of_node_put: refcount=4, dn->full_name=/cpus/PowerPC,POWER8@10 > cpuhp/23-147 [023] .... 128.324829: > of_node_put: refcount=3, dn->full_name=/cpus/PowerPC,POWER8@10 > cpuhp/23-147 [023] .... 128.324831: > of_node_put: refcount=2, dn->full_name=/cpus/PowerPC,POWER8@10 > drmgr-7284 [009] .... 128.439000: > of_node_put: refcount=1, dn->full_name=/cpus/PowerPC,POWER8@10 > drmgr-7284 [009] .... 128.439002: > of_reconfig_notify: action=DETACH_NODE, dn->full_name=/cpus/PowerPC,POWER8@10, > prop->name=null, old_prop->name=null > drmgr-7284 [009] .... 128.439015: > of_node_put: refcount=0, dn->full_name=/cpus/PowerPC,POWER8@10 > drmgr-7284 [009] .... 128.439016: > of_node_release: dn->full_name=/cpus/PowerPC,POWER8@10, dn->_flags=4 > > I was kinda hoping that maybe someone had already created a tool to deal > with this issue. But not too optimistic. This output was actually broken into multiple lines for the commit message. Each trace point is actually a single line in the trace buffer. This output was pulled from the trace buffer with the following: cat /sys/kernel/debug/trace/tracing | grep "POWER8@10" -Tyrel > > >> But IMO this level of detail, tracing every get/put, does not belong in >> printk. Trace points are absolutely the right solution for this type of >> debugging. >> >> cheers >> . >> >