Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751394AbbG2Jih (ORCPT ); Wed, 29 Jul 2015 05:38:37 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:12383 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750800AbbG2Jie (ORCPT ); Wed, 29 Jul 2015 05:38:34 -0400 Subject: Re: llvm bpf debug info. Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event To: Alexei Starovoitov , "Wangnan (F)" , pi3orama References: <1436522587-136825-1-git-send-email-hekuang@huawei.com> <1436522587-136825-4-git-send-email-hekuang@huawei.com> <55A042DC.6030809@plumgrid.com> <55A3404B.6020904@huawei.com> <20150713135223.GB9917@danjae.kornet> <4D441676-21A7-46EE-AAB0-EB529D408082@163.com> <20150713140915.GD9917@danjae.kornet> <55A46928.9090708@plumgrid.com> <55A4F869.1020705@huawei.com> <55A88085.8090407@plumgrid.com> <55A88137.7020609@huawei.com> <55A88449.3030008@plumgrid.com> <55B0D5FC.6050406@huawei.com> <55B1535E.8090406@plumgrid.com> <55B1AEE9.1080207@plumgrid.com> <55B1BC03.9020708@huawei.com> <55B35F42.70803@huawei.com> <55B6E685.30905@plumgrid.com> CC: "linux-kernel@vger.kernel.org" From: He Kuang Message-ID: <55B89F04.5030304@huawei.com> Date: Wed, 29 Jul 2015 17:38:12 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.0 MIME-Version: 1.0 In-Reply-To: <55B6E685.30905@plumgrid.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [10.110.54.65] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4901 Lines: 140 Hi, Alexei On 2015/7/28 10:18, Alexei Starovoitov wrote: > On 7/25/15 3:04 AM, He Kuang wrote: >> I noticed that for 64-bit elf format, the reloc sections have >> 'Addend' in the entry, but there's no 'Addend' info in bpf elf >> file(64bit). I think there must be something wrong in the process >> of .s -> .o, which related to 64bit/32bit. Anyway, we can parse out the >> AT_name now, DW_AT_LOCATION still missed and need your help. > > looks like objdump/llvm-dwarfdump can only read known EM, > but that that shouldn't be the problem for your dwarf reader right? > It should be able to recognize id-s of ELF::R_X86_64_* relo used right? > As far as AT_location for testprog.c it seems there is no info for > local variables because they were optimized away. > With -O0 I see AT_location being emitted. > Also line number info seems to be good in both cases. > But in our case, we don't need this anyway, no? we need to see > the types of structs mainly or you have some other use cases? > I think line number info would be great to correlate the error reported > by verifier into specific line in C. > Yes, without AT_location, we can lookup the user output data type by line number, but there're some issues when we look deep. There're two steps of work that should be done in user space, first we embed data type into bpf output record, then we use this type, or index or some other identifier to lookup the type from dwarf info, so we got a few plans. * Plan A. Use line number to identify the user data type Predefined macros: #define DEFINE_BPF_OUTPUT_DATA(type, var) \ const int BPF_OUTPUT_LINE__##var = __LINE__; type var; #define BPF_OUTPUT_TRACE_DATA(data, size) \ __bpf_output_trace_data(BPF_OUTPUT_LINE__##data, &data, size) User defined BPF code: struct user_define_struct { ... }; int testprog(int myvar_a, long myvar_b) { DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, myvar_c); BPF_OUTPUT_TRACE_DATA(myvar_c, sizeof(myvar_c)); ... We use macros to embed linenum implicitly, which leads an extra restriction that user should not define multiple variables in the same line and not split the macro over multiple lines, like this: 22 DEFINE_BPF_OUTPUT_DATA(struct xxtype, a); DEFINE_BPF_OUTPUT_DATA(struct xxtype, b); Or 22 DEFINE_BPF_OUTPUT_DATA(struct user_define_struct, 23 myvar_c); DW_AT_decl_line = 22, while __LINE__ = 23 So we should add verifier in the llvm BPF backend to warn on the above codes. * Plan B. Lookup variable type from dwarf AT_location info We can make use of the output data variable's address, for bpf is a minus offset to frame base. Then lookup matched offset from location info(e.g. "DW_OP_fbreg: -32") to identify the variable type. For getting the frame base address, we can use builtin functions like __builtin_frame_base() and __builtin_dwarf_cfa() which returns the call frame base address. Currently those builtin functions are not implemented in BPF lower operation yet, so we tested our bpf program by using a variable tag on frame base, as following: struct user_define_struct { ... }; typedef struct {} frame_base_tag; int testprog(void) { frame_base_tag BPF_FRAME_BASE; struct user_define_struct myvar_a; __bpf_trace_output_data((void *)&myvar_a - (void *)&BPF_FRAME_BASE, &myvar_a, sizeof(myvar_a)); ... The first argument of __bpf_trace_output_data() will be caculated and it's easy to traverse the variable DIEs in dwarf info and check each DW_AT_location attribute to find the corresponding variable type. The things let us worry about is the opimization may reuse the stack space which can cause different variables share the same address, by some rough tests that kind of optimization does not appear. * Comparison Plan A needs less effort and easy to implement, but requires more check to ensure user not use multiple definition in the same line and not use macro cross lines. The advantages of plan B is that we do not need introduce macros showed in above example and all the things are done implicitly, but the AT_location info is the prerequisite of this plan, I'm not sure whether we can guarantee this info in dwarf or not. Another way we can think of is adding new builtin functions to indicate the compilier to generate codes return the dwarf type index directly: __bpf_trace_output_data(__builtin_dwarf_type(myvar_a), &myvar_a, size); What's your opinion on those plans, and do you have more suggestion? Thank you. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/