2006-10-30 22:36:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [ANNOUNCE] pahole and other DWARF2 utilities

Hi,

I've been working on some DWARF2 utilities and thought that it
is about time I announce it to the community, so that what is already
available can be used by people interested in reducing structure sizes
and otherwise taking advantage of the information available in the elf
sections of files compiled with 'gcc -g' or in the case of the kernel
with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
tools:

pahole: Poke-a-Hole is a tool to find out holes in structures, holes
being defined as the space between members of functions due to alignemnt
rules that could be used for new struct entries or to reorganize
existing structures to reduce its size, without more ado lets see what
that means:

[acme@newtoy net-2.6]$ pahole kernel/sched.o task_struct
/* include2/asm/system.h:11 */
struct task_struct {
volatile long int state; /* 0 4 */
struct thread_info * thread_info; /* 4 4 */
atomic_t usage; /* 8 4 */
long unsigned int flags; /* 12 4 */

<SNIP>

short unsigned int ioprio; /* 52 2 */

/* XXX 2 bytes hole, try to pack */

long unsigned int sleep_avg; /* 56 4 */ */
unsigned char fpu_counter; /* 388 1 */

/* XXX 3 bytes hole, try to pack */

int oomkilladj; /* 392 4 */

<SNIP>

}; /* size: 1312, sum members: 1287, holes: 3, sum holes: 13, padding: 12 */

It doesn't uses any source code files, just the DWARF2
information in ELF sections, inserted by 'gcc -g', to print out the
above information, current goodies being just to show where are holes
that can be used to reduce the struct size, which is even more useful as
we transition to 64bit architectures, where such holes are more
frequent, as we can see in this example:

[acme@newtoy ~]$ file kdump.debug
kdump.debug: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV),
dynamically linked (uses shared libs), not stripped
[acme@newtoy ~]$ pahole kdump.debug _IO_FILE | head -7
/* /usr/include/stdio.h:46 */
struct _IO_FILE {
int _flags; /* 0 4 */

/* XXX 4 bytes hole, try to pack */

char *_IO_read_ptr; /* 8 8 */
[acme@newtoy ~]$


The columns in the comments are (offset, sizeof(member).

Tons more information is available in the DWARF2 ELF sections,
making it possible to use it for other purposes, and thats where the
next dwarf comes in, pfunct:

[acme@newtoy net-2.6]$ pfunct net/ipv4/tcp_ipv4.o tcp_v4_rcv
/* /pub/scm/linux/kernel/git/acme/net-2.6/net/ipv4/tcp_ipv4.c:1054 */
int tcp_v4_rcv(struct sk_buff * skb);
/* size: 2175 */

pfunct uses the DWARF2 information to get function details, such
as its full prototype and function size, that allows us to do some more
interesting queries, such as:

[acme@newtoy net-2.6]$ pfunct --size net/ipv4/netfilter/ip_conntrack.ko
| sort -k 2 -nr | head -10
tcp_packet: 3349
ip_conntrack_in: 1146
icmp_error: 874
ip_conntrack_expect_related: 804
__ip_conntrack_confirm: 586
tcp_new: 527
ip_conntrack_init: 525
tcp_error: 508
ip_conntrack_helper_unregister: 482
ip_conntrack_alloc: 469
[acme@newtoy net-2.6]$

The top ten functions by size (in bytes) in any ELF file with
debugging information!

The code is available in a git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/pahole.git

Just for browsing the cset comments that may well provide hints
on how this thingy can be useful:

http://www.kernel.org/git/?p=linux/kernel/git/acme/pahole.git;a=summary

Further ideas on how to use the DWARF2 information include tools
that will show where inlines are being used, how much code is added by
inline functions, possibly rewriting asm-offsets.c, converting ostra
(callgraph tool) to use this information, correlate valgrind's
cachegrind information to suggest struct member reorganization to
exploit cacheline locality and more.

Documentation is very much a disaster, but I guess the current
state of things is useful for interested hackers, so that I thought it
was time got announce this.

Ideas for additional tools are more than welcome!

- Arnaldo
Mandriva Labs
http://www.mandriva.com


2006-10-31 04:33:42

by Andrew Morton

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Mon, 30 Oct 2006 18:33:19 -0300
Arnaldo Carvalho de Melo <[email protected]> wrote:

> Hi,
>
> I've been working on some DWARF2 utilities and thought that it
> is about time I announce it to the community, so that what is already
> available can be used by people interested in reducing structure sizes
> and otherwise taking advantage of the information available in the elf
> sections of files compiled with 'gcc -g' or in the case of the kernel
> with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> tools:
>
> pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> being defined as the space between members of functions due to alignemnt
> rules that could be used for new struct entries or to reorganize
> existing structures to reduce its size, without more ado lets see what
> that means:
>
> ...
>
> Further ideas on how to use the DWARF2 information include tools
> that will show where inlines are being used, how much code is added by
> inline functions,

It would be quite useful to be able to identify inlined functions which are
good candidates for uninlining.

2006-10-31 16:05:11

by Thiago Galesi

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

> > Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
>
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.
>
> -

Arnaldo, can't we get a call count for functions? (yes, it is not a
run-time call count, but rather, how many times the function if called
in the code) I guess this would help for this purpose of finding
candidates for inlining, uninlining.

--
-
Thiago Galesi

2006-10-31 17:22:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > I've been working on some DWARF2 utilities and thought that it
> > is about time I announce it to the community, so that what is already
> > available can be used by people interested in reducing structure sizes
> > and otherwise taking advantage of the information available in the elf
> > sections of files compiled with 'gcc -g' or in the case of the kernel
> > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > tools:
> >
> > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > being defined as the space between members of functions due to alignemnt
> > rules that could be used for new struct entries or to reorganize
> > existing structures to reduce its size, without more ado lets see what
> > that means:
> >
> > ...
> >
> > Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
>
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

I'm working on making good use of this information:

--------------- 8< --------------

3.3.8.2 Concrete Inlined Instances

Each inline expansion of an inlinable subroutine is represented by a
debugging information entry with the tag DW_TAG_inlined_subroutine.
Each such entry should be a direct child of the entry that represents
the scope with in which the inlining occurs.

--------------- 8< --------------

To write this tool:

<Ralf> So imagine a tool which says function x was inlined y times
bloating the code by z bytes :)

:-)

- Arnaldo

2006-10-31 17:29:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Tue, Oct 31, 2006 at 02:05:06PM -0200, Thiago Galesi wrote:
> >> Further ideas on how to use the DWARF2 information include tools
> >> that will show where inlines are being used, how much code is added by
> >> inline functions,
> >
> >It would be quite useful to be able to identify inlined functions which are
> >good candidates for uninlining.
> >
> >-
>
> Arnaldo, can't we get a call count for functions? (yes, it is not a
> run-time call count, but rather, how many times the function if called
> in the code) I guess this would help for this purpose of finding
> candidates for inlining, uninlining.

At least for inline expansions, yes, for normal function calls I have to
study more the DWARF2 documentation, but I guess its feasible.

- Arnaldo

2006-10-31 20:45:46

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Tue, Oct 31, 2006 at 02:22:37PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > On Mon, 30 Oct 2006 18:33:19 -0300
> > Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > > I've been working on some DWARF2 utilities and thought that it
> > > is about time I announce it to the community, so that what is already
> > > available can be used by people interested in reducing structure sizes
> > > and otherwise taking advantage of the information available in the elf
> > > sections of files compiled with 'gcc -g' or in the case of the kernel
> > > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > > tools:
> > >
> > > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > > being defined as the space between members of functions due to alignemnt
> > > rules that could be used for new struct entries or to reorganize
> > > existing structures to reduce its size, without more ado lets see what
> > > that means:
> > >
> > > ...
> > >
> > > Further ideas on how to use the DWARF2 information include tools
> > > that will show where inlines are being used, how much code is added by
> > > inline functions,
> >
> > It would be quite useful to be able to identify inlined functions which are
> > good candidates for uninlining.

For now people can take a look at:

http://oops.merseine.nu:81/acme/net.ipv4.tcp.o.pahole

Where all the types in headers included from net/ipv4/tcp.c that have
holes can be seen, for instance:

/* /pub/scm/linux/kernel/git/acme/net-2.6/include/linux/dqblk_xfs.h:143
* */
struct fs_quota_stat {
__s8 qs_version; /* 0 1 */

/* XXX 1 bytes hole, try to pack */

__u16 qs_flags; /* 2 2 */
__s8 qs_pad; /* 4 1 */

/* XXX 3 bytes hole, try to pack */

fs_qfilestat_t qs_uquota; /* 8 20 */
fs_qfilestat_t qs_gquota; /* 28 20 */
__u32 qs_incoredqs; /* 48 4 */
__s32 qs_btimelimit; /* 52 4 */
__s32 qs_itimelimit; /* 56 4 */
__s32 qs_rtbtimelimit; /* 60 4 */
__u16 qs_bwarnlimit; /* 64 2 */
__u16 qs_iwarnlimit; /* 66 2 */
}; /* size: 68, sum members: 64, holes: 2, sum holes: 4 */


See? two holes, that can be combined and reduce the size of this
struct by 4 bytes, just moving qs_pad to be defined just before
qs_flags, many more holes are there to harvest :-)

Of course, mistakes from the past for structs that are exported
to userspace have to be kept that way, and in other cases where grouping
members for cacheline locality optimizations, etc.

- Arnaldo

2006-11-03 15:52:03

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Hi,
> >
> > I've been working on some DWARF2 utilities and thought that it
> > is about time I announce it to the community, so that what is already
> > available can be used by people interested in reducing structure sizes
> > and otherwise taking advantage of the information available in the elf
> > sections of files compiled with 'gcc -g' or in the case of the kernel
> > with CONFIG_DEBUG_INFO enabled, so here it goes the description of said
> > tools:
> >
> > pahole: Poke-a-Hole is a tool to find out holes in structures, holes
> > being defined as the space between members of functions due to alignemnt
> > rules that could be used for new struct entries or to reorganize
> > existing structures to reduce its size, without more ado lets see what
> > that means:
> >
> > ...
> >
> > Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
>
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

Getting there, next step is to per CU (Compilation Unit, .o files)
inlining stats :-)

Ah, the sizes are different because sometimes just some parts of inline
functions are "sourced", as indicated by the DW_AT_ranges DWARF
attribute.

Repo continues at:

git://git.kernel.org/pub/scm/linux/kernel/git/acme/pahole.git

Another suggested was for a stack hole finding tool, similar to what
pahole does for structs :-)

Another example, this time for schedule():

http://oops.merseine.nu:81/acme/schedule.inlines.txt

Regards,

- Arnaldo

commit a42afe1acffc5e57ab504c008b8b75c124bf07de
Author: Arnaldo Carvalho de Melo <[email protected]>
Date: Fri Nov 3 12:41:19 2006 -0300

[CLASSES]: Add support for DW_TAG_inlined_subroutine

Output of pfunct using this information (all for a make allyesconfig build):

Top 5 functions by size of inlined functions in net/ipv4:

[acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k3 -nr | head -5
ip_route_input: 19 7086
tcp_ack: 33 6415
do_ip_vs_set_ctl: 23 4193
q931_help: 8 3822
ip_defrag: 19 3318
[acme@newtoy guinea_pig-2.6]$

And by number of inline expansions:

[acme@newtoy guinea_pig-2.6]$ pfunct -I net/ipv4/built-in.o | sort -k2 -nr | head -5
dump_packet: 35 905
tcp_v4_rcv: 34 1773
tcp_recvmsg: 34 928
tcp_ack: 33 6415
tcp_rcv_established: 31 1195
[acme@newtoy guinea_pig-2.6]$

And the list of expansions on a specific function:

[acme@newtoy guinea_pig-2.6]$ pfunct -i net/ipv4/built-in.o tcp_v4_rcv
/* net/ipv4/tcp_ipv4.c:1054 */
int tcp_v4_rcv(struct sk_buff * skb);
/* size: 2189, variables: 8, goto labels: 6, inline expansions: 34 (1773 bytes) */

/* inline expansions in tcp_v4_rcv:
current_thread_info: 8
pskb_may_pull: 36
pskb_may_pull: 29
tcp_v4_checksum_init: 139
__fswab32: 2
__fswab32: 2
inet_iif: 12
__inet_lookup: 292
__fswab16: 20
inet_ehashfn: 25
inet_ehash_bucket: 18
prefetch: 4
prefetch: 4
prefetch: 4
sock_hold: 4
xfrm4_policy_check: 59
nf_reset: 66
sk_filter: 135
__skb_trim: 20
get_softnet_dma: 68
tcp_prequeue: 257
sk_add_backlog: 40
sock_put: 27
xfrm4_policy_check: 46
tcp_checksum_complete: 29
current_thread_info: 8
sock_put: 20
xfrm4_policy_check: 50
tcp_checksum_complete: 29
current_thread_info: 8
current_thread_info: 8
sock_put: 20
xfrm4_policy_check: 50
tcp_checksum_complete: 29
current_thread_info: 8
inet_iif: 9
inet_lookup_listener: 36
inet_twsk_put: 114
tcp_v4_timewait_ack: 153
*/
[acme@newtoy guinea_pig-2.6]$

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

2006-11-03 19:07:43

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> On Mon, 30 Oct 2006 18:33:19 -0300
> Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Further ideas on how to use the DWARF2 information include tools
> > that will show where inlines are being used, how much code is added by
> > inline functions,
>
> It would be quite useful to be able to identify inlined functions which are
> good candidates for uninlining.

Top 50 inline functions expanded more than once by sum of its expansions
in a vmlinux file built for qemu, most things are modules, columns are
(inline function name, number of times it was expanded, sum in bytes of
its expansions, number of source files where expansions ocurred):

[acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
head -50

get_current 676 5732 155
xfrm_selector_match 6 4778 2
__memcpy 177 4326 89
kmalloc 185 3991 119
__constant_c_memset 113 3556 69
__constant_c_and_count_memset 225 3161 156
prefetch 333 2915 101
__ext3_journal_dirty_metadata 44 2810 6
skb_put 34 2650 27
module_put 80 2613 42
strcmp 108 2506 49
__ext3_journal_get_write_access 41 2482 6
down 57 2253 19
__fswab16 96 2172 33
dst_release 34 2130 23
list_add_tail 88 2030 67
kzalloc 89 2007 76
__constant_memcpy 146 1930 118
tcp_done 8 1918 4
brelse 128 1897 16
__nlmsg_put 21 1856 13
INIT_LIST_HEAD 226 1848 88
pci_read_config_byte 54 1802 9
list_del_init 103 1782 39
ip_rt_put 27 1692 12
pci_read_config_word 50 1675 11
strlen 108 1671 64
__xfrm6_selector_match 3 1615 2
__skb_trim 25 1604 21
do_follow_link 2 1543 1
strncmp 48 1533 22
__xfrm4_selector_match 6 1525 2
outb_p 136 1518 9
tcp_set_state 14 1501 5
find_group_orlov 2 1456 2
inet_twsk_put 16 1448 5
pci_write_config_byte 38 1433 10
up 68 1372 19
pci_read_config_dword 42 1357 12
raw_local_irq_restore 366 1292 88
skb_tailroom 62 1239 23
set_bit 155 1232 68
put_task_struct 53 1227 11
print_irq_desc 2 1206 1
skb_trim 14 1192 13
__do_follow_link 2 1190 1
nf_hook_thresh 16 1164 8
dget 47 1147 19
__raw_local_irq_save 314 1145 85
__fswab32 130 1117 28

- Arnaldo

2006-11-04 21:03:44

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > On Mon, 30 Oct 2006 18:33:19 -0300
> > Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > > Further ideas on how to use the DWARF2 information include tools
> > > that will show where inlines are being used, how much code is added by
> > > inline functions,
> >
> > It would be quite useful to be able to identify inlined functions which are
> > good candidates for uninlining.
>
> Top 50 inline functions expanded more than once by sum of its expansions
> in a vmlinux file built for qemu, most things are modules, columns are
> (inline function name, number of times it was expanded, sum in bytes of
> its expansions, number of source files where expansions ocurred):
>
> [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> head -50
>
> get_current 676 5732 155

Ok, this time for a 'make allyesconfig' build, top 100, for the list of
all 6021 inline functions that were expanded more than once in this 281
MB vmlinux image download the 93 KB files at:

http://oops.merseine.nu:81/acme/vmlinux.allyesconfig.inlines.txt.gz

totsz = Total size of all expansions for this inline function
nrexp = Number of times this function was expanded (inlined)
avgsz = Average expansion size
nrsrc = number of source files where this function was expanded

Some cases are bogus due to namespace colisions, I'll work on mangling
the function name with the file where it was defined, but most should
be ok.

- Arnaldo

name totsz / nrexp = avgsz nrsrc
-------------------------------------------------------------
1 outb 65077 8180 7 505
2 __fswab16 58827 2313 25 459
3 __memcpy 54640 2241 24 1141
4 writel 49011 5989 8 364
5 __constant_c_and_count_memset 42967 3163 13 1708
6 skb_put 41512 741 56 416
7 kmalloc 38684 2345 16 1366
8 __constant_memcpy 35716 2536 14 1633
9 get_current 31233 4058 7 886
10 cfi_build_cmd 28439 131 217 7
11 strcmp 23324 1008 23 329
12 kzalloc 22413 1326 16 1014
13 current_thread_info 21816 2815 7 1319
14 readl 21575 3953 5 363
15 __constant_c_memset 21014 732 28 578
16 strcpy 20681 1420 14 611
17 __fswab32 19797 3566 5 442
18 init_hw 18441 7 2634 12
19 strncmp 18199 596 30 212
20 writeb 17825 2611 6 205
21 INIT_LIST_HEAD 15476 1713 9 746
22 __OUTPLL 15399 125 123 4
23 inb 15174 3246 4 499
24 snd_echo_create 15098 5 3019 5
25 NCR5380_information_transfer 14699 6 2449 6
26 __INPLL 14541 117 124 4
27 outw 14475 1437 10 135
28 up 14467 796 18 189
29 down 13710 396 34 152
30 do_write_buffer 13674 3 4558 3
31 pci_write_config_byte 13069 527 24 176
32 outb_p 12659 1241 10 97
33 strlen 12658 931 13 435
34 load_firmware 12616 7 1802 13
35 pci_read_config_byte 12369 560 22 218
36 cfi_send_gen_cmd 11559 41 281 5
37 module_put 11462 265 43 203
38 skb_push 11297 259 43 182
39 set_bit 11160 1296 8 676
40 readb 11015 1728 6 219
41 radeon_pll_errata_after_data 10855 127 85 4
42 skb_pull 10812 303 35 187
43 outl 10718 1151 9 128
44 netif_wake_queue 10355 348 29 173
45 add_timer 9896 390 25 252
46 pci_free_consistent 9834 376 26 130
47 clear_bit 9654 962 10 600
48 list_add_tail 9570 712 13 434
49 __fswab64 9553 521 18 97
50 ahd_outb 9538 295 32 4
51 hscx_int_main 9526 12 793 12
52 prefetch 9467 1764 5 720
53 pci_read_config_dword 9247 425 21 163
54 pci_write_config_dword 9116 402 22 140
55 dev_alloc_skb 9072 221 41 209
56 constant_test_bit 8764 1955 4 1034
57 dev_kfree_skb_irq 8617 98 87 104
58 writew 8569 1148 7 128
59 ahc_outb 8426 261 32 6
60 netif_stop_queue 8364 495 16 187
61 brelse 8354 822 10 155
62 skb_reserve 8187 444 18 320
63 ahc_inb 8182 303 27 6
64 pci_read_config_word 8174 359 22 178
65 skb_trim 8153 112 72 101
66 i_size_read 8071 181 44 67
67 list_del_init 7916 461 17 210
68 pci_map_single 7756 186 41 103
69 jedec_reset 7689 6 1281 1
70 WriteHSCXCMDR 7576 78 97 17
71 frontend_init 7352 7 1050 7
72 le_key_k_type 7302 60 121 12
73 pci_write_config_word 7123 284 25 137
74 dst_release 7059 125 56 71
75 ahd_set_modes 6930 55 126 4
76 strncpy 6651 296 22 153
77 usb_serial_debug_data 6380 61 104 25
78 pci_alloc_consistent 6352 214 29 129
79 atomic_inc 6235 1022 6 683
80 readw 6170 1001 6 122
81 block_til_ready 6146 10 614 10
82 load_dsp 6141 5 1228 12
83 test_and_set_bit 6096 605 10 394
84 try_module_get 6065 176 34 151
85 input_report_key 5939 208 28 65
86 load_module 5933 2 2966 2
87 usb_fill_bulk_urb 5846 114 51 62
88 skb_queue_head_init 5783 151 38 110
89 ahd_inb 5739 208 27 4
90 device_init 5689 2 2844 2
91 sctp_add_cmd_sf 5665 197 28 2
92 pci_set_drvdata 5595 496 11 268
93 skb_tailroom 5432 313 17 117
94 port_detect 5419 2 2709 2
95 dequeue_rx 5339 3 1779 3
96 dev_to_shost 5338 167 31 22
97 skb_header_pointer 5289 120 44 53
98 prism2_init_local_data 5226 3 1742 3
99 sb_bread 5216 164 31 87
100 strchr 5171 195 26 115

2006-11-05 06:30:37

by Adrian Bunk

[permalink] [raw]
Subject: Re: Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Sat, Nov 04, 2006 at 06:03:32PM -0300, Arnaldo Carvalho de Melo wrote:
> On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > > On Mon, 30 Oct 2006 18:33:19 -0300
> > > Arnaldo Carvalho de Melo <[email protected]> wrote:
> > >
> > > > Further ideas on how to use the DWARF2 information include tools
> > > > that will show where inlines are being used, how much code is added by
> > > > inline functions,
> > >
> > > It would be quite useful to be able to identify inlined functions which are
> > > good candidates for uninlining.
> >
> > Top 50 inline functions expanded more than once by sum of its expansions
> > in a vmlinux file built for qemu, most things are modules, columns are
> > (inline function name, number of times it was expanded, sum in bytes of
> > its expansions, number of source files where expansions ocurred):
> >
> > [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> > ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> > head -50
> >
> > get_current 676 5732 155
>
> Ok, this time for a 'make allyesconfig' build, top 100, for the list of
> all 6021 inline functions that were expanded more than once in this 281
> MB vmlinux image download the 93 KB files at:
>...

Thanks, this is interesting data.

One thing you could do for improving the result:

allyesconfig turns on all debugging option, and there might be functions
that are significantely larger due to this fact.

Unsetting *DEBUG* options in the .config might bring a better focus
on the real-world problems.

> - Arnaldo
>...

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-11-05 16:42:47

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Top 100 inline functions (make allyesconfig) was Re: [ANNOUNCE] pahole and other DWARF2 utilities

On Sun, Nov 05, 2006 at 07:30:37AM +0100, Adrian Bunk wrote:
> On Sat, Nov 04, 2006 at 06:03:32PM -0300, Arnaldo Carvalho de Melo wrote:
> > On Fri, Nov 03, 2006 at 04:07:29PM -0300, Arnaldo Carvalho de Melo wrote:
> > > On Mon, Oct 30, 2006 at 08:33:34PM -0800, Andrew Morton wrote:
> > > > On Mon, 30 Oct 2006 18:33:19 -0300
> > > > Arnaldo Carvalho de Melo <[email protected]> wrote:
> > > >
> > > > > Further ideas on how to use the DWARF2 information include tools
> > > > > that will show where inlines are being used, how much code is added by
> > > > > inline functions,
> > > >
> > > > It would be quite useful to be able to identify inlined functions which are
> > > > good candidates for uninlining.
> > >
> > > Top 50 inline functions expanded more than once by sum of its expansions
> > > in a vmlinux file built for qemu, most things are modules, columns are
> > > (inline function name, number of times it was expanded, sum in bytes of
> > > its expansions, number of source files where expansions ocurred):
> > >
> > > [acme@newtoy guinea_pig-2.6]$ pfunct --total_inline_stats
> > > ../../acme/OUTPUT/qemu/net-2.6/vmlinux | grep -v ': 1 ' | sort -k3 -nr |
> > > head -50
> > >
> > > get_current 676 5732 155
> >
> > Ok, this time for a 'make allyesconfig' build, top 100, for the list of
> > all 6021 inline functions that were expanded more than once in this 281
> > MB vmlinux image download the 93 KB files at:
> >...
>
> Thanks, this is interesting data.
>
> One thing you could do for improving the result:
>
> allyesconfig turns on all debugging option, and there might be functions
> that are significantely larger due to this fact.
>
> Unsetting *DEBUG* options in the .config might bring a better focus
> on the real-world problems.

Sure thing, I did it with allyesconfig to see if the tools were able to
handle that much data, its not perfect, far from it, but it works on my
notebook :-) Neverthless its already a data point for lots of
interesting cases.

One thing I'll do is to get the debug rpms in, say, Mandriva, Fedora,
etc and use them as more down to earth guinea pigs, for that I'll add
support for multi file, not just for multi object, single file ELF
files. Also just using the config files used in major distros is on my
TODO list, of course enabling the extra config options needed to have
the DWARF2 elf sections needed by the tools, these sections don't affect
the binary, are just extra ELF sections, that the 'strip(1)' tool loves
:-)

Stay tuned,

- Arnaldo