2018-11-11 01:22:54

by Joe Perches

[permalink] [raw]
Subject: [PATCH] xfs: Remove noinline from #define STATIC

Reduce total object size quite a bit (~32KB) and presumably
improve performance at the same time.

Total object size old vs new (x86-64 defconfig with xfs)

text data bss dec hex filename
- 959351 165573 632 1125556 112cb4 (TOTALS) (old)
+ 924683 165669 632 1090984 10a5a8 (TOTALS) (new)

Signed-off-by: Joe Perches <[email protected]>
---
fs/xfs/xfs_linux.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index edbd5a210df2..f33c8b626bca 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -242,7 +242,7 @@ static inline uint64_t howmany_64(uint64_t x, uint32_t y)
#endif /* XFS_WARN */
#endif /* DEBUG */

-#define STATIC static noinline
+#define STATIC static

#ifdef CONFIG_XFS_RT




2018-11-12 20:14:42

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On 11/10/18 7:21 PM, Joe Perches wrote:
> Reduce total object size quite a bit (~32KB) and presumably
> improve performance at the same time.
>
> Total object size old vs new (x86-64 defconfig with xfs)
>
> text data bss dec hex filename
> - 959351 165573 632 1125556 112cb4 (TOTALS) (old)
> + 924683 165669 632 1090984 10a5a8 (TOTALS) (new)

And what does it do to maximum stack excursions?

-Eric

> Signed-off-by: Joe Perches <[email protected]>
> ---
> fs/xfs/xfs_linux.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
> index edbd5a210df2..f33c8b626bca 100644
> --- a/fs/xfs/xfs_linux.h
> +++ b/fs/xfs/xfs_linux.h
> @@ -242,7 +242,7 @@ static inline uint64_t howmany_64(uint64_t x, uint32_t y)
> #endif /* XFS_WARN */
> #endif /* DEBUG */
>
> -#define STATIC static noinline
> +#define STATIC static
>
> #ifdef CONFIG_XFS_RT
>
>

2018-11-12 21:06:23

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, 2018-11-12 at 14:12 -0600, Eric Sandeen wrote:
> On 11/10/18 7:21 PM, Joe Perches wrote:
> > Reduce total object size quite a bit (~32KB) and presumably
> > improve performance at the same time.
> >
> > Total object size old vs new (x86-64 defconfig with xfs)
> >
> > text data bss dec hex filename
> > - 959351 165573 632 1125556 112cb4 (TOTALS) (old)
> > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new)
>
> And what does it do to maximum stack excursions?

It seems to add a maximum of 40 bytes in xfs_reclaim_inodes_ag
and xfs_readdir, but I didn't do more than visually scan
checkstack output.

Using scripts/checkstack.pl on defconfig x86-64 with xfs
and cutting out absolute addresses

diff -urN xfs_cs_old_2 xfs_cs_new_2
--- xfs_cs_old_2 2018-11-12 12:55:03.195282512 -0800
+++ xfs_cs_new_2 2018-11-12 12:55:09.219168923 -0800
@@ -1,18 +1,20 @@
xfs_ag_init_headers [vmlinux]: 464
xfs_ag_init_headers [vmlinux]: 464
-xfs_inode_ag_walk.isra.19 [vmlinux]: 416
-xfs_inode_ag_walk.isra.19 [vmlinux]: 416
+xfs_inode_ag_walk.isra.21 [vmlinux]: 424
+xfs_inode_ag_walk.isra.21 [vmlinux]: 424
+xfs_reclaim_inodes_ag [vmlinux]: 368
+xfs_reclaim_inodes_ag [vmlinux]: 368
xfs_trans_committed_bulk [vmlinux]: 336
-xfs_reclaim_inodes_ag [vmlinux]: 328
-xfs_reclaim_inodes_ag [vmlinux]: 328
xfs_ioc_getfsmap.isra.23 [vmlinux]: 320
xfs_ioc_getfsmap.isra.23 [vmlinux]: 320
+xfs_file_ioctl [vmlinux]: 320
+xfs_file_ioctl [vmlinux]: 320
+xfs_bmapi_write [vmlinux]: 312
xfs_getfsmap [vmlinux]: 304
xfs_qm_dquot_walk.isra.11 [vmlinux]: 296
xfs_qm_dquot_walk.isra.11 [vmlinux]: 296
-xfs_bmapi_write [vmlinux]: 288
-xfs_file_ioctl [vmlinux]: 288
-xfs_file_ioctl [vmlinux]: 288
+xfs_file_compat_ioctl [vmlinux]: 288
+xfs_file_compat_ioctl [vmlinux]: 288
xfs_sb_read_verify [vmlinux]: 280
xfs_sb_read_verify [vmlinux]: 280
xfs_sb_write_verify [vmlinux]: 272
@@ -21,36 +23,38 @@
xfs_rmap_convert [vmlinux]: 264
xfs_rmap_convert_shared [vmlinux]: 256
xfs_rmap_convert_shared [vmlinux]: 256
-xfs_file_compat_ioctl [vmlinux]: 256
-xfs_file_compat_ioctl [vmlinux]: 256
xfs_bmap_extents_to_btree [vmlinux]: 248
xfs_bmap_extents_to_btree [vmlinux]: 248
+xfs_bmap_del_extent_real [vmlinux]: 240
+xfs_alloc_fix_freelist [vmlinux]: 224
+xfs_alloc_fix_freelist [vmlinux]: 224
+xfs_bmap_add_extent_unwritten_real [vmlinux]:224
xfs_symlink [vmlinux]: 224
xfs_symlink [vmlinux]: 224
-xfs_alloc_fix_freelist [vmlinux]: 216
-xfs_alloc_fix_freelist [vmlinux]: 216
-xfs_bmap_local_to_extents.constprop.27 [vmlinux]:208
-xfs_bmap_local_to_extents.constprop.27 [vmlinux]:208
-xfs_ialloc_ag_alloc [vmlinux]: 208
-xfs_ialloc_ag_alloc [vmlinux]: 208
-xfs_bmap_add_extent_unwritten_real [vmlinux]:192
-xfs_bmap_add_extent_unwritten_real [vmlinux]:192
-xfs_bmap_add_extent_delay_real [vmlinux]:192
-xfs_bmap_add_extent_delay_real [vmlinux]:192
-xfs_bmap_del_extent_real [vmlinux]: 192
-xfs_bmap_del_extent_real [vmlinux]: 192
-xfs_bmap_btalloc [vmlinux]: 192
-xfs_bmap_btalloc [vmlinux]: 192
+xfs_bmap_btalloc [vmlinux]: 216
+xfs_bmap_btalloc [vmlinux]: 216
+xfs_attr_rmtval_set [vmlinux]: 208
+xfs_bmap_add_extent_delay_real [vmlinux]:208
+xfs_bmap_local_to_extents.constprop.28 [vmlinux]:208
+xfs_bmap_local_to_extents.constprop.28 [vmlinux]:208
+xfs_bulkstat [vmlinux]: 208
+xfs_ialloc_ag_alloc [vmlinux]: 200
+xfs_ialloc_ag_alloc [vmlinux]: 200
xfs_refcountbt_alloc_block [vmlinux]:192
xfs_refcountbt_alloc_block [vmlinux]:192
+xfs_reflink_remap_blocks [vmlinux]: 192
+xfs_reflink_remap_blocks [vmlinux]: 192
__xfs_bunmapi [vmlinux]: 184
__xfs_bunmapi [vmlinux]: 184
xfs_rmap_map [vmlinux]: 184
xfs_rmap_map [vmlinux]: 184
+xfs_readdir [vmlinux]: 184
+xfs_readdir [vmlinux]: 184
xfs_attr_set [vmlinux]: 176
xfs_attr_set [vmlinux]: 176
xfs_attr3_leaf_to_shortform [vmlinux]:176
xfs_attr3_leaf_to_shortform [vmlinux]:176
+xfs_bmap_add_extent_hole_real [vmlinux]:176
xfs_attr_shortform_to_leaf [vmlinux]:168
xfs_attr_shortform_to_leaf [vmlinux]:168
xfs_btree_delrec [vmlinux]: 168
@@ -59,89 +63,99 @@
xfs_rmap_unmap [vmlinux]: 168
xfs_rmap_map_shared [vmlinux]: 168
xfs_rmap_map_shared [vmlinux]: 168
-xfs_bulkstat [vmlinux]: 168
-xfs_bulkstat [vmlinux]: 168
xfs_scrub_metadata [vmlinux]: 168
xfs_scrub_metadata [vmlinux]: 168
xfs_free_extent_fix_freelist [vmlinux]:160
xfs_free_extent_fix_freelist [vmlinux]:160
-xfs_bmap_add_extent_hole_real [vmlinux]:160
-xfs_bmap_add_extent_hole_real [vmlinux]:160
+xfs_attr3_leaf_split [vmlinux]: 160
+xfs_attr3_leaf_split [vmlinux]: 160
xfs_bmap_insert_extents [vmlinux]: 160
xfs_bmap_insert_extents [vmlinux]: 160
xfs_bmbt_alloc_block [vmlinux]: 160
xfs_bmbt_alloc_block [vmlinux]: 160
xfs_btree_overlapped_query_range [vmlinux]:160
xfs_btree_overlapped_query_range [vmlinux]:160
+xfs_dialloc [vmlinux]: 160
+xfs_dialloc [vmlinux]: 160
__xfs_inobt_alloc_block.isra.9 [vmlinux]:160
__xfs_inobt_alloc_block.isra.9 [vmlinux]:160
xfs_ag_resv_rmapbt_alloc [vmlinux]: 160
xfs_ag_resv_rmapbt_alloc [vmlinux]: 160
+xfs_do_writepage [vmlinux]: 160
+xfs_do_writepage [vmlinux]: 160
xfs_dir2_leaf_readbuf [vmlinux]: 160
xfs_readlink_bmap_ilocked [vmlinux]: 160
xfs_readlink_bmap_ilocked [vmlinux]: 160
xfs_ag_resv_rmapbt_alloc [vmlinux]: 160
xfs_ag_resv_rmapbt_alloc [vmlinux]: 160
+xfs_da3_split [vmlinux]: 152
+xfs_da3_split [vmlinux]: 152
xfs_getbmap [vmlinux]: 152
xfs_getbmap [vmlinux]: 152
trace_raw_output_xfs_loggrant_class [vmlinux]:144
trace_raw_output_xfs_alloc_class [vmlinux]:144
xfs_attr_remove [vmlinux]: 144
xfs_attr_remove [vmlinux]: 144
-xfs_dialloc_ag_inobt [vmlinux]: 144
-xfs_dialloc_ag_inobt [vmlinux]: 144
+xfs_bmap_split_extent_at [vmlinux]: 144
+__xfs_btree_split.isra.45 [vmlinux]: 144
+__xfs_btree_split.isra.45 [vmlinux]: 144
+xfs_btree_insrec [vmlinux]: 144
+xfs_btree_insrec [vmlinux]: 144
+xfs_da_shrink_inode [vmlinux]: 144
+xfs_da_shrink_inode [vmlinux]: 144
+xfs_attr_list_int_ilocked [vmlinux]: 144
+xfs_attr_list_int_ilocked [vmlinux]: 144
xfs_swap_extent_rmap [vmlinux]: 144
xfs_swap_extent_rmap [vmlinux]: 144
-xfs_readdir [vmlinux]: 144
-xfs_readdir [vmlinux]: 144
+xfs_ioc_trim [vmlinux]: 144
+xfs_ioc_trim [vmlinux]: 144
xfs_inactive_symlink_rmt [vmlinux]: 144
xfs_inactive_symlink_rmt [vmlinux]: 144
xfs_attr_get [vmlinux]: 136
xfs_attr_get [vmlinux]: 136
+xfs_attr_rmtval_get [vmlinux]: 136
+xfs_attr_rmtval_get [vmlinux]: 136
xfs_bmap_add_attrfork_local [vmlinux]:136
xfs_bmap_add_attrfork_local [vmlinux]:136
-__xfs_btree_split.isra.38 [vmlinux]: 136
-__xfs_btree_split.isra.38 [vmlinux]: 136
xfs_btree_split [vmlinux]: 136
xfs_btree_split [vmlinux]: 136
-xfs_btree_insrec [vmlinux]: 136
-xfs_btree_insrec [vmlinux]: 136
+xfs_rmap_finish_one [vmlinux]: 136
+xfs_rmap_finish_one [vmlinux]: 136
+xfs_attr3_leaf_inactive [vmlinux]: 136
+xfs_attr3_leaf_inactive [vmlinux]: 136
xfs_rename [vmlinux]: 136
xfs_rename [vmlinux]: 136
xfs_attr3_leaf_unbalance [vmlinux]: 120
xfs_attr3_leaf_unbalance [vmlinux]: 120
-xfs_attr_rmtval_set [vmlinux]: 120
-xfs_attr_rmtval_set [vmlinux]: 120
-xfs_ioc_fsgeometry_v1 [vmlinux]: 120
-xfs_ioc_fsgeometry_v1 [vmlinux]: 120
-xfs_ioc_fsgeometry [vmlinux]: 120
-xfs_ioc_fsgeometry [vmlinux]: 120
+xfs_refcount_merge_extents [vmlinux]:120
+xfs_refcount_merge_extents [vmlinux]:120
xfs_file_iomap_begin [vmlinux]: 120
xfs_file_iomap_begin [vmlinux]: 120
-xfs_compat_ioc_fsgeometry_v1 [vmlinux]:120
-xfs_compat_ioc_fsgeometry_v1 [vmlinux]:120
+xfs_ifree [vmlinux]: 120
+xfs_ifree [vmlinux]: 120
+xfs_qm_reset_dqcounts_buf.part.15 [vmlinux]:120
+xfs_qm_reset_dqcounts_buf.part.15 [vmlinux]:120
+xfs_rtallocate_extent [vmlinux]: 120
+xfs_rtallocate_extent [vmlinux]: 120
+xfs_free_ag_extent [vmlinux]: 112
xfs_attr3_leaf_toosmall [vmlinux]: 112
xfs_attr3_leaf_toosmall [vmlinux]: 112
-xfs_attr3_leaf_rebalance [vmlinux]: 112
-xfs_attr3_leaf_rebalance [vmlinux]: 112
+xfs_bmap_del_extent_delay [vmlinux]: 112
+xfs_bmap_del_extent_delay [vmlinux]: 112
xfs_iread_extents [vmlinux]: 112
xfs_iread_extents [vmlinux]: 112
-xfs_bmap_split_extent_at [vmlinux]: 112
-xfs_bmap_split_extent_at [vmlinux]: 112
xfs_btree_insert [vmlinux]: 112
xfs_btree_query_range [vmlinux]: 112
-xfs_da3_swap_lastblock [vmlinux]: 112
-xfs_da3_swap_lastblock [vmlinux]: 112
xfs_dir2_node_addname [vmlinux]: 112
xfs_dir2_node_addname [vmlinux]: 112
xfs_dir2_node_removename [vmlinux]: 112
xfs_dir2_node_removename [vmlinux]: 112
-xfs_trim_extents [vmlinux]: 112
-xfs_trim_extents [vmlinux]: 112
+xfs_difree [vmlinux]: 112
+xfs_difree [vmlinux]: 112
xfs_growfs_data [vmlinux]: 112
xfs_growfs_data [vmlinux]: 112
-xfs_ifree_cluster.isra.21 [vmlinux]: 112
-xfs_ifree_cluster.isra.21 [vmlinux]: 112
+xfs_iget [vmlinux]: 112
+xfs_iget [vmlinux]: 112
xfs_inumbers [vmlinux]: 112
xfs_inumbers [vmlinux]: 112
xfs_reflink_end_cow [vmlinux]: 112
@@ -149,17 +163,19 @@
xfs_vn_listxattr [vmlinux]: 112
xfs_vn_listxattr [vmlinux]: 112
xfsaild [vmlinux]: 112
+xfs_growfs_rt [vmlinux]: 112
+xfs_growfs_rt [vmlinux]: 112
trace_raw_output_xfs_dquot_class [vmlinux]:104
xfs_bmapi_reserve_delalloc [vmlinux]:104
xfs_bmapi_reserve_delalloc [vmlinux]:104
+xfs_btree_lshift [vmlinux]: 104
+xfs_btree_lshift [vmlinux]: 104
xfs_da_grow_inode_int [vmlinux]: 104
xfs_da_grow_inode_int [vmlinux]: 104
xfs_dir2_leaf_removename [vmlinux]: 104
xfs_dir2_leaf_removename [vmlinux]: 104
xfs_dir2_leafn_split [vmlinux]: 104
xfs_dir2_leafn_split [vmlinux]: 104
-xfs_iget [vmlinux]: 104
-xfs_iget [vmlinux]: 104
xfs_ioc_space [vmlinux]: 104
xfs_ioc_space [vmlinux]: 104
xfs_emerg [vmlinux]: 104
@@ -169,15 +185,9 @@
xfs_warn [vmlinux]: 104
xfs_notice [vmlinux]: 104
xfs_info [vmlinux]: 104
-xfs_reflink_remap_extent [vmlinux]: 104
-xfs_reflink_remap_extent [vmlinux]: 104
-xfs_reflink_dirty_extents [vmlinux]: 104
-xfs_reflink_dirty_extents [vmlinux]: 104
xfs_reflink_cancel_cow_blocks [vmlinux]:104
xfs_reflink_cancel_cow_blocks [vmlinux]:104
xfs_log_quiesce [vmlinux]: 104
xfs_log_quiesce [vmlinux]: 104
-xfs_rtallocate_extent_near [vmlinux]:104
-xfs_rtallocate_extent_near [vmlinux]:104
xfs_set_acl [vmlinux]: 104
xfs_set_acl [vmlinux]: 104



2018-11-12 21:46:00

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote:
> On 11/10/18 7:21 PM, Joe Perches wrote:
> > Reduce total object size quite a bit (~32KB) and presumably
> > improve performance at the same time.
> >
> > Total object size old vs new (x86-64 defconfig with xfs)
> >
> > text data bss dec hex filename
> > - 959351 165573 632 1125556 112cb4 (TOTALS) (old)
> > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new)
>
> And what does it do to maximum stack excursions?

Better yet: what does it do to corruption stack traces and debugging
tools like profiling traces?

i.e. this noinline directive isn't about stack usage, this is about
being able to debug production code. Basically the compiler inliner
is so agressive on static functions that it makes it impossible to
decipher the stack traces. It flattens them way too much to
be able to tell how we got to a specific location in the code.

In reality, being able to find problems quickly and efficiently is
far more important to us than being able to run everything at
ludicrous speed....

Cheers,

Dave.
--
Dave Chinner
[email protected]

2018-11-12 22:30:44

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Tue, 2018-11-13 at 08:45 +1100, Dave Chinner wrote:
> On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote:
> > On 11/10/18 7:21 PM, Joe Perches wrote:
> > > Reduce total object size quite a bit (~32KB) and presumably
> > > improve performance at the same time.
> > >
> > > Total object size old vs new (x86-64 defconfig with xfs)
> > >
> > > text data bss dec hex filename
> > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old)
> > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new)
> >
> > And what does it do to maximum stack excursions?
>
> Better yet: what does it do to corruption stack traces and debugging
> tools like profiling traces?
>
> i.e. this noinline directive isn't about stack usage, this is about
> being able to debug production code. Basically the compiler inliner
> is so agressive on static functions that it makes it impossible to
> decipher the stack traces. It flattens them way too much to
> be able to tell how we got to a specific location in the code.
>
> In reality, being able to find problems quickly and efficiently is
> far more important to us than being able to run everything at
> ludicrous speed....

Is that really a compelling argument given thw ~50:50
split of static/STATIC uses in xfs?

$ git grep -w STATIC fs/xfs/ | wc -l
1064
$ git grep -w static fs/xfs/ | wc -l
942



2018-11-13 01:20:19

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, Nov 12, 2018 at 02:30:01PM -0800, Joe Perches wrote:
> On Tue, 2018-11-13 at 08:45 +1100, Dave Chinner wrote:
> > On Mon, Nov 12, 2018 at 02:12:08PM -0600, Eric Sandeen wrote:
> > > On 11/10/18 7:21 PM, Joe Perches wrote:
> > > > Reduce total object size quite a bit (~32KB) and presumably
> > > > improve performance at the same time.
> > > >
> > > > Total object size old vs new (x86-64 defconfig with xfs)
> > > >
> > > > text data bss dec hex filename
> > > > - 959351 165573 632 1125556 112cb4 (TOTALS) (old)
> > > > + 924683 165669 632 1090984 10a5a8 (TOTALS) (new)
> > >
> > > And what does it do to maximum stack excursions?
> >
> > Better yet: what does it do to corruption stack traces and debugging
> > tools like profiling traces?
> >
> > i.e. this noinline directive isn't about stack usage, this is about
> > being able to debug production code. Basically the compiler inliner
> > is so agressive on static functions that it makes it impossible to
> > decipher the stack traces. It flattens them way too much to
> > be able to tell how we got to a specific location in the code.
> >
> > In reality, being able to find problems quickly and efficiently is
> > far more important to us than being able to run everything at
> > ludicrous speed....
>
> Is that really a compelling argument given thw ~50:50
> split of static/STATIC uses in xfs?

Historically, yes. We're talking about code with call chains that
can go 50-60 functions deep here. If that gets flattened to 10-20
functions by compiler inlining (which is the sort of thing that
happens) then we lose a huge amount of visibility into the workings
of the code. This affects profiling, stack traces on corruption,
dynamic debug probes, kernel crash dump analysis, etc.

I'm not interested in making code fast if distro support engineers
can't debug problems on user systems easily. Optimising for
performance over debuggability is a horrible trade off for us to
make because it means users and distros end up much more reliant on
single points of expertise for debugging problems. And that means
the majority of the load of problem triage falls directly on very
limited resources - the core XFS development team. A little bit of
thought about how to make code easier to triage and debug goes a
long, long way....

Indeed, this is not a new problem - we've been using techniques like
STATIC in one form or another to stop compiler inlining and/or
function hiding since XFS was first ported to linux 20 years ago.
In fact, STATIC was inherited from Irix because it helped with
debugging via the userspace simulator that the initial XFS code was
developed on. i.e. STATIC was present in the initial XFS commit
made way back in 1993, and we've been using it ever since...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2018-11-13 01:55:20

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> I'm not interested in making code fast if distro support engineers
> can't debug problems on user systems easily. Optimising for
> performance over debuggability is a horrible trade off for us to
> make because it means users and distros end up much more reliant on
> single points of expertise for debugging problems. And that means
> the majority of the load of problem triage falls directly on very
> limited resources - the core XFS development team. A little bit of
> thought about how to make code easier to triage and debug goes a
> long, long way....

So at least in my experience, if the kernels are compiled with
CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
scripts/decode_stracktrace.sh seems to do a very nice job with inlined
functions. Now, ext4 generally only has about 3 or 4 nested inlines,
and so I don't know how it works with 20 or 30 nested inlined
functions, so perhaps this is not applicable for XFS.

But it perhaps toolchain technology has advanced since the Irix days
such that it's no longer as necessary to force the non-inlining of
functions for easing debugging?

- Ted





2018-11-13 03:10:35

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote:
> On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> > I'm not interested in making code fast if distro support engineers
> > can't debug problems on user systems easily. Optimising for
> > performance over debuggability is a horrible trade off for us to
> > make because it means users and distros end up much more reliant on
> > single points of expertise for debugging problems. And that means
> > the majority of the load of problem triage falls directly on very
> > limited resources - the core XFS development team. A little bit of
> > thought about how to make code easier to triage and debug goes a
> > long, long way....
>
> So at least in my experience, if the kernels are compiled with
> CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
> scripts/decode_stracktrace.sh seems to do a very nice job with inlined

That doesn't help with kernel profiling and other such things that
are based on callgraphs...

> functions. Now, ext4 generally only has about 3 or 4 nested inlines,
> and so I don't know how it works with 20 or 30 nested inlined
> functions, so perhaps this is not applicable for XFS.
>
> But it perhaps toolchain technology has advanced since the Irix days
> such that it's no longer as necessary to force the non-inlining of
> functions for easing debugging?

Not that I've noticed.

Indeed, modern toolchains are moving the opposite direction - have
you ever tried to debug a binary with gdb that was compiled with LTO
enabled? Or maybe even just tried to profile it with perf?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2018-11-13 04:24:57

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote:
> On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote:
> > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> > > I'm not interested in making code fast if distro support engineers
> > > can't debug problems on user systems easily. Optimising for
> > > performance over debuggability is a horrible trade off for us to
> > > make because it means users and distros end up much more reliant on
> > > single points of expertise for debugging problems. And that means
> > > the majority of the load of problem triage falls directly on very
> > > limited resources - the core XFS development team. A little bit of
> > > thought about how to make code easier to triage and debug goes a
> > > long, long way....
> >
> > So at least in my experience, if the kernels are compiled with
> > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
> > scripts/decode_stracktrace.sh seems to do a very nice job with inlined
>
> That doesn't help with kernel profiling and other such things that
> are based on callgraphs...

If that's really the case:

I rather suspect the xfs static v STATIC function marking is not
particularly curated and the marking is somewhat arbitrary.

So perhaps given the large number of static, but not STATIC
functions, perhaps a sed of s/static/STATIC/ should be done
when it's not inline for all xfs functions.


2018-11-13 05:27:34

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote:
> On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote:
> > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote:
> > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> > > > I'm not interested in making code fast if distro support engineers
> > > > can't debug problems on user systems easily. Optimising for
> > > > performance over debuggability is a horrible trade off for us to
> > > > make because it means users and distros end up much more reliant on
> > > > single points of expertise for debugging problems. And that means
> > > > the majority of the load of problem triage falls directly on very
> > > > limited resources - the core XFS development team. A little bit of
> > > > thought about how to make code easier to triage and debug goes a
> > > > long, long way....
> > >
> > > So at least in my experience, if the kernels are compiled with
> > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
> > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined
> >
> > That doesn't help with kernel profiling and other such things that
> > are based on callgraphs...
>
> If that's really the case:
>
> I rather suspect the xfs static v STATIC function marking is not
> particularly curated and the marking is somewhat arbitrary.

That's a common opinion for an outsider to form when they come
across something unfamiliar they don't really understand. "I don't
understand this, so I must rewrite it" is an unfortunate habit that
programmers have.

> So perhaps given the large number of static, but not STATIC
> functions, perhaps a sed of s/static/STATIC/ should be done
> when it's not inline for all xfs functions.

That's just as bad as removing them all, if not worse.

If you are writing new code or reworking existing code, then we'll
consider the usage of STATIC/static in the context of that work.
Otherwise, we leave it alone.

It if ain't broke, don't fix it. And it sure as hell isn't broken
right now. We've got more than enough bugs to fix without having to
deal with drive-by bikeshed painting...

-Dave.

--
Dave Chinner
[email protected]

2018-11-13 05:32:49

by Joe Perches

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Tue, 2018-11-13 at 16:26 +1100, Dave Chinner wrote:
> On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote:
> > On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote:
> > > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote:
> > > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> > > > > I'm not interested in making code fast if distro support engineers
> > > > > can't debug problems on user systems easily. Optimising for
> > > > > performance over debuggability is a horrible trade off for us to
> > > > > make because it means users and distros end up much more reliant on
> > > > > single points of expertise for debugging problems. And that means
> > > > > the majority of the load of problem triage falls directly on very
> > > > > limited resources - the core XFS development team. A little bit of
> > > > > thought about how to make code easier to triage and debug goes a
> > > > > long, long way....
> > > >
> > > > So at least in my experience, if the kernels are compiled with
> > > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
> > > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined
> > >
> > > That doesn't help with kernel profiling and other such things that
> > > are based on callgraphs...
> >
> > If that's really the case:
> >
> > I rather suspect the xfs static v STATIC function marking is not
> > particularly curated and the marking is somewhat arbitrary.
>
> That's a common opinion for an outsider to form when they come
> across something unfamiliar they don't really understand. "I don't
> understand this, so I must rewrite it" is an unfortunate habit that
> programmers have.

Silly.

> > So perhaps given the large number of static, but not STATIC
> > functions, perhaps a sed of s/static/STATIC/ should be done
> > when it's not inline for all xfs functions.
>
> That's just as bad as removing them all, if not worse.

Why?

> If you are writing new code or reworking existing code, then we'll
> consider the usage of STATIC/static in the context of that work.
> Otherwise, we leave it alone.

If your statement is as described above, and
the STATIC use to enable call stack tracing i
useful, why shouldn't it be systemic?

> It if ain't broke, don't fix it.

A generically lazy statement.



2018-11-13 05:45:42

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Mon, Nov 12, 2018 at 09:31:51PM -0800, Joe Perches wrote:
> On Tue, 2018-11-13 at 16:26 +1100, Dave Chinner wrote:
> > On Mon, Nov 12, 2018 at 08:23:42PM -0800, Joe Perches wrote:
> > > On Tue, 2018-11-13 at 14:09 +1100, Dave Chinner wrote:
> > > > On Mon, Nov 12, 2018 at 08:54:10PM -0500, Theodore Y. Ts'o wrote:
> > > > > On Tue, Nov 13, 2018 at 12:18:05PM +1100, Dave Chinner wrote:
> > > > > > I'm not interested in making code fast if distro support engineers
> > > > > > can't debug problems on user systems easily. Optimising for
> > > > > > performance over debuggability is a horrible trade off for us to
> > > > > > make because it means users and distros end up much more reliant on
> > > > > > single points of expertise for debugging problems. And that means
> > > > > > the majority of the load of problem triage falls directly on very
> > > > > > limited resources - the core XFS development team. A little bit of
> > > > > > thought about how to make code easier to triage and debug goes a
> > > > > > long, long way....
> > > > >
> > > > > So at least in my experience, if the kernels are compiled with
> > > > > CONFIG_DEBUG_INFO and/or CONFIG_DEBUG_INFO_REDUCED,
> > > > > scripts/decode_stracktrace.sh seems to do a very nice job with inlined
> > > >
> > > > That doesn't help with kernel profiling and other such things that
> > > > are based on callgraphs...
> > >
> > > If that's really the case:
> > >
> > > I rather suspect the xfs static v STATIC function marking is not
> > > particularly curated and the marking is somewhat arbitrary.

I disagree. I've added plenty of code over the past couple of years.
Short functions with few or no branches (e.g. converters) are 'static';
longer functions (loops, iterators, "decide what to do with this"
functions, etc.) with many branches are STATIC to make it easier for me
to ftrace their decisions over a given dataset.

> > That's a common opinion for an outsider to form when they come
> > across something unfamiliar they don't really understand. "I don't
> > understand this, so I must rewrite it" is an unfortunate habit that
> > programmers have.
>
> Silly.

Yet frequent.

> > > So perhaps given the large number of static, but not STATIC
> > > functions, perhaps a sed of s/static/STATIC/ should be done
> > > when it's not inline for all xfs functions.
> >
> > That's just as bad as removing them all, if not worse.
>
> Why?
>
> > If you are writing new code or reworking existing code, then we'll
> > consider the usage of STATIC/static in the context of that work.
> > Otherwise, we leave it alone.
>
> If your statement is as described above, and
> the STATIC use to enable call stack tracing i
> useful, why shouldn't it be systemic?
>
> > It if ain't broke, don't fix it.
>
> A generically lazy statement.

Please everyone let's take a breather from this thread for a few hours.
A 3.7% reduction in code size is not worth getting worked up over, IMO.

--D

>
>

2018-11-15 10:13:48

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] xfs: Remove noinline from #define STATIC

On Tue, Nov 13, 2018 at 04:26:51PM +1100, Dave Chinner wrote:
> That's just as bad as removing them all, if not worse.
>
> If you are writing new code or reworking existing code, then we'll
> consider the usage of STATIC/static in the context of that work.
> Otherwise, we leave it alone.
>
> It if ain't broke, don't fix it. And it sure as hell isn't broken
> right now. We've got more than enough bugs to fix without having to
> deal with drive-by bikeshed painting...

Agreed. That being said I think we should aim to add manual
noline annotations to those functions where we really care while we
go through the code instead of the weird STATIC that just seems to
cause this kind of confusion.

And if Joe or somone else can come up with a few patches where removing
the noinline (aka STATIC) makes a huge difference for a small number
of functions we should consider it as well.