2019-08-29 08:36:19

by Rasmus Villemoes

[permalink] [raw]
Subject: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

gcc 9 provides a way to override the otherwise crude heuristic that
gcc uses to estimate the size of the code represented by an asm()
statement. From the gcc docs

If you use 'asm inline' instead of just 'asm', then for inlining
purposes the size of the asm is taken as the minimum size, ignoring
how many instructions GCC thinks it is.

For compatibility with older compilers, we obviously want a

#if new enough
#define asm_inline asm inline
#else
#define asm_inline asm
#endif

But since we #define the identifier inline to attach some attributes,
we have to use the alternate spelling __inline__ of that
keyword. Unfortunately, we also currently #define that one (to
inline), so we first have to get rid of all (mis)uses of
__inline__. Hence the huge diffstat. I think patch 1 should be
regenerated and applied just before -rc1.

There are a few remaining users of __inline__, in particular in uapi
headers, which I'm not sure what to do about (kernel-side users of
those will get slightly different semantics since we no longer
automatically attach the __attribute__((__gnu_inline__)) etc.). So RFC.

The two x86 changes cause smaller code gen differences than I'd
expect, but I think we do want the asm_inline thing available sooner
or later, so this is just to get the ball rolling.

Rasmus Villemoes (5):
treewide: replace __inline__ by inline
compiler_types.h: don't #define __inline__
compiler-gcc.h: add asm_inline definition
x86: alternative.h: use asm_inline for all alternative variants
x86: bug.h: use asm_inline in _BUG_FLAGS definitions

arch/alpha/include/asm/atomic.h | 12 +++---
arch/alpha/include/asm/bitops.h | 6 +--
arch/alpha/include/asm/dma.h | 22 +++++-----
arch/alpha/include/asm/floppy.h | 2 +-
arch/alpha/include/asm/irq.h | 2 +-
arch/alpha/include/asm/local.h | 4 +-
arch/alpha/include/asm/smp.h | 2 +-
.../arm/mach-iop32x/include/mach/uncompress.h | 2 +-
.../arm/mach-iop33x/include/mach/uncompress.h | 2 +-
.../arm/mach-ixp4xx/include/mach/uncompress.h | 2 +-
arch/ia64/hp/common/sba_iommu.c | 2 +-
arch/ia64/hp/sim/simeth.c | 2 +-
arch/ia64/include/asm/atomic.h | 8 ++--
arch/ia64/include/asm/bitops.h | 34 ++++++++--------
arch/ia64/include/asm/delay.h | 14 +++----
arch/ia64/include/asm/irq.h | 2 +-
arch/ia64/include/asm/page.h | 2 +-
arch/ia64/include/asm/sn/leds.h | 2 +-
arch/ia64/include/asm/uaccess.h | 4 +-
arch/ia64/oprofile/backtrace.c | 4 +-
arch/m68k/include/asm/blinken.h | 2 +-
arch/m68k/include/asm/checksum.h | 2 +-
arch/m68k/include/asm/dma.h | 32 +++++++--------
arch/m68k/include/asm/floppy.h | 8 ++--
arch/m68k/include/asm/nettel.h | 8 ++--
arch/m68k/mac/iop.c | 14 +++----
arch/mips/include/asm/atomic.h | 16 ++++----
arch/mips/include/asm/checksum.h | 2 +-
arch/mips/include/asm/dma.h | 20 +++++-----
arch/mips/include/asm/jazz.h | 2 +-
arch/mips/include/asm/local.h | 4 +-
arch/mips/include/asm/string.h | 8 ++--
arch/mips/kernel/binfmt_elfn32.c | 2 +-
arch/nds32/include/asm/swab.h | 4 +-
arch/parisc/include/asm/atomic.h | 20 +++++-----
arch/parisc/include/asm/bitops.h | 18 ++++-----
arch/parisc/include/asm/checksum.h | 4 +-
arch/parisc/include/asm/compat.h | 2 +-
arch/parisc/include/asm/delay.h | 2 +-
arch/parisc/include/asm/dma.h | 20 +++++-----
arch/parisc/include/asm/ide.h | 8 ++--
arch/parisc/include/asm/irq.h | 2 +-
arch/parisc/include/asm/spinlock.h | 12 +++---
arch/powerpc/include/asm/atomic.h | 40 +++++++++----------
arch/powerpc/include/asm/bitops.h | 28 ++++++-------
arch/powerpc/include/asm/dma.h | 20 +++++-----
arch/powerpc/include/asm/edac.h | 2 +-
arch/powerpc/include/asm/irq.h | 2 +-
arch/powerpc/include/asm/local.h | 14 +++----
arch/sh/include/asm/pgtable_64.h | 2 +-
arch/sh/include/asm/processor_32.h | 4 +-
arch/sh/include/cpu-sh3/cpu/dac.h | 6 +--
arch/x86/include/asm/alternative.h | 14 +++----
arch/x86/include/asm/bug.h | 4 +-
arch/x86/um/asm/checksum.h | 4 +-
arch/x86/um/asm/checksum_32.h | 4 +-
arch/xtensa/include/asm/checksum.h | 14 +++----
arch/xtensa/include/asm/cmpxchg.h | 4 +-
arch/xtensa/include/asm/irq.h | 2 +-
block/partitions/amiga.c | 2 +-
drivers/atm/he.c | 6 +--
drivers/atm/idt77252.c | 6 +--
drivers/gpu/drm/mga/mga_drv.h | 2 +-
drivers/gpu/drm/mga/mga_state.c | 14 +++----
drivers/gpu/drm/r128/r128_drv.h | 2 +-
drivers/gpu/drm/r128/r128_state.c | 14 +++----
drivers/gpu/drm/via/via_irq.c | 2 +-
drivers/gpu/drm/via/via_verifier.c | 30 +++++++-------
drivers/media/pci/ivtv/ivtv-ioctl.c | 2 +-
drivers/net/ethernet/sun/sungem.c | 8 ++--
drivers/net/ethernet/sun/sunhme.c | 6 +--
drivers/net/hamradio/baycom_ser_fdx.c | 2 +-
drivers/net/wan/lapbether.c | 2 +-
drivers/net/wan/n2.c | 4 +-
drivers/parisc/led.c | 4 +-
drivers/parisc/sba_iommu.c | 2 +-
drivers/parport/parport_gsc.h | 4 +-
drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
drivers/scsi/pcmcia/sym53c500_cs.c | 4 +-
drivers/scsi/qla2xxx/qla_inline.h | 2 +-
drivers/scsi/qla2xxx/qla_os.c | 2 +-
drivers/tty/amiserial.c | 2 +-
drivers/tty/serial/ip22zilog.c | 2 +-
drivers/tty/serial/sunsab.c | 4 +-
drivers/tty/serial/sunzilog.c | 2 +-
drivers/video/fbdev/core/fbcon.c | 20 +++++-----
drivers/video/fbdev/ffb.c | 2 +-
drivers/video/fbdev/intelfb/intelfbdrv.c | 8 ++--
drivers/video/fbdev/intelfb/intelfbhw.c | 2 +-
drivers/w1/masters/matrox_w1.c | 4 +-
fs/coda/coda_linux.h | 6 +--
fs/freevxfs/vxfs_inode.c | 2 +-
fs/nfsd/nfsfh.h | 4 +-
include/acpi/platform/acgcc.h | 2 +-
include/asm-generic/ide_iops.h | 8 ++--
include/linux/atalk.h | 4 +-
include/linux/compiler-gcc.h | 4 ++
include/linux/compiler_types.h | 5 ++-
include/linux/hdlc.h | 4 +-
include/linux/inetdevice.h | 6 +--
include/linux/parport.h | 4 +-
include/linux/parport_pc.h | 22 +++++-----
include/net/ax25.h | 2 +-
include/net/checksum.h | 2 +-
include/net/dn_nsp.h | 16 ++++----
include/net/ip.h | 2 +-
include/net/ip6_checksum.h | 2 +-
include/net/ipx.h | 10 ++---
include/net/llc_c_ev.h | 4 +-
include/net/llc_conn.h | 4 +-
include/net/llc_s_ev.h | 2 +-
include/net/netrom.h | 8 ++--
include/net/scm.h | 14 +++----
include/net/udplite.h | 2 +-
include/net/x25.h | 8 ++--
include/net/xfrm.h | 18 ++++-----
include/video/newport.h | 12 +++---
net/appletalk/atalk_proc.c | 4 +-
net/appletalk/ddp.c | 2 +-
net/core/neighbour.c | 2 +-
net/core/scm.c | 2 +-
net/decnet/dn_nsp_in.c | 2 +-
net/decnet/dn_nsp_out.c | 2 +-
net/decnet/dn_route.c | 2 +-
net/decnet/dn_table.c | 4 +-
net/ipv6/af_inet6.c | 2 +-
net/ipv6/icmp.c | 4 +-
net/ipv6/udp.c | 2 +-
net/lapb/lapb_iface.c | 4 +-
net/llc/llc_input.c | 2 +-
sound/sparc/amd7930.c | 6 +--
131 files changed, 449 insertions(+), 442 deletions(-)

--
2.20.1


2019-08-29 08:37:10

by Rasmus Villemoes

[permalink] [raw]
Subject: [RFC PATCH 5/5] x86: bug.h: use asm_inline in _BUG_FLAGS definitions

This helps preventing a BUG* or WARN* in some static inline from
preventing that (or one of its callers) being inlined, so should allow
gcc to make better informed inlining decisions.

For example, with gcc 9.2, tcp_fastopen_no_cookie() vanishes from
net/ipv4/tcp_fastopen.o. It does not itself have any BUG or WARN, but
it calls dst_metric() which has a WARN_ON_ONCE - and despite that
WARN_ON_ONCE vanishing since the condition is compile-time false,
dst_metric() is apparently sufficiently "large" that when it gets
inlined into tcp_fastopen_no_cookie(), the latter becomes too large
for inlining.

Overall, if one asks size(1), .text decreases a little and .data
increases by about the same amount (x86-64 defconfig)

$ size vmlinux.{before,after}
text data bss dec hex filename
19709726 5202600 1630280 26542606 195020e vmlinux.before
19709330 5203068 1630280 26542678 1950256 vmlinux.after

while bloat-o-meter says

add/remove: 10/28 grow/shrink: 103/51 up/down: 3669/-2854 (815)
...
Total: Before=14783683, After=14784498, chg +0.01%

Signed-off-by: Rasmus Villemoes <[email protected]>
---
arch/x86/include/asm/bug.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 6804d6642767..facba9bc30ca 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -32,7 +32,7 @@

#define _BUG_FLAGS(ins, flags) \
do { \
- asm volatile("1:\t" ins "\n" \
+ asm_inline volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
"\t" __BUG_REL(%c0) "\t# bug_entry::file\n" \
@@ -49,7 +49,7 @@ do { \

#define _BUG_FLAGS(ins, flags) \
do { \
- asm volatile("1:\t" ins "\n" \
+ asm_inline volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
"\t.word %c0" "\t# bug_entry::flags\n" \
--
2.20.1

2019-08-29 08:37:23

by Rasmus Villemoes

[permalink] [raw]
Subject: [RFC PATCH 2/5] compiler_types.h: don't #define __inline__

The spelling __inline__ should be reserved for uses where one really
wants to refer to the inline keyword, regardless of whether or not the
spelling "inline" has been #defined to something else. Most users of
__inline__ have been converted to inline, for the remaining, this
represents a small change of semantics.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
include/linux/compiler_types.h | 1 -
1 file changed, 1 deletion(-)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 599c27b56c29..4a8b63e3a31d 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -150,7 +150,6 @@ struct ftrace_likely_data {
__maybe_unused notrace
#endif

-#define __inline__ inline
#define __inline inline

/*
--
2.20.1

2019-08-29 16:07:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On Thu, Aug 29, 2019 at 1:32 AM Rasmus Villemoes
<[email protected]> wrote:
>
> But since we #define the identifier inline to attach some attributes,
> we have to use the alternate spelling __inline__ of that
> keyword. Unfortunately, we also currently #define that one (to
> inline), so we first have to get rid of all (mis)uses of
> __inline__. Hence the huge diffstat.

Ugh. Not pretty, but I guess we're stuck with it.

However, it worries me a bit that you excluide the UAPI headers where
we still use "__inline__", and now the semantics of that will change
for the kernel (for some odd gcc versions).

I suspect we should just bite the bullet and you should do it to the
uapi headers too. We already use "inline" in a lot of them, so it's
not the case that we're using __inline__ because of some namespace
issue, as far as I can tell.

One option might be to just use "__inline" for the asm_inline thing.
We have way fewer of those. That would make the noise much less for
this patch series.

Linus

2019-08-29 17:37:31

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

+ Masahiro, who sent patches earlier this week also looking to clean
up our redefinition of `inline` more.
https://lkml.org/lkml/2019/8/28/44

On Thu, Aug 29, 2019 at 1:32 AM Rasmus Villemoes
<[email protected]> wrote:
>
> gcc 9 provides a way to override the otherwise crude heuristic that
> gcc uses to estimate the size of the code represented by an asm()
> statement. From the gcc docs
>
> If you use 'asm inline' instead of just 'asm', then for inlining
> purposes the size of the asm is taken as the minimum size, ignoring
> how many instructions GCC thinks it is.

Hi Rasmus,
Thanks for the RFC and including me in the discussion. Can you link
me to the release notes this came from, and the bug related to this
features development?

I'm curious what "the size of the asm" means, and how it differs
precisely from "how many instructions GCC thinks it is." I would
think those are one and the same? Or maybe "the size of the asm"
means the size in bytes when assembled to machine code, as opposed to
the count of assembly instructions?

It looks like LLVM estimates based on instruction count, not assembled
byte size:
https://github.com/llvm/llvm-project/blob/6289ee941d6f8fc222225fb6845efce477bf5094/llvm/lib/CodeGen/TargetInstrInfo.cpp#L75-L125
(That's an arch independent version, that only has an override for hexagon)
I'd imagine it would be possible to implement a machine-code-size
based estimate.

That one snippet alludes to a problem with the existing `asm` GNU C
extension. (I personally have fixed bugs in LLVM where forgetting to
measure the estimated size of inline assembly leads to instruction
selection that can't represent the appropriate jump ranges. Knowing
the size of inline assembly is important for the inliner's cost model,
and AIUI gets difficult for some of the assembler directives) If the
internal heuristic had an issue, I'm curious to know more about the
design decision that led to the introduction of a new GNU C extension
rather than adjustments to the existing heuristic or cost model, and I
assume the bug would have more info? (Maybe giving developers choice
between two different cost models? Did actual code break changing the
existing `asm` cost model? Feels like a choice between an imprecise
and a precise cost model, but at this point I'm speculating too far
ahead.)

In particular the reuse of the keyword `inline` in the new GNU C
extension is problematic for our code, as your patchset clearly
illustrates. I understand that adding new keywords to a language is
impossibly difficult (how many different meanings does `static` have
now, in different contexts) and thus the reuse of them in differing
parse contexts is common (reuse of `auto` in C++11 comes to mind,
though it's not a differing parse context). I think if the GCC
developers were aware that we redefine the `inline` keyword via C
preprocessor, they may have chosen a different design. But as you
illustrate, the changes we'd have to make to the kernel are not
insurmountable.

Tangent:
There are many GNU C extensions I love and I hope ISO will adopt.
I've been talking with Hans Boehm about joining up ISO WG14
specifically to help standardize them. But then again, I don't like
standards that I can't download (the ratified version, not a draft
version). I also think it's better for implementers to have more say
over standards, and the W3C/WHATWG split is a very very interesting
case study in this regard. That said, I wish more LLVM folks were
included in the design discussion of GNU C extensions; as trying to
reimplement new language features can flush out edge cases that allow
us to nail down behavior in the spec (let alone have a spec) ASAP.
The only way I find out about new GNU C extensions is when someone in
the kernel goes to use them, and Clang doesn't support it, then the
build is broken until we support them. :(

>
> For compatibility with older compilers, we obviously want a
>
> #if new enough
> #define asm_inline asm inline
> #else
> #define asm_inline asm
> #endif

Requesting a feature test macro to the GCC developers would be great;
then we could use feature detection in one place rather than brittle
version checks in multiple places. Imagine the C preprocessor would
define something like HAS_GNU_ASM_INLINE, then writing a guard would
be simple, and other compilers could define the same preprocessor
token when they add support. GCC already does this for a recent
extension to inline asm for x86 (I forget the feature name, something
to do with hinting at the flags or output IIRC, Redhat had a blog post
and we recently implemented support).

>
> But since we #define the identifier inline to attach some attributes,
> we have to use the alternate spelling __inline__ of that
> keyword. Unfortunately, we also currently #define that one (to
> inline), so we first have to get rid of all (mis)uses of
> __inline__. Hence the huge diffstat. I think patch 1 should be
> regenerated and applied just before -rc1.
>
> There are a few remaining users of __inline__, in particular in uapi
> headers, which I'm not sure what to do about (kernel-side users of
> those will get slightly different semantics since we no longer
> automatically attach the __attribute__((__gnu_inline__)) etc.). So RFC.

No thoughts on uapi, but I think we should break this work logically into two:
1. remove our redefinition of inline
2. implement asm_inline for GCC

I think what you have for 2 so far is ok, but I need to spend more
time thinking about it.

For 1:
Our redefinition of inline currently looks like:

146 #define inline inline __attribute__((__always_inline__))
__gnu_inline \
147 __maybe_unused notrace

So we have:
1. always_inline attribute
2. gnu_inline attribute
3. maybe_unused
4. notrace

I'm not convinced that always_inline works as intended. An inliner
can still refuse to inline something if it doesn't have the machinery
to perform all of the transformations required for inlining or it's
not safe to do so. The C preprocessor is the one true always inline
(and type safety be damned). It would be interesting to study the
effects of removing that attribute. Android's Java runtime, ART uses
this everywhere for all functions, and it's not clear that adding
attribute always_inline everywhere is an "optimization." Research
folks at Google are playing with finding better inlining heuristics
via machine learning, which is quite exciting.

I introduced gnu_inline; it's like the one semantically different
thing from C89 to C99 IIRC. I introduced it because a few places in
the kernel were redefining KBUILD_CFLAGS, dropping `-std=gnu89`. It
seems now that there's only a few places left that do that, and
they're all under arch/x86/ (see:
https://github.com/ClangBuiltLinux/linux/issues/618#issuecomment-525712390).
Note that it's a little tricky to undo this; someone just reported
yesterday that I broke kexec, and we're working on cleaning that up,
but I think doing that then adding a check to not redefine
KBUILD_CFLAGS (cc Joe) to scripts/checkpatch.pl would doable. If we
fixed that, than we could use `-fgnu_inline` (or w/e the spelling is)
command line flag instead of the compiler attribute.

Masahiro is playing around with the maybe_unused part. Seems to be a
difference in GCC and Clang warning on unused static inline functions.
I think this can be solved with correct usage of #ifdef guards for the
appropriate CONFIG_'s, rather than __maybe_unused.

notrace's definition is pretty complicated, I have no idea what any of
those attributes do...

But maybe all of that is moot, if we just use __inline__. Looking
more at your patchset after typing this all out, seems like that will
work.

>
> The two x86 changes cause smaller code gen differences than I'd
> expect, but I think we do want the asm_inline thing available sooner
> or later, so this is just to get the ball rolling.

Is it a net win for all arch's? Or just x86? `differences` being an
improvement or a regression?


>
> Rasmus Villemoes (5):
> treewide: replace __inline__ by inline
> compiler_types.h: don't #define __inline__
> compiler-gcc.h: add asm_inline definition
> x86: alternative.h: use asm_inline for all alternative variants
> x86: bug.h: use asm_inline in _BUG_FLAGS definitions
>
> arch/alpha/include/asm/atomic.h | 12 +++---
> arch/alpha/include/asm/bitops.h | 6 +--
> arch/alpha/include/asm/dma.h | 22 +++++-----
> arch/alpha/include/asm/floppy.h | 2 +-
> arch/alpha/include/asm/irq.h | 2 +-
> arch/alpha/include/asm/local.h | 4 +-
> arch/alpha/include/asm/smp.h | 2 +-
> .../arm/mach-iop32x/include/mach/uncompress.h | 2 +-
> .../arm/mach-iop33x/include/mach/uncompress.h | 2 +-
> .../arm/mach-ixp4xx/include/mach/uncompress.h | 2 +-
> arch/ia64/hp/common/sba_iommu.c | 2 +-
> arch/ia64/hp/sim/simeth.c | 2 +-
> arch/ia64/include/asm/atomic.h | 8 ++--
> arch/ia64/include/asm/bitops.h | 34 ++++++++--------
> arch/ia64/include/asm/delay.h | 14 +++----
> arch/ia64/include/asm/irq.h | 2 +-
> arch/ia64/include/asm/page.h | 2 +-
> arch/ia64/include/asm/sn/leds.h | 2 +-
> arch/ia64/include/asm/uaccess.h | 4 +-
> arch/ia64/oprofile/backtrace.c | 4 +-
> arch/m68k/include/asm/blinken.h | 2 +-
> arch/m68k/include/asm/checksum.h | 2 +-
> arch/m68k/include/asm/dma.h | 32 +++++++--------
> arch/m68k/include/asm/floppy.h | 8 ++--
> arch/m68k/include/asm/nettel.h | 8 ++--
> arch/m68k/mac/iop.c | 14 +++----
> arch/mips/include/asm/atomic.h | 16 ++++----
> arch/mips/include/asm/checksum.h | 2 +-
> arch/mips/include/asm/dma.h | 20 +++++-----
> arch/mips/include/asm/jazz.h | 2 +-
> arch/mips/include/asm/local.h | 4 +-
> arch/mips/include/asm/string.h | 8 ++--
> arch/mips/kernel/binfmt_elfn32.c | 2 +-
> arch/nds32/include/asm/swab.h | 4 +-
> arch/parisc/include/asm/atomic.h | 20 +++++-----
> arch/parisc/include/asm/bitops.h | 18 ++++-----
> arch/parisc/include/asm/checksum.h | 4 +-
> arch/parisc/include/asm/compat.h | 2 +-
> arch/parisc/include/asm/delay.h | 2 +-
> arch/parisc/include/asm/dma.h | 20 +++++-----
> arch/parisc/include/asm/ide.h | 8 ++--
> arch/parisc/include/asm/irq.h | 2 +-
> arch/parisc/include/asm/spinlock.h | 12 +++---
> arch/powerpc/include/asm/atomic.h | 40 +++++++++----------
> arch/powerpc/include/asm/bitops.h | 28 ++++++-------
> arch/powerpc/include/asm/dma.h | 20 +++++-----
> arch/powerpc/include/asm/edac.h | 2 +-
> arch/powerpc/include/asm/irq.h | 2 +-
> arch/powerpc/include/asm/local.h | 14 +++----
> arch/sh/include/asm/pgtable_64.h | 2 +-
> arch/sh/include/asm/processor_32.h | 4 +-
> arch/sh/include/cpu-sh3/cpu/dac.h | 6 +--
> arch/x86/include/asm/alternative.h | 14 +++----
> arch/x86/include/asm/bug.h | 4 +-
> arch/x86/um/asm/checksum.h | 4 +-
> arch/x86/um/asm/checksum_32.h | 4 +-
> arch/xtensa/include/asm/checksum.h | 14 +++----
> arch/xtensa/include/asm/cmpxchg.h | 4 +-
> arch/xtensa/include/asm/irq.h | 2 +-
> block/partitions/amiga.c | 2 +-
> drivers/atm/he.c | 6 +--
> drivers/atm/idt77252.c | 6 +--
> drivers/gpu/drm/mga/mga_drv.h | 2 +-
> drivers/gpu/drm/mga/mga_state.c | 14 +++----
> drivers/gpu/drm/r128/r128_drv.h | 2 +-
> drivers/gpu/drm/r128/r128_state.c | 14 +++----
> drivers/gpu/drm/via/via_irq.c | 2 +-
> drivers/gpu/drm/via/via_verifier.c | 30 +++++++-------
> drivers/media/pci/ivtv/ivtv-ioctl.c | 2 +-
> drivers/net/ethernet/sun/sungem.c | 8 ++--
> drivers/net/ethernet/sun/sunhme.c | 6 +--
> drivers/net/hamradio/baycom_ser_fdx.c | 2 +-
> drivers/net/wan/lapbether.c | 2 +-
> drivers/net/wan/n2.c | 4 +-
> drivers/parisc/led.c | 4 +-
> drivers/parisc/sba_iommu.c | 2 +-
> drivers/parport/parport_gsc.h | 4 +-
> drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
> drivers/scsi/pcmcia/sym53c500_cs.c | 4 +-
> drivers/scsi/qla2xxx/qla_inline.h | 2 +-
> drivers/scsi/qla2xxx/qla_os.c | 2 +-
> drivers/tty/amiserial.c | 2 +-
> drivers/tty/serial/ip22zilog.c | 2 +-
> drivers/tty/serial/sunsab.c | 4 +-
> drivers/tty/serial/sunzilog.c | 2 +-
> drivers/video/fbdev/core/fbcon.c | 20 +++++-----
> drivers/video/fbdev/ffb.c | 2 +-
> drivers/video/fbdev/intelfb/intelfbdrv.c | 8 ++--
> drivers/video/fbdev/intelfb/intelfbhw.c | 2 +-
> drivers/w1/masters/matrox_w1.c | 4 +-
> fs/coda/coda_linux.h | 6 +--
> fs/freevxfs/vxfs_inode.c | 2 +-
> fs/nfsd/nfsfh.h | 4 +-
> include/acpi/platform/acgcc.h | 2 +-
> include/asm-generic/ide_iops.h | 8 ++--
> include/linux/atalk.h | 4 +-
> include/linux/compiler-gcc.h | 4 ++
> include/linux/compiler_types.h | 5 ++-
> include/linux/hdlc.h | 4 +-
> include/linux/inetdevice.h | 6 +--
> include/linux/parport.h | 4 +-
> include/linux/parport_pc.h | 22 +++++-----
> include/net/ax25.h | 2 +-
> include/net/checksum.h | 2 +-
> include/net/dn_nsp.h | 16 ++++----
> include/net/ip.h | 2 +-
> include/net/ip6_checksum.h | 2 +-
> include/net/ipx.h | 10 ++---
> include/net/llc_c_ev.h | 4 +-
> include/net/llc_conn.h | 4 +-
> include/net/llc_s_ev.h | 2 +-
> include/net/netrom.h | 8 ++--
> include/net/scm.h | 14 +++----
> include/net/udplite.h | 2 +-
> include/net/x25.h | 8 ++--
> include/net/xfrm.h | 18 ++++-----
> include/video/newport.h | 12 +++---
> net/appletalk/atalk_proc.c | 4 +-
> net/appletalk/ddp.c | 2 +-
> net/core/neighbour.c | 2 +-
> net/core/scm.c | 2 +-
> net/decnet/dn_nsp_in.c | 2 +-
> net/decnet/dn_nsp_out.c | 2 +-
> net/decnet/dn_route.c | 2 +-
> net/decnet/dn_table.c | 4 +-
> net/ipv6/af_inet6.c | 2 +-
> net/ipv6/icmp.c | 4 +-
> net/ipv6/udp.c | 2 +-
> net/lapb/lapb_iface.c | 4 +-
> net/llc/llc_input.c | 2 +-
> sound/sparc/amd7930.c | 6 +--
> 131 files changed, 449 insertions(+), 442 deletions(-)
>
> --
> 2.20.1
>


--
Thanks,
~Nick Desaulniers

2019-08-29 18:28:41

by Linus Torvalds

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On Thu, Aug 29, 2019 at 10:36 AM Nick Desaulniers
<[email protected]> wrote:
>
> I'm curious what "the size of the asm" means, and how it differs
> precisely from "how many instructions GCC thinks it is." I would
> think those are one and the same? Or maybe "the size of the asm"
> means the size in bytes when assembled to machine code, as opposed to
> the count of assembly instructions?

The problem is that we do different sections in the inline asm, and
the instruction counts are completely bogus as a result.

The actual instruction in the code stream may be just a single
instruction. But the out-of-line sections can be multiple instructions
and/or a data section that contains exception information.

So we want the asm inlined, because the _inline_ part (and the hot
instruction) is small, even though the asm technically maybe generates
many more bytes of additional data.

The worst offenders for this tend to be

- various exception tables for user accesses etc

- "alternatives" where we list two or more different asm alternatives
and then pick the right one at boot time depending on CPU ID flags

- "BUG_ON()" instructions where there's a "ud2" instruction and
various data annotations going with it

so gcc may be "technically correct" that the inline asm statement
contains ten instructions or more, but the actual instruction _code_
footprint in the asm is likely just a single instruction or two.

The statement counting is also completely off by the fact that some of
the "statements" are assembler directives (ie the
".pushsection"/".popsection" lines etc). So some of it is that the
instruction counting is off, but the largest part is that it's just
not relevant to the code footprint in that function.

Un-inlining a function because it contains a single inline asm
instruction is not productive. Yes, it might result in a smaller
binary over-all (because all those other non-code sections do take up
some space), but it actually results in a bigger code footprint.

Linus

2019-08-29 18:28:42

by Nadav Amit

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

> On Aug 29, 2019, at 11:15 AM, Linus Torvalds <[email protected]> wrote:
>
> On Thu, Aug 29, 2019 at 10:36 AM Nick Desaulniers
> <[email protected]> wrote:
>> I'm curious what "the size of the asm" means, and how it differs
>> precisely from "how many instructions GCC thinks it is." I would
>> think those are one and the same? Or maybe "the size of the asm"
>> means the size in bytes when assembled to machine code, as opposed to
>> the count of assembly instructions?
>
> The problem is that we do different sections in the inline asm, and
> the instruction counts are completely bogus as a result.
>
> The actual instruction in the code stream may be just a single
> instruction. But the out-of-line sections can be multiple instructions
> and/or a data section that contains exception information.
>
> So we want the asm inlined, because the _inline_ part (and the hot
> instruction) is small, even though the asm technically maybe generates
> many more bytes of additional data.
>
> The worst offenders for this tend to be
>
> - various exception tables for user accesses etc
>
> - "alternatives" where we list two or more different asm alternatives
> and then pick the right one at boot time depending on CPU ID flags
>
> - "BUG_ON()" instructions where there's a "ud2" instruction and
> various data annotations going with it
>
> so gcc may be "technically correct" that the inline asm statement
> contains ten instructions or more, but the actual instruction _code_
> footprint in the asm is likely just a single instruction or two.
>
> The statement counting is also completely off by the fact that some of
> the "statements" are assembler directives (ie the
> ".pushsection"/".popsection" lines etc). So some of it is that the
> instruction counting is off, but the largest part is that it's just
> not relevant to the code footprint in that function.
>
> Un-inlining a function because it contains a single inline asm
> instruction is not productive. Yes, it might result in a smaller
> binary over-all (because all those other non-code sections do take up
> some space), but it actually results in a bigger code footprint.

For the record, here is my failing attempt to address the issue without GCC
support:

https://lore.kernel.org/lkml/[email protected]/T/

2019-08-29 18:43:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On Thu, Aug 29, 2019 at 11:15:04AM -0700, Linus Torvalds wrote:
> Un-inlining a function because it contains a single inline asm
> instruction is not productive. Yes, it might result in a smaller
> binary over-all (because all those other non-code sections do take up
> some space), but it actually results in a bigger code footprint.

... and also, like one of the gcc guys said at the time, we should be
careful when using this asm inlining, because, well, if we inline it
everywhere just like always_inline functions and the code footprint
grows considerably, then we get what we deserve.

So the onus is on us to keep such sequences small.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2019-08-29 19:43:14

by Masahiro Yamada

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On Fri, Aug 30, 2019 at 2:36 AM Nick Desaulniers
<[email protected]> wrote:
>
> + Masahiro, who sent patches earlier this week also looking to clean
> up our redefinition of `inline` more.
> https://lkml.org/lkml/2019/8/28/44

I think you noticed what I was trying to do.

Yes, my work for cleaning 'inline' was initially motivated
by the 'asm inline' of GCC 9.

The ultimate goal is to stop macrofying 'inline', but
I am not sure whether or not it is achievable.



> On Thu, Aug 29, 2019 at 1:32 AM Rasmus Villemoes
> <[email protected]> wrote:
> >
> > gcc 9 provides a way to override the otherwise crude heuristic that
> > gcc uses to estimate the size of the code represented by an asm()
> > statement. From the gcc docs
> >
> > If you use 'asm inline' instead of just 'asm', then for inlining
> > purposes the size of the asm is taken as the minimum size, ignoring
> > how many instructions GCC thinks it is.
>
> Hi Rasmus,
> Thanks for the RFC and including me in the discussion. Can you link
> me to the release notes this came from, and the bug related to this
> features development?

Just in care you are interested, this started from this thread:
https://lkml.org/lkml/2018/10/7/92



> I'm curious what "the size of the asm" means, and how it differs
> precisely from "how many instructions GCC thinks it is." I would
> think those are one and the same? Or maybe "the size of the asm"
> means the size in bytes when assembled to machine code, as opposed to
> the count of assembly instructions?
>
> It looks like LLVM estimates based on instruction count, not assembled
> byte size:
> https://github.com/llvm/llvm-project/blob/6289ee941d6f8fc222225fb6845efce477bf5094/llvm/lib/CodeGen/TargetInstrInfo.cpp#L75-L125
> (That's an arch independent version, that only has an override for hexagon)
> I'd imagine it would be possible to implement a machine-code-size
> based estimate.
>
> That one snippet alludes to a problem with the existing `asm` GNU C
> extension. (I personally have fixed bugs in LLVM where forgetting to
> measure the estimated size of inline assembly leads to instruction
> selection that can't represent the appropriate jump ranges. Knowing
> the size of inline assembly is important for the inliner's cost model,
> and AIUI gets difficult for some of the assembler directives) If the
> internal heuristic had an issue, I'm curious to know more about the
> design decision that led to the introduction of a new GNU C extension
> rather than adjustments to the existing heuristic or cost model, and I
> assume the bug would have more info? (Maybe giving developers choice
> between two different cost models? Did actual code break changing the
> existing `asm` cost model? Feels like a choice between an imprecise
> and a precise cost model, but at this point I'm speculating too far
> ahead.)
>
> In particular the reuse of the keyword `inline` in the new GNU C
> extension is problematic for our code, as your patchset clearly
> illustrates. I understand that adding new keywords to a language is
> impossibly difficult (how many different meanings does `static` have
> now, in different contexts) and thus the reuse of them in differing
> parse contexts is common (reuse of `auto` in C++11 comes to mind,
> though it's not a differing parse context). I think if the GCC
> developers were aware that we redefine the `inline` keyword via C
> preprocessor, they may have chosen a different design. But as you
> illustrate, the changes we'd have to make to the kernel are not
> insurmountable.
>
> Tangent:
> There are many GNU C extensions I love and I hope ISO will adopt.
> I've been talking with Hans Boehm about joining up ISO WG14
> specifically to help standardize them. But then again, I don't like
> standards that I can't download (the ratified version, not a draft
> version). I also think it's better for implementers to have more say
> over standards, and the W3C/WHATWG split is a very very interesting
> case study in this regard. That said, I wish more LLVM folks were
> included in the design discussion of GNU C extensions; as trying to
> reimplement new language features can flush out edge cases that allow
> us to nail down behavior in the spec (let alone have a spec) ASAP.
> The only way I find out about new GNU C extensions is when someone in
> the kernel goes to use them, and Clang doesn't support it, then the
> build is broken until we support them. :(
>
> >
> > For compatibility with older compilers, we obviously want a
> >
> > #if new enough
> > #define asm_inline asm inline
> > #else
> > #define asm_inline asm
> > #endif
>
> Requesting a feature test macro to the GCC developers would be great;
> then we could use feature detection in one place rather than brittle
> version checks in multiple places. Imagine the C preprocessor would
> define something like HAS_GNU_ASM_INLINE, then writing a guard would
> be simple, and other compilers could define the same preprocessor
> token when they add support. GCC already does this for a recent
> extension to inline asm for x86 (I forget the feature name, something
> to do with hinting at the flags or output IIRC, Redhat had a blog post
> and we recently implemented support).


If interested, my prototype-patch is here:
https://lkml.org/lkml/2018/12/13/212

I implemented the feature test in Kconfig.

config CC_HAS_ASM_INLINE
def_bool $(success,echo 'void foo(void) { asm inline (""); }' |
$(CC) -x c - -c -o /dev/null)

I am also fine with checking it by version
since we know GCC 9 started supporting it.


> >
> > But since we #define the identifier inline to attach some attributes,
> > we have to use the alternate spelling __inline__ of that
> > keyword. Unfortunately, we also currently #define that one (to
> > inline), so we first have to get rid of all (mis)uses of
> > __inline__. Hence the huge diffstat. I think patch 1 should be
> > regenerated and applied just before -rc1.
> >
> > There are a few remaining users of __inline__, in particular in uapi
> > headers, which I'm not sure what to do about (kernel-side users of
> > those will get slightly different semantics since we no longer
> > automatically attach the __attribute__((__gnu_inline__)) etc.). So RFC.

The exported headers must use '__inline__' instead of 'inline':
This is now compile-tested.
https://github.com/torvalds/linux/blob/v5.3-rc6/usr/include/Makefile#L5


At this moment, 'inline' is replaced with '__inline__' with sed:
https://github.com/torvalds/linux/blob/v5.3-rc6/scripts/headers_install.sh#L37

But, I'd like to reduce the sed patterns in the future.



I think using '__inline' for asm_inlie will be OK.




> No thoughts on uapi, but I think we should break this work logically into two:
> 1. remove our redefinition of inline
> 2. implement asm_inline for GCC
>
> I think what you have for 2 so far is ok, but I need to spend more
> time thinking about it.
>
> For 1:
> Our redefinition of inline currently looks like:
>
> 146 #define inline inline __attribute__((__always_inline__))
> __gnu_inline \
> 147 __maybe_unused notrace
>
> So we have:
> 1. always_inline attribute
> 2. gnu_inline attribute
> 3. maybe_unused
> 4. notrace
>
> I'm not convinced that always_inline works as intended. An inliner
> can still refuse to inline something if it doesn't have the machinery
> to perform all of the transformations required for inlining or it's
> not safe to do so. The C preprocessor is the one true always inline
> (and type safety be damned). It would be interesting to study the
> effects of removing that attribute. Android's Java runtime, ART uses
> this everywhere for all functions, and it's not clear that adding
> attribute always_inline everywhere is an "optimization." Research
> folks at Google are playing with finding better inlining heuristics
> via machine learning, which is quite exciting.

I want to get rid of always_inline.

My commit 9012d011660ea5cf2a623e1de207a2bc0ca6936d
is the first step towards the goal.
All architectures support CONFIG_OPTIMIZE_INLINING,
and it is stable up to now, I think.

Next, set CONFIG_OPTIMIZE_INLINING=y forcibly,
and lastly, remove CONFIG_OPTIMIZE_INLINING and always_inline
entirely.



> I introduced gnu_inline; it's like the one semantically different
> thing from C89 to C99 IIRC. I introduced it because a few places in
> the kernel were redefining KBUILD_CFLAGS, dropping `-std=gnu89`. It
> seems now that there's only a few places left that do that, and
> they're all under arch/x86/ (see:
> https://github.com/ClangBuiltLinux/linux/issues/618#issuecomment-525712390).
> Note that it's a little tricky to undo this; someone just reported
> yesterday that I broke kexec, and we're working on cleaning that up,
> but I think doing that then adding a check to not redefine
> KBUILD_CFLAGS (cc Joe) to scripts/checkpatch.pl would doable. If we
> fixed that, than we could use `-fgnu_inline` (or w/e the spelling is)
> command line flag instead of the compiler attribute.

Probably we can get rid of this too.

> Masahiro is playing around with the maybe_unused part. Seems to be a
> difference in GCC and Clang warning on unused static inline functions.
> I think this can be solved with correct usage of #ifdef guards for the
> appropriate CONFIG_'s, rather than __maybe_unused.

Yes.

> notrace's definition is pretty complicated, I have no idea what any of
> those attributes do...
>
> But maybe all of that is moot, if we just use __inline__. Looking
> more at your patchset after typing this all out, seems like that will
> work.
>
> >
> > The two x86 changes cause smaller code gen differences than I'd
> > expect, but I think we do want the asm_inline thing available sooner
> > or later, so this is just to get the ball rolling.
>
> Is it a net win for all arch's? Or just x86? `differences` being an
> improvement or a regression?
>
>
> >
> > Rasmus Villemoes (5):
> > treewide: replace __inline__ by inline
> > compiler_types.h: don't #define __inline__
> > compiler-gcc.h: add asm_inline definition
> > x86: alternative.h: use asm_inline for all alternative variants
> > x86: bug.h: use asm_inline in _BUG_FLAGS definitions
> >
> > arch/alpha/include/asm/atomic.h | 12 +++---
> > arch/alpha/include/asm/bitops.h | 6 +--
> > arch/alpha/include/asm/dma.h | 22 +++++-----
> > arch/alpha/include/asm/floppy.h | 2 +-
> > arch/alpha/include/asm/irq.h | 2 +-
> > arch/alpha/include/asm/local.h | 4 +-
> > arch/alpha/include/asm/smp.h | 2 +-
> > .../arm/mach-iop32x/include/mach/uncompress.h | 2 +-
> > .../arm/mach-iop33x/include/mach/uncompress.h | 2 +-
> > .../arm/mach-ixp4xx/include/mach/uncompress.h | 2 +-
> > arch/ia64/hp/common/sba_iommu.c | 2 +-
> > arch/ia64/hp/sim/simeth.c | 2 +-
> > arch/ia64/include/asm/atomic.h | 8 ++--
> > arch/ia64/include/asm/bitops.h | 34 ++++++++--------
> > arch/ia64/include/asm/delay.h | 14 +++----
> > arch/ia64/include/asm/irq.h | 2 +-
> > arch/ia64/include/asm/page.h | 2 +-
> > arch/ia64/include/asm/sn/leds.h | 2 +-
> > arch/ia64/include/asm/uaccess.h | 4 +-
> > arch/ia64/oprofile/backtrace.c | 4 +-
> > arch/m68k/include/asm/blinken.h | 2 +-
> > arch/m68k/include/asm/checksum.h | 2 +-
> > arch/m68k/include/asm/dma.h | 32 +++++++--------
> > arch/m68k/include/asm/floppy.h | 8 ++--
> > arch/m68k/include/asm/nettel.h | 8 ++--
> > arch/m68k/mac/iop.c | 14 +++----
> > arch/mips/include/asm/atomic.h | 16 ++++----
> > arch/mips/include/asm/checksum.h | 2 +-
> > arch/mips/include/asm/dma.h | 20 +++++-----
> > arch/mips/include/asm/jazz.h | 2 +-
> > arch/mips/include/asm/local.h | 4 +-
> > arch/mips/include/asm/string.h | 8 ++--
> > arch/mips/kernel/binfmt_elfn32.c | 2 +-
> > arch/nds32/include/asm/swab.h | 4 +-
> > arch/parisc/include/asm/atomic.h | 20 +++++-----
> > arch/parisc/include/asm/bitops.h | 18 ++++-----
> > arch/parisc/include/asm/checksum.h | 4 +-
> > arch/parisc/include/asm/compat.h | 2 +-
> > arch/parisc/include/asm/delay.h | 2 +-
> > arch/parisc/include/asm/dma.h | 20 +++++-----
> > arch/parisc/include/asm/ide.h | 8 ++--
> > arch/parisc/include/asm/irq.h | 2 +-
> > arch/parisc/include/asm/spinlock.h | 12 +++---
> > arch/powerpc/include/asm/atomic.h | 40 +++++++++----------
> > arch/powerpc/include/asm/bitops.h | 28 ++++++-------
> > arch/powerpc/include/asm/dma.h | 20 +++++-----
> > arch/powerpc/include/asm/edac.h | 2 +-
> > arch/powerpc/include/asm/irq.h | 2 +-
> > arch/powerpc/include/asm/local.h | 14 +++----
> > arch/sh/include/asm/pgtable_64.h | 2 +-
> > arch/sh/include/asm/processor_32.h | 4 +-
> > arch/sh/include/cpu-sh3/cpu/dac.h | 6 +--
> > arch/x86/include/asm/alternative.h | 14 +++----
> > arch/x86/include/asm/bug.h | 4 +-
> > arch/x86/um/asm/checksum.h | 4 +-
> > arch/x86/um/asm/checksum_32.h | 4 +-
> > arch/xtensa/include/asm/checksum.h | 14 +++----
> > arch/xtensa/include/asm/cmpxchg.h | 4 +-
> > arch/xtensa/include/asm/irq.h | 2 +-
> > block/partitions/amiga.c | 2 +-
> > drivers/atm/he.c | 6 +--
> > drivers/atm/idt77252.c | 6 +--
> > drivers/gpu/drm/mga/mga_drv.h | 2 +-
> > drivers/gpu/drm/mga/mga_state.c | 14 +++----
> > drivers/gpu/drm/r128/r128_drv.h | 2 +-
> > drivers/gpu/drm/r128/r128_state.c | 14 +++----
> > drivers/gpu/drm/via/via_irq.c | 2 +-
> > drivers/gpu/drm/via/via_verifier.c | 30 +++++++-------
> > drivers/media/pci/ivtv/ivtv-ioctl.c | 2 +-
> > drivers/net/ethernet/sun/sungem.c | 8 ++--
> > drivers/net/ethernet/sun/sunhme.c | 6 +--
> > drivers/net/hamradio/baycom_ser_fdx.c | 2 +-
> > drivers/net/wan/lapbether.c | 2 +-
> > drivers/net/wan/n2.c | 4 +-
> > drivers/parisc/led.c | 4 +-
> > drivers/parisc/sba_iommu.c | 2 +-
> > drivers/parport/parport_gsc.h | 4 +-
> > drivers/scsi/lpfc/lpfc_scsi.c | 2 +-
> > drivers/scsi/pcmcia/sym53c500_cs.c | 4 +-
> > drivers/scsi/qla2xxx/qla_inline.h | 2 +-
> > drivers/scsi/qla2xxx/qla_os.c | 2 +-
> > drivers/tty/amiserial.c | 2 +-
> > drivers/tty/serial/ip22zilog.c | 2 +-
> > drivers/tty/serial/sunsab.c | 4 +-
> > drivers/tty/serial/sunzilog.c | 2 +-
> > drivers/video/fbdev/core/fbcon.c | 20 +++++-----
> > drivers/video/fbdev/ffb.c | 2 +-
> > drivers/video/fbdev/intelfb/intelfbdrv.c | 8 ++--
> > drivers/video/fbdev/intelfb/intelfbhw.c | 2 +-
> > drivers/w1/masters/matrox_w1.c | 4 +-
> > fs/coda/coda_linux.h | 6 +--
> > fs/freevxfs/vxfs_inode.c | 2 +-
> > fs/nfsd/nfsfh.h | 4 +-
> > include/acpi/platform/acgcc.h | 2 +-
> > include/asm-generic/ide_iops.h | 8 ++--
> > include/linux/atalk.h | 4 +-
> > include/linux/compiler-gcc.h | 4 ++
> > include/linux/compiler_types.h | 5 ++-
> > include/linux/hdlc.h | 4 +-
> > include/linux/inetdevice.h | 6 +--
> > include/linux/parport.h | 4 +-
> > include/linux/parport_pc.h | 22 +++++-----
> > include/net/ax25.h | 2 +-
> > include/net/checksum.h | 2 +-
> > include/net/dn_nsp.h | 16 ++++----
> > include/net/ip.h | 2 +-
> > include/net/ip6_checksum.h | 2 +-
> > include/net/ipx.h | 10 ++---
> > include/net/llc_c_ev.h | 4 +-
> > include/net/llc_conn.h | 4 +-
> > include/net/llc_s_ev.h | 2 +-
> > include/net/netrom.h | 8 ++--
> > include/net/scm.h | 14 +++----
> > include/net/udplite.h | 2 +-
> > include/net/x25.h | 8 ++--
> > include/net/xfrm.h | 18 ++++-----
> > include/video/newport.h | 12 +++---
> > net/appletalk/atalk_proc.c | 4 +-
> > net/appletalk/ddp.c | 2 +-
> > net/core/neighbour.c | 2 +-
> > net/core/scm.c | 2 +-
> > net/decnet/dn_nsp_in.c | 2 +-
> > net/decnet/dn_nsp_out.c | 2 +-
> > net/decnet/dn_route.c | 2 +-
> > net/decnet/dn_table.c | 4 +-
> > net/ipv6/af_inet6.c | 2 +-
> > net/ipv6/icmp.c | 4 +-
> > net/ipv6/udp.c | 2 +-
> > net/lapb/lapb_iface.c | 4 +-
> > net/llc/llc_input.c | 2 +-
> > sound/sparc/amd7930.c | 6 +--
> > 131 files changed, 449 insertions(+), 442 deletions(-)
> >
> > --
> > 2.20.1
> >
>
>
> --
> Thanks,
> ~Nick Desaulniers



--
Best Regards

Masahiro Yamada

2019-08-30 07:47:29

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [RFC PATCH 0/5] make use of gcc 9's "asm inline()"

On 29/08/2019 18.05, Linus Torvalds wrote:
> On Thu, Aug 29, 2019 at 1:32 AM Rasmus Villemoes
> <[email protected]> wrote:
>>
>> But since we #define the identifier inline to attach some attributes,
>> we have to use the alternate spelling __inline__ of that
>> keyword. Unfortunately, we also currently #define that one (to
>> inline), so we first have to get rid of all (mis)uses of
>> __inline__. Hence the huge diffstat.
>
> Ugh. Not pretty, but I guess we're stuck with it.
>
> However, it worries me a bit that you excluide the UAPI headers where
> we still use "__inline__", and now the semantics of that will change
> for the kernel (for some odd gcc versions).

Yeah, as I wrote I was aware of that, but didn't have any good ideas, so
I was fishing. Doing

#ifdef __KERNEL__
#define __uapi_inline inline
#else
#define __uapi_inline __inline__
#endif

was one of the bad ideas...

> I suspect we should just bite the bullet and you should do it to the
> uapi headers too. We already use "inline" in a lot of them, so it's
> not the case that we're using __inline__ because of some namespace
> issue, as far as I can tell.
>
> One option might be to just use "__inline" for the asm_inline thing.
> We have way fewer of those. That would make the noise much less for
> this patch series.

Ah, interesting. I didn't know that was also a compiler provided alias.
It seems to be entirely undocumented -
https://gcc.gnu.org/onlinedocs/gcc/Alternate-Keywords.html just mentions
the __inline__ (and __asm__) spellings. But it's clear as day in the gcc
sources

{ "__inline", RID_INLINE, 0 },
{ "__inline__", RID_INLINE, 0 },

and has been that way since forever, AFAICT.

So yes, let's just start by getting rid of the __inline define and fix
the staging (+acpi,zstd) users, to allow asm_inline to progress. I'll
respin.

Rasmus

2019-08-30 23:16:40

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 0/6] make use of gcc 9's "asm inline()"

gcc 9 provides a way to override the otherwise crude heuristic that
gcc uses to estimate the size of the code represented by an asm()
statement. From the gcc docs

If you use 'asm inline' instead of just 'asm', then for inlining
purposes the size of the asm is taken as the minimum size, ignoring
how many instructions GCC thinks it is.

For compatibility with older compilers, we obviously want a

#if new enough
#define asm_inline asm inline
#else
#define asm_inline asm
#endif

But since we #define the identifier inline to attach some attributes,
we have to use an alternate spelling of that keyword. gcc provides
both __inline__ and __inline, and we currently #define both to inline,
so they all have the same semantics. We have to free up one of
__inline__ and __inline, and the latter is by far the easiest.

The two x86 changes cause smaller code gen differences than I'd
expect, but I think we do want the asm_inline thing available sooner
or later, so this is just to get the ball rolling.

Changes since v1: __inline instead of __inline__, making the diffstat
400 lines smaller. Probably no longer needs special handling (having
Linus run a script to generate patch 1), so if the x86 folks want 5/6
and 6/6, perhaps the whole thing can be routed that way.

Rasmus Villemoes (6):
staging: rtl8723bs: replace __inline by inline
lib/zstd/mem.h: replace __inline by inline
compiler_types.h: don't #define __inline
compiler-gcc.h: add asm_inline definition
x86: alternative.h: use asm_inline for all alternative variants
x86: bug.h: use asm_inline in _BUG_FLAGS definitions

arch/x86/include/asm/alternative.h | 14 +++++++-------
arch/x86/include/asm/bug.h | 4 ++--
drivers/staging/rtl8723bs/core/rtw_pwrctrl.c | 4 ++--
drivers/staging/rtl8723bs/core/rtw_wlan_util.c | 2 +-
drivers/staging/rtl8723bs/include/drv_types.h | 6 +++---
.../staging/rtl8723bs/include/osdep_service.h | 10 +++++-----
.../rtl8723bs/include/osdep_service_linux.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_mlme.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_recv.h | 16 ++++++++--------
drivers/staging/rtl8723bs/include/sta_info.h | 2 +-
drivers/staging/rtl8723bs/include/wifi.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/wlan_bssdef.h | 2 +-
include/linux/compiler-gcc.h | 4 ++++
include/linux/compiler_types.h | 15 ++++++++++++++-
lib/zstd/mem.h | 2 +-
15 files changed, 70 insertions(+), 53 deletions(-)

--
2.20.1

2019-08-30 23:16:48

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 3/6] compiler_types.h: don't #define __inline

The spellings __inline and __inline__ should be reserved for uses
where one really wants to refer to the inline keyword, regardless of
whether or not the spelling "inline" has been #defined to something
else. Due to use of __inline__ in uapi headers, we can't easily get
rid of the definition of __inline__. However, almost all users of
__inline has been converted to inline, so we can get rid of that
#define.

The exception is include/acpi/platform/acintel.h. However, that header
is only included when using the intel compiler (does anybody actually
build the kernel with that?), and the ACPI_INLINE macro is only used
in the definition of utterly trivial stub functions, where I doubt a
small change of semantics (lack of __gnu_inline) changes anything.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
include/linux/compiler_types.h | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 599c27b56c29..ee49be6d6088 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -150,8 +150,17 @@ struct ftrace_likely_data {
__maybe_unused notrace
#endif

+/*
+ * gcc provides both __inline__ and __inline as alternate spellings of
+ * the inline keyword, though the latter is undocumented. New kernel
+ * code should only use the inline spelling, but some existing code
+ * uses __inline__. Since we #define inline above, to ensure
+ * __inline__ has the same semantics, we need this #define.
+ *
+ * However, the spelling __inline is strictly reserved for referring
+ * to the bare keyword.
+ */
#define __inline__ inline
-#define __inline inline

/*
* Rather then using noinline to prevent stack consumption, use
--
2.20.1

2019-08-30 23:16:53

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

This adds an asm_inline macro which expands to "asm inline" [1] when gcc
is new enough (>= 9.1), and just asm for older gccs and other
compilers.

Using asm inline("foo") instead of asm("foo") overrules gcc's
heuristic estimate of the size of the code represented by the asm()
statement, and makes gcc use the minimum possible size instead. That
can in turn affect gcc's inlining decisions.

I wasn't sure whether to make this a function-like macro or not - this
way, it can be combined with volatile as

asm_inline volatile()

but perhaps we'd prefer to spell that

asm_inline_volatile()

anyway.

[1] Technically, asm __inline, since both inline and __inline__
are macros that attach various attributes, making gcc barf if one
literally does "asm inline()". However, the third spelling __inline is
available for referring to the bare keyword.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
include/linux/compiler-gcc.h | 4 ++++
include/linux/compiler_types.h | 4 ++++
2 files changed, 8 insertions(+)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index d7ee4c6bad48..544b87b41b58 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -172,3 +172,7 @@
#endif

#define __no_fgcse __attribute__((optimize("-fno-gcse")))
+
+#if GCC_VERSION >= 90100
+#define asm_inline asm __inline
+#endif
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index ee49be6d6088..ba8d81b716c7 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -198,6 +198,10 @@ struct ftrace_likely_data {
#define asm_volatile_goto(x...) asm goto(x)
#endif

+#ifndef asm_inline
+#define asm_inline asm
+#endif
+
#ifndef __no_fgcse
# define __no_fgcse
#endif
--
2.20.1

2019-08-30 23:17:08

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 6/6] x86: bug.h: use asm_inline in _BUG_FLAGS definitions

This helps preventing a BUG* or WARN* in some static inline from
preventing that (or one of its callers) being inlined, so should allow
gcc to make better informed inlining decisions.

For example, with gcc 9.2, tcp_fastopen_no_cookie() vanishes from
net/ipv4/tcp_fastopen.o. It does not itself have any BUG or WARN, but
it calls dst_metric() which has a WARN_ON_ONCE - and despite that
WARN_ON_ONCE vanishing since the condition is compile-time false,
dst_metric() is apparently sufficiently "large" that when it gets
inlined into tcp_fastopen_no_cookie(), the latter becomes too large
for inlining.

Overall, if one asks size(1), .text decreases a little and .data
increases by about the same amount (x86-64 defconfig)

$ size vmlinux.{before,after}
text data bss dec hex filename
19709726 5202600 1630280 26542606 195020e vmlinux.before
19709330 5203068 1630280 26542678 1950256 vmlinux.after

while bloat-o-meter says

add/remove: 10/28 grow/shrink: 103/51 up/down: 3669/-2854 (815)
...
Total: Before=14783683, After=14784498, chg +0.01%

Signed-off-by: Rasmus Villemoes <[email protected]>
---
arch/x86/include/asm/bug.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 6804d6642767..facba9bc30ca 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -32,7 +32,7 @@

#define _BUG_FLAGS(ins, flags) \
do { \
- asm volatile("1:\t" ins "\n" \
+ asm_inline volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
"\t" __BUG_REL(%c0) "\t# bug_entry::file\n" \
@@ -49,7 +49,7 @@ do { \

#define _BUG_FLAGS(ins, flags) \
do { \
- asm volatile("1:\t" ins "\n" \
+ asm_inline volatile("1:\t" ins "\n" \
".pushsection __bug_table,\"aw\"\n" \
"2:\t" __BUG_REL(1b) "\t# bug_entry::bug_addr\n" \
"\t.word %c0" "\t# bug_entry::flags\n" \
--
2.20.1

2019-08-30 23:17:15

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 5/6] x86: alternative.h: use asm_inline for all alternative variants

Most, if not all, uses of the alternative* family just provide one or
two instructions in .text, but the string literal can be quite large,
causing gcc to overestimate the size of the generated code. That in
turn affects its decisions about inlining of the function containing
the alternative() asm statement.

gcc >= 9.1 allows one to overrule the estimated size by using "asm
inline" instead of just "asm". So replace asm by the helper
asm_inline, which for older gccs just expands to asm.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
arch/x86/include/asm/alternative.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index 094fbc9c0b1c..13adca37c99a 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -201,10 +201,10 @@ static inline int alternatives_text_reserved(void *start, void *end)
* without volatile and memory clobber.
*/
#define alternative(oldinstr, newinstr, feature) \
- asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
+ asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")

#define alternative_2(oldinstr, newinstr1, feature1, newinstr2, feature2) \
- asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory")
+ asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2) ::: "memory")

/*
* Alternative inline assembly with input.
@@ -218,7 +218,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
* Leaving an unused argument 0 to keep API compatibility.
*/
#define alternative_input(oldinstr, newinstr, feature, input...) \
- asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
+ asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
: : "i" (0), ## input)

/*
@@ -231,18 +231,18 @@ static inline int alternatives_text_reserved(void *start, void *end)
*/
#define alternative_input_2(oldinstr, newinstr1, feature1, newinstr2, \
feature2, input...) \
- asm volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, \
+ asm_inline volatile(ALTERNATIVE_2(oldinstr, newinstr1, feature1, \
newinstr2, feature2) \
: : "i" (0), ## input)

/* Like alternative_input, but with a single output argument */
#define alternative_io(oldinstr, newinstr, feature, output, input...) \
- asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
+ asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature) \
: output : "i" (0), ## input)

/* Like alternative_io, but for replacing a direct call with another one. */
#define alternative_call(oldfunc, newfunc, feature, output, input...) \
- asm volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
+ asm_inline volatile (ALTERNATIVE("call %P[old]", "call %P[new]", feature) \
: output : [old] "i" (oldfunc), [new] "i" (newfunc), ## input)

/*
@@ -253,7 +253,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
*/
#define alternative_call_2(oldfunc, newfunc1, feature1, newfunc2, feature2, \
output, input...) \
- asm volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
+ asm_inline volatile (ALTERNATIVE_2("call %P[old]", "call %P[new1]", feature1,\
"call %P[new2]", feature2) \
: output, ASM_CALL_CONSTRAINT \
: [old] "i" (oldfunc), [new1] "i" (newfunc1), \
--
2.20.1

2019-08-30 23:17:24

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 1/6] staging: rtl8723bs: replace __inline by inline

Currently, __inline is #defined as inline in compiler_types.h, so this
should not change functionality. It is preparation for removing said
#define.

While at it, change some "inline static" to the customary "static
inline" order.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
drivers/staging/rtl8723bs/core/rtw_pwrctrl.c | 4 ++--
drivers/staging/rtl8723bs/core/rtw_wlan_util.c | 2 +-
drivers/staging/rtl8723bs/include/drv_types.h | 6 +++---
.../staging/rtl8723bs/include/osdep_service.h | 10 +++++-----
.../rtl8723bs/include/osdep_service_linux.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_mlme.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_recv.h | 16 ++++++++--------
drivers/staging/rtl8723bs/include/sta_info.h | 2 +-
drivers/staging/rtl8723bs/include/wifi.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/wlan_bssdef.h | 2 +-
10 files changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c b/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
index ae7fb7046c93..3750fbaeec4f 100644
--- a/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
+++ b/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
@@ -830,12 +830,12 @@ static void pwr_rpwm_timeout_handler(struct timer_list *t)
_set_workitem(&pwrpriv->rpwmtimeoutwi);
}

-static __inline void register_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
+static inline void register_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
{
pwrctrl->alives |= tag;
}

-static __inline void unregister_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
+static inline void unregister_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
{
pwrctrl->alives &= ~tag;
}
diff --git a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
index 76c50377f0fe..34e1ce1b0689 100644
--- a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
+++ b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
@@ -451,7 +451,7 @@ void set_channel_bwmode(struct adapter *padapter, unsigned char channel, unsigne
mutex_unlock(&(adapter_to_dvobj(padapter)->setch_mutex));
}

-__inline u8 *get_my_bssid(struct wlan_bssid_ex *pnetwork)
+inline u8 *get_my_bssid(struct wlan_bssid_ex *pnetwork)
{
return pnetwork->MacAddress;
}
diff --git a/drivers/staging/rtl8723bs/include/drv_types.h b/drivers/staging/rtl8723bs/include/drv_types.h
index 96346ce064aa..d3648f3b1de2 100644
--- a/drivers/staging/rtl8723bs/include/drv_types.h
+++ b/drivers/staging/rtl8723bs/include/drv_types.h
@@ -478,7 +478,7 @@ struct sdio_data intf_data;
#define dvobj_to_pwrctl(dvobj) (&(dvobj->pwrctl_priv))
#define pwrctl_to_dvobj(pwrctl) container_of(pwrctl, struct dvobj_priv, pwrctl_priv)

-__inline static struct device *dvobj_to_dev(struct dvobj_priv *dvobj)
+static inline struct device *dvobj_to_dev(struct dvobj_priv *dvobj)
{
/* todo: get interface type from dvobj and the return the dev accordingly */
#ifdef RTW_DVOBJ_CHIP_HW_TYPE
@@ -636,14 +636,14 @@ struct adapter {

/* define RTW_DISABLE_FUNC(padapter, func) (atomic_add(&adapter_to_dvobj(padapter)->disable_func, (func))) */
/* define RTW_ENABLE_FUNC(padapter, func) (atomic_sub(&adapter_to_dvobj(padapter)->disable_func, (func))) */
-__inline static void RTW_DISABLE_FUNC(struct adapter *padapter, int func_bit)
+static inline void RTW_DISABLE_FUNC(struct adapter *padapter, int func_bit)
{
int df = atomic_read(&adapter_to_dvobj(padapter)->disable_func);
df |= func_bit;
atomic_set(&adapter_to_dvobj(padapter)->disable_func, df);
}

-__inline static void RTW_ENABLE_FUNC(struct adapter *padapter, int func_bit)
+static inline void RTW_ENABLE_FUNC(struct adapter *padapter, int func_bit)
{
int df = atomic_read(&adapter_to_dvobj(padapter)->disable_func);
df &= ~(func_bit);
diff --git a/drivers/staging/rtl8723bs/include/osdep_service.h b/drivers/staging/rtl8723bs/include/osdep_service.h
index d2616af95ffa..81a9c19ecc6a 100644
--- a/drivers/staging/rtl8723bs/include/osdep_service.h
+++ b/drivers/staging/rtl8723bs/include/osdep_service.h
@@ -110,12 +110,12 @@ int _rtw_netif_rx(_nic_hdl ndev, struct sk_buff *skb);

extern void _rtw_init_queue(struct __queue *pqueue);

-static __inline void thread_enter(char *name)
+static inline void thread_enter(char *name)
{
allow_signal(SIGTERM);
}

-__inline static void flush_signals_thread(void)
+static inline void flush_signals_thread(void)
{
if (signal_pending (current))
{
@@ -125,7 +125,7 @@ __inline static void flush_signals_thread(void)

#define rtw_warn_on(condition) WARN_ON(condition)

-__inline static int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *parg4)
+static inline int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *parg4)
{
int ret = true;

@@ -136,7 +136,7 @@ __inline static int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *p
#define _RND(sz, r) ((((sz)+((r)-1))/(r))*(r))
#define RND4(x) (((x >> 2) + (((x & 3) == 0) ? 0: 1)) << 2)

-__inline static u32 _RND4(u32 sz)
+static inline u32 _RND4(u32 sz)
{

u32 val;
@@ -147,7 +147,7 @@ __inline static u32 _RND4(u32 sz)

}

-__inline static u32 _RND8(u32 sz)
+static inline u32 _RND8(u32 sz)
{

u32 val;
diff --git a/drivers/staging/rtl8723bs/include/osdep_service_linux.h b/drivers/staging/rtl8723bs/include/osdep_service_linux.h
index 2f1b51e614fb..c582ede1ac12 100644
--- a/drivers/staging/rtl8723bs/include/osdep_service_linux.h
+++ b/drivers/staging/rtl8723bs/include/osdep_service_linux.h
@@ -64,12 +64,12 @@

typedef struct work_struct _workitem;

-__inline static struct list_head *get_next(struct list_head *list)
+static inline struct list_head *get_next(struct list_head *list)
{
return list->next;
}

-__inline static struct list_head *get_list_head(struct __queue *queue)
+static inline struct list_head *get_list_head(struct __queue *queue)
{
return (&(queue->queue));
}
@@ -78,28 +78,28 @@ __inline static struct list_head *get_list_head(struct __queue *queue)
#define LIST_CONTAINOR(ptr, type, member) \
container_of(ptr, type, member)

-__inline static void _set_timer(_timer *ptimer, u32 delay_time)
+static inline void _set_timer(_timer *ptimer, u32 delay_time)
{
mod_timer(ptimer , (jiffies+(delay_time*HZ/1000)));
}

-__inline static void _cancel_timer(_timer *ptimer, u8 *bcancelled)
+static inline void _cancel_timer(_timer *ptimer, u8 *bcancelled)
{
del_timer_sync(ptimer);
*bcancelled = true;/* true == 1; false == 0 */
}

-__inline static void _init_workitem(_workitem *pwork, void *pfunc, void *cntx)
+static inline void _init_workitem(_workitem *pwork, void *pfunc, void *cntx)
{
INIT_WORK(pwork, pfunc);
}

-__inline static void _set_workitem(_workitem *pwork)
+static inline void _set_workitem(_workitem *pwork)
{
schedule_work(pwork);
}

-__inline static void _cancel_workitem_sync(_workitem *pwork)
+static inline void _cancel_workitem_sync(_workitem *pwork)
{
cancel_work_sync(pwork);
}
diff --git a/drivers/staging/rtl8723bs/include/rtw_mlme.h b/drivers/staging/rtl8723bs/include/rtw_mlme.h
index d3c07d1c36e9..4282dfa70b79 100644
--- a/drivers/staging/rtl8723bs/include/rtw_mlme.h
+++ b/drivers/staging/rtl8723bs/include/rtw_mlme.h
@@ -498,13 +498,13 @@ extern sint rtw_select_and_join_from_scanned_queue(struct mlme_priv *pmlmepriv);
extern sint rtw_set_key(struct adapter *adapter, struct security_priv *psecuritypriv, sint keyid, u8 set_tx, bool enqueue);
extern sint rtw_set_auth(struct adapter *adapter, struct security_priv *psecuritypriv);

-__inline static u8 *get_bssid(struct mlme_priv *pmlmepriv)
+static inline u8 *get_bssid(struct mlme_priv *pmlmepriv)
{ /* if sta_mode:pmlmepriv->cur_network.network.MacAddress => bssid */
/* if adhoc_mode:pmlmepriv->cur_network.network.MacAddress => ibss mac address */
return pmlmepriv->cur_network.network.MacAddress;
}

-__inline static sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
+static inline sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
{
if (pmlmepriv->fw_state & state)
return true;
@@ -512,7 +512,7 @@ __inline static sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
return false;
}

-__inline static sint get_fwstate(struct mlme_priv *pmlmepriv)
+static inline sint get_fwstate(struct mlme_priv *pmlmepriv)
{
return pmlmepriv->fw_state;
}
@@ -524,7 +524,7 @@ __inline static sint get_fwstate(struct mlme_priv *pmlmepriv)
* ### NOTE:#### (!!!!)
* MUST TAKE CARE THAT BEFORE CALLING THIS FUNC, YOU SHOULD HAVE LOCKED pmlmepriv->lock
*/
-__inline static void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
+static inline void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
{
pmlmepriv->fw_state |= state;
/* FOR HW integration */
@@ -533,7 +533,7 @@ __inline static void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
}
}

-__inline static void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
+static inline void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
{
pmlmepriv->fw_state &= ~state;
/* FOR HW integration */
@@ -546,7 +546,7 @@ __inline static void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
* No Limit on the calling context,
* therefore set it to be the critical section...
*/
-__inline static void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
+static inline void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
{
spin_lock_bh(&pmlmepriv->lock);
if (check_fwstate(pmlmepriv, state) == true)
@@ -554,7 +554,7 @@ __inline static void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
spin_unlock_bh(&pmlmepriv->lock);
}

-__inline static void set_scanned_network_val(struct mlme_priv *pmlmepriv, sint val)
+static inline void set_scanned_network_val(struct mlme_priv *pmlmepriv, sint val)
{
spin_lock_bh(&pmlmepriv->lock);
pmlmepriv->num_of_scanned = val;
diff --git a/drivers/staging/rtl8723bs/include/rtw_recv.h b/drivers/staging/rtl8723bs/include/rtw_recv.h
index 5de946e66302..012d8f54814f 100644
--- a/drivers/staging/rtl8723bs/include/rtw_recv.h
+++ b/drivers/staging/rtl8723bs/include/rtw_recv.h
@@ -405,7 +405,7 @@ struct recv_buf *rtw_dequeue_recvbuf (struct __queue *queue);

void rtw_reordering_ctrl_timeout_handler(struct timer_list *t);

-__inline static u8 *get_rxmem(union recv_frame *precvframe)
+static inline u8 *get_rxmem(union recv_frame *precvframe)
{
/* always return rx_head... */
if (precvframe == NULL)
@@ -414,7 +414,7 @@ __inline static u8 *get_rxmem(union recv_frame *precvframe)
return precvframe->u.hdr.rx_head;
}

-__inline static u8 *get_recvframe_data(union recv_frame *precvframe)
+static inline u8 *get_recvframe_data(union recv_frame *precvframe)
{

/* alwasy return rx_data */
@@ -425,7 +425,7 @@ __inline static u8 *get_recvframe_data(union recv_frame *precvframe)

}

-__inline static u8 *recvframe_pull(union recv_frame *precvframe, sint sz)
+static inline u8 *recvframe_pull(union recv_frame *precvframe, sint sz)
{
/* rx_data += sz; move rx_data sz bytes hereafter */

@@ -450,7 +450,7 @@ __inline static u8 *recvframe_pull(union recv_frame *precvframe, sint sz)

}

-__inline static u8 *recvframe_put(union recv_frame *precvframe, sint sz)
+static inline u8 *recvframe_put(union recv_frame *precvframe, sint sz)
{
/* rx_tai += sz; move rx_tail sz bytes hereafter */

@@ -479,7 +479,7 @@ __inline static u8 *recvframe_put(union recv_frame *precvframe, sint sz)



-__inline static u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)
+static inline u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)
{
/* rmv data from rx_tail (by yitsen) */

@@ -503,7 +503,7 @@ __inline static u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)

}

-__inline static union recv_frame *rxmem_to_recvframe(u8 *rxmem)
+static inline union recv_frame *rxmem_to_recvframe(u8 *rxmem)
{
/* due to the design of 2048 bytes alignment of recv_frame, we can reference the union recv_frame */
/* from any given member of recv_frame. */
@@ -513,13 +513,13 @@ __inline static union recv_frame *rxmem_to_recvframe(u8 *rxmem)

}

-__inline static sint get_recvframe_len(union recv_frame *precvframe)
+static inline sint get_recvframe_len(union recv_frame *precvframe)
{
return precvframe->u.hdr.len;
}


-__inline static s32 translate_percentage_to_dbm(u32 SignalStrengthIndex)
+static inline s32 translate_percentage_to_dbm(u32 SignalStrengthIndex)
{
s32 SignalPower; /* in dBm. */

diff --git a/drivers/staging/rtl8723bs/include/sta_info.h b/drivers/staging/rtl8723bs/include/sta_info.h
index b9df42d0677e..3acce5630f8e 100644
--- a/drivers/staging/rtl8723bs/include/sta_info.h
+++ b/drivers/staging/rtl8723bs/include/sta_info.h
@@ -348,7 +348,7 @@ struct sta_priv {
};


-__inline static u32 wifi_mac_hash(u8 *mac)
+static inline u32 wifi_mac_hash(u8 *mac)
{
u32 x;

diff --git a/drivers/staging/rtl8723bs/include/wifi.h b/drivers/staging/rtl8723bs/include/wifi.h
index 8c50bbb20f3b..2faf83704ff0 100644
--- a/drivers/staging/rtl8723bs/include/wifi.h
+++ b/drivers/staging/rtl8723bs/include/wifi.h
@@ -347,7 +347,7 @@ enum WIFI_REG_DOMAIN {
(addr[4] == 0xff) && (addr[5] == 0xff)) ? true : false \
)

-__inline static int IS_MCAST(unsigned char *da)
+static inline int IS_MCAST(unsigned char *da)
{
if ((*da) & 0x01)
return true;
@@ -355,20 +355,20 @@ __inline static int IS_MCAST(unsigned char *da)
return false;
}

-__inline static unsigned char * get_ra(unsigned char *pframe)
+static inline unsigned char * get_ra(unsigned char *pframe)
{
unsigned char *ra;
ra = GetAddr1Ptr(pframe);
return ra;
}
-__inline static unsigned char * get_ta(unsigned char *pframe)
+static inline unsigned char * get_ta(unsigned char *pframe)
{
unsigned char *ta;
ta = GetAddr2Ptr(pframe);
return ta;
}

-__inline static unsigned char * get_da(unsigned char *pframe)
+static inline unsigned char * get_da(unsigned char *pframe)
{
unsigned char *da;
unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
@@ -392,7 +392,7 @@ __inline static unsigned char * get_da(unsigned char *pframe)
}


-__inline static unsigned char * get_sa(unsigned char *pframe)
+static inline unsigned char * get_sa(unsigned char *pframe)
{
unsigned char *sa;
unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
@@ -415,7 +415,7 @@ __inline static unsigned char * get_sa(unsigned char *pframe)
return sa;
}

-__inline static unsigned char * get_hdr_bssid(unsigned char *pframe)
+static inline unsigned char * get_hdr_bssid(unsigned char *pframe)
{
unsigned char *sa = NULL;
unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
@@ -439,7 +439,7 @@ __inline static unsigned char * get_hdr_bssid(unsigned char *pframe)
}


-__inline static int IsFrameTypeCtrl(unsigned char *pframe)
+static inline int IsFrameTypeCtrl(unsigned char *pframe)
{
if (WIFI_CTRL_TYPE == GetFrameType(pframe))
return true;
diff --git a/drivers/staging/rtl8723bs/include/wlan_bssdef.h b/drivers/staging/rtl8723bs/include/wlan_bssdef.h
index 88890b1c3c4c..723fc5b546ef 100644
--- a/drivers/staging/rtl8723bs/include/wlan_bssdef.h
+++ b/drivers/staging/rtl8723bs/include/wlan_bssdef.h
@@ -223,7 +223,7 @@ struct wlan_bssid_ex {
u8 IEs[MAX_IE_SZ]; /* timestamp, beacon interval, and capability information) */
} __packed;

-__inline static uint get_wlan_bssid_ex_sz(struct wlan_bssid_ex *bss)
+static inline uint get_wlan_bssid_ex_sz(struct wlan_bssid_ex *bss)
{
return (sizeof(struct wlan_bssid_ex) - MAX_IE_SZ + bss->IELength);
}
--
2.20.1

2019-08-30 23:19:04

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v2 2/6] lib/zstd/mem.h: replace __inline by inline

Currently, compiler_types.h #defines __inline as inline (and further
#defines inline to automatically attach some attributes), so this does
not change functionality. It serves as preparation for removing the
#define of __inline.

(Note that if ZSTD_STATIC is expanded somewhere where compiler_types.h
has not yet been processed, both __inline and inline both refer to the
compiler keyword, so again this does not change anything.)

Signed-off-by: Rasmus Villemoes <[email protected]>
---
lib/zstd/mem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/zstd/mem.h b/lib/zstd/mem.h
index 3a0f34c8706c..739837a59ad6 100644
--- a/lib/zstd/mem.h
+++ b/lib/zstd/mem.h
@@ -27,7 +27,7 @@
/*-****************************************
* Compiler specifics
******************************************/
-#define ZSTD_STATIC static __inline __attribute__((unused))
+#define ZSTD_STATIC static inline __attribute__((unused))

/*-**************************************************************
* Basic Types
--
2.20.1

2019-09-04 23:57:37

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 1/6] staging: rtl8723bs: replace __inline by inline

On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
<[email protected]> wrote:
>
> Currently, __inline is #defined as inline in compiler_types.h, so this
> should not change functionality. It is preparation for removing said
> #define.
>
> While at it, change some "inline static" to the customary "static
> inline" order.
>
> Signed-off-by: Rasmus Villemoes <[email protected]>

Reviewed-by: Nick Desaulniers <[email protected]>

> ---
> drivers/staging/rtl8723bs/core/rtw_pwrctrl.c | 4 ++--
> drivers/staging/rtl8723bs/core/rtw_wlan_util.c | 2 +-
> drivers/staging/rtl8723bs/include/drv_types.h | 6 +++---
> .../staging/rtl8723bs/include/osdep_service.h | 10 +++++-----
> .../rtl8723bs/include/osdep_service_linux.h | 14 +++++++-------
> drivers/staging/rtl8723bs/include/rtw_mlme.h | 14 +++++++-------
> drivers/staging/rtl8723bs/include/rtw_recv.h | 16 ++++++++--------
> drivers/staging/rtl8723bs/include/sta_info.h | 2 +-
> drivers/staging/rtl8723bs/include/wifi.h | 14 +++++++-------
> drivers/staging/rtl8723bs/include/wlan_bssdef.h | 2 +-
> 10 files changed, 42 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c b/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
> index ae7fb7046c93..3750fbaeec4f 100644
> --- a/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
> +++ b/drivers/staging/rtl8723bs/core/rtw_pwrctrl.c
> @@ -830,12 +830,12 @@ static void pwr_rpwm_timeout_handler(struct timer_list *t)
> _set_workitem(&pwrpriv->rpwmtimeoutwi);
> }
>
> -static __inline void register_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
> +static inline void register_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
> {
> pwrctrl->alives |= tag;
> }
>
> -static __inline void unregister_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
> +static inline void unregister_task_alive(struct pwrctrl_priv *pwrctrl, u32 tag)
> {
> pwrctrl->alives &= ~tag;
> }
> diff --git a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
> index 76c50377f0fe..34e1ce1b0689 100644
> --- a/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
> +++ b/drivers/staging/rtl8723bs/core/rtw_wlan_util.c
> @@ -451,7 +451,7 @@ void set_channel_bwmode(struct adapter *padapter, unsigned char channel, unsigne
> mutex_unlock(&(adapter_to_dvobj(padapter)->setch_mutex));
> }
>
> -__inline u8 *get_my_bssid(struct wlan_bssid_ex *pnetwork)
> +inline u8 *get_my_bssid(struct wlan_bssid_ex *pnetwork)
> {
> return pnetwork->MacAddress;
> }
> diff --git a/drivers/staging/rtl8723bs/include/drv_types.h b/drivers/staging/rtl8723bs/include/drv_types.h
> index 96346ce064aa..d3648f3b1de2 100644
> --- a/drivers/staging/rtl8723bs/include/drv_types.h
> +++ b/drivers/staging/rtl8723bs/include/drv_types.h
> @@ -478,7 +478,7 @@ struct sdio_data intf_data;
> #define dvobj_to_pwrctl(dvobj) (&(dvobj->pwrctl_priv))
> #define pwrctl_to_dvobj(pwrctl) container_of(pwrctl, struct dvobj_priv, pwrctl_priv)
>
> -__inline static struct device *dvobj_to_dev(struct dvobj_priv *dvobj)
> +static inline struct device *dvobj_to_dev(struct dvobj_priv *dvobj)
> {
> /* todo: get interface type from dvobj and the return the dev accordingly */
> #ifdef RTW_DVOBJ_CHIP_HW_TYPE
> @@ -636,14 +636,14 @@ struct adapter {
>
> /* define RTW_DISABLE_FUNC(padapter, func) (atomic_add(&adapter_to_dvobj(padapter)->disable_func, (func))) */
> /* define RTW_ENABLE_FUNC(padapter, func) (atomic_sub(&adapter_to_dvobj(padapter)->disable_func, (func))) */
> -__inline static void RTW_DISABLE_FUNC(struct adapter *padapter, int func_bit)
> +static inline void RTW_DISABLE_FUNC(struct adapter *padapter, int func_bit)
> {
> int df = atomic_read(&adapter_to_dvobj(padapter)->disable_func);
> df |= func_bit;
> atomic_set(&adapter_to_dvobj(padapter)->disable_func, df);
> }
>
> -__inline static void RTW_ENABLE_FUNC(struct adapter *padapter, int func_bit)
> +static inline void RTW_ENABLE_FUNC(struct adapter *padapter, int func_bit)
> {
> int df = atomic_read(&adapter_to_dvobj(padapter)->disable_func);
> df &= ~(func_bit);
> diff --git a/drivers/staging/rtl8723bs/include/osdep_service.h b/drivers/staging/rtl8723bs/include/osdep_service.h
> index d2616af95ffa..81a9c19ecc6a 100644
> --- a/drivers/staging/rtl8723bs/include/osdep_service.h
> +++ b/drivers/staging/rtl8723bs/include/osdep_service.h
> @@ -110,12 +110,12 @@ int _rtw_netif_rx(_nic_hdl ndev, struct sk_buff *skb);
>
> extern void _rtw_init_queue(struct __queue *pqueue);
>
> -static __inline void thread_enter(char *name)
> +static inline void thread_enter(char *name)
> {
> allow_signal(SIGTERM);
> }
>
> -__inline static void flush_signals_thread(void)
> +static inline void flush_signals_thread(void)
> {
> if (signal_pending (current))
> {
> @@ -125,7 +125,7 @@ __inline static void flush_signals_thread(void)
>
> #define rtw_warn_on(condition) WARN_ON(condition)
>
> -__inline static int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *parg4)
> +static inline int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *parg4)
> {
> int ret = true;
>
> @@ -136,7 +136,7 @@ __inline static int rtw_bug_check(void *parg1, void *parg2, void *parg3, void *p
> #define _RND(sz, r) ((((sz)+((r)-1))/(r))*(r))
> #define RND4(x) (((x >> 2) + (((x & 3) == 0) ? 0: 1)) << 2)
>
> -__inline static u32 _RND4(u32 sz)
> +static inline u32 _RND4(u32 sz)
> {
>
> u32 val;
> @@ -147,7 +147,7 @@ __inline static u32 _RND4(u32 sz)
>
> }
>
> -__inline static u32 _RND8(u32 sz)
> +static inline u32 _RND8(u32 sz)
> {
>
> u32 val;
> diff --git a/drivers/staging/rtl8723bs/include/osdep_service_linux.h b/drivers/staging/rtl8723bs/include/osdep_service_linux.h
> index 2f1b51e614fb..c582ede1ac12 100644
> --- a/drivers/staging/rtl8723bs/include/osdep_service_linux.h
> +++ b/drivers/staging/rtl8723bs/include/osdep_service_linux.h
> @@ -64,12 +64,12 @@
>
> typedef struct work_struct _workitem;
>
> -__inline static struct list_head *get_next(struct list_head *list)
> +static inline struct list_head *get_next(struct list_head *list)
> {
> return list->next;
> }
>
> -__inline static struct list_head *get_list_head(struct __queue *queue)
> +static inline struct list_head *get_list_head(struct __queue *queue)
> {
> return (&(queue->queue));
> }
> @@ -78,28 +78,28 @@ __inline static struct list_head *get_list_head(struct __queue *queue)
> #define LIST_CONTAINOR(ptr, type, member) \
> container_of(ptr, type, member)
>
> -__inline static void _set_timer(_timer *ptimer, u32 delay_time)
> +static inline void _set_timer(_timer *ptimer, u32 delay_time)
> {
> mod_timer(ptimer , (jiffies+(delay_time*HZ/1000)));
> }
>
> -__inline static void _cancel_timer(_timer *ptimer, u8 *bcancelled)
> +static inline void _cancel_timer(_timer *ptimer, u8 *bcancelled)
> {
> del_timer_sync(ptimer);
> *bcancelled = true;/* true == 1; false == 0 */
> }
>
> -__inline static void _init_workitem(_workitem *pwork, void *pfunc, void *cntx)
> +static inline void _init_workitem(_workitem *pwork, void *pfunc, void *cntx)
> {
> INIT_WORK(pwork, pfunc);
> }
>
> -__inline static void _set_workitem(_workitem *pwork)
> +static inline void _set_workitem(_workitem *pwork)
> {
> schedule_work(pwork);
> }
>
> -__inline static void _cancel_workitem_sync(_workitem *pwork)
> +static inline void _cancel_workitem_sync(_workitem *pwork)
> {
> cancel_work_sync(pwork);
> }
> diff --git a/drivers/staging/rtl8723bs/include/rtw_mlme.h b/drivers/staging/rtl8723bs/include/rtw_mlme.h
> index d3c07d1c36e9..4282dfa70b79 100644
> --- a/drivers/staging/rtl8723bs/include/rtw_mlme.h
> +++ b/drivers/staging/rtl8723bs/include/rtw_mlme.h
> @@ -498,13 +498,13 @@ extern sint rtw_select_and_join_from_scanned_queue(struct mlme_priv *pmlmepriv);
> extern sint rtw_set_key(struct adapter *adapter, struct security_priv *psecuritypriv, sint keyid, u8 set_tx, bool enqueue);
> extern sint rtw_set_auth(struct adapter *adapter, struct security_priv *psecuritypriv);
>
> -__inline static u8 *get_bssid(struct mlme_priv *pmlmepriv)
> +static inline u8 *get_bssid(struct mlme_priv *pmlmepriv)
> { /* if sta_mode:pmlmepriv->cur_network.network.MacAddress => bssid */
> /* if adhoc_mode:pmlmepriv->cur_network.network.MacAddress => ibss mac address */
> return pmlmepriv->cur_network.network.MacAddress;
> }
>
> -__inline static sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
> +static inline sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
> {
> if (pmlmepriv->fw_state & state)
> return true;
> @@ -512,7 +512,7 @@ __inline static sint check_fwstate(struct mlme_priv *pmlmepriv, sint state)
> return false;
> }
>
> -__inline static sint get_fwstate(struct mlme_priv *pmlmepriv)
> +static inline sint get_fwstate(struct mlme_priv *pmlmepriv)
> {
> return pmlmepriv->fw_state;
> }
> @@ -524,7 +524,7 @@ __inline static sint get_fwstate(struct mlme_priv *pmlmepriv)
> * ### NOTE:#### (!!!!)
> * MUST TAKE CARE THAT BEFORE CALLING THIS FUNC, YOU SHOULD HAVE LOCKED pmlmepriv->lock
> */
> -__inline static void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
> +static inline void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
> {
> pmlmepriv->fw_state |= state;
> /* FOR HW integration */
> @@ -533,7 +533,7 @@ __inline static void set_fwstate(struct mlme_priv *pmlmepriv, sint state)
> }
> }
>
> -__inline static void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
> +static inline void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
> {
> pmlmepriv->fw_state &= ~state;
> /* FOR HW integration */
> @@ -546,7 +546,7 @@ __inline static void _clr_fwstate_(struct mlme_priv *pmlmepriv, sint state)
> * No Limit on the calling context,
> * therefore set it to be the critical section...
> */
> -__inline static void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
> +static inline void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
> {
> spin_lock_bh(&pmlmepriv->lock);
> if (check_fwstate(pmlmepriv, state) == true)
> @@ -554,7 +554,7 @@ __inline static void clr_fwstate(struct mlme_priv *pmlmepriv, sint state)
> spin_unlock_bh(&pmlmepriv->lock);
> }
>
> -__inline static void set_scanned_network_val(struct mlme_priv *pmlmepriv, sint val)
> +static inline void set_scanned_network_val(struct mlme_priv *pmlmepriv, sint val)
> {
> spin_lock_bh(&pmlmepriv->lock);
> pmlmepriv->num_of_scanned = val;
> diff --git a/drivers/staging/rtl8723bs/include/rtw_recv.h b/drivers/staging/rtl8723bs/include/rtw_recv.h
> index 5de946e66302..012d8f54814f 100644
> --- a/drivers/staging/rtl8723bs/include/rtw_recv.h
> +++ b/drivers/staging/rtl8723bs/include/rtw_recv.h
> @@ -405,7 +405,7 @@ struct recv_buf *rtw_dequeue_recvbuf (struct __queue *queue);
>
> void rtw_reordering_ctrl_timeout_handler(struct timer_list *t);
>
> -__inline static u8 *get_rxmem(union recv_frame *precvframe)
> +static inline u8 *get_rxmem(union recv_frame *precvframe)
> {
> /* always return rx_head... */
> if (precvframe == NULL)
> @@ -414,7 +414,7 @@ __inline static u8 *get_rxmem(union recv_frame *precvframe)
> return precvframe->u.hdr.rx_head;
> }
>
> -__inline static u8 *get_recvframe_data(union recv_frame *precvframe)
> +static inline u8 *get_recvframe_data(union recv_frame *precvframe)
> {
>
> /* alwasy return rx_data */
> @@ -425,7 +425,7 @@ __inline static u8 *get_recvframe_data(union recv_frame *precvframe)
>
> }
>
> -__inline static u8 *recvframe_pull(union recv_frame *precvframe, sint sz)
> +static inline u8 *recvframe_pull(union recv_frame *precvframe, sint sz)
> {
> /* rx_data += sz; move rx_data sz bytes hereafter */
>
> @@ -450,7 +450,7 @@ __inline static u8 *recvframe_pull(union recv_frame *precvframe, sint sz)
>
> }
>
> -__inline static u8 *recvframe_put(union recv_frame *precvframe, sint sz)
> +static inline u8 *recvframe_put(union recv_frame *precvframe, sint sz)
> {
> /* rx_tai += sz; move rx_tail sz bytes hereafter */
>
> @@ -479,7 +479,7 @@ __inline static u8 *recvframe_put(union recv_frame *precvframe, sint sz)
>
>
>
> -__inline static u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)
> +static inline u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)
> {
> /* rmv data from rx_tail (by yitsen) */
>
> @@ -503,7 +503,7 @@ __inline static u8 *recvframe_pull_tail(union recv_frame *precvframe, sint sz)
>
> }
>
> -__inline static union recv_frame *rxmem_to_recvframe(u8 *rxmem)
> +static inline union recv_frame *rxmem_to_recvframe(u8 *rxmem)
> {
> /* due to the design of 2048 bytes alignment of recv_frame, we can reference the union recv_frame */
> /* from any given member of recv_frame. */
> @@ -513,13 +513,13 @@ __inline static union recv_frame *rxmem_to_recvframe(u8 *rxmem)
>
> }
>
> -__inline static sint get_recvframe_len(union recv_frame *precvframe)
> +static inline sint get_recvframe_len(union recv_frame *precvframe)
> {
> return precvframe->u.hdr.len;
> }
>
>
> -__inline static s32 translate_percentage_to_dbm(u32 SignalStrengthIndex)
> +static inline s32 translate_percentage_to_dbm(u32 SignalStrengthIndex)
> {
> s32 SignalPower; /* in dBm. */
>
> diff --git a/drivers/staging/rtl8723bs/include/sta_info.h b/drivers/staging/rtl8723bs/include/sta_info.h
> index b9df42d0677e..3acce5630f8e 100644
> --- a/drivers/staging/rtl8723bs/include/sta_info.h
> +++ b/drivers/staging/rtl8723bs/include/sta_info.h
> @@ -348,7 +348,7 @@ struct sta_priv {
> };
>
>
> -__inline static u32 wifi_mac_hash(u8 *mac)
> +static inline u32 wifi_mac_hash(u8 *mac)
> {
> u32 x;
>
> diff --git a/drivers/staging/rtl8723bs/include/wifi.h b/drivers/staging/rtl8723bs/include/wifi.h
> index 8c50bbb20f3b..2faf83704ff0 100644
> --- a/drivers/staging/rtl8723bs/include/wifi.h
> +++ b/drivers/staging/rtl8723bs/include/wifi.h
> @@ -347,7 +347,7 @@ enum WIFI_REG_DOMAIN {
> (addr[4] == 0xff) && (addr[5] == 0xff)) ? true : false \
> )
>
> -__inline static int IS_MCAST(unsigned char *da)
> +static inline int IS_MCAST(unsigned char *da)
> {
> if ((*da) & 0x01)
> return true;
> @@ -355,20 +355,20 @@ __inline static int IS_MCAST(unsigned char *da)
> return false;
> }
>
> -__inline static unsigned char * get_ra(unsigned char *pframe)
> +static inline unsigned char * get_ra(unsigned char *pframe)
> {
> unsigned char *ra;
> ra = GetAddr1Ptr(pframe);
> return ra;
> }
> -__inline static unsigned char * get_ta(unsigned char *pframe)
> +static inline unsigned char * get_ta(unsigned char *pframe)
> {
> unsigned char *ta;
> ta = GetAddr2Ptr(pframe);
> return ta;
> }
>
> -__inline static unsigned char * get_da(unsigned char *pframe)
> +static inline unsigned char * get_da(unsigned char *pframe)
> {
> unsigned char *da;
> unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
> @@ -392,7 +392,7 @@ __inline static unsigned char * get_da(unsigned char *pframe)
> }
>
>
> -__inline static unsigned char * get_sa(unsigned char *pframe)
> +static inline unsigned char * get_sa(unsigned char *pframe)
> {
> unsigned char *sa;
> unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
> @@ -415,7 +415,7 @@ __inline static unsigned char * get_sa(unsigned char *pframe)
> return sa;
> }
>
> -__inline static unsigned char * get_hdr_bssid(unsigned char *pframe)
> +static inline unsigned char * get_hdr_bssid(unsigned char *pframe)
> {
> unsigned char *sa = NULL;
> unsigned int to_fr_ds = (GetToDs(pframe) << 1) | GetFrDs(pframe);
> @@ -439,7 +439,7 @@ __inline static unsigned char * get_hdr_bssid(unsigned char *pframe)
> }
>
>
> -__inline static int IsFrameTypeCtrl(unsigned char *pframe)
> +static inline int IsFrameTypeCtrl(unsigned char *pframe)
> {
> if (WIFI_CTRL_TYPE == GetFrameType(pframe))
> return true;
> diff --git a/drivers/staging/rtl8723bs/include/wlan_bssdef.h b/drivers/staging/rtl8723bs/include/wlan_bssdef.h
> index 88890b1c3c4c..723fc5b546ef 100644
> --- a/drivers/staging/rtl8723bs/include/wlan_bssdef.h
> +++ b/drivers/staging/rtl8723bs/include/wlan_bssdef.h
> @@ -223,7 +223,7 @@ struct wlan_bssid_ex {
> u8 IEs[MAX_IE_SZ]; /* timestamp, beacon interval, and capability information) */
> } __packed;
>
> -__inline static uint get_wlan_bssid_ex_sz(struct wlan_bssid_ex *bss)
> +static inline uint get_wlan_bssid_ex_sz(struct wlan_bssid_ex *bss)
> {
> return (sizeof(struct wlan_bssid_ex) - MAX_IE_SZ + bss->IELength);
> }
> --
> 2.20.1
>


--
Thanks,
~Nick Desaulniers

2019-09-05 00:02:11

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] lib/zstd/mem.h: replace __inline by inline

On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
<[email protected]> wrote:
>
> Currently, compiler_types.h #defines __inline as inline (and further
> #defines inline to automatically attach some attributes), so this does
> not change functionality. It serves as preparation for removing the
> #define of __inline.
>
> (Note that if ZSTD_STATIC is expanded somewhere where compiler_types.h
> has not yet been processed, both __inline and inline both refer to the
> compiler keyword, so again this does not change anything.)
>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> lib/zstd/mem.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/zstd/mem.h b/lib/zstd/mem.h
> index 3a0f34c8706c..739837a59ad6 100644
> --- a/lib/zstd/mem.h
> +++ b/lib/zstd/mem.h
> @@ -27,7 +27,7 @@
> /*-****************************************
> * Compiler specifics
> ******************************************/
> -#define ZSTD_STATIC static __inline __attribute__((unused))
> +#define ZSTD_STATIC static inline __attribute__((unused))

While you're here, would you mind replacing `__attribute__((unused))`
with `__unused`? I would consider "naked attributes" (haven't been
feature tested in include/linux/compiler_attributes.h and are verbose)
to be an antipattern.

>
> /*-**************************************************************
> * Basic Types
> --
> 2.20.1
>


--
Thanks,
~Nick Desaulniers

2019-09-05 00:09:38

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] lib/zstd/mem.h: replace __inline by inline

On Thu, Sep 5, 2019 at 2:00 AM Nick Desaulniers <[email protected]> wrote:
>
> While you're here, would you mind replacing `__attribute__((unused))`
> with `__unused`? I would consider "naked attributes" (haven't been
> feature tested in include/linux/compiler_attributes.h and are verbose)
> to be an antipattern.

+1 We should aim to avoid them entirely where possible.

We have __always_unused and __maybe_unused, please choose whatever
fits best (both map to "unused", we don't have __unused).

Cheers,
Miguel

2019-09-05 00:15:17

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 3/6] compiler_types.h: don't #define __inline

On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
<[email protected]> wrote:
>
> The spellings __inline and __inline__ should be reserved for uses
> where one really wants to refer to the inline keyword, regardless of
> whether or not the spelling "inline" has been #defined to something
> else. Due to use of __inline__ in uapi headers, we can't easily get
> rid of the definition of __inline__. However, almost all users of
> __inline has been converted to inline, so we can get rid of that
> #define.

Besides patch 1 and 2 of this series, I also see:
Documentation/trace/tracepoint-analysis.rst
318: : extern __inline void
__attribute__((__gnu_inline__, __always_inline__, _

scripts/kernel-doc
1574: $prototype =~ s/^__inline +//;

>
> The exception is include/acpi/platform/acintel.h. However, that header
> is only included when using the intel compiler (does anybody actually
> build the kernel with that?), and the ACPI_INLINE macro is only used

In my effort to make the kernel slightly more compiler-portable, I
have not yet found anyone building with ICC. I would love to be
proven wrong. Let me go ask some of my Intel friends.

> in the definition of utterly trivial stub functions, where I doubt a

See:
include/acpi/platform/acenv.h
146 #elif defined(__INTEL_COMPILER)
147 #include <acpi/platform/acintel.h>

> small change of semantics (lack of __gnu_inline) changes anything.

include/acpi/platform/acintel.h
25:#define ACPI_INLINE __inline
include/acpi/platform/acgcc.h
29:#define ACPI_INLINE __inline__

lol wut

I mean, you just would have to change that one line in
include/acpi/platform/acintel.h, right? I'd sign off on this patch
with such a patch added to the series.

>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> include/linux/compiler_types.h | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 599c27b56c29..ee49be6d6088 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -150,8 +150,17 @@ struct ftrace_likely_data {
> __maybe_unused notrace
> #endif
>
> +/*
> + * gcc provides both __inline__ and __inline as alternate spellings of
> + * the inline keyword, though the latter is undocumented. New kernel
> + * code should only use the inline spelling, but some existing code
> + * uses __inline__. Since we #define inline above, to ensure
> + * __inline__ has the same semantics, we need this #define.
> + *
> + * However, the spelling __inline is strictly reserved for referring
> + * to the bare keyword.
> + */
> #define __inline__ inline
> -#define __inline inline
>
> /*
> * Rather then using noinline to prevent stack consumption, use
> --
> 2.20.1
>


--
Thanks,
~Nick Desaulniers

2019-09-05 00:21:48

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
<[email protected]> wrote:
>
> This adds an asm_inline macro which expands to "asm inline" [1] when gcc
> is new enough (>= 9.1), and just asm for older gccs and other
> compilers.
>
> Using asm inline("foo") instead of asm("foo") overrules gcc's
> heuristic estimate of the size of the code represented by the asm()
> statement, and makes gcc use the minimum possible size instead. That
> can in turn affect gcc's inlining decisions.
>
> I wasn't sure whether to make this a function-like macro or not - this
> way, it can be combined with volatile as
>
> asm_inline volatile()
>
> but perhaps we'd prefer to spell that
>
> asm_inline_volatile()
>
> anyway.
>
> [1] Technically, asm __inline, since both inline and __inline__
> are macros that attach various attributes, making gcc barf if one
> literally does "asm inline()". However, the third spelling __inline is
> available for referring to the bare keyword.
>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> include/linux/compiler-gcc.h | 4 ++++
> include/linux/compiler_types.h | 4 ++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index d7ee4c6bad48..544b87b41b58 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -172,3 +172,7 @@
> #endif
>
> #define __no_fgcse __attribute__((optimize("-fno-gcse")))
> +
> +#if GCC_VERSION >= 90100

Is it too late to ask for a feature test macro? Maybe one already
exists? I was not able to find documentation or a bug on `asm
inline`. I'm quite curious how you even found or heard of this
feature. To the source we must go...

> +#define asm_inline asm __inline
> +#endif
> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index ee49be6d6088..ba8d81b716c7 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -198,6 +198,10 @@ struct ftrace_likely_data {
> #define asm_volatile_goto(x...) asm goto(x)
> #endif
>
> +#ifndef asm_inline
> +#define asm_inline asm
> +#endif
> +
> #ifndef __no_fgcse
> # define __no_fgcse
> #endif
> --
> 2.20.1
>


--
Thanks,
~Nick Desaulniers

2019-09-05 05:54:20

by Nadav Amit

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

> On Sep 4, 2019, at 5:18 PM, Nick Desaulniers <[email protected]> wrote:
>
> On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
> <[email protected]> wrote:
>> This adds an asm_inline macro which expands to "asm inline" [1] when gcc
>> is new enough (>= 9.1), and just asm for older gccs and other
>> compilers.
>>
>> Using asm inline("foo") instead of asm("foo") overrules gcc's
>> heuristic estimate of the size of the code represented by the asm()
>> statement, and makes gcc use the minimum possible size instead. That
>> can in turn affect gcc's inlining decisions.
>>
>> I wasn't sure whether to make this a function-like macro or not - this
>> way, it can be combined with volatile as
>>
>> asm_inline volatile()
>>
>> but perhaps we'd prefer to spell that
>>
>> asm_inline_volatile()
>>
>> anyway.
>>
>> [1] Technically, asm __inline, since both inline and __inline__
>> are macros that attach various attributes, making gcc barf if one
>> literally does "asm inline()". However, the third spelling __inline is
>> available for referring to the bare keyword.
>>
>> Signed-off-by: Rasmus Villemoes <[email protected]>
>> ---
>> include/linux/compiler-gcc.h | 4 ++++
>> include/linux/compiler_types.h | 4 ++++
>> 2 files changed, 8 insertions(+)
>>
>> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
>> index d7ee4c6bad48..544b87b41b58 100644
>> --- a/include/linux/compiler-gcc.h
>> +++ b/include/linux/compiler-gcc.h
>> @@ -172,3 +172,7 @@
>> #endif
>>
>> #define __no_fgcse __attribute__((optimize("-fno-gcse")))
>> +
>> +#if GCC_VERSION >= 90100
>
> Is it too late to ask for a feature test macro? Maybe one already
> exists? I was not able to find documentation or a bug on `asm
> inline`. I'm quite curious how you even found or heard of this
> feature. To the source we must go...

When I had some free time I wrote a detailed blog post about this issue:
https://nadav.amit.zone/linux/2018/10/10/newline.html

Which later Borislav took to gcc people:
https://lore.kernel.org/lkml/[email protected]/

2019-09-05 10:21:07

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v2 2/6] lib/zstd/mem.h: replace __inline by inline

On 05/09/2019 02.07, Miguel Ojeda wrote:
> On Thu, Sep 5, 2019 at 2:00 AM Nick Desaulniers <[email protected]> wrote:
>>
>> While you're here, would you mind replacing `__attribute__((unused))`
>> with `__unused`? I would consider "naked attributes" (haven't been
>> feature tested in include/linux/compiler_attributes.h and are verbose)
>> to be an antipattern.
>
> +1 We should aim to avoid them entirely where possible.
>
> We have __always_unused and __maybe_unused, please choose whatever
> fits best (both map to "unused", we don't have __unused).

Well, I agree in principle, but was trying to keep this minimal. FTR, if
anything, I think the __attribute__((unused)) should simply be removed
since it's implied by our (re)definition of inline/__inline/__inline__.

Rasmus


2019-09-05 11:59:39

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v2 3/6] compiler_types.h: don't #define __inline

On 05/09/2019 02.13, Nick Desaulniers wrote:
> On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
> <[email protected]> wrote:
>>
>
> Besides patch 1 and 2 of this series, I also see:
> Documentation/trace/tracepoint-analysis.rst
> 318: : extern __inline void
> __attribute__((__gnu_inline__, __always_inline__, _
>

That seems to be copy-pasted output from some command, so I won't edit that.

> scripts/kernel-doc
> 1574: $prototype =~ s/^__inline +//;

That one can indeed go once all uses of __inline in the source code is gone.

>> The exception is include/acpi/platform/acintel.h. However, that header
>> is only included when using the intel compiler (does anybody actually
>> build the kernel with that?), and the ACPI_INLINE macro is only used
>
> In my effort to make the kernel slightly more compiler-portable, I
> have not yet found anyone building with ICC. I would love to be
> proven wrong. Let me go ask some of my Intel friends.

Yeah, the whole compiler-attributes.h only deal with gcc and clang, and
a lot of the attributes are unconditionally defined, so I assume either
ICC implements them all, or perhaps simply ignores (unknown) attributes?
It would be lovely if we could just throw out all the icc stuff.

>> in the definition of utterly trivial stub functions, where I doubt a
>
> See:
> include/acpi/platform/acenv.h
> 146 #elif defined(__INTEL_COMPILER)
> 147 #include <acpi/platform/acintel.h>
>
>> small change of semantics (lack of __gnu_inline) changes anything.
>
> include/acpi/platform/acintel.h
> 25:#define ACPI_INLINE __inline
> include/acpi/platform/acgcc.h
> 29:#define ACPI_INLINE __inline__
>
> lol wut
>
> I mean, you just would have to change that one line in
> include/acpi/platform/acintel.h, right? I'd sign off on this patch
> with such a patch added to the series.

I'm not sure what you mean. Just switch the above __inline to inline?
It's annoying not being able to test it.

Rasmus

2019-09-05 13:02:16

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On 05/09/2019 02.18, Nick Desaulniers wrote:
> On Fri, Aug 30, 2019 at 4:15 PM Rasmus Villemoes
> <[email protected]> wrote:
>>
>> This adds an asm_inline macro which expands to "asm inline" [1] when gcc
>> is new enough (>= 9.1), and just asm for older gccs and other
>> compilers.
>>
>> Signed-off-by: Rasmus Villemoes <[email protected]>
>> ---
>> include/linux/compiler-gcc.h | 4 ++++
>> include/linux/compiler_types.h | 4 ++++
>> 2 files changed, 8 insertions(+)
>>
>> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
>> index d7ee4c6bad48..544b87b41b58 100644
>> --- a/include/linux/compiler-gcc.h
>> +++ b/include/linux/compiler-gcc.h
>> @@ -172,3 +172,7 @@
>> #endif
>>
>> #define __no_fgcse __attribute__((optimize("-fno-gcse")))
>> +
>> +#if GCC_VERSION >= 90100
>
> Is it too late to ask for a feature test macro? Maybe one already
> exists?

No, not as far as I know. Perhaps something like below, though that
won't affect the already released gcc 9.1 and 9.2, of course.

gcc maintainers, WDYT? Can we add a feature test macro for asm inline()?
For context, I'm trying to add an asm_inline macro to the kernel source
that will fall back to asm when "asm inline" is not supported - see
https://lore.kernel.org/lkml/[email protected]/
for the whole thread.

From: Rasmus Villemoes <[email protected]>
Subject: [PATCH] add feature test macro for "asm inline"

Allow users to check availability of "asm inline()" via a feature test
macro. If and when clang implements support for "asm inline()", it's
easier for users if they can just test __HAVE_ASM_INLINE rather than
juggling different version checks for different compilers.

Changelog:

gcc/c-family/

* c-cppbuiltin.c (c_cpp_builtins): Add pre-define for
__HAVE_ASM_INLINE.

gcc/

* doc/cpp.texi: Document predefine __HAVE_ASM_INLINE.
---
gcc/c-family/c-cppbuiltin.c | 3 +++
gcc/doc/cpp.texi | 5 +++++
2 files changed, 8 insertions(+)

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index fc68bc4d0c4..163f3058741 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1383,6 +1383,9 @@ c_cpp_builtins (cpp_reader *pfile)
if (targetm.have_speculation_safe_value (false))
cpp_define (pfile, "__HAVE_SPECULATION_SAFE_VALUE");

+ /* Show the availability of "asm inline()". */
+ cpp_define (pfile, "__HAVE_ASM_INLINE");
+
#ifdef DWARF2_UNWIND_INFO
if (dwarf2out_do_cfi_asm ())
cpp_define (pfile, "__GCC_HAVE_DWARF2_CFI_ASM");
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index e271f5180d8..98f6d625857 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -2386,6 +2386,11 @@ and swap operations on operands 1, 2, 4, 8 or 16
bytes in length, respectively.
This macro is defined with the value 1 to show that this version of GCC
supports @code{__builtin_speculation_safe_value}.

+@item __HAVE_ASM_INLINE
+This macro is defined with the value 1 to show that this version of GCC
+supports @code{asm inline()}. @xref{Size of an asm,,, gcc, Using
+the GNU Compiler Collection (GCC)}.
+
@item __GCC_HAVE_DWARF2_CFI_ASM
This macro is defined when the compiler is emitting DWARF CFI directives
to the assembler. When this is defined, it is possible to emit those same

2019-09-05 14:12:16

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

Hi Rasmus,

On Thu, Sep 05, 2019 at 01:07:11PM +0200, Rasmus Villemoes wrote:
> On 05/09/2019 02.18, Nick Desaulniers wrote:
> > Is it too late to ask for a feature test macro? Maybe one already
> > exists?
>
> No, not as far as I know.

[ That's not what a feature test macro is; a feature test macro allows the
user to select some optional behaviour. Things like _GNU_SOURCE. ]

> Perhaps something like below, though that
> won't affect the already released gcc 9.1 and 9.2, of course.

That is one reason to not want such a predefined macro. Another reason
is that you usually need to compile some test programs *anyway*, to see if
some bug is present for example, or to see if the exact implementation of
the feature is beneficial (or harmful) to your program in some way.

> gcc maintainers, WDYT? Can we add a feature test macro for asm inline()?

The only comparable existing predefined macro is __PRAGMA_REDEFINE_EXTNAME
it seems (no, I have no idea either). Everything else is required by some
standard (a "standard standard" or a "vendor standard", I'm lumping
everything together here), or shows whether some target has some feature,
or how many bits there are in certain types, that kind of thing.

Why would GCC want to have macros for all features it has? That would be
quite a few new ones every release. And what about bug fixes, are bug
fixes features as well?

I think you need to solve your configuration problems in your
configuration system.


Segher

2019-09-05 16:21:21

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On 05/09/2019 15.45, Segher Boessenkool wrote:
> Hi Rasmus,
>
> On Thu, Sep 05, 2019 at 01:07:11PM +0200, Rasmus Villemoes wrote:
>> On 05/09/2019 02.18, Nick Desaulniers wrote:
>>> Is it too late to ask for a feature test macro? Maybe one already
>>> exists?
>>
>> No, not as far as I know.
>
> [ That's not what a feature test macro is; a feature test macro allows the
> user to select some optional behaviour. Things like _GNU_SOURCE. ]
>
>> Perhaps something like below, though that
>> won't affect the already released gcc 9.1 and 9.2, of course.
>
> That is one reason to not want such a predefined macro. Another reason
> is that you usually need to compile some test programs *anyway*, to see if
> some bug is present for example, or to see if the exact implementation of
> the feature is beneficial (or harmful) to your program in some way.

OK, I think I'll just use a version check for now, and then switch to a
Kconfig test if and when clang grows support.

>> gcc maintainers, WDYT? Can we add a feature test macro for asm inline()?
>
>
> Why would GCC want to have macros for all features it has?

Well, gcc has implemented __has_attribute() which is similar - one could
detect support by compiling a trivial test program. Or the same could be
said for many of the predefined macros that are conditionally defined,
e.g. __HAVE_SPECULATION_SAFE_VALUE.

But I was just throwing the question into the air, I won't pursue this
further.

Thanks,
Rasmus

2019-09-05 19:32:14

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Thu, Sep 05, 2019 at 04:23:11PM +0200, Rasmus Villemoes wrote:
> On 05/09/2019 15.45, Segher Boessenkool wrote:
> > On Thu, Sep 05, 2019 at 01:07:11PM +0200, Rasmus Villemoes wrote:
> >> Perhaps something like below, though that
> >> won't affect the already released gcc 9.1 and 9.2, of course.
> >
> > That is one reason to not want such a predefined macro. Another reason
> > is that you usually need to compile some test programs *anyway*, to see if
> > some bug is present for example, or to see if the exact implementation of
> > the feature is beneficial (or harmful) to your program in some way.
>
> OK, I think I'll just use a version check for now, and then switch to a
> Kconfig test if and when clang grows support.
>
> >> gcc maintainers, WDYT? Can we add a feature test macro for asm inline()?
> >
> > Why would GCC want to have macros for all features it has?
>
> Well, gcc has implemented __has_attribute() which is similar - one could
> detect support by compiling a trivial test program.

It is not a macro, it doesn't spill over the place, and it is for
detecting things that are already fixed strings, much easier to do :-)

> Or the same could be
> said for many of the predefined macros that are conditionally defined,
> e.g. __HAVE_SPECULATION_SAFE_VALUE.

That one happened because of the Great Security Scare of 2017/2018, it's
not a good precedent. And, how it is set is target-specific, it can
depend on CPU model selected, target code generation options, or whatnot.

> But I was just throwing the question into the air, I won't pursue this
> further.

Maybe GCC should have a has_feature thing, it might fit in well there.
As preprocessor macros, not so much, IMO.


Segher

2019-09-05 20:50:56

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Thu, Sep 5, 2019 at 3:45 PM Segher Boessenkool
<[email protected]> wrote:
>
> [ That's not what a feature test macro is; a feature test macro allows the
> user to select some optional behaviour. Things like _GNU_SOURCE. ]

Yes and no. GNU libc defines feature test macros like you say, but
C++'s feature macros are like Rasmus/Nick are saying. I think libc's
definition is weird, I would call those "feature selection macros"
instead, because the user is selecting between some features (whether
to enable or not, for instance), rather than testing for the features.

> Why would GCC want to have macros for all features it has? That would be
> quite a few new ones every release.

Maybe GCC wouldn't, but its users, they surely would. For anything
that 1) is a new language feature, 2) breaks backwards-compatibility
with previous (or other compilers) and 3) is expected to be used by
end users, yes, it would be very useful to have.

For the same reasons C++ is adding feature test macros all over the
place nowadays and it is considered good practice (see SD-6: SG10
Feature Test Recommendations).

Cheers,
Miguel

2019-09-05 20:57:44

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Thu, Sep 5, 2019 at 5:52 PM Miguel Ojeda
<[email protected]> wrote:
>
> Yes and no. GNU libc defines feature test macros like you say, but
> C++'s feature macros are like Rasmus/Nick are saying. I think libc's
> definition is weird, I would call those "feature selection macros"
> instead, because the user is selecting between some features (whether
> to enable or not, for instance), rather than testing for the features.

By the way, this is not to criticize libc, I imagine they employed that
nomenclature since that is what some standards used, but still, the
naming is not great from the users' perspective vs. the header
writer's perspective, IMO.

Cheers,
Miguel

2019-09-06 15:25:59

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Thu, Sep 05, 2019 at 05:52:44PM +0200, Miguel Ojeda wrote:
> On Thu, Sep 5, 2019 at 3:45 PM Segher Boessenkool
> <[email protected]> wrote:
> >
> > [ That's not what a feature test macro is; a feature test macro allows the
> > user to select some optional behaviour. Things like _GNU_SOURCE. ]
>
> Yes and no. GNU libc defines feature test macros like you say, but
> C++'s feature macros are like Rasmus/Nick are saying. I think libc's

I can't find anything with "feature" and "macros" in the C++ standard,
it's "predefined macros" there I guess? In C, it is also "predefined
macros" in general, and there is "conditional feature macros".

> definition is weird, I would call those "feature selection macros"
> instead, because the user is selecting between some features (whether
> to enable or not, for instance), rather than testing for the features.

Sure. But the name is traditional, many decades old, it predates glibc.
Using an established name to mean pretty much the opposite of what it
normally does is a bit confusing, never mind if that usage makes much
sense ;-)


Segher

2019-09-07 03:16:00

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 2:23 PM Segher Boessenkool
<[email protected]> wrote:
>
> I can't find anything with "feature" and "macros" in the C++ standard,
> it's "predefined macros" there I guess? In C, it is also "predefined
> macros" in general, and there is "conditional feature macros".

They are introduced in C++20, but they have been added for a lot of
older features in both the language (see [cpp.predefined]p1, around 50
of them) and the library (see [support.limits.general]p3, ~100):

http://eel.is/c++draft/cpp.predefined#tab:cpp.predefined.ft
http://eel.is/c++draft/support.limits#tab:support.ft

> Sure. But the name is traditional, many decades old, it predates glibc.
> Using an established name to mean pretty much the opposite of what it
> normally does is a bit confusing, never mind if that usage makes much
> sense ;-)

Agreed on principle :-) However, I wouldn't say it is the opposite. I
would say they are the same, but from different perspectives: one says
"I want to test if the user enabled the feature", the other says "I
want to test if the vendor implemented the feature". Which is fine,
but for users the meaning is inverted as you say: in the first case
they want to say "I want to enable this feature in this library" --
they don't want to "test" anything. And since most people will be
users, not vendors writing standard libraries, I think the user
perspective would have been better.

Cheers,
Miguel

2019-09-07 09:23:44

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 05:13:54PM +0200, Miguel Ojeda wrote:
> On Fri, Sep 6, 2019 at 2:23 PM Segher Boessenkool
> <[email protected]> wrote:
> > I can't find anything with "feature" and "macros" in the C++ standard,
> > it's "predefined macros" there I guess? In C, it is also "predefined
> > macros" in general, and there is "conditional feature macros".
>
> They are introduced in C++20,

(Which isn't the C++ standard yet, okay).

> but they have been added for a lot of
> older features in both the language (see [cpp.predefined]p1, around 50
> of them) and the library (see [support.limits.general]p3, ~100):
>
> http://eel.is/c++draft/cpp.predefined#tab:cpp.predefined.ft
> http://eel.is/c++draft/support.limits#tab:support.ft

And they spell it "feature-test" there. Lovely :-/

> > Sure. But the name is traditional, many decades old, it predates glibc.
> > Using an established name to mean pretty much the opposite of what it
> > normally does is a bit confusing, never mind if that usage makes much
> > sense ;-)
>
> Agreed on principle :-) However, I wouldn't say it is the opposite. I
> would say they are the same, but from different perspectives: one says
> "I want to test if the user enabled the feature", the other says "I
> want to test if the vendor implemented the feature".

No, that is not what it does. A user defines such a macro, and that
makes the library change behaviour.

As the GNU C Library manual explains:

This system exists to allow the library to conform to multiple
standards. Although the different standards are often described as
supersets of each other, they are usually incompatible because larger
standards require functions with names that smaller ones reserve to the
user program. This is not mere pedantry -- it has been a problem in
practice. For instance, some non-GNU programs define functions named
'getline' that have nothing to do with this library's 'getline'. They
would not be compilable if all features were enabled indiscriminately.

https://www.gnu.org/software/libc/manual/html_node/Feature-Test-Macros.html


Segher

2019-09-07 09:24:00

by Jakub Jelinek

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 11:30:28AM -0500, Segher Boessenkool wrote:
> On Fri, Sep 06, 2019 at 05:13:54PM +0200, Miguel Ojeda wrote:
> > On Fri, Sep 6, 2019 at 2:23 PM Segher Boessenkool
> > <[email protected]> wrote:
> > > I can't find anything with "feature" and "macros" in the C++ standard,
> > > it's "predefined macros" there I guess? In C, it is also "predefined
> > > macros" in general, and there is "conditional feature macros".
> >
> > They are introduced in C++20,
>
> (Which isn't the C++ standard yet, okay).

Well, they have been required by SD-6 before being added to C++20, so we
have tons of the predefined macros for C++ already starting with gcc 4.9 or
so, but it is something required by the standard so we have to do that.
Most of them depend also on compiler options, so can't be easily replaced
with a simple __GNUC__ version check.

What I'd like to add is that each predefined macro isn't without cost,
while adding one predefined macro might not be measurable (though, for
some predefined macros (the floating point values) it was very measurable
and we had to resort to lazy evaluation of the macros), adding hundreds of
predefined macros is measurable, affects the speed of empty compiler run,
adds size of -g3 produced debug info, increases size of -E -dD output etc.

Jakub

2019-09-07 09:25:28

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 6:30 PM Segher Boessenkool
<[email protected]> wrote:
>
> (Which isn't the C++ standard yet, okay).

At this stage, it pretty much is. It is basically bug fixing at this point.

> No, that is not what it does. A user defines such a macro, and that
> makes the library change behaviour.

That is what I have said:

"I want to test if the user enabled the feature"

means the *library* tests if the user enabled the feature before
including the library. But the user does not want to test anything.

Cheers,
Miguel

2019-09-07 15:58:57

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 9:39 AM Jakub Jelinek <[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 11:30:28AM -0500, Segher Boessenkool wrote:
> > On Fri, Sep 06, 2019 at 05:13:54PM +0200, Miguel Ojeda wrote:
> > > On Fri, Sep 6, 2019 at 2:23 PM Segher Boessenkool
> > > <[email protected]> wrote:
> > > > I can't find anything with "feature" and "macros" in the C++ standard,
> > > > it's "predefined macros" there I guess? In C, it is also "predefined
> > > > macros" in general, and there is "conditional feature macros".
> > >
> > > They are introduced in C++20,
> >
> > (Which isn't the C++ standard yet, okay).
>
> Well, they have been required by SD-6 before being added to C++20, so we
> have tons of the predefined macros for C++ already starting with gcc 4.9 or
> so, but it is something required by the standard so we have to do that.
> Most of them depend also on compiler options, so can't be easily replaced
> with a simple __GNUC__ version check.
>
> What I'd like to add is that each predefined macro isn't without cost,
> while adding one predefined macro might not be measurable (though, for
> some predefined macros (the floating point values) it was very measurable
> and we had to resort to lazy evaluation of the macros), adding hundreds of
> predefined macros is measurable, affects the speed of empty compiler run,
> adds size of -g3 produced debug info, increases size of -E -dD output etc.
>
> Jakub

Here's the case that I think is perfect:
https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/

Specifically the feature test preprocessor define __GCC_ASM_FLAG_OUTPUTS__.

See exactly how we handle it in the kernel:
- https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/asm.h#L112-L118
- https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/rmwcc.h#L14-L30

Feature detection of the feature makes it trivial to detect when the
feature is supported, rather than brittle compiler version checks.
Had it been a GCC version check, it wouldn't work for clang out of the
box when clang added support for __GCC_ASM_FLAG_OUTPUTS__. But since
we had the helpful __GCC_ASM_FLAG_OUTPUTS__, and wisely based our use
of the feature on that preprocessor define, the code ***just worked***
for compilers that didn't support the feature ***and*** compilers when
they did support the feature ***without changing any of the source
code*** being compiled.

All I'm asking for is that when GCC adds a new GNU C extension (such
as `asm inline`), define a new preprocessor symbol (like has already
been done w/ __GCC_ASM_FLAG_OUTPUTS__), so that we don't have to use
version checking (or reimplementing autoconf) and use feature
detection instead. This simplifies use of this feature even between
codebases supporting multiple versions of GCC.

(Also, I'm guessing the cost of another preprocessor define is near
zero compared to parsing comments for -Wimplicit-fallthrough)
--
Thanks,
~Nick Desaulniers

2019-09-08 05:32:17

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 11:14:08AM -0700, Nick Desaulniers wrote:
> Here's the case that I think is perfect:
> https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
>
> Specifically the feature test preprocessor define __GCC_ASM_FLAG_OUTPUTS__.
>
> See exactly how we handle it in the kernel:
> - https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/asm.h#L112-L118
> - https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/rmwcc.h#L14-L30
>
> Feature detection of the feature makes it trivial to detect when the
> feature is supported, rather than brittle compiler version checks.
> Had it been a GCC version check, it wouldn't work for clang out of the
> box when clang added support for __GCC_ASM_FLAG_OUTPUTS__. But since
> we had the helpful __GCC_ASM_FLAG_OUTPUTS__, and wisely based our use
> of the feature on that preprocessor define, the code ***just worked***
> for compilers that didn't support the feature ***and*** compilers when
> they did support the feature ***without changing any of the source
> code*** being compiled.

And if instead you tested whether the actual feature you need works as
you need it to, it would even work fine if there was a bug we fixed that
breaks things for the kernel. Without needing a new compiler.

Or as another example, if we added support for some other flags. (x86
has only a few flags; many other archs have many more, and in some cases
newer hardware actually has more flags than older).

With the "macro" scheme we would need to add new macros in all these
cases. And since those are target-specific macros, that quickly expands
beyond reasonable bounds.

If you want to know if you can do X in some environment, just try to do X.


Segher

2019-09-08 06:34:17

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 3:03 PM Segher Boessenkool
<[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 11:14:08AM -0700, Nick Desaulniers wrote:
> > Here's the case that I think is perfect:
> > https://developers.redhat.com/blog/2016/02/25/new-asm-flags-feature-for-x86-in-gcc-6/
> >
> > Specifically the feature test preprocessor define __GCC_ASM_FLAG_OUTPUTS__.
> >
> > See exactly how we handle it in the kernel:
> > - https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/asm.h#L112-L118
> > - https://github.com/ClangBuiltLinux/linux/blob/0445971000375859008414f87e7c72fa0d809cf8/arch/x86/include/asm/rmwcc.h#L14-L30
> >
> > Feature detection of the feature makes it trivial to detect when the
> > feature is supported, rather than brittle compiler version checks.
> > Had it been a GCC version check, it wouldn't work for clang out of the
> > box when clang added support for __GCC_ASM_FLAG_OUTPUTS__. But since
> > we had the helpful __GCC_ASM_FLAG_OUTPUTS__, and wisely based our use
> > of the feature on that preprocessor define, the code ***just worked***
> > for compilers that didn't support the feature ***and*** compilers when
> > they did support the feature ***without changing any of the source
> > code*** being compiled.
>
> And if instead you tested whether the actual feature you need works as
> you need it to, it would even work fine if there was a bug we fixed that
> breaks things for the kernel. Without needing a new compiler.

That assumes a feature is broken out of the gate and is putting the
cart before the horse. If a feature is available, it should work. If
you later find it to be unsatisfactory, sure go out of your way to add
ugly compiler-specific version checks or upgrade your minimally
supported toolchain; until then feature detection is much cleaner (see
again __GCC_ASM_FLAG_OUTPUTS__).

>
> Or as another example, if we added support for some other flags. (x86
> has only a few flags; many other archs have many more, and in some cases
> newer hardware actually has more flags than older).

I think compiler flags are orthogonal to GNU C extensions we're discussing here.

>
> With the "macro" scheme we would need to add new macros in all these
> cases. And since those are target-specific macros, that quickly expands
> beyond reasonable bounds.

I don't think so. Can you show me an example codebase that proves me wrong?

>
> If you want to know if you can do X in some environment, just try to do X.

That's a very autoconf centric viewpoint. Why doesn't the kernel take
that approach for __GCC_ASM_FLAG_OUTPUTS__? Why not repeatedly invoke
$CC to find if every compiler __attribute__ is supported? Do you
think it's faster for the C preprocessor to check for a few #ifdefs,
or to repeatedly invoke $CC at build or compile time to detect new
features?
--
Thanks,
~Nick Desaulniers

2019-09-08 08:10:32

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 03:35:02PM -0700, Nick Desaulniers wrote:
> On Fri, Sep 6, 2019 at 3:03 PM Segher Boessenkool
> <[email protected]> wrote:
> > And if instead you tested whether the actual feature you need works as
> > you need it to, it would even work fine if there was a bug we fixed that
> > breaks things for the kernel. Without needing a new compiler.
>
> That assumes a feature is broken out of the gate and is putting the
> cart before the horse. If a feature is available, it should work.

GCC currently has 91696 bug reports.

> > Or as another example, if we added support for some other flags. (x86
> > has only a few flags; many other archs have many more, and in some cases
> > newer hardware actually has more flags than older).
>
> I think compiler flags are orthogonal to GNU C extensions we're discussing here.

No, I am talking exactly about what you brought up. The flags output
for inline assembler, using the =@ syntax. If I had implemented that
for Power when it first came up, I would by now have had to add support
for another flag (the CA32 flag). Oh, and I would not have implemented
support for OV or SO at all originally, but perhaps they are useful, so
let's add that as well. And there is OV32 now as well.

> > With the "macro" scheme we would need to add new macros in all these
> > cases. And since those are target-specific macros, that quickly expands
> > beyond reasonable bounds.
>
> I don't think so. Can you show me an example codebase that proves me wrong?

No, of course not, because we aren't silly enough to implement something
like that.

> > If you want to know if you can do X in some environment, just try to do X.
>
> That's a very autoconf centric viewpoint. Why doesn't the kernel take
> that approach for __GCC_ASM_FLAG_OUTPUTS__?

Ask them, not me.


Segher

2019-09-08 10:14:44

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 3:56 PM Segher Boessenkool
<[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 03:35:02PM -0700, Nick Desaulniers wrote:
> > On Fri, Sep 6, 2019 at 3:03 PM Segher Boessenkool
> > <[email protected]> wrote:
> > > And if instead you tested whether the actual feature you need works as
> > > you need it to, it would even work fine if there was a bug we fixed that
> > > breaks things for the kernel. Without needing a new compiler.
> >
> > That assumes a feature is broken out of the gate and is putting the
> > cart before the horse. If a feature is available, it should work.
>
> GCC currently has 91696 bug reports.

Fair.

>
> > > Or as another example, if we added support for some other flags. (x86
> > > has only a few flags; many other archs have many more, and in some cases
> > > newer hardware actually has more flags than older).
> >
> > I think compiler flags are orthogonal to GNU C extensions we're discussing here.
>
> No, I am talking exactly about what you brought up. The flags output
> for inline assembler, using the =@ syntax. If I had implemented that
> for Power when it first came up, I would by now have had to add support
> for another flag (the CA32 flag). Oh, and I would not have implemented
> support for OV or SO at all originally, but perhaps they are useful, so
> let's add that as well. And there is OV32 now as well.

Oh, I misunderstood. I see your point. Define the symbol as a number
for what level of output flags you support (ie. the __cplusplus
macro).

>
> > > With the "macro" scheme we would need to add new macros in all these
> > > cases. And since those are target-specific macros, that quickly expands
> > > beyond reasonable bounds.
> >
> > I don't think so. Can you show me an example codebase that proves me wrong?
>
> No, of course not, because we aren't silly enough to implement something
> like that.

I still don't think feature detection is worse than version detection
(until you find your feature broken in a way that forces the use of
version detection).

Just to prove my point about version checks being brittle, it looks
like Rasmus' version check isn't even right. GCC supported `asm
inline` back in the 8.3 release, not 9.1 as in this patch:
https://godbolt.org/z/1woitS . So users of gcc-8.2 and gcc-8.3
wouldn't be able to take advantage of `asm inline` even though their
compiler supported it.

Or was it "broken" until 9.1? Lord knows, as `asm inline` wasn't in
any release notes or bug reports I can find:
8: https://gcc.gnu.org/gcc-8/changes.html
9: https://gcc.gnu.org/gcc-9/changes.html
Bug tracker query:
https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=SUSPENDED&bug_status=WAITING&bug_status=REOPENED&cf_known_to_fail_type=allwords&cf_known_to_work_type=allwords&query_format=advanced&short_desc=asm%20inline&short_desc_type=anywordssubstr

Ah, here it is:
https://github.com/gcc-mirror/gcc/commit/6de46ad5326fc5e6b730a2feb8c62d09c1561f92
Which looks like the qualifier was added to this page:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

I like many of the GNU C extensions, and I want to help support them
in Clang so that they can be used in more places, but the current
process (and questions I have when I implement some of them) leaves me
with the sense that there's probably room for improvement on some of
these things before they go out the door.

Segher, next time there's discussion about new C extensions for the
kernel, can you please include me in the discussions?
--
Thanks,
~Nick Desaulniers

2019-09-08 10:18:01

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 04:42:58PM -0700, Nick Desaulniers via gcc-patches wrote:
> On Fri, Sep 6, 2019 at 3:56 PM Segher Boessenkool
> <[email protected]> wrote:
> Oh, I misunderstood. I see your point. Define the symbol as a number
> for what level of output flags you support (ie. the __cplusplus
> macro).

That works if history is linear. I guess with enough effort we can make
that work in most cases (for backports it is a problem, if you want to
support a new feature -- or bugfix! -- you need to support all previous
ones as well... Distros are going to hate us, too ;-) )

> > > I don't think so. Can you show me an example codebase that proves me wrong?
> >
> > No, of course not, because we aren't silly enough to implement something
> > like that.
>
> I still don't think feature detection is worse than version detection
> (until you find your feature broken in a way that forces the use of
> version detection).

*You* bring up version detection. I don't. You want some halfway thing,
with some macros that say what version some part of your compiler is.

I say to just detect the feature you want, and if it actually works the
way you want it, etc.

> Just to prove my point about version checks being brittle, it looks
> like Rasmus' version check isn't even right. GCC supported `asm
> inline` back in the 8.3 release, not 9.1 as in this patch:

Yes, I backported it so that it is available in 7.5, 8.3, and 9.1, so
that more users will have this available sooner. (7.5 has not been
released yet, but asm inline has been supported in GCC 7 since Jan 2
this year).

> Or was it "broken" until 9.1? Lord knows, as `asm inline` wasn't in
> any release notes or bug reports I can find:

https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01143.html

It never was accepted, and I dropped the ball.

> Ah, here it is:
> https://github.com/gcc-mirror/gcc/commit/6de46ad5326fc5e6b730a2feb8c62d09c1561f92
> Which looks like the qualifier was added to this page:
> https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html

Sure, it is part of the documentation just fine. And it works just fine
too, it is a *very* simple feature. By design.

> I like many of the GNU C extensions, and I want to help support them
> in Clang so that they can be used in more places, but the current
> process (and questions I have when I implement some of them) leaves me
> with the sense that there's probably room for improvement on some of
> these things before they go out the door.
>
> Segher, next time there's discussion about new C extensions for the
> kernel, can you please include me in the discussions?

You can lurk on gcc-patches@ and/or gcc@? But I'll try to remember, sure.
Not that I am involved in all such discussions myself, mind.


Segher

2019-09-08 10:19:50

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 6, 2019 at 5:14 PM Segher Boessenkool
<[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 04:42:58PM -0700, Nick Desaulniers via gcc-patches wrote:
> > Just to prove my point about version checks being brittle, it looks
> > like Rasmus' version check isn't even right. GCC supported `asm
> > inline` back in the 8.3 release, not 9.1 as in this patch:
>
> Yes, I backported it so that it is available in 7.5, 8.3, and 9.1, so
> that more users will have this available sooner. (7.5 has not been
> released yet, but asm inline has been supported in GCC 7 since Jan 2
> this year).

Ah, ok that makes sense.

How would you even write a version check for that?

Which looks better?

#if __GNU_MAJOR__ > 9 || __GNU_MAJOR__ == 8 && __GNU_MINOR__ >= 3 ||
__GNU_MAJOR__ == 7 && __GNU_MINOR__ >= 5 || __CLANG_MAJOR__ == 42
// make use of `asm inline`
#endif

or

#ifdef __CC_HAS_ASM_INLINE__
// make use of `asm inline`
#endif

>
> > Or was it "broken" until 9.1? Lord knows, as `asm inline` wasn't in
> > any release notes or bug reports I can find:
>
> https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01143.html
>
> It never was accepted, and I dropped the ball.

Ah, ok, that's fine, so documentation was at least written. Tracking
when and where patches land (or don't) is difficult when patch files
are emailed around. I try to keep track of when and where our kernel
patches land, but I frequently drop the ball there.

> > Segher, next time there's discussion about new C extensions for the
> > kernel, can you please include me in the discussions?
>
> You can lurk on gcc-patches@ and/or gcc@?

Please "interrupt" me when you're aware of such discussions, rather
than me "polling" a mailing list. (I will buy you a tasty beverage of
your preference). I'm already subscribed to more mailing lists than I
have time to read.

> But I'll try to remember, sure.
> Not that I am involved in all such discussions myself, mind.

But you _did_ implement `asm inline`. ;)
--
Thanks,
~Nick Desaulniers

2019-09-08 12:50:37

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Fri, Sep 06, 2019 at 06:04:54PM -0700, Nick Desaulniers wrote:
> On Fri, Sep 6, 2019 at 5:14 PM Segher Boessenkool
> <[email protected]> wrote:
> > On Fri, Sep 06, 2019 at 04:42:58PM -0700, Nick Desaulniers via gcc-patches wrote:
> > > Just to prove my point about version checks being brittle, it looks
> > > like Rasmus' version check isn't even right. GCC supported `asm
> > > inline` back in the 8.3 release, not 9.1 as in this patch:
> >
> > Yes, I backported it so that it is available in 7.5, 8.3, and 9.1, so
> > that more users will have this available sooner. (7.5 has not been
> > released yet, but asm inline has been supported in GCC 7 since Jan 2
> > this year).
>
> Ah, ok that makes sense.
>
> How would you even write a version check for that?

I wouldn't. Please stop using that straw man. I'm not saying version
checks are good, or useful for most things. I am saying they are not.

Predefined compiler symbols to do version checking (of a feature) is
just a lesser instance of the same problem though. (And it causes its
own more or less obvious problems as well).

> > > Or was it "broken" until 9.1? Lord knows, as `asm inline` wasn't in
> > > any release notes or bug reports I can find:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01143.html
> >
> > It never was accepted, and I dropped the ball.
>
> Ah, ok, that's fine, so documentation was at least written. Tracking
> when and where patches land (or don't) is difficult when patch files
> are emailed around. I try to keep track of when and where our kernel
> patches land, but I frequently drop the ball there.

I keep track of most things just fine... But the release notes are part
of our web content, which is in a separate CVS repository (still nicer
than SVN :-) ), and since I don't use it very often it falls outside of
all my normal procedures.

> your preference). I'm already subscribed to more mailing lists than I
> have time to read.
>
> > But I'll try to remember, sure.
> > Not that I am involved in all such discussions myself, mind.
>
> But you _did_ implement `asm inline`. ;)

That started as just

+ /* If this asm is asm inline, count anything as minimum size. */
+ if (gimple_asm_inline_p (as_a <gasm *> (stmt)))
+ count = MIN (1, count);

(in estimate_num_insns) but then things ballooned. Like such things do.


Segher

2019-09-09 17:10:19

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Sat, Sep 7, 2019 at 3:11 PM Segher Boessenkool
<[email protected]> wrote:
>
> I wouldn't. Please stop using that straw man. I'm not saying version
> checks are good, or useful for most things. I am saying they are not.
>
> Predefined compiler symbols to do version checking (of a feature) is
> just a lesser instance of the same problem though. (And it causes its
> own more or less obvious problems as well).

That is fair enough, but what are you suggesting we use, then?

Because "testing to do X to know if you can do X" cannot work with
source code alone and implies each project has to redo this work in
its build system for each compiler, plus each project ends up with
different names for these macros. The C++20 approach of having
standardized macros for major features is way better (whether we have
one macro per feature or a __has_feature(X) macro). Note this does not
mean we need to take this to the extreme and have a feature-test macro
for *every* feature, bugfix or behavior change as a macro.

Cheers,
Miguel

2019-09-12 22:34:31

by Nick Desaulniers

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Sat, Sep 7, 2019 at 6:11 AM Segher Boessenkool
<[email protected]> wrote:
>
> On Fri, Sep 06, 2019 at 06:04:54PM -0700, Nick Desaulniers wrote:
> > On Fri, Sep 6, 2019 at 5:14 PM Segher Boessenkool
> > <[email protected]> wrote:
> > > On Fri, Sep 06, 2019 at 04:42:58PM -0700, Nick Desaulniers via gcc-patches wrote:
> > > > Just to prove my point about version checks being brittle, it looks
> > > > like Rasmus' version check isn't even right. GCC supported `asm
> > > > inline` back in the 8.3 release, not 9.1 as in this patch:
> > >
> > > Yes, I backported it so that it is available in 7.5, 8.3, and 9.1, so
> > > that more users will have this available sooner. (7.5 has not been
> > > released yet, but asm inline has been supported in GCC 7 since Jan 2
> > > this year).
> >
> > Ah, ok that makes sense.
> >
> > How would you even write a version check for that?
>
> I wouldn't. Please stop using that straw man. I'm not saying version
> checks are good, or useful for most things. I am saying they are not.

Then please help Rasmus with a suggestion on how best to detect and
safely make use of the feature you implemented. As is, the patch in
question is using version checks.

>
> Predefined compiler symbols to do version checking (of a feature) is
> just a lesser instance of the same problem though. (And it causes its
> own more or less obvious problems as well).
>
> > > > Or was it "broken" until 9.1? Lord knows, as `asm inline` wasn't in
> > > > any release notes or bug reports I can find:
> > >
> > > https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01143.html
> > >
> > > It never was accepted, and I dropped the ball.
> >
> > Ah, ok, that's fine, so documentation was at least written. Tracking
> > when and where patches land (or don't) is difficult when patch files
> > are emailed around. I try to keep track of when and where our kernel
> > patches land, but I frequently drop the ball there.
>
> I keep track of most things just fine... But the release notes are part
> of our web content, which is in a separate CVS repository (still nicer
> than SVN :-) ), and since I don't use it very often it falls outside of
> all my normal procedures.
>
> > your preference). I'm already subscribed to more mailing lists than I
> > have time to read.
> >
> > > But I'll try to remember, sure.
> > > Not that I am involved in all such discussions myself, mind.
> >
> > But you _did_ implement `asm inline`. ;)
>
> That started as just
>
> + /* If this asm is asm inline, count anything as minimum size. */
> + if (gimple_asm_inline_p (as_a <gasm *> (stmt)))
> + count = MIN (1, count);
>
> (in estimate_num_insns) but then things ballooned. Like such things do.

So I'm not convinced this GNU C extension solves the problem it's
described to be used for. I agree that current implementations in
multiple compilers is imprecise, and leads to developer headaches. I
think `asm inline` will help in cases where vanilla `asm`
overestimates the size of inline assembly, but I also think it will be
just as bad as vanilla `asm` in cases where the size is
underestimated.

I have seen instances where instruction selection fails to select the
appropriate way to branch when inline asm size is misjudged, resulting
in un-encodeable jumps (as in the branch target is too far to be
encoded in the number of bits of a single jump/branch instruction).
And the use of .pushsection/.popsection assembler directives and
__attribute__((section())) attributes complicates the accounting
further, as you can then place code from the inline assembly in a
different section than the function itself (so that inline assembly
doesn't affect the function's size, and the implications on inlining
the function). That would cause vanilla `asm` to overestimate size.
(I suspect variable length encoded instruction sets also suffer from
misaccounting).

Short of invoking the assembler itself, and then matching the byte
size of generated code section that matches the function's section,
can you accurately describe the size of inline assembly. .macro and
.rept assembler directives really complicate estimates and can cause
vanilla `asm` to underestimate size.

I agree that current implementations in multiple compilers is
imprecise, and leads to developer headaches. Rather than give
developers the ability to choose between 2 different heuristics that
are both imprecise, why not make the existing heuristic (ie. vanilla
`asm`) more precise in its measure?
--
Thanks,
~Nick Desaulniers

2019-09-12 22:40:46

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On 12/09/2019 23.54, Nick Desaulniers wrote:
> On Sat, Sep 7, 2019 at 6:11 AM Segher Boessenkool
> <[email protected]> wrote:
>>
>> On Fri, Sep 06, 2019 at 06:04:54PM -0700, Nick Desaulniers wrote:
>>
>>> How would you even write a version check for that?
>>
>> I wouldn't. Please stop using that straw man. I'm not saying version
>> checks are good, or useful for most things. I am saying they are not.
>
> Then please help Rasmus with a suggestion on how best to detect and
> safely make use of the feature you implemented. As is, the patch in
> question is using version checks.

I was just about to send out an updated version. I'm just going to do
the check in Kconfig - I didn't realize how easy it had become to do
that kind of thing until Masahiro pointed me at his RFC patch from December.

Rasmus

2019-09-12 22:52:59

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v3 0/6] make use of gcc 9's "asm inline()"

gcc 9+ (and gcc 8.3, 7.5) provides a way to override the otherwise
crude heuristic that gcc uses to estimate the size of the code
represented by an asm() statement. From the gcc docs

If you use 'asm inline' instead of just 'asm', then for inlining
purposes the size of the asm is taken as the minimum size, ignoring
how many instructions GCC thinks it is.

For compatibility with older compilers, we obviously want a

#if [understands asm inline]
#define asm_inline asm inline
#else
#define asm_inline asm
#endif

But since we #define the identifier inline to attach some attributes,
we have to use an alternate spelling of that keyword. gcc provides
both __inline__ and __inline, and we currently #define both to inline,
so they all have the same semantics. We have to free up one of
__inline__ and __inline, and the latter is by far the easiest.

The two x86 changes cause smaller code gen differences than I'd
expect, but I think we do want the asm_inline thing available sooner
or later, so this is just to get the ball rolling.

Changes since v1: __inline instead of __inline__, making the diffstat
400 lines smaller.

Changes since v2: Check support of "asm inline" in Kconfig rather than
based on gcc version, since the feature was backported to gcc 7.x and
gcc 8.x. That also automatically enables it if and when Clang grows
support, though that compiler apparently does not have the same
problems with overestimating sizes of asm()s that gcc has.

Patch 1 has already been picked up by Greg in staging-next, it's
included here for completeness. I don't know how to route the rest, or
if they should simply wait for 5.5 given how close we are to the merge
window for 5.4.

Rasmus Villemoes (6):
staging: rtl8723bs: replace __inline by inline
lib/zstd/mem.h: replace __inline by inline
compiler_types.h: don't #define __inline
compiler-types.h: add asm_inline definition
x86: alternative.h: use asm_inline for all alternative variants
x86: bug.h: use asm_inline in _BUG_FLAGS definitions

arch/x86/include/asm/alternative.h | 14 +++++++-------
arch/x86/include/asm/bug.h | 4 ++--
drivers/staging/rtl8723bs/core/rtw_pwrctrl.c | 4 ++--
drivers/staging/rtl8723bs/core/rtw_wlan_util.c | 2 +-
drivers/staging/rtl8723bs/include/drv_types.h | 6 +++---
.../staging/rtl8723bs/include/osdep_service.h | 10 +++++-----
.../rtl8723bs/include/osdep_service_linux.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_mlme.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/rtw_recv.h | 16 ++++++++--------
drivers/staging/rtl8723bs/include/sta_info.h | 2 +-
drivers/staging/rtl8723bs/include/wifi.h | 14 +++++++-------
drivers/staging/rtl8723bs/include/wlan_bssdef.h | 2 +-
include/linux/compiler_types.h | 17 ++++++++++++++++-
init/Kconfig | 3 +++
lib/zstd/mem.h | 2 +-
15 files changed, 71 insertions(+), 53 deletions(-)

--
2.20.1

2019-09-12 22:55:01

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] make use of gcc 9's "asm inline()"

On Fri, Sep 13, 2019 at 12:19 AM Rasmus Villemoes
<[email protected]> wrote:
>
> Patch 1 has already been picked up by Greg in staging-next, it's
> included here for completeness. I don't know how to route the rest, or
> if they should simply wait for 5.5 given how close we are to the merge
> window for 5.4.

If you want I can pick this up in compiler-attributes and submit it as
a whole if we get Acks from rtl8723bs/x86/...maintainers.

Cheers,
Miguel

2019-09-13 01:17:43

by Rasmus Villemoes

[permalink] [raw]
Subject: [PATCH v3 2/6] lib/zstd/mem.h: replace __inline by inline

Currently, compiler_types.h #defines __inline as inline (and further
#defines inline to automatically attach some attributes), so this does
not change functionality. It serves as preparation for removing the
#define of __inline.

While at it, also remove the __attribute__((unused)) - it's already
included in the definition of the inline macro, and "open-coded"
__attribute__(()) should be avoided.

Since commit a95b37e20db9 (kbuild: get <linux/compiler_types.h> out of
<linux/kconfig.h>), compiler_types.h is automatically included by all
kernel C code - i.e., the definition of inline including the unused
attribute is guaranteed to be in effect whenever ZSTD_STATIC is
expanded.

Signed-off-by: Rasmus Villemoes <[email protected]>
---
lib/zstd/mem.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/zstd/mem.h b/lib/zstd/mem.h
index 3a0f34c8706c..93d7a2c377fe 100644
--- a/lib/zstd/mem.h
+++ b/lib/zstd/mem.h
@@ -27,7 +27,7 @@
/*-****************************************
* Compiler specifics
******************************************/
-#define ZSTD_STATIC static __inline __attribute__((unused))
+#define ZSTD_STATIC static inline

/*-**************************************************************
* Basic Types
--
2.20.1

2019-09-13 06:13:37

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] make use of gcc 9's "asm inline()"

On 13/09/2019 00.30, Miguel Ojeda wrote:
> On Fri, Sep 13, 2019 at 12:19 AM Rasmus Villemoes
> <[email protected]> wrote:
>>
>> Patch 1 has already been picked up by Greg in staging-next, it's
>> included here for completeness. I don't know how to route the rest, or
>> if they should simply wait for 5.5 given how close we are to the merge
>> window for 5.4.
>
> If you want I can pick this up in compiler-attributes and submit it as
> a whole if we get Acks from rtl8723bs/x86/...maintainers.

Ingo has now acked the x86 parts, and Greg has already picked up the
rtl8723bs patch, which is at least an implicit ack. I'm just unsure of
how and if it will work if you also pick up that one - but, if you
don't, your tree would be (somewhat) dependent on Greg's staging-next :(

Rasmus

2019-09-13 16:39:42

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] make use of gcc 9's "asm inline()"

On Fri, Sep 13, 2019 at 08:11:24AM +0200, Rasmus Villemoes wrote:
> On 13/09/2019 00.30, Miguel Ojeda wrote:
> > On Fri, Sep 13, 2019 at 12:19 AM Rasmus Villemoes
> > <[email protected]> wrote:
> >>
> >> Patch 1 has already been picked up by Greg in staging-next, it's
> >> included here for completeness. I don't know how to route the rest, or
> >> if they should simply wait for 5.5 given how close we are to the merge
> >> window for 5.4.
> >
> > If you want I can pick this up in compiler-attributes and submit it as
> > a whole if we get Acks from rtl8723bs/x86/...maintainers.
>
> Ingo has now acked the x86 parts, and Greg has already picked up the
> rtl8723bs patch, which is at least an implicit ack. I'm just unsure of
> how and if it will work if you also pick up that one - but, if you
> don't, your tree would be (somewhat) dependent on Greg's staging-next :(

Feel free to also take that patch through any tree, it's "obviously
correct" :)

greg k-h

2019-09-15 19:52:50

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v3 0/6] make use of gcc 9's "asm inline()"

On Fri, Sep 13, 2019 at 5:21 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> Feel free to also take that patch through any tree, it's "obviously
> correct" :)

OK :) Picked all 6 in compiler-attributes:

https://github.com/ojeda/linux/commits/compiler-attributes

I added Ingo's Acks and fixed a minor typo (has -> have) in the 3rd
patch commit message. It will be in tomorrow's -next.

Cheers,
Miguel

2019-09-20 16:24:55

by Segher Boessenkool

[permalink] [raw]
Subject: Re: [PATCH v2 4/6] compiler-gcc.h: add asm_inline definition

On Thu, Sep 12, 2019 at 02:54:50PM -0700, Nick Desaulniers wrote:
> I have seen instances where instruction selection fails to select the
> appropriate way to branch when inline asm size is misjudged, resulting
> in un-encodeable jumps (as in the branch target is too far to be
> encoded in the number of bits of a single jump/branch instruction).

"asm inline" *only* influences the estimated size *for inlining
purposes*. Not for anything else. Everything else still uses
conservative estimates.


Segher