Date: Tue, 7 Mar 2017 09:27:50 +0100
From: Ingo Molnar <mingo@kernel.org>
To: hpa@zytor.com
Cc: Thomas Gleixner <tglx@linutronix.de>, Jiri Slaby <jslaby@suse.cz>,
        mingo@redhat.com, x86@kernel.org, jpoimboe@redhat.com,
        linux-kernel@vger.kernel.org,
        Boris Ostrovsky <boris.ostrovsky@oracle.com>,
        Juergen Gross <jgross@suse.com>, xen-devel@lists.xenproject.org,
        "Rafael J. Wysocki" <rjw@rjwysocki.net>,
        Len Brown <len.brown@intel.com>, Pavel Machek <pavel@ucw.cz>,
        linux-pm@vger.kernel.org,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Andrew Morton <akpm@linux-foundation.org>,
        Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH 01/10] x86: assembly, ENTRY for fn, GLOBAL for data
Message-ID: <20170307082750.GA1695@gmail.com>
References: <20170217104757.28588-1-jslaby@suse.cz>
 <20170301093855.GA27152@gmail.com>
 <alpine.DEB.2.20.1703011108220.4005@nanos>
 <20170301102754.GA13374@gmail.com>
 <39064F86-5BE2-417F-8A28-2B4CB5177D7D@zytor.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <39064F86-5BE2-417F-8A28-2B4CB5177D7D@zytor.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 5978
Lines: 155


* hpa@zytor.com <hpa@zytor.com> wrote:

> On March 1, 2017 2:27:54 AM PST, Ingo Molnar <mingo@kernel.org> wrote:

> >> > So how about using macro names that actually show the purpose, instead of 
> >> > importing all the crappy, historic, essentially randomly chosen debug 
> >> > symbol macro names from the binutils and older kernels?
> >> > 
> >> > Something sane, like:
> >> > 
> >> > 	SYM__FUNCTION_START
> >> 
> >> Sane would be:
> >> 
> >>      	SYM_FUNCTION_START
> >> 
> >> The double underscore is just not giving any value.
> >
> > So the double underscore (at least in my view) has two advantages:
> >
> > 1) it helps separate the prefix from the postfix.
> >
> > I.e. it's a 'symbols' namespace, and a 'function start', not the 'start' of a 
> > 'symbol function'.
> >
> > 2) It also helps easy greppability.
> >
> > Try this in latest -tip:
> >
> >    git grep e820__
> >
> > To see all the E820 API calls - with no false positives!
> >
> > 'git grep e820_' on the other hand is a lot less reliable...
> 
> IMO these little "namespace tricks" especially for small common macros like we 
> are taking about here make the code very frustrating to read, and even more to 
> write.  Noone would design a programming language that way, and things like PROC 
> are really just substitutes for proper language features (and could even be as 
> assembly rather than cpp macros.)

This is a totally different thing from language keywords which needs to be short 
and concise for obvious reasons.

Keywords of languages get nested and are used all the time, and everyone needs to 
know them and they need to stay out of the way. The symbol start/end macros we are 
talking about here are _MUCH_ less common, and they are only ever used in a single 
nesting level:

        SYM__FUNC_START(some_kernel_asm_function)
        ...
        SYM__FUNC_END(some_kernel_asm_function)

Most kernel developers writing new assembly code rarely know these constructs by 
heart, they just look them up and carbon copy existing practices. And guess what, 
the 'looking them up' gets harder if the macro naming scheme is an idosyncratic 
leftover from long ago.

Kernel developers _reading_ assembly code will know the exact purpose of the 
macros even less, especially if they are named in an ambiguous, illogical fashion.

Furthermore, your suggestion of:

> PROC..ENDPROC, LOCALPROC..ENDPROC and DATA..ENDDATA.  Clear, unambiguous and 
> balanced.

Are neither clear, not unambiguous nor balanced! I mean, they are the _exact_ 
opposite:

 - 'PROC' is actually ambiguous in the kernel source code context, as it clashes 
   with common abbreviations of 'procfs' and 'process'.

   It's also an unnecessary abbreviation of a word ('procedure') that is not 
   actually used a _single time_ in the C ISO/IEC 9899:TC2 standard - in all half 
   thousand+ pages of it. (!) Why the hell does this have to be used in the 
   kernel?

 - It's visually and semantically imbalanced, because some terms have an 'END' 
   prefix, but there's no matching 'START' or 'BEGIN' prefix for their 
   counterparts. This makes it easy to commit various symbol definition 
   termination errors, like:

	PROC(some_kernel_asm_function)
	...

   Here it's not obvious that it needs an ENDPROC. While if it's written as:

        SYM__FUNC_START(some_kernel_asm_function)
        ...

   ... it's pretty obvious at first sight that an '_END' is missing!

 - What you suggest also has senselessly missing underscores, which makes it 
   _more_ cluttered and less clear. We clearly don't have addtowaitqueue() and
   removefromwaitqueue() function names in the kernel, right? Why should we have
   'ENDPROC' and 'ENDDATA' macro names?

 - Hierarchical naming schemes generally tend to put the more generic category
   name first, not last. So it's:

	mutex_init()
	mutex_lock()
	mutex_unlock()
	mutex_trylock()

   It's _NOT_ the other way around:

	init_mutex()
	lock_mutex()
	unlock_mutex()
	trylock_mutex()

   The prefix naming scheme is easier to read both visually/typographically 
   (because it aligns vertically in a natural fashion so it's easier to pattern 
   match), and also semantically: because when reading it it's easy to skip the 
   line once your brain reads the generic category of 'mutex'.

   But with 'ENDPROC' my brain both has to needlessly perform the following steps:

	- disambiguate the 'END' and the 'PROC'

	- fill in the missing underscore

	- and finally when arriving at the generic term 'PROC', discard it as
	  uninteresting

 - Short names have good use in programming languages, because everyone who uses
   that language knows what they are and they become a visual substitute for the
   language element.

   But assembly macros are _NOT_ a new language in this sense, they are actually 
   more similar to library function names: where brevity is actually
   counterintuitive and harmful, because they are ambiguous and pollute the 
   generic namespace. If you look at C library API function name best practices
   you'll see that the best ones are all hierarchically named and categorized,
   with the more generic category put first, they are unambiguously balanced even
   if that makes the names longer, they are clear and use underscores.

For all these reasons the naming scheme you suggest is one of the worst we could 
come up with! I mean, if I had to _intentionally_ engineer something as harmful as 
possible to readability and maintainability this would be pretty close to it...

I'm upset, because even a single minute of reflection should have told you all 
this. I mean, IMHO it's not even a close argument: your suggested naming scheme is 
bleeding from half a dozen of mortal wounds...

I can be convinced to drop the double underscores (I seem to be in the minority 
regard them), and I can be convinced that 'FUNC' is shorter and still easy to 
understand instead of 'FUNCTION', but other than that please stop the naming 
madness!

Thanks,

	Ingo