Date: Wed, 19 Oct 2016 10:18:43 +0200 (CEST)
From: Richard Biener <rguenther@suse.de>
To: "Luis R. Rodriguez" <mcgrof@kernel.org>
cc: Vegard Nossum <vegard.nossum@oracle.com>,
        Peter Zijlstra <peterz@infradead.org>, Jiri Slaby <jslaby@suse.cz>,
        linux-kernel@vger.kernel.org,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Linus Torvalds <torvalds@linux-foundation.org>, stable@vger.kernel.org,
        Ming Lei <ming.lei@canonical.com>,
        Steven Rostedt <srostedt@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
        Josh Poimboeuf <jpoimboe@redhat.com>,
        Cesar Eduardo Barros <cesarb@cesarb.eti.br>,
        Michael Matz <matz@suse.de>, David Miller <davem@davemloft.net>,
        Guenter Roeck <linux@roeck-us.net>,
        Fengguang Wu <fengguang.wu@intel.com>, Borislav Petkov <bp@alien8.de>,
        Boris Ostrovsky <boris.ostrovsky@oracle.com>,
        Juergen Gross <jgross@suse.com>, Kees Cook <keescook@chromium.org>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Ingo Molnar <mingo@kernel.org>, Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [PATCH 01/12] extarray: define helpers for arrays defined in
 linker scripts
In-Reply-To: <20161018211803.GV8651@wotan.suse.de>
Message-ID: <alpine.LSU.2.11.1610190949310.2258@t29.fhfr.qr>
References: <20161016151616.31451-1-vegard.nossum@oracle.com> <20161016151616.31451-2-vegard.nossum@oracle.com> <20161017083315.GA29322@worktop.vlan200.pylonone.local> <186f8242-3f8d-31cd-a8e8-9743bbc1c1fd@suse.cz> <20161017090930.GT3142@twins.programming.kicks-ass.net>
 <55e00c01-2da8-8d06-1d05-9ebf775736ec@oracle.com> <20161017114517.GQ3117@twins.programming.kicks-ass.net> <55b3cbe0-f8fc-6505-411d-5f050d3414cc@oracle.com> <20161018211803.GV8651@wotan.suse.de>
User-Agent: Alpine 2.11 (LSU 23 2013-08-11)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 15065
Lines: 352

On Tue, 18 Oct 2016, Luis R. Rodriguez wrote:

> On Tue, Oct 18, 2016 at 10:08:44AM +0200, Vegard Nossum wrote:
> >
> 
> Vegard, thanks for bringing this to attention!
> 
> Including this hunk for those that were originally not CC'd
> on the original patch.
> 
> > diff --git a/include/linux/extarray.h b/include/linux/extarray.h
> > new file mode 100644
> > index 0000000..1185abc 
> > --- /dev/null
> > +++ b/include/linux/extarray.h
> > @@ -0,0 +1,65 @@
> > +#ifndef LINUX_EXTARRAY_H
> > +#define LINUX_EXTARRAY_H
> > +
> > +#include <linux/compiler.h>
> > +
> > +/*
> > + * A common pattern in the kernel
> 
> This is quite an understatement, as you noted on the cover letter, I've been
> reviewing these uses, in particular where such uses are used also for the
> linker script for quite some time now. I've devised a generic API for these
> uses then to help with making it easier to add new entries, making easier
> to avoid typos, and giving us some new features from this effort. This the
> linker table and section range APIs [0] [1]. Given my review of all this code in
> the kernel I'd say this use is not just common, its a well established practice
> by now, across *all* architectures.
> 
> In fact I would not be surprised if this usage and practice has extended far
> beyond the kernel by now, and custom firmwares / linker scripts also use this
> to mimic the practice on the kernel to take advantage of this old hack to stuff
> special code / data structures into special ELF sections. As such, I would not
> think this is just an issue for Linux.
> 
> Upon a quick cursory review I can confirm such use is prevalent in Xen as well,
> for instance the Xen Security Module framework uses it since eons ago [2]. Its
> used elsewhere on the Xen linker script too though... so XSM is one small
> example at a *quick* glance.
> 
> [0] https://lkml.kernel.org/r/1471642875-5957-1-git-send-email-mcgrof@kernel.org
> [1] https://lkml.kernel.org/r/1471642454-5679-1-git-send-email-mcgrof@kernel.org
> [2] https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=d046f361dc937d8fc179cc2da168f571726cb5a0
> 
> > +   is to put certain objects in a specific
> > + * named section and then create variables in the linker script pointing
> > + * to the start and the end of this section. These variables are declared
> > + * as extern arrays to allow C code to iterate over the list of objects
> 
> Right, sometimes its just a char pointer to represent code, other times it
> custom data structure.
> 
> > + *
> > + * In C, comparing pointers to objects in two different arrays is undefined
> > + * GCC version 7.0 and newer (commit 73447cc5d17) will aggressively optimize
> > + * out such comparisons if it can prove that the two pointers point to
> > + * different arrays (which is the case when the arrays are declared as two
> > + * separate variables). This breaks the typical code used to iterate over
> > + * such arrays.
> 
> Eek, thanks, I checked commit 73447cc5d17 on gcc [2] however its commit log
> is pretty vague to me and does not seem to justify why exactly this optimization
> is done, nor what effort was put in to vet for it, as such I cannot agree or
> disagree with it. Logically though, to me these are just pointers, as such,
> its not clear to me why gcc can take such optimization to make such assumptions.
> 
> Since I cannot infer any more from this commit, and given how prevalent I
> expect this use to be (clearly even outside of Linux) I am inclined to consider
> this a gcc bug, which requires at least an opt-in optimization rather than this
> being a default. Had anyone reported this ??
> 
> Richard ?

The commit implements a long-standing failure to optimize trivial pointer
comparisons that arise for example from libstdc++.  PR65686 contains
a simple C example:

mytype f(struct S *e)
{
  mytype x;
  if(&x != e->pu)
    __builtin_memcpy(&x, e->pu, sizeof(unsigned));
  return x;
}

where GCC before the commit could not optimize the &x != e->pu test
as trivial false.

The commit uses points-to analysis results to simplify pointer equality
tests (which is in itself a very good way to expose bugs in points-to
analysis -- I've fixed a bunch of them thanks to this ;)).

> [2] https://github.com/gcc-mirror/gcc/commit/73447cc5d17
> 
> > + *
> > + * One way to get around this limitation is to force GCC to lose any array
> > + * information about the pointers before we compare them. We can use e.g.
> > + * OPTIMIZER_HIDE_VAR() for this.
> 
> As far as Linux is concerned though your patch set addresses covering a few
> cases, it does not cover all, so while it might help boot x86 on some type of
> system, by no means would I expect it suffice to boot all Linux systems.
> Additionally,  while  it may  *fix*  boot on x86, are we certain the use of
> OPTIMIZER_HIDE_VAR() may not *break* on certain platforms ? I would ask such
> type of intrusive changes which affect the linker script to go well beyond
> just 0-day for testing -- Guenter Roeck was kind enough to let me test my
> series for linker table / section ranges on his infrastructure which covers
> much architectures and toolchains not addressed by 0-day. I'd expect such type
> of testing for these types of changes, which can affect many architectures.
> 
> Additionally you asked for this to series to be considered a stable
> patch, if this just fixes x86 boot, but breaks other architecture it would
> be a considerable regression. I'd prefer we first determine if we want this
> change reverted on gcc, or if it at least can be disabled by default.
> I really do expect shit to hit the fan elsewhere, so this work around
> doesn't same like the next best step at this point.
> 
> > + *
> > + * This file defines a few helpers to deal with declaring and accessing
> > + * such linker-script-defined arrays.
> > + */
> > +
> > +
> > +#define DECLARE_EXTARRAY(type, name)                                   \
> > +       extern type __start_##name[];                                   \
> > +       extern type __stop_##name[];                                    \
> > +
> > +#define _ext_start(name, tmp) \
> > +       ({                                                              \
> > +               typeof(*__start_##name) *tmp = __start_##name;          \
> > +               OPTIMIZER_HIDE_VAR(tmp);                                \
> > +               tmp;                                                    \
> > +       })
> > +
> > +#define ext_start(name) _ext_start(name, __UNIQUE_ID(ext_start_))
> > +
> > +#define _ext_end(name, tmp)                                            \
> > +       ({                                                              \
> > +               typeof(*__stop_##name) *tmp = __stop_##name;            \
> > +               OPTIMIZER_HIDE_VAR(tmp);                                \
> > +               tmp;                                                    \
> > +       })
> > +
> > +#define ext_end(name) _ext_end(name, __UNIQUE_ID(ext_end_))
> > +
> > +#define _ext_size(name, tmp1, tmp2)                                    \
> > +       ({                                                              \
> > +               typeof(*__start_##name) *tmp1 = __start_##name;         \
> > +               typeof(*__stop_##name) *tmp2 = __stop_##name;           \
> > +               OPTIMIZER_HIDE_VAR(tmp1);                               \
> > +               OPTIMIZER_HIDE_VAR(tmp2);                               \
> > +               tmp2 - tmp1;                                            \
> > +       })
> > +
> > +#define ext_size(name) \
> > +       _ext_size(name, __UNIQUE_ID(ext_size1_), __UNIQUE_ID(ext_size2_))
> > +
> > +#define ext_for_each(var, name) \
> > +       for (var = ext_start(name); var != ext_end(name); var++)
> > +
> > +#endif
> 
> FWIW, with linker able we'd do this type of "fix" in one place later,
> if we wanted it, provided all uses are ported, of course.
> 
> > On 10/17/2016 01:45 PM, Peter Zijlstra wrote:
> > > On Mon, Oct 17, 2016 at 01:27:08PM +0200, Vegard Nossum wrote:
> > > > On 10/17/2016 11:09 AM, Peter Zijlstra wrote:
> > > > > On Mon, Oct 17, 2016 at 11:01:13AM +0200, Jiri Slaby wrote:
> > > > > > On the top of that, it's incorrect C according to the standard.
> > > > > 
> > > > > According to the standard non of the kernel has any chance in hell of
> > > > > working, so don't pretend you care about that :-)
> > > > 
> > > > I think that's a bit of a false dilemma. It's obviously true that kernel
> > > > code does not conform to the standards, but that doesn't mean it's not
> > > > something we should strive towards or care about in general. It helps
> > > > static analysis tools, compiler diversity, etc.
> > > 
> > > Sure, but this, two separately allocated objects their address should
> > > not be compared and therefore... stuff is explicitly relied upon by the
> > > kernel in many places.
> > > 
> > > We have workarounds in various places, and this patch adds yet another
> > > instance of it.
> > > 
> > > The workaround is simply confusing the compiler enough to have it not do
> > > the 'optimization'. But we very much still rely on this 'undefined'
> > > behaviour.
> > > 
> > > I think it makes more sense to explicitly allow it than to obfuscate our
> > > code and run the risk a future compiler will see through our tricks.
> > 
> > Actually, I think we're all a bit wrong.
> > 
> > It's not comparing the pointers that's undefined behavior, that was my
> > bad trying to oversimplify the issue.
> > 
> > Of course, comparing arbitrary (valid) pointers with each other is not
> > undefined behavior. The undefined behavior is technically doing iter++
> > past the end of the array,
> 
> What defines the end of the array?
> 
> > that is "creating" a pointer that points outside an array.
> 
> I mean, its just a pointer.
> 
> This is the sort of information that would be useful for the commit log.
> 
> > What gcc does wrong is that it sees us iterating over one array and the
> > optimizer concludes that the iterator can never point to the second
> > array.
> 
> Which second array? Why does it make this huge assumption ?
> 
> > I'd argue there's no real undefined behaviour happening here.
> > Thus, the code _is_ correct and valid C as it stands, it just doesn't do
> > what you would expect intuitively.
> 
> People have relied on its functionality for years, if this is going
> to change it would be I think a good idea to at least have a strong
> justification rather than having us trying to interpret the original
> goal of the gcc patch.

The original goal of the gcc patch wasn't to break the kernel.  The
goal was to implement an optimization other compilers do since a long
time.

> > However, from the linker script's point of view there is no difference
> > between one big array and two consecutive arrays of the same type, and
> > the fact that the compiler doesn't know the memory layout -- although
> > we do.
> 
> In Linux we don't mix/match pointer types, we typically use two char *
> pointers for start/end of code, or a data structure pointers for start/
> end of list, that's it. Its not clear to me why gcc believes it is correct
> to assume start != end.
> 
> > When we call OPTIMIZER_HIDE_VAR() we're not really "confusing" the
> > compiler, it's more like calling a function defined in a different file
> > (therefore the compiler can't see into it) that returns a pointer which
> > we _know_ is valid because it still points to (the end of) the array.
> 
> Its a hack to work around the optimization, if we are to do this I think
> its important we all first understand *why* the original commit was done,
> without this - it would seem the current optimization will just go breaking
> quite a bit of projects. Your changeset would also require much more work
> (or we port things to the linker table / section ranges API, and just do the
> fix with those wrappers, as it would mean 1 collateral evolution rather than 2
> -- one for this fix and then one for the linker table API).
> 
> > If we obtain a pointer somehow (be it from a hardware register or from
> > the stack of a userspace process), the compiler must assume that it's a
> > valid pointer, right? The only reason it didn't do that for the arrays
> > was because we had declared the two arrays as separate variables so it
> > "knew" it was the case that the iterator initialised to point to one
> > array could never point to the second array.
> 
> But why is this invalid C all of a sudden with optimizations ?
> 
> foo.h:
> 
> extern char *start_foo;
> extern char *end_foo;
> 
> foo.c:
> 
> #include "foo.h"
> 
> char *str = "hello";                                                            
> 
> char *start_foo;
> char *end_foo;
> 
> int main(void)
> {
> 	start_foo = str;                                                        
> 	end_foo = str;                                                          
> 
> 	return !(start_foo == end_foo);
> }

Why should this be invalid?

Let's look at what is special about the kernel usage.  Looking
at GCC bug 77964 it is declaring

extern struct builtin_fw __start_builtin_fw[];
extern struct builtin_fw __end_builtin_fw[];

which are extern zero-sized arrays.  I suppose they are nowhere
actually defined but these symbols are created by the linker script only.

I can think of adding workarounds to GCC to try catching this
special case which would be treating a pointer to a extern
object of size zero (so you can't do this in C++ ...) as a
pointer to possibly any other global variable (given actual
data layout may make the pointer value equal to it).

Index: gcc/tree-ssa-structalias.c
===================================================================
--- gcc/tree-ssa-structalias.c  (revision 241327)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -2944,6 +2944,17 @@ get_constraint_for_ssa_var (tree t, vec<
          DECL_PT_UID (t) = DECL_UID (node->decl);
          t = node->decl;
        }
+
+      /* If this is an external zero-sized object consider it to
+        point to NONLOCAL as well.  */
+      if (DECL_EXTERNAL (t)
+         && (! DECL_SIZE (t) || integer_zerop (DECL_SIZE (t))))
+       {
+         cexpr.var = nonlocal_id;
+         cexpr.type = SCALAR;
+         cexpr.offset = 0;
+         results->safe_push (cexpr);
+       }
     }
 
   vi = get_vi_for_tree (t);

Note that any issues exposed by the pointer equality simplification
are possible issues in previous GCC with regard to alias analysis.

I notice we try to honor symbol interposition when directly comparing
__start_builtin_fw != __end_builtin_fw for example.  But we certainly
do not "honor" interposition for alias analysis for, say

extern int a[1];
extern int b[1];

where a store to a[0] is not considered aliasing b[0].

Richard.

> > Anyway, IANALL.
> 
> IGINYA - I guess I'm not young anymore, what's IANALL mean? :)
> 
>   Luis
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)