On 18.11.2017 06:14, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean <[email protected]> wrote:
>> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean <[email protected]> wrote:
>>>>
>>>> I am still getting the crash at d9e12200852d, I figured I would
>>>> double-check the "good" and "bad" kernels before starting a full bisect.
>>>
>>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>>
>> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.
>
> That's strange. With d9e12200852d the shuffle_seed variables won't
> ever actually get used. (i.e. I wouldn't expect the seed to change any
> behavior.)
>
> Can you confirm with something like this:
>
>
> for (i = 0; i < 4; i++) {
> seed[i] = shuffle_seed[i];
>
>
> You should see no reports of "Shuffling struct ..."
>
> And if it reports nothing, and you're on d9e12200852d, can you confirm
> that switching to a "good" seed fixes it? (If it _does_, then I
> suspect a build artifact being left behind or something odd like
> that.)
>
>>> Kees removed even the baseline "randomize pure function pointer
>>> structures", so at that commit, nothing should be randomized.
>>>
>>> But maybe the plugin code itself ends up confusing gcc somehow?
>>>
>>> Even when it doesn't actually do that "relayout_struct()" on the
>>> structure, it always does those TYPE_ATTRIBUTES() games.
>
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.
>
I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin
enabled.
This function is located in a fs/nfsd/nfs4xdr.c file.
The fault happened at "xdr_encode_hyper(p, exp->ex_path.mnt->mnt_sb->s_maxbytes)"
line, namely when accessing s_maxbytes.
exp->ex_path is of type struct path that has been annotated with
__randomize_layout.
It seems to me that this annotation isn't really taken into consideration
when compiling nfs4xdr.c.
This most likely results in dereferencing a value of exp->ex_path.dentry
instead of exp->ex_path.mnt. Then some member of struct dentry is
dereferenced as struct super_block to access its s_maxbytes member which
results in an oops if it happens to be an invalid pointer (which it was
in my case).
How to reproduce the problem statically (tested on current Linus's tree
and on 4.15.4, with gcc 7.3.0):
1) Enable RANDSTRUCT plugin,
2) Use a RANDSTRUCT seed that results in shuffling of struct path,
Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106".
3) make fs/nfsd/nfs4xdr.s and save the result,
4) Insert "#include <linux/compiler_types.h>" at the top of
fs/nfsd/nfs4xdr.c as the very first include directive.
5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3.
One can see that offsets used to access various members of struct path are
different, and also that the original file from step 3 contains an object
named "__randomize_layout".
This is caused by a fact that the current version of nfs4xdr.c includes
linux/fs_struct.h as the very first included header which then includes
linux/path.h as the very first included header, which then defines
struct path, but without including any files on its own.
This results in __randomize_layout tag at the end of struct path
definition being treated as a variable name (since linux/compiler-gcc.h
that defines it as a type attribute has not been included yet).
It looks like to me that every header file that defines a randomized
struct also has to include linux/compiler_types.h or some other file
that ultimately results in that file inclusion in order to make
the RANDSTRUCT plugin work correctly.
Maciej
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
<[email protected]> wrote:
>
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".
Whee.
Thanks for root-causing this issue, and this syntax of ours is clearly
*much* too fragile.
We actually have similar issues with some of our other attributes,
where out nice "helpful" attribute shorthand can end up being just
silently interpreted as a variable name if they aren't defined in
time.
For most of our other attributes, it just doesn't matter all that much
if some user doesn't happen to see the attribute. For
__randomize_layout, it's obviously very fatal, and silently just
generates crazy code.
I'm not entirely sure what the right solution is, because it's
obviously much too easy to miss some #include by mistake. It's easy to
say "you should always include the proper header", but if a failure to
do so doesn't end up with any warnings or errors, but just silent bad
code generation, it's much too fragile.
I wonder if we could change the syntax of that "__randomize_layout"
thing. Some of our related helper macros (ie
randomized_struct_fields_start/end) don't have the same problem,
because if you don't have the define for them, the compiler will
complain about bad syntax.
And other attribute specifiers we encourage people to put in other
parts of the type, like __user etc, so they don't have that same
parsing issue.
I guess one _extreme_ fix for this would be to put
extern struct nostruct __randomize_layout;
in our include/linux/kconfig.h, which I think we end up always
including first thanks to having it on the command line.
Because if you do that, you actually get an error:
CC [M] fs/nfsd/nfs4xdr.o
In file included from ./include/linux/fs_struct.h:5:0,
from fs/nfsd/nfs4xdr.c:36:
./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’
} __randomize_layout;
^~~~~~~~~~~~~~~~~~
In file included from <command-line>:0:0:
././include/linux/kconfig.h:8:28: note: previous declaration of
‘__randomize_layout’ was here
extern struct nostruct __randomize_layout;
^~~~~~~~~~~~~~~~~~
make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
and we would have figured this out immediately.
Broken example patch appended, in case somebody wants to play with
something like this or comes up with a better model entirely..
Linus
---
diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
index fec5076eda91..537dacb83380 100644
--- a/include/linux/kconfig.h
+++ b/include/linux/kconfig.h
@@ -4,6 +4,10 @@
#include <generated/autoconf.h>
+#ifndef __ASSEMBLY__
+ extern struct nostruct __randomize_layout;
+#endif
+
#define __ARG_PLACEHOLDER_1 0,
#define __take_second_arg(__ignored, val, ...) val
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
<[email protected]> wrote:
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".
>
> This is caused by a fact that the current version of nfs4xdr.c includes
> linux/fs_struct.h as the very first included header which then includes
> linux/path.h as the very first included header, which then defines
> struct path, but without including any files on its own.
>
> This results in __randomize_layout tag at the end of struct path
> definition being treated as a variable name (since linux/compiler-gcc.h
> that defines it as a type attribute has not been included yet).
Oh, well done! That would explain the code offset I was seeing when
the plugin on, but no-op, since the variable would still exist.
I'll play with Linus's suggestion and see what we get.
Thanks!
-Kees
--
Kees Cook
Pixel Security
On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook <[email protected]> wrote:
>
> I'll play with Linus's suggestion and see what we get.
It may be just as well to just include <linux/compiler_types.h> from
<linux/kconfig.h> and be done with it.
If you look at that hacky script I documented in commit 23c35f48f5fb
("pinctrl: remove include file from <linux/device.h>") and run it in a
fully built kernel tree, you'll see that that header is included from
pretty much every single file anyway. At least for me, for an
allmodconfig build, the top headers are
23322 arch/x86/include/uapi/asm/types.h
23322 include/asm-generic/int-ll64.h
23322 include/linux/types.h
23322 include/uapi/asm-generic/int-ll64.h
23322 include/uapi/asm-generic/types.h
23322 include/uapi/linux/types.h
23323 arch/x86/include/uapi/asm/bitsperlong.h
23323 include/asm-generic/bitsperlong.h
23323 include/uapi/asm-generic/bitsperlong.h
23326 include/linux/stringify.h
23390 include/linux/compiler_types.h
and considering that I have 25949 object files in that tree, it really
means that just about every compile ended up including that
<linux/compiler_types.h> file anyway (yeah, the "orc_types.h" header
ends up being mentioned twice for most files, so it looks even more
hot, but that's not real data).
I do hate including unnecessary stuff because it makes builds slower,
but kernel header files probably don't get much more core than
<linux/compiler_types.h>.
Linus
On Wed, Feb 21, 2018 at 2:47 PM, Linus Torvalds
<[email protected]> wrote:
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.
Looking at other attributes we use on structs, we may have similar
risks for these:
__packed
____cacheline_aligned
____cacheline_aligned_in_smp
____cacheline_internodealigned_in_smp
But they just haven't been used in places that we could trip over it
as badly, AFAICT.
> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.
We could do that for all the above, but I wonder if the real problem
is our convention of using "regular" names for these kinds of
attributes instead of parameterized names. If we always used something
like:
#define __struct(x) __attribute__(x)
We'd avoid it, but we'd uglify our struct attributes:
struct thing { ... } __struct(randomize_layout);
though trying this now creates other problems. Hmmm.
(Regardless, let me send the nfs fix separately...)
-Kees
>
> Because if you do that, you actually get an error:
>
> CC [M] fs/nfsd/nfs4xdr.o
> In file included from ./include/linux/fs_struct.h:5:0,
> from fs/nfsd/nfs4xdr.c:36:
> ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’
> } __randomize_layout;
> ^~~~~~~~~~~~~~~~~~
> In file included from <command-line>:0:0:
> ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
> extern struct nostruct __randomize_layout;
> ^~~~~~~~~~~~~~~~~~
> make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
> Linus
>
> ---
>
> diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
> index fec5076eda91..537dacb83380 100644
> --- a/include/linux/kconfig.h
> +++ b/include/linux/kconfig.h
> @@ -4,6 +4,10 @@
>
> #include <generated/autoconf.h>
>
> +#ifndef __ASSEMBLY__
> + extern struct nostruct __randomize_layout;
> +#endif
> +
> #define __ARG_PLACEHOLDER_1 0,
> #define __take_second_arg(__ignored, val, ...) val
--
Kees Cook
Pixel Security
On Wed, Feb 21, 2018 at 3:24 PM, Linus Torvalds
<[email protected]> wrote:
> On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook <[email protected]> wrote:
>>
>> I'll play with Linus's suggestion and see what we get.
>
> It may be just as well to just include <linux/compiler_types.h> from
> <linux/kconfig.h> and be done with it.
Hah, yeah, that would certainly solve it too. :)
> I do hate including unnecessary stuff because it makes builds slower,
> but kernel header files probably don't get much more core than
> <linux/compiler_types.h>.
It also has the benefit of not letting it "go wrong" in the first
place. (And the separate fix for nfs isn't needed...)
Do you want me to send the patch for this, or do you already have it
prepared? The body-fields I had prepared for the nfs were:
Reported-by: Patrick McLean <[email protected]>
Reported-by: Maciej S. Szmigiero <[email protected]>
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")
-Kees
--
Kees Cook
Pixel Security
On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook <[email protected]> wrote:
>
> Do you want me to send the patch for this, or do you already have it
> prepared?
I'd rather get something explicitly tested. I tried my earlier patch
with "make allmodconfig" (and a fix to nfsd to make it compile), but
now I'm back to testing hjl's gas updates so it would be better to get
a tested commit with a good commit message.
> The body-fields I had prepared for the nfs were:
>
> Reported-by: Patrick McLean <[email protected]>
> Reported-by: Maciej S. Szmigiero <[email protected]>
Oh, I think Maciej needs to get more than a "Reported-by:". This was a
really subtle thing that we didn't figure out in the original thread,
so give him a gold star in the form of "Root-caused-by:" or something.
*Fixing* this ends up being a one-liner or so. Finding the cause was
the painful part.
Linus
On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
<[email protected]> wrote:
> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook <[email protected]> wrote:
>>
>> Do you want me to send the patch for this, or do you already have it
>> prepared?
>
> I'd rather get something explicitly tested. I tried my earlier patch
> with "make allmodconfig" (and a fix to nfsd to make it compile), but
> now I'm back to testing hjl's gas updates so it would be better to get
> a tested commit with a good commit message.
>
>> The body-fields I had prepared for the nfs were:
>>
>> Reported-by: Patrick McLean <[email protected]>
>> Reported-by: Maciej S. Szmigiero <[email protected]>
>
> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
> really subtle thing that we didn't figure out in the original thread,
> so give him a gold star in the form of "Root-caused-by:" or something.
Oops, I just sent this out. I will adjust a re-send. I couldn't find a
documented field name for this...
> *Fixing* this ends up being a one-liner or so. Finding the cause was
> the painful part.
Yes indeed!
-Kees
--
Kees Cook
Pixel Security
On Wed, Feb 21, 2018 at 4:23 PM, Kees Cook <[email protected]> wrote:
> On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
> <[email protected]> wrote:
>> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook <[email protected]> wrote:
>>>
>>> Do you want me to send the patch for this, or do you already have it
>>> prepared?
>>
>> I'd rather get something explicitly tested. I tried my earlier patch
>> with "make allmodconfig" (and a fix to nfsd to make it compile), but
>> now I'm back to testing hjl's gas updates so it would be better to get
>> a tested commit with a good commit message.
>>
>>> The body-fields I had prepared for the nfs were:
>>>
>>> Reported-by: Patrick McLean <[email protected]>
>>> Reported-by: Maciej S. Szmigiero <[email protected]>
>>
>> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
>> really subtle thing that we didn't figure out in the original thread,
>> so give him a gold star in the form of "Root-caused-by:" or something.
>
> Oops, I just sent this out. I will adjust a re-send. I couldn't find a
> documented field name for this...
With the "root-cause" hint, I see we have used:
2 Root-cause-analysis-by:
2 Root-caused-by:
1 Root-cause-found-by:
I'll go with your "Root-caused-by" to tip the scale. :)
-Kees
--
Kees Cook
Pixel Security
Hi Linus,
2018-02-22 7:47 GMT+09:00 Linus Torvalds <[email protected]>:
> On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
> <[email protected]> wrote:
>>
>> One can see that offsets used to access various members of struct path are
>> different, and also that the original file from step 3 contains an object
>> named "__randomize_layout".
>
> Whee.
>
> Thanks for root-causing this issue, and this syntax of ours is clearly
> *much* too fragile.
>
> We actually have similar issues with some of our other attributes,
> where out nice "helpful" attribute shorthand can end up being just
> silently interpreted as a variable name if they aren't defined in
> time.
>
> For most of our other attributes, it just doesn't matter all that much
> if some user doesn't happen to see the attribute. For
> __randomize_layout, it's obviously very fatal, and silently just
> generates crazy code.
>
> I'm not entirely sure what the right solution is, because it's
> obviously much too easy to miss some #include by mistake. It's easy to
> say "you should always include the proper header", but if a failure to
> do so doesn't end up with any warnings or errors, but just silent bad
> code generation, it's much too fragile.
>
> I wonder if we could change the syntax of that "__randomize_layout"
> thing. Some of our related helper macros (ie
> randomized_struct_fields_start/end) don't have the same problem,
> because if you don't have the define for them, the compiler will
> complain about bad syntax.
>
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.
>
> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.
>
> Because if you do that, you actually get an error:
>
> CC [M] fs/nfsd/nfs4xdr.o
> In file included from ./include/linux/fs_struct.h:5:0,
> from fs/nfsd/nfs4xdr.c:36:
> ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’
> } __randomize_layout;
> ^~~~~~~~~~~~~~~~~~
> In file included from <command-line>:0:0:
> ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
> extern struct nostruct __randomize_layout;
> ^~~~~~~~~~~~~~~~~~
> make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
> Linus
>
Sorry for chiming in late.
I noticed this thread today,
honestly, the commit made me upset.
Can I suggest another way to make it less fragile?
__attribute((...)) can be placed after 'struct'.
So, we can write:
struct __randomize_layout path {
struct vfsmount *mnt;
struct dentry *dentry;
};
instead of
struct path {
struct vfsmount *mnt;
struct dentry *dentry;
} __randomize_layout;
If we force the former notation,
the undefined __randomize_layout results in a build error
instead of silent broken code generation.
It is true somebody can still place
__randomize_layout after the closing brace,
but can we check this by coccicheck or checkpatch.pl?
(we can describe it in coding style documentation, of course)
IMHO, we should not (ab)use include/linux/kconfig.h
to bring in misc things.
--
Best Regards
Masahiro Yamada
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
<[email protected]> wrote:
> Sorry for chiming in late.
>
> I noticed this thread today,
> honestly, the commit made me upset.
>
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.
>
>
> So, we can write:
>
>
> struct __randomize_layout path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> };
>
>
> instead of
>
>
> struct path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> } __randomize_layout;
Ugh. I had tried this after the struct _name_, not after "struct"
itself. This does fix it, though it remains fragile, as you mention.
> If we force the former notation,
> the undefined __randomize_layout results in a build error
> instead of silent broken code generation.
>
>
> It is true somebody can still place
> __randomize_layout after the closing brace,
> but can we check this by coccicheck or checkpatch.pl?
> (we can describe it in coding style documentation, of course)
>
>
> IMHO, we should not (ab)use include/linux/kconfig.h
> to bring in misc things.
I'm happy to send a patch that reverts the other changes and relocates
all the markings...
Linus, how would you like this to go?
-Kees
--
Kees Cook
Pixel Security
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
<[email protected]> wrote:
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.
That avoids the actual bug, but it wouldn't have helped _find_ the
problem in the first place.
If somebody ever does the same thing, they'd hit the same issue. And
it's not just __randomize_struct, it's any of our other type markers.
We can say "don't do that", but if there is no automated checking,
it's still ripe to cause problems just because somebody didn't notice.
So I'd rather have something that causes a build failure when
something goes wrong, rather than silently accepting syntax that
wasn't intended.
Linus