2020-05-21 17:34:03

by Stephen Rothwell

[permalink] [raw]
Subject: linux-next: build failure after merge of the tip tree

Hi all,

After merging the tip tree, all my linux-next builds took signficantly
longer and used much more memory. In some cases, builds would seg fault
due to running out of memory :-(

I have eventaully bisected it to commit

cdd28ad2d811 ("READ_ONCE: Use data_race() to avoid KCSAN instrumentation")

For my (e.g.) x86_64 allmodconfig builds (cross compiled on PowerPC le,
-j80) the elapsed time went from around 9 minutes to over 17 minutes
and the maximum resident size (as reported by /usr/bin/time) from around
500M to around 2G (I saw lots of cc1 processes over 2G in size).

For tomorrow's linux-next (well, later today :-() I will revert that
commit (and its child) when I merge the tip tree.

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-05-21 17:37:15

by Will Deacon

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

Hi Stephen,

[+Marco and Boris]

On Fri, May 22, 2020 at 03:31:19AM +1000, Stephen Rothwell wrote:
> After merging the tip tree, all my linux-next builds took signficantly
> longer and used much more memory. In some cases, builds would seg fault
> due to running out of memory :-(
>
> I have eventaully bisected it to commit
>
> cdd28ad2d811 ("READ_ONCE: Use data_race() to avoid KCSAN instrumentation")
>
> For my (e.g.) x86_64 allmodconfig builds (cross compiled on PowerPC le,
> -j80) the elapsed time went from around 9 minutes to over 17 minutes
> and the maximum resident size (as reported by /usr/bin/time) from around
> 500M to around 2G (I saw lots of cc1 processes over 2G in size).
>
> For tomorrow's linux-next (well, later today :-() I will revert that
> commit (and its child) when I merge the tip tree.

Sorry about that, seems we can't avoid running into compiler problems with
this lot. The good news is that there's a series to fix this here:

https://lore.kernel.org/r/[email protected]

so hopefully this be fixed in -tip soon (but I agree that reverting the
thing in -next in the meantime makes sense).

Will

2020-05-22 07:21:39

by Stephen Rothwell

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

Hi Will,

On Thu, 21 May 2020 18:35:22 +0100 Will Deacon <[email protected]> wrote:
>
> [+Marco and Boris]
>
> On Fri, May 22, 2020 at 03:31:19AM +1000, Stephen Rothwell wrote:
> > After merging the tip tree, all my linux-next builds took signficantly
> > longer and used much more memory. In some cases, builds would seg fault
> > due to running out of memory :-(
> >
> > I have eventaully bisected it to commit
> >
> > cdd28ad2d811 ("READ_ONCE: Use data_race() to avoid KCSAN instrumentation")
> >
> > For my (e.g.) x86_64 allmodconfig builds (cross compiled on PowerPC le,
> > -j80) the elapsed time went from around 9 minutes to over 17 minutes
> > and the maximum resident size (as reported by /usr/bin/time) from around
> > 500M to around 2G (I saw lots of cc1 processes over 2G in size).
> >
> > For tomorrow's linux-next (well, later today :-() I will revert that
> > commit (and its child) when I merge the tip tree.
>
> Sorry about that, seems we can't avoid running into compiler problems with
> this lot. The good news is that there's a series to fix this here:
>
> https://lore.kernel.org/r/[email protected]
>
> so hopefully this be fixed in -tip soon (but I agree that reverting the
> thing in -next in the meantime makes sense).

Unfortunately, the revert didn't work, so instead I have used the tip
tree from next-20200518 for today (hopefully this will all be sorted
out by Monday).

--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-05-22 07:51:42

by Stephen Rothwell

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

Hi all,

On Fri, 22 May 2020 17:17:08 +1000 Stephen Rothwell <[email protected]> wrote:
>
> On Thu, 21 May 2020 18:35:22 +0100 Will Deacon <[email protected]> wrote:
> >
> > [+Marco and Boris]
> >
> > On Fri, May 22, 2020 at 03:31:19AM +1000, Stephen Rothwell wrote:
> > > After merging the tip tree, all my linux-next builds took signficantly
> > > longer and used much more memory. In some cases, builds would seg fault
> > > due to running out of memory :-(
> > >
> > > I have eventaully bisected it to commit
> > >
> > > cdd28ad2d811 ("READ_ONCE: Use data_race() to avoid KCSAN instrumentation")
> > >
> > > For my (e.g.) x86_64 allmodconfig builds (cross compiled on PowerPC le,
> > > -j80) the elapsed time went from around 9 minutes to over 17 minutes
> > > and the maximum resident size (as reported by /usr/bin/time) from around
> > > 500M to around 2G (I saw lots of cc1 processes over 2G in size).
> > >
> > > For tomorrow's linux-next (well, later today :-() I will revert that
> > > commit (and its child) when I merge the tip tree.
> >
> > Sorry about that, seems we can't avoid running into compiler problems with
> > this lot. The good news is that there's a series to fix this here:
> >
> > https://lore.kernel.org/r/[email protected]
> >
> > so hopefully this be fixed in -tip soon (but I agree that reverting the
> > thing in -next in the meantime makes sense).
>
> Unfortunately, the revert didn't work, so instead I have used the tip
> tree from next-20200518 for today (hopefully this will all be sorted
> out by Monday).

And the rcu tree has merged part of the tip tree that contains the
offending commits, so I have used the version fo the rcu tree from
next-20200519 for today.
--
Cheers,
Stephen Rothwell


Attachments:
(No filename) (499.00 B)
OpenPGP digital signature

2020-05-23 00:14:52

by Paul E. McKenney

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Fri, May 22, 2020 at 05:49:44PM +1000, Stephen Rothwell wrote:
> Hi all,
>
> On Fri, 22 May 2020 17:17:08 +1000 Stephen Rothwell <[email protected]> wrote:
> >
> > On Thu, 21 May 2020 18:35:22 +0100 Will Deacon <[email protected]> wrote:
> > >
> > > [+Marco and Boris]
> > >
> > > On Fri, May 22, 2020 at 03:31:19AM +1000, Stephen Rothwell wrote:
> > > > After merging the tip tree, all my linux-next builds took signficantly
> > > > longer and used much more memory. In some cases, builds would seg fault
> > > > due to running out of memory :-(
> > > >
> > > > I have eventaully bisected it to commit
> > > >
> > > > cdd28ad2d811 ("READ_ONCE: Use data_race() to avoid KCSAN instrumentation")
> > > >
> > > > For my (e.g.) x86_64 allmodconfig builds (cross compiled on PowerPC le,
> > > > -j80) the elapsed time went from around 9 minutes to over 17 minutes
> > > > and the maximum resident size (as reported by /usr/bin/time) from around
> > > > 500M to around 2G (I saw lots of cc1 processes over 2G in size).
> > > >
> > > > For tomorrow's linux-next (well, later today :-() I will revert that
> > > > commit (and its child) when I merge the tip tree.
> > >
> > > Sorry about that, seems we can't avoid running into compiler problems with
> > > this lot. The good news is that there's a series to fix this here:
> > >
> > > https://lore.kernel.org/r/[email protected]
> > >
> > > so hopefully this be fixed in -tip soon (but I agree that reverting the
> > > thing in -next in the meantime makes sense).
> >
> > Unfortunately, the revert didn't work, so instead I have used the tip
> > tree from next-20200518 for today (hopefully this will all be sorted
> > out by Monday).
>
> And the rcu tree has merged part of the tip tree that contains the
> offending commits, so I have used the version fo the rcu tree from
> next-20200519 for today.

Please accept my apologies for my part of this problem! I don't see
the slowdowns on my normal test system (possibly due to gcc 4.8.5),
but I do see them on my laptop.

Marco, Thomas, is there any better setup I can provide Stephen? Or
is the next-20200519 -rcu tree the best we have right now?

Thanx, Paul

2020-05-23 06:54:36

by Borislav Petkov

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Fri, May 22, 2020 at 05:12:23PM -0700, Paul E. McKenney wrote:
> Marco, Thomas, is there any better setup I can provide Stephen? Or
> is the next-20200519 -rcu tree the best we have right now?

I've queued the fixes yesterday into tip:locking/kcsan and tglx said
something about you having to rebase anyway. I guess you can find him on
IRC at some point later. :)

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-05-23 09:56:58

by Thomas Gleixner

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

Borislav Petkov <[email protected]> writes:

> On Fri, May 22, 2020 at 05:12:23PM -0700, Paul E. McKenney wrote:
>> Marco, Thomas, is there any better setup I can provide Stephen? Or
>> is the next-20200519 -rcu tree the best we have right now?
>
> I've queued the fixes yesterday into tip:locking/kcsan and tglx said
> something about you having to rebase anyway. I guess you can find him on
> IRC at some point later. :)

locking/kcsan is not the problem (it just has more fixes on top)

core/rcu is the one which diverged and caused the merge conflict with
PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
rebase on the current version (which is not going to change again).

Thanks,

tglx

2020-05-23 15:08:32

by Paul E. McKenney

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
> Borislav Petkov <[email protected]> writes:
>
> > On Fri, May 22, 2020 at 05:12:23PM -0700, Paul E. McKenney wrote:
> >> Marco, Thomas, is there any better setup I can provide Stephen? Or
> >> is the next-20200519 -rcu tree the best we have right now?
> >
> > I've queued the fixes yesterday into tip:locking/kcsan and tglx said
> > something about you having to rebase anyway. I guess you can find him on
> > IRC at some point later. :)
>
> locking/kcsan is not the problem (it just has more fixes on top)
>
> core/rcu is the one which diverged and caused the merge conflict with
> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
> rebase on the current version (which is not going to change again).

So there will be another noinstr-rcu-* tag, and I will rebase on top
of that, correct? If so, fair enough!

Thanx, Paul

2020-05-23 19:08:21

by Thomas Gleixner

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

"Paul E. McKenney" <[email protected]> writes:
> On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
>> core/rcu is the one which diverged and caused the merge conflict with
>> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
>> rebase on the current version (which is not going to change again).
>
> So there will be another noinstr-rcu-* tag, and I will rebase on top
> of that, correct? If so, fair enough!

Here you go: noinstr-rcu-220-05-23

I wanted this to be 2020 and not 220 but I noticed after pushing it
out. I guess it still does the job :)

Thanks,

tglx

2020-05-23 21:29:58

by Paul E. McKenney

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Sat, May 23, 2020 at 09:05:24PM +0200, Thomas Gleixner wrote:
> "Paul E. McKenney" <[email protected]> writes:
> > On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
> >> core/rcu is the one which diverged and caused the merge conflict with
> >> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
> >> rebase on the current version (which is not going to change again).
> >
> > So there will be another noinstr-rcu-* tag, and I will rebase on top
> > of that, correct? If so, fair enough!
>
> Here you go: noinstr-rcu-220-05-23
>
> I wanted this to be 2020 and not 220 but I noticed after pushing it
> out. I guess it still does the job :)

Now -that- is what I call an old-school tag name!!! ;-)

I remerged, rebased, and pushed to -rcu branch "dev".

If it survives testing, I will reset -rcu branch "rcu/next" as well.

Thanx, Paul

2020-05-25 00:51:39

by Paul E. McKenney

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Sat, May 23, 2020 at 02:23:45PM -0700, Paul E. McKenney wrote:
> On Sat, May 23, 2020 at 09:05:24PM +0200, Thomas Gleixner wrote:
> > "Paul E. McKenney" <[email protected]> writes:
> > > On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
> > >> core/rcu is the one which diverged and caused the merge conflict with
> > >> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
> > >> rebase on the current version (which is not going to change again).
> > >
> > > So there will be another noinstr-rcu-* tag, and I will rebase on top
> > > of that, correct? If so, fair enough!
> >
> > Here you go: noinstr-rcu-220-05-23
> >
> > I wanted this to be 2020 and not 220 but I noticed after pushing it
> > out. I guess it still does the job :)
>
> Now -that- is what I call an old-school tag name!!! ;-)
>
> I remerged, rebased, and pushed to -rcu branch "dev".
>
> If it survives testing, I will reset -rcu branch "rcu/next" as well.

And passed! The compile times are back to their old selves on my
laptop as well.

Thank you for setting this up, Thomas!!!

Thanx, Paul

2020-05-25 16:22:46

by Marco Elver

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Mon, 25 May 2020 at 02:37, Paul E. McKenney <[email protected]> wrote:
>
> On Sat, May 23, 2020 at 02:23:45PM -0700, Paul E. McKenney wrote:
> > On Sat, May 23, 2020 at 09:05:24PM +0200, Thomas Gleixner wrote:
> > > "Paul E. McKenney" <[email protected]> writes:
> > > > On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
> > > >> core/rcu is the one which diverged and caused the merge conflict with
> > > >> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
> > > >> rebase on the current version (which is not going to change again).
> > > >
> > > > So there will be another noinstr-rcu-* tag, and I will rebase on top
> > > > of that, correct? If so, fair enough!
> > >
> > > Here you go: noinstr-rcu-220-05-23
> > >
> > > I wanted this to be 2020 and not 220 but I noticed after pushing it
> > > out. I guess it still does the job :)
> >
> > Now -that- is what I call an old-school tag name!!! ;-)
> >
> > I remerged, rebased, and pushed to -rcu branch "dev".
> >
> > If it survives testing, I will reset -rcu branch "rcu/next" as well.
>
> And passed! The compile times are back to their old selves on my
> laptop as well.
>
> Thank you for setting this up, Thomas!!!

I just noticed that -rcu and -tip both still have their own version of
"ubsan, kcsan: Don't combine sanitizer with kcov on clang". For there
to not be any conflicts in -next, "ubsan, kcsan: Don't combine
sanitizer with kcov on clang" could be dropped from -rcu.

Thanks,
-- Marco

2020-05-25 22:16:03

by Paul E. McKenney

[permalink] [raw]
Subject: Re: linux-next: build failure after merge of the tip tree

On Mon, May 25, 2020 at 10:20:29AM +0200, Marco Elver wrote:
> On Mon, 25 May 2020 at 02:37, Paul E. McKenney <[email protected]> wrote:
> >
> > On Sat, May 23, 2020 at 02:23:45PM -0700, Paul E. McKenney wrote:
> > > On Sat, May 23, 2020 at 09:05:24PM +0200, Thomas Gleixner wrote:
> > > > "Paul E. McKenney" <[email protected]> writes:
> > > > > On Sat, May 23, 2020 at 11:54:26AM +0200, Thomas Gleixner wrote:
> > > > >> core/rcu is the one which diverged and caused the merge conflict with
> > > > >> PPC to happen twice. So Paul needs to remove the stale core/rcu bits and
> > > > >> rebase on the current version (which is not going to change again).
> > > > >
> > > > > So there will be another noinstr-rcu-* tag, and I will rebase on top
> > > > > of that, correct? If so, fair enough!
> > > >
> > > > Here you go: noinstr-rcu-220-05-23
> > > >
> > > > I wanted this to be 2020 and not 220 but I noticed after pushing it
> > > > out. I guess it still does the job :)
> > >
> > > Now -that- is what I call an old-school tag name!!! ;-)
> > >
> > > I remerged, rebased, and pushed to -rcu branch "dev".
> > >
> > > If it survives testing, I will reset -rcu branch "rcu/next" as well.
> >
> > And passed! The compile times are back to their old selves on my
> > laptop as well.
> >
> > Thank you for setting this up, Thomas!!!
>
> I just noticed that -rcu and -tip both still have their own version of
> "ubsan, kcsan: Don't combine sanitizer with kcov on clang". For there
> to not be any conflicts in -next, "ubsan, kcsan: Don't combine
> sanitizer with kcov on clang" could be dropped from -rcu.

Thank you for spotting this! Yes, if it is already in -tip, I should
drop it. If this causes trouble for clang users working with -rcu, I
can always pull in the exact commit used in -tip.

Anyway, -rcu branch "dev" no longer contains this commit.

Thanx, Paul