2023-06-27 22:26:59

by Dinh Nguyen

[permalink] [raw]
Subject: [PATCH] Revert "nios2: Convert __pte_free_tlb() to use ptdescs"

This reverts commit 6ebe94baa2b9ddf3ccbb7f94df6ab26234532734.

The patch "nios2: Convert __pte_free_tlb() to use ptdescs" was supposed
to go together with a patchset that Vishal Moola had planned taking it
through the mm tree. By just having this patch, all NIOS2 builds are
broken.

Signed-off-by: Dinh Nguyen <[email protected]>
---
arch/nios2/include/asm/pgalloc.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h
index ce6bb8e74271..ecd1657bb2ce 100644
--- a/arch/nios2/include/asm/pgalloc.h
+++ b/arch/nios2/include/asm/pgalloc.h
@@ -28,10 +28,10 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,

extern pgd_t *pgd_alloc(struct mm_struct *mm);

-#define __pte_free_tlb(tlb, pte, addr) \
- do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+#define __pte_free_tlb(tlb, pte, addr) \
+ do { \
+ pgtable_pte_page_dtor(pte); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)

#endif /* _ASM_NIOS2_PGALLOC_H */
--
2.25.1



2023-06-27 23:10:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Revert "nios2: Convert __pte_free_tlb() to use ptdescs"

On Tue, 27 Jun 2023 at 15:14, Dinh Nguyen <[email protected]> wrote:
>
> This reverts commit 6ebe94baa2b9ddf3ccbb7f94df6ab26234532734.
>
> The patch "nios2: Convert __pte_free_tlb() to use ptdescs" was supposed
> to go together with a patchset that Vishal Moola had planned taking it
> through the mm tree. By just having this patch, all NIOS2 builds are
> broken.

This is now at least the third time just this merge window where some
base tree was broken, and people thought that linux-next is some kind
of testing ground for it all.

NO!

Linux-next is indeed for testing, and for finding situations where
there are interactions between different trees.

But linux-next is *not* a replacement for "this tree has to work on
its own". THAT testing needs to be done independently, and *before* a
tree hits linux-next.

It is *NOT* ok to say "this will work in combination with that other
tree". EVERY SINGLE TREE needs to work on its own, because otherwise
you cannot bisect the end result sanely.

We apparently had the NIOS2 tree being broken. And the RCU tree was
broken. And the KUnit tree was broken.

In all those cases, the base tree did not compile properly on its own,
and linux-next "magically fixed" it by either having Stephen Rothwell
literally fix the build breakage by hand, or by having some other tree
hide the problem.

This is very much not ok.

I'm not sure why it happened so much this release, but this needs to
stop. People need to realize that you can't just throw shit at the
wall and see if it sticks. You need to test your own trees *first*,
and *independently* of other peoples trees.

Then, if you have done basic testing, you can then have it in
linux-next and that hopefully then finds any issues with bad
interactions with other trees, and maybe also ends up getting more
coverage testing on odd architectures and with odd configurations.

But linux-next must not in *any* way be a replacement for doing basic
testing on your own tree first.

Linus

2023-06-27 23:45:38

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] Revert "nios2: Convert __pte_free_tlb() to use ptdescs"

On 6/27/23 15:35, Linus Torvalds wrote:
> On Tue, 27 Jun 2023 at 15:14, Dinh Nguyen <[email protected]> wrote:
>>
>> This reverts commit 6ebe94baa2b9ddf3ccbb7f94df6ab26234532734.
>>
>> The patch "nios2: Convert __pte_free_tlb() to use ptdescs" was supposed
>> to go together with a patchset that Vishal Moola had planned taking it
>> through the mm tree. By just having this patch, all NIOS2 builds are
>> broken.
>
> This is now at least the third time just this merge window where some
> base tree was broken, and people thought that linux-next is some kind
> of testing ground for it all.
>
> NO!
>
> Linux-next is indeed for testing, and for finding situations where
> there are interactions between different trees.
>
> But linux-next is *not* a replacement for "this tree has to work on
> its own". THAT testing needs to be done independently, and *before* a
> tree hits linux-next.
>
> It is *NOT* ok to say "this will work in combination with that other
> tree". EVERY SINGLE TREE needs to work on its own, because otherwise
> you cannot bisect the end result sanely.
>
> We apparently had the NIOS2 tree being broken. And the RCU tree was
> broken. And the KUnit tree was broken.
>

Actually, this one is broken in linux-next as well because it was pulled
into it, but the context patches needed to make it work (compile) are not
there. It is also broken in next/pending-fixes for the same reason.

Only this happened so quick that by the time I noticed and reported
and argued that, no, I did not try to apply this patch on its own,
the pull request into mainline was already sent and applied.

Problem with linux-next is that it is so badly broken that it would take
a full-time position to track down all its failures. Then there are those
last-minute patches added in the week (or days) before the commit window
opens which break it again. This is one example, but there is at least
one more in linux-next (and pending-fixes); see
https://kerneltests.org/builders/next-sh-pending-fixes/builds/822/steps/buildcommand/logs/stdio

Guenter

> In all those cases, the base tree did not compile properly on its own,
> and linux-next "magically fixed" it by either having Stephen Rothwell
> literally fix the build breakage by hand, or by having some other tree
> hide the problem.
>
> This is very much not ok.
>
> I'm not sure why it happened so much this release, but this needs to
> stop. People need to realize that you can't just throw shit at the
> wall and see if it sticks. You need to test your own trees *first*,
> and *independently* of other peoples trees.
>
> Then, if you have done basic testing, you can then have it in
> linux-next and that hopefully then finds any issues with bad
> interactions with other trees, and maybe also ends up getting more
> coverage testing on odd architectures and with odd configurations.
>
> But linux-next must not in *any* way be a replacement for doing basic
> testing on your own tree first.
>
> Linus


2023-06-28 00:04:42

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] Revert "nios2: Convert __pte_free_tlb() to use ptdescs"

On Tue, Jun 27, 2023 at 03:35:45PM -0700, Linus Torvalds wrote:
> On Tue, 27 Jun 2023 at 15:14, Dinh Nguyen <[email protected]> wrote:
> >
> > This reverts commit 6ebe94baa2b9ddf3ccbb7f94df6ab26234532734.
> >
> > The patch "nios2: Convert __pte_free_tlb() to use ptdescs" was supposed
> > to go together with a patchset that Vishal Moola had planned taking it
> > through the mm tree. By just having this patch, all NIOS2 builds are
> > broken.
>
> This is now at least the third time just this merge window where some
> base tree was broken, and people thought that linux-next is some kind
> of testing ground for it all.
>
> NO!
>
> Linux-next is indeed for testing, and for finding situations where
> there are interactions between different trees.
>
> But linux-next is *not* a replacement for "this tree has to work on
> its own". THAT testing needs to be done independently, and *before* a
> tree hits linux-next.
>
> It is *NOT* ok to say "this will work in combination with that other
> tree". EVERY SINGLE TREE needs to work on its own, because otherwise
> you cannot bisect the end result sanely.
>
> We apparently had the NIOS2 tree being broken. And the RCU tree was
> broken. And the KUnit tree was broken.
>
> In all those cases, the base tree did not compile properly on its own,
> and linux-next "magically fixed" it by either having Stephen Rothwell
> literally fix the build breakage by hand, or by having some other tree
> hide the problem.
>
> This is very much not ok.
>
> I'm not sure why it happened so much this release, but this needs to
> stop. People need to realize that you can't just throw shit at the
> wall and see if it sticks. You need to test your own trees *first*,
> and *independently* of other peoples trees.
>
> Then, if you have done basic testing, you can then have it in
> linux-next and that hopefully then finds any issues with bad
> interactions with other trees, and maybe also ends up getting more
> coverage testing on odd architectures and with odd configurations.
>
> But linux-next must not in *any* way be a replacement for doing basic
> testing on your own tree first.

On the off-chance that it helps someone else avoid my stupid mistakes,
here is exactly how I messed this up so badly:

1. This API-name-change series went well, except for the usual
lagging changes. This *should* not be a problem, as you
simply leave the old API in however long it takes for the
change to get in.

2. At some point -next was a single-argument kfree_rcu()-free zone.
So I queued the offending commit on the -rcu tree's rcu/next
branch, followed by a revert for my own testing. The idea was
to make new uses fail in -next testing.

So far, so good.

3. I noticed that -next was now free of kfree_rcu() calls.

At this point, I made three stupid mistakes:

a. I failed to wait for mainline itself to be free of the
single-argument kfree_rcu(), thus pulling the offending
single-argument kfree_rcu() removal commit into my pull
request a merge window too soon. This is of course
especially stupid since I tend to send you the RCU pull
request early.

b. I failed to identify exactly which -next commit eliminated
single-argument kfree_rcu(). Had I done so, I would
have seen that this was in fact Stephen's rcu/next
merge commit, which was never going to go to mainline.

c. Worst yet, out of force of habit, I left the revert
from #2 above in my testing, thus failing to see the
-rcu failure due to that remaining single-argument
kfree_rcu() call.

So a combination of three stupid mistakes on my part made the RCU
happen.

As you say, testing *exactly* the commit heading up the pull request
merged with your master branch would have spotted this, and I will
of course make sure that I do this in the future.

And again, please accept my apologies for this mess.

Thanx, Paul

2023-06-28 13:27:50

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] Revert "nios2: Convert __pte_free_tlb() to use ptdescs"

On 6/27/23 16:23, Guenter Roeck wrote:
> On 6/27/23 15:35, Linus Torvalds wrote:
>> On Tue, 27 Jun 2023 at 15:14, Dinh Nguyen <[email protected]> wrote:
>>>
>>> This reverts commit 6ebe94baa2b9ddf3ccbb7f94df6ab26234532734.
>>>
>>> The patch "nios2: Convert __pte_free_tlb() to use ptdescs" was supposed
>>> to go together with a patchset that Vishal Moola had planned taking it
>>> through the mm tree. By just having this patch, all NIOS2 builds are
>>> broken.
>>
>> This is now at least the third time just this merge window where some
>> base tree was broken, and people thought that linux-next is some kind
>> of testing ground for it all.
>>
>> NO!
>>
>> Linux-next is indeed for testing, and for finding situations where
>> there are interactions between different trees.
>>
>> But linux-next is *not* a replacement for "this tree has to work on
>> its own". THAT testing needs to be done independently, and *before* a
>> tree hits linux-next.
>>
>> It is *NOT* ok to say "this will work in combination with that other
>> tree". EVERY SINGLE TREE needs to work on its own, because otherwise
>> you cannot bisect the end result sanely.
>>
>> We apparently had the NIOS2 tree being broken. And the RCU tree was
>> broken. And the KUnit tree was broken.
>>
>
> Actually, this one is broken in linux-next as well because it was pulled
> into it, but the context patches needed to make it work (compile) are not
> there. It is also broken in next/pending-fixes for the same reason.
>
> Only this happened so quick that by the time I noticed and reported
> and argued that, no, I did not try to apply this patch on its own,
> the pull request into mainline was already sent and applied.
>
> Problem with linux-next is that it is so badly broken that it would take
> a full-time position to track down all its failures. Then there are those
> last-minute patches added in the week (or days) before the commit window
> opens which break it again. This is one example, but there is at least
> one more in linux-next (and pending-fixes); see
> https://kerneltests.org/builders/next-sh-pending-fixes/builds/822/steps/buildcommand/logs/stdio
>

And now the broken (never compiled) patch made it into mainline
and breaks the sh:dreamcast_defconfig build there.

Yes, it does happen a lot that builds are temporarily broken in mainline
because patch series are split up among maintainers and submitted to mainline
without regard of buildability. I have learned to live with that and don't
normally report it because I know (ok, hope) it is going to be fixed
by the end of the commit window.

I personally find patch series - typically doing some cleanup - which are
not even build tested on the affected architectures much more annoying.

Guenter