2023-04-26 17:08:50

by Sudip Mukherjee

[permalink] [raw]
Subject: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Hi All,

The latest mainline kernel branch fails to build xtensa, mips, arm allmodconfig
with the error:

drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_inbuf':
drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:33:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
33 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_outbuf':
drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:51:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
51 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:66:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
66 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

git bisect pointed to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

I will be happy to test any patch or provide any extra log if needed.


--
Regards
Sudip


2023-04-26 17:23:35

by Sudip Mukherjee

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On Wed, 26 Apr 2023 at 18:07, Sudip Mukherjee (Codethink)
<[email protected]> wrote:
>
> Hi All,
>
> The latest mainline kernel branch fails to build xtensa, mips, arm allmodconfig
> with the error:
>
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_inbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:33:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 33 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_outbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:51:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 51 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:66:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 66 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> git bisect pointed to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Also, riscv allmodconfig with the error:

drivers/media/platform/nxp/imx8-isi/imx8-isi-core.c:328:12: error:
'mxc_isi_pm_resume' defined but not used [-Werror=unused-function]
328 | static int mxc_isi_pm_resume(struct device *dev)
| ^~~~~~~~~~~~~~~~~
drivers/media/platform/nxp/imx8-isi/imx8-isi-core.c:314:12: error:
'mxc_isi_pm_suspend' defined but not used [-Werror=unused-function]
314 | static int mxc_isi_pm_suspend(struct device *dev)
| ^~~~~~~~~~~~~~~~~~


--
Regards
Sudip

2023-04-26 17:28:31

by Laurent Pinchart

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Hi Sudip,

On Wed, Apr 26, 2023 at 06:07:34PM +0100, Sudip Mukherjee (Codethink) wrote:
> Hi All,
>
> The latest mainline kernel branch fails to build xtensa, mips, arm allmodconfig
> with the error:
>
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_inbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:33:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 33 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_outbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:51:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 51 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:66:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 66 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> git bisect pointed to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")
>
> I will be happy to test any patch or provide any extra log if needed.

This issue has been fixed already, the fixes are available in the master
branch of git://linuxtv.org/media_stage.git:

e49b1e1958b4 media: nxp: ignore unused suspend operations
9829ed20b019 media: nxp: imx8-isi: fix buiding on 32-bit

Mauro, can you please include those two fixes in your next pull request
for v6.4 ?

--
Regards,

Laurent Pinchart

2023-05-08 11:07:51

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 26.04.23 19:07, Sudip Mukherjee (Codethink) wrote:
>
> The latest mainline kernel branch fails to build xtensa, mips, arm allmodconfig
> with the error:
>
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_inbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:33:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 33 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c: In function 'mxc_isi_channel_set_outbuf':
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:51:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 51 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
> drivers/media/platform/nxp/imx8-isi/imx8-isi-hw.c:66:5: error: "CONFIG_ARCH_DMA_ADDR_T_64BIT" is not defined, evaluates to 0 [-Werror=undef]
> 66 | #if CONFIG_ARCH_DMA_ADDR_T_64BIT
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> git bisect pointed to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Thanks for the report. The fixes (see the mail from Laurent) apparently
are still not mainlined (or am I missing something?), so let me add this
report to the tracking to ensure this is not forgotten:

#regzbot ^introduced cf21f328fcaf
#regzbot fix: media: nxp: ignore unused suspend operations
#regzbot ignore-activity

(I simply picked the latter of the two fixes, that should do the trick)

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

2023-05-08 16:38:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
(Thorsten Leemhuis) <[email protected]> wrote:
>
> Thanks for the report. The fixes (see the mail from Laurent) apparently
> are still not mainlined (or am I missing something?), so let me add this
> report to the tracking to ensure this is not forgotten:

Gaah. I was intending to apply the patch directly before rc1, but then
I forgot about this issue.

Mauro: I'm currently really *really* fed up with the media tree. This
exact same thing happened last merge window, where the media tree
caused pointless build errors, and it took way too long to get the
fixes the proper ways.

If something doesn't even build, it should damn well be fixed ASAP.

Last release it was imx290.c and PM support being disabled, and I had
to apply the fix manually because it continued to not come in the
proper way.

See commit 7b50567bdcad ("media: i2c: imx290: fix conditional function
defintions").

But also see commit b928db940448 ("media: i2c: imx290: fix conditional
function definitions"), which you *did* commit, but note this on that
commit:

AuthorDate: Tue Feb 7 17:13
CommitDate: Sat Mar 18 08:44

so it took you a MONTH AND A HALF to react to a build failure.

And see this:

git name-rev b928db940448
b928db940448 tags/v6.4-rc1~161^2~458

ie that build fix that you finally committed came in *AFTER* the 6.3
release, even though the bug it fixes was introduced in the 6.3 merge
window:

git name-rev 02852c01f654
02852c01f654 tags/v6.3-rc1~72^2~2^2~193

and now we're in the *EXACT*SAME* situation, with me applying a build
fix directly, because you couldn't get it fixed in a timely manner.

End result: you and the media tree is on my shit-list, and I will not
take any pull requests from you that aren't just fixes.

Not just this release, but the next one.

Because I'm completely and utterly fed up with you ignoring
fundamental "it doesn't even build" issues.

If you can't be bothered fix the build issues that get introduced
during the merge window, then I'm not going to merge new stuff from
you.

It's *that* simple.

Linus

2023-05-10 08:13:08

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Hi Linus,

Em Mon, 8 May 2023 09:27:28 -0700
Linus Torvalds <[email protected]> escreveu:

> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
> (Thorsten Leemhuis) <[email protected]> wrote:
> >
> > Thanks for the report. The fixes (see the mail from Laurent) apparently
> > are still not mainlined (or am I missing something?), so let me add this
> > report to the tracking to ensure this is not forgotten:
>
> Gaah. I was intending to apply the patch directly before rc1, but then
> I forgot about this issue.
>
> Mauro: I'm currently really *really* fed up with the media tree. This
> exact same thing happened last merge window, where the media tree
> caused pointless build errors, and it took way too long to get the
> fixes the proper ways.
>
> If something doesn't even build, it should damn well be fixed ASAP.
>
> Last release it was imx290.c and PM support being disabled, and I had
> to apply the fix manually because it continued to not come in the
> proper way.
>
> See commit 7b50567bdcad ("media: i2c: imx290: fix conditional function
> defintions").
>
> But also see commit b928db940448 ("media: i2c: imx290: fix conditional
> function definitions"), which you *did* commit, but note this on that
> commit:
>
> AuthorDate: Tue Feb 7 17:13
> CommitDate: Sat Mar 18 08:44
>
> so it took you a MONTH AND A HALF to react to a build failure.
>
> And see this:
>
> git name-rev b928db940448
> b928db940448 tags/v6.4-rc1~161^2~458
>
> ie that build fix that you finally committed came in *AFTER* the 6.3
> release, even though the bug it fixes was introduced in the 6.3 merge
> window:
>
> git name-rev 02852c01f654
> 02852c01f654 tags/v6.3-rc1~72^2~2^2~193
>
> and now we're in the *EXACT*SAME* situation, with me applying a build
> fix directly, because you couldn't get it fixed in a timely manner.

Sorry for the mess. I'll work to improve the process to avoid this
to happen again.

FYI, in order to reduce build issues, we have a Jenkins instance
doing builds with gcc and CLANG at the media stage tree, before we even merge
them at the main media development tree. They run with allyesconfig for
x86_64 arch, with W=1:

https://builder.linuxtv.org/job/media_stage_clang/
https://builder.linuxtv.org/job/media_stage_gcc/

And another CI job testing bisect breakages as I receive pull requests,
applying patch per patch and using both allyesconfig and allmodconfig,
also on x86_64 arch with W=1:

https://builder.linuxtv.org/job/patchwork/

The rule is to not merge stuff on media tree if any of those jobs
fail. I also fast-forward merging patches whose subject states that
the build has failed.

In order to help with that, on normal situation, I usually take one week
to merge stuff from media_stage into media_tree, doing rebases at
media_stage if needed to avoid git bisect build breakages at media_tree
(which is from where I send my update PRs to you).

Unfortunately, currently we don't have resources to do multiple randconfig
on Jenkins, as the build machines on the server are very slow. Yet, I'll
add CONFIG_PM disabled to the test set, as it seems to be a recurrent source
of troubles those days. I'll also try to identify a couple of other
randconfigs that would help to catch earlier problems like that.
If some other problematic Kconfig variables comes to your mind, please
feel free to suggest them for us to add to the CI automation.

-

In the specific case of this fixup patch, I didn't identify it as a build
issue, so it followed the usual workflow. We have a huge number of patches
for media, and it usually takes some time to handle all of them. This one
just followed the normal flow, as it didn't break Jenkins builds nor the
subject mentioned anything about build breakage.

Regards,
Mauro

2023-05-10 09:22:04

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 10.05.23 10:05, Mauro Carvalho Chehab wrote:
> Em Mon, 8 May 2023 09:27:28 -0700
> Linus Torvalds <[email protected]> escreveu:
>> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
>> (Thorsten Leemhuis) <[email protected]> wrote:
>>>
>>> Thanks for the report. The fixes (see the mail from Laurent) apparently
>>> are still not mainlined (or am I missing something?), so let me add this
>>> report to the tracking to ensure this is not forgotten:
>>
>> Gaah. I was intending to apply the patch directly before rc1, but then
>> I forgot about this issue.
>>
>> Mauro: I'm currently really *really* fed up with the media tree. This
>> exact same thing happened last merge window, where the media tree
>> caused pointless build errors, and it took way too long to get the
>> fixes the proper ways.
> [...]
>
> In the specific case of this fixup patch, I didn't identify it as a build
> issue, so it followed the usual workflow. We have a huge number of patches
> for media, and it usually takes some time to handle all of them. This one
> just followed the normal flow, as it didn't break Jenkins builds nor the
> subject mentioned anything about build breakage.

Makes me wonder again if we should start adding

CC: [email protected]

to any patches that fix regressions, that way maintainers and reviewers
would have something to filter for -- and I would become aware of all
regression fixes in the work, too.

Ciao, Thorsten

P.S.: BTW, let me tell regzbot that Linus merged the fix for the build
failure.

#regzbot fix: ba0ad6ed89f

FWIW, the one for the gcc warnings[1] Laurent mentioned elsewhere in
this thread is not merged yet afaics.

[1] https://lore.kernel.org/all/[email protected]/

2023-05-10 13:24:43

by Hans Verkuil

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
> Hi Linus,
>
> Em Mon, 8 May 2023 09:27:28 -0700
> Linus Torvalds <[email protected]> escreveu:
>
>> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
>> (Thorsten Leemhuis) <[email protected]> wrote:
>>>
>>> Thanks for the report. The fixes (see the mail from Laurent) apparently
>>> are still not mainlined (or am I missing something?), so let me add this
>>> report to the tracking to ensure this is not forgotten:
>>
>> Gaah. I was intending to apply the patch directly before rc1, but then
>> I forgot about this issue.
>>
>> Mauro: I'm currently really *really* fed up with the media tree. This
>> exact same thing happened last merge window, where the media tree
>> caused pointless build errors, and it took way too long to get the
>> fixes the proper ways.
>>
>> If something doesn't even build, it should damn well be fixed ASAP.
>>
>> Last release it was imx290.c and PM support being disabled, and I had
>> to apply the fix manually because it continued to not come in the
>> proper way.
>>
>> See commit 7b50567bdcad ("media: i2c: imx290: fix conditional function
>> defintions").
>>
>> But also see commit b928db940448 ("media: i2c: imx290: fix conditional
>> function definitions"), which you *did* commit, but note this on that
>> commit:
>>
>> AuthorDate: Tue Feb 7 17:13
>> CommitDate: Sat Mar 18 08:44
>>
>> so it took you a MONTH AND A HALF to react to a build failure.
>>
>> And see this:
>>
>> git name-rev b928db940448
>> b928db940448 tags/v6.4-rc1~161^2~458
>>
>> ie that build fix that you finally committed came in *AFTER* the 6.3
>> release, even though the bug it fixes was introduced in the 6.3 merge
>> window:
>>
>> git name-rev 02852c01f654
>> 02852c01f654 tags/v6.3-rc1~72^2~2^2~193
>>
>> and now we're in the *EXACT*SAME* situation, with me applying a build
>> fix directly, because you couldn't get it fixed in a timely manner.
>
> Sorry for the mess. I'll work to improve the process to avoid this
> to happen again.
>
> FYI, in order to reduce build issues, we have a Jenkins instance
> doing builds with gcc and CLANG at the media stage tree, before we even merge
> them at the main media development tree. They run with allyesconfig for
> x86_64 arch, with W=1:
>
> https://builder.linuxtv.org/job/media_stage_clang/
> https://builder.linuxtv.org/job/media_stage_gcc/
>
> And another CI job testing bisect breakages as I receive pull requests,
> applying patch per patch and using both allyesconfig and allmodconfig,
> also on x86_64 arch with W=1:
>
> https://builder.linuxtv.org/job/patchwork/
>
> The rule is to not merge stuff on media tree if any of those jobs
> fail. I also fast-forward merging patches whose subject states that
> the build has failed.
>
> In order to help with that, on normal situation, I usually take one week
> to merge stuff from media_stage into media_tree, doing rebases at
> media_stage if needed to avoid git bisect build breakages at media_tree
> (which is from where I send my update PRs to you).
>
> Unfortunately, currently we don't have resources to do multiple randconfig
> on Jenkins, as the build machines on the server are very slow. Yet, I'll
> add CONFIG_PM disabled to the test set, as it seems to be a recurrent source
> of troubles those days. I'll also try to identify a couple of other
> randconfigs that would help to catch earlier problems like that.
> If some other problematic Kconfig variables comes to your mind, please
> feel free to suggest them for us to add to the CI automation.
>
> -
>
> In the specific case of this fixup patch, I didn't identify it as a build
> issue, so it followed the usual workflow. We have a huge number of patches
> for media, and it usually takes some time to handle all of them. This one
> just followed the normal flow, as it didn't break Jenkins builds nor the
> subject mentioned anything about build breakage.

In the end it was my fault: I pushed the fix to our staging tree thinking
that there was enough time for it to be included in the PR for 6.4.
But I was wrong, the window for that closed a week earlier (which Mauro
even documented!). So Mauro never knew that this patch had to be included
in the PR to you. The right procedure would have been for me to tell Mauro
about this patch. Hopefully this will be the first and also last time that
I make that mistake.

We do have a major problem with too many incoming patches and not enough
maintainers & time. Some of it can be improved with better procedures and
testing, but that won't help the often slow code review times. It will be a
big topic during the upcoming media mini summit in Prague.

Regards,

Hans

2023-05-11 06:54:41

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Em Wed, 10 May 2023 11:02:57 +0200
"Linux regression tracking (Thorsten Leemhuis)" <[email protected]> escreveu:

> On 10.05.23 10:05, Mauro Carvalho Chehab wrote:
> > Em Mon, 8 May 2023 09:27:28 -0700
> > Linus Torvalds <[email protected]> escreveu:
> >> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
> >> (Thorsten Leemhuis) <[email protected]> wrote:
> >>>
> >>> Thanks for the report. The fixes (see the mail from Laurent) apparently
> >>> are still not mainlined (or am I missing something?), so let me add this
> >>> report to the tracking to ensure this is not forgotten:
> >>
> >> Gaah. I was intending to apply the patch directly before rc1, but then
> >> I forgot about this issue.
> >>
> >> Mauro: I'm currently really *really* fed up with the media tree. This
> >> exact same thing happened last merge window, where the media tree
> >> caused pointless build errors, and it took way too long to get the
> >> fixes the proper ways.
> > [...]
> >
> > In the specific case of this fixup patch, I didn't identify it as a build
> > issue, so it followed the usual workflow. We have a huge number of patches
> > for media, and it usually takes some time to handle all of them. This one
> > just followed the normal flow, as it didn't break Jenkins builds nor the
> > subject mentioned anything about build breakage.
>
> Makes me wonder again if we should start adding
>
> CC: [email protected]
>
> to any patches that fix regressions, that way maintainers and reviewers
> would have something to filter for -- and I would become aware of all
> regression fixes in the work, too.

Having some way that could be parsed by e-mail filters would be
nice.

>
> Ciao, Thorsten
>
> P.S.: BTW, let me tell regzbot that Linus merged the fix for the build
> failure.
>
> #regzbot fix: ba0ad6ed89f
>
> FWIW, the one for the gcc warnings[1] Laurent mentioned elsewhere in
> this thread is not merged yet afaics.
>
> [1] https://lore.kernel.org/all/[email protected]/

Just sent a pull request.

Btw, I did some changes at linux-media Jenkins instance to help early
track some extra build issues. They're all against
https://git.linuxtv.org/media_stage.git/, which is the tree where we place
media patches that are ready. We move them later, after a couple of days
to https://git.linuxtv.org/media_tree.git/. So, if something bad happens,
we have a chance to fix before setting them into a stone. With such
changes, we now have:

1. https://builder.linuxtv.org/job/patchwork/

This is a pre-merge test. It tests patch per patch the PRs with patch
sets ready to be merged, with W=1, allyesconfig/almodconfig[1] on x86_64.
Builds drivers/media and drivers/staging/media.
This is there already for a long time;

2. https://builder.linuxtv.org/job/media_stage_clang/

Checks build with clang-12 on x86_64 with W=1. Builds drivers/media
and drivers/staging/media with allyesconfig[1].

It was building with WERROR disabled, as some core macros were
producing errors at the time I created it (and for a while).
It was modified to enable WERROR as well.

3. https://builder.linuxtv.org/job/media_stage_gcc-pipeline/

It replaces another job that was just doing builds for x86_64
with W=1. Builds drivers/media and drivers/staging/media with
different configurations[1]:
x86_64: allyesconfig, allmodconfig, almodconfig with PM disabled;
arm32: allyesconfig
arm64: allyesconfig

4. https://builder.linuxtv.org/job/linux-media/

Does full builds with different configurations[1]:
x86_64: allyesconfig, allmodconfig, almodconfig with PM disabled;
arm32: allyesconfig
arm64: allyesconfig
docs: htmldocs and pdfdocs

I hope this will help avoiding future build regressions from our side.
Feel free to suggest a couple of other configs that we might add to
jobs (3) and (4).

I'm still adjusting the pipeline for (4), but currently, it is failing
on an issue that seems unrelated with the media subsystem with gcc 10.2.1:

AR drivers/built-in.a
AR built-in.a
AR vmlinux.a
LD vmlinux.o
vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x2d8: call to vmread_error_trampoline() leaves .noinstr.text section
vmlinux.o: warning: objtool: lkdtm_UNSET_SMEP+0xe1: relocation to !ENDBR: native_write_cr4+0x40

Is this a known regression? The media-stage tree is on the top of
Kernel 6.4-rc1.

Regards,
Mauro

-

[1] On all builds, the jobs disable some symbols that should not affect
media subsystem, to speedup the builds:

scripts/config -d MODULE_SIG -d KEYS -d IMA -d CONFIG_DEBUG_INFO -d SYSTEM_TRUSTED_KEYRING -d MODVERSIONS

2023-05-11 07:14:48

by Laurent Pinchart

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On Thu, May 11, 2023 at 07:46:06AM +0100, Mauro Carvalho Chehab wrote:
> Em Wed, 10 May 2023 11:02:57 +0200 "Linux regression tracking (Thorsten Leemhuis)" escreveu:
> > On 10.05.23 10:05, Mauro Carvalho Chehab wrote:
> > > Em Mon, 8 May 2023 09:27:28 -0700
> > > Linus Torvalds <[email protected]> escreveu:
> > >> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
> > >> (Thorsten Leemhuis) <[email protected]> wrote:
> > >>>
> > >>> Thanks for the report. The fixes (see the mail from Laurent) apparently
> > >>> are still not mainlined (or am I missing something?), so let me add this
> > >>> report to the tracking to ensure this is not forgotten:
> > >>
> > >> Gaah. I was intending to apply the patch directly before rc1, but then
> > >> I forgot about this issue.
> > >>
> > >> Mauro: I'm currently really *really* fed up with the media tree. This
> > >> exact same thing happened last merge window, where the media tree
> > >> caused pointless build errors, and it took way too long to get the
> > >> fixes the proper ways.
> > > [...]
> > >
> > > In the specific case of this fixup patch, I didn't identify it as a build
> > > issue, so it followed the usual workflow. We have a huge number of patches
> > > for media, and it usually takes some time to handle all of them. This one
> > > just followed the normal flow, as it didn't break Jenkins builds nor the
> > > subject mentioned anything about build breakage.
> >
> > Makes me wonder again if we should start adding
> >
> > CC: [email protected]
> >
> > to any patches that fix regressions, that way maintainers and reviewers
> > would have something to filter for -- and I would become aware of all
> > regression fixes in the work, too.
>
> Having some way that could be parsed by e-mail filters would be
> nice.

The presence of a Fixes: tag is already a strong indication that the
patch should be prioritized. Looking at the last 10 kernel releases,
here's the number of commits with a Fixes: tag in drivers/media/:

v5.14 - v5.15: 19
v5.15 - v5.16: 36
v5.16 - v5.17: 50
v5.17 - v5.18: 49
v5.18 - v5.19: 25
v5.19 - v6.0: 44
v6.0 - v6.1: 35
v6.1 - v6.2: 72
v6.2 - v6.3: 39
v6.3 - v6.4-rc1: 39

Some are likely not regressions and wouldn't need to be treated with the
highest priority, but keeping an eye on patches with a Fixes: tag seems
doable.

There's also the issue of regression fixes missing a Fixes: tag, but I
doubt those would get CC: [email protected], so from a mail
filtering point of view, that wouldn't help.

> > Ciao, Thorsten
> >
> > P.S.: BTW, let me tell regzbot that Linus merged the fix for the build
> > failure.
> >
> > #regzbot fix: ba0ad6ed89f
> >
> > FWIW, the one for the gcc warnings[1] Laurent mentioned elsewhere in
> > this thread is not merged yet afaics.
> >
> > [1] https://lore.kernel.org/all/[email protected]/
>
> Just sent a pull request.
>
> Btw, I did some changes at linux-media Jenkins instance to help early
> track some extra build issues. They're all against
> https://git.linuxtv.org/media_stage.git/, which is the tree where we place
> media patches that are ready. We move them later, after a couple of days
> to https://git.linuxtv.org/media_tree.git/. So, if something bad happens,
> we have a chance to fix before setting them into a stone. With such
> changes, we now have:
>
> 1. https://builder.linuxtv.org/job/patchwork/
>
> This is a pre-merge test. It tests patch per patch the PRs with patch
> sets ready to be merged, with W=1, allyesconfig/almodconfig[1] on x86_64.
> Builds drivers/media and drivers/staging/media.
> This is there already for a long time;
>
> 2. https://builder.linuxtv.org/job/media_stage_clang/
>
> Checks build with clang-12 on x86_64 with W=1. Builds drivers/media
> and drivers/staging/media with allyesconfig[1].
>
> It was building with WERROR disabled, as some core macros were
> producing errors at the time I created it (and for a while).
> It was modified to enable WERROR as well.
>
> 3. https://builder.linuxtv.org/job/media_stage_gcc-pipeline/
>
> It replaces another job that was just doing builds for x86_64
> with W=1. Builds drivers/media and drivers/staging/media with
> different configurations[1]:
> x86_64: allyesconfig, allmodconfig, almodconfig with PM disabled;
> arm32: allyesconfig
> arm64: allyesconfig
>
> 4. https://builder.linuxtv.org/job/linux-media/
>
> Does full builds with different configurations[1]:
> x86_64: allyesconfig, allmodconfig, almodconfig with PM disabled;
> arm32: allyesconfig
> arm64: allyesconfig
> docs: htmldocs and pdfdocs
>
> I hope this will help avoiding future build regressions from our side.
> Feel free to suggest a couple of other configs that we might add to
> jobs (3) and (4).
>
> I'm still adjusting the pipeline for (4), but currently, it is failing
> on an issue that seems unrelated with the media subsystem with gcc 10.2.1:
>
> AR drivers/built-in.a
> AR built-in.a
> AR vmlinux.a
> LD vmlinux.o
> vmlinux.o: warning: objtool: vmx_vcpu_enter_exit+0x2d8: call to vmread_error_trampoline() leaves .noinstr.text section
> vmlinux.o: warning: objtool: lkdtm_UNSET_SMEP+0xe1: relocation to !ENDBR: native_write_cr4+0x40
>
> Is this a known regression? The media-stage tree is on the top of
> Kernel 6.4-rc1.
>
> Regards,
> Mauro
>
> -
>
> [1] On all builds, the jobs disable some symbols that should not affect
> media subsystem, to speedup the builds:
>
> scripts/config -d MODULE_SIG -d KEYS -d IMA -d CONFIG_DEBUG_INFO -d SYSTEM_TRUSTED_KEYRING -d MODVERSIONS

--
Regards,

Laurent Pinchart

2023-05-14 11:47:52

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
> And another CI job testing bisect breakages as I receive pull requests,
> applying patch per patch and using both allyesconfig and allmodconfig,
> also on x86_64 arch with W=1:
>
> https://builder.linuxtv.org/job/patchwork/
>
> The rule is to not merge stuff on media tree if any of those jobs
> fail. I also fast-forward merging patches whose subject states that
> the build has failed.
>
> In order to help with that, on normal situation, I usually take one week
> to merge stuff from media_stage into media_tree, doing rebases at
> media_stage if needed to avoid git bisect build breakages at media_tree
> (which is from where I send my update PRs to you).
>
> Unfortunately, currently we don't have resources to do multiple randconfig

Hi Mauro,

Is you media staging tree included in LKP (kernel test robot)? You would
get huge build coverage after every push to your staging repo.

Except of maintainer trees, it is also useful to add there development
trees. That way for 'free' I can get early warnings about all my patches.

Best regards,
Krzysztof


2023-05-15 07:59:51

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Hi Krzysztof,

On Sun, May 14, 2023 at 1:01 PM Krzysztof Kozlowski <[email protected]> wrote:
> On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
> > And another CI job testing bisect breakages as I receive pull requests,
> > applying patch per patch and using both allyesconfig and allmodconfig,
> > also on x86_64 arch with W=1:
> >
> > https://builder.linuxtv.org/job/patchwork/
> >
> > The rule is to not merge stuff on media tree if any of those jobs
> > fail. I also fast-forward merging patches whose subject states that
> > the build has failed.
> >
> > In order to help with that, on normal situation, I usually take one week
> > to merge stuff from media_stage into media_tree, doing rebases at
> > media_stage if needed to avoid git bisect build breakages at media_tree
> > (which is from where I send my update PRs to you).
> >
> > Unfortunately, currently we don't have resources to do multiple randconfig
>
> Is you media staging tree included in LKP (kernel test robot)? You would
> get huge build coverage after every push to your staging repo.

As (multiple[*[) fixes for the build issues were submitted before the
opening of the merge window, there must have been some build coverage,
with even some people acting upon it...

[*] General note, not limited to media: please apply build fixes and
regression fixes ASAP, to avoid multiple people running into the
same issues, and spending time on bisecting, investigating,
fixing, ...
Thanks a lot!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2023-05-15 09:27:56

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Em Mon, 15 May 2023 09:46:41 +0200
Geert Uytterhoeven <[email protected]> escreveu:

> Hi Krzysztof,
>
> On Sun, May 14, 2023 at 1:01 PM Krzysztof Kozlowski <[email protected]> wrote:
> > On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
> > > And another CI job testing bisect breakages as I receive pull requests,
> > > applying patch per patch and using both allyesconfig and allmodconfig,
> > > also on x86_64 arch with W=1:
> > >
> > > https://builder.linuxtv.org/job/patchwork/
> > >
> > > The rule is to not merge stuff on media tree if any of those jobs
> > > fail. I also fast-forward merging patches whose subject states that
> > > the build has failed.
> > >
> > > In order to help with that, on normal situation, I usually take one week
> > > to merge stuff from media_stage into media_tree, doing rebases at
> > > media_stage if needed to avoid git bisect build breakages at media_tree
> > > (which is from where I send my update PRs to you).
> > >
> > > Unfortunately, currently we don't have resources to do multiple randconfig
> >
> > Is you media staging tree included in LKP (kernel test robot)? You would
> > get huge build coverage after every push to your staging repo.

No idea, as I don't know where LKP settings are stored, nor what frequency
it is doing builds from git://linuxtv.org/media_stage.git, if any. Do you know
where we can check such configuration?

In the end, patches there will end going to linux-next, so at least some
sort of coverage is there, but I'm not sure if LKP will always reply to
linux-media if the patch causing build regressions is there.

While being helpful, one problem with LKP is that it is hard to filter out
reports per git tree. The only way to check if the report is applicable to
media trees seems to be looking inside the e-mail's body.

> As (multiple[*[) fixes for the build issues were submitted before the
> opening of the merge window, there must have been some build coverage,
> with even some people acting upon it...
>
> [*] General note, not limited to media: please apply build fixes and
> regression fixes ASAP, to avoid multiple people running into the
> same issues, and spending time on bisecting, investigating,
> fixing, ...
> Thanks a lot!

Fully agreed. That's why we opted to have a CI instance focused on media:
we should be able to program it to give feedback with the kernel build
parameters we use (Like W=1) and set it up in a way that it we can
customize to our needs (like testing for the architectures where media
drivers are known to be used in practice), and setting up an e-mail subject
that can be easily be filtered by our e-mail filters.

Regards,
Mauro

2023-05-15 10:29:41

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 15/05/2023 11:25, Mauro Carvalho Chehab wrote:
> Em Mon, 15 May 2023 09:46:41 +0200
> Geert Uytterhoeven <[email protected]> escreveu:
>
>> Hi Krzysztof,
>>
>> On Sun, May 14, 2023 at 1:01 PM Krzysztof Kozlowski <[email protected]> wrote:
>>> On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
>>>> And another CI job testing bisect breakages as I receive pull requests,
>>>> applying patch per patch and using both allyesconfig and allmodconfig,
>>>> also on x86_64 arch with W=1:
>>>>
>>>> https://builder.linuxtv.org/job/patchwork/
>>>>
>>>> The rule is to not merge stuff on media tree if any of those jobs
>>>> fail. I also fast-forward merging patches whose subject states that
>>>> the build has failed.
>>>>
>>>> In order to help with that, on normal situation, I usually take one week
>>>> to merge stuff from media_stage into media_tree, doing rebases at
>>>> media_stage if needed to avoid git bisect build breakages at media_tree
>>>> (which is from where I send my update PRs to you).
>>>>
>>>> Unfortunately, currently we don't have resources to do multiple randconfig
>>>
>>> Is you media staging tree included in LKP (kernel test robot)? You would
>>> get huge build coverage after every push to your staging repo.
>
> No idea, as I don't know where LKP settings are stored, nor what frequency
> it is doing builds from git://linuxtv.org/media_stage.git, if any. Do you know
> where we can check such configuration?

AFAIR, it checks all kernel.org repos, but anything outside should be
manually added. See:
https://github.com/intel/lkp-tests/pull/271

I actually proposed a talk around this for LPC, so it seems it might be
useful.

>
> In the end, patches there will end going to linux-next, so at least some
> sort of coverage is there, but I'm not sure if LKP will always reply to
> linux-media if the patch causing build regressions is there.
>
> While being helpful, one problem with LKP is that it is hard to filter out
> reports per git tree. The only way to check if the report is applicable to
> media trees seems to be looking inside the e-mail's body.

If you setup specific repo entry in LKP, you will get a quite specific
email, e.g. with subject:

[krzk:for-next] BUILD SUCCESS 5b248db78d1aa679009efa4763794890572e63fb
[krzk-github:n/of-device-id-of-match-ptr-rebase] BUILD SUCCESS ce0964f4

(where krzk/krzk-github are the names of repos in LKP)


Best regards,
Krzysztof


2023-05-15 15:11:38

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On 15.05.23 09:46, Geert Uytterhoeven wrote:
> On Sun, May 14, 2023 at 1:01 PM Krzysztof Kozlowski <[email protected]> wrote:
>> On 10/05/2023 10:05, Mauro Carvalho Chehab wrote:
>>> And another CI job testing bisect breakages as I receive pull requests,
>>> applying patch per patch and using both allyesconfig and allmodconfig,
>>> also on x86_64 arch with W=1:
>>>
>>> https://builder.linuxtv.org/job/patchwork/
>>>
>>> The rule is to not merge stuff on media tree if any of those jobs
>>> fail. I also fast-forward merging patches whose subject states that
>>> the build has failed.
>>>
>>> In order to help with that, on normal situation, I usually take one week
>>> to merge stuff from media_stage into media_tree, doing rebases at
>>> media_stage if needed to avoid git bisect build breakages at media_tree
>>> (which is from where I send my update PRs to you).
>>>
>>> Unfortunately, currently we don't have resources to do multiple randconfig
>>
>> Is you media staging tree included in LKP (kernel test robot)? You would
>> get huge build coverage after every push to your staging repo.
>
> As (multiple[*[) fixes for the build issues were submitted before the
> opening of the merge window, there must have been some build coverage,
> with even some people acting upon it...
>
> [*] General note, not limited to media: please apply build fixes and
> regression fixes ASAP, to avoid multiple people running into the
> same issues, and spending time on bisecting, investigating,
> fixing, ...
> Thanks a lot!

FWIW, I proposed a rewritten section of
Documentation/process/handling-regressions.rst that is closely related
to this. The new text says:

```
+ * Do not consider regressions from the current cycle as something that
can wait
+ till the end of the cycle, as the issue might discourage or prevent
users and
+ CI systems from testing mainline now or generally.
[...]
+ * Aim to mainline a fix within two or three days, if the issue is
severe or
+ bothering many users -- either in general or in prevalent conditions
like a
+ particular hardware environment, distribution, or stable/longterm
series.```

For details and context see
https://lore.kernel.org/linux-doc/6971680941a5b7b9cb0c2839c75b5cc4ddb2d162.1684139586.git.linux@leemhuis.info/

Let me known if you think I should be more explicit; I could simply add
a "this includes, but is not limited to fixes for build errors" to the
second segment mentioned above. But well, that yet again would make the
text longer. :-(

Ciao, Thorsten

2023-05-18 20:14:41

by Nicolas Dufresne

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

Hi,

Le lundi 08 mai 2023 à 09:27 -0700, Linus Torvalds a écrit :
> End result: you and the media tree is on my shit-list, and I will not
> take any pull requests from you that aren't just fixes.
>
> Not just this release, but the next one.

I'm expected to be flamed for getting in the way, but whatever. To me this
decision lacks any kind of consideration toward who will be affected. This will
hit those that makes the new features and are working hard to convince their
customers to go mainline first. None of Mauro or Hans will truly be affected by
this. Everyone affected by this decision were completely unaware of the problem
in the first place and could not help it.

Punishment and shame is not something I encourage or think is nice in general.
Its the reflection of a strong frustration and the spontaneous need to hit
somewhere it hurts. It brings no benefit to anyone and likely convince the
skeptical to go away. In the end, it is the leaf contributors (the one that have
the least authority on the project) that takes the hit, having to tell their
customers they'll only get their stuff mainline in 6.7 even though they might
have done everything right to get things on time for the next window. This kind
of methods is purely negative for everyone, there is no win of any sort.

regards,
Nicolas

2023-05-18 20:50:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

On Thu, May 18, 2023 at 12:53 PM Nicolas Dufresne <[email protected]> wrote:
>
> I'm expected to be flamed for getting in the way, but whatever. To me this
> decision lacks any kind of consideration toward who will be affected. This will
> hit those that makes the new features and are working hard to convince their
> customers to go mainline first.

I think the solution may be for those affected people to help Mauro & co.

Clearly the media maintenance doesn't have enough time. I'm not going
to pull from a tree where I know that it then may take six *weeks* and
one whole release for simple bugs to be fixed.

That is literally what happened. And if it had been once, that would
be one thing. But when the same thing starts happening again the very
next release, it's no longer a one-off. It's a pattern.

> Punishment and shame is not something I encourage or think is nice in general.

This is NOT about punishment.,

It's very simple: if I cannot trust the tree to be maintained, I'm not
going to pull it.

That's not punishment, that is simply about kernel maintenance.

If you want to help fix the media maintenance issue, then by all means
help. But as things are now, if I cannot rely on the media tree
getting even simple build fixes in a timely manner, then I'm not
pulling it.

Please realize: to misquote Shakespeare, I have two options: to pull
or not to pull. And in order to pull a tree, I need to know that I can
expect any problems from that pull to be fixed.

Would you expect me to pull known-buggy trees? I sure hope you don't
expect that. Not pulling buggy trees isn't "punishment". It's the only
sane thing to do.

And the exact same thing is true when a tree isn't maintained
properly. Bugs happen. That's inevitable. And sometimes bugs can be
hard to find, or hard to fix. But when the maintainer has been sent a
fix, and that fix doesn't get handled for SIX WEEKS, then that tree is
buggy.

Something is very rotten in the state of media. It needs to get fixed.
Until it is fixed, I don't want to take random new code.

The fix *may* be as simple as more testing, and better automation. But
really, the thing that annoyed me enormously was that these bugs were
all found by automation and testing. And still they were left to rot.

Linus