2021-02-04 06:04:57

by Jari Ruusu

[permalink] [raw]
Subject: Kernel version numbers after 4.9.255 and 4.4.255

Greg,
I hope that your linux kernel release scripts are
implemented in a way that understands that PATCHLEVEL= and
SUBLEVEL= numbers in top-level linux Makefile are encoded
as 8-bit numbers for LINUX_VERSION_CODE and
KERNEL_VERSION() macros, and must stay in range 0...255.
These 8-bit limits are hardcoded in both kernel source and
userspace ABI.

After 4.9.255 and 4.4.255, your scripts should be
incrementing a number in EXTRAVERSION= in top-level
linux Makefile.

--
Jari Ruusu  4096R/8132F189 12D6 4C3A DCDA 0AA4 27BD  ACDF F073 3C80 8132 F189


2021-02-04 06:22:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
> Greg,
> I hope that your linux kernel release scripts are
> implemented in a way that understands that PATCHLEVEL= and
> SUBLEVEL= numbers in top-level linux Makefile are encoded
> as 8-bit numbers for LINUX_VERSION_CODE and
> KERNEL_VERSION() macros, and must stay in range 0...255.
> These 8-bit limits are hardcoded in both kernel source and
> userspace ABI.
>
> After 4.9.255 and 4.4.255, your scripts should be
> incrementing a number in EXTRAVERSION= in top-level
> linux Makefile.

Should already be fixed in linux-next, right?

thanks,

greg k-h

2021-02-04 08:56:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Thu, Feb 04, 2021 at 08:26:04AM +0100, Jiri Slaby wrote:
> On 04. 02. 21, 7:20, Greg Kroah-Hartman wrote:
> > On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
> > > Greg,
> > > I hope that your linux kernel release scripts are
> > > implemented in a way that understands that PATCHLEVEL= and
> > > SUBLEVEL= numbers in top-level linux Makefile are encoded
> > > as 8-bit numbers for LINUX_VERSION_CODE and
> > > KERNEL_VERSION() macros, and must stay in range 0...255.
> > > These 8-bit limits are hardcoded in both kernel source and
> > > userspace ABI.
> > >
> > > After 4.9.255 and 4.4.255, your scripts should be
> > > incrementing a number in EXTRAVERSION= in top-level
> > > linux Makefile.
> >
> > Should already be fixed in linux-next, right?
>
> I assume you mean:
> commit 537896fabed11f8d9788886d1aacdb977213c7b3
> Author: Sasha Levin <[email protected]>
> Date: Mon Jan 18 14:54:53 2021 -0500
>
> kbuild: give the SUBLEVEL more room in KERNEL_VERSION
>
> That would IMO break userspace as definition of kernel version has changed.
> And that one is UAPI/ABI (see include/generated/uapi/linux/version.h) as
> Jari writes. For example will glibc still work:
> http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/configure.ac;h=13abda0a51484c5951ffc6d718aa36b72f3a9429;hb=HEAD#l14
>
> ? Or gcc 10 (11 will have this differently):
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf.c;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l165
>
> and
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf-helpers.h;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l53

Ugh, I thought this was an internal representation, not an external one
:(

> It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> assumptions all around the world. So this doesn't look like a good idea.

Ok, so what happens if we "wrap"? What will break with that? At first
glance, I can't see anything as we keep the padding the same, and our
build scripts seem to pick the number up from the Makefile and treat it
like a string.

It's only the crazy out-of-tree kernel stuff that wants to do minor
version checks that might go boom. And frankly, I'm not all that
concerned if they have problems :)

So, let's leave it alone and just see what happens!

greg k-h

2021-02-04 11:06:38

by Jiri Slaby

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On 04. 02. 21, 9:51, Greg Kroah-Hartman wrote:
>> It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
>> assumptions all around the world. So this doesn't look like a good idea.
>
> Ok, so what happens if we "wrap"? What will break with that? At first
> glance, I can't see anything as we keep the padding the same, and our
> build scripts seem to pick the number up from the Makefile and treat it
> like a string.
>
> It's only the crazy out-of-tree kernel stuff that wants to do minor
> version checks that might go boom. And frankly, I'm not all that
> concerned if they have problems :)

Agreed. But currently, sublevel won't "wrap", it will "overflow" to
patchlevel. And that might be a problem. So we might need to update the
header generation using e.g. "sublevel & 0xff" (wrap around) or
"sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).

In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.

thanks,
--
js

2021-02-04 16:54:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Thu, Feb 04, 2021 at 04:28:19PM +0000, David Laight wrote:
> From: Jiri Slaby
> > Sent: 04 February 2021 11:01
> >
> > On 04. 02. 21, 9:51, Greg Kroah-Hartman wrote:
> > >> It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> > >> assumptions all around the world. So this doesn't look like a good idea.
> > >
> > > Ok, so what happens if we "wrap"? What will break with that? At first
> > > glance, I can't see anything as we keep the padding the same, and our
> > > build scripts seem to pick the number up from the Makefile and treat it
> > > like a string.
> > >
> > > It's only the crazy out-of-tree kernel stuff that wants to do minor
> > > version checks that might go boom. And frankly, I'm not all that
> > > concerned if they have problems :)
> >
> > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > patchlevel. And that might be a problem. So we might need to update the
> > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> >
> > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
>
> A full wrap might catch checks for less than (say) 4.4.2 which
> might be present to avoid very early versions.

Who does that?

> So sticking at 255 or wrapping onto (say) 128 to 255 might be better.

Better how?

> I'm actually intrigued about how often you expect people to update
> systems running these LTS kernels.

Whenever they can, and should.

> At a release every week it takes 5 years to run out of sublevels.
> No one is going to reboot a server anywhere near that often.

Why not?

Usually kernels this old are stuck in legacy embedded systems, like last
year's new phone models :)

thanks,

greg k-h

2021-02-04 20:32:47

by Christoph Biedl

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

David Laight wrote...

> A full wrap might catch checks for less than (say) 4.4.2 which
> might be present to avoid very early versions.
> So sticking at 255 or wrapping onto (say) 128 to 255 might be better.

Hitting such version checks still might happen, though.

Also, any wrapping introduces a real risk package managers will see
version numbers running backwards and therefore will refrain from
installing an actually newer version.

For scripts/package/builddeb (I don't use that, though), you could work
around by setting an epoch, i.e. (untested)

-$sourcename ($packageversion) $distribution; urgency=low
+$sourcename (1:$packageversion) $distribution; urgency=low

but every packaging mechanism in-tree and outside should adopt such a
change, if even possible. Which is why this feels bad.

Possibly I am missing something: What's the reason to not use
EXTRAVERSION as back in the old 2.6.x.y days, so change to 4.4.255.1 and
so on? Well, unless there are still installations who treat 4.4.255 as
2.6.64.255.

Christoph

2021-02-04 23:28:37

by Jiri Slaby

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On 04. 02. 21, 7:20, Greg Kroah-Hartman wrote:
> On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
>> Greg,
>> I hope that your linux kernel release scripts are
>> implemented in a way that understands that PATCHLEVEL= and
>> SUBLEVEL= numbers in top-level linux Makefile are encoded
>> as 8-bit numbers for LINUX_VERSION_CODE and
>> KERNEL_VERSION() macros, and must stay in range 0...255.
>> These 8-bit limits are hardcoded in both kernel source and
>> userspace ABI.
>>
>> After 4.9.255 and 4.4.255, your scripts should be
>> incrementing a number in EXTRAVERSION= in top-level
>> linux Makefile.
>
> Should already be fixed in linux-next, right?

I assume you mean:
commit 537896fabed11f8d9788886d1aacdb977213c7b3
Author: Sasha Levin <[email protected]>
Date: Mon Jan 18 14:54:53 2021 -0500

kbuild: give the SUBLEVEL more room in KERNEL_VERSION

That would IMO break userspace as definition of kernel version has
changed. And that one is UAPI/ABI (see
include/generated/uapi/linux/version.h) as Jari writes. For example will
glibc still work:
http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/configure.ac;h=13abda0a51484c5951ffc6d718aa36b72f3a9429;hb=HEAD#l14

? Or gcc 10 (11 will have this differently):
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf.c;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l165

and

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf-helpers.h;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l53

It might work somewhere, but there are a lot of (X * 65536 + Y * 256 +
Z) assumptions all around the world. So this doesn't look like a good idea.

thanks,
--
js
suse labs

2021-02-05 00:59:54

by David Laight

[permalink] [raw]
Subject: RE: Kernel version numbers after 4.9.255 and 4.4.255

From: Jiri Slaby
> Sent: 04 February 2021 11:01
>
> On 04. 02. 21, 9:51, Greg Kroah-Hartman wrote:
> >> It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> >> assumptions all around the world. So this doesn't look like a good idea.
> >
> > Ok, so what happens if we "wrap"? What will break with that? At first
> > glance, I can't see anything as we keep the padding the same, and our
> > build scripts seem to pick the number up from the Makefile and treat it
> > like a string.
> >
> > It's only the crazy out-of-tree kernel stuff that wants to do minor
> > version checks that might go boom. And frankly, I'm not all that
> > concerned if they have problems :)
>
> Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> patchlevel. And that might be a problem. So we might need to update the
> header generation using e.g. "sublevel & 0xff" (wrap around) or
> "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
>
> In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.

A full wrap might catch checks for less than (say) 4.4.2 which
might be present to avoid very early versions.
So sticking at 255 or wrapping onto (say) 128 to 255 might be better.

I'm actually intrigued about how often you expect people to update
systems running these LTS kernels.
At a release every week it takes 5 years to run out of sublevels.
No one is going to reboot a server anywhere near that often.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2021-02-05 06:54:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Thu, Feb 04, 2021 at 09:19:33PM +0100, Christoph Biedl wrote:
> David Laight wrote...
>
> > A full wrap might catch checks for less than (say) 4.4.2 which
> > might be present to avoid very early versions.
> > So sticking at 255 or wrapping onto (say) 128 to 255 might be better.
>
> Hitting such version checks still might happen, though.

By who? For what?

> Also, any wrapping introduces a real risk package managers will see
> version numbers running backwards and therefore will refrain from
> installing an actually newer version.

package managers do not take the version from this macro, do they? If
they do, please show me which ones.

thanks,

greg k-h

2021-02-05 09:12:45

by Pavel Machek

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Thu 2021-02-04 09:51:03, Greg Kroah-Hartman wrote:
> On Thu, Feb 04, 2021 at 08:26:04AM +0100, Jiri Slaby wrote:
> > On 04. 02. 21, 7:20, Greg Kroah-Hartman wrote:
> > > On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
> > > > Greg,
> > > > I hope that your linux kernel release scripts are
> > > > implemented in a way that understands that PATCHLEVEL= and
> > > > SUBLEVEL= numbers in top-level linux Makefile are encoded
> > > > as 8-bit numbers for LINUX_VERSION_CODE and
> > > > KERNEL_VERSION() macros, and must stay in range 0...255.
> > > > These 8-bit limits are hardcoded in both kernel source and
> > > > userspace ABI.
> > > >
> > > > After 4.9.255 and 4.4.255, your scripts should be
> > > > incrementing a number in EXTRAVERSION= in top-level
> > > > linux Makefile.
> > >
> > > Should already be fixed in linux-next, right?
> >
> > I assume you mean:
> > commit 537896fabed11f8d9788886d1aacdb977213c7b3
> > Author: Sasha Levin <[email protected]>
> > Date: Mon Jan 18 14:54:53 2021 -0500
> >
> > kbuild: give the SUBLEVEL more room in KERNEL_VERSION
> >
> > That would IMO break userspace as definition of kernel version has changed.
> > And that one is UAPI/ABI (see include/generated/uapi/linux/version.h) as
> > Jari writes. For example will glibc still work:
> > http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/configure.ac;h=13abda0a51484c5951ffc6d718aa36b72f3a9429;hb=HEAD#l14
> >
> > ? Or gcc 10 (11 will have this differently):
> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf.c;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l165
> >
> > and
> >
> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf-helpers.h;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l53
>
> Ugh, I thought this was an internal representation, not an external one
> :(
>
> > It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> > assumptions all around the world. So this doesn't look like a good idea.
>
> Ok, so what happens if we "wrap"? What will break with that? At first
> glance, I can't see anything as we keep the padding the same, and our
> build scripts seem to pick the number up from the Makefile and treat it
> like a string.
>
> It's only the crazy out-of-tree kernel stuff that wants to do minor
> version checks that might go boom. And frankly, I'm not all that
> concerned if they have problems :)
>
> So, let's leave it alone and just see what happens!

Yeah, stable is a great place to do the experiments. Not that this is
the first time :-(.
Pavel
--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (2.62 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2021-02-05 09:39:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Fri, Feb 05, 2021 at 10:06:59AM +0100, Pavel Machek wrote:
> On Thu 2021-02-04 09:51:03, Greg Kroah-Hartman wrote:
> > On Thu, Feb 04, 2021 at 08:26:04AM +0100, Jiri Slaby wrote:
> > > On 04. 02. 21, 7:20, Greg Kroah-Hartman wrote:
> > > > On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
> > > > > Greg,
> > > > > I hope that your linux kernel release scripts are
> > > > > implemented in a way that understands that PATCHLEVEL= and
> > > > > SUBLEVEL= numbers in top-level linux Makefile are encoded
> > > > > as 8-bit numbers for LINUX_VERSION_CODE and
> > > > > KERNEL_VERSION() macros, and must stay in range 0...255.
> > > > > These 8-bit limits are hardcoded in both kernel source and
> > > > > userspace ABI.
> > > > >
> > > > > After 4.9.255 and 4.4.255, your scripts should be
> > > > > incrementing a number in EXTRAVERSION= in top-level
> > > > > linux Makefile.
> > > >
> > > > Should already be fixed in linux-next, right?
> > >
> > > I assume you mean:
> > > commit 537896fabed11f8d9788886d1aacdb977213c7b3
> > > Author: Sasha Levin <[email protected]>
> > > Date: Mon Jan 18 14:54:53 2021 -0500
> > >
> > > kbuild: give the SUBLEVEL more room in KERNEL_VERSION
> > >
> > > That would IMO break userspace as definition of kernel version has changed.
> > > And that one is UAPI/ABI (see include/generated/uapi/linux/version.h) as
> > > Jari writes. For example will glibc still work:
> > > http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/configure.ac;h=13abda0a51484c5951ffc6d718aa36b72f3a9429;hb=HEAD#l14
> > >
> > > ? Or gcc 10 (11 will have this differently):
> > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf.c;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l165
> > >
> > > and
> > >
> > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/bpf/bpf-helpers.h;hb=ee5c3db6c5b2c3332912fb4c9cfa2864569ebd9a#l53
> >
> > Ugh, I thought this was an internal representation, not an external one
> > :(
> >
> > > It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> > > assumptions all around the world. So this doesn't look like a good idea.
> >
> > Ok, so what happens if we "wrap"? What will break with that? At first
> > glance, I can't see anything as we keep the padding the same, and our
> > build scripts seem to pick the number up from the Makefile and treat it
> > like a string.
> >
> > It's only the crazy out-of-tree kernel stuff that wants to do minor
> > version checks that might go boom. And frankly, I'm not all that
> > concerned if they have problems :)
> >
> > So, let's leave it alone and just see what happens!
>
> Yeah, stable is a great place to do the experiments. Not that this is
> the first time :-(.

How else can we "test this out"?

Should I do an "empty" release of 4.4.256 and see if anyone complains?

Any other ideas are gladly welcome...

thanks,

greg k-h

2021-02-05 17:53:16

by Tony Battersby

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On 2/4/21 6:00 AM, Jiri Slaby wrote:
> Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> patchlevel. And that might be a problem. So we might need to update the
> header generation using e.g. "sublevel & 0xff" (wrap around) or
> "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
>
> In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.

My preference would be to be monotonic and get stuck at 255 to avoid
breaking out-of-tree modules.  If needed, add another macro that
increases the number of bits that can be used to check for sublevels >
255, while keeping the old macros for compatibility reasons.  Since
sublevels > 255 have never existed before, any such checks must be
newly-added, so they can be required to use the new macros.

I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
range of kernels from 3.18 (gasp!) up to the latest when bisecting,
benchmarking, or debugging problems.  And I use a number of out-of-tree
modules that rely on the KERNEL_VERSION to make everything work.  Some
out-of-tree modules like an updated igb network driver might be needed
to make it possible to test the old kernel on particular hardware.

In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
locally to make out-of-tree modules work.  Or else just not test kernels
with sublevel > 255.

Tony Battersby
Cybernetics


2021-02-05 18:48:39

by Pavel Machek

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

Hi!

> > > Ugh, I thought this was an internal representation, not an external one
> > > :(
> > >
> > > > It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> > > > assumptions all around the world. So this doesn't look like a good idea.
> > >
> > > Ok, so what happens if we "wrap"? What will break with that? At first
> > > glance, I can't see anything as we keep the padding the same, and our
> > > build scripts seem to pick the number up from the Makefile and treat it
> > > like a string.
> > >
> > > It's only the crazy out-of-tree kernel stuff that wants to do minor
> > > version checks that might go boom. And frankly, I'm not all that
> > > concerned if they have problems :)
> > >
> > > So, let's leave it alone and just see what happens!
> >
> > Yeah, stable is a great place to do the experiments. Not that this is
> > the first time :-(.
>
> How else can we "test this out"?
>
> Should I do an "empty" release of 4.4.256 and see if anyone complains?

It seems that would be bad idea, as it would cause problems when stuff
is compiled on 4.4.256, not simply by running it.

Sasha's patch seems like one option that could work.

Even safer option is to switch to 4.4.255-st1, 4.4.255-st2 ... scheme.

Best regards,
Pavel
--
http://www.livejournal.com/~pavelmachek


Attachments:
(No filename) (1.32 kB)
signature.asc (201.00 B)
Download all attachments

2021-02-05 18:53:03

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

Em Fri, 5 Feb 2021 12:31:05 -0500
Tony Battersby <[email protected]> escreveu:

> On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > patchlevel. And that might be a problem. So we might need to update the
> > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> >
> > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
>
> My preference would be to be monotonic and get stuck at 255 to avoid
> breaking out-of-tree modules.  If needed, add another macro that
> increases the number of bits that can be used to check for sublevels >
> 255, while keeping the old macros for compatibility reasons.  Since
> sublevels > 255 have never existed before, any such checks must be
> newly-added, so they can be required to use the new macros.
>
> I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
> range of kernels from 3.18 (gasp!) up to the latest when bisecting,
> benchmarking, or debugging problems.  And I use a number of out-of-tree
> modules that rely on the KERNEL_VERSION to make everything work.  Some
> out-of-tree modules like an updated igb network driver might be needed
> to make it possible to test the old kernel on particular hardware.
>
> In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
> locally to make out-of-tree modules work.  Or else just not test kernels
> with sublevel > 255.

Overflowing LINUX_VERSION_CODE breaks media applications. Several media
APIs have an ioctl that returns the Kernel version:

drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;

Those can be used by applications in order to enable some features that
are available only after certain Kernel versions.

This is somewhat deprecated, in favor of the usage of some other
capability fields, but for instance, the v4l2-compliance userspace tool
have two such checks:

utils/v4l2-compliance/v4l2-compliance.cpp
640: fail_on_test((vcap.version >> 16) < 3);
641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards

As far as I remember, all such checks are against major.minor. So,
something like:

sublevel = (sublevel > 0xff) ? 0xff : sublevel;

inside KERNEL_VERSION macro should fix such regression at -stable.

Thanks,
Mauro

2021-02-06 07:22:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Fri, Feb 05, 2021 at 07:11:05PM +0100, Mauro Carvalho Chehab wrote:
> Em Fri, 5 Feb 2021 12:31:05 -0500
> Tony Battersby <[email protected]> escreveu:
>
> > On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > > patchlevel. And that might be a problem. So we might need to update the
> > > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> > >
> > > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
> >
> > My preference would be to be monotonic and get stuck at 255 to avoid
> > breaking out-of-tree modules.? If needed, add another macro that
> > increases the number of bits that can be used to check for sublevels >
> > 255, while keeping the old macros for compatibility reasons.? Since
> > sublevels > 255 have never existed before, any such checks must be
> > newly-added, so they can be required to use the new macros.
> >
> > I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
> > range of kernels from 3.18 (gasp!) up to the latest when bisecting,
> > benchmarking, or debugging problems.? And I use a number of out-of-tree
> > modules that rely on the KERNEL_VERSION to make everything work.? Some
> > out-of-tree modules like an updated igb network driver might be needed
> > to make it possible to test the old kernel on particular hardware.
> >
> > In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
> > locally to make out-of-tree modules work.? Or else just not test kernels
> > with sublevel > 255.
>
> Overflowing LINUX_VERSION_CODE breaks media applications. Several media
> APIs have an ioctl that returns the Kernel version:
>
> drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
> drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
> drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
> drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;

This always struck me as odd, because why can't they just use the
uname(2) syscall instead?

> Those can be used by applications in order to enable some features that
> are available only after certain Kernel versions.
>
> This is somewhat deprecated, in favor of the usage of some other
> capability fields, but for instance, the v4l2-compliance userspace tool
> have two such checks:
>
> utils/v4l2-compliance/v4l2-compliance.cpp
> 640: fail_on_test((vcap.version >> 16) < 3);
> 641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
>
> As far as I remember, all such checks are against major.minor. So,
> something like:
>
> sublevel = (sublevel > 0xff) ? 0xff : sublevel;
>
> inside KERNEL_VERSION macro should fix such regression at -stable.

I think if we clamp KERNEL_VERSION at 255 we should be fine for anyone
checking this type of thing. Sasha has posted patches to do this.

thanks,

greg k-h

2021-02-06 07:23:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Fri, Feb 05, 2021 at 12:31:05PM -0500, Tony Battersby wrote:
> On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > patchlevel. And that might be a problem. So we might need to update the
> > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> >
> > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
>
> My preference would be to be monotonic and get stuck at 255 to avoid
> breaking out-of-tree modules.

I really do not care about out-of-tree modules sorry, as there's nothing
we can do about them. And internal kernel apis are always changing,
even in stable/lts releases, so changing this type of thing for them
should not be a big deal as maintainers of this type of code always have
to do that.

thanks,

greg k-h

2021-02-06 07:25:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Fri, Feb 05, 2021 at 07:44:12PM +0100, Pavel Machek wrote:
> Hi!
>
> > > > Ugh, I thought this was an internal representation, not an external one
> > > > :(
> > > >
> > > > > It might work somewhere, but there are a lot of (X * 65536 + Y * 256 + Z)
> > > > > assumptions all around the world. So this doesn't look like a good idea.
> > > >
> > > > Ok, so what happens if we "wrap"? What will break with that? At first
> > > > glance, I can't see anything as we keep the padding the same, and our
> > > > build scripts seem to pick the number up from the Makefile and treat it
> > > > like a string.
> > > >
> > > > It's only the crazy out-of-tree kernel stuff that wants to do minor
> > > > version checks that might go boom. And frankly, I'm not all that
> > > > concerned if they have problems :)
> > > >
> > > > So, let's leave it alone and just see what happens!
> > >
> > > Yeah, stable is a great place to do the experiments. Not that this is
> > > the first time :-(.
> >
> > How else can we "test this out"?
> >
> > Should I do an "empty" release of 4.4.256 and see if anyone complains?
>
> It seems that would be bad idea, as it would cause problems when stuff
> is compiled on 4.4.256, not simply by running it.
>
> Sasha's patch seems like one option that could work.
>
> Even safer option is to switch to 4.4.255-st1, 4.4.255-st2 ... scheme.

Using EXTRAVERSION would work, but it is effectivly the same thing as
nothing exports that to userspace through the LINUX_VERSION macro.

So clamping the version like Sasha's patches seems to be the best
solution.

thanks,

greg k-h

2021-02-06 09:25:51

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

Em Sat, 6 Feb 2021 08:20:45 +0100
Greg Kroah-Hartman <[email protected]> escreveu:

> On Fri, Feb 05, 2021 at 07:11:05PM +0100, Mauro Carvalho Chehab wrote:
> > Em Fri, 5 Feb 2021 12:31:05 -0500
> > Tony Battersby <[email protected]> escreveu:
> >
> > > On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > > > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > > > patchlevel. And that might be a problem. So we might need to update the
> > > > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > > > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> > > >
> > > > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
> > >
> > > My preference would be to be monotonic and get stuck at 255 to avoid
> > > breaking out-of-tree modules.  If needed, add another macro that
> > > increases the number of bits that can be used to check for sublevels >
> > > 255, while keeping the old macros for compatibility reasons.  Since
> > > sublevels > 255 have never existed before, any such checks must be
> > > newly-added, so they can be required to use the new macros.
> > >
> > > I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
> > > range of kernels from 3.18 (gasp!) up to the latest when bisecting,
> > > benchmarking, or debugging problems.  And I use a number of out-of-tree
> > > modules that rely on the KERNEL_VERSION to make everything work.  Some
> > > out-of-tree modules like an updated igb network driver might be needed
> > > to make it possible to test the old kernel on particular hardware.
> > >
> > > In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
> > > locally to make out-of-tree modules work.  Or else just not test kernels
> > > with sublevel > 255.
> >
> > Overflowing LINUX_VERSION_CODE breaks media applications. Several media
> > APIs have an ioctl that returns the Kernel version:
> >
> > drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
> > drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
> > drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
> > drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;
>
> This always struck me as odd, because why can't they just use the
> uname(2) syscall instead?

I agree that this is odd on upstream Kernels.

On backported ones, this should be filled with the version of the V4L2 core.

We maintain a tree that allows running older Kernels with the latest V4L2
media drivers and subsystem. On such tree, there's a patch that replaces
LINUX_VERSION_CODE macro to V4L2_VERSION:

https://git.linuxtv.org/media_build.git/tree/backports/api_version.patch

There's a logic here which gets the version of the V4L2 used at the
build. So, right now, it is filled with:

#define V4L2_VERSION 330496 /* 0x050b00 */

In other words, even if you run the backported driver on, let's say, Kernel
4.8, those calls will tell that the driver's version is from Kernel
5.11.

-

Providing a little of history behind those, this came together with the
V4L version 2 API developed during Kernel 2.5.x and merged at Kernel
2.6.0.

When such API was originally introduced, this field was meant to
contain the driver's version. The problem is that people used to change
the drivers (even with major rewrites) without changing its version.

We ended by standardizing it everywhere, filling those at the media core,
instead of doing it at driver's level - and using the Kernel version.

This way, developers won't need to be concerned of keeping this
updated as the subsystem evolves.

With time, we also improved the V4L2 API in a way that applications can
be able to directly detect the core/driver functionalities without needing
to rely on such fields. So, I guess recent versions of most open source
applications nowadays don't use it.

> > Those can be used by applications in order to enable some features that
> > are available only after certain Kernel versions.
> >
> > This is somewhat deprecated, in favor of the usage of some other
> > capability fields, but for instance, the v4l2-compliance userspace tool
> > have two such checks:
> >
> > utils/v4l2-compliance/v4l2-compliance.cpp
> > 640: fail_on_test((vcap.version >> 16) < 3);
> > 641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
> >
> > As far as I remember, all such checks are against major.minor. So,
> > something like:
> >
> > sublevel = (sublevel > 0xff) ? 0xff : sublevel;
> >
> > inside KERNEL_VERSION macro should fix such regression at -stable.
>
> I think if we clamp KERNEL_VERSION at 255 we should be fine for anyone
> checking this type of thing. Sasha has posted patches to do this.

Yes, this should be enough.

As far as I remember, when opensource apps use the version from the API,
since Kernel 3.0, they always check only for major.minor.

So, the only problem with those APIs are due to overflows. Setting
sublevel to any value beteen 0-255 should work, from V4L2 API
standpoint.

Thanks,
Mauro

2021-02-06 09:31:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On Sat, Feb 06, 2021 at 10:24:02AM +0100, Mauro Carvalho Chehab wrote:
> Em Sat, 6 Feb 2021 08:20:45 +0100
> Greg Kroah-Hartman <[email protected]> escreveu:
>
> > On Fri, Feb 05, 2021 at 07:11:05PM +0100, Mauro Carvalho Chehab wrote:
> > > Em Fri, 5 Feb 2021 12:31:05 -0500
> > > Tony Battersby <[email protected]> escreveu:
> > >
> > > > On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > > > > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > > > > patchlevel. And that might be a problem. So we might need to update the
> > > > > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > > > > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> > > > >
> > > > > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
> > > >
> > > > My preference would be to be monotonic and get stuck at 255 to avoid
> > > > breaking out-of-tree modules.? If needed, add another macro that
> > > > increases the number of bits that can be used to check for sublevels >
> > > > 255, while keeping the old macros for compatibility reasons.? Since
> > > > sublevels > 255 have never existed before, any such checks must be
> > > > newly-added, so they can be required to use the new macros.
> > > >
> > > > I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
> > > > range of kernels from 3.18 (gasp!) up to the latest when bisecting,
> > > > benchmarking, or debugging problems.? And I use a number of out-of-tree
> > > > modules that rely on the KERNEL_VERSION to make everything work.? Some
> > > > out-of-tree modules like an updated igb network driver might be needed
> > > > to make it possible to test the old kernel on particular hardware.
> > > >
> > > > In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
> > > > locally to make out-of-tree modules work.? Or else just not test kernels
> > > > with sublevel > 255.
> > >
> > > Overflowing LINUX_VERSION_CODE breaks media applications. Several media
> > > APIs have an ioctl that returns the Kernel version:
> > >
> > > drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
> > > drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
> > > drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
> > > drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;
> >
> > This always struck me as odd, because why can't they just use the
> > uname(2) syscall instead?
>
> I agree that this is odd on upstream Kernels.
>
> On backported ones, this should be filled with the version of the V4L2 core.
>
> We maintain a tree that allows running older Kernels with the latest V4L2
> media drivers and subsystem. On such tree, there's a patch that replaces
> LINUX_VERSION_CODE macro to V4L2_VERSION:
>
> https://git.linuxtv.org/media_build.git/tree/backports/api_version.patch
>
> There's a logic here which gets the version of the V4L2 used at the
> build. So, right now, it is filled with:
>
> #define V4L2_VERSION 330496 /* 0x050b00 */
>
> In other words, even if you run the backported driver on, let's say, Kernel
> 4.8, those calls will tell that the driver's version is from Kernel
> 5.11.

That too, is crazy and insane :)

> Providing a little of history behind those, this came together with the
> V4L version 2 API developed during Kernel 2.5.x and merged at Kernel
> 2.6.0.
>
> When such API was originally introduced, this field was meant to
> contain the driver's version. The problem is that people used to change
> the drivers (even with major rewrites) without changing its version.
>
> We ended by standardizing it everywhere, filling those at the media core,
> instead of doing it at driver's level - and using the Kernel version.
>
> This way, developers won't need to be concerned of keeping this
> updated as the subsystem evolves.
>
> With time, we also improved the V4L2 API in a way that applications can
> be able to directly detect the core/driver functionalities without needing
> to rely on such fields. So, I guess recent versions of most open source
> applications nowadays don't use it.

Yes, driver "version" means nothing, so functionality is the correct way
to handle this.

Any chance you all can just drop the kernel version stuff and just
report a static number that never goes up to allow people to use the
correct api for new stuff? Pick a "modern" number, like 5.10 and leave
it there for forever.

> > > Those can be used by applications in order to enable some features that
> > > are available only after certain Kernel versions.
> > >
> > > This is somewhat deprecated, in favor of the usage of some other
> > > capability fields, but for instance, the v4l2-compliance userspace tool
> > > have two such checks:
> > >
> > > utils/v4l2-compliance/v4l2-compliance.cpp
> > > 640: fail_on_test((vcap.version >> 16) < 3);
> > > 641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
> > >
> > > As far as I remember, all such checks are against major.minor. So,
> > > something like:
> > >
> > > sublevel = (sublevel > 0xff) ? 0xff : sublevel;
> > >
> > > inside KERNEL_VERSION macro should fix such regression at -stable.
> >
> > I think if we clamp KERNEL_VERSION at 255 we should be fine for anyone
> > checking this type of thing. Sasha has posted patches to do this.
>
> Yes, this should be enough.
>
> As far as I remember, when opensource apps use the version from the API,
> since Kernel 3.0, they always check only for major.minor.
>
> So, the only problem with those APIs are due to overflows. Setting
> sublevel to any value beteen 0-255 should work, from V4L2 API
> standpoint.

Great, thanks for checking.

greg k-h

2021-02-06 09:51:14

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

Em Sat, 6 Feb 2021 10:29:10 +0100
Greg Kroah-Hartman <[email protected]> escreveu:

> On Sat, Feb 06, 2021 at 10:24:02AM +0100, Mauro Carvalho Chehab wrote:
> > Em Sat, 6 Feb 2021 08:20:45 +0100
> > Greg Kroah-Hartman <[email protected]> escreveu:
> >
> > > On Fri, Feb 05, 2021 at 07:11:05PM +0100, Mauro Carvalho Chehab wrote:
> > > > Em Fri, 5 Feb 2021 12:31:05 -0500
> > > > Tony Battersby <[email protected]> escreveu:
> > > >
> > > > > On 2/4/21 6:00 AM, Jiri Slaby wrote:
> > > > > > Agreed. But currently, sublevel won't "wrap", it will "overflow" to
> > > > > > patchlevel. And that might be a problem. So we might need to update the
> > > > > > header generation using e.g. "sublevel & 0xff" (wrap around) or
> > > > > > "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
> > > > > >
> > > > > > In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
> > > > >
> > > > > My preference would be to be monotonic and get stuck at 255 to avoid
> > > > > breaking out-of-tree modules.  If needed, add another macro that
> > > > > increases the number of bits that can be used to check for sublevels >
> > > > > 255, while keeping the old macros for compatibility reasons.  Since
> > > > > sublevels > 255 have never existed before, any such checks must be
> > > > > newly-added, so they can be required to use the new macros.
> > > > >
> > > > > I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
> > > > > range of kernels from 3.18 (gasp!) up to the latest when bisecting,
> > > > > benchmarking, or debugging problems.  And I use a number of out-of-tree
> > > > > modules that rely on the KERNEL_VERSION to make everything work.  Some
> > > > > out-of-tree modules like an updated igb network driver might be needed
> > > > > to make it possible to test the old kernel on particular hardware.
> > > > >
> > > > > In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
> > > > > locally to make out-of-tree modules work.  Or else just not test kernels
> > > > > with sublevel > 255.
> > > >
> > > > Overflowing LINUX_VERSION_CODE breaks media applications. Several media
> > > > APIs have an ioctl that returns the Kernel version:
> > > >
> > > > drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
> > > > drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
> > > > drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
> > > > drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;
> > >
> > > This always struck me as odd, because why can't they just use the
> > > uname(2) syscall instead?
> >
> > I agree that this is odd on upstream Kernels.
> >
> > On backported ones, this should be filled with the version of the V4L2 core.
> >
> > We maintain a tree that allows running older Kernels with the latest V4L2
> > media drivers and subsystem. On such tree, there's a patch that replaces
> > LINUX_VERSION_CODE macro to V4L2_VERSION:
> >
> > https://git.linuxtv.org/media_build.git/tree/backports/api_version.patch
> >
> > There's a logic here which gets the version of the V4L2 used at the
> > build. So, right now, it is filled with:
> >
> > #define V4L2_VERSION 330496 /* 0x050b00 */
> >
> > In other words, even if you run the backported driver on, let's say, Kernel
> > 4.8, those calls will tell that the driver's version is from Kernel
> > 5.11.
>
> That too, is crazy and insane :)
>
> > Providing a little of history behind those, this came together with the
> > V4L version 2 API developed during Kernel 2.5.x and merged at Kernel
> > 2.6.0.
> >
> > When such API was originally introduced, this field was meant to
> > contain the driver's version. The problem is that people used to change
> > the drivers (even with major rewrites) without changing its version.
> >
> > We ended by standardizing it everywhere, filling those at the media core,
> > instead of doing it at driver's level - and using the Kernel version.
> >
> > This way, developers won't need to be concerned of keeping this
> > updated as the subsystem evolves.
> >
> > With time, we also improved the V4L2 API in a way that applications can
> > be able to directly detect the core/driver functionalities without needing
> > to rely on such fields. So, I guess recent versions of most open source
> > applications nowadays don't use it.
>
> Yes, driver "version" means nothing, so functionality is the correct way
> to handle this.
>
> Any chance you all can just drop the kernel version stuff and just
> report a static number that never goes up to allow people to use the
> correct api for new stuff? Pick a "modern" number, like 5.10 and leave
> it there for forever.

Good question. I like the idea of keeping it fixed, marking those fields
as DEPRECATED at the uAPI documentation.

However, at least the v4l2-compliance tool (used for V4L2
development) currently requires it:

if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
node->might_support_cache_hints = true;

Not sure if uname would work there, or if we would need, to use some
Kconfig symbol to only return the real version on debug Kernels.

Hans,

What do you think?


>
> > > > Those can be used by applications in order to enable some features that
> > > > are available only after certain Kernel versions.
> > > >
> > > > This is somewhat deprecated, in favor of the usage of some other
> > > > capability fields, but for instance, the v4l2-compliance userspace tool
> > > > have two such checks:
> > > >
> > > > utils/v4l2-compliance/v4l2-compliance.cpp
> > > > 640: fail_on_test((vcap.version >> 16) < 3);
> > > > 641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
> > > >
> > > > As far as I remember, all such checks are against major.minor. So,
> > > > something like:
> > > >
> > > > sublevel = (sublevel > 0xff) ? 0xff : sublevel;
> > > >
> > > > inside KERNEL_VERSION macro should fix such regression at -stable.
> > >
> > > I think if we clamp KERNEL_VERSION at 255 we should be fine for anyone
> > > checking this type of thing. Sasha has posted patches to do this.
> >
> > Yes, this should be enough.
> >
> > As far as I remember, when opensource apps use the version from the API,
> > since Kernel 3.0, they always check only for major.minor.
> >
> > So, the only problem with those APIs are due to overflows. Setting
> > sublevel to any value beteen 0-255 should work, from V4L2 API
> > standpoint.
>
> Great, thanks for checking.
>
> greg k-h



Thanks,
Mauro

2021-02-06 10:21:36

by Hans Verkuil

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

On 06/02/2021 10:48, Mauro Carvalho Chehab wrote:
> Em Sat, 6 Feb 2021 10:29:10 +0100
> Greg Kroah-Hartman <[email protected]> escreveu:
>
>> On Sat, Feb 06, 2021 at 10:24:02AM +0100, Mauro Carvalho Chehab wrote:
>>> Em Sat, 6 Feb 2021 08:20:45 +0100
>>> Greg Kroah-Hartman <[email protected]> escreveu:
>>>
>>>> On Fri, Feb 05, 2021 at 07:11:05PM +0100, Mauro Carvalho Chehab wrote:
>>>>> Em Fri, 5 Feb 2021 12:31:05 -0500
>>>>> Tony Battersby <[email protected]> escreveu:
>>>>>
>>>>>> On 2/4/21 6:00 AM, Jiri Slaby wrote:
>>>>>>> Agreed. But currently, sublevel won't "wrap", it will "overflow" to
>>>>>>> patchlevel. And that might be a problem. So we might need to update the
>>>>>>> header generation using e.g. "sublevel & 0xff" (wrap around) or
>>>>>>> "sublevel > 255 : 255 : sublevel" (be monotonic and get stuck at 255).
>>>>>>>
>>>>>>> In both LINUX_VERSION_CODE generation and KERNEL_VERSION proper.
>>>>>>
>>>>>> My preference would be to be monotonic and get stuck at 255 to avoid
>>>>>> breaking out-of-tree modules.  If needed, add another macro that
>>>>>> increases the number of bits that can be used to check for sublevels >
>>>>>> 255, while keeping the old macros for compatibility reasons.  Since
>>>>>> sublevels > 255 have never existed before, any such checks must be
>>>>>> newly-added, so they can be required to use the new macros.
>>>>>>
>>>>>> I do not run the 4.4/4.9 kernels usually, but I do sometimes test a wide
>>>>>> range of kernels from 3.18 (gasp!) up to the latest when bisecting,
>>>>>> benchmarking, or debugging problems.  And I use a number of out-of-tree
>>>>>> modules that rely on the KERNEL_VERSION to make everything work.  Some
>>>>>> out-of-tree modules like an updated igb network driver might be needed
>>>>>> to make it possible to test the old kernel on particular hardware.
>>>>>>
>>>>>> In the worst case, I can patch LINUX_VERSION_CODE and KERNEL_VERSION
>>>>>> locally to make out-of-tree modules work.  Or else just not test kernels
>>>>>> with sublevel > 255.
>>>>>
>>>>> Overflowing LINUX_VERSION_CODE breaks media applications. Several media
>>>>> APIs have an ioctl that returns the Kernel version:
>>>>>
>>>>> drivers/media/cec/core/cec-api.c: caps.version = LINUX_VERSION_CODE;
>>>>> drivers/media/mc/mc-device.c: info->media_version = LINUX_VERSION_CODE;
>>>>> drivers/media/v4l2-core/v4l2-ioctl.c: cap->version = LINUX_VERSION_CODE;
>>>>> drivers/media/v4l2-core/v4l2-subdev.c: cap->version = LINUX_VERSION_CODE;
>>>>
>>>> This always struck me as odd, because why can't they just use the
>>>> uname(2) syscall instead?
>>>
>>> I agree that this is odd on upstream Kernels.
>>>
>>> On backported ones, this should be filled with the version of the V4L2 core.
>>>
>>> We maintain a tree that allows running older Kernels with the latest V4L2
>>> media drivers and subsystem. On such tree, there's a patch that replaces
>>> LINUX_VERSION_CODE macro to V4L2_VERSION:
>>>
>>> https://git.linuxtv.org/media_build.git/tree/backports/api_version.patch
>>>
>>> There's a logic here which gets the version of the V4L2 used at the
>>> build. So, right now, it is filled with:
>>>
>>> #define V4L2_VERSION 330496 /* 0x050b00 */
>>>
>>> In other words, even if you run the backported driver on, let's say, Kernel
>>> 4.8, those calls will tell that the driver's version is from Kernel
>>> 5.11.
>>
>> That too, is crazy and insane :)
>>
>>> Providing a little of history behind those, this came together with the
>>> V4L version 2 API developed during Kernel 2.5.x and merged at Kernel
>>> 2.6.0.
>>>
>>> When such API was originally introduced, this field was meant to
>>> contain the driver's version. The problem is that people used to change
>>> the drivers (even with major rewrites) without changing its version.
>>>
>>> We ended by standardizing it everywhere, filling those at the media core,
>>> instead of doing it at driver's level - and using the Kernel version.
>>>
>>> This way, developers won't need to be concerned of keeping this
>>> updated as the subsystem evolves.
>>>
>>> With time, we also improved the V4L2 API in a way that applications can
>>> be able to directly detect the core/driver functionalities without needing
>>> to rely on such fields. So, I guess recent versions of most open source
>>> applications nowadays don't use it.
>>
>> Yes, driver "version" means nothing, so functionality is the correct way
>> to handle this.
>>
>> Any chance you all can just drop the kernel version stuff and just
>> report a static number that never goes up to allow people to use the
>> correct api for new stuff? Pick a "modern" number, like 5.10 and leave
>> it there for forever.
>
> Good question. I like the idea of keeping it fixed, marking those fields
> as DEPRECATED at the uAPI documentation.
>
> However, at least the v4l2-compliance tool (used for V4L2
> development) currently requires it:
>
> if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
> node->might_support_cache_hints = true;
>
> Not sure if uname would work there, or if we would need, to use some
> Kconfig symbol to only return the real version on debug Kernels.
>
> Hans,
>
> What do you think?

It could be replaced by uname, but if we fix the version number to something
>= 5.9 (which we will no doubt do), then there is no need to change anything here.

But I was wondering if it wouldn't make sense to create a variant of
LINUX_VERSION_CODE that ignored the sublevel and just always leaves that
at 0. In practice, media API changes only happen at new kernel releases and
not in the stable series (there might be rare exceptions to that, but I'm
not aware of that).

And while we are using capability flags a lot more these days to ensure
userspace can discover what is and what is not available, we never did a full
analysis of that and I feel a bit uncomfortable about fixing the version
number.

I see more usages of LINUX_VERSION_CODE in the kernel that look like they do
something similar to what the media subsystem does, and that probably also
do not need the SUBLEVEL.

A LINUX_MAJOR_MINOR_CODE define (or whatever you want to call it) would solve
this problem for us.

Regards,

Hans

>
>
>>
>>>>> Those can be used by applications in order to enable some features that
>>>>> are available only after certain Kernel versions.
>>>>>
>>>>> This is somewhat deprecated, in favor of the usage of some other
>>>>> capability fields, but for instance, the v4l2-compliance userspace tool
>>>>> have two such checks:
>>>>>
>>>>> utils/v4l2-compliance/v4l2-compliance.cpp
>>>>> 640: fail_on_test((vcap.version >> 16) < 3);
>>>>> 641: if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
>>>>>
>>>>> As far as I remember, all such checks are against major.minor. So,
>>>>> something like:
>>>>>
>>>>> sublevel = (sublevel > 0xff) ? 0xff : sublevel;
>>>>>
>>>>> inside KERNEL_VERSION macro should fix such regression at -stable.
>>>>
>>>> I think if we clamp KERNEL_VERSION at 255 we should be fine for anyone
>>>> checking this type of thing. Sasha has posted patches to do this.
>>>
>>> Yes, this should be enough.
>>>
>>> As far as I remember, when opensource apps use the version from the API,
>>> since Kernel 3.0, they always check only for major.minor.
>>>
>>> So, the only problem with those APIs are due to overflows. Setting
>>> sublevel to any value beteen 0-255 should work, from V4L2 API
>>> standpoint.
>>
>> Great, thanks for checking.
>>
>> greg k-h
>
>
>
> Thanks,
> Mauro
>

2021-02-06 11:30:00

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Kernel version numbers after 4.9.255 and 4.4.255

Em Sat, 6 Feb 2021 11:18:15 +0100
Hans Verkuil <[email protected]> escreveu:

> >> Yes, driver "version" means nothing, so functionality is the correct way
> >> to handle this.
> >>
> >> Any chance you all can just drop the kernel version stuff and just
> >> report a static number that never goes up to allow people to use the
> >> correct api for new stuff? Pick a "modern" number, like 5.10 and leave
> >> it there for forever.
> >
> > Good question. I like the idea of keeping it fixed, marking those fields
> > as DEPRECATED at the uAPI documentation.
> >
> > However, at least the v4l2-compliance tool (used for V4L2
> > development) currently requires it:
> >
> > if (vcap.version >= 0x050900) // Present from 5.9.0 onwards
> > node->might_support_cache_hints = true;
> >
> > Not sure if uname would work there, or if we would need, to use some
> > Kconfig symbol to only return the real version on debug Kernels.
> >
> > Hans,
> >
> > What do you think?
>
> It could be replaced by uname, but if we fix the version number to something
> >= 5.9 (which we will no doubt do), then there is no need to change anything here.

Sure, but needing to check for a so recent Kernel version probably
means that we should have an extra capability somewhere to the
feature that it is enabled only if Kernel >= 5.9.

> But I was wondering if it wouldn't make sense to create a variant of
> LINUX_VERSION_CODE that ignored the sublevel and just always leaves that
> at 0. In practice, media API changes only happen at new kernel releases and
> not in the stable series (there might be rare exceptions to that, but I'm
> not aware of that).

I guess there were one or two exceptions of uAPI regressions that happened
after a new version that were fixed at stable sublevel 1 or 2.

> And while we are using capability flags a lot more these days to ensure
> userspace can discover what is and what is not available, we never did a full
> analysis of that and I feel a bit uncomfortable about fixing the version
> number.

We don't need a full analysis for past features. If the version gets
fixed on, let's say, 6.0.0, if caps.version >= 0x060000, everything
supported up to the present date will be there.

We'll just need to have an extra care of ensuring that every new
feature added upstream will have a way for userspace to check if
it is present.

> I see more usages of LINUX_VERSION_CODE in the kernel that look like they do
> something similar to what the media subsystem does, and that probably also
> do not need the SUBLEVEL.

Yeah, other subsystems seem to use it as well.

> A LINUX_MAJOR_MINOR_CODE define (or whatever you want to call it) would solve
> this problem for us.

There are ways to minimize this problem on future stable Kernels.

My main concern is if we should keep letting applications relying
on caps.version. By keeping

cap->version = LINUX_VERSION_CODE;

(or any variant of that), applications may simply rely on it,
instead of properly implementing a functionality probing code.

To be clear: my main concern here is not about media development
tools, like v4l2-compliance. It is about real applications that
could end breaking on backports that won't be properly
back-propagating cap->version.

Thanks,
Mauro