2024-03-11 08:11:42

by Michal Hocko

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Wed 06-03-24 06:45:54, Greg KH wrote:
> Description
> ===========
>
> In the Linux kernel, the following vulnerability has been resolved:
>
> sysctl: Fix out of bounds access for empty sysctl registers
>
> When registering tables to the sysctl subsystem there is a check to see
> if header is a permanently empty directory (used for mounts). This check
> evaluates the first element of the ctl_table. This results in an out of
> bounds evaluation when registering empty directories.
>
> The function register_sysctl_mount_point now passes a ctl_table of size
> 1 instead of size 0. It now relies solely on the type to identify
> a permanently empty register.
>
> Make sure that the ctl_table has at least one element before testing for
> permanent emptiness.

While this makes the code more robust and more future proof I do not think
this is fixing any real issue not to mention anything with security
implications. AFAIU there is no actual code that can generate empty
sysctl directories unless the kernel is heavily misconfigured.

Luis, Joel, what do you think?

--
Michal Hocko
SUSE Labs


2024-03-11 21:58:00

by Luis Chamberlain

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Mon, Mar 11, 2024 at 09:11:27AM +0100, Michal Hocko wrote:
> On Wed 06-03-24 06:45:54, Greg KH wrote:
> > Description
> > ===========
> >
> > In the Linux kernel, the following vulnerability has been resolved:
> >
> > sysctl: Fix out of bounds access for empty sysctl registers
> >
> > When registering tables to the sysctl subsystem there is a check to see
> > if header is a permanently empty directory (used for mounts). This check
> > evaluates the first element of the ctl_table. This results in an out of
> > bounds evaluation when registering empty directories.
> >
> > The function register_sysctl_mount_point now passes a ctl_table of size
> > 1 instead of size 0. It now relies solely on the type to identify
> > a permanently empty register.
> >
> > Make sure that the ctl_table has at least one element before testing for
> > permanent emptiness.
>
> While this makes the code more robust and more future proof I do not think
> this is fixing any real issue not to mention anything with security
> implications. AFAIU there is no actual code that can generate empty
> sysctl directories unless the kernel is heavily misconfigured.
>
> Luis, Joel, what do you think?

As per review with Joel:

The out-of-bounds issue cannot be triggered in older kernels unless
you had external modules with empty sysctls. That is because although
support for empty sysctls was added on v6.6 none of the sysctls that
were trimmed for the superfluous entry over the different kernel
releases so far has had the chance to be empty.

The 0-day reported crash was for a future out of tree patch which was
never merged yet:

https://github.com/Joelgranados/linux/commit/423f75083acd947804c8d5c31ad1e1b5fcb3c020

Let's examine this commit to see why it triggers a crash to understand
how the out of bounds issue can be triggered.

Looking for suspects of sysctls which may end up empty in that patch we
have a couple. kernel/sched/fair.c sched_fair_sysctls can end up empty when
you don't have CONFIG_CFS_BANDWIDTH or CONFIG_NUMA_BALANCING. But the kernel
config for the 0-day test was:
https://download.01.org/0day-ci/archive/20231120/[email protected]/config-6.6.0-rc2-00030-g423f75083acd

It has CONFIG_CFS_BANDWIDTH=y but CONFIG_NUMA_BALANCING is not set so
this was not the culprit, but that configuration could have been a
cause if it was possible.

In the same future-not-upstream commit kernel/sched/core.c sched_core_sysctls
can be empty if you don't have the following:

CONFIG_SCHEDSTATS --> not set on 0-day kernel
CONFIG_UCLAMP_TASK --> not set on 0-day kernel
CONFIG_NUMA_BALANCING --> not set on 0-day kernel

So that was the cause, and an example real valid config which can trigger
a crash. But that patch was never upstream.

Now, we can look for existing removal of sysctl sentinels with:

git log -p --grep "superfluous sentinel element"

And of these, at first glance, only locks_sysctls seems like it *could*
possibly end up in a situation where random config would create an empty
sysctl, but exanding on the code we see that's not possible because
leases-enable sysctl entry is always built if you have sysctls enabled:

static struct ctl_table llocks_sysctlsocks_sysctls[] = {
{
.procname = "leases-enable",
.data = &leases_enable,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
},
#ifdef CONFIG_MMU
{
.procname = "lease-break-time",
.data = &lease_break_time,
.maxlen = sizeof(int),
.mode = 0644,
.proc_handler = proc_dointvec,
},
#endif /* CONFIG_MMU */
};

The out of bounds fix commit should have just had the tag:

Fixes 9edbfe92a0a13 ("sysctl: Add size to register_sysctl")

Backporting this is fine, but wouldn't fix an issue unless an external
module had empty sysctls. And exploiting this is not possible unless
you purposely build an external module which could end up with empty
sysctls.

HTH,

Luis

2024-03-12 09:20:13

by Lee Jones

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Mon, 11 Mar 2024, Luis Chamberlain wrote:

> On Mon, Mar 11, 2024 at 09:11:27AM +0100, Michal Hocko wrote:
> > On Wed 06-03-24 06:45:54, Greg KH wrote:
> > > Description
> > > ===========
> > >
> > > In the Linux kernel, the following vulnerability has been resolved:
> > >
> > > sysctl: Fix out of bounds access for empty sysctl registers
> > >
> > > When registering tables to the sysctl subsystem there is a check to see
> > > if header is a permanently empty directory (used for mounts). This check
> > > evaluates the first element of the ctl_table. This results in an out of
> > > bounds evaluation when registering empty directories.
> > >
> > > The function register_sysctl_mount_point now passes a ctl_table of size
> > > 1 instead of size 0. It now relies solely on the type to identify
> > > a permanently empty register.
> > >
> > > Make sure that the ctl_table has at least one element before testing for
> > > permanent emptiness.
> >
> > While this makes the code more robust and more future proof I do not think
> > this is fixing any real issue not to mention anything with security
> > implications. AFAIU there is no actual code that can generate empty
> > sysctl directories unless the kernel is heavily misconfigured.
> >
> > Luis, Joel, what do you think?
>
> As per review with Joel:
>
> The out-of-bounds issue cannot be triggered in older kernels unless
> you had external modules with empty sysctls. That is because although
> support for empty sysctls was added on v6.6 none of the sysctls that
> were trimmed for the superfluous entry over the different kernel
> releases so far has had the chance to be empty.
>
> The 0-day reported crash was for a future out of tree patch which was
> never merged yet:
>
> https://github.com/Joelgranados/linux/commit/423f75083acd947804c8d5c31ad1e1b5fcb3c020
>
> Let's examine this commit to see why it triggers a crash to understand
> how the out of bounds issue can be triggered.
>
> Looking for suspects of sysctls which may end up empty in that patch we
> have a couple. kernel/sched/fair.c sched_fair_sysctls can end up empty when
> you don't have CONFIG_CFS_BANDWIDTH or CONFIG_NUMA_BALANCING. But the kernel
> config for the 0-day test was:
> https://download.01.org/0day-ci/archive/20231120/[email protected]/config-6.6.0-rc2-00030-g423f75083acd
>
> It has CONFIG_CFS_BANDWIDTH=y but CONFIG_NUMA_BALANCING is not set so
> this was not the culprit, but that configuration could have been a
> cause if it was possible.
>
> In the same future-not-upstream commit kernel/sched/core.c sched_core_sysctls
> can be empty if you don't have the following:
>
> CONFIG_SCHEDSTATS --> not set on 0-day kernel
> CONFIG_UCLAMP_TASK --> not set on 0-day kernel
> CONFIG_NUMA_BALANCING --> not set on 0-day kernel
>
> So that was the cause, and an example real valid config which can trigger
> a crash. But that patch was never upstream.
>
> Now, we can look for existing removal of sysctl sentinels with:
>
> git log -p --grep "superfluous sentinel element"
>
> And of these, at first glance, only locks_sysctls seems like it *could*
> possibly end up in a situation where random config would create an empty
> sysctl, but exanding on the code we see that's not possible because
> leases-enable sysctl entry is always built if you have sysctls enabled:
>
> static struct ctl_table llocks_sysctlsocks_sysctls[] = {
> {
> .procname = "leases-enable",
> .data = &leases_enable,
> .maxlen = sizeof(int),
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> #ifdef CONFIG_MMU
> {
> .procname = "lease-break-time",
> .data = &lease_break_time,
> .maxlen = sizeof(int),
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> #endif /* CONFIG_MMU */
> };
>
> The out of bounds fix commit should have just had the tag:
>
> Fixes 9edbfe92a0a13 ("sysctl: Add size to register_sysctl")
>
> Backporting this is fine, but wouldn't fix an issue unless an external
> module had empty sysctls. And exploiting this is not possible unless
> you purposely build an external module which could end up with empty
> sysctls.

Thanks for the amazing explanation Luis.

If I'm reading this correctly, an issue does exist, but an attacker
would have to lay some foundations before it could be triggered. Sounds
like loading of a malicious or naive module would be enough.

We know from conducting postmortems on previous exploits that successful
attacks often consist of leveraging a chain of smaller, seemingly
implausible or innocuous looking bugs rather than in isolation.

With that in mind, it is still my belief that this could be used by an
attacker in such a chain. Unless I have this totally wrong or any of
the maintainers have strong feelings to the contrary, I would like to
keep the CVE number associated with this fix.

--
Lee Jones [李琼斯]

2024-03-12 09:48:40

by Michal Hocko

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Tue 12-03-24 09:17:30, Lee Jones wrote:
[...]
> > Backporting this is fine, but wouldn't fix an issue unless an external
> > module had empty sysctls. And exploiting this is not possible unless
> > you purposely build an external module which could end up with empty
> > sysctls.

Thanks for the clarification Luis!

> Thanks for the amazing explanation Luis.
>
> If I'm reading this correctly, an issue does exist, but an attacker
> would have to lay some foundations before it could be triggered. Sounds
> like loading of a malicious or naive module would be enough.

If the bar is set as high as a kernel module to create and empty sysctl
directory then I think it is safe to say that the security aspect is
mostly moot. There are much simpler ways to attack the system if you are
able to load a kernel module.

> We know from conducting postmortems on previous exploits that successful
> attacks often consist of leveraging a chain of smaller, seemingly
> implausible or innocuous looking bugs rather than in isolation.
>
> With that in mind, it is still my belief that this could be used by an
> attacker in such a chain. Unless I have this totally wrong or any of
> the maintainers have strong feelings to the contrary, I would like to
> keep the CVE number associated with this fix.

No, no real strong feelings but I have to say that I find a CVE more
than a stretch. Kernel modules could do much more harm than just abuse
this particular bug.
--
Michal Hocko
SUSE Labs

2024-03-12 11:20:34

by Joel Granados

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Mon, Mar 11, 2024 at 02:57:50PM -0700, Luis Chamberlain wrote:
> On Mon, Mar 11, 2024 at 09:11:27AM +0100, Michal Hocko wrote:
> > On Wed 06-03-24 06:45:54, Greg KH wrote:
> > > Description
> > > ===========
> > >
> > > In the Linux kernel, the following vulnerability has been resolved:
> > >
> > > sysctl: Fix out of bounds access for empty sysctl registers
> > >
> > > When registering tables to the sysctl subsystem there is a check to see
> > > if header is a permanently empty directory (used for mounts). This check
> > > evaluates the first element of the ctl_table. This results in an out of
> > > bounds evaluation when registering empty directories.
> > >
> > > The function register_sysctl_mount_point now passes a ctl_table of size
> > > 1 instead of size 0. It now relies solely on the type to identify
> > > a permanently empty register.
> > >
> > > Make sure that the ctl_table has at least one element before testing for
> > > permanent emptiness.
> >
> > While this makes the code more robust and more future proof I do not think
> > this is fixing any real issue not to mention anything with security
> > implications. AFAIU there is no actual code that can generate empty
> > sysctl directories unless the kernel is heavily misconfigured.
> >
> > Luis, Joel, what do you think?
>
> As per review with Joel:
>
> The out-of-bounds issue cannot be triggered in older kernels unless
> you had external modules with empty sysctls. That is because although
> support for empty sysctls was added on v6.6 none of the sysctls that
> were trimmed for the superfluous entry over the different kernel
> releases so far has had the chance to be empty.
>
> The 0-day reported crash was for a future out of tree patch which was
> never merged yet:
>
> https://github.com/Joelgranados/linux/commit/423f75083acd947804c8d5c31ad1e1b5fcb3c020
>
> Let's examine this commit to see why it triggers a crash to understand
> how the out of bounds issue can be triggered.
>
> Looking for suspects of sysctls which may end up empty in that patch we
> have a couple. kernel/sched/fair.c sched_fair_sysctls can end up empty when
> you don't have CONFIG_CFS_BANDWIDTH or CONFIG_NUMA_BALANCING. But the kernel
> config for the 0-day test was:
> https://download.01.org/0day-ci/archive/20231120/[email protected]/config-6.6.0-rc2-00030-g423f75083acd
>
> It has CONFIG_CFS_BANDWIDTH=y but CONFIG_NUMA_BALANCING is not set so
> this was not the culprit, but that configuration could have been a
> cause if it was possible.
>
> In the same future-not-upstream commit kernel/sched/core.c sched_core_sysctls
> can be empty if you don't have the following:
>
> CONFIG_SCHEDSTATS --> not set on 0-day kernel
> CONFIG_UCLAMP_TASK --> not set on 0-day kernel
> CONFIG_NUMA_BALANCING --> not set on 0-day kernel
>
> So that was the cause, and an example real valid config which can trigger
> a crash. But that patch was never upstream.
>
> Now, we can look for existing removal of sysctl sentinels with:
>
> git log -p --grep "superfluous sentinel element"
>
> And of these, at first glance, only locks_sysctls seems like it *could*
> possibly end up in a situation where random config would create an empty
> sysctl, but exanding on the code we see that's not possible because
> leases-enable sysctl entry is always built if you have sysctls enabled:
>
> static struct ctl_table llocks_sysctlsocks_sysctls[] = {
> {
> .procname = "leases-enable",
> .data = &leases_enable,
> .maxlen = sizeof(int),
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> #ifdef CONFIG_MMU
> {
> .procname = "lease-break-time",
> .data = &lease_break_time,
> .maxlen = sizeof(int),
> .mode = 0644,
> .proc_handler = proc_dointvec,
> },
> #endif /* CONFIG_MMU */
> };
>
> The out of bounds fix commit should have just had the tag:
>
> Fixes 9edbfe92a0a13 ("sysctl: Add size to register_sysctl")
>
> Backporting this is fine, but wouldn't fix an issue unless an external
It is not only fine, I think it is necessary to avoid some other future
backported patch actually introducing an empty sysctl.

> module had empty sysctls. And exploiting this is not possible unless
> you purposely build an external module which could end up with empty
> sysctls.
This is exactly my conclusion.
Very nice summary Luis. Thx for putting it together

>
> HTH,
>
> Luis


--

Joel Granados


Attachments:
(No filename) (5.31 kB)
signature.asc (673.00 B)
Download all attachments

2024-03-12 18:21:49

by Luis Chamberlain

[permalink] [raw]
Subject: Re: CVE-2023-52596: sysctl: Fix out of bounds access for empty sysctl registers

On Tue, Mar 12, 2024 at 10:45:28AM +0100, Michal Hocko wrote:
> On Tue 12-03-24 09:17:30, Lee Jones wrote:
> [...]
> > > Backporting this is fine, but wouldn't fix an issue unless an external
> > > module had empty sysctls. And exploiting this is not possible unless
> > > you purposely build an external module which could end up with empty
> > > sysctls.
>
> Thanks for the clarification Luis!
>
> > Thanks for the amazing explanation Luis.
> >
> > If I'm reading this correctly, an issue does exist, but an attacker
> > would have to lay some foundations before it could be triggered. Sounds
> > like loading of a malicious or naive module would be enough.
>
> If the bar is set as high as a kernel module to create and empty sysctl
> directory then I think it is safe to say that the security aspect is
> mostly moot. There are much simpler ways to attack the system if you are
> able to load a kernel module.

Indeed, a simple BUG_ON(1) on external modules cannot possible be a
source of a CVE. And so this becomes BUG_ON(when_sysctl_empty()) where
when_sysctl_empty() is hypotethical and I think the source of this
question for CVE. Today's that not at boot time or dynamically with any
linux kernel sources released, and so its only possible if:

a) As Joel indicated if you backported an empty sysctl array
(which would be unless you carried all the infrastructure to support it).

b) an external module has an empty sysctl

HTH.

Luis