On Thu 22-02-24 17:21:58, Greg KH wrote:
> Description
> ===========
>
> In the Linux kernel, the following vulnerability has been resolved:
>
> powerpc/pseries/memhp: Fix access beyond end of drmem array
>
> dlpar_memory_remove_by_index() may access beyond the bounds of the
> drmem lmb array when the LMB lookup fails to match an entry with the
> given DRC index. When the search fails, the cursor is left pointing to
> &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> last valid entry in the array. The debug message at the end of the
> function then dereferences this pointer:
>
> pr_debug("Failed to hot-remove memory at %llx\n",
> lmb->base_addr);
While this is a reasonable fix and the stable material it is really
unclear to me why it has gained a CVE. Memory hotplug is a privileged
operation. Could you clarify please?
--
Michal Hocko
SUSE Labs
On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> On Thu 22-02-24 17:21:58, Greg KH wrote:
> > Description
> > ===========
> >
> > In the Linux kernel, the following vulnerability has been resolved:
> >
> > powerpc/pseries/memhp: Fix access beyond end of drmem array
> >
> > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > drmem lmb array when the LMB lookup fails to match an entry with the
> > given DRC index. When the search fails, the cursor is left pointing to
> > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > last valid entry in the array. The debug message at the end of the
> > function then dereferences this pointer:
> >
> > pr_debug("Failed to hot-remove memory at %llx\n",
> > lmb->base_addr);
>
> While this is a reasonable fix and the stable material it is really
> unclear to me why it has gained a CVE. Memory hotplug is a privileged
> operation. Could you clarify please?
As you know, history has shown us that accessing out of your allocated
memory can cause problems, and we can not assume use-cases, as we don't
know how everyone uses our codebase, so marking places where we fix
out-of-bound memory accesses is resolving a weakness in the codebase,
hence a CVE assignment.
If your systems are not vulnerable to this specific issue, wonderful, no
need to take it, but why wouldn't you want to take a fix that resolves a
known weakness?
thanks,
greg k-h
On Mon 26-02-24 16:06:51, Greg KH wrote:
> On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> > On Thu 22-02-24 17:21:58, Greg KH wrote:
> > > Description
> > > ===========
> > >
> > > In the Linux kernel, the following vulnerability has been resolved:
> > >
> > > powerpc/pseries/memhp: Fix access beyond end of drmem array
> > >
> > > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > > drmem lmb array when the LMB lookup fails to match an entry with the
> > > given DRC index. When the search fails, the cursor is left pointing to
> > > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > > last valid entry in the array. The debug message at the end of the
> > > function then dereferences this pointer:
> > >
> > > pr_debug("Failed to hot-remove memory at %llx\n",
> > > lmb->base_addr);
> >
> > While this is a reasonable fix and the stable material it is really
> > unclear to me why it has gained a CVE. Memory hotplug is a privileged
> > operation. Could you clarify please?
>
> As you know, history has shown us that accessing out of your allocated
> memory can cause problems, and we can not assume use-cases, as we don't
> know how everyone uses our codebase, so marking places where we fix
> out-of-bound memory accesses is resolving a weakness in the codebase,
> hence a CVE assignment.
Does that mean that any potentially incorrect input provided by an admin is
considered CVE now? I guess we would need to ban interfaces like
/dev/mem and many others.
> If your systems are not vulnerable to this specific issue, wonderful, no
> need to take it, but why wouldn't you want to take a fix that resolves a
> known weakness?
I have explicitly said, the fix is reasonable. I just do not see a point
to mark it as CVE. I fail to see any thread model where this would
matter as it would require untrusted privileged actor to trigger it
AFAICS. I am happy to be proven wrong here.
--
Michal Hocko
SUSE Labs
On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> On Mon 26-02-24 16:06:51, Greg KH wrote:
> > On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> > > On Thu 22-02-24 17:21:58, Greg KH wrote:
> > > > Description
> > > > ===========
> > > >
> > > > In the Linux kernel, the following vulnerability has been resolved:
> > > >
> > > > powerpc/pseries/memhp: Fix access beyond end of drmem array
> > > >
> > > > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > > > drmem lmb array when the LMB lookup fails to match an entry with the
> > > > given DRC index. When the search fails, the cursor is left pointing to
> > > > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > > > last valid entry in the array. The debug message at the end of the
> > > > function then dereferences this pointer:
> > > >
> > > > pr_debug("Failed to hot-remove memory at %llx\n",
> > > > lmb->base_addr);
> > >
> > > While this is a reasonable fix and the stable material it is really
> > > unclear to me why it has gained a CVE. Memory hotplug is a privileged
> > > operation. Could you clarify please?
> >
> > As you know, history has shown us that accessing out of your allocated
> > memory can cause problems, and we can not assume use-cases, as we don't
> > know how everyone uses our codebase, so marking places where we fix
> > out-of-bound memory accesses is resolving a weakness in the codebase,
> > hence a CVE assignment.
>
> Does that mean that any potentially incorrect input provided by an admin is
> considered CVE now? I guess we would need to ban interfaces like
> /dev/mem and many others.
If you have your system set up to prevent admins from accessing /dev/mem
(isn't there a config option for that), and you can access it, then yes,
that would be a CVE-worthy issue.
But if you configure your system to allow an admin access to /dev/mem
then you wanted that :)
> > If your systems are not vulnerable to this specific issue, wonderful, no
> > need to take it, but why wouldn't you want to take a fix that resolves a
> > known weakness?
>
> I have explicitly said, the fix is reasonable. I just do not see a point
> to mark it as CVE. I fail to see any thread model where this would
> matter as it would require untrusted privileged actor to trigger it
> AFAICS. I am happy to be proven wrong here.
We can not determine threat models when filing CVEs as we do not know
what your threat model is. All we can do is determine if this resolves
a weakness in the system. A use-after-free is a weakness and this
resolves that issue.
It is up to others to "grade" the CVEs if they want to. There are loads
of other orginizations that do that type of thing, taking into
consideration specific threat models by which they wish to enforce. If
your orginization thinks this is not relevent to your threat model at
all, wonderful, you can ignore it :)
thanks,
greg k-h
On Mon 26-02-24 17:12:19, Greg KH wrote:
> On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> > On Mon 26-02-24 16:06:51, Greg KH wrote:
> > > On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> > > > On Thu 22-02-24 17:21:58, Greg KH wrote:
> > > > > Description
> > > > > ===========
> > > > >
> > > > > In the Linux kernel, the following vulnerability has been resolved:
> > > > >
> > > > > powerpc/pseries/memhp: Fix access beyond end of drmem array
> > > > >
> > > > > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > > > > drmem lmb array when the LMB lookup fails to match an entry with the
> > > > > given DRC index. When the search fails, the cursor is left pointing to
> > > > > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > > > > last valid entry in the array. The debug message at the end of the
> > > > > function then dereferences this pointer:
> > > > >
> > > > > pr_debug("Failed to hot-remove memory at %llx\n",
> > > > > lmb->base_addr);
> > > >
> > > > While this is a reasonable fix and the stable material it is really
> > > > unclear to me why it has gained a CVE. Memory hotplug is a privileged
> > > > operation. Could you clarify please?
> > >
> > > As you know, history has shown us that accessing out of your allocated
> > > memory can cause problems, and we can not assume use-cases, as we don't
> > > know how everyone uses our codebase, so marking places where we fix
> > > out-of-bound memory accesses is resolving a weakness in the codebase,
> > > hence a CVE assignment.
> >
> > Does that mean that any potentially incorrect input provided by an admin is
> > considered CVE now? I guess we would need to ban interfaces like
> > /dev/mem and many others.
>
> If you have your system set up to prevent admins from accessing /dev/mem
> (isn't there a config option for that), and you can access it, then yes,
> that would be a CVE-worthy issue.
>
> But if you configure your system to allow an admin access to /dev/mem
> then you wanted that :)
OK, I thought we are having a serious discussion here. Now I am not
really sure about that.
> > > If your systems are not vulnerable to this specific issue, wonderful, no
> > > need to take it, but why wouldn't you want to take a fix that resolves a
> > > known weakness?
> >
> > I have explicitly said, the fix is reasonable. I just do not see a point
> > to mark it as CVE. I fail to see any thread model where this would
> > matter as it would require untrusted privileged actor to trigger it
> > AFAICS. I am happy to be proven wrong here.
>
> We can not determine threat models when filing CVEs as we do not know
> what your threat model is. All we can do is determine if this resolves
> a weakness in the system. A use-after-free is a weakness and this
> resolves that issue.
Sure, you cannot make any assumptions and nobody questions that. We do
have a certain common sense though. And if somebody comes up with a
usecase that we haven't anticipated then we can surely assign a CVE.
I though that level of control is exactly the reason to own the process
by the community.
All that being said I dispute the issue fixed here has any more security
relevance than allowing untrusted user to control memory hotplug which
could easily result in DoS of the system.
--
Michal Hocko
SUSE Labs
On Mon, Feb 26, 2024 at 05:36:57PM +0100, Michal Hocko wrote:
> On Mon 26-02-24 17:12:19, Greg KH wrote:
> > On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> > > On Mon 26-02-24 16:06:51, Greg KH wrote:
> > > > On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> > > > > On Thu 22-02-24 17:21:58, Greg KH wrote:
> > > > > > Description
> > > > > > ===========
> > > > > >
> > > > > > In the Linux kernel, the following vulnerability has been resolved:
> > > > > >
> > > > > > powerpc/pseries/memhp: Fix access beyond end of drmem array
> > > > > >
> > > > > > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > > > > > drmem lmb array when the LMB lookup fails to match an entry with the
> > > > > > given DRC index. When the search fails, the cursor is left pointing to
> > > > > > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > > > > > last valid entry in the array. The debug message at the end of the
> > > > > > function then dereferences this pointer:
> > > > > >
> > > > > > pr_debug("Failed to hot-remove memory at %llx\n",
> > > > > > lmb->base_addr);
> > > > >
> > > > > While this is a reasonable fix and the stable material it is really
> > > > > unclear to me why it has gained a CVE. Memory hotplug is a privileged
> > > > > operation. Could you clarify please?
> > > >
> > > > As you know, history has shown us that accessing out of your allocated
> > > > memory can cause problems, and we can not assume use-cases, as we don't
> > > > know how everyone uses our codebase, so marking places where we fix
> > > > out-of-bound memory accesses is resolving a weakness in the codebase,
> > > > hence a CVE assignment.
> > >
> > > Does that mean that any potentially incorrect input provided by an admin is
> > > considered CVE now? I guess we would need to ban interfaces like
> > > /dev/mem and many others.
> >
> > If you have your system set up to prevent admins from accessing /dev/mem
> > (isn't there a config option for that), and you can access it, then yes,
> > that would be a CVE-worthy issue.
> >
> > But if you configure your system to allow an admin access to /dev/mem
> > then you wanted that :)
>
> OK, I thought we are having a serious discussion here. Now I am not
> really sure about that.
I am serious, sorry if that came across as not.
> > > > If your systems are not vulnerable to this specific issue, wonderful, no
> > > > need to take it, but why wouldn't you want to take a fix that resolves a
> > > > known weakness?
> > >
> > > I have explicitly said, the fix is reasonable. I just do not see a point
> > > to mark it as CVE. I fail to see any thread model where this would
> > > matter as it would require untrusted privileged actor to trigger it
> > > AFAICS. I am happy to be proven wrong here.
> >
> > We can not determine threat models when filing CVEs as we do not know
> > what your threat model is. All we can do is determine if this resolves
> > a weakness in the system. A use-after-free is a weakness and this
> > resolves that issue.
>
> Sure, you cannot make any assumptions and nobody questions that. We do
> have a certain common sense though. And if somebody comes up with a
> usecase that we haven't anticipated then we can surely assign a CVE.
> I though that level of control is exactly the reason to own the process
> by the community.
>
> All that being said I dispute the issue fixed here has any more security
> relevance than allowing untrusted user to control memory hotplug which
> could easily result in DoS of the system.
Ok, I traced this call back, and here is the callpath, starting with a
write to the the 'dlpar' sysfs file (conviently NOT documented in
Documentation/ABI, and it looks like it violates the "one value per
file" rule...)
dlpar_store()
handle_dlpar_errorlog()
dlpar_memory()
dlpar_memory_remove_by_index()
Yes, the kernel by default sets 'dlpar' to 0644, BUT that means that
root in a container can cause this use-after-free to happen, or if the
permissions are changed by userspace, or if you are in "lockdown mode",
or if you want to attempt the crazy "confidential computing" model, or
if you have a system which root is possible for some things by normal
users (there are lots of different security models out there...)
Yes, I will argue that making the sysfs file writable by userspace is
out of our control, but what is in our control is the fact that there is
a out-of-bounds write that is fixed here, and we don't want those to be
able to be triggered by anyone as that is a weakness in our codebase.
That is what has caused the CVE to be created here, not the fact that
root can remove memory as that's the normal expected operation to have
happen here.
However if the maintainer of the code here disputes this, we are more
than willing to mark this invalid and reject the CVE.
thanks,
greg k-h
[Let me add Michael as PPC maintainer - the thread starts with
http://lkml.kernel.org/r/2024022257-CVE-2023-52451-7bdb@gregkh]
On Tue 27-02-24 06:14:45, Greg KH wrote:
> On Mon, Feb 26, 2024 at 05:36:57PM +0100, Michal Hocko wrote:
[...]
> > All that being said I dispute the issue fixed here has any more security
> > relevance than allowing untrusted user to control memory hotplug which
> > could easily result in DoS of the system.
>
> Ok, I traced this call back, and here is the callpath, starting with a
> write to the the 'dlpar' sysfs file (conviently NOT documented in
> Documentation/ABI, and it looks like it violates the "one value per
> file" rule...)
> dlpar_store()
> handle_dlpar_errorlog()
> dlpar_memory()
> dlpar_memory_remove_by_index()
>
> Yes, the kernel by default sets 'dlpar' to 0644, BUT that means that
> root in a container can cause this use-after-free to happen, or if the
> permissions are changed by userspace, or if you are in "lockdown mode",
> or if you want to attempt the crazy "confidential computing" model, or
> if you have a system which root is possible for some things by normal
> users (there are lots of different security models out there...)
This is all nice but please do realize that if you allow access to to
memory hotremove to any untrusted entity (be it a root in container or
by changing access permissions) then the machine is in a serious
resource management control trouble already and that is a security
threat already.
> Yes, I will argue that making the sysfs file writable by userspace is
> out of our control, but what is in our control is the fact that there is
> a out-of-bounds write that is fixed here, and we don't want those to be
> able to be triggered by anyone as that is a weakness in our codebase.
Yes, and that is why the fix is good and nobody disputes that. What I am
actually trying to drill down to is whether this is an actual security
threat worth assigning a CVE or it is just yet-anothing-pointless-CVE we
were so used to with the old process.
> That is what has caused the CVE to be created here, not the fact that
> root can remove memory as that's the normal expected operation to have
> happen here.
>
> However if the maintainer of the code here disputes this, we are more
> than willing to mark this invalid and reject the CVE.
Michael, do you see any real security risk being addressed by
bd68ffce69f6 ("powerpc/pseries/memhp: Fix access beyond end of drmem
array").
Thanks!
--
Michal Hocko
SUSE Labs
On Tue, 27 Feb 2024, Greg Kroah-Hartman wrote:
> Ok, I traced this call back, and here is the callpath, starting with a
> write to the the 'dlpar' sysfs file (conviently NOT documented in
> Documentation/ABI, and it looks like it violates the "one value per
> file" rule...)
> dlpar_store()
> handle_dlpar_errorlog()
> dlpar_memory()
> dlpar_memory_remove_by_index()
>
> Yes, the kernel by default sets 'dlpar' to 0644, BUT that means that
> root in a container can cause this use-after-free to happen, or if the
> permissions are changed by userspace, or if you are in "lockdown mode",
> or if you want to attempt the crazy "confidential computing" model, or
> if you have a system which root is possible for some things by normal
> users (there are lots of different security models out there...)
Whatever the security threat model, whoever can offline memory by writing
to this file can kill the machine already. So there is no *additional*
security issue fixed by this that'd justify a CVE.
--
Jiri Kosina
SUSE Labs
On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> On Mon 26-02-24 16:06:51, Greg KH wrote:
> > On Mon, Feb 26, 2024 at 03:52:11PM +0100, Michal Hocko wrote:
> > > On Thu 22-02-24 17:21:58, Greg KH wrote:
> > > > Description
> > > > ===========
> > > >
> > > > In the Linux kernel, the following vulnerability has been resolved:
> > > >
> > > > powerpc/pseries/memhp: Fix access beyond end of drmem array
> > > >
> > > > dlpar_memory_remove_by_index() may access beyond the bounds of the
> > > > drmem lmb array when the LMB lookup fails to match an entry with the
> > > > given DRC index. When the search fails, the cursor is left pointing to
> > > > &drmem_info->lmbs[drmem_info->n_lmbs], which is one element past the
> > > > last valid entry in the array. The debug message at the end of the
> > > > function then dereferences this pointer:
> > > >
> > > > pr_debug("Failed to hot-remove memory at %llx\n",
> > > > lmb->base_addr);
> > >
> > > While this is a reasonable fix and the stable material it is really
> > > unclear to me why it has gained a CVE. Memory hotplug is a privileged
> > > operation. Could you clarify please?
> >
> > As you know, history has shown us that accessing out of your allocated
> > memory can cause problems, and we can not assume use-cases, as we don't
> > know how everyone uses our codebase, so marking places where we fix
> > out-of-bound memory accesses is resolving a weakness in the codebase,
> > hence a CVE assignment.
>
> Does that mean that any potentially incorrect input provided by an admin is
> considered CVE now?
Yes. Have you seen what USER_NS does? There isn't a way to know how
deployments are using Linux, and this is clearly a "weakness" as defined
by CVE. It is better to be over zealous than miss things.
> I guess we would need to ban interfaces like /dev/mem and many others.
Yes. Absolutely. :) Have you seen CONFIG_STRICT_DEVMEM,
CONFIG_IO_STRICT_DEVMEM, etc? Many deployments keep a bright line
between root and kernel. There is a whole subsystem (lockdown) for
working to enforce this.
> > If your systems are not vulnerable to this specific issue, wonderful, no
> > need to take it, but why wouldn't you want to take a fix that resolves a
> > known weakness?
>
> I have explicitly said, the fix is reasonable. I just do not see a point
> to mark it as CVE. I fail to see any thread model where this would
> matter as it would require untrusted privileged actor to trigger it
> AFAICS. I am happy to be proven wrong here.
Given that weaknesses are commonly chained together for exploits, it's
just not possible to say something can never be used in an attack.
Exploits are about slowly gaining more and more leverage on a system,
expanding the scope of control a little more with each step.
It's not possible to make these distinctions without deep reachability
analysis and an omniscient view of all Linux deployments.
There's no harm in marking fixes for weaknesses as CVEs, so why the
push back?
-Kees
--
Kees Cook
On Tue 27-02-24 10:35:40, Kees Cook wrote:
> On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
[...]
> > Does that mean that any potentially incorrect input provided by an admin is
> > considered CVE now?
>
> Yes. Have you seen what USER_NS does? There isn't a way to know how
> deployments are using Linux, and this is clearly a "weakness" as defined
> by CVE. It is better to be over zealous than miss things.
If we are over zealous to the point when almost any fix is marked CVE
then the special marking simply stops making any sense IMHO.
> > I guess we would need to ban interfaces like /dev/mem and many others.
>
> Yes. Absolutely. :) Have you seen CONFIG_STRICT_DEVMEM,
> CONFIG_IO_STRICT_DEVMEM, etc? Many deployments keep a bright line
> between root and kernel. There is a whole subsystem (lockdown) for
> working to enforce this.
Are you confusing hardening with security relevant fixes here? It makes
a lot of sense to reduce the attack space by sacrificing functionality
for some usecases but in general a large part of the kernel is built
around a "root can do anything" philosophy. Whether we like it or not.
And that means that we do not even pretend to protect dubious
configurations by root/CAP_SYSADMIN which could effectivelly DoS the
system (just consider hotplug/hotremove as an example - try to run your
workload when most CPUs or memory is offlined). Some operations are
simply not suited for untrusted entity.
[...]
> There's no harm in marking fixes for weaknesses as CVEs, so why the
> push back?
Because assigning CVEs nilly willy was the main downside of the prior
process and I was hoping that the new one, in hands of kernel people,
could be better and we could be getting more relevant CVEs.
--
Michal Hocko
SUSE Labs
On Wed, Feb 28, 2024 at 01:04:14PM +0100, Michal Hocko wrote:
> On Tue 27-02-24 10:35:40, Kees Cook wrote:
> > On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> [...]
> > > Does that mean that any potentially incorrect input provided by an admin is
> > > considered CVE now?
> >
> > Yes. Have you seen what USER_NS does? There isn't a way to know how
> > deployments are using Linux, and this is clearly a "weakness" as defined
> > by CVE. It is better to be over zealous than miss things.
>
> If we are over zealous to the point when almost any fix is marked CVE
> then the special marking simply stops making any sense IMHO.
Perhaps, but the volume of fixes is high, and I think it's better to
over estimate than under estimate -- the work needed to actually
evaluate all these changes is huge: it's better to take everything from
-stable. This has been a long standing problem with communicating this
to engineering management in many organizations. They have pointed to
the relatively small number of CVEs and said, "just backport those
fixes", and trying to explain that it's is totally insufficient falls on
deaf ears.
> Are you confusing hardening with security relevant fixes here? It makes
> a lot of sense to reduce the attack space by sacrificing functionality
> for some usecases but in general a large part of the kernel is built
> around a "root can do anything" philosophy. Whether we like it or not.
> And that means that we do not even pretend to protect dubious
> configurations by root/CAP_SYSADMIN which could effectivelly DoS the
> system (just consider hotplug/hotremove as an example - try to run your
> workload when most CPUs or memory is offlined). Some operations are
> simply not suited for untrusted entity.
Sure, I understand, but it's not possible to prove the negative. Perhaps
this particular fix will never be used in an attack, but I don't think
it's good to underestimate the creativity of attackers chaining
unexpected states together.
> [...]
>
> > There's no harm in marking fixes for weaknesses as CVEs, so why the
> > push back?
>
> Because assigning CVEs nilly willy was the main downside of the prior
> process and I was hoping that the new one, in hands of kernel people,
> could be better and we could be getting more relevant CVEs.
The old assignment pattern was both radically below actual fixes
landing, and also regularly not applicable at all (CVEs for out-of-tree
downstream patches against Linux). On both counts, I think the new
process is better, even if it will produce some "very low" severity
CVEs. :)
-Kees
--
Kees Cook
On Wed 28-02-24 09:12:15, Kees Cook wrote:
> On Wed, Feb 28, 2024 at 01:04:14PM +0100, Michal Hocko wrote:
> > On Tue 27-02-24 10:35:40, Kees Cook wrote:
> > > On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> > [...]
> > > > Does that mean that any potentially incorrect input provided by an admin is
> > > > considered CVE now?
> > >
> > > Yes. Have you seen what USER_NS does? There isn't a way to know how
> > > deployments are using Linux, and this is clearly a "weakness" as defined
> > > by CVE. It is better to be over zealous than miss things.
> >
> > If we are over zealous to the point when almost any fix is marked CVE
> > then the special marking simply stops making any sense IMHO.
>
> Perhaps, but the volume of fixes is high, and I think it's better to
> over estimate than under estimate -- the work needed to actually
> evaluate all these changes is huge: it's better to take everything from
> -stable.
This is simply not feasible for many downstream kernels and reasons have
been discussed many times.
> This has been a long standing problem with communicating this
> to engineering management in many organizations. They have pointed to
> the relatively small number of CVEs and said, "just backport those
> fixes", and trying to explain that it's is totally insufficient falls on
> deaf ears.
I think it is fair to say/expect that every downstream is responsibile
for the kernel they are distributing and that applies to vulnerabilities
affecting those kernels. Forcing fixes by slapping CVE over them sounds
just very dubious to me.
--
Michal Hocko
SUSE Labs
On Thu, Feb 29, 2024 at 09:22:51AM +0100, Michal Hocko wrote:
> On Wed 28-02-24 09:12:15, Kees Cook wrote:
> > On Wed, Feb 28, 2024 at 01:04:14PM +0100, Michal Hocko wrote:
> > > On Tue 27-02-24 10:35:40, Kees Cook wrote:
> > > > On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> > > [...]
> > > > > Does that mean that any potentially incorrect input provided by an admin is
> > > > > considered CVE now?
> > > >
> > > > Yes. Have you seen what USER_NS does? There isn't a way to know how
> > > > deployments are using Linux, and this is clearly a "weakness" as defined
> > > > by CVE. It is better to be over zealous than miss things.
> > >
> > > If we are over zealous to the point when almost any fix is marked CVE
> > > then the special marking simply stops making any sense IMHO.
> >
> > Perhaps, but the volume of fixes is high, and I think it's better to
> > over estimate than under estimate -- the work needed to actually
> > evaluate all these changes is huge: it's better to take everything from
> > -stable.
>
> This is simply not feasible for many downstream kernels and reasons have
> been discussed many times.
How does taking 10 patches differ from taking 200 patches? Your
testing/infrastructure issues should be the same, right?
And really, if you have the crazy requirement of "All CVEs must be
applied" then perhaps that needs to be revisited? You are dictating
your business's processes to an external entity, is that wise? And if
it is wise, why are you not able to handle it today? Making it simpler
for you to consume these entries seems to be a better decision. With
the changes we have made, you should be able to automatically determine
if you are affected or not, which is something that you did NOT have
before, so this might just be able to be automated, right?
> > This has been a long standing problem with communicating this
> > to engineering management in many organizations. They have pointed to
> > the relatively small number of CVEs and said, "just backport those
> > fixes", and trying to explain that it's is totally insufficient falls on
> > deaf ears.
>
> I think it is fair to say/expect that every downstream is responsibile
> for the kernel they are distributing and that applies to vulnerabilities
> affecting those kernels. Forcing fixes by slapping CVE over them sounds
> just very dubious to me.
Not having CVEs assigned for things that can cause issues is dubious,
which is why we are doing this. We can not determine use cases of our
code base, that is up to you as a distro to figure out, all we can do is
our best to call out "This might be something you want to take!" which
is what we are doing here.
Remember, we KNOW that people have been scraping our commit logs for
these types of things and using them in attacks for years. At least
this way we are giving you a machine-parsable feed that gives you the
opportunity to make your kernels more secure than they previously were.
Why is that not a good thing?
thanks,
greg k-h
On Thu 29-02-24 09:35:28, Greg KH wrote:
[...]
> How does taking 10 patches differ from taking 200 patches? Your
> testing/infrastructure issues should be the same, right?
You would be right if there was a test of every/most of the applied
patches. Which is not the case. The testing is merely about nothing has
broken in an obvious way. I hope we can agree that more patches that are
taken the larger the risk is.
I would have a much higher confidence in stable patches if they were
properly reviewed in the context of the tree they are applied to. This
is not done for the vast majority of them though. I simply do not trust
"it applies so it is OK(ish)" approach.
> And really, if you have the crazy requirement of "All CVEs must be
> applied" then perhaps that needs to be revisited? You are dictating
> your business's processes to an external entity, is that wise?
I do not think we (Suse) have any policy like that. We do evaluate CVEs
and other fixes and make risk/benefit based decisions. This is a
demanding process to say the least but it has allowed for very stable
code base.
> And if
> it is wise, why are you not able to handle it today? Making it simpler
> for you to consume these entries seems to be a better decision. With
> the changes we have made, you should be able to automatically determine
> if you are affected or not, which is something that you did NOT have
> before, so this might just be able to be automated, right?
Not really. The most of those fixes already do have stable or Fixes tags
so automation doesn't really require CVE number. If anybody is
interested to follow fixes in general then all the proper metadata are
already there.
> > > This has been a long standing problem with communicating this
> > > to engineering management in many organizations. They have pointed to
> > > the relatively small number of CVEs and said, "just backport those
> > > fixes", and trying to explain that it's is totally insufficient falls on
> > > deaf ears.
> >
> > I think it is fair to say/expect that every downstream is responsibile
> > for the kernel they are distributing and that applies to vulnerabilities
> > affecting those kernels. Forcing fixes by slapping CVE over them sounds
> > just very dubious to me.
>
> Not having CVEs assigned for things that can cause issues is dubious,
> which is why we are doing this. We can not determine use cases of our
> code base, that is up to you as a distro to figure out, all we can do is
> our best to call out "This might be something you want to take!" which
> is what we are doing here.
You are missing my point I am afraid. Downstreams do not need CVE numbers
to fish for useful fixes. Downstreams, however, would appreciate the call
out serious security problems to treat them with higher priority because
believe me it makes a huge difference to address an exploitable problem
promptly rather than one that is affecting a misconfigured system which
is by definition vulnerable. And while I am sympathetic with your "we
cannot assume usecases" statement, I believe we can still tell a
difference that some configurations simply make no sense from the
security perspective.
> Remember, we KNOW that people have been scraping our commit logs for
> these types of things and using them in attacks for years. At least
> this way we are giving you a machine-parsable feed that gives you the
> opportunity to make your kernels more secure than they previously were.
They are going to do that whether you stick CVE on top or not.
> Why is that not a good thing?
Because the more dubious or outright bogus CVEs there are the less
meaningful they will become.
Look, it seems there is a huge gap in our understanding of what the CVE
is good for. I was really hopeful that the noise/signal ratio would be
better with the new model but that doesn't seem to be the case so far. I
was also hoped that the dispute process would be improved but judging
from the discussion so far I have lost that hope as well.
Please asked yourself again who is going to benefit from this. Stable
users certainly not, because they are supposed to use the latest stable
kernels so all the fixes (no matter of CVE), downstreams based on stable
kernels will not benefit either because of the same reason. Downstreams
which are not based on stable trees will not benefit either because of
the low signal/noise ratio and will need to find a better way to find
important and safe fixes. Probably the only winner will be those fancy
CVE scanners which will have more to scan.
--
Michal Hocko
SUSE Labs
On Thu 2024-02-29 09:35:28, Greg Kroah-Hartman wrote:
> On Thu, Feb 29, 2024 at 09:22:51AM +0100, Michal Hocko wrote:
> > On Wed 28-02-24 09:12:15, Kees Cook wrote:
> > > On Wed, Feb 28, 2024 at 01:04:14PM +0100, Michal Hocko wrote:
> > > > On Tue 27-02-24 10:35:40, Kees Cook wrote:
> > > > > On Mon, Feb 26, 2024 at 04:25:09PM +0100, Michal Hocko wrote:
> > > > [...]
> > > > > > Does that mean that any potentially incorrect input provided by an admin is
> > > > > > considered CVE now?
> > > > >
> > > > > Yes. Have you seen what USER_NS does? There isn't a way to know how
> > > > > deployments are using Linux, and this is clearly a "weakness" as defined
> > > > > by CVE. It is better to be over zealous than miss things.
> > > >
> > > > If we are over zealous to the point when almost any fix is marked CVE
> > > > then the special marking simply stops making any sense IMHO.
> > >
> > > Perhaps, but the volume of fixes is high, and I think it's better to
> > > over estimate than under estimate -- the work needed to actually
> > > evaluate all these changes is huge: it's better to take everything from
> > > -stable.
> >
> > This is simply not feasible for many downstream kernels and reasons have
> > been discussed many times.
>
> How does taking 10 patches differ from taking 200 patches? Your
> testing/infrastructure issues should be the same, right?
It is more work to review 200 patches than to review 10. As you would
know if you actually reviewed -stable patches or at least AUTOSEL
ones.
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.
Hi!
> > Does that mean that any potentially incorrect input provided by an admin is
> > considered CVE now?
>
> Yes. Have you seen what USER_NS does? There isn't a way to know how
> deployments are using Linux, and this is clearly a "weakness" as defined
> by CVE. It is better to be over zealous than miss things.
Is it?
What is happening now is DoS on anyone who tries to use CVE
database... and on l-k users.
How do I get CVE number for that?
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.
On Thu, Feb 29, 2024 at 10:41:58AM +0100, Michal Hocko wrote:
> > > > This has been a long standing problem with communicating this
> > > > to engineering management in many organizations. They have pointed to
> > > > the relatively small number of CVEs and said, "just backport those
> > > > fixes", and trying to explain that it's is totally insufficient falls on
> > > > deaf ears.
> > >
> > > I think it is fair to say/expect that every downstream is responsibile
> > > for the kernel they are distributing and that applies to vulnerabilities
> > > affecting those kernels. Forcing fixes by slapping CVE over them sounds
> > > just very dubious to me.
> >
> > Not having CVEs assigned for things that can cause issues is dubious,
> > which is why we are doing this. We can not determine use cases of our
> > code base, that is up to you as a distro to figure out, all we can do is
> > our best to call out "This might be something you want to take!" which
> > is what we are doing here.
>
> You are missing my point I am afraid. Downstreams do not need CVE numbers
> to fish for useful fixes. Downstreams, however, would appreciate the call
> out serious security problems to treat them with higher priority because
> believe me it makes a huge difference to address an exploitable problem
> promptly rather than one that is affecting a misconfigured system which
> is by definition vulnerable. And while I am sympathetic with your "we
> cannot assume usecases" statement, I believe we can still tell a
> difference that some configurations simply make no sense from the
> security perspective.
The reason we became a CNA is NOT because we want to only call out "high
priority" issues, because obviously, we can not do that as we do not
know anyone's use case and we can not claim any specific user is higher
priority than another.
The reason we DID become a CNA was because there were entities out there
filing CVEs against invalid stuff AND causing problems for everyone,
including the kernel developers themselves, and making it impossible to
ever get anything revoked no matter how wrong they were.
As part of the requirement to be a CNA, we have to announce everything
that we think is a potential vulnerability, severity not be judged at
all as that is NOT part of the CVE project or program. It is up to
others to judge the severity, not us, nor the CVE group themselves,
that's not what the ID is for.
So, because of this, we are now assigning CVE ids to everything that we
review that we think is a vulnerability AND everything that people
submit to us to have a CVE assigned to that they deem is a
vulnerability. Many groups/people have already asked us for IDs and we
have only turned one down so far (the submitter agreed that it wasn't an
issue at all).
We also are required to go back through the GSD entries and create CVE
entries for them as well, which is why you are seeing these "older"
entries be created. We have many thousands to go through of them, so
that will take us a while to catch up (next few months.)
Again, none of this has anything to do with "severity", it only is an
identifier that says "this fixes a vulnerability".
> > Why is that not a good thing?
>
> Because the more dubious or outright bogus CVEs there are the less
> meaningful they will become.
That's not our problem or issue. All we care about is properly tagging
things we see as fixes for vulnerabilities. Other groups can judge the
"meaning of what a CVE is or not".
> Look, it seems there is a huge gap in our understanding of what the CVE
> is good for. I was really hopeful that the noise/signal ratio would be
> better with the new model but that doesn't seem to be the case so far. I
> was also hoped that the dispute process would be improved but judging
> from the discussion so far I have lost that hope as well.
It's better than the old process where you could NOT even dispute a CVE
and get it rejected at all, don't you think? We've already rejected a
few, probably more than has ever been rejected for the kernel in the
past few years.
We are working this out as we go, if you can come up with ways to make
this easier for everyone involved, we are more than willing to work on
it. But note, it's only been a week! Give us a chance please.
And the signal/noise all depends on your use case. Again, we can not
judge that.
> Please asked yourself again who is going to benefit from this. Stable
> users certainly not, because they are supposed to use the latest stable
> kernels so all the fixes (no matter of CVE), downstreams based on stable
> kernels will not benefit either because of the same reason. Downstreams
> which are not based on stable trees will not benefit either because of
> the low signal/noise ratio and will need to find a better way to find
> important and safe fixes. Probably the only winner will be those fancy
> CVE scanners which will have more to scan.
The overall community is going to benifit because more actual fixes will
be picked up by the distros that today are not picking them up. It's
pretty trivial to get root on most of the "enterprise" kernels today,
and on many phones, because people are NOT picking the stable releases.
I do a presentation every year at a major phone company where I show how
easy it is to break their device just because they keep insisting that
they "know" what commits to take from the stable kernels. They have
finally changed their minds and will be taking stable updates, and I
hope the enterprise kernels will as well.
But again, that's independent of the CVE process. We have taken it over
because what was there was broken and abused by many bad actors. And
again, we are now required to report as many issues as we notice, which
is what we are doing. We will get some wrong over time, but overall, I
only see complaints about 2-3 when we have already issued:
Number of assigned CVE ids, by year:
2019: 2
2020: 13
2021: 147
2022: 1
2023: 51
2024: 26
Total: 240
240 as of right now, and haven't really started the process of issuing
ids for the current stable tree (i.e. 6.7.x).
thanks,
greg k-h
On February 29, 2024 6:18:36 AM PST, Greg Kroah-Hartman <[email protected]> wrote:
>As part of the requirement to be a CNA, we have to announce everything
>that we think is a potential vulnerability, severity not be judged at
> [...]
>Again, none of this has anything to do with "severity", it only is an
>identifier that says "this fixes a vulnerability".
The language here can perhaps be improved for better understanding by folks since "CVE" and "vulnerability" can mean different things to different people. I would say "this fixes a weakness".
CVEs are for anything deemed a "weakness"[1]. It doesn't need to rise to the level of what many people would consider a "vulnerability". (Modern attacks traditionally chain many weaknesses together to form an exploit, some of which look harmless when examined in isolation.)
I find it helps to keep in mind the "CIA" acronym of what makes up a security weakness: "negative impact to Confidentiality, Integrity, or Availability". (Not to be confused with the US Gov intelligence org with the name acronym, ironically.)
-Kees
[1] https://nvd.nist.gov/vuln
--
Kees Cook
On Thu, 29 Feb 2024, Greg Kroah-Hartman wrote:
> It's pretty trivial to get root on most of the "enterprise" kernels
Wow, that's a very strong statement you are making here, and I'd now
really like to ask you to back that up with some real data.
(or at least first clarify what exactly you mean when you are saying
"enterprise kernels").
--
Jiri Kosina
SUSE Labs
On Thu, Feb 29, 2024 at 04:09:24PM +0100, Jiri Kosina wrote:
>On Thu, 29 Feb 2024, Greg Kroah-Hartman wrote:
>
>> It's pretty trivial to get root on most of the "enterprise" kernels
>
>Wow, that's a very strong statement you are making here, and I'd now
>really like to ask you to back that up with some real data.
Is something like https://www.suse.com/security/cve/CVE-2023-52447.html
a good example?
--
Thanks,
Sasha
On Thu, 29 Feb 2024, Sasha Levin wrote:
> >> It's pretty trivial to get root on most of the "enterprise" kernels
> >
> >Wow, that's a very strong statement you are making here, and I'd now
> >really like to ask you to back that up with some real data.
>
> Is something like https://www.suse.com/security/cve/CVE-2023-52447.html
> a good example?
- this fix is on our list/queue to be integrated into one of our kernel
branches, and was even beore it just got CVE assigned, as it references
a commit in Fixes: that we have present in one of our branches, but
hasn't been processed yet, mainly because we don't allow unprivileged
BPF
- you pointed to a fix for UAF in BPF, which definitely is a good fix to
have, I don't even dispute that CVE is justified in this particular
case. What I haven't yet seen though how this connects to in my view
rather serious 'trivial to get root' statement
Thanks,
--
Jiri Kosina
SUSE Labs
On Thu, 29 Feb 2024, Jiri Kosina wrote:
> - you pointed to a fix for UAF in BPF, which definitely is a good fix to
> have, I don't even dispute that CVE is justified in this particular
> case. What I haven't yet seen though how this connects to in my view
> rather serious 'trivial to get root' statement
To elaborate on this a little bit more -- I completely agree that this fix
is completely in-line with what Kees is, in my view, quite nicely
describing at [1]. You pointed to a weakness (for which a fix *is* in our
queue), sure.
But I see a HUGE leap from "fixes a weakness" to bold, aggressive and in
my view exaggerating statements a-la "I am able to trivially pwn any
kernel which is not -stable".
[1] https://lore.kernel.org/all/[email protected]/
--
Jiri Kosina
SUSE Labs
On Thu 29-02-24 07:08:06, Kees Cook wrote:
>
>
> On February 29, 2024 6:18:36 AM PST, Greg Kroah-Hartman <[email protected]> wrote:
> >As part of the requirement to be a CNA, we have to announce everything
> >that we think is a potential vulnerability, severity not be judged at
> > [...]
> >Again, none of this has anything to do with "severity", it only is an
> >identifier that says "this fixes a vulnerability".
>
> The language here can perhaps be improved for better understanding by
> folks since "CVE" and "vulnerability" can mean different things to
> different people. I would say "this fixes a weakness".
>
> CVEs are for anything deemed a "weakness"[1]. It doesn't need to rise
> to the level of what many people would consider a "vulnerability".
> (Modern attacks traditionally chain many weaknesses together to form
> an exploit, some of which look harmless when examined in isolation.)
>
> I find it helps to keep in mind the "CIA" acronym of what makes up a
> security weakness: "negative impact to Confidentiality, Integrity, or
> Availability". (Not to be confused with the US Gov intelligence org
> with the name acronym, ironically.)
Yes, this makes a lot of sense to me and I believe we are on the same
page here. We have gone in several tangents here but let me just remind
that I was objecting/wondering how much sense does it make to assign CVE
to configurations which are inherently insecure. From a user POV, does
it make me safer to have that addressed when the fundamental
configuration hole (to allow untrusted user to access said resources)
still in place. See my point or do I still fail to explain myself? It is
not about assuming usecases and whatnot.
> -Kees
>
> [1] https://nvd.nist.gov/vuln
--
Michal Hocko
SUSE Labs
On Thu, Feb 29, 2024 at 06:11:40PM +0100, Jiri Kosina wrote:
>On Thu, 29 Feb 2024, Sasha Levin wrote:
>
>> >> It's pretty trivial to get root on most of the "enterprise" kernels
>> >
>> >Wow, that's a very strong statement you are making here, and I'd now
>> >really like to ask you to back that up with some real data.
>>
>> Is something like https://www.suse.com/security/cve/CVE-2023-52447.html
>> a good example?
>
>- this fix is on our list/queue to be integrated into one of our kernel
> branches, and was even beore it just got CVE assigned, as it references
> a commit in Fixes: that we have present in one of our branches, but
> hasn't been processed yet, mainly because we don't allow unprivileged
> BPF
This comment touches on two points raised in this thread:
Greg's point that instead of taking all the fixes, they end up in queues
waiting to be processed, which means that the trees en up being
vulnerable during that time.
Kees's point that exploitation is rarely a single issue coming in to
play, but is usually a long chain of different exploits coming together
to achieve a goal.
>- you pointed to a fix for UAF in BPF, which definitely is a good fix to
> have, I don't even dispute that CVE is justified in this particular
> case. What I haven't yet seen though how this connects to in my view
> rather serious 'trivial to get root' statement
Yes, the patch reads like a fix for a UAF.
--
Thanks,
Sasha
On Thu, Feb 29, 2024 at 06:36:08PM +0100, Jiri Kosina wrote:
> On Thu, 29 Feb 2024, Jiri Kosina wrote:
>
> > - you pointed to a fix for UAF in BPF, which definitely is a good fix to
> > have, I don't even dispute that CVE is justified in this particular
> > case. What I haven't yet seen though how this connects to in my view
> > rather serious 'trivial to get root' statement
>
> To elaborate on this a little bit more -- I completely agree that this fix
> is completely in-line with what Kees is, in my view, quite nicely
> describing at [1]. You pointed to a weakness (for which a fix *is* in our
> queue), sure.
>
> But I see a HUGE leap from "fixes a weakness" to bold, aggressive and in
> my view exaggerating statements a-la "I am able to trivially pwn any
> kernel which is not -stable".
{sigh}
I do not mean _any_ enterprise kernel, I said "most", some I did not
find any problems with at all that I could tell. This was true the last
time I did this exercise about 9 or so months ago for a presentation,
and hey, it might have changed since then, I sure hope so for everyone's
sake.
Sorry, I will NOT say what distros I did, or did not, find vunerable.
That's not my place to say here in public for obvious reasons, but I'm
more than willing to discuss it over drinks in-person anytime.
thanks,
greg k-h
Michal Hocko <[email protected]> writes:
> [Let me add Michael as PPC maintainer - the thread starts with
> http://lkml.kernel.org/r/2024022257-CVE-2023-52451-7bdb@gregkh]
Sorry just saw this ...
> On Tue 27-02-24 06:14:45, Greg KH wrote:
>> On Mon, Feb 26, 2024 at 05:36:57PM +0100, Michal Hocko wrote:
> [...]
>> > All that being said I dispute the issue fixed here has any more security
>> > relevance than allowing untrusted user to control memory hotplug which
>> > could easily result in DoS of the system.
>>
>> Ok, I traced this call back, and here is the callpath, starting with a
>> write to the the 'dlpar' sysfs file (conviently NOT documented in
>> Documentation/ABI, and it looks like it violates the "one value per
>> file" rule...)
>> dlpar_store()
>> handle_dlpar_errorlog()
>> dlpar_memory()
>> dlpar_memory_remove_by_index()
>>
>> Yes, the kernel by default sets 'dlpar' to 0644, BUT that means that
>> root in a container can cause this use-after-free to happen, or if the
>> permissions are changed by userspace, or if you are in "lockdown mode",
>> or if you want to attempt the crazy "confidential computing" model, or
>> if you have a system which root is possible for some things by normal
>> users (there are lots of different security models out there...)
>
> This is all nice but please do realize that if you allow access to to
> memory hotremove to any untrusted entity (be it a root in container or
> by changing access permissions) then the machine is in a serious
> resource management control trouble already and that is a security
> threat already.
The standard container runtimes, eg. podman/docker, will mount /sys as
read-only by default. So at least in typical situations root in a
container will not be able to trigger this issue.
>> Yes, I will argue that making the sysfs file writable by userspace is
>> out of our control, but what is in our control is the fact that there is
>> a out-of-bounds write that is fixed here, and we don't want those to be
>> able to be triggered by anyone as that is a weakness in our codebase.
>
> Yes, and that is why the fix is good and nobody disputes that. What I am
> actually trying to drill down to is whether this is an actual security
> threat worth assigning a CVE or it is just yet-anothing-pointless-CVE we
> were so used to with the old process.
>
>> That is what has caused the CVE to be created here, not the fact that
>> root can remove memory as that's the normal expected operation to have
>> happen here.
>>
>> However if the maintainer of the code here disputes this, we are more
>> than willing to mark this invalid and reject the CVE.
>
> Michael, do you see any real security risk being addressed by
> bd68ffce69f6 ("powerpc/pseries/memhp: Fix access beyond end of drmem
> array").
No I don't think this bug on its own is a "real" security risk.
It allows root to print 8 bytes of kernel memory it's not supposed to,
but there's no way for the attacker to control what 8 bytes. So I doubt
that's actually useful for constructing an exploit.
But as Kees explained elsewhere, a CVE can be for any weakness, not just
actual vulnerabilties.
So under that definition I'm quite happy for this to be given a CVE.
cheers