2024-03-08 12:50:44

by Michal Hocko

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Wed 06-03-24 06:46:11, Greg KH wrote:
[...]
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&svms->lock);
> lock(&mm->mmap_lock);
> lock(&svms->lock);
> lock((work_completion)(&svm_bo->eviction_work));
>
> I believe this cannot really lead to a deadlock in practice, because
> svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> refcount is non-0. That means it's impossible that svm_range_bo_release
> is running concurrently. However, there is no good way to annotate this.

OK, so is this even a bug (not to mention a security/weakness)?
--
Michal Hocko
SUSE Labs


2024-03-14 11:09:48

by Lee Jones

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Fri, 08 Mar 2024, Michal Hocko wrote:

> On Wed 06-03-24 06:46:11, Greg KH wrote:
> [...]
> > Possible unsafe locking scenario:
> >
> > CPU0 CPU1
> > ---- ----
> > lock(&svms->lock);
> > lock(&mm->mmap_lock);
> > lock(&svms->lock);
> > lock((work_completion)(&svm_bo->eviction_work));
> >
> > I believe this cannot really lead to a deadlock in practice, because
> > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > refcount is non-0. That means it's impossible that svm_range_bo_release
> > is running concurrently. However, there is no good way to annotate this.
>
> OK, so is this even a bug (not to mention a security/weakness)?

Looks like the patch fixes a warning which can crash some kernels. So
the CVE appears to be fixing that, rather than the impossible deadlock.

--
Lee Jones [李琼斯]

2024-03-20 15:33:33

by Michal Hocko

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Thu 14-03-24 11:09:38, Lee Jones wrote:
> On Fri, 08 Mar 2024, Michal Hocko wrote:
>
> > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > [...]
> > > Possible unsafe locking scenario:
> > >
> > > CPU0 CPU1
> > > ---- ----
> > > lock(&svms->lock);
> > > lock(&mm->mmap_lock);
> > > lock(&svms->lock);
> > > lock((work_completion)(&svm_bo->eviction_work));
> > >
> > > I believe this cannot really lead to a deadlock in practice, because
> > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > is running concurrently. However, there is no good way to annotate this.
> >
> > OK, so is this even a bug (not to mention a security/weakness)?
>
> Looks like the patch fixes a warning which can crash some kernels. So
> the CVE appears to be fixing that, rather than the impossible deadlock.

Are you talking about lockdep warning or anything else?
--
Michal Hocko
SUSE Labs

2024-03-20 15:47:49

by Lee Jones

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Wed, 20 Mar 2024, Michal Hocko wrote:

> On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > On Fri, 08 Mar 2024, Michal Hocko wrote:
> >
> > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > [...]
> > > > Possible unsafe locking scenario:
> > > >
> > > > CPU0 CPU1
> > > > ---- ----
> > > > lock(&svms->lock);
> > > > lock(&mm->mmap_lock);
> > > > lock(&svms->lock);
> > > > lock((work_completion)(&svm_bo->eviction_work));
> > > >
> > > > I believe this cannot really lead to a deadlock in practice, because
> > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > is running concurrently. However, there is no good way to annotate this.
> > >
> > > OK, so is this even a bug (not to mention a security/weakness)?
> >
> > Looks like the patch fixes a warning which can crash some kernels. So
> > the CVE appears to be fixing that, rather than the impossible deadlock.
>
> Are you talking about lockdep warning or anything else?

Anything that triggers a BUG() or a WARN() (as per the splat in the
commit message). Many in-field kernels are configured to panic on
BUG()s and WARN()s, thus triggering them are presently considered local
DoS and attract CVE status.

--
Lee Jones [李琼斯]

2024-03-20 16:51:37

by Lee Jones

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Wed, 20 Mar 2024, Lee Jones wrote:

> On Wed, 20 Mar 2024, Michal Hocko wrote:
>
> > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > >
> > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > [...]
> > > > > Possible unsafe locking scenario:
> > > > >
> > > > > CPU0 CPU1
> > > > > ---- ----
> > > > > lock(&svms->lock);
> > > > > lock(&mm->mmap_lock);
> > > > > lock(&svms->lock);
> > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > >
> > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > is running concurrently. However, there is no good way to annotate this.
> > > >
> > > > OK, so is this even a bug (not to mention a security/weakness)?
> > >
> > > Looks like the patch fixes a warning which can crash some kernels. So
> > > the CVE appears to be fixing that, rather than the impossible deadlock.
> >
> > Are you talking about lockdep warning or anything else?
>
> Anything that triggers a BUG() or a WARN() (as per the splat in the
> commit message). Many in-field kernels are configured to panic on
> BUG()s and WARN()s, thus triggering them are presently considered local
> DoS and attract CVE status.

We have discussed this internally and agree with your thinking.

The splat in the circular lockdep detection code appears to be generated
using some stacked pr_warn() calls, rather than a WARN().

Thus, CVE-2024-26628 has now been rejected.

https://lore.kernel.org/all/[email protected]/

Thank you for your input Michal.

--
Lee Jones [李琼斯]

2024-03-20 17:13:03

by Michal Hocko

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Wed 20-03-24 16:51:27, Lee Jones wrote:
> On Wed, 20 Mar 2024, Lee Jones wrote:
>
> > On Wed, 20 Mar 2024, Michal Hocko wrote:
> >
> > > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > > >
> > > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > > [...]
> > > > > > Possible unsafe locking scenario:
> > > > > >
> > > > > > CPU0 CPU1
> > > > > > ---- ----
> > > > > > lock(&svms->lock);
> > > > > > lock(&mm->mmap_lock);
> > > > > > lock(&svms->lock);
> > > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > > >
> > > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > > is running concurrently. However, there is no good way to annotate this.
> > > > >
> > > > > OK, so is this even a bug (not to mention a security/weakness)?
> > > >
> > > > Looks like the patch fixes a warning which can crash some kernels. So
> > > > the CVE appears to be fixing that, rather than the impossible deadlock.
> > >
> > > Are you talking about lockdep warning or anything else?
> >
> > Anything that triggers a BUG() or a WARN() (as per the splat in the
> > commit message). Many in-field kernels are configured to panic on
> > BUG()s and WARN()s, thus triggering them are presently considered local
> > DoS and attract CVE status.

yes I do agree that WARN() should be treated same as BUG() if
triggerable by an user (for reasons you have mentioned). Lockdep is a
different thing as you follow up below.

> We have discussed this internally and agree with your thinking.
>
> The splat in the circular lockdep detection code appears to be generated
> using some stacked pr_warn() calls, rather than a WARN().
>
> Thus, CVE-2024-26628 has now been rejected.
>
> https://lore.kernel.org/all/[email protected]/
>
> Thank you for your input Michal.

Thanks!
--
Michal Hocko
SUSE Labs

2024-06-13 09:32:58

by Pavel Machek

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Wed 2024-03-20 15:47:34, Lee Jones wrote:
> On Wed, 20 Mar 2024, Michal Hocko wrote:
>
> > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > >
> > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > [...]
> > > > > Possible unsafe locking scenario:
> > > > >
> > > > > CPU0 CPU1
> > > > > ---- ----
> > > > > lock(&svms->lock);
> > > > > lock(&mm->mmap_lock);
> > > > > lock(&svms->lock);
> > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > >
> > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > is running concurrently. However, there is no good way to annotate this.
> > > >
> > > > OK, so is this even a bug (not to mention a security/weakness)?
> > >
> > > Looks like the patch fixes a warning which can crash some kernels. So
> > > the CVE appears to be fixing that, rather than the impossible deadlock.
> >
> > Are you talking about lockdep warning or anything else?
>
> Anything that triggers a BUG() or a WARN() (as per the splat in the
> commit message). Many in-field kernels are configured to panic on
> BUG()s and WARN()s, thus triggering them are presently considered local
> DoS and attract CVE status.

So... because it is possible to configure machine to reboot on
warning, now every warning is a security issue?

Lockdep is for debugging, if someone uses it in production with panic
on reboot, they are getting exactly what they are asking for.

Not a security problem.
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.


Attachments:
(No filename) (1.86 kB)
signature.asc (201.00 B)
Download all attachments

2024-06-13 10:17:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Thu, Jun 13, 2024 at 11:32:41AM +0200, Pavel Machek wrote:
> On Wed 2024-03-20 15:47:34, Lee Jones wrote:
> > On Wed, 20 Mar 2024, Michal Hocko wrote:
> >
> > > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > > >
> > > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > > [...]
> > > > > > Possible unsafe locking scenario:
> > > > > >
> > > > > > CPU0 CPU1
> > > > > > ---- ----
> > > > > > lock(&svms->lock);
> > > > > > lock(&mm->mmap_lock);
> > > > > > lock(&svms->lock);
> > > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > > >
> > > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > > is running concurrently. However, there is no good way to annotate this.
> > > > >
> > > > > OK, so is this even a bug (not to mention a security/weakness)?
> > > >
> > > > Looks like the patch fixes a warning which can crash some kernels. So
> > > > the CVE appears to be fixing that, rather than the impossible deadlock.
> > >
> > > Are you talking about lockdep warning or anything else?
> >
> > Anything that triggers a BUG() or a WARN() (as per the splat in the
> > commit message). Many in-field kernels are configured to panic on
> > BUG()s and WARN()s, thus triggering them are presently considered local
> > DoS and attract CVE status.
>
> So... because it is possible to configure machine to reboot on
> warning, now every warning is a security issue?
>
> Lockdep is for debugging, if someone uses it in production with panic
> on reboot, they are getting exactly what they are asking for.
>
> Not a security problem.

And we agree, I don't know what you are arguing about here, please stop.

greg k-h

2024-06-13 10:45:25

by Pavel Machek

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Thu 2024-06-13 12:16:50, Greg Kroah-Hartman wrote:
> On Thu, Jun 13, 2024 at 11:32:41AM +0200, Pavel Machek wrote:
> > On Wed 2024-03-20 15:47:34, Lee Jones wrote:
> > > On Wed, 20 Mar 2024, Michal Hocko wrote:
> > >
> > > > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > > > >
> > > > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > > > [...]
> > > > > > > Possible unsafe locking scenario:
> > > > > > >
> > > > > > > CPU0 CPU1
> > > > > > > ---- ----
> > > > > > > lock(&svms->lock);
> > > > > > > lock(&mm->mmap_lock);
> > > > > > > lock(&svms->lock);
> > > > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > > > >
> > > > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > > > is running concurrently. However, there is no good way to annotate this.
> > > > > >
> > > > > > OK, so is this even a bug (not to mention a security/weakness)?
> > > > >
> > > > > Looks like the patch fixes a warning which can crash some kernels. So
> > > > > the CVE appears to be fixing that, rather than the impossible deadlock.
> > > >
> > > > Are you talking about lockdep warning or anything else?
> > >
> > > Anything that triggers a BUG() or a WARN() (as per the splat in the
> > > commit message). Many in-field kernels are configured to panic on
> > > BUG()s and WARN()s, thus triggering them are presently considered local
> > > DoS and attract CVE status.
> >
> > So... because it is possible to configure machine to reboot on
> > warning, now every warning is a security issue?
> >
> > Lockdep is for debugging, if someone uses it in production with panic
> > on reboot, they are getting exactly what they are asking for.
> >
> > Not a security problem.
>
> And we agree, I don't know what you are arguing about here, please stop.

So you agree that WARN triggering randomly is not a security problem?

Following communication did not say so.

"The splat in the circular lockdep detection code appears to be generated
using some stacked pr_warn() calls, rather than a WARN()."
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.


Attachments:
(No filename) (2.46 kB)
signature.asc (201.00 B)
Download all attachments

2024-06-13 10:49:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Thu, Jun 13, 2024 at 12:40:35PM +0200, Pavel Machek wrote:
> On Thu 2024-06-13 12:16:50, Greg Kroah-Hartman wrote:
> > On Thu, Jun 13, 2024 at 11:32:41AM +0200, Pavel Machek wrote:
> > > On Wed 2024-03-20 15:47:34, Lee Jones wrote:
> > > > On Wed, 20 Mar 2024, Michal Hocko wrote:
> > > >
> > > > > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > > > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > > > > >
> > > > > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > > > > [...]
> > > > > > > > Possible unsafe locking scenario:
> > > > > > > >
> > > > > > > > CPU0 CPU1
> > > > > > > > ---- ----
> > > > > > > > lock(&svms->lock);
> > > > > > > > lock(&mm->mmap_lock);
> > > > > > > > lock(&svms->lock);
> > > > > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > > > > >
> > > > > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > > > > is running concurrently. However, there is no good way to annotate this.
> > > > > > >
> > > > > > > OK, so is this even a bug (not to mention a security/weakness)?
> > > > > >
> > > > > > Looks like the patch fixes a warning which can crash some kernels. So
> > > > > > the CVE appears to be fixing that, rather than the impossible deadlock.
> > > > >
> > > > > Are you talking about lockdep warning or anything else?
> > > >
> > > > Anything that triggers a BUG() or a WARN() (as per the splat in the
> > > > commit message). Many in-field kernels are configured to panic on
> > > > BUG()s and WARN()s, thus triggering them are presently considered local
> > > > DoS and attract CVE status.
> > >
> > > So... because it is possible to configure machine to reboot on
> > > warning, now every warning is a security issue?
> > >
> > > Lockdep is for debugging, if someone uses it in production with panic
> > > on reboot, they are getting exactly what they are asking for.
> > >
> > > Not a security problem.
> >
> > And we agree, I don't know what you are arguing about here, please stop.
>
> So you agree that WARN triggering randomly is not a security problem?
>
> Following communication did not say so.
>
> "The splat in the circular lockdep detection code appears to be generated
> using some stacked pr_warn() calls, rather than a WARN()."

*plonk*

2024-06-13 11:45:12

by Lee Jones

[permalink] [raw]
Subject: Re: CVE-2024-26628: drm/amdkfd: Fix lock dependency warning

On Thu, 13 Jun 2024, Pavel Machek wrote:

> On Thu 2024-06-13 12:16:50, Greg Kroah-Hartman wrote:
> > On Thu, Jun 13, 2024 at 11:32:41AM +0200, Pavel Machek wrote:
> > > On Wed 2024-03-20 15:47:34, Lee Jones wrote:
> > > > On Wed, 20 Mar 2024, Michal Hocko wrote:
> > > >
> > > > > On Thu 14-03-24 11:09:38, Lee Jones wrote:
> > > > > > On Fri, 08 Mar 2024, Michal Hocko wrote:
> > > > > >
> > > > > > > On Wed 06-03-24 06:46:11, Greg KH wrote:
> > > > > > > [...]
> > > > > > > > Possible unsafe locking scenario:
> > > > > > > >
> > > > > > > > CPU0 CPU1
> > > > > > > > ---- ----
> > > > > > > > lock(&svms->lock);
> > > > > > > > lock(&mm->mmap_lock);
> > > > > > > > lock(&svms->lock);
> > > > > > > > lock((work_completion)(&svm_bo->eviction_work));
> > > > > > > >
> > > > > > > > I believe this cannot really lead to a deadlock in practice, because
> > > > > > > > svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
> > > > > > > > refcount is non-0. That means it's impossible that svm_range_bo_release
> > > > > > > > is running concurrently. However, there is no good way to annotate this.
> > > > > > >
> > > > > > > OK, so is this even a bug (not to mention a security/weakness)?
> > > > > >
> > > > > > Looks like the patch fixes a warning which can crash some kernels. So
> > > > > > the CVE appears to be fixing that, rather than the impossible deadlock.
> > > > >
> > > > > Are you talking about lockdep warning or anything else?
> > > >
> > > > Anything that triggers a BUG() or a WARN() (as per the splat in the
> > > > commit message). Many in-field kernels are configured to panic on
> > > > BUG()s and WARN()s, thus triggering them are presently considered local
> > > > DoS and attract CVE status.
> > >
> > > So... because it is possible to configure machine to reboot on
> > > warning, now every warning is a security issue?
> > >
> > > Lockdep is for debugging, if someone uses it in production with panic
> > > on reboot, they are getting exactly what they are asking for.
> > >
> > > Not a security problem.
> >
> > And we agree, I don't know what you are arguing about here, please stop.
>
> So you agree that WARN triggering randomly is not a security problem?
>
> Following communication did not say so.
>
> "The splat in the circular lockdep detection code appears to be generated
> using some stacked pr_warn() calls, rather than a WARN()."

We agree that the lockdep detection is a debugging feature AND that even
though the splat looks like a WARN(), it does not behave like one.
Therefore it does not constitute a security issue.

However, yes, we believe that if an attacker can trip a WARN() and
reboot a victim's machine on demand then this is equivalent to a local
DoS attack and merits CVE status.

--
Lee Jones [李琼斯]