Linus,
please pull the latest core/urgent branch from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git core-urgent-2022-05-08
up to: 2667ed10d9f0: mm: Fix PASID use-after-free issue
A single bugfix for the PASID management code, which freed the PASID too
early. The PASID needs to be tied to the mm lifetime, not to the address
space lifetime.
Thanks,
tglx
------------------>
Fenghua Yu (1):
mm: Fix PASID use-after-free issue
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 9796897560ab..35a3beff140b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -792,6 +792,7 @@ void __mmdrop(struct mm_struct *mm)
mmu_notifier_subscriptions_destroy(mm);
check_mm(mm);
put_user_ns(mm->user_ns);
+ mm_pasid_drop(mm);
free_mm(mm);
}
EXPORT_SYMBOL_GPL(__mmdrop);
@@ -1190,7 +1191,6 @@ static inline void __mmput(struct mm_struct *mm)
}
if (mm->binfmt)
module_put(mm->binfmt->module);
- mm_pasid_drop(mm);
mmdrop(mm);
}
On Sun, May 8, 2022 at 5:05 AM Thomas Gleixner <[email protected]> wrote:
>
> A single bugfix for the PASID management code, which freed the PASID too
> early. The PASID needs to be tied to the mm lifetime, not to the address
> space lifetime.
So I have to once more complain about the -tip tree "Link:" usage.
Again, the commit has a link to the patch *submission*, which is
almost entirely useless. There's no link to the actual problem the
patch fixes.
It does have a "Fixes:" link, and it has an "explanation", but the
explanation doesn't actually make much sense to me.
The *sensible* thing for an iommu driver to do is to just do a
mmget/mmput, and then the whole "drop PASIX on last mmput() (ie in
__mmput)" makes lots of sense.
The commit claims that's not what the iommu drivers do, but it's
really hard to tell *what* it is that does a mmgrab. It just says
"the IOMMU driver holds a reference on the mm itself, not the address space"
but then I tried to see where such references are held, and I couldn't find it.
*Must* users do mmget/mmput, and the commit even says that's the
logical thing to do. Apparently something that the iommu code relies
on does the mmgrab/mmdrop thing, but I really would have liked to know
what that is.
Yes, "mmgab" is the right thing to do if all you want is the "struct
mm_struct", and don't actually care about the page tables themselves,
and don't want to have the refcount keep page tables alive. But you'd
think the iommu code really does want the page tables.
So it would have been really nice to have an explanation for what it
was that did the mmgrab, especially since the commit itself makes it
clear that the logical thing to do seems to be just mmget/put. No?
I _really_ wish the -tip tree had more links to the actual problem
reports and explanations, rather than links to the patch submission.
One has the actual issue. The other just has the information that is
already in the commit, and the "Link:" adds almost no actual value
(yes, yes, you can see late "Acked-by" etc, but that really isn't
worth it).
I've pulled this, because I definitely believe the issue is real. I
just wish I could see _what_ the issue was.
Put another way: I can see that
Reported-by: Zhangfei Gao <[email protected]>
in the commit, but I don't have a clue what the actual report was, and
there really isn't enough information in the commit itself, except for
a fairly handwavy "Device drivers might, for instance, still need to
flush operations.."
I don't want to know what device drivers _might_ do. I would want to
have an actual pointer to what they do and where.
I suspect it's related to mmu_notifiers or something, but I really
would have liked a more exact "this is where things go wrong".
I also *suspect* that this is all about that _loong_ thread (picking
one email almost at random) here:
https://lore.kernel.org/all/[email protected]/
but if so, the confusion I see in that thread is even more reason to
have more background for what is going on.
Anyway, this is merged in my tree, but I'm having trouble following
the logic, which is why I'm writing this long email to complain about
lack of context.
Linus
On Sun, May 8, 2022 at 11:09 AM Linus Torvalds
<[email protected]> wrote:
>
> Looks like it is
>
> ->(*sva_bind)()
> -> intel_svm_bind_mm()
> -> mmu_notifier_register(&svm->notifier, mm)
>
> and yes, the mmu notifiers annoyingly end up doing an mmgrab [..]
Side note: quite independently of this mmgrab issue, I think the code
in question is *very* suspect and horrendously fragile.
In particular, the code ends up being called through things like this:
handle = iommu_sva_bind_device(uacce->parent, current->mm, NULL);
and then that Intel svm.c code does this:
svm->pasid = mm->pasid;
svm->mm = mm;
svm->flags = flags;
and saves off that mm pointer in the 'svm' structure.
AND IT NEVER TAKES ANY REFERENCE TO IT AT ALL!
It then does
mm = svm->mm;
later at some unspecified time, and the 'mm' might long since have died.
In other words, the code works almost by accident - the only user of
the 'mm' pointer seems to be that mmu_notifier_register() thing, so it
basically treats the 'struct mm_struct' as something as a random
cookie.
And yes, the mmu notifiers do then take that mmgrab reference to the
mm, so it all works.
But it sure looks horrendously ugly. Saving off a 'struct mm_struct'
pointer with having basically an accidental reference to it is WRONG.
In fact, it will save off that pointer whether it then actually does
the mmu_notifier thing on it, because the code actually does
[...]
svm->mm = mm;
svm->flags = flags;
INIT_LIST_HEAD_RCU(&svm->devs);
if (!(flags & SVM_FLAG_SUPERVISOR_MODE)) {
svm->notifier.ops = &intel_mmuops;
ret = mmu_notifier_register(&svm->notifier, mm);
[...]
so that mmu_notifier_register() call is conditional. On the freeing
path, it then uses that "svm->notifier.ops" pointer as a "did we
register this thing or not" flag, so again - it all technically
*works*, but this is all horrendously ugly and wrong on so many
levels, keeping pointers around with very dubious reference counting
indeed.
Linus
On Sun, May 8, 2022 at 11:00 AM Linus Torvalds
<[email protected]> wrote:
>
> I suspect it's related to mmu_notifiers or something, but I really
> would have liked a more exact "this is where things go wrong".
Looks like it is
->(*sva_bind)()
-> intel_svm_bind_mm()
-> mmu_notifier_register(&svm->notifier, mm)
and yes, the mmu notifiers annoyingly end up doing an mmgrab, because
they actually don't want to affect the state of mm teardown, they just
want to be notified about it.
Annoying, but it seems to explain the issue if I followed this right.
Unlike the commit message.
Linus
Hi,
On Sun, May 08, 2022 at 11:36:23AM -0700, Linus Torvalds wrote:
> And yes, the mmu notifiers do then take that mmgrab reference to the
> mm, so it all works.
>
> But it sure looks horrendously ugly. Saving off a 'struct mm_struct'
> pointer with having basically an accidental reference to it is WRONG.
Yes it is fragile. We're currently reworking iommu_sva_bind_device() to
factor more code into the IOMMU core and to add explicit mmgrab/mmdrop, so
once this lands we won't depend on mmu_notifier_register() anymore to hold
the mm reference.
https://lore.kernel.org/linux-iommu/[email protected]/
Thanks,
Jean
On 08.05.22 20:00, Linus Torvalds wrote:
> On Sun, May 8, 2022 at 5:05 AM Thomas Gleixner <[email protected]> wrote:
>>
>> A single bugfix for the PASID management code, which freed the PASID too
>> early. The PASID needs to be tied to the mm lifetime, not to the address
>> space lifetime.
>
> So I have to once more complain about the -tip tree "Link:" usage.
Many thx for reminding people about the tag. FWIW, that's a problem in
a lot or subsystems and makes my regression tracking efforts hard, as my
tracking bot relies on the 'Link:' tag. If it's missing I thus have to
manually search if patches were posted or committed to fix a regression,
which makes the tracking hard and annoying. :-/
> Again, the commit has a link to the patch *submission*, which is
> almost entirely useless. There's no link to the actual problem the
> patch fixes.
It seems quite a few developers are under the impressions that "Link:"
is just for the patch submission and not to be used for other things.
That's why some people invented other tags. I see "BugLink" quite often
these days, but there are also other tags in use (some drm people used
"References:" for a while).
Do we care? Should we try to clean this up while making things a bit
more straight forward at the same time by making it more obvious what a
link is actually about? I once suggested we use something like
* "Submitted:" or "Posted:" for the patch submission
* "Reported:" or "BugLink:" for any reports that lead to the
That would leave "Link:" ambiguous and usable for anything else (and b4
likely could be fixed easily to set a different tag; but sure, there
would be a transition period).
But there was not much feedback on the idea. Do you think I should pick
up this again? Or is this something I should bring up during this years
kernel summit?
> [...]
> Put another way: I can see that
> Reported-by: Zhangfei Gao <[email protected]>
With a "Reported:" tag like mentioned above we could stop using
"Reported-by:" if we wanted to reduce the overhead (or make it
optional). But OTOH I guess it's a bad idea, as having this in there
will motivate some people to submit reports. And is good for stats reg.
syzbot and 0-day (but I guess those could be generated from proper
links, too).
BTW: Documentation/process/5.Posting.rst states '''Be careful in the
addition of tags to your patches: only Cc: is appropriate for addition
without the explicit permission of the person named.''' Is that actually
true? A lot of people seem to set "Reported-by:" without getting
explicit permission. If that is fine I'd prepare a patch to fix the docs.
Ciao, Thorsten
Borislav Petkov <[email protected]> writes:
> On Tue, May 10, 2022 at 01:27:54PM +0200, Thorsten Leemhuis wrote:
>> Many thx for reminding people about the tag. FWIW, that's a problem in
>> a lot or subsystems and makes my regression tracking efforts hard, as my
>> tracking bot relies on the 'Link:' tag. If it's missing I thus have to
>> manually search if patches were posted or committed to fix a regression,
>> which makes the tracking hard and annoying. :-/
>
> Here's my experience with the Link thing:
>
> So it is trivial to take the Message-ID and turn it into a link tag and
> our automation does that.
>
> - Now, it is not a problem when that link tag points to a patch which is
> part of the thread which contains the initial bug report - you just go
> up-thread.
>
> - If the link tag points to a patch which is version N and it is the
> version which passed all review and gets committed, it is a bit harder
> to find the previous versions and find the whole discussion how it all
> arrived at version N. You can search by the Subject, ofc, which, if it
> hasn't been changed, will give you the previous threads. And so on ...
>
> - The problem is when the discussion happened somewhere and the patch
> got submitted separately. I can't think of a good way to automate
> that so we have to pay attention and fix the link tag by hand and add
> the relevant one. And I try to do that when I'm especially awake when
> applying the patch.
That doesn't scale though, it puts more work on maintainers, who already
don't have enough time.
The *submitter* should be putting all the relevant info in the patch,
including any links to other discussions, previous versions etc. etc.
Then the maintainer can automatically add the "Link:" tag pointing to
the submission, and everything is there in the archive.
One advantage of linking back to the original submission is that if the
patch doesn't have all the relevant info, anyone can post replies adding
context or linking to other places, even after the patch has been
committed.
cheers
On Tue, May 10, 2022 at 01:27:54PM +0200, Thorsten Leemhuis wrote:
> Many thx for reminding people about the tag. FWIW, that's a problem in
> a lot or subsystems and makes my regression tracking efforts hard, as my
> tracking bot relies on the 'Link:' tag. If it's missing I thus have to
> manually search if patches were posted or committed to fix a regression,
> which makes the tracking hard and annoying. :-/
Here's my experience with the Link thing:
So it is trivial to take the Message-ID and turn it into a link tag and
our automation does that.
- Now, it is not a problem when that link tag points to a patch which is
part of the thread which contains the initial bug report - you just go
up-thread.
- If the link tag points to a patch which is version N and it is the
version which passed all review and gets committed, it is a bit harder
to find the previous versions and find the whole discussion how it all
arrived at version N. You can search by the Subject, ofc, which, if it
hasn't been changed, will give you the previous threads. And so on ...
- The problem is when the discussion happened somewhere and the patch
got submitted separately. I can't think of a good way to automate
that so we have to pay attention and fix the link tag by hand and add
the relevant one. And I try to do that when I'm especially awake when
applying the patch.
So I think we should simply pay attention to making sure the link tags
point to the relevant discussion. And we even document that:
"If related discussions or any other background information behind the
change can be found on the web, add 'Link:' tags pointing to it."
so we better follow through with it.
:-)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Wed, May 11, 2022 at 8:50 AM Theodore Ts'o <[email protected]> wrote:
>
> I would argue that it should be the patch submitter's responsibility
> to explicitly add a URL to the problem report.
I agree in the perfect case.
But in practice, we have a lot more patch submitters than we have
maintainers, and not all "leaf developers" necessarily know how to do
everything.
So the maintainer should probably expect to fix things up. Not always,
but also not a "the developer should have done this, so I won't do it"
This isn't so different from the fact that not everybody writes
English proficiently - people do hopefully end up fixing things up as
they get passed onwards.
Linus
On Wed, May 11, 2022 at 10:37:35AM +0200, Borislav Petkov wrote:
> - The problem is when the discussion happened somewhere and the patch
> got submitted separately. I can't think of a good way to automate
> that so we have to pay attention and fix the link tag by hand and add
> the relevant one. And I try to do that when I'm especially awake when
> applying the patch.
I would argue that it should be the patch submitter's responsibility
to explicitly add a URL to the problem report. In some cases this
might be a pointer to a bug tracking system; in other cases it might
be a URL to lore.kernel.org.
- Ted
On Wed, May 11, 2022 at 08:55:34AM -0700, Linus Torvalds wrote:
> On Wed, May 11, 2022 at 8:50 AM Theodore Ts'o <[email protected]> wrote:
> >
> > I would argue that it should be the patch submitter's responsibility
> > to explicitly add a URL to the problem report.
>
> I agree in the perfect case.
>
> But in practice, we have a lot more patch submitters than we have
> maintainers, and not all "leaf developers" necessarily know how to do
> everything.
>
> So the maintainer should probably expect to fix things up. Not always,
> but also not a "the developer should have done this, so I won't do it"
>
> This isn't so different from the fact that not everybody writes
> English proficiently - people do hopefully end up fixing things up as
> they get passed onwards.
And, in addition, what happens most often in my experience is I
constantly get to point submitters to our process documentation -
submitting-patches especially - as not a small number of them are not
aware of different aspects of the whole patch dance: tags, SOB chains,
etc. And the Link tag is no exception here.
So yeah, the maintainer is kinda the last one to make sure the patch
looks somewhat sane and documents the whole issue so that years from
now, it can still be understood what we were fixing there.
And I guess important part of that documentation is setting the proper
Link so...
And as said, yeah, we won't be perfect and catch all cases but at least
we can pay attention.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On 11.05.22 21:35, Borislav Petkov wrote:
> On Wed, May 11, 2022 at 08:55:34AM -0700, Linus Torvalds wrote:
>> On Wed, May 11, 2022 at 8:50 AM Theodore Ts'o <[email protected]> wrote:
>>>
>>> I would argue that it should be the patch submitter's responsibility
>>> to explicitly add a URL to the problem report.
>>
>> I agree in the perfect case.
>>
>> But in practice, we have a lot more patch submitters than we have
>> maintainers, and not all "leaf developers" necessarily know how to do
>> everything.
>>
>> So the maintainer should probably expect to fix things up. Not always,
>> but also not a "the developer should have done this, so I won't do it"
>>
>> This isn't so different from the fact that not everybody writes
>> English proficiently - people do hopefully end up fixing things up as
>> they get passed onwards.
>
> And, in addition, what happens most often in my experience is I
> constantly get to point submitters to our process documentation -
> submitting-patches especially - as not a small number of them are not
> aware of different aspects of the whole patch dance: tags, SOB chains,
> etc. And the Link tag is no exception here.
Which leads to the question: can we (and do we want to) teach
scripts/checkpatch.pl to point out when a Link: tag is missing and
likely appropriate? If a "Reported-by:" is present there should be a
"Link:" as well, unless the issue was reported privately, via IRC or
something like that. A "Fixes:" tag is also a strong indicator that a
link might be appropriate, but not as good.
Ciao, Thorsten
On Wed, May 11, 2022 at 08:55:34AM -0700, Linus Torvalds wrote:
> On Wed, May 11, 2022 at 8:50 AM Theodore Ts'o <[email protected]> wrote:
> >
> > I would argue that it should be the patch submitter's responsibility
> > to explicitly add a URL to the problem report.
>
> I agree in the perfect case.
>
> But in practice, we have a lot more patch submitters than we have
> maintainers, and not all "leaf developers" necessarily know how to do
> everything.
>
> So the maintainer should probably expect to fix things up. Not always,
> but also not a "the developer should have done this, so I won't do it"
Sure... but *because* maintainers are significantly outnumbered by the
patch submitters, what we should document is that it is the
developer's responsibility to provide the link, just as it is the
developer's responsibility to at least *try* to write a clear commit
description.
> This isn't so different from the fact that not everybody writes
> English proficiently - people do hopefully end up fixing things up as
> they get passed onwards.
Maintainers can be the backstop, sure, but if we are trying to make
Maintainers scale well, developers should at least feel obligated to
**try** to lighten the load on maintainers to the limits of their
ability.
And I have a bit more sympathy over someone who doesn't speak English
as their first language, over someone who can't be bothered to look up
a bug report link or a Syzkaller ID.
I guess we could always NACK such a patch until they provide that
information, but if it's for a critical bug fix, that's not
necessarily a good alternative either. I guess that's why the
Maintainers get paid the big bucks....
- Ted
On Thu, May 12, 2022 at 10:43:07AM +0200, Thorsten Leemhuis wrote:
> Which leads to the question: can we (and do we want to) teach
> scripts/checkpatch.pl to point out when a Link: tag is missing and
> likely appropriate? If a "Reported-by:" is present there should be a
> "Link:" as well, unless the issue was reported privately, via IRC or
> something like that. A "Fixes:" tag is also a strong indicator that a
> link might be appropriate, but not as good.
All good ideas, sure.
At least pointing it out as a hint - not necessarily as a warning -
would be a good idea. And say, "hey, and while you're adding a Link
tag, pls make sure it points to the mail which has the most relevant
discussion on the matter your patch is fixing."
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette