> From: Baolu Lu <[email protected]>
> Sent: Monday, September 11, 2023 8:46 PM
>
> On 2023/9/11 14:57, Tian, Kevin wrote:
> >> From: Baolu Lu <[email protected]>
> >> Sent: Tuesday, September 5, 2023 1:24 PM
> >>
> >> Hi Kevin,
> >>
> >> I am trying to address this issue in below patch. Does it looks sane to
> >> you?
> >>
> >> iommu: Consolidate per-device fault data management
> >>
> >> The per-device fault data is a data structure that is used to store
> >> information about faults that occur on a device. This data is allocated
> >> when IOPF is enabled on the device and freed when IOPF is disabled. The
> >> data is used in the paths of iopf reporting, handling, responding, and
> >> draining.
> >>
> >> The fault data is protected by two locks:
> >>
> >> - dev->iommu->lock: This lock is used to protect the allocation and
> >> freeing of the fault data.
> >> - dev->iommu->fault_parameter->lock: This lock is used to protect the
> >> fault data itself.
> >>
> >> Improve the iopf code to enforce this lock mechanism and add a
> reference
> >> counter in the fault data to avoid use-after-free issue.
> >>
> >
> > Can you elaborate the use-after-free issue and why a new user count
> > is required?
>
> I was concerned that when iommufd uses iopf, page fault report/response
> may occur simultaneously with enable/disable PRI.
>
> Currently, this is not an issue as the enable/disable PRI is in its own
> path. In the future, we may discard this interface and enable PRI when
> attaching the first PRI-capable domain, and disable it when detaching
> the last PRI-capable domain.
Then let's not do it now until there is a real need after you have a
thorough design for iommufd.
>
> >
> > btw a Fix tag is required given this mislocking issue has been there for
> > quite some time...
>
> I don't see any real issue fixed by this change. It's only a lock
> refactoring after the code refactoring and preparing it for iommufd use.
> Perhaps I missed anything?
>
mislocking already exists today for the partial list:
- iommu_queue_iopf() uses dev->iommu->lock;
- iopf_queue_discard_partial() uses queue->lock;
On 9/13/23 10:34 AM, Tian, Kevin wrote:
>> From: Baolu Lu <[email protected]>
>> Sent: Monday, September 11, 2023 8:46 PM
>>
>> On 2023/9/11 14:57, Tian, Kevin wrote:
>>>> From: Baolu Lu <[email protected]>
>>>> Sent: Tuesday, September 5, 2023 1:24 PM
>>>>
>>>> Hi Kevin,
>>>>
>>>> I am trying to address this issue in below patch. Does it looks sane to
>>>> you?
>>>>
>>>> iommu: Consolidate per-device fault data management
>>>>
>>>> The per-device fault data is a data structure that is used to store
>>>> information about faults that occur on a device. This data is allocated
>>>> when IOPF is enabled on the device and freed when IOPF is disabled. The
>>>> data is used in the paths of iopf reporting, handling, responding, and
>>>> draining.
>>>>
>>>> The fault data is protected by two locks:
>>>>
>>>> - dev->iommu->lock: This lock is used to protect the allocation and
>>>> freeing of the fault data.
>>>> - dev->iommu->fault_parameter->lock: This lock is used to protect the
>>>> fault data itself.
>>>>
>>>> Improve the iopf code to enforce this lock mechanism and add a
>> reference
>>>> counter in the fault data to avoid use-after-free issue.
>>>>
>>>
>>> Can you elaborate the use-after-free issue and why a new user count
>>> is required?
>>
>> I was concerned that when iommufd uses iopf, page fault report/response
>> may occur simultaneously with enable/disable PRI.
>>
>> Currently, this is not an issue as the enable/disable PRI is in its own
>> path. In the future, we may discard this interface and enable PRI when
>> attaching the first PRI-capable domain, and disable it when detaching
>> the last PRI-capable domain.
>
> Then let's not do it now until there is a real need after you have a
> thorough design for iommufd.
Okay, fair enough.
>
>>
>>>
>>> btw a Fix tag is required given this mislocking issue has been there for
>>> quite some time...
>>
>> I don't see any real issue fixed by this change. It's only a lock
>> refactoring after the code refactoring and preparing it for iommufd use.
>> Perhaps I missed anything?
>>
>
> mislocking already exists today for the partial list:
>
> - iommu_queue_iopf() uses dev->iommu->lock;
> - iopf_queue_discard_partial() uses queue->lock;
So, if it's worth it, let me try splitting a fix patch.
Best regards,
baolu