2014-10-24 19:41:18

by Jesse Barnes

[permalink] [raw]
Subject: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
simply, rather than doing tricks with page refs and get_user_pages().

Signed-off-by: Jesse Barnes <[email protected]>
---
mm/memory.c | 1 +
mm/mmap.c | 2 ++
2 files changed, 3 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index 1cc6bfb..969ff0c 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,

return ret;
}
+EXPORT_SYMBOL_GPL(handle_mm_fault);

#ifndef __PAGETABLE_PUD_FOLDED
/*
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f85520..2ee7971 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
}
#endif

+EXPORT_SYMBOL_GPL(find_extend_vma);
+
/*
* Ok - we have the memory areas we should free on the vma list,
* so release them, and do the vma updates.
--
1.9.1


2014-10-24 19:41:17

by Jesse Barnes

[permalink] [raw]
Subject: [PATCH 2/2] iommu/amd: use handle_mm_fault directly

This could be useful for debug in the future if we want to track
major/minor faults more closely, and also avoids the put_page trick we
used with gup.

In order to do this, we also track the task struct in the PASID state
structure. This lets us update the appropriate task stats after the
fault has been handled, and may aid with debug in the future as well.

Signed-off-by: Jesse Barnes <[email protected]>
---
drivers/iommu/amd_iommu_v2.c | 93 +++++++++++++++++++++++++++++---------------
1 file changed, 62 insertions(+), 31 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 90d734b..b23481b 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -47,6 +47,7 @@ struct pasid_state {
atomic_t count; /* Reference count */
unsigned mmu_notifier_count; /* Counting nested mmu_notifier
calls */
+ struct task_struct *task; /* task_struct for accounting */
struct mm_struct *mm; /* mm_struct for the faults */
struct mmu_notifier mn; /* mmu_notifier handle */
struct pri_queue pri[PRI_QUEUE_SIZE]; /* PRI tag states */
@@ -513,45 +514,74 @@ static void finish_pri_tag(struct device_state *dev_state,
spin_unlock_irqrestore(&pasid_state->lock, flags);
}

+static void handle_fault_error(struct fault *fault)
+{
+ int status;
+
+ if (!fault->dev_state->inv_ppr_cb) {
+ set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+ return;
+ }
+
+ status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
+ fault->pasid,
+ fault->address,
+ fault->flags);
+ switch (status) {
+ case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
+ set_pri_tag_status(fault->state, fault->tag, PPR_SUCCESS);
+ break;
+ case AMD_IOMMU_INV_PRI_RSP_INVALID:
+ set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+ break;
+ case AMD_IOMMU_INV_PRI_RSP_FAIL:
+ set_pri_tag_status(fault->state, fault->tag, PPR_FAILURE);
+ break;
+ default:
+ BUG();
+ }
+}
+
static void do_fault(struct work_struct *work)
{
struct fault *fault = container_of(work, struct fault, work);
- int npages, write;
- struct page *page;
+ struct mm_struct *mm;
+ struct vm_area_struct *vma;
+ struct task_struct *task;
+ u64 address;
+ int ret, write;

write = !!(fault->flags & PPR_FAULT_WRITE);

- down_read(&fault->state->mm->mmap_sem);
- npages = get_user_pages(NULL, fault->state->mm,
- fault->address, 1, write, 0, &page, NULL);
- up_read(&fault->state->mm->mmap_sem);
-
- if (npages == 1) {
- put_page(page);
- } else if (fault->dev_state->inv_ppr_cb) {
- int status;
-
- status = fault->dev_state->inv_ppr_cb(fault->dev_state->pdev,
- fault->pasid,
- fault->address,
- fault->flags);
- switch (status) {
- case AMD_IOMMU_INV_PRI_RSP_SUCCESS:
- set_pri_tag_status(fault->state, fault->tag, PPR_SUCCESS);
- break;
- case AMD_IOMMU_INV_PRI_RSP_INVALID:
- set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
- break;
- case AMD_IOMMU_INV_PRI_RSP_FAIL:
- set_pri_tag_status(fault->state, fault->tag, PPR_FAILURE);
- break;
- default:
- BUG();
- }
- } else {
- set_pri_tag_status(fault->state, fault->tag, PPR_INVALID);
+ task = fault->state->task;
+ mm = fault->state->mm;
+ address = fault->address;
+
+ down_read(&mm->mmap_sem);
+ vma = find_extend_vma(mm, address);
+ if (!vma || address < vma->vm_start) {
+ /* failed to get a vma in the right range */
+ up_read(&mm->mmap_sem);
+ handle_fault_error(fault);
+ goto out;
}

+ ret = handle_mm_fault(mm, vma, address, write);
+ if (ret & VM_FAULT_ERROR) {
+ /* failed to service fault */
+ up_read(&mm->mmap_sem);
+ handle_fault_error(fault);
+ goto out;
+ }
+
+ if (ret & VM_FAULT_MAJOR)
+ task->maj_flt++;
+ else
+ task->min_flt++;
+
+ up_read(&mm->mmap_sem);
+
+out:
finish_pri_tag(fault->dev_state, fault->state, fault->tag);

put_pasid_state(fault->state);
@@ -663,6 +693,7 @@ int amd_iommu_bind_pasid(struct pci_dev *pdev, int pasid,
spin_lock_init(&pasid_state->lock);

mm = get_task_mm(task);
+ pasid_state->task = task;
pasid_state->mm = mm;
pasid_state->device_state = dev_state;
pasid_state->pasid = pasid;
--
1.9.1

2014-10-27 15:13:38

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Hi Oded,

can you please test these patches with the KFD driver and make sure
nothing breaks for you? I really like this improvement and it would be
great to send it upstream for v3.19.

Thanks,

Joerg

On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> simply, rather than doing tricks with page refs and get_user_pages().
>
> Signed-off-by: Jesse Barnes <[email protected]>
> ---
> mm/memory.c | 1 +
> mm/mmap.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 1cc6bfb..969ff0c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>
> return ret;
> }
> +EXPORT_SYMBOL_GPL(handle_mm_fault);
>
> #ifndef __PAGETABLE_PUD_FOLDED
> /*
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 7f85520..2ee7971 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
> }
> #endif
>
> +EXPORT_SYMBOL_GPL(find_extend_vma);
> +
> /*
> * Ok - we have the memory areas we should free on the vma list,
> * so release them, and do the vma updates.
> --
> 1.9.1

2014-10-27 15:15:56

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Sure, no problem

Oded

On 10/27/2014 05:13 PM, Joerg Roedel wrote:

Hi Oded,

can you please test these patches with the KFD driver and make sure
nothing breaks for you? I really like this improvement and it would be
great to send it upstream for v3.19.

Thanks,

Joerg

On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:

> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> simply, rather than doing tricks with page refs and get_user_pages().
>
> Signed-off-by: Jesse Barnes <[email protected]>
> ---
> mm/memory.c | 1 +
> mm/mmap.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 1cc6bfb..969ff0c 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>
> return ret;
> }
> +EXPORT_SYMBOL_GPL(handle_mm_fault);
>
> #ifndef __PAGETABLE_PUD_FOLDED
> /*
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 7f85520..2ee7971 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
> }
> #endif
>
> +EXPORT_SYMBOL_GPL(find_extend_vma);
> +
> /*
> * Ok - we have the memory areas we should free on the vma list,
> * so release them, and do the vma updates.
> --
> 1.9.1

2014-10-27 15:35:52

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Thanks, I have no way of testing this, but I'm hopeful. :)

Jesse

On Mon, 27 Oct 2014 17:15:45 +0200
Oded Gabbay <[email protected]> wrote:

> Sure, no problem
>
> Oded
>
> On 10/27/2014 05:13 PM, Joerg Roedel wrote:
>
> Hi Oded,
>
> can you please test these patches with the KFD driver and make sure
> nothing breaks for you? I really like this improvement and it would be
> great to send it upstream for v3.19.
>
> Thanks,
>
> Joerg
>
> On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
>
> > This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> > simply, rather than doing tricks with page refs and get_user_pages().
> >
> > Signed-off-by: Jesse Barnes <[email protected]>
> > ---
> > mm/memory.c | 1 +
> > mm/mmap.c | 2 ++
> > 2 files changed, 3 insertions(+)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 1cc6bfb..969ff0c 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >
> > return ret;
> > }
> > +EXPORT_SYMBOL_GPL(handle_mm_fault);
> >
> > #ifndef __PAGETABLE_PUD_FOLDED
> > /*
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 7f85520..2ee7971 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
> > }
> > #endif
> >
> > +EXPORT_SYMBOL_GPL(find_extend_vma);
> > +
> > /*
> > * Ok - we have the memory areas we should free on the vma list,
> > * so release them, and do the vma updates.
> > --
> > 1.9.1
>
>


--
Jesse Barnes, Intel Open Source Technology Center

2014-10-27 15:37:31

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Buy Kaveri ;)

I'm sure Intel will be happy to contribute some $$$ to AMD ;)

Oded

On 10/27/2014 05:35 PM, Jesse Barnes wrote:

Thanks, I have no way of testing this, but I'm hopeful. :)

Jesse

On Mon, 27 Oct 2014 17:15:45 +0200
Oded Gabbay <[email protected]> wrote:

> Sure, no problem
>
> Oded
>
> On 10/27/2014 05:13 PM, Joerg Roedel wrote:
>
> Hi Oded,
>
> can you please test these patches with the KFD driver and make sure
> nothing breaks for you? I really like this improvement and it would be
> great to send it upstream for v3.19.
>
> Thanks,
>
> Joerg
>
> On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
>
>> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
>> simply, rather than doing tricks with page refs and get_user_pages().
>>
>> Signed-off-by: Jesse Barnes <[email protected]>
>> ---
>> mm/memory.c | 1 +
>> mm/mmap.c | 2 ++
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 1cc6bfb..969ff0c 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>>
>> return ret;
>> }
>> +EXPORT_SYMBOL_GPL(handle_mm_fault);
>>
>> #ifndef __PAGETABLE_PUD_FOLDED
>> /*
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 7f85520..2ee7971 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
>> }
>> #endif
>>
>> +EXPORT_SYMBOL_GPL(find_extend_vma);
>> +
>> /*
>> * Ok - we have the memory areas we should free on the vma list,
>> * so release them, and do the vma updates.
>> --
>> 1.9.1

2014-10-29 09:33:55

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Hi Joerg and Jesse,

I tested our amdkfd driver with your patches applied (kernel 3.17.1).
I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP

All tests passed and I didn't see any kernel error messages.

So:

Tested-by: Oded Gabbay <[email protected]>

Oded

On 10/27/2014 05:35 PM, Jesse Barnes wrote:

Thanks, I have no way of testing this, but I'm hopeful. :)

Jesse

On Mon, 27 Oct 2014 17:15:45 +0200
Oded Gabbay <[email protected]> wrote:

> Sure, no problem
>
> Oded
>
> On 10/27/2014 05:13 PM, Joerg Roedel wrote:
>
> Hi Oded,
>
> can you please test these patches with the KFD driver and make sure
> nothing breaks for you? I really like this improvement and it would be
> great to send it upstream for v3.19.
>
> Thanks,
>
> Joerg
>
> On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
>
>> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
>> simply, rather than doing tricks with page refs and get_user_pages().
>>
>> Signed-off-by: Jesse Barnes <[email protected]>
>> ---
>> mm/memory.c | 1 +
>> mm/mmap.c | 2 ++
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 1cc6bfb..969ff0c 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>>
>> return ret;
>> }
>> +EXPORT_SYMBOL_GPL(handle_mm_fault);
>>
>> #ifndef __PAGETABLE_PUD_FOLDED
>> /*
>> diff --git a/mm/mmap.c b/mm/mmap.c
>> index 7f85520..2ee7971 100644
>> --- a/mm/mmap.c
>> +++ b/mm/mmap.c
>> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
>> }
>> #endif
>>
>> +EXPORT_SYMBOL_GPL(find_extend_vma);
>> +
>> /*
>> * Ok - we have the memory areas we should free on the vma list,
>> * so release them, and do the vma updates.
>> --
>> 1.9.1

2014-10-29 14:38:03

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Cool, thanks a lot, Oded. I guess my compiler did a good job making
sure there were no bugs. :)

Jesse

On Wed, 29 Oct 2014 11:33:38 +0200
Oded Gabbay <[email protected]> wrote:

> Hi Joerg and Jesse,
>
> I tested our amdkfd driver with your patches applied (kernel 3.17.1).
> I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
>
> All tests passed and I didn't see any kernel error messages.
>
> So:
>
> Tested-by: Oded Gabbay <[email protected]>
>
> Oded
>
> On 10/27/2014 05:35 PM, Jesse Barnes wrote:
>
> Thanks, I have no way of testing this, but I'm hopeful. :)
>
> Jesse
>
> On Mon, 27 Oct 2014 17:15:45 +0200
> Oded Gabbay <[email protected]> wrote:
>
> > Sure, no problem
> >
> > Oded
> >
> > On 10/27/2014 05:13 PM, Joerg Roedel wrote:
> >
> > Hi Oded,
> >
> > can you please test these patches with the KFD driver and make sure
> > nothing breaks for you? I really like this improvement and it would be
> > great to send it upstream for v3.19.
> >
> > Thanks,
> >
> > Joerg
> >
> > On Fri, Oct 24, 2014 at 12:34:30PM -0700, Jesse Barnes wrote:
> >
> >> This lets drivers like the AMD IOMMUv2 driver handle faults a bit more
> >> simply, rather than doing tricks with page refs and get_user_pages().
> >>
> >> Signed-off-by: Jesse Barnes <[email protected]>
> >> ---
> >> mm/memory.c | 1 +
> >> mm/mmap.c | 2 ++
> >> 2 files changed, 3 insertions(+)
> >>
> >> diff --git a/mm/memory.c b/mm/memory.c
> >> index 1cc6bfb..969ff0c 100644
> >> --- a/mm/memory.c
> >> +++ b/mm/memory.c
> >> @@ -3378,6 +3378,7 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> >>
> >> return ret;
> >> }
> >> +EXPORT_SYMBOL_GPL(handle_mm_fault);
> >>
> >> #ifndef __PAGETABLE_PUD_FOLDED
> >> /*
> >> diff --git a/mm/mmap.c b/mm/mmap.c
> >> index 7f85520..2ee7971 100644
> >> --- a/mm/mmap.c
> >> +++ b/mm/mmap.c
> >> @@ -2359,6 +2359,8 @@ find_extend_vma(struct mm_struct *mm, unsigned long addr)
> >> }
> >> #endif
> >>
> >> +EXPORT_SYMBOL_GPL(find_extend_vma);
> >> +
> >> /*
> >> * Ok - we have the memory areas we should free on the vma list,
> >> * so release them, and do the vma updates.
> >> --
> >> 1.9.1
>
>
>


--
Jesse Barnes, Intel Open Source Technology Center

2014-11-05 12:03:57

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

Hi Oded, Jesse,

On Wed, Oct 29, 2014 at 11:33:38AM +0200, Oded Gabbay wrote:
> I tested our amdkfd driver with your patches applied (kernel 3.17.1).
> I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
>
> All tests passed and I didn't see any kernel error messages.
>
> So:
>
> Tested-by: Oded Gabbay <[email protected]>

Thanks for testing Oded. Jesse, the patch looks good to me, except the
task accounting for the page-faults. I'd like to get rid of using
task_struct in the IOMMUv2 driver entirely if possible. Also it is not
really the CPU task causing the faults, but some non-CPU process.

So can you please remove that code and resend the patches with Oded's
Tested-by and Andrew Morton on Cc? I think these patches should go
through the -mm tree.

Thanks,

Joerg

2014-11-05 21:51:00

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

On Wed, 5 Nov 2014 13:03:51 +0100
Joerg Roedel <[email protected]> wrote:

> Hi Oded, Jesse,
>
> On Wed, Oct 29, 2014 at 11:33:38AM +0200, Oded Gabbay wrote:
> > I tested our amdkfd driver with your patches applied (kernel 3.17.1).
> > I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
> >
> > All tests passed and I didn't see any kernel error messages.
> >
> > So:
> >
> > Tested-by: Oded Gabbay <[email protected]>
>
> Thanks for testing Oded. Jesse, the patch looks good to me, except the
> task accounting for the page-faults. I'd like to get rid of using
> task_struct in the IOMMUv2 driver entirely if possible. Also it is not
> really the CPU task causing the faults, but some non-CPU process.

Hm, but the CPU task initiates the activity on the GPU, so we should
account for it somewhere, right? I guess I had been thinking of the
"task" as spanning the CPUs and GPUs and other devices in the system,
rather than just representing the CPU activity.

> So can you please remove that code and resend the patches with Oded's
> Tested-by and Andrew Morton on Cc? I think these patches should go
> through the -mm tree.

Sure, thanks.

--
Jesse Barnes, Intel Open Source Technology Center

2014-11-06 08:51:16

by Oded Gabbay

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use



On 11/05/2014 11:51 PM, Jesse Barnes wrote:
> On Wed, 5 Nov 2014 13:03:51 +0100
> Joerg Roedel <[email protected]> wrote:
>
>> Hi Oded, Jesse,
>>
>> On Wed, Oct 29, 2014 at 11:33:38AM +0200, Oded Gabbay wrote:
>>> I tested our amdkfd driver with your patches applied (kernel 3.17.1).
>>> I run OpenCL tests, Aparapi/Sumatra (Java) and OpenMP
>>>
>>> All tests passed and I didn't see any kernel error messages.
>>>
>>> So:
>>>
>>> Tested-by: Oded Gabbay <[email protected]>
>>
>> Thanks for testing Oded. Jesse, the patch looks good to me, except the
>> task accounting for the page-faults. I'd like to get rid of using
>> task_struct in the IOMMUv2 driver entirely if possible. Also it is not
>> really the CPU task causing the faults, but some non-CPU process.
>
> Hm, but the CPU task initiates the activity on the GPU, so we should
> account for it somewhere, right? I guess I had been thinking of the
> "task" as spanning the CPUs and GPUs and other devices in the system,
> rather than just representing the CPU activity.

Joerg, sorry for the dumb question but what do you mean by "task accounting for
page-faults"? Where is that code in IOMMUv2 driver now ?

>
>> So can you please remove that code and resend the patches with Oded's
>> Tested-by and Andrew Morton on Cc? I think these patches should go
>> through the -mm tree.
>
> Sure, thanks.
>

2014-11-06 13:01:27

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

On Wed, Nov 05, 2014 at 01:51:09PM -0800, Jesse Barnes wrote:
> On Wed, 5 Nov 2014 13:03:51 +0100
> Joerg Roedel <[email protected]> wrote:
> > Thanks for testing Oded. Jesse, the patch looks good to me, except the
> > task accounting for the page-faults. I'd like to get rid of using
> > task_struct in the IOMMUv2 driver entirely if possible. Also it is not
> > really the CPU task causing the faults, but some non-CPU process.
>
> Hm, but the CPU task initiates the activity on the GPU, so we should
> account for it somewhere, right? I guess I had been thinking of the
> "task" as spanning the CPUs and GPUs and other devices in the system,
> rather than just representing the CPU activity.

One problem is that the task that called amd_iommu_bind_pasid() isn't
necessarily the same task (thread) that queues the jobs to the device.
The thread that called amd_iommu_bind_pasid() might even exit while
other threads still use the mappings.

Besides that, from an abstract point of view, what is running on the
device (GPU) is a logically seperate 'thread' of the process which we
should account for seperatly. If we want accounting for these off-CPU
threads we should probably introduce some concept of a non-CPU task to
the kernel and do the accounting there?


Joerg

2014-11-06 13:03:13

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

On Thu, Nov 06, 2014 at 10:51:02AM +0200, Oded Gabbay wrote:
> Joerg, sorry for the dumb question but what do you mean by "task
> accounting for page-faults"? Where is that code in IOMMUv2 driver
> now?

Linux counts major and minor page-faults for each running task. For the
IOMMUv2 driver this was done in the get_user_pages function, at least
until I changed the task parameter to NULL. Currently there is no
accounting for that in the IOMMUv2 driver.


Joerg

2014-11-12 22:13:52

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

On Thu, 6 Nov 2014 14:01:22 +0100
Joerg Roedel <[email protected]> wrote:

> On Wed, Nov 05, 2014 at 01:51:09PM -0800, Jesse Barnes wrote:
> > On Wed, 5 Nov 2014 13:03:51 +0100
> > Joerg Roedel <[email protected]> wrote:
> > > Thanks for testing Oded. Jesse, the patch looks good to me, except the
> > > task accounting for the page-faults. I'd like to get rid of using
> > > task_struct in the IOMMUv2 driver entirely if possible. Also it is not
> > > really the CPU task causing the faults, but some non-CPU process.
> >
> > Hm, but the CPU task initiates the activity on the GPU, so we should
> > account for it somewhere, right? I guess I had been thinking of the
> > "task" as spanning the CPUs and GPUs and other devices in the system,
> > rather than just representing the CPU activity.
>
> One problem is that the task that called amd_iommu_bind_pasid() isn't
> necessarily the same task (thread) that queues the jobs to the device.
> The thread that called amd_iommu_bind_pasid() might even exit while
> other threads still use the mappings.
>
> Besides that, from an abstract point of view, what is running on the
> device (GPU) is a logically seperate 'thread' of the process which we
> should account for seperatly. If we want accounting for these off-CPU
> threads we should probably introduce some concept of a non-CPU task to
> the kernel and do the accounting there?

Yeah those are good points; I hadn't been thinking of multi-threaded
stuff. Logically the GPU stuff really is a separate thread in that
sense, so monitoring faults separately makes sense.

I wonder if we need a new "device_task_struct" or
"coprocessor_task_struct" or something to wrap some simple stuff on
non-CPU devices. They could be sub-classed by drivers that needed
additional driver specific data.

--
Jesse Barnes, Intel Open Source Technology Center

2014-11-17 16:53:54

by Jörg Rödel

[permalink] [raw]
Subject: Re: [PATCH 1/2] mm: export find_extend_vma and handle_mm_fault for driver use

On Wed, Nov 12, 2014 at 02:07:41PM -0800, Jesse Barnes wrote:
> I wonder if we need a new "device_task_struct" or
> "coprocessor_task_struct" or something to wrap some simple stuff on
> non-CPU devices. They could be sub-classed by drivers that needed
> additional driver specific data.

Yes, I think something like a device_task_struct may be needed at some
point. It could be used in some generic code for task scheduling and
management on the devices too. The KFD driver already implements a
scheduler and stuff, maybe some of this code can be generalized to be
useable by other drivers and a common userspace interface (at least to
some degree).


Joerg