2019-06-17 04:42:41

by Alastair D'Silva

[permalink] [raw]
Subject: [PATCH] ocxl: Allow contexts to be attached with a NULL mm

From: Alastair D'Silva <[email protected]>

If an OpenCAPI context is to be used directly by a kernel driver, there
may not be a suitable mm to use.

The patch makes the mm parameter to ocxl_context_attach optional.

Signed-off-by: Alastair D'Silva <[email protected]>
---
drivers/misc/ocxl/context.c | 9 ++++++---
drivers/misc/ocxl/link.c | 12 ++++++++----
2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c
index bab9c9364184..994563a078eb 100644
--- a/drivers/misc/ocxl/context.c
+++ b/drivers/misc/ocxl/context.c
@@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr)
int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
{
int rc;
+ unsigned long pidr = 0;

// Locks both status & tidr
mutex_lock(&ctx->status_mutex);
@@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
goto out;
}

- rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid,
- mm->context.id, ctx->tidr, amr, mm,
- xsl_fault_error, ctx);
+ if (mm)
+ pidr = mm->context.id;
+
+ rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr,
+ amr, mm, xsl_fault_error, ctx);
if (rc)
goto out;

diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index cce5b0d64505..43542f124807 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
pe->amr = cpu_to_be64(amr);
pe->software_state = cpu_to_be32(SPA_PE_VALID);

- mm_context_add_copro(mm);
+ if (mm)
+ mm_context_add_copro(mm);
/*
* Barrier is to make sure PE is visible in the SPA before it
* is used by the device. It also helps with the global TLBI
@@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
* have a reference on mm_users. Incrementing mm_count solves
* the problem.
*/
- mmgrab(mm);
+ if (mm)
+ mmgrab(mm);
trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr);
unlock:
mutex_unlock(&spa->spa_lock);
@@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid)
if (!pe_data) {
WARN(1, "Couldn't find pe data when removing PE\n");
} else {
- mm_context_remove_copro(pe_data->mm);
- mmdrop(pe_data->mm);
+ if (pe_data->mm) {
+ mm_context_remove_copro(pe_data->mm);
+ mmdrop(pe_data->mm);
+ }
kfree_rcu(pe_data, rcu);
}
unlock:
--
2.21.0


2019-06-18 01:51:58

by Andrew Donnellan

[permalink] [raw]
Subject: Re: [PATCH] ocxl: Allow contexts to be attached with a NULL mm

On 17/6/19 2:41 pm, Alastair D'Silva wrote:
> From: Alastair D'Silva <[email protected]>
>
> If an OpenCAPI context is to be used directly by a kernel driver, there
> may not be a suitable mm to use.
>
> The patch makes the mm parameter to ocxl_context_attach optional.
>
> Signed-off-by: Alastair D'Silva <[email protected]>

The one issue I can see here is that using mm == NULL bypasses our
method of enabling/disabling global TLBIs in mm_context_add_copro().

Discussing this privately with Alastair and Fred - this should be fine,
but perhaps we should document that.

> ---
> drivers/misc/ocxl/context.c | 9 ++++++---
> drivers/misc/ocxl/link.c | 12 ++++++++----
> 2 files changed, 14 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c
> index bab9c9364184..994563a078eb 100644
> --- a/drivers/misc/ocxl/context.c
> +++ b/drivers/misc/ocxl/context.c
> @@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr, u64 dsisr)
> int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
> {
> int rc;
> + unsigned long pidr = 0;
>
> // Locks both status & tidr
> mutex_lock(&ctx->status_mutex);
> @@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct mm_struct *mm)
> goto out;
> }
>
> - rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid,
> - mm->context.id, ctx->tidr, amr, mm,
> - xsl_fault_error, ctx);
> + if (mm)
> + pidr = mm->context.id;
> +
> + rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr, ctx->tidr,
> + amr, mm, xsl_fault_error, ctx);
> if (rc)
> goto out;
>
> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
> index cce5b0d64505..43542f124807 100644
> --- a/drivers/misc/ocxl/link.c
> +++ b/drivers/misc/ocxl/link.c
> @@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> pe->amr = cpu_to_be64(amr);
> pe->software_state = cpu_to_be32(SPA_PE_VALID);
>
> - mm_context_add_copro(mm);
> + if (mm)
> + mm_context_add_copro(mm);
> /*
> * Barrier is to make sure PE is visible in the SPA before it
> * is used by the device. It also helps with the global TLBI
> @@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 pidr, u32 tidr,
> * have a reference on mm_users. Incrementing mm_count solves
> * the problem.
> */
> - mmgrab(mm);
> + if (mm)
> + mmgrab(mm);
> trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr, tidr);
> unlock:
> mutex_unlock(&spa->spa_lock);
> @@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int pasid)
> if (!pe_data) {
> WARN(1, "Couldn't find pe data when removing PE\n");
> } else {
> - mm_context_remove_copro(pe_data->mm);
> - mmdrop(pe_data->mm);
> + if (pe_data->mm) {
> + mm_context_remove_copro(pe_data->mm);
> + mmdrop(pe_data->mm);
> + }
> kfree_rcu(pe_data, rcu);
> }
> unlock:
>

--
Andrew Donnellan OzLabs, ADL Canberra
[email protected] IBM Australia Limited

2019-06-19 10:36:49

by Frederic Barrat

[permalink] [raw]
Subject: Re: [PATCH] ocxl: Allow contexts to be attached with a NULL mm



Le 18/06/2019 à 03:50, Andrew Donnellan a écrit :
> On 17/6/19 2:41 pm, Alastair D'Silva wrote:
>> From: Alastair D'Silva <[email protected]>
>>
>> If an OpenCAPI context is to be used directly by a kernel driver, there
>> may not be a suitable mm to use.
>>
>> The patch makes the mm parameter to ocxl_context_attach optional.
>>
>> Signed-off-by: Alastair D'Silva <[email protected]>
>
> The one issue I can see here is that using mm == NULL bypasses our
> method of enabling/disabling global TLBIs in mm_context_add_copro().
>
> Discussing this privately with Alastair and Fred - this should be fine,
> but perhaps we should document that.


So indeed we should be fine. I confirmed with Nick that kernel space
invalidations are already global today.
Nick mentioned that we should still be fine tomorrow, but in the distant
future, we could imagine local usage of some part of the kernel space.
It will require some work, but it would be best to add a comment in one
of the kernel invalidation function (for example
radix__flush_tlb_kernel_range()) that if a kernel invalidation ever
becomes local, then clients of the nest MMU may need some work.

A few more comments below.


>> ---
>>   drivers/misc/ocxl/context.c |  9 ++++++---
>>   drivers/misc/ocxl/link.c    | 12 ++++++++----
>>   2 files changed, 14 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c
>> index bab9c9364184..994563a078eb 100644
>> --- a/drivers/misc/ocxl/context.c
>> +++ b/drivers/misc/ocxl/context.c
>> @@ -69,6 +69,7 @@ static void xsl_fault_error(void *data, u64 addr,
>> u64 dsisr)
>>   int ocxl_context_attach(struct ocxl_context *ctx, u64 amr, struct
>> mm_struct *mm)
>>   {
>>       int rc;
>> +    unsigned long pidr = 0;
>>       // Locks both status & tidr
>>       mutex_lock(&ctx->status_mutex);
>> @@ -77,9 +78,11 @@ int ocxl_context_attach(struct ocxl_context *ctx,
>> u64 amr, struct mm_struct *mm)
>>           goto out;
>>       }
>> -    rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid,
>> -            mm->context.id, ctx->tidr, amr, mm,
>> -            xsl_fault_error, ctx);
>> +    if (mm)
>> +        pidr = mm->context.id;
>> +
>> +    rc = ocxl_link_add_pe(ctx->afu->fn->link, ctx->pasid, pidr,
>> ctx->tidr,
>> +                  amr, mm, xsl_fault_error, ctx);
>>       if (rc)
>>           goto out;
>> diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
>> index cce5b0d64505..43542f124807 100644
>> --- a/drivers/misc/ocxl/link.c
>> +++ b/drivers/misc/ocxl/link.c
>> @@ -523,7 +523,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid,
>> u32 pidr, u32 tidr,
>>       pe->amr = cpu_to_be64(amr);
>>       pe->software_state = cpu_to_be32(SPA_PE_VALID);
>> -    mm_context_add_copro(mm);
>> +    if (mm)
>> +        mm_context_add_copro(mm);


Same as above, we should add a comment here in the driver code that a
kernel context is ok because invalidations are global.


We also need a new check in xsl_fault_handler(). A valid kernel address
shouldn't fault, but it's still possible for the FPGA to try accessing a
bogus kernel address. In which case, xsl_fault_handler() would be
entered, with a valid fault context. We'll find pe_data in the tree
based on the valid pe_handle, but pe_data->mm will be NULL. In that, we
can return early, acknowledging the interrupt with ADDRESS_ERROR value
(like we do if pe_data is not found in the tree).

Fred


>>       /*
>>        * Barrier is to make sure PE is visible in the SPA before it
>>        * is used by the device. It also helps with the global TLBI
>> @@ -546,7 +547,8 @@ int ocxl_link_add_pe(void *link_handle, int pasid,
>> u32 pidr, u32 tidr,
>>        * have a reference on mm_users. Incrementing mm_count solves
>>        * the problem.
>>        */
>> -    mmgrab(mm);
>> +    if (mm)
>> +        mmgrab(mm);
>>       trace_ocxl_context_add(current->pid, spa->spa_mem, pasid, pidr,
>> tidr);
>>   unlock:
>>       mutex_unlock(&spa->spa_lock);
>> @@ -652,8 +654,10 @@ int ocxl_link_remove_pe(void *link_handle, int
>> pasid)
>>       if (!pe_data) {
>>           WARN(1, "Couldn't find pe data when removing PE\n");
>>       } else {
>> -        mm_context_remove_copro(pe_data->mm);
>> -        mmdrop(pe_data->mm);
>> +        if (pe_data->mm) {
>> +            mm_context_remove_copro(pe_data->mm);
>> +            mmdrop(pe_data->mm);
>> +        }
>>           kfree_rcu(pe_data, rcu);
>>       }
>>   unlock:
>>
>