2014-12-08 08:18:46

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 1/7] CXL: Change contexts_lock to a mutex to fix sleep while atomic bug

From: Ian Munsie <[email protected]>

We had a known sleep while atomic bug if a CXL device was forcefully
unbound while it was in use. This could occur as a result of EEH, or
manually induced with something like this while the device was in use:

echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind

The issue was that in this code path we iterated over each context and
forcefully detached it with the contexts_lock spin lock held, however
the detach also needed to take the spu_mutex, and call schedule.

This patch changes the contexts_lock to a mutex so that we are not in
atomic context while doing the detach, thereby avoiding the sleep while
atomic.

Also delete the related TODO comment, which suggested an alternate
solution which turned out to not be workable.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/context.c | 15 ++++++++-------
drivers/misc/cxl/cxl.h | 2 +-
drivers/misc/cxl/native.c | 7 -------
drivers/misc/cxl/pci.c | 2 +-
drivers/misc/cxl/sysfs.c | 10 +++++-----
5 files changed, 15 insertions(+), 21 deletions(-)

diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
index cca4721..4aa31a3 100644
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -82,12 +82,12 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master)
* Allocating IDR! We better make sure everything's setup that
* dereferences from it.
*/
+ mutex_lock(&afu->contexts_lock);
idr_preload(GFP_KERNEL);
- spin_lock(&afu->contexts_lock);
i = idr_alloc(&ctx->afu->contexts_idr, ctx, 0,
ctx->afu->num_procs, GFP_NOWAIT);
- spin_unlock(&afu->contexts_lock);
idr_preload_end();
+ mutex_unlock(&afu->contexts_lock);
if (i < 0)
return i;

@@ -168,21 +168,22 @@ void cxl_context_detach_all(struct cxl_afu *afu)
struct cxl_context *ctx;
int tmp;

- rcu_read_lock();
- idr_for_each_entry(&afu->contexts_idr, ctx, tmp)
+ mutex_lock(&afu->contexts_lock);
+ idr_for_each_entry(&afu->contexts_idr, ctx, tmp) {
/*
* Anything done in here needs to be setup before the IDR is
* created and torn down after the IDR removed
*/
__detach_context(ctx);
- rcu_read_unlock();
+ }
+ mutex_unlock(&afu->contexts_lock);
}

void cxl_context_free(struct cxl_context *ctx)
{
- spin_lock(&ctx->afu->contexts_lock);
+ mutex_lock(&ctx->afu->contexts_lock);
idr_remove(&ctx->afu->contexts_idr, ctx->pe);
- spin_unlock(&ctx->afu->contexts_lock);
+ mutex_unlock(&ctx->afu->contexts_lock);
synchronize_rcu();

free_page((u64)ctx->sstp);
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index b5b6bda..7c05239 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -351,7 +351,7 @@ struct cxl_afu {
struct device *chardev_s, *chardev_m, *chardev_d;
struct idr contexts_idr;
struct dentry *debugfs;
- spinlock_t contexts_lock;
+ struct mutex contexts_lock;
struct mutex spa_mutex;
spinlock_t afu_cntl_lock;

diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index 9a5a442..1001cf4 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -610,13 +610,6 @@ static inline int detach_process_native_dedicated(struct cxl_context *ctx)
return 0;
}

-/*
- * TODO: handle case when this is called inside a rcu_read_lock() which may
- * happen when we unbind the driver (ie. cxl_context_detach_all()) . Terminate
- * & remove use a mutex lock and schedule which will not good with lock held.
- * May need to write do_process_element_cmd() that handles outstanding page
- * faults synchronously.
- */
static inline int detach_process_native_afu_directed(struct cxl_context *ctx)
{
if (!ctx->pe_inserted)
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 10c98ab..0f2cc9f 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -502,7 +502,7 @@ static struct cxl_afu *cxl_alloc_afu(struct cxl *adapter, int slice)
afu->dev.release = cxl_release_afu;
afu->slice = slice;
idr_init(&afu->contexts_idr);
- spin_lock_init(&afu->contexts_lock);
+ mutex_init(&afu->contexts_lock);
spin_lock_init(&afu->afu_cntl_lock);
mutex_init(&afu->spa_mutex);

diff --git a/drivers/misc/cxl/sysfs.c b/drivers/misc/cxl/sysfs.c
index ce7ec06..461bdbd 100644
--- a/drivers/misc/cxl/sysfs.c
+++ b/drivers/misc/cxl/sysfs.c
@@ -121,7 +121,7 @@ static ssize_t reset_store_afu(struct device *device,
int rc;

/* Not safe to reset if it is currently in use */
- spin_lock(&afu->contexts_lock);
+ mutex_lock(&afu->contexts_lock);
if (!idr_is_empty(&afu->contexts_idr)) {
rc = -EBUSY;
goto err;
@@ -132,7 +132,7 @@ static ssize_t reset_store_afu(struct device *device,

rc = count;
err:
- spin_unlock(&afu->contexts_lock);
+ mutex_unlock(&afu->contexts_lock);
return rc;
}

@@ -247,7 +247,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr,
int rc = -EBUSY;

/* can't change this if we have a user */
- spin_lock(&afu->contexts_lock);
+ mutex_lock(&afu->contexts_lock);
if (!idr_is_empty(&afu->contexts_idr))
goto err;

@@ -271,7 +271,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr,
afu->current_mode = 0;
afu->num_procs = 0;

- spin_unlock(&afu->contexts_lock);
+ mutex_unlock(&afu->contexts_lock);

if ((rc = _cxl_afu_deactivate_mode(afu, old_mode)))
return rc;
@@ -280,7 +280,7 @@ static ssize_t mode_store(struct device *device, struct device_attribute *attr,

return count;
err:
- spin_unlock(&afu->contexts_lock);
+ mutex_unlock(&afu->contexts_lock);
return rc;
}

--
2.1.3


2014-12-08 08:18:47

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 2/7] CXL: Add timeout to process element commands

From: Ian Munsie <[email protected]>

In the event that something goes wrong in the hardware and it is unable
to complete a process element comment we would end up polling forever,
effectively making the associated process unkillable.

This patch adds a timeout to the process element command code path, so
that we will give up if the hardware does not respond in a reasonable
time.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/native.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index 1001cf4..f2b37b4 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -277,6 +277,7 @@ static int do_process_element_cmd(struct cxl_context *ctx,
u64 cmd, u64 pe_state)
{
u64 state;
+ unsigned long timeout = jiffies + (HZ * CXL_TIMEOUT);

WARN_ON(!ctx->afu->enabled);

@@ -286,6 +287,10 @@ static int do_process_element_cmd(struct cxl_context *ctx,
smp_mb();
cxl_p1n_write(ctx->afu, CXL_PSL_LLCMD_An, cmd | ctx->pe);
while (1) {
+ if (time_after_eq(jiffies, timeout)) {
+ dev_warn(&ctx->afu->dev, "WARNING: Process Element Command timed out!\n");
+ return -EBUSY;
+ }
state = be64_to_cpup(ctx->afu->sw_command_status);
if (state == ~0ULL) {
pr_err("cxl: Error adding process element to AFU\n");
--
2.1.3

2014-12-08 08:18:52

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 6/7] CXL: Disable SPAP register when freeing SPA

From: Ian Munsie <[email protected]>

When we deactivate the AFU directed mode we free the scheduled process
area, but did not clear the register in the hardware that has a pointer
to it.

This should be fine since we will have already cleared out every context
and we won't do anything that would cause the hardware to access it
until after we have allocated a new one, but just to be safe this patch
clears out the register when we free the page.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/native.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index f2b37b4..0f24fa5 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -185,6 +185,7 @@ static int alloc_spa(struct cxl_afu *afu)

static void release_spa(struct cxl_afu *afu)
{
+ cxl_p1n_write(afu, CXL_PSL_SPAP_An, 0);
free_pages((unsigned long) afu->spa, afu->spa_order);
}

--
2.1.3

2014-12-08 08:19:30

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 4/7] CXL: Early return from cxl_handle_fault for a shut down context

From: Ian Munsie <[email protected]>

If a context is being detached and we get a translation fault for it
there is little point getting it's mm and handling the fault, so just
respond with an address error and return earlier.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/fault.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
index c99e896..3e8c06a 100644
--- a/drivers/misc/cxl/fault.c
+++ b/drivers/misc/cxl/fault.c
@@ -176,6 +176,12 @@ void cxl_handle_fault(struct work_struct *fault_work)
return;
}

+ /* Early return if the context is being / has been detached */
+ if (ctx->status == CLOSED) {
+ cxl_ack_ae(ctx);
+ return;
+ }
+
pr_devel("CXL BOTTOM HALF handling fault for afu pe: %i. "
"DSISR: %#llx DAR: %#llx\n", ctx->pe, dsisr, dar);

--
2.1.3

2014-12-08 08:19:42

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 3/7] CXL: Fix leaking interrupts if attach process fails

From: Ian Munsie <[email protected]>

In this particular error path we have already allocated the AFU
interrupts, but have not yet set the status to STARTED. The detach
context code will only attempt to release the interrupts if the context
is in state STARTED, so in this case the interrupts would remain
allocated.

This patch releases the AFU interrupts immediately if the attach call
fails to prevent them leaking.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 378b099..2e067a5 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -181,8 +181,10 @@ static long afu_ioctl_start_work(struct cxl_context *ctx,
ctx->pid = get_pid(get_task_pid(current, PIDTYPE_PID));

if ((rc = cxl_attach_process(ctx, false, work.work_element_descriptor,
- amr)))
+ amr))) {
+ afu_release_irqs(ctx);
goto out;
+ }

ctx->status = STARTED;
rc = 0;
--
2.1.3

2014-12-08 08:19:50

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 7/7] CXL: Unmap MMIO regions when detaching a context

From: Ian Munsie <[email protected]>

If we need to force detach a context (e.g. due to EEH or simply force
unbinding the driver) we should prevent the userspace contexts from
being able to access the Problem State Area MMIO region further, which
they may have mapped with mmap().

This patch unmaps any mapped MMIO regions when detaching a userspace
context.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/context.c | 11 ++++++++++-
drivers/misc/cxl/cxl.h | 7 ++++++-
drivers/misc/cxl/file.c | 6 +++++-
3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c
index 4aa31a3..51fd6b5 100644
--- a/drivers/misc/cxl/context.c
+++ b/drivers/misc/cxl/context.c
@@ -34,7 +34,8 @@ struct cxl_context *cxl_context_alloc(void)
/*
* Initialises a CXL context.
*/
-int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master)
+int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master,
+ struct address_space *mapping)
{
int i;

@@ -42,6 +43,8 @@ int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master)
ctx->afu = afu;
ctx->master = master;
ctx->pid = NULL; /* Set in start work ioctl */
+ mutex_init(&ctx->mapping_lock);
+ ctx->mapping = mapping;

/*
* Allocate the segment table before we put it in the IDR so that we
@@ -147,6 +150,12 @@ static void __detach_context(struct cxl_context *ctx)
afu_release_irqs(ctx);
flush_work(&ctx->fault_work); /* Only needed for dedicated process */
wake_up_all(&ctx->wq);
+
+ /* Release Problem State Area mapping */
+ mutex_lock(&ctx->mapping_lock);
+ if (ctx->mapping)
+ unmap_mapping_range(ctx->mapping, 0, 0, 1);
+ mutex_unlock(&ctx->mapping_lock);
}

/*
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index c1f8aa6..0df0438 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -405,6 +405,10 @@ struct cxl_context {
phys_addr_t psn_phys;
u64 psn_size;

+ /* Used to unmap any mmaps when force detaching */
+ struct address_space *mapping;
+ struct mutex mapping_lock;
+
spinlock_t sste_lock; /* Protects segment table entries */
struct cxl_sste *sstp;
u64 sstp0, sstp1;
@@ -606,7 +610,8 @@ int cxl_alloc_sst(struct cxl_context *ctx);
void init_cxl_native(void);

struct cxl_context *cxl_context_alloc(void);
-int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master);
+int cxl_context_init(struct cxl_context *ctx, struct cxl_afu *afu, bool master,
+ struct address_space *mapping);
void cxl_context_free(struct cxl_context *ctx);
int cxl_context_iomap(struct cxl_context *ctx, struct vm_area_struct *vma);

diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 2e067a5..b09be44 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -77,7 +77,7 @@ static int __afu_open(struct inode *inode, struct file *file, bool master)
goto err_put_afu;
}

- if ((rc = cxl_context_init(ctx, afu, master)))
+ if ((rc = cxl_context_init(ctx, afu, master, inode->i_mapping)))
goto err_put_afu;

pr_devel("afu_open pe: %i\n", ctx->pe);
@@ -113,6 +113,10 @@ static int afu_release(struct inode *inode, struct file *file)
__func__, ctx->pe);
cxl_context_detach(ctx);

+ mutex_lock(&ctx->mapping_lock);
+ ctx->mapping = NULL;
+ mutex_unlock(&ctx->mapping_lock);
+
put_device(&ctx->afu->dev);

/*
--
2.1.3

2014-12-08 08:20:25

by Ian Munsie

[permalink] [raw]
Subject: [PATCH 5/7] CXL: Disable AFU debug flag

From: Ian Munsie <[email protected]>

Upon inspection of the implementation specific registers, it was
discovered that the high bit of the implementation specific RXCTL
register was enabled, which enables the DEADB00F debug feature.

The debug feature causes MMIO reads to a disabled AFU to respond with
0xDEADB00F instead of all Fs. In general this should not be visible as
the kernel will only allow MMIO access to enabled AFUs, but there may be
some circumstances where an AFU may become disabled while it is use.
One such case would be an AFU designed to only be used in the dedicated
process mode and to disable itself after it has completed it's work
(however even in that case the effects of this debug flag would be
limited as the userspace application must have completed any required
MMIO accesses before the AFU disables itself with or without the flag).

This patch removes the debug flag and replaces the magic value
programmed into this register with a preprocessor define so it is
clearer what the rest of this initialisation does.

Signed-off-by: Ian Munsie <[email protected]>
---
drivers/misc/cxl/cxl.h | 7 +++++++
drivers/misc/cxl/pci.c | 2 +-
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 7c05239..c1f8aa6 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -287,6 +287,13 @@ static const cxl_p2n_reg_t CXL_PSL_WED_An = {0x0A0};
#define CXL_PE_SOFTWARE_STATE_S (1ul << (31 - 30)) /* Suspend */
#define CXL_PE_SOFTWARE_STATE_T (1ul << (31 - 31)) /* Terminate */

+/****** CXL_PSL_RXCTL_An (Implementation Specific) **************************
+ * Controls AFU Hang Pulse, which sets the timeout for the AFU to respond to
+ * the PSL for any response (except MMIO). Timeouts will occur between 1x to 2x
+ * of the hang pulse frequency.
+ */
+#define CXL_PSL_RXCTL_AFUHP_4S 0x7000000000000000ULL
+
/* SPA->sw_command_status */
#define CXL_SPA_SW_CMD_MASK 0xffff000000000000ULL
#define CXL_SPA_SW_CMD_TERMINATE 0x0001000000000000ULL
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index 0f2cc9f..2ccd0a9 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -348,7 +348,7 @@ static int init_implementation_afu_regs(struct cxl_afu *afu)
cxl_p1n_write(afu, CXL_PSL_COALLOC_A, 0xFF000000FEFEFEFEULL);
/* for debugging with trace arrays */
cxl_p1n_write(afu, CXL_PSL_SLICE_TRACE, 0x0000FFFF00000000ULL);
- cxl_p1n_write(afu, CXL_PSL_RXCTL_A, 0xF000000000000000ULL);
+ cxl_p1n_write(afu, CXL_PSL_RXCTL_A, CXL_PSL_RXCTL_AFUHP_4S);

return 0;
}
--
2.1.3

2014-12-15 03:32:30

by Ian Munsie

[permalink] [raw]
Subject: Re: [PATCH 7/7] CXL: Unmap MMIO regions when detaching a context

Excerpts from Ian Munsie's message of 2014-12-08 19:18:01 +1100:
> From: Ian Munsie <[email protected]>
>
> If we need to force detach a context (e.g. due to EEH or simply force
> unbinding the driver) we should prevent the userspace contexts from
> being able to access the Problem State Area MMIO region further, which
> they may have mapped with mmap().
>
> This patch unmaps any mapped MMIO regions when detaching a userspace
> context.

Might want to drop this one for now - Philippe found that it sometimes
causes an issue when multiple contexts are using the afu. Seems that
unmap_mapping_range() unmapped a bit more than I expected.

Cheers,
-Ian

2014-12-15 23:32:09

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 7/7] CXL: Unmap MMIO regions when detaching a context

On Mon, 2014-12-15 at 14:32 +1100, Ian Munsie wrote:
> Excerpts from Ian Munsie's message of 2014-12-08 19:18:01 +1100:
> > From: Ian Munsie <[email protected]>
> >
> > If we need to force detach a context (e.g. due to EEH or simply force
> > unbinding the driver) we should prevent the userspace contexts from
> > being able to access the Problem State Area MMIO region further, which
> > they may have mapped with mmap().
> >
> > This patch unmaps any mapped MMIO regions when detaching a userspace
> > context.
>
> Might want to drop this one for now - Philippe found that it sometimes
> causes an issue when multiple contexts are using the afu. Seems that
> unmap_mapping_range() unmapped a bit more than I expected.

Sorry, it's already in next.

https://git.kernel.org/cgit/linux/kernel/git/mpe/linux.git/commit/?h=next&id=b123429e6a9e8d03aacf888d23262835f0081448

Send me a revert or a fixup.

cheers