2017-03-09 16:45:00

by Andrew Banman

[permalink] [raw]
Subject: [PATCHv3 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates

The following patch series adds the necessary functionality to make the BAU
on UV4 operational. The purpose of these patches is to implement the correct
message completion logic on UV4. Also included is a bug fix to add a field
to the INTD payload. This is needed to verify the source of each message.

As of this patch set, the BAU operates without errors and performance tests
show TLB shootdowns take up to 42% less time with the BAU enabled.

The series was built and tested on the latest -tip kernel. This version
differs from v2 only in some corrected typos and in fixing the context
of some patches to resolve build conflicts.

The patches are summarized as follows:

(1) Populate a message payload field to verify messages at the destination.
Without this verification, the destination agent triggers a HUB error,
resulting in an NMI.

[PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated
[PATCH 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier

This bug fix is included at the start of the series to avoid conflicts
in a code path shared by the rest of the series.

(2) Make the wait_completion routine part of the bau_operations interface,
and add a uv4_wait_completion routine to employ new completion logic.

The message completion logic for previous generations relies on software-
defined timeouts that are not implemented on UV4. Without these patches,
the BAU driver on UV4 erroneously identifies a UV2-WAR timeout during
normal operation.

[PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration
[PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to
[PATCH 5/6] x86/platform/uv/BAU: Add wait_completion to
[PATCH 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with


Please see the commit messages for details on the motivation and content of
each patch.

Thank you,

Andrew Banman
HPE, Linux Kernel Engineer
<[email protected]>



2017-03-09 16:42:27

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status

UV4 does not employ a software-timeout as in previous generations so a new
wait_completion routine without this logic is required. Certain completion
statuses require the AUX status bit in addition to ERROR and BUSY.

Add the read_status routine to construct the full completion status. Use
read_status in the uv4_wait_completion routine to handle all possible
completion statuses.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/platform/uv/tlb_uv.c | 58 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 2a826dd..42e65fe 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -687,6 +687,62 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
}

/*
+ * Returns the status of current BAU message for cpu desc as a bit field
+ * [Error][Busy][Aux]
+ */
+static u64 read_status(u64 status_mmr, int index, int desc)
+{
+ u64 stat;
+
+ stat = ((read_lmmr(status_mmr) >> index) & UV_ACT_STATUS_MASK) << 1;
+ stat |= (read_lmmr(UVH_LB_BAU_SB_ACTIVATION_STATUS_2) >> desc) & 0x1;
+
+ return stat;
+}
+
+static int uv4_wait_completion(struct bau_desc *bau_desc,
+ struct bau_control *bcp, long try)
+{
+ struct ptc_stats *stat = bcp->statp;
+ u64 descriptor_stat;
+ u64 mmr = bcp->status_mmr;
+ int index = bcp->status_index;
+ int desc = bcp->uvhub_cpu;
+
+ descriptor_stat = read_status(mmr, index, desc);
+
+ /* spin on the status MMR, waiting for it to go idle */
+ while (descriptor_stat != UV2H_DESC_IDLE) {
+ switch (descriptor_stat) {
+ case UV2H_DESC_SOURCE_TIMEOUT:
+ stat->s_stimeout++;
+ return FLUSH_GIVEUP;
+
+ case UV2H_DESC_DEST_TIMEOUT:
+ stat->s_dtimeout++;
+ bcp->conseccompletes = 0;
+ return FLUSH_RETRY_TIMEOUT;
+
+ case UV2H_DESC_DEST_STRONG_NACK:
+ stat->s_plugged++;
+ bcp->conseccompletes = 0;
+ return FLUSH_RETRY_PLUGGED;
+
+ case UV2H_DESC_DEST_PUT_ERR:
+ bcp->conseccompletes = 0;
+ return FLUSH_GIVEUP;
+
+ default:
+ /* descriptor_stat is still BUSY */
+ cpu_relax();
+ }
+ descriptor_stat = read_status(mmr, index, desc);
+ }
+ bcp->conseccompletes++;
+ return FLUSH_COMPLETE;
+}
+
+/*
* Our retries are blocked by all destination sw ack resources being
* in use, and a timeout is pending. In that case hardware immediately
* returns the ERROR that looks like a destination timeout.
@@ -2157,7 +2213,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
.write_g_sw_ack = write_gmmr_proc_sw_ack,
.write_payload_first = write_mmr_proc_payload_first,
.write_payload_last = write_mmr_proc_payload_last,
- .wait_completion = uv2_3_wait_completion,
+ .wait_completion = uv4_wait_completion,
};

/*
--
1.8.2.1

2017-03-09 16:42:34

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations

Remove the present wait_completion routine and add a function pointer by
the same name to the bau_operations struct. Rather than switching on the
UV hub version during message processing, set the architecture-specific
uv*_wait_completion during initialization.

The uv123_bau_ops struct must be split into uv1 and uv2_3 versions to
accommodate the corresponding wait_completion routines.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/include/asm/uv/uv_bau.h | 2 ++
arch/x86/platform/uv/tlb_uv.c | 31 ++++++++++++++++++-------------
2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index f1c23b3..10e65cd 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -671,6 +671,8 @@ struct bau_operations {
void (*write_g_sw_ack)(int pnode, unsigned long mmr);
void (*write_payload_first)(int pnode, unsigned long mmr);
void (*write_payload_last)(int pnode, unsigned long mmr);
+ int (*wait_completion)(struct bau_desc*,
+ struct bau_control*, long try);
};

static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 13a7055..2a826dd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -686,14 +686,6 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
return FLUSH_COMPLETE;
}

-static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
-{
- if (bcp->uvhub_version == UV_BAU_V1)
- return uv1_wait_completion(bau_desc, bcp, try);
- else
- return uv2_3_wait_completion(bau_desc, bcp, try);
-}
-
/*
* Our retries are blocked by all destination sw ack resources being
* in use, and a timeout is pending. In that case hardware immediately
@@ -922,7 +914,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
write_mmr_activation(index);

try++;
- completion_stat = wait_completion(bau_desc, bcp, try);
+ completion_stat = ops.wait_completion(bau_desc, bcp, try);

handle_cmplt(completion_stat, bau_desc, bcp, hmaster, stat);

@@ -2135,7 +2127,18 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
return 1;
}

-static const struct bau_operations uv123_bau_ops __initconst = {
+static const struct bau_operations uv1_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_offset,
+ .read_l_sw_ack = read_mmr_sw_ack,
+ .read_g_sw_ack = read_gmmr_sw_ack,
+ .write_l_sw_ack = write_mmr_sw_ack,
+ .write_g_sw_ack = write_gmmr_sw_ack,
+ .write_payload_first = write_mmr_payload_first,
+ .write_payload_last = write_mmr_payload_last,
+ .wait_completion = uv1_wait_completion,
+};
+
+static const struct bau_operations uv2_3_bau_ops __initconst = {
.bau_gpa_to_offset = uv_gpa_to_offset,
.read_l_sw_ack = read_mmr_sw_ack,
.read_g_sw_ack = read_gmmr_sw_ack,
@@ -2143,6 +2146,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
.write_g_sw_ack = write_gmmr_sw_ack,
.write_payload_first = write_mmr_payload_first,
.write_payload_last = write_mmr_payload_last,
+ .wait_completion = uv2_3_wait_completion,
};

static const struct bau_operations uv4_bau_ops __initconst = {
@@ -2153,6 +2157,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
.write_g_sw_ack = write_gmmr_proc_sw_ack,
.write_payload_first = write_mmr_proc_payload_first,
.write_payload_last = write_mmr_proc_payload_last,
+ .wait_completion = uv2_3_wait_completion,
};

/*
@@ -2174,11 +2179,11 @@ static int __init uv_bau_init(void)
if (is_uv4_hub())
ops = uv4_bau_ops;
else if (is_uv3_hub())
- ops = uv123_bau_ops;
+ ops = uv2_3_bau_ops;
else if (is_uv2_hub())
- ops = uv123_bau_ops;
+ ops = uv2_3_bau_ops;
else if (is_uv1_hub())
- ops = uv123_bau_ops;
+ ops = uv1_bau_ops;

for_each_possible_cpu(cur_cpu) {
mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);
--
1.8.2.1

2017-03-09 16:42:25

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier

On UV4, the destination agent verifies each message by checking the
descriptor qualifier field of the message payload. Messages without this
field set to 0x534749 will cause a hub error to assert. Split
bau_message_payload into uv1_2_3 and uv4 versions to account for the
different payload formats.

Enforce the size of each field by using the appropriate u** integer type.
Replace extraneous comments with KernelDoc comment.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/include/asm/uv/uv_bau.h | 40 ++++++++++++++++++++++++++++------------
arch/x86/platform/uv/tlb_uv.c | 27 +++++++++++++++++++--------
2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 768093f..1ed0574 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,8 @@
#define MSG_REGULAR 1
#define MSG_RETRY 2

+#define BAU_DESC_QUALIFIER 0x534749
+
enum uv_bau_version {
UV_BAU_V1 = 1,
UV_BAU_V2,
@@ -229,20 +231,31 @@ struct bau_local_cpumask {
* the s/w ack bit vector ]
*/

-/*
- * The payload is software-defined for INTD transactions
+/**
+ * struct uv1_2_3_bau_msg_payload - defines payload for INTD transactions
+ * @address: signifies a page or all TLB's of the cpu
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
*/
-struct bau_msg_payload {
- unsigned long address; /* signifies a page or all
- TLB's of the cpu */
- /* 64 bits */
- unsigned short sending_cpu; /* filled in by sender */
- /* 16 bits */
- unsigned short acknowledge_count; /* filled in by destination */
- /* 16 bits */
- unsigned int reserved1:32; /* not usable */
+struct uv1_2_3_bau_msg_payload {
+ u64 address;
+ u16 sending_cpu;
+ u16 acknowledge_count;
+ u32 reserved1;
};

+/**
+ * struct uv4_bau_msg_payload - defines payload for INTD transactions
+ * @address: signifies a page or all TLB's of the cpu
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
+ * @qualifier: set by source to verify origin of INTD broadcast
+ */
+struct uv4_bau_msg_payload {
+ u64 address;
+ u16 sending_cpu;
+ u16 acknowledge_count;
+ u32 reserved1:8;
+ u32 qualifier:24;
+};

/*
* UV1 Message header: 16 bytes (128 bits) (bytes 0x30-0x3f of descriptor)
@@ -418,7 +431,10 @@ struct bau_desc {
struct uv2_3_bau_msg_header uv2_3_hdr;
} header;

- struct bau_msg_payload payload;
+ union bau_payload_header {
+ struct uv1_2_3_bau_msg_payload uv1_2_3;
+ struct uv4_bau_msg_payload uv4;
+ } payload;
};
/* UV1:
* -payload-- ---------header------
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f4f5aa6..70721c4 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1114,15 +1114,12 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
unsigned long end,
unsigned int cpu)
{
- int locals = 0;
- int remotes = 0;
- int hubs = 0;
+ int locals = 0, remotes = 0, hubs = 0;
struct bau_desc *bau_desc;
struct cpumask *flush_mask;
struct ptc_stats *stat;
struct bau_control *bcp;
- unsigned long descriptor_status;
- unsigned long status;
+ unsigned long descriptor_status, status, address;

bcp = &per_cpu(bau_control, cpu);

@@ -1171,10 +1168,24 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
record_send_statistics(stat, locals, hubs, remotes, bau_desc);

if (!end || (end - start) <= PAGE_SIZE)
- bau_desc->payload.address = start;
+ address = start;
else
- bau_desc->payload.address = TLB_FLUSH_ALL;
- bau_desc->payload.sending_cpu = cpu;
+ address = TLB_FLUSH_ALL;
+
+ switch (bcp->uvhub_version) {
+ case UV_BAU_V1:
+ case UV_BAU_V2:
+ case UV_BAU_V3:
+ bau_desc->payload.uv1_2_3.address = address;
+ bau_desc->payload.uv1_2_3.sending_cpu = cpu;
+ break;
+ case UV_BAU_V4:
+ bau_desc->payload.uv4.address = address;
+ bau_desc->payload.uv4.sending_cpu = cpu;
+ bau_desc->payload.uv4.qualifier = BAU_DESC_QUALIFIER;
+ break;
+ }
+
/*
* uv_flush_send_and_wait returns 0 if all cpu's were messaged,
* or 1 if it gave up and the original cpumask should be returned.
--
1.8.2.1

2017-03-09 16:42:52

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control

The location of the ERROR and BUSY status bits depends on the descriptor
index, i.e. the CPU, of the message. Since this index does not change,
there is no need to calculate the mmr and index location during message
processing. The less work we do in the hot path the better.

Add status_mmr and status_index fields to bau_control and compute their
values during initialization. Add kerneldoc descriptions for the new
fields. Update uv*_wait_completion to use these fields rather than
receiving the information as parameters.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/include/asm/uv/uv_bau.h | 10 +++++++--
arch/x86/platform/uv/tlb_uv.c | 46 +++++++++++++++++++---------------------
2 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 59ae8a7..f1c23b3 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -600,8 +600,12 @@ struct uvhub_desc {
struct socket_desc socket[2];
};

-/*
- * one per-cpu; to locate the software tables
+/**
+ * struct bau_control
+ * @status_mmr: location of status mmr, determined by uvhub_cpu
+ * @status_index: index of ERR|BUSY bits in status mmr, determined by uvhub_cpu
+ *
+ * Per-cpu control struct containing CPU topology information and BAU tuneables.
*/
struct bau_control {
struct bau_desc *descriptor_base;
@@ -619,6 +623,8 @@ struct bau_control {
int timeout_tries;
int ipi_attempts;
int conseccompletes;
+ u64 status_mmr;
+ int status_index;
bool nobau;
short baudisabled;
short cpu;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index e6994fd..13a7055 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -527,11 +527,12 @@ static unsigned long uv1_read_status(unsigned long mmr_offset, int right_shift)
* return COMPLETE, RETRY(PLUGGED or TIMEOUT) or GIVEUP
*/
static int uv1_wait_completion(struct bau_desc *bau_desc,
- unsigned long mmr_offset, int right_shift,
struct bau_control *bcp, long try)
{
unsigned long descriptor_status;
cycles_t ttm;
+ u64 mmr_offset = bcp->status_mmr;
+ int right_shift = bcp->status_index;
struct ptc_stats *stat = bcp->statp;

descriptor_status = uv1_read_status(mmr_offset, right_shift);
@@ -619,11 +620,12 @@ int handle_uv2_busy(struct bau_control *bcp)
}

static int uv2_3_wait_completion(struct bau_desc *bau_desc,
- unsigned long mmr_offset, int right_shift,
struct bau_control *bcp, long try)
{
unsigned long descriptor_stat;
cycles_t ttm;
+ u64 mmr_offset = bcp->status_mmr;
+ int right_shift = bcp->status_index;
int desc = bcp->uvhub_cpu;
long busy_reps = 0;
struct ptc_stats *stat = bcp->statp;
@@ -684,29 +686,12 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
return FLUSH_COMPLETE;
}

-/*
- * There are 2 status registers; each and array[32] of 2 bits. Set up for
- * which register to read and position in that register based on cpu in
- * current hub.
- */
static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
{
- int right_shift;
- unsigned long mmr_offset;
- int desc = bcp->uvhub_cpu;
-
- if (desc < UV_CPUS_PER_AS) {
- mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
- right_shift = desc * UV_ACT_STATUS_SIZE;
- } else {
- mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
- right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
- }
-
if (bcp->uvhub_version == UV_BAU_V1)
- return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+ return uv1_wait_completion(bau_desc, bcp, try);
else
- return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+ return uv2_3_wait_completion(bau_desc, bcp, try);
}

/*
@@ -2024,8 +2009,7 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
struct bau_control **smasterp,
struct bau_control **hmasterp)
{
- int i;
- int cpu;
+ int i, cpu, uvhub_cpu;
struct bau_control *bcp;

for (i = 0; i < sdp->num_cpus; i++) {
@@ -2054,7 +2038,21 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
return 1;
}
bcp->uvhub_master = *hmasterp;
- bcp->uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+ uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+ bcp->uvhub_cpu = uvhub_cpu;
+
+ /*
+ * The ERROR and BUSY status registers are located pairwise over
+ * the STATUS_0 and STATUS_1 mmrs; each an array[32] of 2 bits.
+ */
+ if (uvhub_cpu < UV_CPUS_PER_AS) {
+ bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
+ bcp->status_index = uvhub_cpu * UV_ACT_STATUS_SIZE;
+ } else {
+ bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
+ bcp->status_index = (uvhub_cpu - UV_CPUS_PER_AS)
+ * UV_ACT_STATUS_SIZE;
+ }

if (bcp->uvhub_cpu >= MAX_CPUS_PER_UVHUB) {
pr_emerg("%d cpus per uvhub invalid\n",
--
1.8.2.1

2017-03-09 16:42:50

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances

Move the bau_operations declaration after bau struct declarations so the
bau structs can be referenced when adding new functions to
bau_operations. That way we avoid forward declarations of the bau
structs.

Likewise, move uv*_bau_ops structs down to avoid forward declarations of
new functions defined in the same file. Declare these structs __initconst
since they are only used during initialization. Similarly, declare the
bau_operations ops instance __ro_after_init as it is read-only after
initialization.

This is a preparatory patch for adding wait_completion to bau_operations.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Mike Travis <[email protected]>
---
arch/x86/include/asm/uv/uv_bau.h | 22 ++++++++++----------
arch/x86/platform/uv/tlb_uv.c | 43 ++++++++++++++++++++--------------------
2 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 1ed0574..59ae8a7 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -405,17 +405,6 @@ struct uv2_3_bau_msg_header {
/* bits 127:120 */
};

-/* Abstracted BAU functions */
-struct bau_operations {
- unsigned long (*read_l_sw_ack)(void);
- unsigned long (*read_g_sw_ack)(int pnode);
- unsigned long (*bau_gpa_to_offset)(unsigned long vaddr);
- void (*write_l_sw_ack)(unsigned long mmr);
- void (*write_g_sw_ack)(int pnode, unsigned long mmr);
- void (*write_payload_first)(int pnode, unsigned long mmr);
- void (*write_payload_last)(int pnode, unsigned long mmr);
-};
-
/*
* The activation descriptor:
* The format of the message to send, plus all accompanying control
@@ -667,6 +656,17 @@ struct bau_control {
struct hub_and_pnode *thp;
};

+/* Abstracted BAU functions */
+struct bau_operations {
+ unsigned long (*read_l_sw_ack)(void);
+ unsigned long (*read_g_sw_ack)(int pnode);
+ unsigned long (*bau_gpa_to_offset)(unsigned long vaddr);
+ void (*write_l_sw_ack)(unsigned long mmr);
+ void (*write_g_sw_ack)(int pnode, unsigned long mmr);
+ void (*write_payload_first)(int pnode, unsigned long mmr);
+ void (*write_payload_last)(int pnode, unsigned long mmr);
+};
+
static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
{
write_gmmr(pnode, UVH_BAU_DATA_BROADCAST, mmr_image);
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 70721c4..e6994fd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -23,28 +23,7 @@
#include <asm/irq_vectors.h>
#include <asm/timer.h>

-static struct bau_operations ops;
-
-static struct bau_operations uv123_bau_ops = {
- .bau_gpa_to_offset = uv_gpa_to_offset,
- .read_l_sw_ack = read_mmr_sw_ack,
- .read_g_sw_ack = read_gmmr_sw_ack,
- .write_l_sw_ack = write_mmr_sw_ack,
- .write_g_sw_ack = write_gmmr_sw_ack,
- .write_payload_first = write_mmr_payload_first,
- .write_payload_last = write_mmr_payload_last,
-};
-
-static struct bau_operations uv4_bau_ops = {
- .bau_gpa_to_offset = uv_gpa_to_soc_phys_ram,
- .read_l_sw_ack = read_mmr_proc_sw_ack,
- .read_g_sw_ack = read_gmmr_proc_sw_ack,
- .write_l_sw_ack = write_mmr_proc_sw_ack,
- .write_g_sw_ack = write_gmmr_proc_sw_ack,
- .write_payload_first = write_mmr_proc_payload_first,
- .write_payload_last = write_mmr_proc_payload_last,
-};
-
+static struct bau_operations ops __ro_after_init;

/* timeouts in nanoseconds (indexed by UVH_AGING_PRESCALE_SEL urgency7 30:28) */
static int timeout_base_ns[] = {
@@ -2158,6 +2137,26 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
return 1;
}

+static const struct bau_operations uv123_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_offset,
+ .read_l_sw_ack = read_mmr_sw_ack,
+ .read_g_sw_ack = read_gmmr_sw_ack,
+ .write_l_sw_ack = write_mmr_sw_ack,
+ .write_g_sw_ack = write_gmmr_sw_ack,
+ .write_payload_first = write_mmr_payload_first,
+ .write_payload_last = write_mmr_payload_last,
+};
+
+static const struct bau_operations uv4_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_soc_phys_ram,
+ .read_l_sw_ack = read_mmr_proc_sw_ack,
+ .read_g_sw_ack = read_gmmr_proc_sw_ack,
+ .write_l_sw_ack = write_mmr_proc_sw_ack,
+ .write_g_sw_ack = write_gmmr_proc_sw_ack,
+ .write_payload_first = write_mmr_proc_payload_first,
+ .write_payload_last = write_mmr_proc_payload_last,
+};
+
/*
* Initialization of BAU-related structures
*/
--
1.8.2.1

2017-03-09 16:43:18

by Andrew Banman

[permalink] [raw]
Subject: [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants

Define enumerated constants for each UV hub version and replace magic
numbers with the appropriate constant.

Signed-off-by: Andrew Banman <[email protected]>
---
arch/x86/include/asm/uv/uv_bau.h | 7 +++++++
arch/x86/platform/uv/tlb_uv.c | 16 ++++++++--------
2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 57ab86d..768093f 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,13 @@
#define MSG_REGULAR 1
#define MSG_RETRY 2

+enum uv_bau_version {
+ UV_BAU_V1 = 1,
+ UV_BAU_V2,
+ UV_BAU_V3,
+ UV_BAU_V4,
+};
+
/*
* Distribution: 32 bytes (256 bits) (bytes 0-0x1f of descriptor)
* If the 'multilevel' flag in the header portion of the descriptor
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f25982c..f4f5aa6 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -724,7 +724,7 @@ static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, l
right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
}

- if (bcp->uvhub_version == 1)
+ if (bcp->uvhub_version == UV_BAU_V1)
return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
else
return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
@@ -918,7 +918,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
struct uv1_bau_msg_header *uv1_hdr = NULL;
struct uv2_3_bau_msg_header *uv2_3_hdr = NULL;

- if (bcp->uvhub_version == 1) {
+ if (bcp->uvhub_version == UV_BAU_V1) {
uv1 = 1;
uv1_throttle(hmaster, stat);
}
@@ -1296,7 +1296,7 @@ void uv_bau_message_interrupt(struct pt_regs *regs)

msgdesc.msg_slot = msg - msgdesc.queue_first;
msgdesc.msg = msg;
- if (bcp->uvhub_version == 2)
+ if (bcp->uvhub_version == UV_BAU_V2)
process_uv2_message(&msgdesc, bcp);
else
/* no error workaround for uv1 or uv3 */
@@ -1838,7 +1838,7 @@ static void pq_init(int node, int pnode)
* and the payload queue tail must be maintained by the kernel.
*/
bcp = &per_cpu(bau_control, smp_processor_id());
- if (bcp->uvhub_version <= 3) {
+ if (bcp->uvhub_version <= UV_BAU_V3) {
tail = first;
gnode = uv_gpa_to_gnode(uv_gpa(pqp));
first = (gnode << UV_PAYLOADQ_GNODE_SHIFT) | tail;
@@ -2052,13 +2052,13 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
bcp->socket_master = *smasterp;
bcp->uvhub = bdp->uvhub;
if (is_uv1_hub())
- bcp->uvhub_version = 1;
+ bcp->uvhub_version = UV_BAU_V1;
else if (is_uv2_hub())
- bcp->uvhub_version = 2;
+ bcp->uvhub_version = UV_BAU_V2;
else if (is_uv3_hub())
- bcp->uvhub_version = 3;
+ bcp->uvhub_version = UV_BAU_V3;
else if (is_uv4_hub())
- bcp->uvhub_version = 4;
+ bcp->uvhub_version = UV_BAU_V4;
else {
pr_emerg("uvhub version not 1, 2, 3, or 4\n");
return 1;
--
1.8.2.1

Subject: [tip:x86/platform] x86/platform/uv/BAU: Add uv_bau_version enumerated constants

Commit-ID: 491bd88cdb256cdabd25362b923d94ab80cf72c9
Gitweb: http://git.kernel.org/tip/491bd88cdb256cdabd25362b923d94ab80cf72c9
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:09 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:28 +0100

x86/platform/uv/BAU: Add uv_bau_version enumerated constants

Define enumerated constants for each UV hub version and replace magic
numbers with the appropriate constant.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/uv/uv_bau.h | 7 +++++++
arch/x86/platform/uv/tlb_uv.c | 16 ++++++++--------
2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 57ab86d..768093f 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,13 @@
#define MSG_REGULAR 1
#define MSG_RETRY 2

+enum uv_bau_version {
+ UV_BAU_V1 = 1,
+ UV_BAU_V2,
+ UV_BAU_V3,
+ UV_BAU_V4,
+};
+
/*
* Distribution: 32 bytes (256 bits) (bytes 0-0x1f of descriptor)
* If the 'multilevel' flag in the header portion of the descriptor
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f25982c..f4f5aa6 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -724,7 +724,7 @@ static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, l
right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
}

- if (bcp->uvhub_version == 1)
+ if (bcp->uvhub_version == UV_BAU_V1)
return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
else
return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
@@ -918,7 +918,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
struct uv1_bau_msg_header *uv1_hdr = NULL;
struct uv2_3_bau_msg_header *uv2_3_hdr = NULL;

- if (bcp->uvhub_version == 1) {
+ if (bcp->uvhub_version == UV_BAU_V1) {
uv1 = 1;
uv1_throttle(hmaster, stat);
}
@@ -1296,7 +1296,7 @@ void uv_bau_message_interrupt(struct pt_regs *regs)

msgdesc.msg_slot = msg - msgdesc.queue_first;
msgdesc.msg = msg;
- if (bcp->uvhub_version == 2)
+ if (bcp->uvhub_version == UV_BAU_V2)
process_uv2_message(&msgdesc, bcp);
else
/* no error workaround for uv1 or uv3 */
@@ -1838,7 +1838,7 @@ static void pq_init(int node, int pnode)
* and the payload queue tail must be maintained by the kernel.
*/
bcp = &per_cpu(bau_control, smp_processor_id());
- if (bcp->uvhub_version <= 3) {
+ if (bcp->uvhub_version <= UV_BAU_V3) {
tail = first;
gnode = uv_gpa_to_gnode(uv_gpa(pqp));
first = (gnode << UV_PAYLOADQ_GNODE_SHIFT) | tail;
@@ -2052,13 +2052,13 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
bcp->socket_master = *smasterp;
bcp->uvhub = bdp->uvhub;
if (is_uv1_hub())
- bcp->uvhub_version = 1;
+ bcp->uvhub_version = UV_BAU_V1;
else if (is_uv2_hub())
- bcp->uvhub_version = 2;
+ bcp->uvhub_version = UV_BAU_V2;
else if (is_uv3_hub())
- bcp->uvhub_version = 3;
+ bcp->uvhub_version = UV_BAU_V3;
else if (is_uv4_hub())
- bcp->uvhub_version = 4;
+ bcp->uvhub_version = UV_BAU_V4;
else {
pr_emerg("uvhub version not 1, 2, 3, or 4\n");
return 1;

Subject: [tip:x86/platform] x86/platform/uv/BAU: Add payload descriptor qualifier

Commit-ID: e9be36443cecda1be20b2cc3b891676ff2af9dff
Gitweb: http://git.kernel.org/tip/e9be36443cecda1be20b2cc3b891676ff2af9dff
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:10 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:28 +0100

x86/platform/uv/BAU: Add payload descriptor qualifier

On UV4, the destination agent verifies each message by checking the
descriptor qualifier field of the message payload. Messages without this
field set to 0x534749 will cause a hub error to assert. Split
bau_message_payload into uv1_2_3 and uv4 versions to account for the
different payload formats.

Enforce the size of each field by using the appropriate u** integer type.
Replace extraneous comments with KernelDoc comment.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/uv/uv_bau.h | 41 ++++++++++++++++++++++++++++------------
arch/x86/platform/uv/tlb_uv.c | 27 ++++++++++++++++++--------
2 files changed, 48 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 768093f..dcd63ed 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,8 @@
#define MSG_REGULAR 1
#define MSG_RETRY 2

+#define BAU_DESC_QUALIFIER 0x534749
+
enum uv_bau_version {
UV_BAU_V1 = 1,
UV_BAU_V2,
@@ -229,20 +231,32 @@ struct bau_local_cpumask {
* the s/w ack bit vector ]
*/

-/*
- * The payload is software-defined for INTD transactions
+/**
+ * struct uv1_2_3_bau_msg_payload - defines payload for INTD transactions
+ * @address: Signifies a page or all TLB's of the cpu
+ * @sending_cpu: CPU from which the message originates
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
*/
-struct bau_msg_payload {
- unsigned long address; /* signifies a page or all
- TLB's of the cpu */
- /* 64 bits */
- unsigned short sending_cpu; /* filled in by sender */
- /* 16 bits */
- unsigned short acknowledge_count; /* filled in by destination */
- /* 16 bits */
- unsigned int reserved1:32; /* not usable */
+struct uv1_2_3_bau_msg_payload {
+ u64 address;
+ u16 sending_cpu;
+ u16 acknowledge_count;
};

+/**
+ * struct uv4_bau_msg_payload - defines payload for INTD transactions
+ * @address: Signifies a page or all TLB's of the cpu
+ * @sending_cpu: CPU from which the message originates
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
+ * @qualifier: Set by source to verify origin of INTD broadcast
+ */
+struct uv4_bau_msg_payload {
+ u64 address;
+ u16 sending_cpu;
+ u16 acknowledge_count;
+ u32 reserved:8;
+ u32 qualifier:24;
+};

/*
* UV1 Message header: 16 bytes (128 bits) (bytes 0x30-0x3f of descriptor)
@@ -418,7 +432,10 @@ struct bau_desc {
struct uv2_3_bau_msg_header uv2_3_hdr;
} header;

- struct bau_msg_payload payload;
+ union bau_payload_header {
+ struct uv1_2_3_bau_msg_payload uv1_2_3;
+ struct uv4_bau_msg_payload uv4;
+ } payload;
};
/* UV1:
* -payload-- ---------header------
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f4f5aa6..70721c4 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1114,15 +1114,12 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
unsigned long end,
unsigned int cpu)
{
- int locals = 0;
- int remotes = 0;
- int hubs = 0;
+ int locals = 0, remotes = 0, hubs = 0;
struct bau_desc *bau_desc;
struct cpumask *flush_mask;
struct ptc_stats *stat;
struct bau_control *bcp;
- unsigned long descriptor_status;
- unsigned long status;
+ unsigned long descriptor_status, status, address;

bcp = &per_cpu(bau_control, cpu);

@@ -1171,10 +1168,24 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
record_send_statistics(stat, locals, hubs, remotes, bau_desc);

if (!end || (end - start) <= PAGE_SIZE)
- bau_desc->payload.address = start;
+ address = start;
else
- bau_desc->payload.address = TLB_FLUSH_ALL;
- bau_desc->payload.sending_cpu = cpu;
+ address = TLB_FLUSH_ALL;
+
+ switch (bcp->uvhub_version) {
+ case UV_BAU_V1:
+ case UV_BAU_V2:
+ case UV_BAU_V3:
+ bau_desc->payload.uv1_2_3.address = address;
+ bau_desc->payload.uv1_2_3.sending_cpu = cpu;
+ break;
+ case UV_BAU_V4:
+ bau_desc->payload.uv4.address = address;
+ bau_desc->payload.uv4.sending_cpu = cpu;
+ bau_desc->payload.uv4.qualifier = BAU_DESC_QUALIFIER;
+ break;
+ }
+
/*
* uv_flush_send_and_wait returns 0 if all cpu's were messaged,
* or 1 if it gave up and the original cpumask should be returned.

Subject: [tip:x86/platform] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances

Commit-ID: 8e3b21b6dbf0318d5b3a598572acc23f07189c40
Gitweb: http://git.kernel.org/tip/8e3b21b6dbf0318d5b3a598572acc23f07189c40
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:11 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:28 +0100

x86/platform/uv/BAU: Cleanup bau_operations declaration and instances

Move the bau_operations declaration after bau struct declarations so the
bau structs can be referenced when adding new functions to
bau_operations. That way we avoid forward declarations of the bau
structs.

Likewise, move uv*_bau_ops structs down to avoid forward declarations of
new functions defined in the same file. Declare these structs __initconst
since they are only used during initialization. Similarly, declare the
bau_operations ops instance __ro_after_init as it is read-only after
initialization.

This is a preparatory patch for adding wait_completion to bau_operations.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/uv/uv_bau.h | 22 ++++++++++----------
arch/x86/platform/uv/tlb_uv.c | 43 ++++++++++++++++++++--------------------
2 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index dcd63ed..695b873 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -406,17 +406,6 @@ struct uv2_3_bau_msg_header {
/* bits 127:120 */
};

-/* Abstracted BAU functions */
-struct bau_operations {
- unsigned long (*read_l_sw_ack)(void);
- unsigned long (*read_g_sw_ack)(int pnode);
- unsigned long (*bau_gpa_to_offset)(unsigned long vaddr);
- void (*write_l_sw_ack)(unsigned long mmr);
- void (*write_g_sw_ack)(int pnode, unsigned long mmr);
- void (*write_payload_first)(int pnode, unsigned long mmr);
- void (*write_payload_last)(int pnode, unsigned long mmr);
-};
-
/*
* The activation descriptor:
* The format of the message to send, plus all accompanying control
@@ -668,6 +657,17 @@ struct bau_control {
struct hub_and_pnode *thp;
};

+/* Abstracted BAU functions */
+struct bau_operations {
+ unsigned long (*read_l_sw_ack)(void);
+ unsigned long (*read_g_sw_ack)(int pnode);
+ unsigned long (*bau_gpa_to_offset)(unsigned long vaddr);
+ void (*write_l_sw_ack)(unsigned long mmr);
+ void (*write_g_sw_ack)(int pnode, unsigned long mmr);
+ void (*write_payload_first)(int pnode, unsigned long mmr);
+ void (*write_payload_last)(int pnode, unsigned long mmr);
+};
+
static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
{
write_gmmr(pnode, UVH_BAU_DATA_BROADCAST, mmr_image);
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 70721c4..e6994fd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -23,28 +23,7 @@
#include <asm/irq_vectors.h>
#include <asm/timer.h>

-static struct bau_operations ops;
-
-static struct bau_operations uv123_bau_ops = {
- .bau_gpa_to_offset = uv_gpa_to_offset,
- .read_l_sw_ack = read_mmr_sw_ack,
- .read_g_sw_ack = read_gmmr_sw_ack,
- .write_l_sw_ack = write_mmr_sw_ack,
- .write_g_sw_ack = write_gmmr_sw_ack,
- .write_payload_first = write_mmr_payload_first,
- .write_payload_last = write_mmr_payload_last,
-};
-
-static struct bau_operations uv4_bau_ops = {
- .bau_gpa_to_offset = uv_gpa_to_soc_phys_ram,
- .read_l_sw_ack = read_mmr_proc_sw_ack,
- .read_g_sw_ack = read_gmmr_proc_sw_ack,
- .write_l_sw_ack = write_mmr_proc_sw_ack,
- .write_g_sw_ack = write_gmmr_proc_sw_ack,
- .write_payload_first = write_mmr_proc_payload_first,
- .write_payload_last = write_mmr_proc_payload_last,
-};
-
+static struct bau_operations ops __ro_after_init;

/* timeouts in nanoseconds (indexed by UVH_AGING_PRESCALE_SEL urgency7 30:28) */
static int timeout_base_ns[] = {
@@ -2158,6 +2137,26 @@ fail:
return 1;
}

+static const struct bau_operations uv123_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_offset,
+ .read_l_sw_ack = read_mmr_sw_ack,
+ .read_g_sw_ack = read_gmmr_sw_ack,
+ .write_l_sw_ack = write_mmr_sw_ack,
+ .write_g_sw_ack = write_gmmr_sw_ack,
+ .write_payload_first = write_mmr_payload_first,
+ .write_payload_last = write_mmr_payload_last,
+};
+
+static const struct bau_operations uv4_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_soc_phys_ram,
+ .read_l_sw_ack = read_mmr_proc_sw_ack,
+ .read_g_sw_ack = read_gmmr_proc_sw_ack,
+ .write_l_sw_ack = write_mmr_proc_sw_ack,
+ .write_g_sw_ack = write_gmmr_proc_sw_ack,
+ .write_payload_first = write_mmr_proc_payload_first,
+ .write_payload_last = write_mmr_proc_payload_last,
+};
+
/*
* Initialization of BAU-related structures
*/

Subject: [tip:x86/platform] x86/platform/uv/BAU: Add status mmr location fields to bau_control

Commit-ID: dfeb28f068ff9cc4f714c7d1edaf61597ea1768b
Gitweb: http://git.kernel.org/tip/dfeb28f068ff9cc4f714c7d1edaf61597ea1768b
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:12 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:29 +0100

x86/platform/uv/BAU: Add status mmr location fields to bau_control

The location of the ERROR and BUSY status bits depends on the descriptor
index, i.e. the CPU, of the message. Since this index does not change,
there is no need to calculate the mmr and index location during message
processing. The less work we do in the hot path the better.

Add status_mmr and status_index fields to bau_control and compute their
values during initialization. Add kerneldoc descriptions for the new
fields. Update uv*_wait_completion to use these fields rather than
receiving the information as parameters.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/uv/uv_bau.h | 10 +++++++--
arch/x86/platform/uv/tlb_uv.c | 46 +++++++++++++++++++---------------------
2 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 695b873..0ec7631 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -601,8 +601,12 @@ struct uvhub_desc {
struct socket_desc socket[2];
};

-/*
- * one per-cpu; to locate the software tables
+/**
+ * struct bau_control
+ * @status_mmr: location of status mmr, determined by uvhub_cpu
+ * @status_index: index of ERR|BUSY bits in status mmr, determined by uvhub_cpu
+ *
+ * Per-cpu control struct containing CPU topology information and BAU tuneables.
*/
struct bau_control {
struct bau_desc *descriptor_base;
@@ -620,6 +624,8 @@ struct bau_control {
int timeout_tries;
int ipi_attempts;
int conseccompletes;
+ u64 status_mmr;
+ int status_index;
bool nobau;
short baudisabled;
short cpu;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index e6994fd..13a7055 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -527,11 +527,12 @@ static unsigned long uv1_read_status(unsigned long mmr_offset, int right_shift)
* return COMPLETE, RETRY(PLUGGED or TIMEOUT) or GIVEUP
*/
static int uv1_wait_completion(struct bau_desc *bau_desc,
- unsigned long mmr_offset, int right_shift,
struct bau_control *bcp, long try)
{
unsigned long descriptor_status;
cycles_t ttm;
+ u64 mmr_offset = bcp->status_mmr;
+ int right_shift = bcp->status_index;
struct ptc_stats *stat = bcp->statp;

descriptor_status = uv1_read_status(mmr_offset, right_shift);
@@ -619,11 +620,12 @@ int handle_uv2_busy(struct bau_control *bcp)
}

static int uv2_3_wait_completion(struct bau_desc *bau_desc,
- unsigned long mmr_offset, int right_shift,
struct bau_control *bcp, long try)
{
unsigned long descriptor_stat;
cycles_t ttm;
+ u64 mmr_offset = bcp->status_mmr;
+ int right_shift = bcp->status_index;
int desc = bcp->uvhub_cpu;
long busy_reps = 0;
struct ptc_stats *stat = bcp->statp;
@@ -684,29 +686,12 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
return FLUSH_COMPLETE;
}

-/*
- * There are 2 status registers; each and array[32] of 2 bits. Set up for
- * which register to read and position in that register based on cpu in
- * current hub.
- */
static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
{
- int right_shift;
- unsigned long mmr_offset;
- int desc = bcp->uvhub_cpu;
-
- if (desc < UV_CPUS_PER_AS) {
- mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
- right_shift = desc * UV_ACT_STATUS_SIZE;
- } else {
- mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
- right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
- }
-
if (bcp->uvhub_version == UV_BAU_V1)
- return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+ return uv1_wait_completion(bau_desc, bcp, try);
else
- return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+ return uv2_3_wait_completion(bau_desc, bcp, try);
}

/*
@@ -2024,8 +2009,7 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
struct bau_control **smasterp,
struct bau_control **hmasterp)
{
- int i;
- int cpu;
+ int i, cpu, uvhub_cpu;
struct bau_control *bcp;

for (i = 0; i < sdp->num_cpus; i++) {
@@ -2054,7 +2038,21 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
return 1;
}
bcp->uvhub_master = *hmasterp;
- bcp->uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+ uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+ bcp->uvhub_cpu = uvhub_cpu;
+
+ /*
+ * The ERROR and BUSY status registers are located pairwise over
+ * the STATUS_0 and STATUS_1 mmrs; each an array[32] of 2 bits.
+ */
+ if (uvhub_cpu < UV_CPUS_PER_AS) {
+ bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
+ bcp->status_index = uvhub_cpu * UV_ACT_STATUS_SIZE;
+ } else {
+ bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
+ bcp->status_index = (uvhub_cpu - UV_CPUS_PER_AS)
+ * UV_ACT_STATUS_SIZE;
+ }

if (bcp->uvhub_cpu >= MAX_CPUS_PER_UVHUB) {
pr_emerg("%d cpus per uvhub invalid\n",

Subject: [tip:x86/platform] x86/platform/uv/BAU: Add wait_completion to bau_operations

Commit-ID: 2620bbbf1f4f187952fb35861f4473860c432728
Gitweb: http://git.kernel.org/tip/2620bbbf1f4f187952fb35861f4473860c432728
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:13 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:29 +0100

x86/platform/uv/BAU: Add wait_completion to bau_operations

Remove the present wait_completion routine and add a function pointer by
the same name to the bau_operations struct. Rather than switching on the
UV hub version during message processing, set the architecture-specific
uv*_wait_completion during initialization.

The uv123_bau_ops struct must be split into uv1 and uv2_3 versions to
accommodate the corresponding wait_completion routines.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/include/asm/uv/uv_bau.h | 2 ++
arch/x86/platform/uv/tlb_uv.c | 31 ++++++++++++++++++-------------
2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 0ec7631..7cac798 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -672,6 +672,8 @@ struct bau_operations {
void (*write_g_sw_ack)(int pnode, unsigned long mmr);
void (*write_payload_first)(int pnode, unsigned long mmr);
void (*write_payload_last)(int pnode, unsigned long mmr);
+ int (*wait_completion)(struct bau_desc*,
+ struct bau_control*, long try);
};

static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 13a7055..2a826dd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -686,14 +686,6 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
return FLUSH_COMPLETE;
}

-static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
-{
- if (bcp->uvhub_version == UV_BAU_V1)
- return uv1_wait_completion(bau_desc, bcp, try);
- else
- return uv2_3_wait_completion(bau_desc, bcp, try);
-}
-
/*
* Our retries are blocked by all destination sw ack resources being
* in use, and a timeout is pending. In that case hardware immediately
@@ -922,7 +914,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
write_mmr_activation(index);

try++;
- completion_stat = wait_completion(bau_desc, bcp, try);
+ completion_stat = ops.wait_completion(bau_desc, bcp, try);

handle_cmplt(completion_stat, bau_desc, bcp, hmaster, stat);

@@ -2135,7 +2127,18 @@ fail:
return 1;
}

-static const struct bau_operations uv123_bau_ops __initconst = {
+static const struct bau_operations uv1_bau_ops __initconst = {
+ .bau_gpa_to_offset = uv_gpa_to_offset,
+ .read_l_sw_ack = read_mmr_sw_ack,
+ .read_g_sw_ack = read_gmmr_sw_ack,
+ .write_l_sw_ack = write_mmr_sw_ack,
+ .write_g_sw_ack = write_gmmr_sw_ack,
+ .write_payload_first = write_mmr_payload_first,
+ .write_payload_last = write_mmr_payload_last,
+ .wait_completion = uv1_wait_completion,
+};
+
+static const struct bau_operations uv2_3_bau_ops __initconst = {
.bau_gpa_to_offset = uv_gpa_to_offset,
.read_l_sw_ack = read_mmr_sw_ack,
.read_g_sw_ack = read_gmmr_sw_ack,
@@ -2143,6 +2146,7 @@ static const struct bau_operations uv123_bau_ops __initconst = {
.write_g_sw_ack = write_gmmr_sw_ack,
.write_payload_first = write_mmr_payload_first,
.write_payload_last = write_mmr_payload_last,
+ .wait_completion = uv2_3_wait_completion,
};

static const struct bau_operations uv4_bau_ops __initconst = {
@@ -2153,6 +2157,7 @@ static const struct bau_operations uv4_bau_ops __initconst = {
.write_g_sw_ack = write_gmmr_proc_sw_ack,
.write_payload_first = write_mmr_proc_payload_first,
.write_payload_last = write_mmr_proc_payload_last,
+ .wait_completion = uv2_3_wait_completion,
};

/*
@@ -2174,11 +2179,11 @@ static int __init uv_bau_init(void)
if (is_uv4_hub())
ops = uv4_bau_ops;
else if (is_uv3_hub())
- ops = uv123_bau_ops;
+ ops = uv2_3_bau_ops;
else if (is_uv2_hub())
- ops = uv123_bau_ops;
+ ops = uv2_3_bau_ops;
else if (is_uv1_hub())
- ops = uv123_bau_ops;
+ ops = uv1_bau_ops;

for_each_possible_cpu(cur_cpu) {
mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);

Subject: [tip:x86/platform] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status

Commit-ID: 2f2a033fb5819c393d65da9b6233e095f3690f15
Gitweb: http://git.kernel.org/tip/2f2a033fb5819c393d65da9b6233e095f3690f15
Author: Andrew Banman <[email protected]>
AuthorDate: Thu, 9 Mar 2017 10:42:14 -0600
Committer: Thomas Gleixner <[email protected]>
CommitDate: Mon, 13 Mar 2017 14:26:29 +0100

x86/platform/uv/BAU: Implement uv4_wait_completion with read_status

UV4 does not employ a software-timeout as in previous generations so a new
wait_completion routine without this logic is required. Certain completion
statuses require the AUX status bit in addition to ERROR and BUSY.

Add the read_status routine to construct the full completion status. Use
read_status in the uv4_wait_completion routine to handle all possible
completion statuses.

Signed-off-by: Andrew Banman <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Mike Travis <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>

---
arch/x86/platform/uv/tlb_uv.c | 58 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 2a826dd..42e65fe 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -687,6 +687,62 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
}

/*
+ * Returns the status of current BAU message for cpu desc as a bit field
+ * [Error][Busy][Aux]
+ */
+static u64 read_status(u64 status_mmr, int index, int desc)
+{
+ u64 stat;
+
+ stat = ((read_lmmr(status_mmr) >> index) & UV_ACT_STATUS_MASK) << 1;
+ stat |= (read_lmmr(UVH_LB_BAU_SB_ACTIVATION_STATUS_2) >> desc) & 0x1;
+
+ return stat;
+}
+
+static int uv4_wait_completion(struct bau_desc *bau_desc,
+ struct bau_control *bcp, long try)
+{
+ struct ptc_stats *stat = bcp->statp;
+ u64 descriptor_stat;
+ u64 mmr = bcp->status_mmr;
+ int index = bcp->status_index;
+ int desc = bcp->uvhub_cpu;
+
+ descriptor_stat = read_status(mmr, index, desc);
+
+ /* spin on the status MMR, waiting for it to go idle */
+ while (descriptor_stat != UV2H_DESC_IDLE) {
+ switch (descriptor_stat) {
+ case UV2H_DESC_SOURCE_TIMEOUT:
+ stat->s_stimeout++;
+ return FLUSH_GIVEUP;
+
+ case UV2H_DESC_DEST_TIMEOUT:
+ stat->s_dtimeout++;
+ bcp->conseccompletes = 0;
+ return FLUSH_RETRY_TIMEOUT;
+
+ case UV2H_DESC_DEST_STRONG_NACK:
+ stat->s_plugged++;
+ bcp->conseccompletes = 0;
+ return FLUSH_RETRY_PLUGGED;
+
+ case UV2H_DESC_DEST_PUT_ERR:
+ bcp->conseccompletes = 0;
+ return FLUSH_GIVEUP;
+
+ default:
+ /* descriptor_stat is still BUSY */
+ cpu_relax();
+ }
+ descriptor_stat = read_status(mmr, index, desc);
+ }
+ bcp->conseccompletes++;
+ return FLUSH_COMPLETE;
+}
+
+/*
* Our retries are blocked by all destination sw ack resources being
* in use, and a timeout is pending. In that case hardware immediately
* returns the ERROR that looks like a destination timeout.
@@ -2157,7 +2213,7 @@ static const struct bau_operations uv4_bau_ops __initconst = {
.write_g_sw_ack = write_gmmr_proc_sw_ack,
.write_payload_first = write_mmr_proc_payload_first,
.write_payload_last = write_mmr_proc_payload_last,
- .wait_completion = uv2_3_wait_completion,
+ .wait_completion = uv4_wait_completion,
};

/*