2017-08-08 23:07:15

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 00/17] powerpc/vas: Enable VAS

POWER9 introduces a hardware subsystem referred to as the Virtual
Accelerator Switchboard (VAS). VAS allows kernel subsystems and user
space processes to directly access the Nest Accelerator (NX) engines
which implement compression and encryption algorithms in the hardware.

NX has been in Power processors since Power7+, but access to the NX
engines was through the 'icswx' instruction which is only available
to the kernel/hypervisor. Starting with POWER9, access to the NX
engines is provided to both kernel and user space processes through
VAS.

The switchboard (i.e VAS) multiplexes accesses between "receivers" and
"senders", where the "receivers" are typically the NX engines and the
"senders" are the kernel subsystems and user processors that wish to
access the receivers (NX engines). Once a sender is "connected" to
a receiver through the switchboard, the senders can submit compression/
encryption requests to the hardware using the new (PowerISA 3.0)
"copy" and "paste" instructions.

Senders can also send "empty" messages to the receiver. If the receiver
is executing a WAIT instruction, this empty message serves to have the
receiver resume from the next instruction. (i.e acts as "wake up" message).
This usage of VAS is referred to as "Fast thread-wakeup".

Provides:

This patch set:
- configures the VAS subsystems in the hardware

- provides kernel interfaces to drivers like NX-842 and
NX-FTW (new) to open receive and send/receive windows
and to submit copy/paste (i.e compression) requests to
the NX engines.

- implements an NX-FTW driver for the fast thread-wake up
mechanism. It provides the /dev/crypto/nx-ftw device node,
and ioctls to allow users to use the FTW mechanism in VAS.

Follow-on patch set(s) will allow user space processes to submit
requests to the NX-GZIP engine (and possibly other engines).

Requires:

This patch set needs corresponding VAS/NX skiboot patches which
were merged into skiboot tree. i.e skiboot must include:

commit 3b3c596 (NX: Add P9 NX support for 842 compression engine)

Testing:

In-kernel compression requests were tested on DD1 POWER9 hardware
using the following NX-842 patch set from Haren Myneni:

https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-July/160620.html

The ability to setup user space send/receive windows for FTW was
tested on DD1 hardware. The actual copy/paste of the empty messages
is not yet supported in hardware and that functionality was tested
on DD2 simics software.

Git Tree:

https://github.com/sukadev/linux/

Branch: vas-kern-v6

Thanks to input from Ben Herrenschmidt, Michael Neuling, Michael Ellerman,
Robert Blackmore and Haren Myneni.

Changelog[v6]
- Add support for user space send/receive FTW windows
- Add a new, NX-FTW driver which provides the FTW user interface

Changelog[v5]
- [Ben Herrenschmidt] Make VAS a platform device in the device tree
and use the core platform functions to parse the VAS properties.
Map the VAS MMIO regions as non-cachable and paste regions as
cachable. Use CONFIG_PPC_VAS rather than CONFIG_VAS; Don't assume
VAS ids are sequential.
- Copy the FIFO address as is into LFIFO_BAR (don't shift it).

Changelog[v4]
Comments from Michael Neuling:
- Move VAS code from drivers/misc/vas to arch/powerpc/platforms/powernv
since VAS only provides interfaces to other drivers like NX-842.
- Drop vas-internal.h and use vas.h in separate dirs for VAS
internal, kernel API and user API
- Rather than create 6 separate device tree properties windows
and window context, combine them into 6 "reg" properties.
- Drop vas_window_reset() since windows are reset/cleared before
being assigned to kernel/users.
- Use ilog2() and radix_enabled() helpers

Changelog[v3]
- Rebase to v4.11-rc1
- Add interfaces to initialize send/receive window attributes to
defaults that drivers can use (see arch/powerpc/include/asm/vas.h)
- Modify interface vas_paste() to return 0 or error code
- Fix a bug in setting Translation Control Mode (0b11 not 0x11)
- Enable send-window-credit checking
- Reorg code in vas_win_close()
- Minor reorgs and tweaks to register field settings to make it
easier to add support for user space windows.
- Skip writing to read-only registers
- Start window indexing from 0 rather than 1

Changelog[v2]
- Use vas-id, HVWC, UWC and paste address, entries from device tree
rather than defining/computing them in kernel and reorg code.

Sukadev Bhattiprolu (17):
powerpc/vas: Define macros, register fields and structures
Move GET_FIELD/SET_FIELD to vas.h
powerpc/vas: Define vas_init() and vas_exit()
powerpc/vas: Define helpers to access MMIO regions
powerpc/vas: Define helpers to init window context
powerpc/vas: Define helpers to alloc/free windows
powerpc/vas: Define vas_win_paste_addr()
powerpc/vas: Define vas_win_id()
powerpc/vas: Define vas_rx_win_open() interface
powerpc/vas: Define vas_rx_win_open() interface
powerpc/vas: Define vas_win_close() interface
powerpc/vas: Define vas_tx_win_open()
powerpc/vas: Define copy/paste interfaces
powerpc: Add support for setting SPRN_TIDR
powerpc/vas: Define window open ioctls API
powerpc/vas: Implement a simple FTW driver
VAS: Document FTW API/usage

.../devicetree/bindings/powerpc/ibm,vas.txt | 24 +
Documentation/powerpc/ftw-api.txt | 373 ++++++
MAINTAINERS | 20 +
arch/powerpc/include/asm/processor.h | 4 +
arch/powerpc/include/asm/vas.h | 156 +++
arch/powerpc/include/uapi/asm/vas.h | 63 +
arch/powerpc/kernel/process.c | 74 ++
arch/powerpc/platforms/powernv/Kconfig | 30 +
arch/powerpc/platforms/powernv/Makefile | 2 +
arch/powerpc/platforms/powernv/copy-paste.h | 74 ++
arch/powerpc/platforms/powernv/nx-ftw.c | 486 ++++++++
arch/powerpc/platforms/powernv/vas-window.c | 1233 ++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.c | 183 +++
arch/powerpc/platforms/powernv/vas.h | 500 ++++++++
drivers/crypto/nx/nx-842-powernv.c | 7 +-
drivers/crypto/nx/nx-842.h | 5 -
16 files changed, 3226 insertions(+), 8 deletions(-)
create mode 100644 Documentation/devicetree/bindings/powerpc/ibm,vas.txt
create mode 100644 Documentation/powerpc/ftw-api.txt
create mode 100644 arch/powerpc/include/asm/vas.h
create mode 100644 arch/powerpc/include/uapi/asm/vas.h
create mode 100644 arch/powerpc/platforms/powernv/copy-paste.h
create mode 100644 arch/powerpc/platforms/powernv/nx-ftw.c
create mode 100644 arch/powerpc/platforms/powernv/vas-window.c
create mode 100644 arch/powerpc/platforms/powernv/vas.c
create mode 100644 arch/powerpc/platforms/powernv/vas.h

--
2.7.4


2017-08-08 23:07:20

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v6]
- Add some fields for FTW windows

Changelog[v4]
- [Michael Neuling] Move VAS code to arch/powerpc; Reorg vas.h and
vas-internal.h to kernel and uapi versions; rather than creating
separate properties for window context/address entries in device
tree, combine them into "reg" properties; drop ->hwirq and irq_port
fields from vas_window as they are only needed with user space
windows.
- Drop the error check for CONFIG_PPC_4K_PAGES. Instead in a
follow-on patch add a "depends on CONFIG_PPC_64K_PAGES".

Changelog[v3]
- Rename winctx->pid to winctx->pidr to reflect that its a value
from the PID register (SPRN_PID), not the linux process id.
- Make it easier to split header into kernel/user parts
- To keep user interface simple, use macros rather than enum for
the threshold-control modes.
- Add a pid field to struct vas_window - needed for user space
send windows.

Changelog[v2]
- Add an overview of VAS in vas-internal.h
- Get window context parameters from device tree and drop
unnecessary macros.
---
arch/powerpc/include/asm/vas.h | 35 ++++
arch/powerpc/include/uapi/asm/vas.h | 25 +++
arch/powerpc/platforms/powernv/vas.h | 382 +++++++++++++++++++++++++++++++++++
3 files changed, 442 insertions(+)
create mode 100644 arch/powerpc/include/asm/vas.h
create mode 100644 arch/powerpc/include/uapi/asm/vas.h
create mode 100644 arch/powerpc/platforms/powernv/vas.h

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
new file mode 100644
index 0000000..2c8558a
--- /dev/null
+++ b/arch/powerpc/include/asm/vas.h
@@ -0,0 +1,35 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _MISC_VAS_H
+#define _MISC_VAS_H
+
+#include <uapi/asm/vas.h>
+
+/*
+ * Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
+ * (Local FIFO Size Register) of the VAS workbook.
+ */
+#define VAS_RX_FIFO_SIZE_MIN (1 << 10) /* 1KB */
+#define VAS_RX_FIFO_SIZE_MAX (8 << 20) /* 8MB */
+
+/*
+ * Co-processor Engine type.
+ */
+enum vas_cop_type {
+ VAS_COP_TYPE_FAULT,
+ VAS_COP_TYPE_842,
+ VAS_COP_TYPE_842_HIPRI,
+ VAS_COP_TYPE_GZIP,
+ VAS_COP_TYPE_GZIP_HIPRI,
+ VAS_COP_TYPE_FTW,
+ VAS_COP_TYPE_MAX,
+};
+
+#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/include/uapi/asm/vas.h b/arch/powerpc/include/uapi/asm/vas.h
new file mode 100644
index 0000000..ddfe046
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/vas.h
@@ -0,0 +1,25 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _UAPI_MISC_VAS_H
+#define _UAPI_MISC_VAS_H
+
+/*
+ * Threshold Control Mode: Have paste operation fail if the number of
+ * requests in receive FIFO exceeds a threshold.
+ *
+ * NOTE: No special error code yet if paste is rejected because of these
+ * limits. So users can't distinguish between this and other errors.
+ */
+#define VAS_THRESH_DISABLED 0
+#define VAS_THRESH_FIFO_GT_HALF_FULL 1
+#define VAS_THRESH_FIFO_GT_QTR_FULL 2
+#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
+
+#endif /* _UAPI_MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
new file mode 100644
index 0000000..312a378
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -0,0 +1,382 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _VAS_H
+#define _VAS_H
+#include <linux/atomic.h>
+#include <linux/idr.h>
+#include <asm/vas.h>
+
+/*
+ * Overview of Virtual Accelerator Switchboard (VAS).
+ *
+ * VAS is a hardware "switchboard" that allows senders and receivers to
+ * exchange messages with _minimal_ kernel involvment. The receivers are
+ * typically NX coprocessor engines that perform compression or encryption
+ * in hardware, but receivers can also be other software threads.
+ *
+ * Senders are user/kernel threads that submit compression/encryption or
+ * other requests to the receivers. Senders must format their messages as
+ * Coprocessor Request Blocks (CRB)s and submit them using the "copy" and
+ * "paste" instructions which were introduced in Power9.
+ *
+ * A Power node can have (upto?) 8 Power chips. There is one instance of
+ * VAS in each Power9 chip. Each instance of VAS has 64K windows or ports,
+ * Senders and receivers must each connect to a separate window before they
+ * can exchange messages through the switchboard.
+ *
+ * Each window is described by two types of window contexts:
+ *
+ * Hypervisor Window Context (HVWC) of size VAS_HVWC_SIZE bytes
+ *
+ * OS/User Window Context (UWC) of size VAS_UWC_SIZE bytes.
+ *
+ * A window context can be viewed as a set of 64-bit registers. The settings
+ * in these registers configure/control/determine the behavior of the VAS
+ * hardware when messages are sent/received through the window. The registers
+ * in the HVWC are configured by the kernel while the registers in the UWC can
+ * be configured by the kernel or by the user space application that is using
+ * the window.
+ *
+ * The HVWCs for all windows on a specific instance of VAS are in a contiguous
+ * range of hardware addresses or Base address region (BAR) referred to as the
+ * HVWC BAR for the instance. Similarly the UWCs for all windows on an instance
+ * are referred to as the UWC BAR for the instance.
+ *
+ * The two BARs for each instance are defined Power9 MMIO Ranges spreadsheet
+ * and available to the kernel in the VAS node's "reg" property in the device
+ * tree:
+ *
+ * /proc/device-tree/vasm@.../reg
+ *
+ * (see vas_probe() for details on the reg property).
+ *
+ * The kernel maps the HVWC and UWC BAR regions into the kernel address
+ * space (hvwc_map and uwc_map). The kernel can then access the window
+ * contexts of a specific window using:
+ *
+ * hvwc = hvwc_map + winid * VAS_HVWC_SIZE.
+ * uwc = uwc_map + winid * VAS_UWC_SIZE.
+ *
+ * where winid is the window index (0..64K).
+ *
+ * As mentioned, a window context is used to "configure" a window. Besides
+ * this configuration address, each _send_ window also has a unique hardware
+ * "paste" address that is used to submit requests/CRBs (see vas_paste_crb()).
+ *
+ * The hardware paste address for a window is computed using the "paste
+ * base address" and "paste win id shift" reg properties in the VAS device
+ * tree node using:
+ *
+ * paste_addr = paste_base + ((winid << paste_win_id_shift))
+ *
+ * (again, see vas_probe() for ->paste_base_addr and ->paste_win_id_shift).
+ *
+ * The kernel maps this hardware address into the sender's address space
+ * after which they can use the 'paste' instruction (new in Power9) to
+ * send a message (submit a request aka CRB) to the coprocessor.
+ *
+ * NOTE: In the initial version, senders can only in-kernel drivers/threads.
+ * Support for user space threads will be added in follow-on patches.
+ *
+ * TODO: Do we need to map the UWC into user address space so they can return
+ * credits? Its NA for NX but may be needed for other receive windows.
+ *
+ */
+
+#define VAS_WINDOWS_PER_CHIP (64 << 10)
+
+/*
+ * Hypervisor and OS/USer Window Context sizes
+ */
+#define VAS_HVWC_SIZE 512
+#define VAS_UWC_SIZE PAGE_SIZE
+
+/*
+ * Initial per-process credits.
+ * Max send window credits: 4K-1 (12-bits in VAS_TX_WCRED)
+ * Max receive window credits: 64K-1 (16 bits in VAS_LRX_WCRED)
+ *
+ * TODO: Needs tuning for per-process credits
+ */
+#define VAS_WCREDS_MIN 16
+#define VAS_WCREDS_MAX ((64 << 10) - 1)
+#define VAS_WCREDS_DEFAULT (1 << 10)
+
+/*
+ * VAS Window Context Register Offsets and bitmasks.
+ * See Section 3.1.4 of VAS Work book
+ */
+#define VAS_LPID_OFFSET 0x010
+#define VAS_LPID PPC_BITMASK(0, 11)
+
+#define VAS_PID_OFFSET 0x018
+#define VAS_PID_ID PPC_BITMASK(0, 19)
+
+#define VAS_XLATE_MSR_OFFSET 0x020
+#define VAS_XLATE_MSR_DR PPC_BIT(0)
+#define VAS_XLATE_MSR_TA PPC_BIT(1)
+#define VAS_XLATE_MSR_PR PPC_BIT(2)
+#define VAS_XLATE_MSR_US PPC_BIT(3)
+#define VAS_XLATE_MSR_HV PPC_BIT(4)
+#define VAS_XLATE_MSR_SF PPC_BIT(5)
+
+#define VAS_XLATE_LPCR_OFFSET 0x028
+#define VAS_XLATE_LPCR_PAGE_SIZE PPC_BITMASK(0, 2)
+#define VAS_XLATE_LPCR_ISL PPC_BIT(3)
+#define VAS_XLATE_LPCR_TC PPC_BIT(4)
+#define VAS_XLATE_LPCR_SC PPC_BIT(5)
+
+#define VAS_XLATE_CTL_OFFSET 0x030
+#define VAS_XLATE_MODE PPC_BITMASK(0, 1)
+
+#define VAS_AMR_OFFSET 0x040
+#define VAS_AMR PPC_BITMASK(0, 63)
+
+#define VAS_SEIDR_OFFSET 0x048
+#define VAS_SEIDR PPC_BITMASK(0, 63)
+
+#define VAS_FAULT_TX_WIN_OFFSET 0x050
+#define VAS_FAULT_TX_WIN PPC_BITMASK(48, 63)
+
+#define VAS_OSU_INTR_SRC_RA_OFFSET 0x060
+#define VAS_OSU_INTR_SRC_RA PPC_BITMASK(8, 63)
+
+#define VAS_HV_INTR_SRC_RA_OFFSET 0x070
+#define VAS_HV_INTR_SRC_RA PPC_BITMASK(8, 63)
+
+#define VAS_PSWID_OFFSET 0x078
+#define VAS_PSWID_EA_HANDLE PPC_BITMASK(0, 31)
+
+#define VAS_SPARE1_OFFSET 0x080
+#define VAS_SPARE2_OFFSET 0x088
+#define VAS_SPARE3_OFFSET 0x090
+#define VAS_SPARE4_OFFSET 0x130
+#define VAS_SPARE5_OFFSET 0x160
+#define VAS_SPARE6_OFFSET 0x188
+
+#define VAS_LFIFO_BAR_OFFSET 0x0A0
+#define VAS_LFIFO_BAR PPC_BITMASK(8, 53)
+#define VAS_PAGE_MIGRATION_SELECT PPC_BITMASK(54, 56)
+
+#define VAS_LDATA_STAMP_CTL_OFFSET 0x0A8
+#define VAS_LDATA_STAMP PPC_BITMASK(0, 1)
+#define VAS_XTRA_WRITE PPC_BIT(2)
+
+#define VAS_LDMA_CACHE_CTL_OFFSET 0x0B0
+#define VAS_LDMA_TYPE PPC_BITMASK(0, 1)
+#define VAS_LDMA_FIFO_DISABLE PPC_BIT(2)
+
+#define VAS_LRFIFO_PUSH_OFFSET 0x0B8
+#define VAS_LRFIFO_PUSH PPC_BITMASK(0, 15)
+
+#define VAS_CURR_MSG_COUNT_OFFSET 0x0C0
+#define VAS_CURR_MSG_COUNT PPC_BITMASK(0, 7)
+
+#define VAS_LNOTIFY_AFTER_COUNT_OFFSET 0x0C8
+#define VAS_LNOTIFY_AFTER_COUNT PPC_BITMASK(0, 7)
+
+#define VAS_LRX_WCRED_OFFSET 0x0E0
+#define VAS_LRX_WCRED PPC_BITMASK(0, 15)
+
+#define VAS_LRX_WCRED_ADDER_OFFSET 0x190
+#define VAS_LRX_WCRED_ADDER PPC_BITMASK(0, 15)
+
+#define VAS_TX_WCRED_OFFSET 0x0F0
+#define VAS_TX_WCRED PPC_BITMASK(4, 15)
+
+#define VAS_TX_WCRED_ADDER_OFFSET 0x1A0
+#define VAS_TX_WCRED_ADDER PPC_BITMASK(4, 15)
+
+#define VAS_LFIFO_SIZE_OFFSET 0x100
+#define VAS_LFIFO_SIZE PPC_BITMASK(0, 3)
+
+#define VAS_WINCTL_OFFSET 0x108
+#define VAS_WINCTL_OPEN PPC_BIT(0)
+#define VAS_WINCTL_REJ_NO_CREDIT PPC_BIT(1)
+#define VAS_WINCTL_PIN PPC_BIT(2)
+#define VAS_WINCTL_TX_WCRED_MODE PPC_BIT(3)
+#define VAS_WINCTL_RX_WCRED_MODE PPC_BIT(4)
+#define VAS_WINCTL_TX_WORD_MODE PPC_BIT(5)
+#define VAS_WINCTL_RX_WORD_MODE PPC_BIT(6)
+#define VAS_WINCTL_RSVD_TXBUF PPC_BIT(7)
+#define VAS_WINCTL_THRESH_CTL PPC_BITMASK(8, 9)
+#define VAS_WINCTL_FAULT_WIN PPC_BIT(10)
+#define VAS_WINCTL_NX_WIN PPC_BIT(11)
+
+#define VAS_WIN_STATUS_OFFSET 0x110
+#define VAS_WIN_BUSY PPC_BIT(1)
+
+#define VAS_WIN_CTX_CACHING_CTL_OFFSET 0x118
+#define VAS_CASTOUT_REQ PPC_BIT(0)
+#define VAS_PUSH_TO_MEM PPC_BIT(1)
+#define VAS_WIN_CACHE_STATUS PPC_BIT(4)
+
+#define VAS_TX_RSVD_BUF_COUNT_OFFSET 0x120
+#define VAS_RXVD_BUF_COUNT PPC_BITMASK(58, 63)
+
+#define VAS_LRFIFO_WIN_PTR_OFFSET 0x128
+#define VAS_LRX_WIN_ID PPC_BITMASK(0, 15)
+
+/*
+ * Local Notification Control Register controls what happens in _response_
+ * to a paste command and hence applies only to receive windows.
+ */
+#define VAS_LNOTIFY_CTL_OFFSET 0x138
+#define VAS_NOTIFY_DISABLE PPC_BIT(0)
+#define VAS_INTR_DISABLE PPC_BIT(1)
+#define VAS_NOTIFY_EARLY PPC_BIT(2)
+#define VAS_NOTIFY_OSU_INTR PPC_BIT(3)
+
+#define VAS_LNOTIFY_PID_OFFSET 0x140
+#define VAS_LNOTIFY_PID PPC_BITMASK(0, 19)
+
+#define VAS_LNOTIFY_LPID_OFFSET 0x148
+#define VAS_LNOTIFY_LPID PPC_BITMASK(0, 11)
+
+#define VAS_LNOTIFY_TID_OFFSET 0x150
+#define VAS_LNOTIFY_TID PPC_BITMASK(0, 15)
+
+#define VAS_LNOTIFY_SCOPE_OFFSET 0x158
+#define VAS_LNOTIFY_MIN_SCOPE PPC_BITMASK(0, 1)
+#define VAS_LNOTIFY_MAX_SCOPE PPC_BITMASK(2, 3)
+
+#define VAS_NX_UTIL_OFFSET 0x1B0
+#define VAS_NX_UTIL PPC_BITMASK(0, 63)
+
+/* SE: Side effects */
+#define VAS_NX_UTIL_SE_OFFSET 0x1B8
+#define VAS_NX_UTIL_SE PPC_BITMASK(0, 63)
+
+#define VAS_NX_UTIL_ADDER_OFFSET 0x180
+#define VAS_NX_UTIL_ADDER PPC_BITMASK(32, 63)
+
+/*
+ * Local Notify Scope Control Register. (Receive windows only).
+ */
+enum vas_notify_scope {
+ VAS_SCOPE_LOCAL,
+ VAS_SCOPE_GROUP,
+ VAS_SCOPE_VECTORED_GROUP,
+ VAS_SCOPE_UNUSED,
+};
+
+/*
+ * Local DMA Cache Control Register (Receive windows only).
+ */
+enum vas_dma_type {
+ VAS_DMA_TYPE_INJECT,
+ VAS_DMA_TYPE_WRITE,
+};
+
+/*
+ * Local Notify Scope Control Register. (Receive windows only).
+ * Not applicable to NX receive windows.
+ */
+enum vas_notify_after_count {
+ VAS_NOTIFY_AFTER_256 = 0,
+ VAS_NOTIFY_NONE,
+ VAS_NOTIFY_AFTER_2
+};
+
+/*
+ * One per instance of VAS. Each instance will have a separate set of
+ * receive windows, one per coprocessor type.
+ */
+struct vas_instance {
+ int vas_id;
+ struct ida ida;
+
+ u64 hvwc_bar_start;
+ u64 hvwc_bar_len;
+ u64 uwc_bar_start;
+ u64 uwc_bar_len;
+ u64 win_base_addr;
+ u64 win_id_shift;
+
+ struct mutex mutex;
+ struct vas_window *rxwin[VAS_COP_TYPE_MAX];
+ struct vas_window *windows[VAS_WINDOWS_PER_CHIP];
+};
+
+/*
+ * In-kernel state a VAS window. One per window.
+ */
+struct vas_window {
+ /* Fields common to send and receive windows */
+ struct vas_instance *vinst;
+ int winid;
+ bool tx_win; /* True if send window */
+ bool nx_win; /* True if NX window */
+ bool user_win; /* True if user space window */
+ void *hvwc_map; /* HV window context */
+ void *uwc_map; /* OS/User window context */
+ pid_t pid; /* Linux process id of owner */
+
+ /* Fields applicable only to send windows */
+ void *paste_kaddr;
+ char *paste_addr_name;
+ struct vas_window *rxwin;
+
+ /* Feilds applicable only to receive windows */
+ enum vas_cop_type cop;
+ atomic_t num_txwins;
+};
+
+/*
+ * Container for the hardware state of a window. One per-window.
+ *
+ * A VAS Window context is a 512-byte area in the hardware that contains
+ * a set of 64-bit registers. Individual bit-fields in these registers
+ * determine the configuration/operation of the hardware. struct vas_winctx
+ * is a container for the register fields in the window context.
+ */
+struct vas_winctx {
+ void *rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+ int rsvd_txbuf_count;
+
+ bool user_win;
+ bool nx_win;
+ bool fault_win;
+ bool rsvd_txbuf_enable;
+ bool pin_win;
+ bool rej_no_credit;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_word_mode;
+ bool rx_word_mode;
+ bool data_stamp;
+ bool xtra_write;
+ bool notify_disable;
+ bool intr_disable;
+ bool fifo_disable;
+ bool notify_early;
+ bool notify_os_intr_reg;
+
+ int lpid;
+ int pidr; /* value from SPRN_PID, not linux pid */
+ int lnotify_lpid;
+ int lnotify_pid;
+ int lnotify_tid;
+ uint32_t pswid;
+ int rx_win_id;
+ int fault_win_id;
+ int tc_mode;
+
+ uint64_t irq_port;
+
+ enum vas_dma_type dma_type;
+ enum vas_notify_scope min_scope;
+ enum vas_notify_scope max_scope;
+ enum vas_notify_after_count notify_after_count;
+};
+
+#endif /* _VAS_H */
--
2.7.4

2017-08-08 23:07:27

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 03/17] powerpc/vas: Define vas_init() and vas_exit()

Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.

VAS is currently only supported with 64K page size.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v5]:
- [Ben Herrenschmidt]: Create and use platform device tree nodes,
fix up the "reg" properties for the VAS DT node and use the
platform device helpers to parse the reg properties; Use linked
list of VAS instances (don't assume vasids are sequential);
Use CONFIG_PPC_VAS instead of CONFIG_VAS.

Changelog[v4]:
- [Michael Neuling] Fix some accidental deletions; fix help text
in Kconfig; change vas_initialized to a function; move from
drivers/misc to arch/powerpc/kernel
- Drop the vas_window_reset() interface. It is not needed as
window will be initialized before each use.
- Add a "depends on PPC_64K_PAGES"

Changelog[v3]:
- Zero vas_instances memory on allocation
- [Haren Myneni] Fix description in Kconfig
Changelog[v2]:
- Get HVWC, UWC and window address parameters from device tree.
---
.../devicetree/bindings/powerpc/ibm,vas.txt | 24 +++
MAINTAINERS | 18 ++
arch/powerpc/platforms/powernv/Kconfig | 14 ++
arch/powerpc/platforms/powernv/Makefile | 1 +
arch/powerpc/platforms/powernv/vas-window.c | 19 +++
arch/powerpc/platforms/powernv/vas.c | 183 +++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 10 +-
7 files changed, 267 insertions(+), 2 deletions(-)
create mode 100644 Documentation/devicetree/bindings/powerpc/ibm,vas.txt
create mode 100644 arch/powerpc/platforms/powernv/vas-window.c
create mode 100644 arch/powerpc/platforms/powernv/vas.c

diff --git a/Documentation/devicetree/bindings/powerpc/ibm,vas.txt b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
new file mode 100644
index 0000000..8468a3a
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
@@ -0,0 +1,24 @@
+* IBM Powerpc Virtual Accelerator Switchboard (VAS)
+
+VAS is a hardware mechanism that allows ekrnel subsystems and user processes
+to directly submit compression and other requests to Nest accelerators (NX)
+or other coprocessors functions.
+
+Required properties:
+- compatible : should be "ibm,vas" or "ibm,power9-vas"
+- ibm,vas-id : A unique identifier for each instance of VAS in the system
+- reg : Should contain 4 pairs of 64-bit fields specifying the Hypervisor
+ window context start and length, OS/User window context start and length,
+ "Paste address" start and length, "Paste window id" start bit and number
+ of bits)
+- name : "vas"
+
+Example:
+
+ vas@6019100000000 {
+ compatible = "ibm,vas", "ibm,power9-vas";
+ reg = <0x6019100000000 0x2000000 0x6019000000000 0x100000000 0x8000000000000 0x100000000 0x20 0x10>;
+ name = "vas";
+ ibm,vas-id = <0x1>;
+ };
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 3c41902..edc58c9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6425,6 +6425,24 @@ F: drivers/crypto/nx/nx.*
F: drivers/crypto/nx/nx_csbcpb.h
F: drivers/crypto/nx/nx_debugfs.h

+IBM Power Virtual Accelerator Switchboard
+M: Sukadev Bhattiprolu
+L: [email protected]
+S: Supported
+F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/include/asm/vas.h
+F: arch/powerpc/include/uapi/asm/vas.h
+
+IBM Power 842 compression accelerator
+M: Haren Myneni <[email protected]>
+S: Supported
+F: drivers/crypto/nx/Makefile
+F: drivers/crypto/nx/Kconfig
+F: drivers/crypto/nx/nx-842*
+F: include/linux/sw842.h
+F: crypto/842.c
+F: lib/842/
+
IBM Power Linux RAID adapter
M: Brian King <[email protected]>
S: Supported
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 6a6f4ef..f565454 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -30,3 +30,17 @@ config OPAL_PRD
help
This enables the opal-prd driver, a facility to run processor
recovery diagnostics on OpenPower machines
+
+config PPC_VAS
+ bool "IBM Virtual Accelerator Switchboard (VAS)"
+ depends on PPC_POWERNV && PPC_64K_PAGES
+ default n
+ help
+ This enables support for IBM Virtual Accelerator Switchboard (VAS).
+
+ VAS allows accelerators in co-processors like NX-GZIP and NX-842
+ to be accessible to kernel subsystems and user processes.
+
+ VAS adapters are found in POWER9 based systems.
+
+ If unsure, say N.
diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index b5d98cb..e4db292 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_PPC_SCOM) += opal-xscom.o
obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o
obj-$(CONFIG_TRACEPOINTS) += opal-tracepoints.o
obj-$(CONFIG_OPAL_PRD) += opal-prd.o
+obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
new file mode 100644
index 0000000..6156fbe
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -0,0 +1,19 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/types.h>
+#include <linux/mutex.h>
+
+#include "vas.h"
+
+/* stub for now */
+int vas_win_close(struct vas_window *window)
+{
+ return -1;
+}
diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
new file mode 100644
index 0000000..556156b
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -0,0 +1,183 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+#include <linux/of_platform.h>
+#include <linux/of_address.h>
+#include <linux/of.h>
+
+#include "vas.h"
+
+static bool init_done;
+LIST_HEAD(vas_instances);
+
+static int init_vas_instance(struct platform_device *pdev)
+{
+ int rc, vasid;
+ struct vas_instance *vinst;
+ struct device_node *dn = pdev->dev.of_node;
+ struct resource *res;
+
+ rc = of_property_read_u32(dn, "ibm,vas-id", &vasid);
+ if (rc) {
+ pr_err("VAS: No ibm,vas-id property for %s?\n", pdev->name);
+ return -ENODEV;
+ }
+
+ if (pdev->num_resources != 4) {
+ pr_err("VAS: Unexpected DT configuration for [%s, %d]\n",
+ pdev->name, vasid);
+ return -ENODEV;
+ }
+
+ vinst = kcalloc(1, sizeof(*vinst), GFP_KERNEL);
+ if (!vinst)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&vinst->node);
+ ida_init(&vinst->ida);
+ mutex_init(&vinst->mutex);
+ vinst->vas_id = vasid;
+ vinst->pdev = pdev;
+
+ res = &pdev->resource[0];
+ vinst->hvwc_bar_start = res->start;
+ vinst->hvwc_bar_len = res->end - res->start + 1;
+
+ res = &pdev->resource[1];
+ vinst->uwc_bar_start = res->start;
+ vinst->uwc_bar_len = res->end - res->start + 1;
+
+ res = &pdev->resource[2];
+ vinst->paste_base_addr = res->start;
+
+ res = &pdev->resource[3];
+ vinst->paste_win_id_shift = 63 - res->end;
+
+ pr_devel("VAS: Initialized instance [%s, %d], paste_base 0x%llx, "
+ "paste_win_id_shift 0x%llx\n", pdev->name, vasid,
+ vinst->paste_base_addr, vinst->paste_win_id_shift);
+
+ vinst->ready = true;
+ list_add(&vinst->node, &vas_instances);
+
+ dev_set_drvdata(&pdev->dev, vinst);
+
+ return 0;
+}
+
+/*
+ * Although this is read/used multiple times, it is written to only
+ * during initialization.
+ */
+struct vas_instance *find_vas_instance(int vasid)
+{
+ struct list_head *ent;
+ struct vas_instance *vinst;
+
+ list_for_each(ent, &vas_instances) {
+ vinst = list_entry(ent, struct vas_instance, node);
+ if (vinst->vas_id == vasid)
+ return vinst;
+ }
+
+ pr_devel("VAS: Instance %d not found\n", vasid);
+ return NULL;
+}
+
+bool vas_initialized(void)
+{
+ return init_done;
+}
+
+static int vas_probe(struct platform_device *pdev)
+{
+ if (!pdev || !pdev->dev.of_node)
+ return -ENODEV;
+
+ return init_vas_instance(pdev);
+}
+
+static void free_inst(struct vas_instance *vinst)
+{
+ list_del(&vinst->node);
+
+ kfree(vinst);
+}
+
+static int vas_remove(struct platform_device *pdev)
+{
+ struct vas_instance *vinst;
+
+ vinst = dev_get_drvdata(&pdev->dev);
+
+ pr_devel("VAS: Removed instance [%s, %d]\n", pdev->name,
+ vinst->vas_id);
+ free_inst(vinst);
+
+ return 0;
+}
+static const struct of_device_id powernv_vas_match[] = {
+ { .compatible = "ibm,vas",},
+ {},
+};
+
+static struct platform_driver vas_driver = {
+ .driver = {
+ .name = "vas",
+ .of_match_table = powernv_vas_match,
+ },
+ .probe = vas_probe,
+ .remove = vas_remove,
+};
+
+module_platform_driver(vas_driver);
+
+int vas_init(void)
+{
+ int found = 0;
+ struct device_node *dn;
+
+ for_each_compatible_node(dn, NULL, "ibm,vas") {
+ of_platform_device_create(dn, NULL, NULL);
+ found++;
+ }
+
+ if (!found)
+ return -ENODEV;
+
+ pr_devel("VAS: Found %d instances\n", found);
+ init_done = true;
+
+ return 0;
+}
+
+void vas_exit(void)
+{
+ struct list_head *ent;
+ struct vas_instance *vinst;
+
+ list_for_each(ent, &vas_instances) {
+ vinst = list_entry(ent, struct vas_instance, node);
+ of_platform_depopulate(&vinst->pdev->dev);
+ }
+
+ init_done = false;
+}
+
+module_init(vas_init);
+module_exit(vas_exit);
+MODULE_DESCRIPTION("Bare metal IBM Virtual Accelerator Switchboard");
+MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 312a378..150d7b1 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -291,14 +291,17 @@ enum vas_notify_after_count {
*/
struct vas_instance {
int vas_id;
+ bool ready;
struct ida ida;
+ struct list_head node;
+ struct platform_device *pdev;

u64 hvwc_bar_start;
u64 hvwc_bar_len;
u64 uwc_bar_start;
u64 uwc_bar_len;
- u64 win_base_addr;
- u64 win_id_shift;
+ u64 paste_base_addr;
+ u64 paste_win_id_shift;

struct mutex mutex;
struct vas_window *rxwin[VAS_COP_TYPE_MAX];
@@ -379,4 +382,7 @@ struct vas_winctx {
enum vas_notify_after_count notify_after_count;
};

+extern bool vas_initialized(void);
+extern struct vas_instance *find_vas_instance(int vasid);
+
#endif /* _VAS_H */
--
2.7.4

2017-08-08 23:07:30

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 07/17] powerpc/vas: Define vas_win_paste_addr()

Define an interface that the NX drivers can use to find the physical
paste address of a send window. This interface is expected to be used
with the mmap() operation of the NX driver's device. i.e the user space
process can use driver's mmap() operation to map the send window's paste
address into their address space and then use copy and paste instructions
to submit the CRBs to the NX engine.

Note that kernel drivers will use vas_paste_crb() directly and don't need
this interface.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/asm/vas.h | 7 +++++++
arch/powerpc/platforms/powernv/vas-window.c | 10 ++++++++++
2 files changed, 17 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 2c8558a..2b35b95 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -12,6 +12,8 @@

#include <uapi/asm/vas.h>

+struct vas_window;
+
/*
* Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
* (Local FIFO Size Register) of the VAS workbook.
@@ -32,4 +34,9 @@ enum vas_cop_type {
VAS_COP_TYPE_MAX,
};

+/*
+ * Return the power bus paste address associated with @win so the caller
+ * can map that address into their address space.
+ */
+extern uint64_t vas_win_paste_addr(struct vas_window *win);
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 9c12919..3a4599f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -35,6 +35,16 @@ void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
}

+uint64_t vas_win_paste_addr(struct vas_window *win)
+{
+ uint64_t addr;
+
+ compute_paste_address(win, &addr, NULL);
+
+ return addr;
+}
+EXPORT_SYMBOL(vas_win_paste_addr);
+
static inline void get_hvwc_mmio_bar(struct vas_window *window,
uint64_t *start, int *len)
{
--
2.7.4

2017-08-08 23:07:35

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 08/17] powerpc/vas: Define vas_win_id()

Define an interface to return a system-wide unique id for a given VAS
window.

The vas_win_id() will be used in a follow-on patch to generate an unique
handle for a user space receive window. Applications can use this handle
to pair send and receive windows for fast thread-wakeup.

The hardware refers to this system-wide unique id as a Partition Send
Window ID which is expected to be used during fault handling. Hence the
"pswid" in the function names.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/asm/vas.h | 5 +++++
arch/powerpc/platforms/powernv/vas-window.c | 9 +++++++++
arch/powerpc/platforms/powernv/vas.h | 28 ++++++++++++++++++++++++++++
3 files changed, 42 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 2b35b95..30667db 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -35,6 +35,11 @@ enum vas_cop_type {
};

/*
+ * Return a system-wide unique id for the VAS window @win.
+ */
+extern uint32_t vas_win_id(struct vas_window *win);
+
+/*
* Return the power bus paste address associated with @win so the caller
* can map that address into their address space.
*/
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 3a4599f..42c1d4f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -575,3 +575,12 @@ int vas_win_close(struct vas_window *window)
{
return -1;
}
+
+/*
+ * Return a system-wide unique window id for the window @win.
+ */
+uint32_t vas_win_id(struct vas_window *win)
+{
+ return encode_pswid(win->vinst->vas_id, win->winid);
+}
+EXPORT_SYMBOL_GPL(vas_win_id);
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 7b2bcd0..3eadf90 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -440,4 +440,32 @@ static inline uint64_t read_hvwc_reg(struct vas_window *win,
return in_be64(win->hvwc_map+reg);
}

+/*
+ * Encode/decode the Partition Send Window ID (PSWID) for a window in
+ * a way that we can uniquely identify any window in the system. i.e.
+ * we should be able to locate the 'struct vas_window' given the PSWID.
+ *
+ * Bits Usage
+ * 0:7 VAS id (8 bits)
+ * 8:15 Unused, 0 (3 bits)
+ * 16:31 Window id (16 bits)
+ */
+static inline u32 encode_pswid(int vasid, int winid)
+{
+ u32 pswid = 0;
+
+ pswid |= vasid << (31 - 7);
+ pswid |= winid;
+
+ return pswid;
+}
+
+static inline void decode_pswid(u32 pswid, int *vasid, int *winid)
+{
+ if (vasid)
+ *vasid = pswid >> (31 - 7) & 0xFF;
+
+ if (winid)
+ *winid = pswid & 0xFFFF;
+}
#endif /* _VAS_H */
--
2.7.4

2017-08-08 23:07:40

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 11/17] powerpc/vas: Define vas_win_close() interface

Define the vas_win_close() interface which should be used to close a
send or receive windows.

While the hardware configurations required to open send and receive windows
differ, the configuration to close a window is the same for both. So we use
a single interface to close the window.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v4]:
- Drop the poll for credits return (we can set the required credit,
but cannot really find the available credit at a point in time)
- Export the symbol

Changelog[v3]:
- Fix order of parameters in GET_FIELD().
- Update references and sequence for closing/quiescing a window.
---
arch/powerpc/include/asm/vas.h | 7 ++
arch/powerpc/platforms/powernv/vas-window.c | 99 +++++++++++++++++++++++++++--
2 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index a3778d7..e1c5376 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -91,4 +91,11 @@ extern void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr,
extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
struct vas_rx_win_attr *attr);

+/*
+ * Close the send or receive window identified by @win. For receive windows
+ * return -EAGAIN if there are active send windows attached to this receive
+ * window.
+ */
+int vas_win_close(struct vas_window *win);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index dfa7e67..9704a3b 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -139,7 +139,7 @@ static void unmap_region(void *addr, uint64_t start, int len)
/*
* Unmap the paste address region for a window.
*/
-void unmap_paste_region(struct vas_window *window)
+static void unmap_paste_region(struct vas_window *window)
{
int len;
uint64_t busaddr_start;
@@ -535,7 +535,7 @@ int vas_assign_window_id(struct ida *ida)
return winid;
}

-void vas_window_free(struct vas_window *window)
+static void vas_window_free(struct vas_window *window)
{
int winid = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -609,6 +609,14 @@ static bool valid_permissions(struct vas_window *rxwin)
return rc;
}

+static void put_rx_win(struct vas_window *rxwin)
+{
+ /* Better not be a send window! */
+ WARN_ON_ONCE(rxwin->tx_win);
+
+ atomic_dec(&rxwin->num_txwins);
+}
+
/*
* Find the user space receive window given the @pswid.
*
@@ -710,7 +718,7 @@ static void set_vinst_win(struct vas_instance *vinst,
* Clear this window from the table(s) of windows for this VAS instance.
* See also function header of set_vinst_win().
*/
-void clear_vinst_win(struct vas_window *window)
+static void clear_vinst_win(struct vas_window *window)
{
int id = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -925,11 +933,92 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_rx_win_open);

-/* stub for now */
+static void poll_window_busy_state(struct vas_window *window)
+{
+ int busy;
+ uint64_t val;
+
+retry:
+ /*
+ * Poll Window Busy flag
+ */
+ val = read_hvwc_reg(window, VREG(WIN_STATUS));
+ busy = GET_FIELD(VAS_WIN_BUSY, val);
+ if (busy) {
+ val = 0;
+ schedule_timeout(2000);
+ goto retry;
+ }
+}
+
+static void poll_window_castout(struct vas_window *window)
+{
+ int cached;
+ uint64_t val;
+
+ /* Cast window context out of the cache */
+retry:
+ val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
+ cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
+ if (cached) {
+ val = 0ULL;
+ val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
+ val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
+
+ schedule_timeout(2000);
+ goto retry;
+ }
+}
+
+/*
+ * Close a window.
+ *
+ * See Section 1.12.1 of VAS workbook v1.05 for details on closing window:
+ * - Disable new paste operations (unmap paste address)
+ * - Poll for the "Window Busy" bit to be cleared
+ * - Clear the Open/Enable bit for the Window.
+ * - Poll for return of window Credits (implies FIFO empty for Rx win?)
+ * - Unpin and cast window context out of cache
+ *
+ * Besides the hardware, kernel has some bookkeeping of course.
+ */
int vas_win_close(struct vas_window *window)
{
- return -1;
+ uint64_t val;
+
+ if (!window)
+ return 0;
+
+ if (!window->tx_win && atomic_read(&window->num_txwins) != 0) {
+ pr_devel("VAS: Attempting to close an active Rx window!\n");
+ WARN_ON_ONCE(1);
+ return -EAGAIN;
+ }
+
+ unmap_paste_region(window);
+
+ clear_vinst_win(window);
+
+ poll_window_busy_state(window);
+
+ /* Unpin window from cache and close it */
+ val = read_hvwc_reg(window, VREG(WINCTL));
+ val = SET_FIELD(VAS_WINCTL_PIN, val, 0);
+ val = SET_FIELD(VAS_WINCTL_OPEN, val, 0);
+ write_hvwc_reg(window, VREG(WINCTL), val);
+
+ poll_window_castout(window);
+
+ /* if send window, drop reference to matching receive window */
+ if (window->tx_win)
+ put_rx_win(window->rxwin);
+
+ vas_window_free(window);
+
+ return 0;
}
+EXPORT_SYMBOL_GPL(vas_win_close);

/*
* Return a system-wide unique window id for the window @win.
--
2.7.4

2017-08-08 23:07:51

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 17/17] powerpc/vas: Document FTW API/usage

Document the usage of the VAS Fast thread-wakeup API.

Thanks for input/comments from Benjamin Herrenschmidt, Michael Neuling,
Michael Ellerman, Robert Blackmore, Ian Munsie, Haren Myneni, Paul Mackerras.

Cc:Ian Munsie <[email protected]>
Cc:Paul Mackerras <[email protected]>
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Documentation/powerpc/ftw-api.txt | 373 ++++++++++++++++++++++++++++++++++++++
1 file changed, 373 insertions(+)
create mode 100644 Documentation/powerpc/ftw-api.txt

diff --git a/Documentation/powerpc/ftw-api.txt b/Documentation/powerpc/ftw-api.txt
new file mode 100644
index 0000000..0b3f16f
--- /dev/null
+++ b/Documentation/powerpc/ftw-api.txt
@@ -0,0 +1,373 @@
+Virtual Accelerator Switchboard and Fast Thread-Wakeup API
+
+ Power9 processor supports a hardware subystem known as the Virtual
+ Accelerator Switchboard (VAS) which allows two entities in the Power9
+ system to efficiently exchange messages. Messages must be formatted as
+ Coprocessor Reqeust Blocks (CRB) and be submitted using the COPY/PASTE
+ instructions (new in Power9).
+
+ Usage of VAS depends on the entities exchanging the messages and
+ currently two usages have been identified.
+
+ First usage of VAS, referred to as VAS/NX involves a software thread
+ submitting data compression requests to a co-processor (hardware/nest
+ accelerator) aka NX engine. The API for this usage is described in the
+ VAS/NX API document.
+
+ Alternatively, VAS can be used by two software threads to efficiently
+ exchange messages. Initially, this mechanism is intended to wake up a
+ waiting thread quickly - i.e "fast thread wake-up (FTW)". This document
+ describes the user API for this VAS/FTW mechanism.
+
+ Application access to the FTW mechanism is provided through the NX-FTW
+ device node (/dev/crypto/nx-ftw) implemented by the VAS/FTW device
+ driver.
+
+ A software thread T1 that intends to wait for an event must first setup
+ a receive window, by opening the NX-FTW device and using the
+ VAS_RX_WIN_OPEN ioctl. Upon successful return from the VAS_RX_WIN_OPEN
+ ioctl, an rx_win_handle is returned.
+
+ A software thread T2 that intends to wake up T1 at some point, must first
+ set up a "send window" using the VAS_TX_WIN_OPEN ioctl and specify the
+ rx_win_handle obtained by T1. After a successful VAS_TX_WIN_OPEN ioctl the
+ send window of T2 is considered paired with the receive window of T1. The
+ thread T2 must then use mmap() to obtain a "paste address" for the send
+ window.
+
+ With this set up, thread T1 can wait for an event using the WAIT
+ instruction.
+
+ Thread T2 can wake up T1 by using the "COPY/PASTE" instructions and
+ submitting an empty/NULL CRB to the send window's paste address. The
+ wait/wake up process can be repeated as long as the threads have the
+ send/receive windows open.
+
+1. NX-FTW Device Node
+
+ There is one /dev/crypto/nx-ftw node in the system and it provides
+ access to the VAS/FTW functionality.
+
+ The only valid operations on the NX-FTW node are:
+
+ - open() the device for read and write.
+
+ - issue either VAS_RX_WIN_OPEN or VAS_TX_WIN_OPEN ioctls to set up
+ receive or send (only one of them per open).
+
+ - if the open is associated with send window (i.e VAS_TX_WIN_OPEN
+ ioctl was issued) mmap() the send window into the application's
+ virtual address space. (i.e get a 'paste_address' for the send
+ window).
+
+ - close the device node.
+
+ Other file operations on the NX-FTW node are undefined.
+
+ Note tHAT the COPY and PASTE operations go directly to the hardware
+ and not go through the NX-FTW device.
+
+ Although a system may have several instances of the VAS in the system
+ (typically, one per P9 chip) there is just one NX-FTW device node in
+ the system.
+
+ When the NX-FTW device node is opened, the kernel assigns a suitable
+ instance of VAS to the process. Kernel will make a best-effort attempt
+ to assign an optimal instance of VAS for the process. In the initial
+ release, the kernel does not support migrating the VAS instance if the
+ process migrates from a processor on one chip to a processor on another
+ chip.
+
+ Applications may chose a specific instance of the VAS using the 'vas_id'
+ field in the VAS_TX_WIN_OPEN and VAS_RX_WIN_OPEN ioctls as detailed below.
+
+2. Open NX-FTW node
+
+ The device should be opened for read and write. No special privileges
+ are needed to open the device. The device may be opened multiple times.
+
+ Each open() of the NX-FTW device may be associated with either a send
+ window or receive window but not both.
+
+ See open(2) system call man pages for other details such as return
+ values, error codes and restrictions.
+
+3. Setup Receive window (VAS_RX_WIN_OPEN ioctl)
+
+ A thread that expects to wait for events and be woken up using COPY/PASTE
+ must first set up a receive window by issuing the VAS_RX_WIN_OPEN ioctl.
+
+ #include <asm/vas.h>
+
+ struct vas_rx_win_open_attr rxattr;
+
+ rc = ioctl(fd, VAS_RX_WIN_OPEN, &rxattr);
+
+ The attributes of rxattr are as follows:
+
+ struct vas_rx_win_open_attr {
+ int16_t version;
+ int16_t vas_id;
+ int32_t rx_win_handle; /* output field */
+ int64_t reserved[8];
+ };
+
+ The version field identifies the version of the API and must currently
+ be set to 1.
+
+ The vas_id field identifies a specific instance of the VAS that the
+ application wishes to access. See section on VAS ID below.
+
+ The reserved field must be set to all zeroes.
+
+ Upon successful return from the ioctl, the rx_win_handle field contains
+ an identifier for the VAS window associated with this "sleeping" thread.
+
+ This rx_win_handle field is used to "pair" this receive window with a
+ send window and must be specified when opening the corresponding send
+ window (see struct vas_tx_win_open_attr below).
+
+ Return value:
+
+ The VAS_RX_WIN_OPEN ioctl returns 0 on success. On error, it returns -1
+ and sets the errno variable to indicate the error.
+
+ Error codes:
+
+ EINVAL version is invalid
+
+ EINVAL vas_id is invalid
+
+ EINVAL reserved field is not set to zeroes
+
+ EINVAL fd is already associated with a send window
+
+
+3. Set up a Send window (VAS_TX_WIN_OPEN ioctl)
+
+ An application thread that expects to wake up a waiting thread using
+ copy/paste, must first set up a send window that is paired with the
+ receive window of the waiting thread. This is accomplished using the
+ VAS_TX_WIN_OPEN ioctl.
+
+ #include <asm/vas.h>
+
+ struct vas_tx_win_open_attr txattr;
+
+ rc = ioctl(fd, VAS_TX_WIN_OPEN, &txattr);
+
+ The attributes 'txattr' for the VAS_TX_WIN_OPEN ioctl are defined as
+ follows:
+
+ struct vas_tx_win_open_attr {
+ int32_t version;
+ int16_t vas_id;
+ uint32_t rx_win_handle;
+
+ int64_t reserved1;
+
+ int64_t flags;
+ int64_t reserved2;
+
+ int32_t tc_mode;
+ int32_t rsvd_txbuf;
+ int64_t reserved3[6];
+ };
+
+ The version field must currently be set to 1.
+
+ The vas_id field identifies a specific instance of the VAS that the
+ application wishes to access. See section on VAS ID below.
+
+ The rx_win_handle field must be set to the rx_win_handle returned by
+ a prior successful call to VAS_RX_WIN_OPEN ioctl (see above). This
+ field is used to pair this send window with a receive window. The
+ process must have sufficient permissions to communicate with the
+ process owning the receive window identified by rx_win_handle.
+
+ The tc_mode and rsvd_txbuf fields are currently unused and must be
+ set to 0
+
+ The flags field specifies additional attributes to the window. The
+ only valid bit in the flag are for FTW windows is:
+
+ VAS_FLAGS_PIN_WINDOW if set, indicates the a window should be
+ pinned in cache. This flag is restricted
+ to privileged users. See Pinning windows
+ below.
+
+ All the other bits in the flags field must be set to 0.
+
+ The fields reserved1, reserved2 and reserved3 are for future extension
+ and must be set to 0.
+
+ Return value:
+
+ The VAS_TX_WIN_OPEN ioctl returns 0 on success. On error, it returns -1
+ and sets the errno variable to indicate the error.
+
+ Error conditions:
+
+ EINVAL version, vas_id or rx_win_handle fields are invalid
+
+ EINVAL fd does not refer to a valid VAS device.
+
+ EINVAL fd is already associated with a receive window
+
+ ENOSPC System has too many active windows (connections) open,
+
+ EINVAL For FTW windows, rsvd_txbuf is not 0.
+
+ EINVAL For FTW windows, tc_mode is not VAS_THRESH_DISABLED.
+
+ EPERM VAS_FLAGS_PIN_WINDOW is set in 'flags' field and process
+ is not privileged.
+
+ EPERM VAS_FLAGS_HIGH_PRI is set in 'flags' field and process
+ is not privileged.
+
+ EINVAL an invalid flag is set in the 'flags' field. (For FTW
+ windows, VAS_FLAGS_HIGH_PRI is also invalid).
+
+ EINVAL reserved fields are not set to 0.
+
+ See the ioctl(2) man page for more details, error codes and restrictions.
+
+4. mmap() NX-FTW device fd
+
+ The mmap() system call for a NX-FTW device fd returns a "paste address"
+ that the application can use to COPY/PASTE a CRB to the waiting thread.
+
+ paste_addr = mmap(NULL, size, prot, flags, fd, offset);
+
+ The mmap() operation is only valid on a file descriptor associated
+ with a send window.
+
+ Only restrictions on mmap for a NX-FTW device fd are:
+
+ - size parameter should be one page size
+
+ - offset parameter should be 0ULL.
+
+ Refer to mmap(2) man page for additional details/restrictions.
+
+ In addition to the error conditions listed on the mmap(2) man page,
+ mmap() can also fail with one of following error codes:
+
+ EINVAL fd is not associated with an open send window (i.e mmap()
+ does not follow a successful call to the VAS_TX_WIN_OPEN
+ ioctl).
+
+ EINVAL offset field is not 0ULL.
+
+
+5. VAS ID
+
+ A system may have several instances of VAS in the hardware, typically
+ one per POWER 9 chip. The choice of a specific instance of VAS can have
+ significant impact on the performance, specially if the application
+ migrates from one CPU to another. Applications can specify a vas_id
+ using the VAS_TX_WIN_OPEN and VAS_RX_WIN_OPEN ioctls and should be
+ prudent in choosing an instance of VAS.
+
+ The vas_id for each instance of VAS is listed as the device tree
+ property 'ibm,vas-id'. Determining the specific vas_id to use for
+ a specific application thread is beyond the scope of this API.
+
+ If the application has no preference, the vas_id field may be set to
+ -1 and the kernel will choose a suitable instance of the VAS engine.
+
+6. COPY/PASTE operations:
+
+ Applications should use the COPY and PASTE instructions defined in
+ the RFC to copy/paste the CRB. For VAS/FTW usage, the contents of
+ CRB if any, are ignored. CRB can be NULL.
+
+7. Interrupt completion and signal handling
+
+ No VAS-specific signals will be generated to the application threads
+ with the VAS/FTW usage.
+
+
+8. Example/Proposed usage of the VAS/FTW API
+
+ In the following example we use two threads that use the VAS/FTW API.
+ Thread T1 uses the WAIT instruction to wait for an event. Thread T2
+ uses copy/paste instructions to wake up T1.
+
+ Common interfaces:
+
+ static bool paste_done;
+ uint32_t rx_win_handle;
+
+ #define WAIT .long (0x7C00003C)
+
+ static inline int do_wait(void)
+ {
+ __asm__ __volatile(stringify_in_c(WAIT)";");
+ }
+
+ /*
+ * Check if paste_done is true
+ */
+ static bool is_paste_done(void)
+ {
+ return __sync_bool_compare_and_swap(&paste_done, 1, 0);
+
+ }
+
+ /*
+ * Set paste_done to true
+ */
+ static inline void set_paste_done(void)
+ {
+ __sync_bool_compare_and_swap(&paste_done, 0, 1);
+ }
+
+ Thread T1:
+
+ struct vas_rx_win_open_attr rxattr;
+
+ fd = open("/dev/crypto/nx-ftw", O_RDWR);
+
+ memset(&rxattr, 0, sizeof(rxattr));
+ rxattr.version = 1;
+
+ rc = ioctl(fd, VAS_RX_WIN_OPEN, &rxattr);
+
+ rx_win_handle = rxattr.rx_win_handle;
+
+ /* Tell T2 that Rx window is ready to be paired */
+ pthread_cond_signal(&rx_win_ready);
+
+ /* Rx set up done */
+
+ /* later, wait for an event to occur */
+
+ while(!is_paste_done())
+ do_wait();
+
+ Thread T2:
+
+ struct vas_tx_win_open_attr txattr;
+
+ fd = open("/dev/crypto/nx-ftw", O_RDWR);
+
+ /* Wait for Rx window to be set up first */
+ pthread_cond_wait(&rx_win_ready);
+
+ memset(&txattr, 0, sizeof(txattr));
+ txattr.version = 1;
+ txattr.rx_win_handle = rx_win_handle;
+
+ rc = ioctl(fd, VAS_TX_WIN_OPEN, &txattr);
+
+ prot = PROT_READ|PROT_WRITE;
+ paste_addr = mmap(NULL, 4096, prot, MAP_SHARED, fd, 0ULL);
+
+ /* Tx setup done */
+
+ /* later ... */
+
+ set_paste_done(); /* ... event occured */
+ write_null_crb(paste_addr); /* wake up T1 */
--
2.7.4

2017-08-08 23:07:47

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 13/17] powerpc/vas: Define copy/paste interfaces

Define interfaces (wrappers) to the 'copy' and 'paste' instructions
(which are new in PowerISA 3.0). These are intended to be used to
by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the
NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>

---
Changelog[v4]
- Export symbols
Changelog[v3]
- Map raw CR value from paste instruction into an error code.
---
MAINTAINERS | 1 +
arch/powerpc/include/asm/vas.h | 13 +++++
arch/powerpc/platforms/powernv/copy-paste.h | 74 +++++++++++++++++++++++++++++
arch/powerpc/platforms/powernv/vas-window.c | 52 ++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 15 ++++++
5 files changed, 155 insertions(+)
create mode 100644 arch/powerpc/platforms/powernv/copy-paste.h

diff --git a/MAINTAINERS b/MAINTAINERS
index edc58c9..c3f156c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6430,6 +6430,7 @@ M: Sukadev Bhattiprolu
L: [email protected]
S: Supported
F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/platforms/powernv/copy-paste.h
F: arch/powerpc/include/asm/vas.h
F: arch/powerpc/include/uapi/asm/vas.h

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 3fc6435..f9779c4 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -140,4 +140,17 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
*/
int vas_win_close(struct vas_window *win);

+/*
+ * Copy the co-processor request block (CRB) @crb into the local L2 cache.
+ * For now, @offset must be 0 and @first must be true.
+ */
+extern int vas_copy_crb(void *crb, int offset, bool first);
+
+/*
+ * Paste a previously copied CRB (see vas_copy_crb()) from the L2 cache to
+ * the hardware address associated with the window @win. For now, @off must
+ * 0 and @last must be true. @re is expected/assumed to be true for NX windows.
+ */
+extern int vas_paste_crb(struct vas_window *win, int off, bool last, bool re);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/copy-paste.h b/arch/powerpc/platforms/powernv/copy-paste.h
new file mode 100644
index 0000000..7783bb8
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/copy-paste.h
@@ -0,0 +1,74 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+/*
+ * Macros taken from tools/testing/selftests/powerpc/context_switch/cp_abort.c
+ */
+#define PASTE(RA, RB, L, RC) \
+ .long (0x7c00070c | (RA) << (31-15) | (RB) << (31-20) \
+ | (L) << (31-10) | (RC) << (31-31))
+
+#define COPY(RA, RB, L) \
+ .long (0x7c00060c | (RA) << (31-15) | (RB) << (31-20) \
+ | (L) << (31-10))
+
+#define CR0_FXM "0x80"
+#define CR0_SHIFT 28
+#define CR0_MASK 0xF
+/*
+ * Copy/paste instructions:
+ *
+ * copy RA,RB,L
+ * Copy contents of address (RA) + effective_address(RB)
+ * to internal copy-buffer.
+ *
+ * L == 1 indicates this is the first copy.
+ *
+ * L == 0 indicates its a continuation of a prior first copy.
+ *
+ * paste RA,RB,L
+ * Paste contents of internal copy-buffer to the address
+ * (RA) + effective_address(RB)
+ *
+ * L == 0 indicates its a continuation of a prior paste. i.e.
+ * don't wait for the completion or update status.
+ *
+ * L == 1 indicates this is the last paste in the group (i.e.
+ * wait for the group to complete and update status in CR0).
+ *
+ * For Power9, the L bit must be 'true' in both copy and paste.
+ */
+
+static inline int vas_copy(void *crb, int offset, int first)
+{
+ WARN_ON_ONCE(!first);
+
+ __asm__ __volatile(stringify_in_c(COPY(%0, %1, %2))";"
+ :
+ : "b" (offset), "b" (crb), "i" (1)
+ : "memory");
+
+ return 0;
+}
+
+static inline int vas_paste(void *paste_address, int offset, int last)
+{
+ unsigned long long cr;
+
+ WARN_ON_ONCE(!last);
+
+ cr = 0;
+ __asm__ __volatile(stringify_in_c(PASTE(%1, %2, 1, 1))";"
+ "mfocrf %0," CR0_FXM ";"
+ : "=r" (cr)
+ : "b" (paste_address), "b" (offset)
+ : "memory");
+
+ return cr;
+}
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 3e2655c..63367c7 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -16,6 +16,7 @@
#include <linux/cred.h>

#include "vas.h"
+#include "copy-paste.h"

/*
* Compute the paste address region for the window @window using the
@@ -1084,6 +1085,57 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_tx_win_open);

+int vas_copy_crb(void *crb, int offset, bool first)
+{
+ if (!vas_initialized())
+ return -1;
+
+ return vas_copy(crb, offset, first);
+}
+EXPORT_SYMBOL_GPL(vas_copy_crb);
+
+#define RMA_LSMP_REPORT_ENABLE PPC_BIT(53)
+int vas_paste_crb(struct vas_window *txwin, int offset, bool last, bool re)
+{
+ int rc;
+ uint64_t val;
+ void *addr;
+
+ if (!vas_initialized())
+ return -1;
+ /*
+ * Only NX windows are supported for now and hardware assumes
+ * report-enable flag is set for NX windows. Ensure software
+ * complies too.
+ */
+ WARN_ON_ONCE(!re);
+
+ addr = txwin->paste_kaddr;
+ if (re) {
+ /*
+ * Set the REPORT_ENABLE bit (equivalent to writing
+ * to 1K offset of the paste address)
+ */
+ val = SET_FIELD(RMA_LSMP_REPORT_ENABLE, 0ULL, 1);
+ addr += val;
+ }
+
+ /*
+ * Map the raw CR value from vas_paste() to an error code (there
+ * is just pass or fail for now though).
+ */
+ rc = vas_paste(addr, offset, last);
+ if (rc == 0x20000000)
+ rc = 0;
+ else
+ rc = -EINVAL;
+
+ print_fifo_msg_count(txwin);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vas_paste_crb);
+
static void poll_window_busy_state(struct vas_window *window)
{
int busy;
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 61fd80f..4e3e5fe 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -482,4 +482,19 @@ static inline void decode_pswid(u32 pswid, int *vasid, int *winid)
if (winid)
*winid = pswid & 0xFFFF;
}
+
+#ifdef vas_debug
+
+static void print_fifo_msg_count(struct vas_window *txwin)
+{
+ uint64_t read_hvwc_reg(struct vas_window *w, char *n, uint64_t o);
+ pr_devel("Winid %d, Msg count %llu\n", txwin->winid,
+ (uint64_t)read_hvwc_reg(txwin, VREG(LRFIFO_PUSH)));
+}
+#else /* vas_debug */
+
+#define print_fifo_msg_count(window)
+
+#endif /* vas_debug */
+
#endif /* _VAS_H */
--
2.7.4

2017-08-08 23:08:25

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 16/17] powerpc/vas: Implement a simple FTW driver

The Fast Thread Wake-up (FTW) driver provides user space applications an
interface to the Core-to-Core functionality in POWER9. The driver provides
the device node/ioctl API to applications and uses the external interfaces
to the VAS driver to interact with the VAS hardware.

A follow-on patch provides detailed description of the API for the driver.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
MAINTAINERS | 1 +
arch/powerpc/platforms/powernv/Kconfig | 16 ++
arch/powerpc/platforms/powernv/Makefile | 1 +
arch/powerpc/platforms/powernv/nx-ftw.c | 486 ++++++++++++++++++++++++++++++++
4 files changed, 504 insertions(+)
create mode 100644 arch/powerpc/platforms/powernv/nx-ftw.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c3f156c..a45c0c4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6431,6 +6431,7 @@ L: [email protected]
S: Supported
F: arch/powerpc/platforms/powernv/vas*
F: arch/powerpc/platforms/powernv/copy-paste.h
+F: arch/powerpc/platforms/powernv/nx-ftw*
F: arch/powerpc/include/asm/vas.h
F: arch/powerpc/include/uapi/asm/vas.h

diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index f565454..67ea0ff 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -44,3 +44,19 @@ config PPC_VAS
VAS adapters are found in POWER9 based systems.

If unsure, say N.
+
+config PPC_FTW
+ bool "IBM Fast Thread-Wakeup (FTW)"
+ depends on PPC_VAS
+ default n
+ help
+ This enables support for IBM Fast Thread-Wakeup driver.
+
+ The FTW driver allows applications to utilize a low overhead
+ core-to-core wake up mechansim in the IBM Virtual Accelerator
+ Switchboard (VAS) to improve performance.
+
+ VAS adapters are found in POWER9 based systems and are required
+ for the FTW driver to be operational.
+
+ If unsure, say N.
diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index e4db292..dc60046 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o
obj-$(CONFIG_TRACEPOINTS) += opal-tracepoints.o
obj-$(CONFIG_OPAL_PRD) += opal-prd.o
obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o
+obj-$(CONFIG_PPC_FTW) += nx-ftw.o
diff --git a/arch/powerpc/platforms/powernv/nx-ftw.c b/arch/powerpc/platforms/powernv/nx-ftw.c
new file mode 100644
index 0000000..a0b6388
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/nx-ftw.c
@@ -0,0 +1,486 @@
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <asm/cputable.h>
+#include <linux/device.h>
+#include <linux/debugfs.h>
+#include <linux/cdev.h>
+#include <linux/mutex.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/sched.h>
+#include <linux/uaccess.h>
+#include <linux/bootmem.h>
+#include <asm/opal-api.h>
+#include <asm/opal.h>
+#include <asm/page.h>
+#include <asm/vas.h>
+#include <asm/reg.h>
+
+/*
+ * NX-FTW is a device driver used to provide user space access to the
+ * Core-to-Core aka Fast Thread Wakeup (FTW) functionality provided by
+ * the Virtual Accelerator Subsystem (VAS) in POWER9 systems. See also
+ * arch/powerpc/platforms/powernv/vas*.
+ *
+ * The driver creates the device node /dev/crypto/nx-ftw that can be
+ * used as follows:
+ *
+ * fd = open("/dev/crypto/nx-ftw", O_RDWR);
+ * rc = ioctl(fd, VAS_RX_WIN_OPEN, &rxattr);
+ * rc = ioctl(fd, VAS_TX_WIN_OPEN, &txattr);
+ * paste_addr = mmap(NULL, PAGE_SIZE, prot, MAP_SHARED, fd, 0ULL).
+ * vas_copy(&crb, 0, 1);
+ * vas_paste(paste_addr, 0, 1);
+ *
+ * where "vas_copy" and "vas_paste" are defined in copy-paste.h.
+ */
+
+static char *nxftw_dev_name = "nx-ftw";
+static atomic_t nxftw_instid = ATOMIC_INIT(0);
+static dev_t nxftw_devt;
+static struct dentry *nxftw_debugfs;
+static struct class *nxftw_dbgfs_class;
+
+/*
+ * Wrapper object for the nx-ftw device node - there is just one
+ * instance of this node for the whole system.
+ */
+struct nxftw_dev {
+ struct cdev cdev;
+ struct device *device;
+ char *name;
+ atomic_t refcount;
+} nxftw_device;
+
+/*
+ * One instance per open of a nx-ftw device. Each nxftw_instance is
+ * associated with a VAS window, after the caller issues VAS_RX_WIN_OPEN
+ * or VAS_TX_WIN_OPEN ioctl.
+ */
+struct nxftw_instance {
+ int instance;
+ bool tx_win;
+ struct vas_window *window;
+};
+
+#define VAS_DEFAULT_VAS_ID 0
+#define POWERNV_LPID 0 /* TODO: For VM/KVM guests? */
+
+static char *nxftw_devnode(struct device *dev, umode_t *mode)
+{
+ return kasprintf(GFP_KERNEL, "crypto/%s", dev_name(dev));
+}
+
+static int nxftw_open(struct inode *inode, struct file *fp)
+{
+ int minor;
+ struct nxftw_instance *nxti;
+
+ minor = MINOR(inode->i_rdev);
+
+ nxti = kzalloc(sizeof(*nxti), GFP_KERNEL);
+ if (!nxti)
+ return -ENOMEM;
+
+ nxti->instance = atomic_inc_return(&nxftw_instid);
+ nxti->window = NULL;
+
+ fp->private_data = nxti;
+ return 0;
+}
+
+static int validate_txwin_user_attr(struct vas_tx_win_open_attr *uattr)
+{
+ int i;
+
+ if (uattr->version != 1)
+ return -EINVAL;
+
+ if (uattr->flags & ~VAS_FLAGS_HIGH_PRI)
+ return -EINVAL;
+
+ if (uattr->reserved1 || uattr->reserved2)
+ return -EINVAL;
+
+ for (i = 0; i < sizeof(uattr->reserved3) / sizeof(uint64_t); i++) {
+ if (uattr->reserved3[i])
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static bool validate_rxwin_user_attr(struct vas_rx_win_open_attr *uattr)
+{
+ int i;
+
+ if (uattr->version != 1)
+ return -EINVAL;
+
+ for (i = 0; i < sizeof(uattr->reserved) / sizeof(uint64_t); i++) {
+ if (uattr->reserved[i])
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+#ifdef vas_debug
+static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
+{
+ pr_err("NX-FTW: user %d, nx %d, fault %d, ntfy %d, intr %d early %d\n",
+ attr->user_win ? 1 : 0,
+ attr->nx_win ? 1 : 0,
+ attr->fault_win ? 1 : 0,
+ attr->notify_disable ? 1 : 0,
+ attr->intr_disable ? 1 : 0,
+ attr->notify_early ? 1 : 0);
+
+ pr_err("NX-FTW: rx_fifo %p, rx_fifo_size %d, max value 0x%x\n",
+ attr->rx_fifo, attr->rx_fifo_size,
+ VAS_RX_FIFO_SIZE_MAX);
+
+}
+#else
+static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
+{
+}
+#endif
+
+static int nxftw_ioc_open_rx_window(struct file *fp, unsigned long arg)
+{
+ int rc;
+ struct vas_rx_win_open_attr uattr;
+ struct vas_rx_win_attr rxattr;
+ struct nxftw_instance *nxti = fp->private_data;
+ struct vas_window *win;
+
+ rc = copy_from_user(&uattr, (void *)arg, sizeof(uattr));
+ if (rc) {
+ pr_devel("%s(): copy_from_user() returns %d\n", __func__, rc);
+ return -EFAULT;
+ }
+
+ rc = validate_rxwin_user_attr(&uattr);
+ if (rc)
+ return rc;
+
+ memset(&rxattr, 0, sizeof(rxattr));
+
+ rxattr.lnotify_lpid = POWERNV_LPID;
+
+ /*
+ * Only caller can own the window for now. Not sure if there is need
+ * for process P1 to make P2 the owner of a window. If so, we need to
+ * find P2, make sure we have permissions, get a reference etc.
+ */
+ rxattr.lnotify_pid = mfspr(SPRN_PID);
+ rxattr.lnotify_tid = mfspr(SPRN_TIDR);
+ rxattr.rx_fifo = NULL;
+ rxattr.rx_fifo_size = 0;
+ rxattr.intr_disable = true;
+ rxattr.user_win = true;
+
+ dump_rx_win_attr(&rxattr);
+
+ /*
+ * TODO: Rather than the default vas id, choose an instance of VAS
+ * based on the chip the caller is running.
+ */
+ win = vas_rx_win_open(VAS_DEFAULT_VAS_ID, VAS_COP_TYPE_FTW, &rxattr);
+ if (IS_ERR(win)) {
+ pr_devel("%s() vas_rx_win_open() failed, %ld\n", __func__,
+ PTR_ERR(win));
+ return PTR_ERR(win);
+ }
+
+ nxti->window = win;
+ uattr.rx_win_handle = vas_win_id(win);
+
+ rc = copy_to_user((void *)arg, &uattr, sizeof(uattr));
+ if (rc) {
+ pr_devel("%s(): copy_to_user() failed, %d\n", __func__, rc);
+ return -EFAULT;
+ }
+
+ return 0;
+}
+
+static int nxftw_ioc_open_tx_window(struct file *fp, unsigned long arg)
+{
+ int rc;
+ enum vas_cop_type cop;
+ struct vas_window *win;
+ struct vas_tx_win_open_attr uattr;
+ struct vas_tx_win_attr txattr;
+ struct nxftw_instance *nxti = fp->private_data;
+
+ rc = copy_from_user(&uattr, (void *)arg, sizeof(uattr));
+ if (rc) {
+ pr_devel("%s(): copy_from_user() failed, %d\n", __func__, rc);
+ return -EFAULT;
+ }
+
+ cop = VAS_COP_TYPE_FTW;
+
+ rc = validate_txwin_user_attr(&uattr);
+ if (rc)
+ return rc;
+
+ pr_devel("Pid %d: Opening txwin, cop %d, PIDR %ld\n",
+ task_pid_nr(current), cop, mfspr(SPRN_PID));
+
+ vas_init_tx_win_attr(&txattr, cop);
+
+ txattr.lpid = POWERNV_LPID;
+ txattr.pidr = mfspr(SPRN_PID);
+ txattr.pid = task_pid_nr(current);
+ txattr.user_win = true;
+ txattr.pswid = uattr.rx_win_handle;
+
+ win = vas_tx_win_open(VAS_DEFAULT_VAS_ID, cop, &txattr);
+ if (IS_ERR(win)) {
+ pr_devel("%s() vas_tx_win_open() failed, %ld\n", __func__,
+ PTR_ERR(win));
+ return PTR_ERR(win);
+ }
+ nxti->window = win;
+ nxti->tx_win = true;
+
+ return 0;
+}
+
+static int nxftw_release(struct inode *inode, struct file *fp)
+{
+ struct nxftw_instance *nxti;
+
+ nxti = fp->private_data;
+
+ vas_win_close(nxti->window);
+ nxti->window = NULL;
+
+ kfree(nxti);
+ fp->private_data = NULL;
+ atomic_dec(&nxftw_instid);
+
+ return 0;
+}
+
+static ssize_t nxftw_write(struct file *fp, const char __user *buf,
+ size_t len, loff_t *offsetp)
+{
+ return -ENOTSUPP;
+}
+
+static ssize_t nxftw_read(struct file *fp, char __user *buf, size_t len,
+ loff_t *offsetp)
+{
+ return -ENOTSUPP;
+}
+
+static int nxftw_vma_fault(struct vm_fault *vmf)
+{
+ u64 offset;
+ unsigned long vaddr;
+ uint64_t pbaddr_start;
+ struct nxftw_instance *nxti;
+ struct vm_area_struct *vma = vmf->vma;
+
+ nxti = vma->vm_private_data;
+ offset = vmf->pgoff << PAGE_SHIFT;
+ vaddr = (unsigned long)vmf->address;
+
+ pbaddr_start = vas_win_paste_addr(nxti->window);
+
+ pr_devel("%s() instance %d, pbaddr 0x%llx, vaddr 0x%lx,"
+ "offset %llx, pgoff 0x%lx, vma-start 0x%zx,"
+ "size %zd\n", __func__, nxti->instance,
+ pbaddr_start, vaddr, offset, vmf->pgoff,
+ vma->vm_start, vma->vm_end-vma->vm_start);
+
+ vm_insert_pfn(vma, vaddr, (pbaddr_start + offset) >> PAGE_SHIFT);
+
+ return VM_FAULT_NOPAGE;
+}
+
+const struct vm_operations_struct nxftw_vm_ops = {
+ .fault = nxftw_vma_fault,
+};
+
+static int nxftw_mmap(struct file *fp, struct vm_area_struct *vma)
+{
+ struct nxftw_instance *nxti = fp->private_data;
+
+ if ((vma->vm_end - vma->vm_start) > PAGE_SIZE) {
+ pr_devel("%s(): size 0x%zx, PAGE_SIZE 0x%zx\n", __func__,
+ (vma->vm_end - vma->vm_start), PAGE_SIZE);
+ return -EINVAL;
+ }
+
+ /* Ensure instance has an open send window */
+ if (!nxti->window || !nxti->tx_win) {
+ pr_devel("%s(): No send window open?\n", __func__);
+ return -EINVAL;
+ }
+
+ /* flags, page_prot from cxl_mmap(), except we want cachable */
+ vma->vm_flags |= VM_IO | VM_PFNMAP;
+ vma->vm_page_prot = pgprot_cached(vma->vm_page_prot);
+
+ vma->vm_ops = &nxftw_vm_ops;
+ vma->vm_private_data = nxti;
+
+ return 0;
+}
+
+static long nxftw_ioctl(struct file *fp, unsigned int cmd, unsigned long arg)
+{
+ struct nxftw_instance *nxti;
+
+ nxti = fp->private_data;
+
+ pr_devel("%s() cmd 0x%x, TX_WIN_OPEN 0x%lx\n", __func__, cmd,
+ VAS_TX_WIN_OPEN);
+ switch (cmd) {
+
+ case VAS_TX_WIN_OPEN:
+ return nxftw_ioc_open_tx_window(fp, arg);
+
+ case VAS_RX_WIN_OPEN:
+ return nxftw_ioc_open_rx_window(fp, arg);
+
+ default:
+ return -EINVAL;
+ }
+}
+
+const struct file_operations nxftw_fops = {
+ .owner = THIS_MODULE,
+ .open = nxftw_open,
+ .release = nxftw_release,
+ .read = nxftw_read,
+ .write = nxftw_write,
+ .mmap = nxftw_mmap,
+ .unlocked_ioctl = nxftw_ioctl,
+};
+
+
+int nxftw_file_init(void)
+{
+ int rc;
+ dev_t devno;
+
+ rc = alloc_chrdev_region(&nxftw_devt, 1, 1, "nx-ftw");
+ if (rc) {
+ pr_err("Unable to allocate nxftw major number: %i\n", rc);
+ return rc;
+ }
+
+ pr_devel("NX-FTW device allocated, dev [%i,%i]\n", MAJOR(nxftw_devt),
+ MINOR(nxftw_devt));
+
+ nxftw_dbgfs_class = class_create(THIS_MODULE, "nxftw");
+ if (IS_ERR(nxftw_dbgfs_class)) {
+ pr_err("Unable to create NX-FTW class\n");
+ rc = PTR_ERR(nxftw_dbgfs_class);
+ goto err;
+ }
+ nxftw_dbgfs_class->devnode = nxftw_devnode;
+
+ cdev_init(&nxftw_device.cdev, &nxftw_fops);
+
+ devno = MKDEV(MAJOR(nxftw_devt), 0);
+ if (cdev_add(&nxftw_device.cdev, devno, 1)) {
+ pr_err("NX-FTW: cdev_add() failed\n");
+ goto err;
+ }
+
+ nxftw_device.device = device_create(nxftw_dbgfs_class, NULL,
+ devno, NULL, nxftw_dev_name, MINOR(devno));
+ if (IS_ERR(nxftw_device.device)) {
+ pr_err("Unable to create nxftw-%d\n", MINOR(devno));
+ goto err;
+ }
+
+ pr_devel("%s: Added dev [%d,%d]\n", __func__, MAJOR(devno),
+ MINOR(devno));
+ return 0;
+
+err:
+ unregister_chrdev_region(nxftw_devt, 1);
+ return rc;
+}
+
+void nxftw_file_exit(void)
+{
+ dev_t devno;
+
+ pr_devel("NX-FTW: %s entered\n", __func__);
+
+ cdev_del(&nxftw_device.cdev);
+ devno = MKDEV(MAJOR(nxftw_devt), MINOR(nxftw_devt));
+ device_destroy(nxftw_dbgfs_class, devno);
+
+ class_destroy(nxftw_dbgfs_class);
+ unregister_chrdev_region(nxftw_devt, 1);
+}
+
+
+/*
+ * Create a debugfs entry. Not sure what for yet, though
+ */
+int __init nxftw_debugfs_init(void)
+{
+ struct dentry *ent;
+
+ ent = debugfs_create_dir("nxftw", NULL);
+ if (IS_ERR(ent)) {
+ pr_devel("nxftw: %s(): error creating dbgfs dir\n", __func__);
+ return PTR_ERR(ent);
+ }
+ nxftw_debugfs = ent;
+
+ return 0;
+}
+
+void nxftw_debugfs_exit(void)
+{
+ debugfs_remove_recursive(nxftw_debugfs);
+}
+
+int __init nxftw_init(void)
+{
+ int rc;
+
+ rc = nxftw_file_init();
+ if (rc)
+ return rc;
+
+ rc = nxftw_debugfs_init();
+ if (rc)
+ goto free_file;
+
+ pr_err("NX-FTW Device initialized\n");
+
+ return 0;
+
+free_file:
+ nxftw_file_exit();
+ return rc;
+}
+
+void __init nxftw_exit(void)
+{
+ pr_devel("NX-FTW Device exiting\n");
+ nxftw_debugfs_exit();
+ nxftw_file_exit();
+}
+
+module_init(nxftw_init);
+module_exit(nxftw_exit);
+
+MODULE_DESCRIPTION("IBM NX Fast Thread Wakeup Device");
+MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
+MODULE_LICENSE("GPL");
--
2.7.4

2017-08-08 23:08:42

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 15/17] powerpc/vas: Define window open ioctls API

Define the VAS_TX_WIN_OPEN and VAS_RX_WIN_OPEN ioctl interface. Each user
of VAS, like the NX-FTW driver in a follow-on patch, should implement
these ioctls.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/uapi/asm/vas.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/vas.h b/arch/powerpc/include/uapi/asm/vas.h
index 21249f5..e9730fb 100644
--- a/arch/powerpc/include/uapi/asm/vas.h
+++ b/arch/powerpc/include/uapi/asm/vas.h
@@ -10,6 +10,8 @@
#ifndef _UAPI_MISC_VAS_H
#define _UAPI_MISC_VAS_H

+#include <asm/ioctl.h>
+
/*
* Threshold Control Mode: Have paste operation fail if the number of
* requests in receive FIFO exceeds a threshold.
@@ -22,6 +24,34 @@
#define VAS_THRESH_FIFO_GT_QTR_FULL 2
#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3

+#define VAS_FLAGS_PIN_WINDOW 0x1
+#define VAS_FLAGS_HIGH_PRI 0x2
+
+#define VAS_TX_WIN_OPEN _IOW('v', 1, struct vas_tx_win_open_attr)
+#define VAS_RX_WIN_OPEN _IOW('v', 2, struct vas_rx_win_open_attr)
+
+struct vas_tx_win_open_attr {
+ int16_t version;
+ int16_t vas_id;
+ uint32_t rx_win_handle;
+
+ int64_t reserved1;
+
+ int64_t flags;
+ int64_t reserved2;
+
+ int32_t tc_mode;
+ int32_t rsvd_txbuf;
+ int64_t reserved3[6];
+};
+
+struct vas_rx_win_open_attr {
+ int16_t version;
+ int16_t vas_id;
+ uint32_t rx_win_handle; /* output field */
+ int64_t reserved[8];
+};
+
/*
* Get/Set bit fields
*/
--
2.7.4

2017-08-08 23:07:44

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

We need the SPRN_TIDR to bet set for use with fast thread-wakeup
(core-to-core wakeup). Each thread in a process needs to have a
unique id within the process but as explained below, for now, we
assign globally unique thread ids to all threads in the system.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/asm/processor.h | 4 ++
arch/powerpc/kernel/process.c | 74 ++++++++++++++++++++++++++++++++++++
2 files changed, 78 insertions(+)

diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index fab7ff8..bf6ba63 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -232,6 +232,10 @@ struct debug_reg {
struct thread_struct {
unsigned long ksp; /* Kernel stack pointer */

+#ifdef CONFIG_PPC_VAS
+ unsigned long tidr;
+#endif
+
#ifdef CONFIG_PPC64
unsigned long ksp_vsid;
#endif
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 9f3e2c9..6123859 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -1213,6 +1213,16 @@ struct task_struct *__switch_to(struct task_struct *prev,
hard_irq_disable();
}

+#ifdef CONFIG_PPC_VAS
+ mtspr(SPRN_TIDR, new->thread.tidr);
+#endif
+ /*
+ * We can't take a PMU exception inside _switch() since there is a
+ * window where the kernel stack SLB and the kernel stack are out
+ * of sync. Hard disable here.
+ */
+ hard_irq_disable();
+
/*
* Call restore_sprs() before calling _switch(). If we move it after
* _switch() then we miss out on calling it for new tasks. The reason
@@ -1449,9 +1459,70 @@ void flush_thread(void)
#endif /* CONFIG_HAVE_HW_BREAKPOINT */
}

+#ifdef CONFIG_PPC_VAS
+static DEFINE_SPINLOCK(vas_thread_id_lock);
+static DEFINE_IDA(vas_thread_ida);
+
+/*
+ * We need to assign an unique thread id to each thread in a process. This
+ * thread id is intended to be used with the Fast Thread-wakeup (aka Core-
+ * to-core wakeup) mechanism being implemented on top of Virtual Accelerator
+ * Switchboard (VAS).
+ *
+ * To get a unique thread-id per process we could simply use task_pid_nr()
+ * but the problem is that task_pid_nr() is not yet available for the thread
+ * when copy_thread() is called. Fixing that would require changing more
+ * intrusive arch-neutral code in code path in copy_process()?.
+ *
+ * Further, to assign unique thread ids within each process, we need an
+ * atomic field (or an IDR) in task_struct, which again intrudes into the
+ * arch-neutral code.
+ *
+ * So try to assign globally unique thraed ids for now.
+ */
+static int assign_thread_id(void)
+{
+ int index;
+ int err;
+
+again:
+ if (!ida_pre_get(&vas_thread_ida, GFP_KERNEL))
+ return -ENOMEM;
+
+ spin_lock(&vas_thread_id_lock);
+ err = ida_get_new_above(&vas_thread_ida, 1, &index);
+ spin_unlock(&vas_thread_id_lock);
+
+ if (err == -EAGAIN)
+ goto again;
+ else if (err)
+ return err;
+
+ if (index > MAX_USER_CONTEXT) {
+ spin_lock(&vas_thread_id_lock);
+ ida_remove(&vas_thread_ida, index);
+ spin_unlock(&vas_thread_id_lock);
+ return -ENOMEM;
+ }
+
+ return index;
+}
+
+static void free_thread_id(int id)
+{
+ spin_lock(&vas_thread_id_lock);
+ ida_remove(&vas_thread_ida, id);
+ spin_unlock(&vas_thread_id_lock);
+}
+#endif /* CONFIG_PPC_VAS */
+
+
void
release_thread(struct task_struct *t)
{
+#ifdef CONFIG_PPC_VAS
+ free_thread_id(t->thread.tidr);
+#endif
}

/*
@@ -1587,6 +1658,9 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
#endif

setup_ksp_vsid(p, sp);
+#ifdef CONFIG_PPC_VAS
+ p->thread.tidr = assign_thread_id();
+#endif

#ifdef CONFIG_PPC64
if (cpu_has_feature(CPU_FTR_DSCR)) {
--
2.7.4

2017-08-08 23:09:33

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 12/17] powerpc/vas: Define vas_tx_win_open()

Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.

The receive window, identified by the [vasid, cop] parameters, must
already be open in VAS (i.e connected to an NX engine).

Signed-off-by: Sukadev Bhattiprolu <[email protected]>

---
Changelog[v6]:
- Add support for FTW windows

Changelog[v4]:
- [Ben Herrenschmidt] MMIO regions must be mapped non-cached and
paste regions must be mapped cached. Define/use map_paste_region().

Changelog [v3]:
- Distinguish between hardware PID (SPRN_PID) and Linux pid.
- Use macros rather than enum for threshold-control mode
- Set the pid of send window from attr (needed for user space
send windows).
- Ignore irq port setting for now. They are needed for user space
windows and will be added later
---
arch/powerpc/include/asm/vas.h | 42 ++++++++
arch/powerpc/platforms/powernv/vas-window.c | 157 +++++++++++++++++++++++++++-
2 files changed, 196 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index e1c5376..3fc6435 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -65,6 +65,29 @@ struct vas_rx_win_attr {
};

/*
+ * Window attributes specified by the in-kernel owner of a send window.
+ */
+struct vas_tx_win_attr {
+ enum vas_cop_type cop;
+ int wcreds_max;
+ int lpid;
+ int pidr; /* hardware PID (from SPRN_PID) */
+ int pid; /* linux process id */
+ int pswid;
+ int rsvd_txbuf_count;
+ int tc_mode;
+
+ bool user_win;
+ bool pin_win;
+ bool rej_no_credit;
+ bool rsvd_txbuf_enable;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_win_ord_mode;
+ bool rx_win_ord_mode;
+};
+
+/*
* Return a system-wide unique id for the VAS window @win.
*/
extern uint32_t vas_win_id(struct vas_window *win);
@@ -92,6 +115,25 @@ extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
struct vas_rx_win_attr *attr);

/*
+ * Helper to initialize send window attributes to defaults for an NX window.
+ */
+extern void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr,
+ enum vas_cop_type cop);
+
+/*
+ * Open a VAS send window for the instance of VAS identified by @vasid
+ * and the co-processor type @cop. Use @attr to initialize attributes
+ * of the window.
+ *
+ * Note: The instance of VAS must already have an open receive window for
+ * the coprocessor type @cop.
+ *
+ * Return a handle to the send window or ERR_PTR() on error.
+ */
+struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr);
+
+/*
* Close the send or receive window identified by @win. For receive windows
* return -EAGAIN if there are active send windows attached to this receive
* window.
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 9704a3b..3e2655c 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -72,7 +72,7 @@ static inline void get_uwc_mmio_bar(struct vas_window *window,
* space. Unlike MMIO regions (map_mmio_region() below), paste region must
* be mapped cache-able and is only applicable to send windows.
*/
-void *map_paste_region(struct vas_window *txwin)
+static void *map_paste_region(struct vas_window *txwin)
{
int rc, len;
void *map;
@@ -109,7 +109,6 @@ void *map_paste_region(struct vas_window *txwin)
return ERR_PTR(rc);
}

-
static void *map_mmio_region(char *name, uint64_t start, int len)
{
void *map;
@@ -657,7 +656,7 @@ struct vas_window *get_user_rxwin(struct vas_instance *vinst, uint32_t pswid)
*
* See also function header of set_vinst_win().
*/
-struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+static struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
enum vas_cop_type cop, uint32_t pswid)
{
struct vas_window *rxwin;
@@ -933,6 +932,158 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_rx_win_open);

+void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr, enum vas_cop_type cop)
+{
+ memset(txattr, 0, sizeof(*txattr));
+
+ if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI) {
+ txattr->rej_no_credit = false;
+ txattr->rx_wcred_mode = true;
+ txattr->tx_wcred_mode = true;
+ txattr->rx_win_ord_mode = true;
+ txattr->tx_win_ord_mode = true;
+ }
+}
+EXPORT_SYMBOL_GPL(vas_init_tx_win_attr);
+
+static void init_winctx_for_txwin(struct vas_window *txwin,
+ struct vas_tx_win_attr *txattr,
+ struct vas_winctx *winctx)
+{
+ /*
+ * We first zero all fields and only set non-zero ones. Following
+ * are some fields set to 0/false for the stated reason:
+ *
+ * ->notify_os_intr_reg In powerNV, send intrs to HV
+ * ->rsvd_txbuf_count Not supported yet.
+ * ->notify_disable False for NX windows
+ * ->xtra_write False for NX windows
+ * ->notify_early NA for NX windows
+ * ->lnotify_lpid NA for Tx windows
+ * ->lnotify_pid NA for Tx windows
+ * ->lnotify_tid NA for Tx windows
+ * ->tx_win_cred_mode Ignore for now for NX windows
+ * ->rx_win_cred_mode Ignore for now for NX windows
+ */
+ memset(winctx, 0, sizeof(struct vas_winctx));
+
+ winctx->wcreds_max = txattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
+
+ winctx->user_win = txattr->user_win;
+ winctx->nx_win = txwin->rxwin->nx_win;
+ winctx->pin_win = txattr->pin_win;
+
+ winctx->rx_wcred_mode = txattr->rx_wcred_mode;
+ winctx->tx_wcred_mode = txattr->tx_wcred_mode;
+ winctx->rx_word_mode = txattr->rx_win_ord_mode;
+ winctx->tx_word_mode = txattr->tx_win_ord_mode;
+
+ if (winctx->nx_win) {
+ winctx->data_stamp = true;
+ winctx->intr_disable = true;
+ }
+
+ winctx->lpid = txattr->lpid;
+ winctx->pidr = txattr->pidr;
+ winctx->rx_win_id = txwin->rxwin->winid;
+
+ winctx->dma_type = VAS_DMA_TYPE_INJECT;
+ winctx->tc_mode = txattr->tc_mode;
+ winctx->min_scope = VAS_SCOPE_LOCAL;
+ winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+
+ winctx->pswid = encode_pswid(txwin->vinst->vas_id, txwin->winid);
+}
+
+static bool tx_win_args_valid(enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr)
+{
+ if (attr->tc_mode != VAS_THRESH_DISABLED)
+ return false;
+
+ if (cop > VAS_COP_TYPE_MAX)
+ return false;
+
+ if (attr->user_win &&
+ (cop != VAS_COP_TYPE_FTW || attr->rsvd_txbuf_count))
+ return false;
+
+ return true;
+}
+
+struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr)
+{
+ int rc;
+ struct vas_instance *vinst;
+ struct vas_window *txwin;
+ struct vas_window *rxwin;
+ struct vas_winctx winctx;
+
+ if (!vas_initialized())
+ return ERR_PTR(-EAGAIN);
+
+ if (!tx_win_args_valid(cop, attr))
+ return ERR_PTR(-EINVAL);
+
+ vinst = find_vas_instance(vasid);
+ if (!vinst) {
+ pr_devel("VAS: vasid %d not found!\n", vasid);
+ return ERR_PTR(-EINVAL);
+ }
+
+ rxwin = get_vinst_rxwin(vinst, cop, attr->pswid);
+ if (IS_ERR(rxwin)) {
+ pr_devel("VAS: No RxWin for vasid %d, cop %d\n", vasid, cop);
+ return rxwin;
+ }
+
+ txwin = vas_window_alloc(vinst);
+ if (IS_ERR(txwin)) {
+ rc = PTR_ERR(txwin);
+ goto put_rxwin;
+ }
+
+ txwin->tx_win = 1;
+ txwin->rxwin = rxwin;
+ txwin->nx_win = txwin->rxwin->nx_win;
+ txwin->pid = attr->pid;
+ txwin->user_win = attr->user_win;
+
+ init_winctx_for_txwin(txwin, attr, &winctx);
+
+ init_winctx_regs(txwin, &winctx);
+
+ /*
+ * If its a kernel send window, map the window address into the
+ * kernel's address space. For user windows, user must issue an
+ * mmap() to map the window into their address space.
+ *
+ * NOTE: If kernel ever resubmits a user CRB after handling a page
+ * fault, we will need to map this into kernel as well.
+ */
+ if (!txwin->user_win) {
+ txwin->paste_kaddr = map_paste_region(txwin);
+ if (IS_ERR(txwin->paste_kaddr)) {
+ rc = PTR_ERR(txwin->paste_kaddr);
+ goto free_window;
+ }
+ }
+
+ set_vinst_win(vinst, txwin);
+
+ return txwin;
+
+free_window:
+ vas_window_free(txwin);
+
+put_rxwin:
+ put_rx_win(rxwin);
+ return ERR_PTR(rc);
+
+}
+EXPORT_SYMBOL_GPL(vas_tx_win_open);
+
static void poll_window_busy_state(struct vas_window *window)
{
int busy;
--
2.7.4

2017-08-08 23:09:52

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 10/17] powerpc/vas: Define vas_rx_win_open() interface

Define the vas_rx_win_open() interface. This interface is intended to be
used by the Nest Accelerator (NX) driver(s) to setup receive windows for
one or more NX engines (which implement compression/encryption algorithms
in the hardware).

Follow-on patches will provide an interface to close the window and to open
a send window that kenrel subsystems can use to access the NX engines.

The interface to open a receive window is expected to be invoked for each
instance of VAS in the system.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---

Changelog[v6]:
- Add support for FTW windows

Changelog[v4]:
- Export the symbols

Changelog[v3]:
- Fault receive windows must enable interrupts and disable
notifications. NX Windows are opposite.
- Use macros rather than enum for threshold-control mode
- Ignore irq_ports for in-kernel windows. They are needed for
user space windows and will be added later
---
arch/powerpc/platforms/powernv/vas-window.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index ff64022..dfa7e67 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -717,7 +717,7 @@ void clear_vinst_win(struct vas_window *window)

mutex_lock(&vinst->mutex);

- if (!window->tx_win) {
+ if (!window->user_win && !window->tx_win) {
WARN_ON_ONCE(!vinst->rxwin[window->cop]);
vinst->rxwin[window->cop] = NULL;
}
--
2.7.4

2017-08-08 23:09:54

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 09/17] powerpc/vas: Define vas_rx_win_open() interface

Define the vas_rx_win_open() interface. This interface is intended to be
used by the Nest Accelerator (NX) driver(s) to setup receive windows for
one or more NX engines (which implement compression/encryption algorithms
in the hardware).

Follow-on patches will provide an interface to close the window and to open
a send window that kenrel subsystems can use to access the NX engines.

The interface to open a receive window is expected to be invoked for each
instance of VAS in the system.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---

Changelog[v6]:
- Add support for FTW windows

Changelog[v4]:
- Export the symbols

Changelog[v3]:
- Fault receive windows must enable interrupts and disable
notifications. NX Windows are opposite.
- Use macros rather than enum for threshold-control mode
- Ignore irq_ports for in-kernel windows. They are needed for
user space windows and will be added later
---
arch/powerpc/include/asm/vas.h | 47 ++++
arch/powerpc/platforms/powernv/vas-window.c | 357 +++++++++++++++++++++++++++-
arch/powerpc/platforms/powernv/vas.h | 14 ++
3 files changed, 417 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 30667db..a3778d7 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -35,6 +35,36 @@ enum vas_cop_type {
};

/*
+ * Receive window attributes specified by the (in-kernel) owner of window.
+ */
+struct vas_rx_win_attr {
+ void *rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+
+ bool pin_win;
+ bool rej_no_credit;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_win_ord_mode;
+ bool rx_win_ord_mode;
+ bool data_stamp;
+ bool nx_win;
+ bool fault_win;
+ bool user_win;
+ bool notify_disable;
+ bool intr_disable;
+ bool notify_early;
+
+ int lnotify_lpid;
+ int lnotify_pid;
+ int lnotify_tid;
+ uint32_t pswid;
+
+ int tc_mode;
+};
+
+/*
* Return a system-wide unique id for the VAS window @win.
*/
extern uint32_t vas_win_id(struct vas_window *win);
@@ -44,4 +74,21 @@ extern uint32_t vas_win_id(struct vas_window *win);
* can map that address into their address space.
*/
extern uint64_t vas_win_paste_addr(struct vas_window *win);
+
+/*
+ * Helper to initialize receive window attributes to defaults for an
+ * NX window.
+ */
+extern void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr,
+ enum vas_cop_type cop);
+
+/*
+ * Open a VAS receive window for the instance of VAS identified by @vasid
+ * Use @attr to initialize the attributes of the window.
+ *
+ * Return a handle to the window or ERR_PTR() on error.
+ */
+extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_rx_win_attr *attr);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 42c1d4f..ff64022 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -12,6 +12,8 @@
#include <linux/slab.h>
#include <linux/io.h>
#include <linux/log2.h>
+#include <linux/rcupdate.h>
+#include <linux/cred.h>

#include "vas.h"

@@ -544,7 +546,7 @@ void vas_window_free(struct vas_window *window)
vas_release_window_id(&vinst->ida, winid);
}

-struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+static struct vas_window *vas_window_alloc(struct vas_instance *vinst)
{
int winid;
struct vas_window *window;
@@ -570,6 +572,359 @@ struct vas_window *vas_window_alloc(struct vas_instance *vinst)
return ERR_PTR(-ENOMEM);
}

+/*
+ * Check if current task has permissions to pair with the process that
+ * opened the receive window @rxwin. For now the check is based on
+ * kill_ok_by_cred() - i.e equivalent to current task being able to
+ * send a signal to owner of @rxwin.
+ */
+static bool valid_permissions(struct vas_window *rxwin)
+{
+ bool rc;
+ struct task_struct *wtask;
+ const struct cred *txcred, *rxcred;
+
+ rcu_read_lock();
+ wtask = find_task_by_vpid(rxwin->pid);
+
+ /*
+ * CHECK: Don't need to get_task_struct(wtask) since we hold
+ * RCU till we complete the uid checks? Since rxwin is
+ * open, the task has not exited.
+ */
+
+ txcred = current_cred();
+ rxcred = __task_cred(wtask);
+
+ rc = false;
+ if (uid_eq(txcred->euid, rxcred->suid) ||
+ uid_eq(txcred->euid, rxcred->uid) ||
+ uid_eq(txcred->uid, rxcred->suid) ||
+ uid_eq(txcred->uid, rxcred->uid) ||
+ capable(CAP_KILL))
+ rc = true;
+
+ rcu_read_unlock();
+
+ return rc;
+}
+
+/*
+ * Find the user space receive window given the @pswid.
+ *
+ * The pswid, aka rx_win_handle, comes from user space so we should
+ * validate it carefully.
+ * - We must have a valid vasid and it must belong to this instance.
+ * - The window must refer to an OPEN, FTW, RECEIVE window.
+ * - Calling process must have "kill" capabilities to the process
+ * that opened/owns the receive window.
+ * - Anything else?
+ *
+ * NOTE: We access ->windows[] table and assume that vinst->mutex is held.
+ */
+struct vas_window *get_user_rxwin(struct vas_instance *vinst, uint32_t pswid)
+{
+ int vasid, winid;
+ struct vas_window *rxwin;
+
+ decode_pswid(pswid, &vasid, &winid);
+
+ if (vinst->vas_id != vasid)
+ return ERR_PTR(-EINVAL);
+
+ rxwin = vinst->windows[winid];
+
+ if (!rxwin || rxwin->tx_win || rxwin->cop != VAS_COP_TYPE_FTW)
+ return ERR_PTR(-EINVAL);
+
+ if (!valid_permissions(rxwin))
+ return ERR_PTR(-EACCES);
+
+ return rxwin;
+}
+
+/*
+ * Get the VAS receive window associated with NX engine identified
+ * by @cop and if applicable, @pswid.
+ *
+ * See also function header of set_vinst_win().
+ */
+struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+ enum vas_cop_type cop, uint32_t pswid)
+{
+ struct vas_window *rxwin;
+
+ mutex_lock(&vinst->mutex);
+
+ if (cop == VAS_COP_TYPE_FTW)
+ rxwin = get_user_rxwin(vinst, pswid);
+ else
+ rxwin = vinst->rxwin[cop] ?: ERR_PTR(-EINVAL);
+
+ if (!IS_ERR(rxwin))
+ atomic_inc(&rxwin->num_txwins);
+
+ mutex_unlock(&vinst->mutex);
+
+ return rxwin;
+}
+
+/*
+ * We have two tables of windows in a VAS instance. The first one,
+ * ->windows[], contains all the windows in the instance and allows
+ * looking up a window by its id. It is used to look up send windows
+ * during fault handling and receive windows when pairing user space
+ * send/receive windows.
+ *
+ * The second table, ->rxwin[], contains receive windows that are
+ * associated with NX engines. This table has VAS_COP_TYPE_MAX
+ * entries and is used to look up a receive window by its
+ * coprocessor type.
+ *
+ * Here, we save @window in the ->windows[] table. If it is a receive
+ * window, we also save the window in the ->rxwin[] table.
+ */
+static void set_vinst_win(struct vas_instance *vinst,
+ struct vas_window *window)
+{
+ int id = window->winid;
+
+ mutex_lock(&vinst->mutex);
+
+ /*
+ * There should only be one receive window for a coprocessor type
+ * unless its a user (FTW) window.
+ */
+ if (!window->user_win && !window->tx_win) {
+ WARN_ON_ONCE(vinst->rxwin[window->cop]);
+ vinst->rxwin[window->cop] = window;
+ }
+
+ WARN_ON_ONCE(vinst->windows[id] != NULL);
+ vinst->windows[id] = window;
+
+ mutex_unlock(&vinst->mutex);
+}
+
+/*
+ * Clear this window from the table(s) of windows for this VAS instance.
+ * See also function header of set_vinst_win().
+ */
+void clear_vinst_win(struct vas_window *window)
+{
+ int id = window->winid;
+ struct vas_instance *vinst = window->vinst;
+
+ mutex_lock(&vinst->mutex);
+
+ if (!window->tx_win) {
+ WARN_ON_ONCE(!vinst->rxwin[window->cop]);
+ vinst->rxwin[window->cop] = NULL;
+ }
+
+ WARN_ON_ONCE(vinst->windows[id] != window);
+ vinst->windows[id] = NULL;
+
+ mutex_unlock(&vinst->mutex);
+}
+
+static void init_winctx_for_rxwin(struct vas_window *rxwin,
+ struct vas_rx_win_attr *rxattr,
+ struct vas_winctx *winctx)
+{
+ /*
+ * We first zero (memset()) all fields and only set non-zero fields.
+ * Following fields are 0/false but maybe deserve a comment:
+ *
+ * ->notify_os_intr_reg In powerNV, send intrs to HV
+ * ->notify_disable False for NX windows
+ * ->intr_disable False for Fault Windows
+ * ->xtra_write False for NX windows
+ * ->notify_early NA for NX windows
+ * ->rsvd_txbuf_count NA for Rx windows
+ * ->lpid, ->pid, ->tid NA for Rx windows
+ */
+
+ memset(winctx, 0, sizeof(struct vas_winctx));
+
+ winctx->rx_fifo = rxattr->rx_fifo;
+ winctx->rx_fifo_size = rxattr->rx_fifo_size;
+ winctx->wcreds_max = rxattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
+ winctx->pin_win = rxattr->pin_win;
+
+ winctx->nx_win = rxattr->nx_win;
+ winctx->fault_win = rxattr->fault_win;
+ winctx->rx_word_mode = rxattr->rx_win_ord_mode;
+ winctx->tx_word_mode = rxattr->tx_win_ord_mode;
+ winctx->rx_wcred_mode = rxattr->rx_wcred_mode;
+ winctx->tx_wcred_mode = rxattr->tx_wcred_mode;
+
+ if (winctx->nx_win) {
+ winctx->data_stamp = true;
+ winctx->intr_disable = true;
+ winctx->pin_win = true;
+
+ WARN_ON_ONCE(winctx->fault_win);
+ WARN_ON_ONCE(!winctx->rx_word_mode);
+ WARN_ON_ONCE(!winctx->tx_word_mode);
+ WARN_ON_ONCE(winctx->notify_after_count);
+ } else if (winctx->fault_win) {
+ winctx->notify_disable = true;
+ } else if (winctx->user_win) {
+ /*
+ * Section 1.8.1 Low Latency Core-Core Wake up of
+ * the VAS workbook:
+ *
+ * - disable credit checks ([tr]x_wcred_mode = false)
+ * - disable FIFO writes
+ * - enable ASB_Notify, disable interrupt
+ */
+ winctx->fifo_disable = true;
+ winctx->intr_disable = true;
+ winctx->rx_fifo = NULL;
+ }
+
+ winctx->lnotify_lpid = rxattr->lnotify_lpid;
+ winctx->lnotify_pid = rxattr->lnotify_pid;
+ winctx->lnotify_tid = rxattr->lnotify_tid;
+ winctx->pswid = rxattr->pswid;
+ winctx->dma_type = VAS_DMA_TYPE_INJECT;
+ winctx->tc_mode = rxattr->tc_mode;
+
+ winctx->min_scope = VAS_SCOPE_LOCAL;
+ winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+}
+
+static bool rx_win_args_valid(enum vas_cop_type cop,
+ struct vas_rx_win_attr *attr)
+{
+ dump_rx_win_attr(attr);
+
+ if (cop >= VAS_COP_TYPE_MAX)
+ return false;
+
+ if (cop != VAS_COP_TYPE_FTW &&
+ attr->rx_fifo_size < VAS_RX_FIFO_SIZE_MIN)
+ return false;
+
+ if (attr->rx_fifo_size > VAS_RX_FIFO_SIZE_MAX)
+ return false;
+
+ if (attr->nx_win) {
+ /* cannot be fault or user window if it is nx */
+ if (attr->fault_win || attr->user_win)
+ return false;
+ /*
+ * Section 3.1.4.32: NX Windows must not disable notification,
+ * and must not enable interrupts or early notification.
+ */
+ if (attr->notify_disable || !attr->intr_disable ||
+ attr->notify_early)
+ return false;
+ } else if (attr->fault_win) {
+ /* cannot be both fault and user window */
+ if (attr->user_win)
+ return false;
+
+ /*
+ * Section 3.1.4.32: Fault windows must disable notification
+ * but not interrupts.
+ */
+ if (!attr->notify_disable || attr->intr_disable)
+ return false;
+
+ } else if (attr->user_win) {
+ /*
+ * User receive windows are only for fast-thread-wakeup
+ * (FTW). They don't need a FIFO and must disable interrupts
+ */
+ if (attr->rx_fifo || attr->rx_fifo_size || !attr->intr_disable)
+ return false;
+ } else {
+ /* Rx window must be one of NX or Fault or User window. */
+ return false;
+ }
+
+ return true;
+}
+
+void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr, enum vas_cop_type cop)
+{
+ memset(rxattr, 0, sizeof(*rxattr));
+
+ if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI) {
+ rxattr->pin_win = true;
+ rxattr->nx_win = true;
+ rxattr->fault_win = false;
+ rxattr->intr_disable = true;
+ rxattr->rx_wcred_mode = true;
+ rxattr->tx_wcred_mode = true;
+ rxattr->rx_win_ord_mode = true;
+ rxattr->tx_win_ord_mode = true;
+ } else if (cop == VAS_COP_TYPE_FAULT) {
+ rxattr->pin_win = true;
+ rxattr->fault_win = true;
+ rxattr->notify_disable = true;
+ rxattr->rx_wcred_mode = true;
+ rxattr->tx_wcred_mode = true;
+ rxattr->rx_win_ord_mode = true;
+ rxattr->tx_win_ord_mode = true;
+ } else if (cop == VAS_COP_TYPE_FTW) {
+ rxattr->user_win = true;
+ rxattr->intr_disable = true;
+
+ /*
+ * As noted in the VAS Workbook we disable credit checks.
+ * If we enable credit checks in the future, we must also
+ * implement a mechanism to return the user credits or new
+ * paste operations will fail.
+ */
+ }
+}
+EXPORT_SYMBOL_GPL(vas_init_rx_win_attr);
+
+struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_rx_win_attr *rxattr)
+{
+ struct vas_instance *vinst;
+ struct vas_window *rxwin;
+ struct vas_winctx winctx;
+
+ if (!vas_initialized())
+ return ERR_PTR(-EAGAIN);
+
+ if (!rx_win_args_valid(cop, rxattr))
+ return ERR_PTR(-EINVAL);
+
+ vinst = find_vas_instance(vasid);
+ if (!vinst) {
+ pr_devel("VAS: vasid %d not found!\n", vasid);
+ return ERR_PTR(-EINVAL);
+ }
+ pr_devel("VAS: Found instance %d\n", vasid);
+
+ rxwin = vas_window_alloc(vinst);
+ if (IS_ERR(rxwin)) {
+ pr_devel("VAS: Unable to allocate memory for Rx window\n");
+ return rxwin;
+ }
+
+ rxwin->tx_win = false;
+ rxwin->nx_win = rxattr->nx_win;
+ rxwin->user_win = rxattr->user_win;
+ rxwin->cop = cop;
+ if (rxattr->user_win)
+ rxwin->pid = task_pid_vnr(current);
+
+ init_winctx_for_rxwin(rxwin, rxattr, &winctx);
+ init_winctx_regs(rxwin, &winctx);
+
+ set_vinst_win(vinst, rxwin);
+
+ return rxwin;
+}
+EXPORT_SYMBOL_GPL(vas_rx_win_open);
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 3eadf90..61fd80f 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -289,6 +289,9 @@ enum vas_notify_after_count {
/*
* One per instance of VAS. Each instance will have a separate set of
* receive windows, one per coprocessor type.
+ *
+ * See also function header of set_vinst_win() for details on ->windows[]
+ * and ->rxwin[] tables.
*/
struct vas_instance {
int vas_id;
@@ -397,6 +400,16 @@ extern struct vas_instance *find_vas_instance(int vasid);
#define VREG(r) VREG_SFX(r, _OFFSET)

#ifdef vas_debug
+static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
+{
+ pr_err("VAS: fault %d, notify %d, intr %d early %d\n",
+ attr->fault_win, attr->notify_disable,
+ attr->intr_disable, attr->notify_early);
+
+ pr_err("VAS: rx_fifo_size %d, max value %d\n",
+ attr->rx_fifo_size, VAS_RX_FIFO_SIZE_MAX);
+}
+
static inline void vas_log_write(struct vas_window *win, char *name,
void *regptr, uint64_t val)
{
@@ -409,6 +422,7 @@ static inline void vas_log_write(struct vas_window *win, char *name,
#else /* vas_debug */

#define vas_log_write(win, name, reg, val)
+#define dump_rx_win_attr(attr)

#endif /* vas_debug */

--
2.7.4

2017-08-08 23:07:32

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 06/17] powerpc/vas: Define helpers to alloc/free windows

Define helpers to allocate/free VAS window objects. These will
be used in follow-on patches when opening/closing windows.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/platforms/powernv/vas-window.c | 70 +++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 3a50d6a..9c12919 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -490,6 +490,76 @@ int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
return 0;
}

+DEFINE_SPINLOCK(vas_ida_lock);
+
+void vas_release_window_id(struct ida *ida, int winid)
+{
+ spin_lock(&vas_ida_lock);
+ ida_remove(ida, winid);
+ spin_unlock(&vas_ida_lock);
+}
+
+int vas_assign_window_id(struct ida *ida)
+{
+ int rc, winid;
+
+ rc = ida_pre_get(ida, GFP_KERNEL);
+ if (!rc)
+ return -EAGAIN;
+
+ spin_lock(&vas_ida_lock);
+ rc = ida_get_new_above(ida, 0, &winid);
+ spin_unlock(&vas_ida_lock);
+
+ if (rc)
+ return rc;
+
+ if (winid > VAS_WINDOWS_PER_CHIP) {
+ pr_err("VAS: Too many (%d) open windows\n", winid);
+ vas_release_window_id(ida, winid);
+ return -EAGAIN;
+ }
+
+ return winid;
+}
+
+void vas_window_free(struct vas_window *window)
+{
+ int winid = window->winid;
+ struct vas_instance *vinst = window->vinst;
+
+ unmap_winctx_mmio_bars(window);
+ kfree(window);
+
+ vas_release_window_id(&vinst->ida, winid);
+}
+
+struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+{
+ int winid;
+ struct vas_window *window;
+
+ winid = vas_assign_window_id(&vinst->ida);
+ if (winid < 0)
+ return ERR_PTR(winid);
+
+ window = kzalloc(sizeof(*window), GFP_KERNEL);
+ if (!window)
+ return ERR_PTR(-ENOMEM);
+
+ window->vinst = vinst;
+ window->winid = winid;
+
+ if (map_winctx_mmio_bars(window))
+ goto out_free;
+
+ return window;
+
+out_free:
+ kfree(window);
+ return ERR_PTR(-ENOMEM);
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
--
2.7.4

2017-08-08 23:10:48

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 05/17] powerpc/vas: Define helpers to init window context

Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v6]
- Add support for FTW windows and drop the fault window id
code since it is not needed for FTW/kernel windows.
Changelog[v5]
- Fix: Copy the FIFO address into LFIFO_BAR register as is (don't
shift address into bits 8:53).

Changelog[v4]
- Michael Neuling] Use ilog2(), radix_enabled() helpers;
drop warning when 32-bit app uses VAS (a follow-on patch
will check and return error). Set MSR_PR state to 0 for
kernel (rather than reading from MSR).

Changelog[v3]
- Have caller, rather than init_xlate_regs() reset window regs
so we don't reset any settings caller may already have set.
- Translation mode should be 0x3 (0b11) not 0x11.
- Skip initilaizing read-only registers NX_UTIL and NX_UTIL_SE
- Skip initializing adder registers from UWC - they are already
initialized from the HVWC.
- Check winctx->user_win when setting translation registers
---
arch/powerpc/platforms/powernv/vas-window.c | 305 ++++++++++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 55 +++++
2 files changed, 360 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index a3a705a..3a50d6a 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -11,6 +11,7 @@
#include <linux/mutex.h>
#include <linux/slab.h>
#include <linux/io.h>
+#include <linux/log2.h>

#include "vas.h"

@@ -185,6 +186,310 @@ int map_winctx_mmio_bars(struct vas_window *window)
return 0;
}

+/*
+ * Reset all valid registers in the HV and OS/User Window Contexts for
+ * the window identified by @window.
+ *
+ * NOTE: We cannot really use a for loop to reset window context. Not all
+ * offsets in a window context are valid registers and the valid
+ * registers are not sequential. And, we can only write to offsets
+ * with valid registers (or is that only in Simics?).
+ */
+void reset_window_regs(struct vas_window *window)
+{
+ write_hvwc_reg(window, VREG(LPID), 0ULL);
+ write_hvwc_reg(window, VREG(PID), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_MSR), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_LPCR), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(AMR), 0ULL);
+ write_hvwc_reg(window, VREG(SEIDR), 0ULL);
+ write_hvwc_reg(window, VREG(FAULT_TX_WIN), 0ULL);
+ write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
+ write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), 0ULL);
+ write_hvwc_reg(window, VREG(PSWID), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE1), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE2), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE3), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE4), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE5), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE6), 0ULL);
+ write_hvwc_reg(window, VREG(LFIFO_BAR), 0ULL);
+ write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
+ write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LRX_WCRED), 0ULL);
+ write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(LFIFO_SIZE), 0ULL);
+ write_hvwc_reg(window, VREG(WINCTL), 0ULL);
+ write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(TX_RSVD_BUF_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_PID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_LPID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_TID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), 0ULL);
+ write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
+
+ /* Skip read-only registers: NX_UTIL and NX_UTIL_SE */
+
+ /*
+ * The send and receive window credit adder registers are also
+ * accessible from HVWC and have been initialized above. We don't
+ * need to initialize from the OS/User Window Context, so skip
+ * following calls:
+ *
+ * write_uwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+ * write_uwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ */
+}
+
+/*
+ * Initialize window context registers related to Address Translation.
+ * These registers are common to send/receive windows although they
+ * differ for user/kernel windows. As we resolve the TODOs we may
+ * want to add fields to vas_winctx and move the initialization to
+ * init_vas_winctx_regs().
+ */
+static void init_xlate_regs(struct vas_window *window, bool user_win)
+{
+ uint64_t lpcr, val;
+
+ /*
+ * MSR_TA, MSR_US are false for both kernel and user.
+ * MSR_DR and MSR_PR are false for kernel.
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_XLATE_MSR_HV, val, true);
+ val = SET_FIELD(VAS_XLATE_MSR_SF, val, true);
+ if (user_win) {
+ val = SET_FIELD(VAS_XLATE_MSR_DR, val, true);
+ val = SET_FIELD(VAS_XLATE_MSR_PR, val, true);
+ }
+ write_hvwc_reg(window, VREG(XLATE_MSR), val);
+
+ lpcr = mfspr(SPRN_LPCR);
+ val = 0ULL;
+ /*
+ * NOTE: From Section 5.7.6.1 Segment Lookaside Buffer of the
+ * Power ISA, v2.07, Page size encoding is 0 = 4KB, 5 = 64KB.
+ *
+ * NOTE: From Section 1.3.1, Address Translation Context of the
+ * Nest MMU Workbook, LPCR_SC should be 0 for Power9.
+ */
+ val = SET_FIELD(VAS_XLATE_LPCR_PAGE_SIZE, val, 5);
+ val = SET_FIELD(VAS_XLATE_LPCR_ISL, val, lpcr & LPCR_ISL);
+ val = SET_FIELD(VAS_XLATE_LPCR_TC, val, lpcr & LPCR_TC);
+ val = SET_FIELD(VAS_XLATE_LPCR_SC, val, 0);
+ write_hvwc_reg(window, VREG(XLATE_LPCR), val);
+
+ /*
+ * Section 1.3.1 (Address translation Context) of NMMU workbook.
+ * 0b00 Hashed Page Table mode
+ * 0b01 Reserved
+ * 0b10 Radix on HPT
+ * 0b11 Radix on Radix
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_XLATE_MODE, val, radix_enabled() ? 3 : 2);
+ write_hvwc_reg(window, VREG(XLATE_CTL), val);
+
+ /*
+ * TODO: Can we mfspr(AMR) even for user windows?
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_AMR, val, mfspr(SPRN_AMR));
+ write_hvwc_reg(window, VREG(AMR), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_SEIDR, val, 0);
+ write_hvwc_reg(window, VREG(SEIDR), val);
+}
+
+/*
+ * Initialize Reserved Send Buffer Count for the send window. It involves
+ * writing to the register, reading it back to confirm that the hardware
+ * has enough buffers to reserve. See section 1.3.1.2.1 of VAS workbook.
+ *
+ * Since we can only make a best-effort attempt to fulfill the request,
+ * we don't return any errors if we cannot.
+ *
+ * TODO: Reserved (aka dedicated) send buffers are not supported yet.
+ */
+static void init_rsvd_tx_buf_count(struct vas_window *txwin,
+ struct vas_winctx *winctx)
+{
+ write_hvwc_reg(txwin, VREG(TX_RSVD_BUF_COUNT), 0ULL);
+}
+
+/*
+ * init_winctx_regs()
+ * Initialize window context registers for a receive window.
+ * Except for caching control and marking window open, the registers
+ * are initialized in the order listed in Section 3.1.4 (Window Context
+ * Cache Register Details) of the VAS workbook although they don't need
+ * to be.
+ *
+ * Design note: For NX receive windows, NX allocates the FIFO buffer in OPAL
+ * (so that it can get a large contiguous area) and passes that buffer
+ * to kernel via device tree. We now write that buffer address to the
+ * FIFO BAR. Would it make sense to do this all in OPAL? i.e have OPAL
+ * write the per-chip RX FIFO addresses to the windows during boot-up
+ * as a one-time task? That could work for NX but what about other
+ * receivers? Let the receivers tell us the rx-fifo buffers for now.
+ */
+int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
+{
+ uint64_t val;
+ int fifo_size;
+
+ reset_window_regs(window);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LPID, val, winctx->lpid);
+ write_hvwc_reg(window, VREG(LPID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_PID_ID, val, winctx->pidr);
+ write_hvwc_reg(window, VREG(PID), val);
+
+ init_xlate_regs(window, winctx->user_win);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_FAULT_TX_WIN, val, 0);
+ write_hvwc_reg(window, VREG(FAULT_TX_WIN), val);
+
+ /* In PowerNV, interrupts go to HV. */
+ write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_HV_INTR_SRC_RA, val, winctx->irq_port);
+ write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_PSWID_EA_HANDLE, val, winctx->pswid);
+ write_hvwc_reg(window, VREG(PSWID), val);
+
+ write_hvwc_reg(window, VREG(SPARE1), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE2), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE3), 0ULL);
+
+ /*
+ * NOTE: VAS expects the FIFO address to be copied into the LFIFO_BAR
+ * register as is - do NOT shift the address into VAS_LFIFO_BAR
+ * bit fields! Ok to set the page migration select fields -
+ * VAS ignores the lower 10+ bits in the address anyway, because
+ * the minimum FIFO size is 1K?
+ *
+ * See also: Design note in function header.
+ */
+ val = __pa(winctx->rx_fifo);
+ val = SET_FIELD(VAS_PAGE_MIGRATION_SELECT, val, 0);
+ write_hvwc_reg(window, VREG(LFIFO_BAR), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LDATA_STAMP, val, winctx->data_stamp);
+ write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LDMA_TYPE, val, winctx->dma_type);
+ val = SET_FIELD(VAS_LDMA_FIFO_DISABLE, val, winctx->fifo_disable);
+ write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), val);
+
+ write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
+ write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LRX_WCRED, val, winctx->wcreds_max);
+ write_hvwc_reg(window, VREG(LRX_WCRED), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_TX_WCRED, val, winctx->wcreds_max);
+ write_hvwc_reg(window, VREG(TX_WCRED), val);
+
+ write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+
+ fifo_size = winctx->rx_fifo_size / 1024;
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LFIFO_SIZE, val, ilog2(fifo_size));
+ write_hvwc_reg(window, VREG(LFIFO_SIZE), val);
+
+ /* Update window control and caching control registers last so
+ * we mark the window open only after fully initializing it and
+ * pushing context to cache.
+ */
+
+ write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
+
+ init_rsvd_tx_buf_count(window, winctx);
+
+ /* for a send window, point to the matching receive window */
+ val = 0ULL;
+ val = SET_FIELD(VAS_LRX_WIN_ID, val, winctx->rx_win_id);
+ write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), val);
+
+ write_hvwc_reg(window, VREG(SPARE4), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_NOTIFY_DISABLE, val, winctx->notify_disable);
+ val = SET_FIELD(VAS_INTR_DISABLE, val, winctx->intr_disable);
+ val = SET_FIELD(VAS_NOTIFY_EARLY, val, winctx->notify_early);
+ val = SET_FIELD(VAS_NOTIFY_OSU_INTR, val, winctx->notify_os_intr_reg);
+ write_hvwc_reg(window, VREG(LNOTIFY_CTL), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_PID, val, winctx->lnotify_pid);
+ write_hvwc_reg(window, VREG(LNOTIFY_PID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_LPID, val, winctx->lnotify_lpid);
+ write_hvwc_reg(window, VREG(LNOTIFY_LPID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_TID, val, winctx->lnotify_tid);
+ write_hvwc_reg(window, VREG(LNOTIFY_TID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_MIN_SCOPE, val, winctx->min_scope);
+ val = SET_FIELD(VAS_LNOTIFY_MAX_SCOPE, val, winctx->max_scope);
+ write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), val);
+
+ /* Skip read-only registers NX_UTIL and NX_UTIL_SE */
+
+ write_hvwc_reg(window, VREG(SPARE5), 0ULL);
+ write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE6), 0ULL);
+
+ /* Finally, push window context to memory and... */
+ val = 0ULL;
+ val = SET_FIELD(VAS_PUSH_TO_MEM, val, 1);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
+
+ /* ... mark the window open for business */
+ val = 0ULL;
+ val = SET_FIELD(VAS_WINCTL_REJ_NO_CREDIT, val, winctx->rej_no_credit);
+ val = SET_FIELD(VAS_WINCTL_PIN, val, winctx->pin_win);
+ val = SET_FIELD(VAS_WINCTL_TX_WCRED_MODE, val, winctx->tx_wcred_mode);
+ val = SET_FIELD(VAS_WINCTL_RX_WCRED_MODE, val, winctx->rx_wcred_mode);
+ val = SET_FIELD(VAS_WINCTL_TX_WORD_MODE, val, winctx->tx_word_mode);
+ val = SET_FIELD(VAS_WINCTL_RX_WORD_MODE, val, winctx->rx_word_mode);
+ val = SET_FIELD(VAS_WINCTL_FAULT_WIN, val, winctx->fault_win);
+ val = SET_FIELD(VAS_WINCTL_NX_WIN, val, winctx->nx_win);
+ val = SET_FIELD(VAS_WINCTL_OPEN, val, 1);
+ write_hvwc_reg(window, VREG(WINCTL), val);
+
+ return 0;
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 150d7b1..7b2bcd0 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -12,6 +12,7 @@
#include <linux/atomic.h>
#include <linux/idr.h>
#include <asm/vas.h>
+#include <linux/io.h>

/*
* Overview of Virtual Accelerator Switchboard (VAS).
@@ -385,4 +386,58 @@ struct vas_winctx {
extern bool vas_initialized(void);
extern struct vas_instance *find_vas_instance(int vasid);

+/*
+ * VREG(x):
+ * Expand a register's short name (eg: LPID) into two parameters:
+ * - the register's short name in string form ("LPID"), and
+ * - the name of the macro (eg: VAS_LPID_OFFSET), defining the
+ * register's offset in the window context
+ */
+#define VREG_SFX(n, s) __stringify(n), VAS_##n##s
+#define VREG(r) VREG_SFX(r, _OFFSET)
+
+#ifdef vas_debug
+static inline void vas_log_write(struct vas_window *win, char *name,
+ void *regptr, uint64_t val)
+{
+ if (val)
+ pr_err("%swin #%d: %s reg %p, val 0x%016llx\n",
+ win->tx_win ? "Tx" : "Rx", win->winid, name,
+ regptr, val);
+}
+
+#else /* vas_debug */
+
+#define vas_log_write(win, name, reg, val)
+
+#endif /* vas_debug */
+
+static inline void write_uwc_reg(struct vas_window *win, char *name,
+ int32_t reg, uint64_t val)
+{
+ void *regptr;
+
+ regptr = win->uwc_map + reg;
+ vas_log_write(win, name, regptr, val);
+
+ out_be64(regptr, val);
+}
+
+static inline void write_hvwc_reg(struct vas_window *win, char *name,
+ int32_t reg, uint64_t val)
+{
+ void *regptr;
+
+ regptr = win->hvwc_map + reg;
+ vas_log_write(win, name, regptr, val);
+
+ out_be64(regptr, val);
+}
+
+static inline uint64_t read_hvwc_reg(struct vas_window *win,
+ char *name __maybe_unused, int32_t reg)
+{
+ return in_be64(win->hvwc_map+reg);
+}
+
#endif /* _VAS_H */
--
2.7.4

2017-08-08 23:11:02

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 04/17] powerpc/vas: Define helpers to access MMIO regions

Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog [v6]:
- Minor reorg to make setup/cleanup functions more symmetric

Changelog [v5]:
- [Ben Herrenschmidt]: Need cachable mapping for paste regions
and non-cachable mapping for the MMIO regions. So, just use
ioremap() for mapping the MMIO regions; use "winctx" instead
of "wc" to avoid collision with "write combine".

Changelog [v3]:
- Minor reorg/cleanup of map/unmap functions

Changelog [v2]:
- Get HVWC, UWC and paste addresses from window->vinst (i.e DT)
rather than kernel macros.
---
arch/powerpc/platforms/powernv/vas-window.c | 173 ++++++++++++++++++++++++++++
1 file changed, 173 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 6156fbe..a3a705a 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -9,9 +9,182 @@

#include <linux/types.h>
#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/io.h>

#include "vas.h"

+/*
+ * Compute the paste address region for the window @window using the
+ * ->paste_base_addr and ->paste_win_id_shift we got from device tree.
+ */
+void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
+{
+ uint64_t base, shift;
+ int winid;
+
+ base = window->vinst->paste_base_addr;
+ shift = window->vinst->paste_win_id_shift;
+ winid = window->winid;
+
+ *addr = base + (winid << shift);
+ if (len)
+ *len = PAGE_SIZE;
+
+ pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
+}
+
+static inline void get_hvwc_mmio_bar(struct vas_window *window,
+ uint64_t *start, int *len)
+{
+ uint64_t pbaddr;
+
+ pbaddr = window->vinst->hvwc_bar_start;
+ *start = pbaddr + window->winid * VAS_HVWC_SIZE;
+ *len = VAS_HVWC_SIZE;
+}
+
+static inline void get_uwc_mmio_bar(struct vas_window *window,
+ uint64_t *start, int *len)
+{
+ uint64_t pbaddr;
+
+ pbaddr = window->vinst->uwc_bar_start;
+ *start = pbaddr + window->winid * VAS_UWC_SIZE;
+ *len = VAS_UWC_SIZE;
+}
+
+/*
+ * Map the paste bus address of the given send window into kernel address
+ * space. Unlike MMIO regions (map_mmio_region() below), paste region must
+ * be mapped cache-able and is only applicable to send windows.
+ */
+void *map_paste_region(struct vas_window *txwin)
+{
+ int rc, len;
+ void *map;
+ char *name;
+ uint64_t start;
+
+ rc = -ENOMEM;
+ name = kasprintf(GFP_KERNEL, "window-v%d-w%d", txwin->vinst->vas_id,
+ txwin->winid);
+ if (!name)
+ return ERR_PTR(rc);
+
+ txwin->paste_addr_name = name;
+ compute_paste_address(txwin, &start, &len);
+
+ if (!request_mem_region(start, len, name)) {
+ pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+ __func__, start, len);
+ goto free_name;
+ }
+
+ map = ioremap_cache(start, len);
+ if (!map) {
+ pr_devel("%s(): ioremap_cache(0x%llx, %d) failed\n", __func__,
+ start, len);
+ goto free_name;
+ }
+
+ pr_devel("VAS: mapped paste addr 0x%llx to kaddr 0x%p\n", start, map);
+ return map;
+
+free_name:
+ kfree(name);
+ return ERR_PTR(rc);
+}
+
+
+static void *map_mmio_region(char *name, uint64_t start, int len)
+{
+ void *map;
+
+ if (!request_mem_region(start, len, name)) {
+ pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+ __func__, start, len);
+ return NULL;
+ }
+
+ map = ioremap(start, len);
+ if (!map) {
+ pr_devel("%s(): ioremap(0x%llx, %d) failed\n", __func__, start,
+ len);
+ return NULL;
+ }
+
+ return map;
+}
+
+static void unmap_region(void *addr, uint64_t start, int len)
+{
+ iounmap(addr);
+ release_mem_region((phys_addr_t)start, len);
+}
+
+/*
+ * Unmap the paste address region for a window.
+ */
+void unmap_paste_region(struct vas_window *window)
+{
+ int len;
+ uint64_t busaddr_start;
+
+ if (window->paste_kaddr) {
+ compute_paste_address(window, &busaddr_start, &len);
+ unmap_region(window->paste_kaddr, busaddr_start, len);
+ window->paste_kaddr = NULL;
+ kfree(window->paste_addr_name);
+ window->paste_addr_name = NULL;
+ }
+}
+
+/*
+ * Unmap the MMIO regions for a window.
+ */
+static void unmap_winctx_mmio_bars(struct vas_window *window)
+{
+ int len;
+ uint64_t busaddr_start;
+
+ if (window->hvwc_map) {
+ get_hvwc_mmio_bar(window, &busaddr_start, &len);
+ unmap_region(window->hvwc_map, busaddr_start, len);
+ window->hvwc_map = NULL;
+ }
+
+ if (window->uwc_map) {
+ get_uwc_mmio_bar(window, &busaddr_start, &len);
+ unmap_region(window->uwc_map, busaddr_start, len);
+ window->uwc_map = NULL;
+ }
+}
+
+/*
+ * Find the Hypervisor Window Context (HVWC) MMIO Base Address Region and the
+ * OS/User Window Context (UWC) MMIO Base Address Region for the given window.
+ * Map these bus addresses and save the mapped kernel addresses in @window.
+ */
+int map_winctx_mmio_bars(struct vas_window *window)
+{
+ int len;
+ uint64_t start;
+
+ get_hvwc_mmio_bar(window, &start, &len);
+ window->hvwc_map = map_mmio_region("HVWCM_Window", start, len);
+
+ get_uwc_mmio_bar(window, &start, &len);
+ window->uwc_map = map_mmio_region("UWCM_Window", start, len);
+
+ if (!window->hvwc_map || !window->uwc_map) {
+ unmap_winctx_mmio_bars(window);
+ return -1;
+ }
+
+ return 0;
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
--
2.7.4

2017-08-08 23:11:42

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: [PATCH v6 02/17] powerpc/vas: Move GET_FIELD/SET_FIELD to vas.h

Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.

There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency switch the order of parameters in
SET_FIELD to match the order in skiboot.

Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Reviewed-by: Dan Streetman <[email protected]>
---

Changelog[v3]
- Fix order of parameters in nx-842 driver.
---
arch/powerpc/include/uapi/asm/vas.h | 8 ++++++++
drivers/crypto/nx/nx-842-powernv.c | 7 ++++---
drivers/crypto/nx/nx-842.h | 5 -----
3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/vas.h b/arch/powerpc/include/uapi/asm/vas.h
index ddfe046..21249f5 100644
--- a/arch/powerpc/include/uapi/asm/vas.h
+++ b/arch/powerpc/include/uapi/asm/vas.h
@@ -22,4 +22,12 @@
#define VAS_THRESH_FIFO_GT_QTR_FULL 2
#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3

+/*
+ * Get/Set bit fields
+ */
+#define GET_FIELD(m, v) (((v) & (m)) >> MASK_LSH(m))
+#define MASK_LSH(m) (__builtin_ffsl(m) - 1)
+#define SET_FIELD(m, v, val) \
+ (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_LSH(m)) & (m)))
+
#endif /* _UAPI_MISC_VAS_H */
diff --git a/drivers/crypto/nx/nx-842-powernv.c b/drivers/crypto/nx/nx-842-powernv.c
index 1710f80..3abb045 100644
--- a/drivers/crypto/nx/nx-842-powernv.c
+++ b/drivers/crypto/nx/nx-842-powernv.c
@@ -22,6 +22,7 @@

#include <asm/prom.h>
#include <asm/icswx.h>
+#include <asm/vas.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Dan Streetman <[email protected]>");
@@ -424,9 +425,9 @@ static int nx842_powernv_function(const unsigned char *in, unsigned int inlen,

/* set up CCW */
ccw = 0;
- ccw = SET_FIELD(ccw, CCW_CT, nx842_ct);
- ccw = SET_FIELD(ccw, CCW_CI_842, 0); /* use 0 for hw auto-selection */
- ccw = SET_FIELD(ccw, CCW_FC_842, fc);
+ ccw = SET_FIELD(CCW_CT, ccw, nx842_ct);
+ ccw = SET_FIELD(CCW_CI_842, ccw, 0); /* use 0 for hw auto-selection */
+ ccw = SET_FIELD(CCW_FC_842, ccw, fc);

/* set up CRB's CSB addr */
csb_addr = nx842_get_pa(csb) & CRB_CSB_ADDRESS;
diff --git a/drivers/crypto/nx/nx-842.h b/drivers/crypto/nx/nx-842.h
index a4eee3b..30929bd 100644
--- a/drivers/crypto/nx/nx-842.h
+++ b/drivers/crypto/nx/nx-842.h
@@ -100,11 +100,6 @@ static inline unsigned long nx842_get_pa(void *addr)
return page_to_phys(vmalloc_to_page(addr)) + offset_in_page(addr);
}

-/* Get/Set bit fields */
-#define MASK_LSH(m) (__builtin_ffsl(m) - 1)
-#define GET_FIELD(v, m) (((v) & (m)) >> MASK_LSH(m))
-#define SET_FIELD(v, m, val) (((v) & ~(m)) | (((val) << MASK_LSH(m)) & (m)))
-
/**
* This provides the driver's constraints. Different nx842 implementations
* may have varying requirements. The constraints are:
--
2.7.4

2017-08-14 06:05:45

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

On Mon, 14 Aug 2017 15:21:48 +1000
Michael Ellerman <[email protected]> wrote:

> Sukadev Bhattiprolu <[email protected]> writes:

> > arch/powerpc/include/asm/vas.h | 35 ++++
> > arch/powerpc/include/uapi/asm/vas.h | 25 +++
>
> I thought we weren't exposing VAS to userspace yet?
>
> If we are then we need to get things straight WRT copy/paste abort.

No we should not be. This might be just a leftover hunk that should
be moved to a future series.

At the moment (as far as I understand) it should be limited to
preempt-disabled, process context, kernel users which avoids any
concern for switch_to.

Thanks,
Nick

2017-08-14 06:53:23

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v6 16/17] powerpc/vas: Implement a simple FTW driver

Hi Suka,

Some comments inline ...


Sukadev Bhattiprolu <[email protected]> writes:

> The Fast Thread Wake-up (FTW) driver provides user space applications an
> interface to the Core-to-Core functionality in POWER9. The driver provides
> the device node/ioctl API to applications and uses the external interfaces
> to the VAS driver to interact with the VAS hardware.
>
> A follow-on patch provides detailed description of the API for the driver.
>
> Signed-off-by: Sukadev Bhattiprolu <[email protected]>
> ---
> MAINTAINERS | 1 +
> arch/powerpc/platforms/powernv/Kconfig | 16 ++
> arch/powerpc/platforms/powernv/Makefile | 1 +
> arch/powerpc/platforms/powernv/nx-ftw.c | 486 ++++++++++++++++++++++++++++++++

AFAICS this has nothing to do with NX, so why is it called nx-ftw ?

Also aren't we going to want to use this on pseries eventually? If so
should it go in arch/powerpc/sysdev ?

> diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
> index e4db292..dc60046 100644
> --- a/arch/powerpc/platforms/powernv/Makefile
> +++ b/arch/powerpc/platforms/powernv/Makefile
> @@ -13,3 +13,4 @@ obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o
> obj-$(CONFIG_TRACEPOINTS) += opal-tracepoints.o
> obj-$(CONFIG_OPAL_PRD) += opal-prd.o
> obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o
> +obj-$(CONFIG_PPC_FTW) += nx-ftw.o
> diff --git a/arch/powerpc/platforms/powernv/nx-ftw.c b/arch/powerpc/platforms/powernv/nx-ftw.c
> new file mode 100644
> index 0000000..a0b6388
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/nx-ftw.c
> @@ -0,0 +1,486 @@

Missing license header.

> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/export.h>
> +#include <asm/cputable.h>
> +#include <linux/device.h>
> +#include <linux/debugfs.h>
> +#include <linux/cdev.h>
> +#include <linux/mutex.h>
> +#include <linux/fs.h>
> +#include <linux/mm.h>
> +#include <linux/slab.h>
> +#include <linux/sched.h>
> +#include <linux/uaccess.h>
> +#include <linux/bootmem.h>
> +#include <asm/opal-api.h>
> +#include <asm/opal.h>
> +#include <asm/page.h>
> +#include <asm/vas.h>
> +#include <asm/reg.h>

Please try and trim the list to what you need.

> +
> +/*
> + * NX-FTW is a device driver used to provide user space access to the
> + * Core-to-Core aka Fast Thread Wakeup (FTW) functionality provided by
> + * the Virtual Accelerator Subsystem (VAS) in POWER9 systems. See also
> + * arch/powerpc/platforms/powernv/vas*.
> + *
> + * The driver creates the device node /dev/crypto/nx-ftw that can be
> + * used as follows:
> + *
> + * fd = open("/dev/crypto/nx-ftw", O_RDWR);
> + * rc = ioctl(fd, VAS_RX_WIN_OPEN, &rxattr);
> + * rc = ioctl(fd, VAS_TX_WIN_OPEN, &txattr);
> + * paste_addr = mmap(NULL, PAGE_SIZE, prot, MAP_SHARED, fd, 0ULL).
> + * vas_copy(&crb, 0, 1);
> + * vas_paste(paste_addr, 0, 1);
> + *
> + * where "vas_copy" and "vas_paste" are defined in copy-paste.h.
> + */
> +
> +static char *nxftw_dev_name = "nx-ftw";
> +static atomic_t nxftw_instid = ATOMIC_INIT(0);
> +static dev_t nxftw_devt;
> +static struct dentry *nxftw_debugfs;
> +static struct class *nxftw_dbgfs_class;

The class doesn't go in debugfs, which is what "dbgfs" says to me.

> +/*
> + * Wrapper object for the nx-ftw device node - there is just one

Just "device".

"device node" is ambiguous vs device tree.

> + * instance of this node for the whole system.

So why not put the globals above in here also?

> + */
> +struct nxftw_dev {
> + struct cdev cdev;
> + struct device *device;
> + char *name;
> + atomic_t refcount;
> +} nxftw_device;
> +
> +/*
> + * One instance per open of a nx-ftw device. Each nxftw_instance is
> + * associated with a VAS window, after the caller issues VAS_RX_WIN_OPEN
> + * or VAS_TX_WIN_OPEN ioctl.
> + */
> +struct nxftw_instance {
> + int instance;
> + bool tx_win;
> + struct vas_window *window;
> +};
> +
> +#define VAS_DEFAULT_VAS_ID 0
> +#define POWERNV_LPID 0 /* TODO: For VM/KVM guests? */

mfspr(SPRN_LPID)

would seem to do the trick?

> +static char *nxftw_devnode(struct device *dev, umode_t *mode)
> +{
> + return kasprintf(GFP_KERNEL, "crypto/%s", dev_name(dev));

This isn't a crypto device?

> +}
> +
> +static int nxftw_open(struct inode *inode, struct file *fp)
> +{
> + int minor;
> + struct nxftw_instance *nxti;

instance would be a better name.

> + minor = MINOR(inode->i_rdev);

Not used?

> + nxti = kzalloc(sizeof(*nxti), GFP_KERNEL);
> + if (!nxti)
> + return -ENOMEM;
> +
> + nxti->instance = atomic_inc_return(&nxftw_instid);

And this would read better if the variable was "id". eg.

instance->id = atomic_inc_return(&next_instance_id);

> + nxti->window = NULL;
> +
> + fp->private_data = nxti;
> + return 0;
> +}
> +
> +static int validate_txwin_user_attr(struct vas_tx_win_open_attr *uattr)
> +{
> + int i;
> +
> + if (uattr->version != 1)
> + return -EINVAL;
> +
> + if (uattr->flags & ~VAS_FLAGS_HIGH_PRI)
> + return -EINVAL;
> +
> + if (uattr->reserved1 || uattr->reserved2)
> + return -EINVAL;
> +
> + for (i = 0; i < sizeof(uattr->reserved3) / sizeof(uint64_t); i++) {
> + if (uattr->reserved3[i])
> + return -EINVAL;
> + }

That struct is a mess and needs to be reworked.

> + return 0;
> +}
> +
> +static bool validate_rxwin_user_attr(struct vas_rx_win_open_attr *uattr)
> +{
> + int i;
> +
> + if (uattr->version != 1)
> + return -EINVAL;
> +
> + for (i = 0; i < sizeof(uattr->reserved) / sizeof(uint64_t); i++) {
> + if (uattr->reserved[i])
> + return -EINVAL;
> + }

Ditto.

> + return 0;
> +}
> +
> +#ifdef vas_debug

This is dead code, which makes it very easy for it to get out of sync
with the vas_rx_win_attr for example.

Better to just make these pr_debug() in the only caller, that way they
get type checked.

> +static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
> +{
> + pr_err("NX-FTW: user %d, nx %d, fault %d, ntfy %d, intr %d early %d\n",
> + attr->user_win ? 1 : 0,
> + attr->nx_win ? 1 : 0,
> + attr->fault_win ? 1 : 0,
> + attr->notify_disable ? 1 : 0,
> + attr->intr_disable ? 1 : 0,
> + attr->notify_early ? 1 : 0);
> +
> + pr_err("NX-FTW: rx_fifo %p, rx_fifo_size %d, max value 0x%x\n",
> + attr->rx_fifo, attr->rx_fifo_size,
> + VAS_RX_FIFO_SIZE_MAX);
> +
> +}
> +#else
> +static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
> +{
> +}
> +#endif
> +
> +static int nxftw_ioc_open_rx_window(struct file *fp, unsigned long arg)
> +{
> + int rc;
> + struct vas_rx_win_open_attr uattr;
> + struct vas_rx_win_attr rxattr;
> + struct nxftw_instance *nxti = fp->private_data;
> + struct vas_window *win;

struct vas_rx_win_open_attr uattr;
struct vas_rx_win_attr rxattr;
struct nxftw_instance *nxti;
struct vas_window *win;
int rc;

nxti = fp->private_data;

Ah much better :)

Aka. reverse-christmas-tree.

> +
> + rc = copy_from_user(&uattr, (void *)arg, sizeof(uattr));

Nicer would be:

void __user *uptr = (void *)arg;

rc = copy_from_user(&uattr, uptr, sizeof(uattr));

> + if (rc) {
> + pr_devel("%s(): copy_from_user() returns %d\n", __func__, rc);
> + return -EFAULT;
> + }
> +
> + rc = validate_rxwin_user_attr(&uattr);
> + if (rc)
> + return rc;
> +
> + memset(&rxattr, 0, sizeof(rxattr));
> +
> + rxattr.lnotify_lpid = POWERNV_LPID;
> +
> + /*
> + * Only caller can own the window for now. Not sure if there is need
> + * for process P1 to make P2 the owner of a window. If so, we need to
> + * find P2, make sure we have permissions, get a reference etc.
> + */
> + rxattr.lnotify_pid = mfspr(SPRN_PID);
> + rxattr.lnotify_tid = mfspr(SPRN_TIDR);
> + rxattr.rx_fifo = NULL;
> + rxattr.rx_fifo_size = 0;
> + rxattr.intr_disable = true;
> + rxattr.user_win = true;

vas_init_rx_win_attr() ?

> +
> + dump_rx_win_attr(&rxattr);
> +
> + /*
> + * TODO: Rather than the default vas id, choose an instance of VAS
> + * based on the chip the caller is running.
> + */

Seems like that will be a common pattern so maybe the vas core should
handle it for callers who want it.

> + win = vas_rx_win_open(VAS_DEFAULT_VAS_ID, VAS_COP_TYPE_FTW, &rxattr);
> + if (IS_ERR(win)) {
> + pr_devel("%s() vas_rx_win_open() failed, %ld\n", __func__,
> + PTR_ERR(win));
> + return PTR_ERR(win);
> + }
> +
> + nxti->window = win;
> + uattr.rx_win_handle = vas_win_id(win);
> +
> + rc = copy_to_user((void *)arg, &uattr, sizeof(uattr));
> + if (rc) {
> + pr_devel("%s(): copy_to_user() failed, %d\n", __func__, rc);
> + return -EFAULT;
> + }

You defined the ioctl as:

#define VAS_RX_WIN_OPEN _IOW('v', 2, struct vas_rx_win_open_attr)

But you're reading and writing from the user arg, so it should be _IOWR.

> +
> + return 0;
> +}
> +
> +static int nxftw_ioc_open_tx_window(struct file *fp, unsigned long arg)
> +{
> + int rc;
> + enum vas_cop_type cop;
> + struct vas_window *win;
> + struct vas_tx_win_open_attr uattr;
> + struct vas_tx_win_attr txattr;

Those two struct names are quite confusing.

> + struct nxftw_instance *nxti = fp->private_data;
> +
> + rc = copy_from_user(&uattr, (void *)arg, sizeof(uattr));
> + if (rc) {
> + pr_devel("%s(): copy_from_user() failed, %d\n", __func__, rc);
> + return -EFAULT;
> + }

All you use is rx_win_handle, so why does this ioctl take the whole struct?

> + cop = VAS_COP_TYPE_FTW;
> +
> + rc = validate_txwin_user_attr(&uattr);
> + if (rc)
> + return rc;
> +
> + pr_devel("Pid %d: Opening txwin, cop %d, PIDR %ld\n",
> + task_pid_nr(current), cop, mfspr(SPRN_PID));
> +
> + vas_init_tx_win_attr(&txattr, cop);
> +
> + txattr.lpid = POWERNV_LPID;
> + txattr.pidr = mfspr(SPRN_PID);
> + txattr.pid = task_pid_nr(current);

Why is that in txattr?

The pid can be freed and given to another process so it's fishy to be
saving the pid without also holding a reference on the task.

> + txattr.user_win = true;

Has been done for us.

> + txattr.pswid = uattr.rx_win_handle;
> +
> + win = vas_tx_win_open(VAS_DEFAULT_VAS_ID, cop, &txattr);
> + if (IS_ERR(win)) {
> + pr_devel("%s() vas_tx_win_open() failed, %ld\n", __func__,
> + PTR_ERR(win));
> + return PTR_ERR(win);
> + }
> + nxti->window = win;
> + nxti->tx_win = true;

is_tx would be clearer IMHO.

> + return 0;
> +}
> +
> +static int nxftw_release(struct inode *inode, struct file *fp)
> +{
> + struct nxftw_instance *nxti;
> +
> + nxti = fp->private_data;
> +
> + vas_win_close(nxti->window);
> + nxti->window = NULL;
> +
> + kfree(nxti);
> + fp->private_data = NULL;

Flipping the order of those would be preferable though it's not actually
a bug.

> + atomic_dec(&nxftw_instid);
> +
> + return 0;
> +}
> +
> +static ssize_t nxftw_write(struct file *fp, const char __user *buf,
> + size_t len, loff_t *offsetp)
> +{
> + return -ENOTSUPP;
> +}
> +
> +static ssize_t nxftw_read(struct file *fp, char __user *buf, size_t len,
> + loff_t *offsetp)
> +{
> + return -ENOTSUPP;
> +}

Do you need those?

> +static int nxftw_vma_fault(struct vm_fault *vmf)
> +{
> + u64 offset;
> + unsigned long vaddr;
> + uint64_t pbaddr_start;
> + struct nxftw_instance *nxti;
> + struct vm_area_struct *vma = vmf->vma;
> +
> + nxti = vma->vm_private_data;
> + offset = vmf->pgoff << PAGE_SHIFT;
> + vaddr = (unsigned long)vmf->address;
> +
> + pbaddr_start = vas_win_paste_addr(nxti->window);
> +
> + pr_devel("%s() instance %d, pbaddr 0x%llx, vaddr 0x%lx,"
> + "offset %llx, pgoff 0x%lx, vma-start 0x%zx,"
> + "size %zd\n", __func__, nxti->instance,
> + pbaddr_start, vaddr, offset, vmf->pgoff,
> + vma->vm_start, vma->vm_end-vma->vm_start);
> +
> + vm_insert_pfn(vma, vaddr, (pbaddr_start + offset) >> PAGE_SHIFT);
> +
> + return VM_FAULT_NOPAGE;
> +}
> +
> +const struct vm_operations_struct nxftw_vm_ops = {
> + .fault = nxftw_vma_fault,
> +};

Is there some particular reason you need to implement those, you appear
to be just mapping a page into the address space. Can't you just use
remap_pfn_range() in your mmap routine?

> +static int nxftw_mmap(struct file *fp, struct vm_area_struct *vma)
> +{
> + struct nxftw_instance *nxti = fp->private_data;
> +
> + if ((vma->vm_end - vma->vm_start) > PAGE_SIZE) {
> + pr_devel("%s(): size 0x%zx, PAGE_SIZE 0x%zx\n", __func__,
> + (vma->vm_end - vma->vm_start), PAGE_SIZE);
> + return -EINVAL;
> + }
> +
> + /* Ensure instance has an open send window */
> + if (!nxti->window || !nxti->tx_win) {
> + pr_devel("%s(): No send window open?\n", __func__);
> + return -EINVAL;
> + }
> +
> + /* flags, page_prot from cxl_mmap(), except we want cachable */
> + vma->vm_flags |= VM_IO | VM_PFNMAP;
> + vma->vm_page_prot = pgprot_cached(vma->vm_page_prot);
> +
> + vma->vm_ops = &nxftw_vm_ops;
> + vma->vm_private_data = nxti;

ie. here.

See eg. opal-prd.c for an example.

> + return 0;
> +}
> +
> +static long nxftw_ioctl(struct file *fp, unsigned int cmd, unsigned long arg)
> +{
> + struct nxftw_instance *nxti;
> +
> + nxti = fp->private_data;

Not used.
> +
> + pr_devel("%s() cmd 0x%x, TX_WIN_OPEN 0x%lx\n", __func__, cmd,
> + VAS_TX_WIN_OPEN);

Can we drop that?

> + switch (cmd) {
> +
> + case VAS_TX_WIN_OPEN:
> + return nxftw_ioc_open_tx_window(fp, arg);
> +
> + case VAS_RX_WIN_OPEN:
> + return nxftw_ioc_open_rx_window(fp, arg);
> +
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +const struct file_operations nxftw_fops = {
> + .owner = THIS_MODULE,
> + .open = nxftw_open,
> + .release = nxftw_release,
> + .read = nxftw_read,
> + .write = nxftw_write,
> + .mmap = nxftw_mmap,
> + .unlocked_ioctl = nxftw_ioctl,
> +};
> +
> +
> +int nxftw_file_init(void)
> +{
> + int rc;
> + dev_t devno;
> +
> + rc = alloc_chrdev_region(&nxftw_devt, 1, 1, "nx-ftw");
> + if (rc) {
> + pr_err("Unable to allocate nxftw major number: %i\n", rc);
> + return rc;
> + }
> +
> + pr_devel("NX-FTW device allocated, dev [%i,%i]\n", MAJOR(nxftw_devt),
> + MINOR(nxftw_devt));
> +
> + nxftw_dbgfs_class = class_create(THIS_MODULE, "nxftw");
> + if (IS_ERR(nxftw_dbgfs_class)) {
> + pr_err("Unable to create NX-FTW class\n");
> + rc = PTR_ERR(nxftw_dbgfs_class);
> + goto err;
> + }
> + nxftw_dbgfs_class->devnode = nxftw_devnode;
> +
> + cdev_init(&nxftw_device.cdev, &nxftw_fops);
> +
> + devno = MKDEV(MAJOR(nxftw_devt), 0);
> + if (cdev_add(&nxftw_device.cdev, devno, 1)) {
> + pr_err("NX-FTW: cdev_add() failed\n");
> + goto err;
> + }
> +
> + nxftw_device.device = device_create(nxftw_dbgfs_class, NULL,
> + devno, NULL, nxftw_dev_name, MINOR(devno));
> + if (IS_ERR(nxftw_device.device)) {
> + pr_err("Unable to create nxftw-%d\n", MINOR(devno));
> + goto err;
> + }
> +
> + pr_devel("%s: Added dev [%d,%d]\n", __func__, MAJOR(devno),
> + MINOR(devno));
> + return 0;
> +
> +err:
> + unregister_chrdev_region(nxftw_devt, 1);
> + return rc;
> +}
> +
> +void nxftw_file_exit(void)
> +{
> + dev_t devno;
> +
> + pr_devel("NX-FTW: %s entered\n", __func__);
> +
> + cdev_del(&nxftw_device.cdev);
> + devno = MKDEV(MAJOR(nxftw_devt), MINOR(nxftw_devt));
> + device_destroy(nxftw_dbgfs_class, devno);
> +
> + class_destroy(nxftw_dbgfs_class);
> + unregister_chrdev_region(nxftw_devt, 1);
> +}
> +
> +
> +/*
> + * Create a debugfs entry. Not sure what for yet, though
> + */

Please just drop it.

> +int __init nxftw_debugfs_init(void)
> +{
> + struct dentry *ent;
> +
> + ent = debugfs_create_dir("nxftw", NULL);
> + if (IS_ERR(ent)) {
> + pr_devel("nxftw: %s(): error creating dbgfs dir\n", __func__);
> + return PTR_ERR(ent);
> + }
> + nxftw_debugfs = ent;
> +
> + return 0;
> +}
> +
> +void nxftw_debugfs_exit(void)
> +{
> + debugfs_remove_recursive(nxftw_debugfs);
> +}
> +
> +int __init nxftw_init(void)
> +{
> + int rc;
> +
> + rc = nxftw_file_init();
> + if (rc)
> + return rc;
> +
> + rc = nxftw_debugfs_init();
> + if (rc)
> + goto free_file;
> +
> + pr_err("NX-FTW Device initialized\n");

That's not an error.

> +
> + return 0;
> +
> +free_file:
> + nxftw_file_exit();
> + return rc;
> +}
> +
> +void __init nxftw_exit(void)
> +{
> + pr_devel("NX-FTW Device exiting\n");
> + nxftw_debugfs_exit();
> + nxftw_file_exit();
> +}
> +
> +module_init(nxftw_init);
> +module_exit(nxftw_exit);

This can't be a module, so you shouldn't be using these.

Or these:

> +MODULE_DESCRIPTION("IBM NX Fast Thread Wakeup Device");
> +MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
> +MODULE_LICENSE("GPL");

cheers

2017-08-14 07:02:46

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

On Tue, 2017-08-08 at 16:06 -0700, Sukadev Bhattiprolu wrote:
> We need the SPRN_TIDR to bet set for use with fast thread-wakeup
> (core-to-core wakeup).  Each thread in a process needs to have a
> unique id within the process but as explained below, for now, we
> assign globally unique thread ids to all threads in the system.
>
> Signed-off-by: Sukadev Bhattiprolu <[email protected]>
> ---
>  arch/powerpc/include/asm/processor.h |  4 ++
>  arch/powerpc/kernel/process.c        | 74
> ++++++++++++++++++++++++++++++++++++
>  2 files changed, 78 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/processor.h
> b/arch/powerpc/include/asm/processor.h
> index fab7ff8..bf6ba63 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -232,6 +232,10 @@ struct debug_reg {
>  struct thread_struct {
>   unsigned long ksp; /* Kernel stack pointer */
>  
> +#ifdef CONFIG_PPC_VAS

I'm tempted to have this always, or a new feature CONFIG_PPC_TID that's PPC_VAS
depends on.

> + unsigned long tidr;

> +#endif
> +
>  #ifdef CONFIG_PPC64
>   unsigned long ksp_vsid;
>  #endif
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 9f3e2c9..6123859 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1213,6 +1213,16 @@ struct task_struct *__switch_to(struct task_struct
> *prev,
>   hard_irq_disable();
>   }
>  
> +#ifdef CONFIG_PPC_VAS
> + mtspr(SPRN_TIDR, new->thread.tidr);

how much does this hurt our context_switch benchmark in
tools/testing/selftests/powerpc/benchmarks/context_switch.c ?

Also you need an CPU_FTR_ARCH_300 test here (and elsewhere)

> +#endif
> + /*
> +  * We can't take a PMU exception inside _switch() since there is a
> +  * window where the kernel stack SLB and the kernel stack are out
> +  * of sync. Hard disable here.
> +  */
> + hard_irq_disable();
> +

What is this?

>   /*
>    * Call restore_sprs() before calling _switch(). If we move it after
>    * _switch() then we miss out on calling it for new tasks. The reason
> @@ -1449,9 +1459,70 @@ void flush_thread(void)
>  #endif /* CONFIG_HAVE_HW_BREAKPOINT */
>  }
>  
> +#ifdef CONFIG_PPC_VAS
> +static DEFINE_SPINLOCK(vas_thread_id_lock);
> +static DEFINE_IDA(vas_thread_ida);

This IDA be per process, not global.

> +
> +/*
> + * We need to assign an unique thread id to each thread in a process. This
> + * thread id is intended to be used with the Fast Thread-wakeup (aka Core-
> + * to-core wakeup) mechanism being implemented on top of Virtual Accelerator
> + * Switchboard (VAS).
> + *
> + * To get a unique thread-id per process we could simply use task_pid_nr()
> + * but the problem is that task_pid_nr() is not yet available for the thread
> + * when copy_thread() is called. Fixing that would require changing more
> + * intrusive arch-neutral code in code path in copy_process()?.
> + *
> + * Further, to assign unique thread ids within each process, we need an
> + * atomic field (or an IDR) in task_struct, which again intrudes into the
> + * arch-neutral code.

Really?

> + * So try to assign globally unique thraed ids for now.

Yuck!

> + */
> +static int assign_thread_id(void)
> +{
> + int index;
> + int err;
> +
> +again:
> + if (!ida_pre_get(&vas_thread_ida, GFP_KERNEL))
> + return -ENOMEM;
> +
> + spin_lock(&vas_thread_id_lock);
> + err = ida_get_new_above(&vas_thread_ida, 1, &index);

We can't use 0 or 1?

> + spin_unlock(&vas_thread_id_lock);
> +
> + if (err == -EAGAIN)
> + goto again;
> + else if (err)
> + return err;
> +
> + if (index > MAX_USER_CONTEXT) {
> + spin_lock(&vas_thread_id_lock);
> + ida_remove(&vas_thread_ida, index);
> + spin_unlock(&vas_thread_id_lock);
> + return -ENOMEM;
> + }
> +
> + return index;
> +}
> +
> +static void free_thread_id(int id)
> +{
> + spin_lock(&vas_thread_id_lock);
> + ida_remove(&vas_thread_ida, id);
> + spin_unlock(&vas_thread_id_lock);
> +}
> +#endif /* CONFIG_PPC_VAS */
> +
> +
>  void
>  release_thread(struct task_struct *t)
>  {
> +#ifdef CONFIG_PPC_VAS
> + free_thread_id(t->thread.tidr);
> +#endif

Can you restructure this to avoid the #ifdef ugliness

>  }
>  
>  /*
> @@ -1587,6 +1658,9 @@ int copy_thread(unsigned long clone_flags, unsigned long
> usp,
>  #endif
>  
>   setup_ksp_vsid(p, sp);
> +#ifdef CONFIG_PPC_VAS
> + p->thread.tidr = assign_thread_id();
> +#endif

Same here...

>  
>  #ifdef CONFIG_PPC64 
>   if (cpu_has_feature(CPU_FTR_DSCR)) {

2017-08-14 07:26:09

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v6 02/17] powerpc/vas: Move GET_FIELD/SET_FIELD to vas.h

Sukadev Bhattiprolu <[email protected]> writes:

> Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
> users of VAS, including NX-842 can use those macros.
>
> There is a lot of related code between the VAS/NX kernel drivers
> and skiboot. For consistency switch the order of parameters in
> SET_FIELD to match the order in skiboot.
>
> Signed-off-by: Sukadev Bhattiprolu <[email protected]>
> Reviewed-by: Dan Streetman <[email protected]>

> diff --git a/arch/powerpc/include/uapi/asm/vas.h b/arch/powerpc/include/uapi/asm/vas.h
> index ddfe046..21249f5 100644
> --- a/arch/powerpc/include/uapi/asm/vas.h
> +++ b/arch/powerpc/include/uapi/asm/vas.h
> @@ -22,4 +22,12 @@
> #define VAS_THRESH_FIFO_GT_QTR_FULL 2
> #define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
>
> +/*
> + * Get/Set bit fields
> + */
> +#define GET_FIELD(m, v) (((v) & (m)) >> MASK_LSH(m))
> +#define MASK_LSH(m) (__builtin_ffsl(m) - 1)
> +#define SET_FIELD(m, v, val) \
> + (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_LSH(m)) & (m)))

This has no business being in a uapi header for VAS.

Put it in asm/vas.h if you must.

Personally I really dislike these sort of macros because they completely
obscure what the final value should end up being, and it's the final
value you'll see when you're debugging it.

> + ccw = SET_FIELD(CCW_CT, ccw, nx842_ct);
> + ccw = SET_FIELD(CCW_CI_842, ccw, 0); /* use 0 for hw auto-selection */
> + ccw = SET_FIELD(CCW_FC_842, ccw, fc);

eg. that could also be written:

ccw = (nx842_ct << 16) | (fc & 7);

cheers

2017-08-14 10:43:34

by Michael Neuling

[permalink] [raw]
Subject: Re: [PATCH v6 17/17] powerpc/vas: Document FTW API/usage

On Tue, 2017-08-08 at 16:07 -0700, Sukadev Bhattiprolu wrote:
> Document the usage of the VAS Fast thread-wakeup API.
>
> Thanks for input/comments from Benjamin Herrenschmidt, Michael Neuling,
> Michael Ellerman, Robert Blackmore, Ian Munsie, Haren Myneni, Paul Mackerras.
>
> Cc:Ian Munsie <[email protected]>
> Cc:Paul Mackerras <[email protected]>
> Signed-off-by: Sukadev Bhattiprolu <[email protected]>
> ---
>  Documentation/powerpc/ftw-api.txt | 373
> ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 373 insertions(+)
>  create mode 100644 Documentation/powerpc/ftw-api.txt
>
> diff --git a/Documentation/powerpc/ftw-api.txt b/Documentation/powerpc/ftw-
> api.txt
> new file mode 100644
> index 0000000..0b3f16f
> --- /dev/null
> +++ b/Documentation/powerpc/ftw-api.txt
> @@ -0,0 +1,373 @@
> +Virtual Accelerator Switchboard and Fast Thread-Wakeup API
> +
> +    Power9 processor supports a hardware subystem known as the Virtual
> +    Accelerator Switchboard (VAS) which allows two entities in the Power9
> +    system to efficiently exchange messages. Messages must be formatted as
> +    Coprocessor Reqeust Blocks (CRB) and be submitted using the COPY/PASTE
> +    instructions (new in Power9).
> +
> +    Usage of VAS depends on the entities exchanging the messages and
> +    currently two usages have been identified.
> +
> +    First usage of VAS, referred to as VAS/NX involves a software thread
> +    submitting data compression requests to a co-processor (hardware/nest
> +    accelerator) aka NX engine. The API for this usage is described in the
> +    VAS/NX API document.
> +
> +    Alternatively, VAS can be used by two software threads to efficiently
> +    exchange messages. Initially, this mechanism is intended to wake up a
> +    waiting thread quickly - i.e "fast thread wake-up (FTW)". This document
> +    describes the user API for this VAS/FTW mechanism.
> +
> +    Application access to the FTW mechanism is provided through the NX-FTW
> +    device node (/dev/crypto/nx-ftw) implemented by the VAS/FTW device
> +    driver.

crypto?

> +
> +    A software thread T1 that intends to wait for an event must first setup
> +    a receive window, by opening the NX-FTW device and using the
> +    VAS_RX_WIN_OPEN ioctl. Upon successful return from the VAS_RX_WIN_OPEN
> +    ioctl, an rx_win_handle is returned.

I realise there is a window here as part of the hardware implementation, but the
users don't care about the window on the receive side. It's hidden from them.
It's just an rx handle IMHO.

The sender certainly has a window that users care about since they have to mmap
it.

> +
> +    A software thread T2 that intends to wake up T1 at some point, must first
> +    set up a "send window" using the VAS_TX_WIN_OPEN ioctl and specify the
> +    rx_win_handle obtained by T1. After a successful VAS_TX_WIN_OPEN ioctl
> the
> +    send window of T2 is considered paired with the receive window of T1. The
> +    thread T2 must then use mmap() to obtain a "paste address" for the send
> +    window.


> +    With this set up, thread T1 can wait for an event using the WAIT
> +    instruction.
> +
> +    Thread T2 can wake up T1 by using the "COPY/PASTE" instructions and
> +    submitting an empty/NULL CRB to the send window's paste address. The
> +    wait/wake up process can be repeated as long as the threads have the
> +    send/receive windows open.



> +1. NX-FTW Device Node
> +
> +    There is one /dev/crypto/nx-ftw node in the system and it provides
> +    access to the VAS/FTW functionality.


> +    The only valid operations on the NX-FTW node are:
> +
> +        - open() the device for read and write.
> +
> +        - issue either VAS_RX_WIN_OPEN or VAS_TX_WIN_OPEN ioctls to set up
> +          receive or send (only one of them per open).
> +
> +        - if the open is associated with send window (i.e VAS_TX_WIN_OPEN
> +          ioctl was issued) mmap() the send window into the application's
> +          virtual address space. (i.e get a 'paste_address' for the send
> +          window).
> +
> +        - close the device node.
> +
> +    Other file operations on the NX-FTW node are undefined.
> +
> +    Note tHAT the COPY and PASTE operations go directly to the hardware
> +    and not go through the NX-FTW device.

I don't understand this statement

> +
> +    Although a system may have several instances of the VAS in the system
> +    (typically, one per P9 chip) there is just one NX-FTW device node in
> +    the system.

> + When the NX-FTW device node is opened, the kernel assigns a suitable
> + instance of VAS to the process. Kernel will make a best-effort
> attempt
> + to assign an optimal instance of VAS for the process. In the initial
> +    release, the kernel does not support migrating the VAS instance if the
> +    process migrates from a processor on one chip to a processor on another
> +    chip.

How is it "optimal"?

> +    Applications may chose a specific instance of the VAS using the 'vas_id'
> +    field in the VAS_TX_WIN_OPEN and VAS_RX_WIN_OPEN ioctls as detailed
> below.




> +2. Open NX-FTW node
> +
> +    The device should be opened for read and write. No special privileges
> +    are needed to open the device. The device may be opened multiple times.
> +
> +    Each open() of the NX-FTW device may be associated with either a send
> +    window or receive window but not both.
> +
> +    See open(2) system call man pages for other details such as return
> +    values, error codes and restrictions.
> +
> +3. Setup Receive window (VAS_RX_WIN_OPEN ioctl)
> +
> +    A thread that expects to wait for events and be woken up using COPY/PASTE
> +    must first set up a receive window by issuing the VAS_RX_WIN_OPEN ioctl.
> +
> +        #include <asm/vas.h>
> +
> +        struct vas_rx_win_open_attr rxattr;
> +
> +        rc = ioctl(fd, VAS_RX_WIN_OPEN, &rxattr);
> +
> +    The attributes of rxattr are as follows:
> +
> +        struct vas_rx_win_open_attr {
> +                int16_t       version;
> +                int16_t       vas_id;
> +                int32_t       rx_win_handle;    /* output field */
> +                int64_t       reserved[8];
> +        };
> +
> +    The version field identifies the version of the API and must currently
> +    be set to 1.
> +
> +    The vas_id field identifies a specific instance of the VAS that the
> +    application wishes to access. See section on VAS ID below.
> +
> +    The reserved field must be set to all zeroes.
> +
> +    Upon successful return from the ioctl, the rx_win_handle field contains
> +    an identifier for the VAS window associated with this "sleeping" thread.
> +
> +    This rx_win_handle field is used to "pair" this receive window with a
> +    send window and must be specified when opening the corresponding send
> +    window (see struct vas_tx_win_open_attr below).
> +
> +    Return value:
> +
> +    The VAS_RX_WIN_OPEN ioctl returns 0 on success. On error, it returns -1
> +    and sets the errno variable to indicate the error.
> +
> +    Error codes:
> +
> +        EINVAL      version is invalid
> +
> +        EINVAL      vas_id is invalid
> +
> +        EINVAL      reserved field is not set to zeroes
> +
> +        EINVAL      fd is already associated with a send window
> +
> +
> +3. Set up a Send window (VAS_TX_WIN_OPEN ioctl)
> +
> +    An application thread that expects to wake up a waiting thread using
> +    copy/paste, must first set up a send window that is paired with the
> +    receive window of the waiting thread. This is accomplished using the
> +    VAS_TX_WIN_OPEN ioctl.
> +
> +        #include <asm/vas.h>
> +
> +        struct vas_tx_win_open_attr txattr;
> +
> +        rc = ioctl(fd, VAS_TX_WIN_OPEN, &txattr);

So we talked about this offline before.... the fd here should not be from the
/dev device but should be the fd from rx_win_open ioctl.

As you have it here you pass the handle in as a parameter of ioctl. This means
all the permissions checks have to be done by you as to if these two windows can
be linked. If you use the fd from before, you can assume if the receiver has
given this fd to the sender, it has the right permissions.

I have some pseudo code at the end shows this.

> +    The attributes 'txattr' for the VAS_TX_WIN_OPEN ioctl are defined as
> +    follows:
> +
> +        struct vas_tx_win_open_attr {
> +            int32_t       version;
> +            int16_t       vas_id;
> +            uint32_t      rx_win_handle;
> +
> +            int64_t       reserved1;
> +
> +            int64_t       flags;
> +            int64_t       reserved2;
> +
> +            int32_t       tc_mode;
> +            int32_t       rsvd_txbuf;
> +            int64_t       reserved3[6];
> +        };
> +
> +    The version field must currently be set to 1.
> +
> +    The vas_id field identifies a specific instance of the VAS that the
> +    application wishes to access. See section on VAS ID below.

Can this be different to the rx?

> +    The rx_win_handle field must be set to the rx_win_handle returned by
> +    a prior successful call to VAS_RX_WIN_OPEN ioctl (see above). This
> +    field is used to pair this send window with a receive window. The
> +    process must have sufficient permissions to communicate with the
> +    process owning the receive window identified by rx_win_handle.

As above, this should be part of the FD otherwise users could specify anything
here and paste to anyone.

> +    The tc_mode and  rsvd_txbuf fields are currently unused and must be
> +    set to 0
> +
> +    The flags field specifies additional attributes to the window. The
> +    only valid bit in the flag are for FTW windows is:
> +
> +        VAS_FLAGS_PIN_WINDOW    if set, indicates the a window should be
> +                                pinned in cache. This flag is restricted
> +                                to privileged users. See Pinning windows
> +                                below.
> +
> +    All the other bits in the flags field must be set to 0.
> +
> +    The fields reserved1, reserved2 and reserved3 are for future extension
> +    and must be set to 0.
> +
> +    Return value:
> +
> +    The VAS_TX_WIN_OPEN ioctl returns 0 on success. On error, it returns -1
> +    and sets the errno variable to indicate the error.
> +
> +    Error conditions:
> +
> +        EINVAL      version, vas_id or rx_win_handle fields are invalid
> +
> +        EINVAL      fd does not refer to a valid VAS device.
> +
> +        EINVAL      fd is already associated with a receive window
> +
> +        ENOSPC      System has too many active windows (connections) open,
> +
> +        EINVAL      For FTW windows, rsvd_txbuf is not 0.
> +
> +        EINVAL      For FTW windows, tc_mode is not VAS_THRESH_DISABLED.
> +
> +        EPERM       VAS_FLAGS_PIN_WINDOW is set in 'flags' field and process
> +                    is not privileged.
> +
> +        EPERM       VAS_FLAGS_HIGH_PRI is set in 'flags' field and process
> +                    is not privileged.
> +
> +        EINVAL      an invalid flag is set in the 'flags' field. (For FTW
> +                    windows, VAS_FLAGS_HIGH_PRI is also invalid).
> +
> +        EINVAL      reserved fields are not set to 0.
> +
> +    See the ioctl(2) man page for more details, error codes and restrictions.
> +
> +4. mmap() NX-FTW device fd
> +
> +    The mmap() system call for a NX-FTW device fd returns a "paste address"
> +    that the application can use to COPY/PASTE a CRB to the waiting thread.
> +
> +        paste_addr = mmap(NULL, size, prot, flags, fd, offset);
> +
> +    The mmap() operation is only valid on a file descriptor associated
> +    with a send window.
> +
> +    Only restrictions on mmap for a NX-FTW device fd are:
> +
> +        - size parameter should be one page size
> +
> +        - offset parameter should be 0ULL.
> +
> +    Refer to mmap(2) man page for additional details/restrictions.
> +
> +    In addition to the error conditions listed on the mmap(2) man page,
> +    mmap() can also fail with one of following error codes:
> +
> +        EINVAL      fd is not associated with an open send window (i.e mmap()
> +                    does not follow a successful call to the VAS_TX_WIN_OPEN
> +                    ioctl).
> +
> +        EINVAL      offset field is not 0ULL.
> +
> +
> +5. VAS ID
> +
> +    A system may have several instances of VAS in the hardware, typically
> +    one per POWER 9 chip. The choice of a specific instance of VAS can have
> +    significant impact on the performance, specially if the application
> +    migrates from one CPU to another. Applications can specify a vas_id
> +    using the VAS_TX_WIN_OPEN and VAS_RX_WIN_OPEN ioctls and should be
> +    prudent in choosing an instance of VAS.
> +
> +    The vas_id for each instance of VAS is listed as the device tree
> +    property 'ibm,vas-id'. Determining the specific vas_id to use for
> +    a specific application thread is beyond the scope of this API.

I would lean towards having 1 device per vas/chip but I'll defer to mpe and benh
on the best option here.

you planning a libftw to do this?

> +
> +    If the application has no preference, the vas_id field may be set to
> +    -1 and the kernel will choose a suitable instance of the VAS engine.

+1

> +6. COPY/PASTE operations:
> +
> +    Applications should use the COPY and PASTE instructions defined in
> +    the RFC to copy/paste the CRB. For VAS/FTW usage, the contents of
> +    CRB if any, are ignored. CRB can be NULL.
> +
> +7. Interrupt completion and signal handling
> +
> +    No VAS-specific signals will be generated to the application threads
> +    with the VAS/FTW usage.

+1

> +
> +
> +8. Example/Proposed usage of the VAS/FTW API
> +
> +    In the following example we use two threads that use the VAS/FTW API.
> +    Thread T1 uses the WAIT instruction to wait for an event. Thread T2
> +    uses copy/paste instructions to wake up T1.

So here's how pseudo code for my idea would look with pthreads.  

I've also added some memory barriers. The ISA suggests that copy/paste has no
ordering associated with it, so you are going to need them I think. I'm not sure
of the flavour though.

---
bool done = false;
int rxfd;

static void reciever(void)
{
do {
                asm("wait");
smp_mb(); /* needed for wait -> memory  */
} while (!done); /* check for spurious wakeup */
/* woken up! */
}

static void sender(void)
{
void *paste_addr;

/* mmap the rx file descriptor */
paste_addr = mmap(NULL, getpagesize(), prot, MAP_SHARED, rxfd, 0);

done = true;
smp_mb(); /* needed for memory -> paste */
        write_crb(paste_addr);
}

int main()
{
pthread_t thread;
int devfd;

        devfd = open("/dev/vas-ftw", O_RDWR);

/* create a new rx file descriptor associated with this LPID/PID/TID */
        rxfd = ioctl(devfd, VAS_RX_CREATE);

pthread_create(&thread, NULL, sender, NULL);

/* Reciever must *not* be a new thread since VAS_RX_CREATE
   ioctl is associated with this LPID/PID/TID 
*/
reciever();
}


2017-08-14 11:01:02

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

Nicholas Piggin <[email protected]> writes:

> On Mon, 14 Aug 2017 15:21:48 +1000
> Michael Ellerman <[email protected]> wrote:
>
>> Sukadev Bhattiprolu <[email protected]> writes:
>
>> > arch/powerpc/include/asm/vas.h | 35 ++++
>> > arch/powerpc/include/uapi/asm/vas.h | 25 +++
>>
>> I thought we weren't exposing VAS to userspace yet?
>>
>> If we are then we need to get things straight WRT copy/paste abort.
>
> No we should not be. This might be just a leftover hunk that should
> be moved to a future series.
>
> At the moment (as far as I understand) it should be limited to
> preempt-disabled, process context, kernel users which avoids any
> concern for switch_to.

I think that comment applied to a previous version, see patch 16.

cheers

2017-08-14 11:16:57

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

Sukadev Bhattiprolu <[email protected]> writes:

> We need the SPRN_TIDR to bet set for use with fast thread-wakeup
> (core-to-core wakeup). Each thread in a process needs to have a
> unique id within the process but as explained below, for now, we
> assign globally unique thread ids to all threads in the system.

Each thread in a process already has a unique id, ie. its pid (in the
init PID namespace), accessible in the kernel as task_pid_nr(task).

So if that's all we need, we don't need a new allocator, and we don't
need to store it in the thread_struct.

Also 99.99% of processes aren't going to care about the TIDR, so we
should avoid setting it in the common case. ie. it should start out zero
and only be initialised in the FTW code, or a helper that it calls.

> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 9f3e2c9..6123859 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1213,6 +1213,16 @@ struct task_struct *__switch_to(struct task_struct *prev,
> hard_irq_disable();
> }
>
> +#ifdef CONFIG_PPC_VAS
> + mtspr(SPRN_TIDR, new->thread.tidr);
> +#endif

That should be in restore_sprs().

It should also check that the TIDR is initialised, and only switch it
when necessary.

> + /*
> + * We can't take a PMU exception inside _switch() since there is a
> + * window where the kernel stack SLB and the kernel stack are out
> + * of sync. Hard disable here.
> + */
> + hard_irq_disable();

We removed that in June in:

e4c0fc5f72bc ("powerpc/64s: Leave interrupts hard enabled in context switch for radix")

You've obviously picked it up somewhere along the line during a rebase,
please be more careful!

cheers

2017-08-14 13:30:44

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

On Mon, 2017-08-14 at 17:02 +1000, Michael Neuling wrote:
> > +/*
> > + * We need to assign an unique thread id to each thread in a process. This
> > + * thread id is intended to be used with the Fast Thread-wakeup (aka Core-
> > + * to-core wakeup) mechanism being implemented on top of Virtual Accelerator
> > + * Switchboard (VAS).
> > + *
> > + * To get a unique thread-id per process we could simply use task_pid_nr()
> > + * but the problem is that task_pid_nr() is not yet available for the thread
> > + * when copy_thread() is called. Fixing that would require changing more
> > + * intrusive arch-neutral code in code path in copy_process()?.
> > + *
> > + * Further, to assign unique thread ids within each process, we need an
> > + * atomic field (or an IDR) in task_struct, which again intrudes into the
> > + * arch-neutral code.
>
> Really?
>
> > + * So try to assign globally unique thraed ids for now.
>
> Yuck!

Also CAPI has size limits for the TIDR afaik

Ben.

2017-08-14 19:14:17

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

Nicholas Piggin [[email protected]] wrote:
> On Mon, 14 Aug 2017 15:21:48 +1000
> Michael Ellerman <[email protected]> wrote:
>
> > Sukadev Bhattiprolu <[email protected]> writes:
>
> > > arch/powerpc/include/asm/vas.h | 35 ++++
> > > arch/powerpc/include/uapi/asm/vas.h | 25 +++
> >
> > I thought we weren't exposing VAS to userspace yet?
> >
> > If we are then we need to get things straight WRT copy/paste abort.
>

> No we should not be. This might be just a leftover hunk that should
> be moved to a future series.

Yes, I should have posted patches 14..17 separately as an RFC that goes
on top of the VAS kernel patches 1..13.

>
> At the moment (as far as I understand) it should be limited to
> preempt-disabled, process context, kernel users which avoids any
> concern for switch_to.
>

In the FTW case, there is no data transfer from user space to the hardware.
i.e the copy/paste submit a NULL CRB and hardware will be configured (see
->fifo_disable setting in winctx) to ignore any data they specify in the CRB.

Would we be able to allow copy/paste from user space in that case?

Sukadev

2017-08-14 19:27:33

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

Benjamin Herrenschmidt [[email protected]] wrote:
> On Mon, 2017-08-14 at 17:02 +1000, Michael Neuling wrote:
> > > +/*
> > > + * We need to assign an unique thread id to each thread in a process. This
> > > + * thread id is intended to be used with the Fast Thread-wakeup (aka Core-
> > > + * to-core wakeup) mechanism being implemented on top of Virtual Accelerator
> > > + * Switchboard (VAS).
> > > + *
> > > + * To get a unique thread-id per process we could simply use task_pid_nr()
> > > + * but the problem is that task_pid_nr() is not yet available for the thread
> > > + * when copy_thread() is called. Fixing that would require changing more
> > > + * intrusive arch-neutral code in code path in copy_process()?.
> > > + *
> > > + * Further, to assign unique thread ids within each process, we need an
> > > + * atomic field (or an IDR) in task_struct, which again intrudes into the
> > > + * arch-neutral code.
> >
> > Really?
> >
> > > + * So try to assign globally unique thraed ids for now.
> >
> > Yuck!

I know :-) copy_process() has:

retval = copy_thread_tls(clone_flags, stack_start, stack_size, p, tls);
if (retval)
goto bad_fork_cleanup_io;

if (pid != &init_struct_pid) {
pid = alloc_pid(p->nsproxy->pid_ns_for_children);
if (IS_ERR(pid)) {


so copy_thread() is called before a pid_nr is assigned to the task.

But see also response to Michael Ellerman.

>
> Also CAPI has size limits for the TIDR afaik

Ok.

>
> Ben.

2017-08-14 20:03:19

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > We need the SPRN_TIDR to bet set for use with fast thread-wakeup
> > (core-to-core wakeup). Each thread in a process needs to have a
> > unique id within the process but as explained below, for now, we
> > assign globally unique thread ids to all threads in the system.
>
> Each thread in a process already has a unique id, ie. its pid (in the
> init PID namespace), accessible in the kernel as task_pid_nr(task).
>
> So if that's all we need, we don't need a new allocator, and we don't
> need to store it in the thread_struct.
>
> Also 99.99% of processes aren't going to care about the TIDR, so we
> should avoid setting it in the common case. ie. it should start out zero
> and only be initialised in the FTW code, or a helper that it calls.

Good point. So, should we just set when the RX_WIN_OPEN ioctl is called
rather than at the time of clone()?

_switch_to() (restore_sprs() could check for non-zero and save/restore
the value.

As Ben pointed out, we are going to be have limit the number of TIDs (to
be within the size limits), so we won't be able to use task_pid_nr()? But
if we assign the TIDs in the RX_WIN_OPEN ioctl, then only the FTW processes
will need the TIDR value.

Can we then assign new, globally-unique TID values for now and have the ioctl
fail with -EAGAIN if all TIDs are in use? We can extend to per-process TID
values, later?

>
> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 9f3e2c9..6123859 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -1213,6 +1213,16 @@ struct task_struct *__switch_to(struct task_struct *prev,
> > hard_irq_disable();
> > }
> >
> > +#ifdef CONFIG_PPC_VAS
> > + mtspr(SPRN_TIDR, new->thread.tidr);
> > +#endif
>
> That should be in restore_sprs().

ok.
>
> It should also check that the TIDR is initialised, and only switch it
> when necessary.
>
> > + /*
> > + * We can't take a PMU exception inside _switch() since there is a
> > + * window where the kernel stack SLB and the kernel stack are out
> > + * of sync. Hard disable here.
> > + */
> > + hard_irq_disable();
>
> We removed that in June in:
>
> e4c0fc5f72bc ("powerpc/64s: Leave interrupts hard enabled in context switch for radix")
>
> You've obviously picked it up somewhere along the line during a rebase,
> please be more careful!

Yeah, That was stupid. I picked it up on a recent rebase. Will be careful.

>
> cheers

2017-08-14 22:23:35

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

On Mon, 2017-08-14 at 21:16 +1000, Michael Ellerman wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > We need the SPRN_TIDR to bet set for use with fast thread-wakeup
> > (core-to-core wakeup). Each thread in a process needs to have a
> > unique id within the process but as explained below, for now, we
> > assign globally unique thread ids to all threads in the system.
>
> Each thread in a process already has a unique id, ie. its pid (in the
> init PID namespace), accessible in the kernel as task_pid_nr(task).
>
> So if that's all we need, we don't need a new allocator, and we don't
> need to store it in the thread_struct.

We need an allocator, I think, due to size restriction on the HW TID.

> Also 99.99% of processes aren't going to care about the TIDR, so we
> should avoid setting it in the common case. ie. it should start out zero
> and only be initialised in the FTW code, or a helper that it calls.


> > diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> > index 9f3e2c9..6123859 100644
> > --- a/arch/powerpc/kernel/process.c
> > +++ b/arch/powerpc/kernel/process.c
> > @@ -1213,6 +1213,16 @@ struct task_struct *__switch_to(struct task_struct *prev,
> > hard_irq_disable();
> > }
> >
> > +#ifdef CONFIG_PPC_VAS
> > + mtspr(SPRN_TIDR, new->thread.tidr);
> > +#endif
>
> That should be in restore_sprs().
>
> It should also check that the TIDR is initialised, and only switch it
> when necessary.
>
> > + /*
> > + * We can't take a PMU exception inside _switch() since there is a
> > + * window where the kernel stack SLB and the kernel stack are out
> > + * of sync. Hard disable here.
> > + */
> > + hard_irq_disable();
>
> We removed that in June in:
>
> e4c0fc5f72bc ("powerpc/64s: Leave interrupts hard enabled in context switch for radix")
>
> You've obviously picked it up somewhere along the line during a rebase,
> please be more careful!
>
> cheers

2017-08-14 22:25:27

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [PATCH v6 14/17] powerpc: Add support for setting SPRN_TIDR

On Mon, 2017-08-14 at 13:03 -0700, Sukadev Bhattiprolu wrote:
> As Ben pointed out, we are going to be have limit the number of TIDs (to
> be within the size limits), so we won't be able to use task_pid_nr()? But
> if we assign the TIDs in the RX_WIN_OPEN ioctl, then only the FTW processes
> will need the TIDR value.

But you'll have to assign it for all present and future threads of that
process which is somewhat hard to do without races.

> Can we then assign new, globally-unique TID values for now and have the ioctl
> fail with -EAGAIN if all TIDs are in use? We can extend to per-process TID
> values, later?

Why would you want to do that ?

Ben.

2017-08-16 12:07:45

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

Sukadev Bhattiprolu <[email protected]> writes:

> Nicholas Piggin [[email protected]] wrote:
>> On Mon, 14 Aug 2017 15:21:48 +1000
>> Michael Ellerman <[email protected]> wrote:
>>
>> > Sukadev Bhattiprolu <[email protected]> writes:
>>
>> > > arch/powerpc/include/asm/vas.h | 35 ++++
>> > > arch/powerpc/include/uapi/asm/vas.h | 25 +++
>> >
>> > I thought we weren't exposing VAS to userspace yet?
>> >
>> > If we are then we need to get things straight WRT copy/paste abort.
...
>
> In the FTW case, there is no data transfer from user space to the hardware.
> i.e the copy/paste submit a NULL CRB and hardware will be configured (see
> ->fifo_disable setting in winctx) to ignore any data they specify in the CRB.

I thought the copy did copy a cacheline, but then the paste to the VAS
window just ignores the contents, and doesn't allow userspace to get the
content in any way?

Which means we have two thirds of a covert channel, ie. something can be
copied into the copy buffer by one process, and then a second process
can paste it, but because it can only paste to foreign memory, and the
only foreign memory it can get is a VAS FTW window, it can't actually
see the content of the copy buffer.

> Would we be able to allow copy/paste from user space in that case?

Yeah I think so, but it is all a bit fragile.

cheers

2017-08-16 23:07:51

by Sukadev Bhattiprolu

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] powerpc/vas: Define macros, register fields and structures

Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > Nicholas Piggin [[email protected]] wrote:
> >> On Mon, 14 Aug 2017 15:21:48 +1000
> >> Michael Ellerman <[email protected]> wrote:
> >>
> >> > Sukadev Bhattiprolu <[email protected]> writes:
> >>
> >> > > arch/powerpc/include/asm/vas.h | 35 ++++
> >> > > arch/powerpc/include/uapi/asm/vas.h | 25 +++
> >> >
> >> > I thought we weren't exposing VAS to userspace yet?
> >> >
> >> > If we are then we need to get things straight WRT copy/paste abort.
> ...
> >
> > In the FTW case, there is no data transfer from user space to the hardware.

Sorry, that was focussed on the paste side.

> > i.e the copy/paste submit a NULL CRB and hardware will be configured (see
> > ->fifo_disable setting in winctx) to ignore any data they specify in the CRB.
>
> I thought the copy did copy a cacheline, but then the paste to the VAS
> window just ignores the contents, and doesn't allow userspace to get the
> content in any way?

Yes, you are right. The copy instruction does read the CRB into its copy-
buffer but for the FTW, VAS ignores the copy-buffer contents on paste.
So, the CRB may be zeroed, but must be a valid buffer.

>
> Which means we have two thirds of a covert channel, ie. something can be
> copied into the copy buffer by one process, and then a second process
> can paste it, but because it can only paste to foreign memory, and the
> only foreign memory it can get is a VAS FTW window, it can't actually
> see the content of the copy buffer.
>
> > Would we be able to allow copy/paste from user space in that case?
>
> Yeah I think so, but it is all a bit fragile.
>
> cheers