Power9 introduces a hardware subsystem referred to as the Virtual
Accelerator Switchboard (VAS). VAS allows kernel subsystems and user
space processes to directly access the Nest Accelerator (NX) engines
which implement compression and encryption algorithms in the hardware.
NX has been in Power processors since Power7+, but access to the NX
engines was through the 'icswx' instruction which is only available
to the kernel/hypervisor. Starting with Power9, access to the NX
engines is provided to both kernel and user space processes through
VAS.
The switchboard (i.e VAS) multiplexes accesses between "receivers" and
"senders", where the "receivers" are typically the NX engines and
"senders" are the kernel subsystems and user processors that wish to
access the receivers (NX engines). Once a sender is "connected" to
a receiver through the switchboard, the senders can submit compression/
encryption requests to the hardware using the new (PowerISA 3.0)
"copy" and "paste" instructions.
In the initial OPAL and PowerNV kernel patchsets, the "senders" can
only be kernel subsystems (eg NX-842 driver) and receivers can only
be the NX-842 engine. Follow-on patch sets will allow senders/receivers
to be user-space processes and receivers to be NX-GZIP engines.
Provides:
This kernel patch set configures the VAS subsystems and provides
kernel interfaces to drivers like NX-842 to open receive and send
windows in VAS and to submit compression requests to the NX engine.
Requires:
This patch set needs corresponding VAS/NX skiboot patches which
were merged into skiboot tree. i.e skiboot must include:
commit b503dcf ("vas: Set mmio enable bits in DD2")
Tests:
In-kernel compression requests were tested on DD1 and DD2 POWER9
hardware using compression self-test module and the following
NX-842 patch set from Haren Myneni:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-July/160620.html
Git Tree:
https://github.com/sukadev/linux/
Branch: vas-kern-v7
Thanks to input from Ben Herrenschmidt, Michael Neuling, Michael Ellerman
and Haren Myneni.
Changelog[v7]:
- Drop support for user space send/receive FTW windows (will be
posted separately) Simplifies the rx-win-open interface a bit.
- [Michael Ellerman] Move GET_FIELD/SET_FIELD macros from
uapi/asm/vas.h to asm/vas.h.
Changelog[v6]
- Add support for user space send/receive FTW windows
- Add a new, NX-FTW driver which provides the FTW user interface
Changelog[v5]
- [Ben Herrenschmidt] Make VAS a platform device in the device tree
and use the core platform functions to parse the VAS properties.
Map the VAS MMIO regions as non-cachable and paste regions as
cachable. Use CONFIG_PPC_VAS rather than CONFIG_VAS; Don't assume
VAS ids are sequential.
- Copy the FIFO address as is into LFIFO_BAR (don't shift it).
Changelog[v4]
Comments from Michael Neuling:
- Move VAS code from drivers/misc/vas to arch/powerpc/platforms/powernv
since VAS only provides interfaces to other drivers like NX-842.
- Drop vas-internal.h and use vas.h in separate dirs for VAS
internal, kernel API and user API
- Rather than create 6 separate device tree properties windows
and window context, combine them into 6 "reg" properties.
- Drop vas_window_reset() since windows are reset/cleared before
being assigned to kernel/users.
- Use ilog2() and radix_enabled() helpers
Changelog[v3]
- Rebase to v4.11-rc1
- Add interfaces to initialize send/receive window attributes to
defaults that drivers can use (see arch/powerpc/include/asm/vas.h)
- Modify interface vas_paste() to return 0 or error code
- Fix a bug in setting Translation Control Mode (0b11 not 0x11)
- Enable send-window-credit checking
- Reorg code in vas_win_close()
- Minor reorgs and tweaks to register field settings to make it
easier to add support for user space windows.
- Skip writing to read-only registers
- Start window indexing from 0 rather than 1
Changelog[v2]
- Use vas-id, HVWC, UWC and paste address, entries from device tree
rather than defining/computing them in kernel and reorg code.
Sukadev Bhattiprolu (12):
powerpc/vas: Define macros, register fields and structures
Move GET_FIELD/SET_FIELD to vas.h
powerpc/vas: Define vas_init() and vas_exit()
powerpc/vas: Define helpers to access MMIO regions
powerpc/vas: Define helpers to init window context
powerpc/vas: Define helpers to alloc/free windows
powerpc/vas: Define vas_win_paste_addr()
powerpc/vas: Define vas_win_id()
powerpc/vas: Define vas_rx_win_open() interface
powerpc/vas: Define vas_win_close() interface
powerpc/vas: Define vas_tx_win_open()
powerpc/vas: Define copy/paste interfaces
.../devicetree/bindings/powerpc/ibm,vas.txt | 24 +
MAINTAINERS | 9 +
arch/powerpc/include/asm/vas.h | 174 +++
arch/powerpc/platforms/powernv/Kconfig | 14 +
arch/powerpc/platforms/powernv/Makefile | 1 +
arch/powerpc/platforms/powernv/copy-paste.h | 74 ++
arch/powerpc/platforms/powernv/vas-window.c | 1189 ++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.c | 183 +++
arch/powerpc/platforms/powernv/vas.h | 500 ++++++++
drivers/crypto/nx/nx-842-powernv.c | 7 +-
drivers/crypto/nx/nx-842.h | 5 -
11 files changed, 2172 insertions(+), 8 deletions(-)
create mode 100644 Documentation/devicetree/bindings/powerpc/ibm,vas.txt
create mode 100644 arch/powerpc/include/asm/vas.h
create mode 100644 arch/powerpc/platforms/powernv/copy-paste.h
create mode 100644 arch/powerpc/platforms/powernv/vas-window.c
create mode 100644 arch/powerpc/platforms/powernv/vas.c
create mode 100644 arch/powerpc/platforms/powernv/vas.h
--
2.7.4
Define macros for the VAS hardware registers and bit-fields as well
as couple of data structures needed by the VAS driver.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v7]
- Move the threshold control macros from uapi/asm/vas.h to
asm/vas.h for now. When we actually have an user space need for
them, we can move them to uapi/asm/vas.h. With this change,
uapi/asm/vas.h is empty and can be dropped from this patch.
Changelog[v6]
- Add some fields for FTW windows
Changelog[v4]
- [Michael Neuling] Move VAS code to arch/powerpc; Reorg vas.h and
vas-internal.h to kernel and uapi versions; rather than creating
separate properties for window context/address entries in device
tree, combine them into "reg" properties; drop ->hwirq and irq_port
fields from vas_window as they are only needed with user space
windows.
- Drop the error check for CONFIG_PPC_4K_PAGES. Instead in a
follow-on patch add a "depends on CONFIG_PPC_64K_PAGES".
Changelog[v3]
- Rename winctx->pid to winctx->pidr to reflect that its a value
from the PID register (SPRN_PID), not the linux process id.
- Make it easier to split header into kernel/user parts
- To keep user interface simple, use macros rather than enum for
the threshold-control modes.
- Add a pid field to struct vas_window - needed for user space
send windows.
Changelog[v2]
- Add an overview of VAS in vas-internal.h
- Get window context parameters from device tree and drop
unnecessary macros.
---
arch/powerpc/include/asm/vas.h | 45 ++++
arch/powerpc/platforms/powernv/vas.h | 385 +++++++++++++++++++++++++++++++++++
2 files changed, 430 insertions(+)
create mode 100644 arch/powerpc/include/asm/vas.h
create mode 100644 arch/powerpc/platforms/powernv/vas.h
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
new file mode 100644
index 0000000..66a863c
--- /dev/null
+++ b/arch/powerpc/include/asm/vas.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _MISC_VAS_H
+#define _MISC_VAS_H
+
+/*
+ * Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
+ * (Local FIFO Size Register) of the VAS workbook.
+ */
+#define VAS_RX_FIFO_SIZE_MIN (1 << 10) /* 1KB */
+#define VAS_RX_FIFO_SIZE_MAX (8 << 20) /* 8MB */
+
+/*
+ * Threshold Control Mode: Have paste operation fail if the number of
+ * requests in receive FIFO exceeds a threshold.
+ *
+ * NOTE: No special error code yet if paste is rejected because of these
+ * limits. So users can't distinguish between this and other errors.
+ */
+#define VAS_THRESH_DISABLED 0
+#define VAS_THRESH_FIFO_GT_HALF_FULL 1
+#define VAS_THRESH_FIFO_GT_QTR_FULL 2
+#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
+
+/*
+ * Co-processor Engine type.
+ */
+enum vas_cop_type {
+ VAS_COP_TYPE_FAULT,
+ VAS_COP_TYPE_842,
+ VAS_COP_TYPE_842_HIPRI,
+ VAS_COP_TYPE_GZIP,
+ VAS_COP_TYPE_GZIP_HIPRI,
+ VAS_COP_TYPE_FTW,
+ VAS_COP_TYPE_MAX,
+};
+
+#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
new file mode 100644
index 0000000..c66aaf0
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -0,0 +1,385 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#ifndef _VAS_H
+#define _VAS_H
+#include <linux/atomic.h>
+#include <linux/idr.h>
+#include <asm/vas.h>
+
+/*
+ * Overview of Virtual Accelerator Switchboard (VAS).
+ *
+ * VAS is a hardware "switchboard" that allows senders and receivers to
+ * exchange messages with _minimal_ kernel involvment. The receivers are
+ * typically NX coprocessor engines that perform compression or encryption
+ * in hardware, but receivers can also be other software threads.
+ *
+ * Senders are user/kernel threads that submit compression/encryption or
+ * other requests to the receivers. Senders must format their messages as
+ * Coprocessor Request Blocks (CRB)s and submit them using the "copy" and
+ * "paste" instructions which were introduced in Power9.
+ *
+ * A Power node can have (upto?) 8 Power chips. There is one instance of
+ * VAS in each Power9 chip. Each instance of VAS has 64K windows or ports,
+ * Senders and receivers must each connect to a separate window before they
+ * can exchange messages through the switchboard.
+ *
+ * Each window is described by two types of window contexts:
+ *
+ * Hypervisor Window Context (HVWC) of size VAS_HVWC_SIZE bytes
+ *
+ * OS/User Window Context (UWC) of size VAS_UWC_SIZE bytes.
+ *
+ * A window context can be viewed as a set of 64-bit registers. The settings
+ * in these registers configure/control/determine the behavior of the VAS
+ * hardware when messages are sent/received through the window. The registers
+ * in the HVWC are configured by the kernel while the registers in the UWC can
+ * be configured by the kernel or by the user space application that is using
+ * the window.
+ *
+ * The HVWCs for all windows on a specific instance of VAS are in a contiguous
+ * range of hardware addresses or Base address region (BAR) referred to as the
+ * HVWC BAR for the instance. Similarly the UWCs for all windows on an instance
+ * are referred to as the UWC BAR for the instance.
+ *
+ * The two BARs for each instance are defined Power9 MMIO Ranges spreadsheet
+ * and available to the kernel in the VAS node's "reg" property in the device
+ * tree:
+ *
+ * /proc/device-tree/vasm@.../reg
+ *
+ * (see vas_probe() for details on the reg property).
+ *
+ * The kernel maps the HVWC and UWC BAR regions into the kernel address
+ * space (hvwc_map and uwc_map). The kernel can then access the window
+ * contexts of a specific window using:
+ *
+ * hvwc = hvwc_map + winid * VAS_HVWC_SIZE.
+ * uwc = uwc_map + winid * VAS_UWC_SIZE.
+ *
+ * where winid is the window index (0..64K).
+ *
+ * As mentioned, a window context is used to "configure" a window. Besides
+ * this configuration address, each _send_ window also has a unique hardware
+ * "paste" address that is used to submit requests/CRBs (see vas_paste_crb()).
+ *
+ * The hardware paste address for a window is computed using the "paste
+ * base address" and "paste win id shift" reg properties in the VAS device
+ * tree node using:
+ *
+ * paste_addr = paste_base + ((winid << paste_win_id_shift))
+ *
+ * (again, see vas_probe() for ->paste_base_addr and ->paste_win_id_shift).
+ *
+ * The kernel maps this hardware address into the sender's address space
+ * after which they can use the 'paste' instruction (new in Power9) to
+ * send a message (submit a request aka CRB) to the coprocessor.
+ *
+ * NOTE: In the initial version, senders can only in-kernel drivers/threads.
+ * Support for user space threads will be added in follow-on patches.
+ *
+ * TODO: Do we need to map the UWC into user address space so they can return
+ * credits? Its NA for NX but may be needed for other receive windows.
+ *
+ */
+
+#define VAS_WINDOWS_PER_CHIP (64 << 10)
+
+/*
+ * Hypervisor and OS/USer Window Context sizes
+ */
+#define VAS_HVWC_SIZE 512
+#define VAS_UWC_SIZE PAGE_SIZE
+
+/*
+ * Initial per-process credits.
+ * Max send window credits: 4K-1 (12-bits in VAS_TX_WCRED)
+ * Max receive window credits: 64K-1 (16 bits in VAS_LRX_WCRED)
+ *
+ * TODO: Needs tuning for per-process credits
+ */
+#define VAS_WCREDS_MIN 16
+#define VAS_WCREDS_MAX ((64 << 10) - 1)
+#define VAS_WCREDS_DEFAULT (1 << 10)
+
+/*
+ * VAS Window Context Register Offsets and bitmasks.
+ * See Section 3.1.4 of VAS Work book
+ */
+#define VAS_LPID_OFFSET 0x010
+#define VAS_LPID PPC_BITMASK(0, 11)
+
+#define VAS_PID_OFFSET 0x018
+#define VAS_PID_ID PPC_BITMASK(0, 19)
+
+#define VAS_XLATE_MSR_OFFSET 0x020
+#define VAS_XLATE_MSR_DR PPC_BIT(0)
+#define VAS_XLATE_MSR_TA PPC_BIT(1)
+#define VAS_XLATE_MSR_PR PPC_BIT(2)
+#define VAS_XLATE_MSR_US PPC_BIT(3)
+#define VAS_XLATE_MSR_HV PPC_BIT(4)
+#define VAS_XLATE_MSR_SF PPC_BIT(5)
+
+#define VAS_XLATE_LPCR_OFFSET 0x028
+#define VAS_XLATE_LPCR_PAGE_SIZE PPC_BITMASK(0, 2)
+#define VAS_XLATE_LPCR_ISL PPC_BIT(3)
+#define VAS_XLATE_LPCR_TC PPC_BIT(4)
+#define VAS_XLATE_LPCR_SC PPC_BIT(5)
+
+#define VAS_XLATE_CTL_OFFSET 0x030
+#define VAS_XLATE_MODE PPC_BITMASK(0, 1)
+
+#define VAS_AMR_OFFSET 0x040
+#define VAS_AMR PPC_BITMASK(0, 63)
+
+#define VAS_SEIDR_OFFSET 0x048
+#define VAS_SEIDR PPC_BITMASK(0, 63)
+
+#define VAS_FAULT_TX_WIN_OFFSET 0x050
+#define VAS_FAULT_TX_WIN PPC_BITMASK(48, 63)
+
+#define VAS_OSU_INTR_SRC_RA_OFFSET 0x060
+#define VAS_OSU_INTR_SRC_RA PPC_BITMASK(8, 63)
+
+#define VAS_HV_INTR_SRC_RA_OFFSET 0x070
+#define VAS_HV_INTR_SRC_RA PPC_BITMASK(8, 63)
+
+#define VAS_PSWID_OFFSET 0x078
+#define VAS_PSWID_EA_HANDLE PPC_BITMASK(0, 31)
+
+#define VAS_SPARE1_OFFSET 0x080
+#define VAS_SPARE2_OFFSET 0x088
+#define VAS_SPARE3_OFFSET 0x090
+#define VAS_SPARE4_OFFSET 0x130
+#define VAS_SPARE5_OFFSET 0x160
+#define VAS_SPARE6_OFFSET 0x188
+
+#define VAS_LFIFO_BAR_OFFSET 0x0A0
+#define VAS_LFIFO_BAR PPC_BITMASK(8, 53)
+#define VAS_PAGE_MIGRATION_SELECT PPC_BITMASK(54, 56)
+
+#define VAS_LDATA_STAMP_CTL_OFFSET 0x0A8
+#define VAS_LDATA_STAMP PPC_BITMASK(0, 1)
+#define VAS_XTRA_WRITE PPC_BIT(2)
+
+#define VAS_LDMA_CACHE_CTL_OFFSET 0x0B0
+#define VAS_LDMA_TYPE PPC_BITMASK(0, 1)
+#define VAS_LDMA_FIFO_DISABLE PPC_BIT(2)
+
+#define VAS_LRFIFO_PUSH_OFFSET 0x0B8
+#define VAS_LRFIFO_PUSH PPC_BITMASK(0, 15)
+
+#define VAS_CURR_MSG_COUNT_OFFSET 0x0C0
+#define VAS_CURR_MSG_COUNT PPC_BITMASK(0, 7)
+
+#define VAS_LNOTIFY_AFTER_COUNT_OFFSET 0x0C8
+#define VAS_LNOTIFY_AFTER_COUNT PPC_BITMASK(0, 7)
+
+#define VAS_LRX_WCRED_OFFSET 0x0E0
+#define VAS_LRX_WCRED PPC_BITMASK(0, 15)
+
+#define VAS_LRX_WCRED_ADDER_OFFSET 0x190
+#define VAS_LRX_WCRED_ADDER PPC_BITMASK(0, 15)
+
+#define VAS_TX_WCRED_OFFSET 0x0F0
+#define VAS_TX_WCRED PPC_BITMASK(4, 15)
+
+#define VAS_TX_WCRED_ADDER_OFFSET 0x1A0
+#define VAS_TX_WCRED_ADDER PPC_BITMASK(4, 15)
+
+#define VAS_LFIFO_SIZE_OFFSET 0x100
+#define VAS_LFIFO_SIZE PPC_BITMASK(0, 3)
+
+#define VAS_WINCTL_OFFSET 0x108
+#define VAS_WINCTL_OPEN PPC_BIT(0)
+#define VAS_WINCTL_REJ_NO_CREDIT PPC_BIT(1)
+#define VAS_WINCTL_PIN PPC_BIT(2)
+#define VAS_WINCTL_TX_WCRED_MODE PPC_BIT(3)
+#define VAS_WINCTL_RX_WCRED_MODE PPC_BIT(4)
+#define VAS_WINCTL_TX_WORD_MODE PPC_BIT(5)
+#define VAS_WINCTL_RX_WORD_MODE PPC_BIT(6)
+#define VAS_WINCTL_RSVD_TXBUF PPC_BIT(7)
+#define VAS_WINCTL_THRESH_CTL PPC_BITMASK(8, 9)
+#define VAS_WINCTL_FAULT_WIN PPC_BIT(10)
+#define VAS_WINCTL_NX_WIN PPC_BIT(11)
+
+#define VAS_WIN_STATUS_OFFSET 0x110
+#define VAS_WIN_BUSY PPC_BIT(1)
+
+#define VAS_WIN_CTX_CACHING_CTL_OFFSET 0x118
+#define VAS_CASTOUT_REQ PPC_BIT(0)
+#define VAS_PUSH_TO_MEM PPC_BIT(1)
+#define VAS_WIN_CACHE_STATUS PPC_BIT(4)
+
+#define VAS_TX_RSVD_BUF_COUNT_OFFSET 0x120
+#define VAS_RXVD_BUF_COUNT PPC_BITMASK(58, 63)
+
+#define VAS_LRFIFO_WIN_PTR_OFFSET 0x128
+#define VAS_LRX_WIN_ID PPC_BITMASK(0, 15)
+
+/*
+ * Local Notification Control Register controls what happens in _response_
+ * to a paste command and hence applies only to receive windows.
+ */
+#define VAS_LNOTIFY_CTL_OFFSET 0x138
+#define VAS_NOTIFY_DISABLE PPC_BIT(0)
+#define VAS_INTR_DISABLE PPC_BIT(1)
+#define VAS_NOTIFY_EARLY PPC_BIT(2)
+#define VAS_NOTIFY_OSU_INTR PPC_BIT(3)
+
+#define VAS_LNOTIFY_PID_OFFSET 0x140
+#define VAS_LNOTIFY_PID PPC_BITMASK(0, 19)
+
+#define VAS_LNOTIFY_LPID_OFFSET 0x148
+#define VAS_LNOTIFY_LPID PPC_BITMASK(0, 11)
+
+#define VAS_LNOTIFY_TID_OFFSET 0x150
+#define VAS_LNOTIFY_TID PPC_BITMASK(0, 15)
+
+#define VAS_LNOTIFY_SCOPE_OFFSET 0x158
+#define VAS_LNOTIFY_MIN_SCOPE PPC_BITMASK(0, 1)
+#define VAS_LNOTIFY_MAX_SCOPE PPC_BITMASK(2, 3)
+
+#define VAS_NX_UTIL_OFFSET 0x1B0
+#define VAS_NX_UTIL PPC_BITMASK(0, 63)
+
+/* SE: Side effects */
+#define VAS_NX_UTIL_SE_OFFSET 0x1B8
+#define VAS_NX_UTIL_SE PPC_BITMASK(0, 63)
+
+#define VAS_NX_UTIL_ADDER_OFFSET 0x180
+#define VAS_NX_UTIL_ADDER PPC_BITMASK(32, 63)
+
+/*
+ * Local Notify Scope Control Register. (Receive windows only).
+ */
+enum vas_notify_scope {
+ VAS_SCOPE_LOCAL,
+ VAS_SCOPE_GROUP,
+ VAS_SCOPE_VECTORED_GROUP,
+ VAS_SCOPE_UNUSED,
+};
+
+/*
+ * Local DMA Cache Control Register (Receive windows only).
+ */
+enum vas_dma_type {
+ VAS_DMA_TYPE_INJECT,
+ VAS_DMA_TYPE_WRITE,
+};
+
+/*
+ * Local Notify Scope Control Register. (Receive windows only).
+ * Not applicable to NX receive windows.
+ */
+enum vas_notify_after_count {
+ VAS_NOTIFY_AFTER_256 = 0,
+ VAS_NOTIFY_NONE,
+ VAS_NOTIFY_AFTER_2
+};
+
+/*
+ * One per instance of VAS. Each instance will have a separate set of
+ * receive windows, one per coprocessor type.
+ */
+struct vas_instance {
+ int vas_id;
+ bool ready;
+ struct ida ida;
+ struct list_head node;
+ struct platform_device *pdev;
+
+ u64 hvwc_bar_start;
+ u64 hvwc_bar_len;
+ u64 uwc_bar_start;
+ u64 uwc_bar_len;
+ u64 paste_base_addr;
+ u64 paste_win_id_shift;
+
+ struct mutex mutex;
+ struct vas_window *rxwin[VAS_COP_TYPE_MAX];
+ struct vas_window *windows[VAS_WINDOWS_PER_CHIP];
+};
+
+/*
+ * In-kernel state a VAS window. One per window.
+ */
+struct vas_window {
+ /* Fields common to send and receive windows */
+ struct vas_instance *vinst;
+ int winid;
+ bool tx_win; /* True if send window */
+ bool nx_win; /* True if NX window */
+ bool user_win; /* True if user space window */
+ void *hvwc_map; /* HV window context */
+ void *uwc_map; /* OS/User window context */
+ pid_t pid; /* Linux process id of owner */
+
+ /* Fields applicable only to send windows */
+ void *paste_kaddr;
+ char *paste_addr_name;
+ struct vas_window *rxwin;
+
+ /* Feilds applicable only to receive windows */
+ enum vas_cop_type cop;
+ atomic_t num_txwins;
+};
+
+/*
+ * Container for the hardware state of a window. One per-window.
+ *
+ * A VAS Window context is a 512-byte area in the hardware that contains
+ * a set of 64-bit registers. Individual bit-fields in these registers
+ * determine the configuration/operation of the hardware. struct vas_winctx
+ * is a container for the register fields in the window context.
+ */
+struct vas_winctx {
+ void *rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+ int rsvd_txbuf_count;
+
+ bool user_win;
+ bool nx_win;
+ bool fault_win;
+ bool rsvd_txbuf_enable;
+ bool pin_win;
+ bool rej_no_credit;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_word_mode;
+ bool rx_word_mode;
+ bool data_stamp;
+ bool xtra_write;
+ bool notify_disable;
+ bool intr_disable;
+ bool fifo_disable;
+ bool notify_early;
+ bool notify_os_intr_reg;
+
+ int lpid;
+ int pidr; /* value from SPRN_PID, not linux pid */
+ int lnotify_lpid;
+ int lnotify_pid;
+ int lnotify_tid;
+ uint32_t pswid;
+ int rx_win_id;
+ int fault_win_id;
+ int tc_mode;
+
+ uint64_t irq_port;
+
+ enum vas_dma_type dma_type;
+ enum vas_notify_scope min_scope;
+ enum vas_notify_scope max_scope;
+ enum vas_notify_after_count notify_after_count;
+};
+
+#endif /* _VAS_H */
--
2.7.4
Define some helper functions to access the MMIO regions. We use these
in follow-on patches to read/write VAS hardware registers. They are
also used to later issue 'paste' instructions to submit requests to
the NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog [v6]:
- Minor reorg to make setup/cleanup functions more symmetric
Changelog [v5]:
- [Ben Herrenschmidt]: Need cachable mapping for paste regions
and non-cachable mapping for the MMIO regions. So, just use
ioremap() for mapping the MMIO regions; use "winctx" instead
of "wc" to avoid collision with "write combine".
Changelog [v3]:
- Minor reorg/cleanup of map/unmap functions
Changelog [v2]:
- Get HVWC, UWC and paste addresses from window->vinst (i.e DT)
rather than kernel macros.
---
arch/powerpc/platforms/powernv/vas-window.c | 173 ++++++++++++++++++++++++++++
1 file changed, 173 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 6156fbe..a3a705a 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -9,9 +9,182 @@
#include <linux/types.h>
#include <linux/mutex.h>
+#include <linux/slab.h>
+#include <linux/io.h>
#include "vas.h"
+/*
+ * Compute the paste address region for the window @window using the
+ * ->paste_base_addr and ->paste_win_id_shift we got from device tree.
+ */
+void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
+{
+ uint64_t base, shift;
+ int winid;
+
+ base = window->vinst->paste_base_addr;
+ shift = window->vinst->paste_win_id_shift;
+ winid = window->winid;
+
+ *addr = base + (winid << shift);
+ if (len)
+ *len = PAGE_SIZE;
+
+ pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
+}
+
+static inline void get_hvwc_mmio_bar(struct vas_window *window,
+ uint64_t *start, int *len)
+{
+ uint64_t pbaddr;
+
+ pbaddr = window->vinst->hvwc_bar_start;
+ *start = pbaddr + window->winid * VAS_HVWC_SIZE;
+ *len = VAS_HVWC_SIZE;
+}
+
+static inline void get_uwc_mmio_bar(struct vas_window *window,
+ uint64_t *start, int *len)
+{
+ uint64_t pbaddr;
+
+ pbaddr = window->vinst->uwc_bar_start;
+ *start = pbaddr + window->winid * VAS_UWC_SIZE;
+ *len = VAS_UWC_SIZE;
+}
+
+/*
+ * Map the paste bus address of the given send window into kernel address
+ * space. Unlike MMIO regions (map_mmio_region() below), paste region must
+ * be mapped cache-able and is only applicable to send windows.
+ */
+void *map_paste_region(struct vas_window *txwin)
+{
+ int rc, len;
+ void *map;
+ char *name;
+ uint64_t start;
+
+ rc = -ENOMEM;
+ name = kasprintf(GFP_KERNEL, "window-v%d-w%d", txwin->vinst->vas_id,
+ txwin->winid);
+ if (!name)
+ return ERR_PTR(rc);
+
+ txwin->paste_addr_name = name;
+ compute_paste_address(txwin, &start, &len);
+
+ if (!request_mem_region(start, len, name)) {
+ pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+ __func__, start, len);
+ goto free_name;
+ }
+
+ map = ioremap_cache(start, len);
+ if (!map) {
+ pr_devel("%s(): ioremap_cache(0x%llx, %d) failed\n", __func__,
+ start, len);
+ goto free_name;
+ }
+
+ pr_devel("VAS: mapped paste addr 0x%llx to kaddr 0x%p\n", start, map);
+ return map;
+
+free_name:
+ kfree(name);
+ return ERR_PTR(rc);
+}
+
+
+static void *map_mmio_region(char *name, uint64_t start, int len)
+{
+ void *map;
+
+ if (!request_mem_region(start, len, name)) {
+ pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
+ __func__, start, len);
+ return NULL;
+ }
+
+ map = ioremap(start, len);
+ if (!map) {
+ pr_devel("%s(): ioremap(0x%llx, %d) failed\n", __func__, start,
+ len);
+ return NULL;
+ }
+
+ return map;
+}
+
+static void unmap_region(void *addr, uint64_t start, int len)
+{
+ iounmap(addr);
+ release_mem_region((phys_addr_t)start, len);
+}
+
+/*
+ * Unmap the paste address region for a window.
+ */
+void unmap_paste_region(struct vas_window *window)
+{
+ int len;
+ uint64_t busaddr_start;
+
+ if (window->paste_kaddr) {
+ compute_paste_address(window, &busaddr_start, &len);
+ unmap_region(window->paste_kaddr, busaddr_start, len);
+ window->paste_kaddr = NULL;
+ kfree(window->paste_addr_name);
+ window->paste_addr_name = NULL;
+ }
+}
+
+/*
+ * Unmap the MMIO regions for a window.
+ */
+static void unmap_winctx_mmio_bars(struct vas_window *window)
+{
+ int len;
+ uint64_t busaddr_start;
+
+ if (window->hvwc_map) {
+ get_hvwc_mmio_bar(window, &busaddr_start, &len);
+ unmap_region(window->hvwc_map, busaddr_start, len);
+ window->hvwc_map = NULL;
+ }
+
+ if (window->uwc_map) {
+ get_uwc_mmio_bar(window, &busaddr_start, &len);
+ unmap_region(window->uwc_map, busaddr_start, len);
+ window->uwc_map = NULL;
+ }
+}
+
+/*
+ * Find the Hypervisor Window Context (HVWC) MMIO Base Address Region and the
+ * OS/User Window Context (UWC) MMIO Base Address Region for the given window.
+ * Map these bus addresses and save the mapped kernel addresses in @window.
+ */
+int map_winctx_mmio_bars(struct vas_window *window)
+{
+ int len;
+ uint64_t start;
+
+ get_hvwc_mmio_bar(window, &start, &len);
+ window->hvwc_map = map_mmio_region("HVWCM_Window", start, len);
+
+ get_uwc_mmio_bar(window, &start, &len);
+ window->uwc_map = map_mmio_region("UWCM_Window", start, len);
+
+ if (!window->hvwc_map || !window->uwc_map) {
+ unmap_winctx_mmio_bars(window);
+ return -1;
+ }
+
+ return 0;
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
--
2.7.4
Define helpers to allocate/free VAS window objects. These will
be used in follow-on patches when opening/closing windows.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/platforms/powernv/vas-window.c | 70 +++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 3a50d6a..9c12919 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -490,6 +490,76 @@ int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
return 0;
}
+DEFINE_SPINLOCK(vas_ida_lock);
+
+void vas_release_window_id(struct ida *ida, int winid)
+{
+ spin_lock(&vas_ida_lock);
+ ida_remove(ida, winid);
+ spin_unlock(&vas_ida_lock);
+}
+
+int vas_assign_window_id(struct ida *ida)
+{
+ int rc, winid;
+
+ rc = ida_pre_get(ida, GFP_KERNEL);
+ if (!rc)
+ return -EAGAIN;
+
+ spin_lock(&vas_ida_lock);
+ rc = ida_get_new_above(ida, 0, &winid);
+ spin_unlock(&vas_ida_lock);
+
+ if (rc)
+ return rc;
+
+ if (winid > VAS_WINDOWS_PER_CHIP) {
+ pr_err("VAS: Too many (%d) open windows\n", winid);
+ vas_release_window_id(ida, winid);
+ return -EAGAIN;
+ }
+
+ return winid;
+}
+
+void vas_window_free(struct vas_window *window)
+{
+ int winid = window->winid;
+ struct vas_instance *vinst = window->vinst;
+
+ unmap_winctx_mmio_bars(window);
+ kfree(window);
+
+ vas_release_window_id(&vinst->ida, winid);
+}
+
+struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+{
+ int winid;
+ struct vas_window *window;
+
+ winid = vas_assign_window_id(&vinst->ida);
+ if (winid < 0)
+ return ERR_PTR(winid);
+
+ window = kzalloc(sizeof(*window), GFP_KERNEL);
+ if (!window)
+ return ERR_PTR(-ENOMEM);
+
+ window->vinst = vinst;
+ window->winid = winid;
+
+ if (map_winctx_mmio_bars(window))
+ goto out_free;
+
+ return window;
+
+out_free:
+ kfree(window);
+ return ERR_PTR(-ENOMEM);
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
--
2.7.4
Define the vas_win_close() interface which should be used to close a
send or receive windows.
While the hardware configurations required to open send and receive windows
differ, the configuration to close a window is the same for both. So we use
a single interface to close the window.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v4]:
- Drop the poll for credits return (we can set the required credit,
but cannot really find the available credit at a point in time)
- Export the symbol
Changelog[v3]:
- Fix order of parameters in GET_FIELD().
- Update references and sequence for closing/quiescing a window.
---
arch/powerpc/include/asm/vas.h | 7 ++
arch/powerpc/platforms/powernv/vas-window.c | 99 +++++++++++++++++++++++++++--
2 files changed, 101 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 0701a08..3478ac8 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -109,4 +109,11 @@ extern void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr,
extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
struct vas_rx_win_attr *attr);
+/*
+ * Close the send or receive window identified by @win. For receive windows
+ * return -EAGAIN if there are active send windows attached to this receive
+ * window.
+ */
+int vas_win_close(struct vas_window *win);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 2dd4b63..24288dd 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -139,7 +139,7 @@ static void unmap_region(void *addr, uint64_t start, int len)
/*
* Unmap the paste address region for a window.
*/
-void unmap_paste_region(struct vas_window *window)
+static void unmap_paste_region(struct vas_window *window)
{
int len;
uint64_t busaddr_start;
@@ -535,7 +535,7 @@ int vas_assign_window_id(struct ida *ida)
return winid;
}
-void vas_window_free(struct vas_window *window)
+static void vas_window_free(struct vas_window *window)
{
int winid = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -572,6 +572,14 @@ static struct vas_window *vas_window_alloc(struct vas_instance *vinst)
return ERR_PTR(-ENOMEM);
}
+static void put_rx_win(struct vas_window *rxwin)
+{
+ /* Better not be a send window! */
+ WARN_ON_ONCE(rxwin->tx_win);
+
+ atomic_dec(&rxwin->num_txwins);
+}
+
/*
* Find the user space receive window given the @pswid.
* - We must have a valid vasid and it must belong to this instance.
@@ -664,7 +672,7 @@ static void set_vinst_win(struct vas_instance *vinst,
* Clear this window from the table(s) of windows for this VAS instance.
* See also function header of set_vinst_win().
*/
-void clear_vinst_win(struct vas_window *window)
+static void clear_vinst_win(struct vas_window *window)
{
int id = window->winid;
struct vas_instance *vinst = window->vinst;
@@ -879,11 +887,92 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_rx_win_open);
-/* stub for now */
+static void poll_window_busy_state(struct vas_window *window)
+{
+ int busy;
+ uint64_t val;
+
+retry:
+ /*
+ * Poll Window Busy flag
+ */
+ val = read_hvwc_reg(window, VREG(WIN_STATUS));
+ busy = GET_FIELD(VAS_WIN_BUSY, val);
+ if (busy) {
+ val = 0;
+ schedule_timeout(2000);
+ goto retry;
+ }
+}
+
+static void poll_window_castout(struct vas_window *window)
+{
+ int cached;
+ uint64_t val;
+
+ /* Cast window context out of the cache */
+retry:
+ val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
+ cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
+ if (cached) {
+ val = 0ULL;
+ val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
+ val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
+
+ schedule_timeout(2000);
+ goto retry;
+ }
+}
+
+/*
+ * Close a window.
+ *
+ * See Section 1.12.1 of VAS workbook v1.05 for details on closing window:
+ * - Disable new paste operations (unmap paste address)
+ * - Poll for the "Window Busy" bit to be cleared
+ * - Clear the Open/Enable bit for the Window.
+ * - Poll for return of window Credits (implies FIFO empty for Rx win?)
+ * - Unpin and cast window context out of cache
+ *
+ * Besides the hardware, kernel has some bookkeeping of course.
+ */
int vas_win_close(struct vas_window *window)
{
- return -1;
+ uint64_t val;
+
+ if (!window)
+ return 0;
+
+ if (!window->tx_win && atomic_read(&window->num_txwins) != 0) {
+ pr_devel("VAS: Attempting to close an active Rx window!\n");
+ WARN_ON_ONCE(1);
+ return -EAGAIN;
+ }
+
+ unmap_paste_region(window);
+
+ clear_vinst_win(window);
+
+ poll_window_busy_state(window);
+
+ /* Unpin window from cache and close it */
+ val = read_hvwc_reg(window, VREG(WINCTL));
+ val = SET_FIELD(VAS_WINCTL_PIN, val, 0);
+ val = SET_FIELD(VAS_WINCTL_OPEN, val, 0);
+ write_hvwc_reg(window, VREG(WINCTL), val);
+
+ poll_window_castout(window);
+
+ /* if send window, drop reference to matching receive window */
+ if (window->tx_win)
+ put_rx_win(window->rxwin);
+
+ vas_window_free(window);
+
+ return 0;
}
+EXPORT_SYMBOL_GPL(vas_win_close);
/*
* Return a system-wide unique window id for the window @win.
--
2.7.4
Define an interface that the NX drivers can use to find the physical
paste address of a send window. This interface is expected to be used
with the mmap() operation of the NX driver's device. i.e the user space
process can use driver's mmap() operation to map the send window's paste
address into their address space and then use copy and paste instructions
to submit the CRBs to the NX engine.
Note that kernel drivers will use vas_paste_crb() directly and don't need
this interface.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/asm/vas.h | 7 +++++++
arch/powerpc/platforms/powernv/vas-window.c | 10 ++++++++++
2 files changed, 17 insertions(+)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index e34d46a..2f1c168 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -10,6 +10,8 @@
#ifndef _MISC_VAS_H
#define _MISC_VAS_H
+struct vas_window;
+
/*
* Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
* (Local FIFO Size Register) of the VAS workbook.
@@ -50,4 +52,9 @@ enum vas_cop_type {
VAS_COP_TYPE_MAX,
};
+/*
+ * Return the power bus paste address associated with @win so the caller
+ * can map that address into their address space.
+ */
+extern uint64_t vas_win_paste_addr(struct vas_window *win);
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 9c12919..3a4599f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -35,6 +35,16 @@ void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
}
+uint64_t vas_win_paste_addr(struct vas_window *win)
+{
+ uint64_t addr;
+
+ compute_paste_address(win, &addr, NULL);
+
+ return addr;
+}
+EXPORT_SYMBOL(vas_win_paste_addr);
+
static inline void get_hvwc_mmio_bar(struct vas_window *window,
uint64_t *start, int *len)
{
--
2.7.4
Define an interface to return a system-wide unique id for a given VAS
window.
The vas_win_id() will be used in a follow-on patch to generate an unique
handle for a user space receive window. Applications can use this handle
to pair send and receive windows for fast thread-wakeup.
The hardware refers to this system-wide unique id as a Partition Send
Window ID which is expected to be used during fault handling. Hence the
"pswid" in the function names.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
arch/powerpc/include/asm/vas.h | 5 +++++
arch/powerpc/platforms/powernv/vas-window.c | 9 +++++++++
arch/powerpc/platforms/powernv/vas.h | 28 ++++++++++++++++++++++++++++
3 files changed, 42 insertions(+)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 2f1c168..6c126f3 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -53,6 +53,11 @@ enum vas_cop_type {
};
/*
+ * Return a system-wide unique id for the VAS window @win.
+ */
+extern uint32_t vas_win_id(struct vas_window *win);
+
+/*
* Return the power bus paste address associated with @win so the caller
* can map that address into their address space.
*/
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 3a4599f..42c1d4f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -575,3 +575,12 @@ int vas_win_close(struct vas_window *window)
{
return -1;
}
+
+/*
+ * Return a system-wide unique window id for the window @win.
+ */
+uint32_t vas_win_id(struct vas_window *win)
+{
+ return encode_pswid(win->vinst->vas_id, win->winid);
+}
+EXPORT_SYMBOL_GPL(vas_win_id);
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 7b2bcd0..3eadf90 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -440,4 +440,32 @@ static inline uint64_t read_hvwc_reg(struct vas_window *win,
return in_be64(win->hvwc_map+reg);
}
+/*
+ * Encode/decode the Partition Send Window ID (PSWID) for a window in
+ * a way that we can uniquely identify any window in the system. i.e.
+ * we should be able to locate the 'struct vas_window' given the PSWID.
+ *
+ * Bits Usage
+ * 0:7 VAS id (8 bits)
+ * 8:15 Unused, 0 (3 bits)
+ * 16:31 Window id (16 bits)
+ */
+static inline u32 encode_pswid(int vasid, int winid)
+{
+ u32 pswid = 0;
+
+ pswid |= vasid << (31 - 7);
+ pswid |= winid;
+
+ return pswid;
+}
+
+static inline void decode_pswid(u32 pswid, int *vasid, int *winid)
+{
+ if (vasid)
+ *vasid = pswid >> (31 - 7) & 0xFF;
+
+ if (winid)
+ *winid = pswid & 0xFFFF;
+}
#endif /* _VAS_H */
--
2.7.4
Define an interface to open a VAS send window. This interface is
intended to be used the Nest Accelerator (NX) driver(s) to open
a send window and use it to submit compression/encryption requests
to a VAS receive window.
The receive window, identified by the [vasid, cop] parameters, must
already be open in VAS (i.e connected to an NX engine).
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v7]:
- Initialize txwin->user_win field for FTW windows.
Changelog[v6]:
- Add support for FTW windows
Changelog[v4]:
- [Ben Herrenschmidt] MMIO regions must be mapped non-cached and
paste regions must be mapped cached. Define/use map_paste_region().
Changelog [v3]:
- Distinguish between hardware PID (SPRN_PID) and Linux pid.
- Use macros rather than enum for threshold-control mode
- Set the pid of send window from attr (needed for user space
send windows).
- Ignore irq port setting for now. They are needed for user space
windows and will be added later
---
arch/powerpc/include/asm/vas.h | 42 ++++++++
arch/powerpc/platforms/powernv/vas-window.c | 159 +++++++++++++++++++++++++++-
2 files changed, 198 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 3478ac8..ed1be4c 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -83,6 +83,29 @@ struct vas_rx_win_attr {
};
/*
+ * Window attributes specified by the in-kernel owner of a send window.
+ */
+struct vas_tx_win_attr {
+ enum vas_cop_type cop;
+ int wcreds_max;
+ int lpid;
+ int pidr; /* hardware PID (from SPRN_PID) */
+ int pid; /* linux process id */
+ int pswid;
+ int rsvd_txbuf_count;
+ int tc_mode;
+
+ bool user_win;
+ bool pin_win;
+ bool rej_no_credit;
+ bool rsvd_txbuf_enable;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_win_ord_mode;
+ bool rx_win_ord_mode;
+};
+
+/*
* Return a system-wide unique id for the VAS window @win.
*/
extern uint32_t vas_win_id(struct vas_window *win);
@@ -110,6 +133,25 @@ extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
struct vas_rx_win_attr *attr);
/*
+ * Helper to initialize send window attributes to defaults for an NX window.
+ */
+extern void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr,
+ enum vas_cop_type cop);
+
+/*
+ * Open a VAS send window for the instance of VAS identified by @vasid
+ * and the co-processor type @cop. Use @attr to initialize attributes
+ * of the window.
+ *
+ * Note: The instance of VAS must already have an open receive window for
+ * the coprocessor type @cop.
+ *
+ * Return a handle to the send window or ERR_PTR() on error.
+ */
+struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr);
+
+/*
* Close the send or receive window identified by @win. For receive windows
* return -EAGAIN if there are active send windows attached to this receive
* window.
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 24288dd..70762c3 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -72,7 +72,7 @@ static inline void get_uwc_mmio_bar(struct vas_window *window,
* space. Unlike MMIO regions (map_mmio_region() below), paste region must
* be mapped cache-able and is only applicable to send windows.
*/
-void *map_paste_region(struct vas_window *txwin)
+static void *map_paste_region(struct vas_window *txwin)
{
int rc, len;
void *map;
@@ -109,7 +109,6 @@ void *map_paste_region(struct vas_window *txwin)
return ERR_PTR(rc);
}
-
static void *map_mmio_region(char *name, uint64_t start, int len)
{
void *map;
@@ -611,7 +610,7 @@ struct vas_window *get_user_rxwin(struct vas_instance *vinst, uint32_t pswid)
*
* See also function header of set_vinst_win().
*/
-struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+static struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
enum vas_cop_type cop, uint32_t pswid)
{
struct vas_window *rxwin;
@@ -887,6 +886,160 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_rx_win_open);
+void vas_init_tx_win_attr(struct vas_tx_win_attr *txattr, enum vas_cop_type cop)
+{
+ memset(txattr, 0, sizeof(*txattr));
+
+ if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI) {
+ txattr->rej_no_credit = false;
+ txattr->rx_wcred_mode = true;
+ txattr->tx_wcred_mode = true;
+ txattr->rx_win_ord_mode = true;
+ txattr->tx_win_ord_mode = true;
+ } else if (cop == VAS_COP_TYPE_FTW) {
+ txattr->user_win = true;
+ }
+}
+EXPORT_SYMBOL_GPL(vas_init_tx_win_attr);
+
+static void init_winctx_for_txwin(struct vas_window *txwin,
+ struct vas_tx_win_attr *txattr,
+ struct vas_winctx *winctx)
+{
+ /*
+ * We first zero all fields and only set non-zero ones. Following
+ * are some fields set to 0/false for the stated reason:
+ *
+ * ->notify_os_intr_reg In powerNV, send intrs to HV
+ * ->rsvd_txbuf_count Not supported yet.
+ * ->notify_disable False for NX windows
+ * ->xtra_write False for NX windows
+ * ->notify_early NA for NX windows
+ * ->lnotify_lpid NA for Tx windows
+ * ->lnotify_pid NA for Tx windows
+ * ->lnotify_tid NA for Tx windows
+ * ->tx_win_cred_mode Ignore for now for NX windows
+ * ->rx_win_cred_mode Ignore for now for NX windows
+ */
+ memset(winctx, 0, sizeof(struct vas_winctx));
+
+ winctx->wcreds_max = txattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
+
+ winctx->user_win = txattr->user_win;
+ winctx->nx_win = txwin->rxwin->nx_win;
+ winctx->pin_win = txattr->pin_win;
+
+ winctx->rx_wcred_mode = txattr->rx_wcred_mode;
+ winctx->tx_wcred_mode = txattr->tx_wcred_mode;
+ winctx->rx_word_mode = txattr->rx_win_ord_mode;
+ winctx->tx_word_mode = txattr->tx_win_ord_mode;
+
+ if (winctx->nx_win) {
+ winctx->data_stamp = true;
+ winctx->intr_disable = true;
+ }
+
+ winctx->lpid = txattr->lpid;
+ winctx->pidr = txattr->pidr;
+ winctx->rx_win_id = txwin->rxwin->winid;
+
+ winctx->dma_type = VAS_DMA_TYPE_INJECT;
+ winctx->tc_mode = txattr->tc_mode;
+ winctx->min_scope = VAS_SCOPE_LOCAL;
+ winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+
+ winctx->pswid = encode_pswid(txwin->vinst->vas_id, txwin->winid);
+}
+
+static bool tx_win_args_valid(enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr)
+{
+ if (attr->tc_mode != VAS_THRESH_DISABLED)
+ return false;
+
+ if (cop > VAS_COP_TYPE_MAX)
+ return false;
+
+ if (attr->user_win &&
+ (cop != VAS_COP_TYPE_FTW || attr->rsvd_txbuf_count))
+ return false;
+
+ return true;
+}
+
+struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_tx_win_attr *attr)
+{
+ int rc;
+ struct vas_instance *vinst;
+ struct vas_window *txwin;
+ struct vas_window *rxwin;
+ struct vas_winctx winctx;
+
+ if (!vas_initialized())
+ return ERR_PTR(-EAGAIN);
+
+ if (!tx_win_args_valid(cop, attr))
+ return ERR_PTR(-EINVAL);
+
+ vinst = find_vas_instance(vasid);
+ if (!vinst) {
+ pr_devel("VAS: vasid %d not found!\n", vasid);
+ return ERR_PTR(-EINVAL);
+ }
+
+ rxwin = get_vinst_rxwin(vinst, cop, attr->pswid);
+ if (IS_ERR(rxwin)) {
+ pr_devel("VAS: No RxWin for vasid %d, cop %d\n", vasid, cop);
+ return rxwin;
+ }
+
+ txwin = vas_window_alloc(vinst);
+ if (IS_ERR(txwin)) {
+ rc = PTR_ERR(txwin);
+ goto put_rxwin;
+ }
+
+ txwin->tx_win = 1;
+ txwin->rxwin = rxwin;
+ txwin->nx_win = txwin->rxwin->nx_win;
+ txwin->pid = attr->pid;
+ txwin->user_win = attr->user_win;
+
+ init_winctx_for_txwin(txwin, attr, &winctx);
+
+ init_winctx_regs(txwin, &winctx);
+
+ /*
+ * If its a kernel send window, map the window address into the
+ * kernel's address space. For user windows, user must issue an
+ * mmap() to map the window into their address space.
+ *
+ * NOTE: If kernel ever resubmits a user CRB after handling a page
+ * fault, we will need to map this into kernel as well.
+ */
+ if (!txwin->user_win) {
+ txwin->paste_kaddr = map_paste_region(txwin);
+ if (IS_ERR(txwin->paste_kaddr)) {
+ rc = PTR_ERR(txwin->paste_kaddr);
+ goto free_window;
+ }
+ }
+
+ set_vinst_win(vinst, txwin);
+
+ return txwin;
+
+free_window:
+ vas_window_free(txwin);
+
+put_rxwin:
+ put_rx_win(rxwin);
+ return ERR_PTR(rc);
+
+}
+EXPORT_SYMBOL_GPL(vas_tx_win_open);
+
static void poll_window_busy_state(struct vas_window *window)
{
int busy;
--
2.7.4
Define interfaces (wrappers) to the 'copy' and 'paste' instructions
(which are new in PowerISA 3.0). These are intended to be used to
by NX driver(s) to submit Coprocessor Request Blocks (CRBs) to the
NX hardware engines.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v4]
- Export symbols
Changelog[v3]
- Map raw CR value from paste instruction into an error code.
---
MAINTAINERS | 1 +
arch/powerpc/include/asm/vas.h | 13 +++++
arch/powerpc/platforms/powernv/copy-paste.h | 74 +++++++++++++++++++++++++++++
arch/powerpc/platforms/powernv/vas-window.c | 52 ++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 15 ++++++
5 files changed, 155 insertions(+)
create mode 100644 arch/powerpc/platforms/powernv/copy-paste.h
diff --git a/MAINTAINERS b/MAINTAINERS
index abc235f..4ca94e8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6430,6 +6430,7 @@ M: Sukadev Bhattiprolu
L: [email protected]
S: Supported
F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/platforms/powernv/copy-paste.h
F: arch/powerpc/include/asm/vas.h
F: arch/powerpc/include/uapi/asm/vas.h
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index ed1be4c..29531df 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -158,4 +158,17 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
*/
int vas_win_close(struct vas_window *win);
+/*
+ * Copy the co-processor request block (CRB) @crb into the local L2 cache.
+ * For now, @offset must be 0 and @first must be true.
+ */
+extern int vas_copy_crb(void *crb, int offset, bool first);
+
+/*
+ * Paste a previously copied CRB (see vas_copy_crb()) from the L2 cache to
+ * the hardware address associated with the window @win. For now, @off must
+ * 0 and @last must be true. @re is expected/assumed to be true for NX windows.
+ */
+extern int vas_paste_crb(struct vas_window *win, int off, bool last, bool re);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/copy-paste.h b/arch/powerpc/platforms/powernv/copy-paste.h
new file mode 100644
index 0000000..7783bb8
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/copy-paste.h
@@ -0,0 +1,74 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+/*
+ * Macros taken from tools/testing/selftests/powerpc/context_switch/cp_abort.c
+ */
+#define PASTE(RA, RB, L, RC) \
+ .long (0x7c00070c | (RA) << (31-15) | (RB) << (31-20) \
+ | (L) << (31-10) | (RC) << (31-31))
+
+#define COPY(RA, RB, L) \
+ .long (0x7c00060c | (RA) << (31-15) | (RB) << (31-20) \
+ | (L) << (31-10))
+
+#define CR0_FXM "0x80"
+#define CR0_SHIFT 28
+#define CR0_MASK 0xF
+/*
+ * Copy/paste instructions:
+ *
+ * copy RA,RB,L
+ * Copy contents of address (RA) + effective_address(RB)
+ * to internal copy-buffer.
+ *
+ * L == 1 indicates this is the first copy.
+ *
+ * L == 0 indicates its a continuation of a prior first copy.
+ *
+ * paste RA,RB,L
+ * Paste contents of internal copy-buffer to the address
+ * (RA) + effective_address(RB)
+ *
+ * L == 0 indicates its a continuation of a prior paste. i.e.
+ * don't wait for the completion or update status.
+ *
+ * L == 1 indicates this is the last paste in the group (i.e.
+ * wait for the group to complete and update status in CR0).
+ *
+ * For Power9, the L bit must be 'true' in both copy and paste.
+ */
+
+static inline int vas_copy(void *crb, int offset, int first)
+{
+ WARN_ON_ONCE(!first);
+
+ __asm__ __volatile(stringify_in_c(COPY(%0, %1, %2))";"
+ :
+ : "b" (offset), "b" (crb), "i" (1)
+ : "memory");
+
+ return 0;
+}
+
+static inline int vas_paste(void *paste_address, int offset, int last)
+{
+ unsigned long long cr;
+
+ WARN_ON_ONCE(!last);
+
+ cr = 0;
+ __asm__ __volatile(stringify_in_c(PASTE(%1, %2, 1, 1))";"
+ "mfocrf %0," CR0_FXM ";"
+ : "=r" (cr)
+ : "b" (paste_address), "b" (offset)
+ : "memory");
+
+ return cr;
+}
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 70762c3..73081b4 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -16,6 +16,7 @@
#include <linux/cred.h>
#include "vas.h"
+#include "copy-paste.h"
/*
* Compute the paste address region for the window @window using the
@@ -1040,6 +1041,57 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
}
EXPORT_SYMBOL_GPL(vas_tx_win_open);
+int vas_copy_crb(void *crb, int offset, bool first)
+{
+ if (!vas_initialized())
+ return -1;
+
+ return vas_copy(crb, offset, first);
+}
+EXPORT_SYMBOL_GPL(vas_copy_crb);
+
+#define RMA_LSMP_REPORT_ENABLE PPC_BIT(53)
+int vas_paste_crb(struct vas_window *txwin, int offset, bool last, bool re)
+{
+ int rc;
+ uint64_t val;
+ void *addr;
+
+ if (!vas_initialized())
+ return -1;
+ /*
+ * Only NX windows are supported for now and hardware assumes
+ * report-enable flag is set for NX windows. Ensure software
+ * complies too.
+ */
+ WARN_ON_ONCE(!re);
+
+ addr = txwin->paste_kaddr;
+ if (re) {
+ /*
+ * Set the REPORT_ENABLE bit (equivalent to writing
+ * to 1K offset of the paste address)
+ */
+ val = SET_FIELD(RMA_LSMP_REPORT_ENABLE, 0ULL, 1);
+ addr += val;
+ }
+
+ /*
+ * Map the raw CR value from vas_paste() to an error code (there
+ * is just pass or fail for now though).
+ */
+ rc = vas_paste(addr, offset, last);
+ if (rc == 0x20000000)
+ rc = 0;
+ else
+ rc = -EINVAL;
+
+ print_fifo_msg_count(txwin);
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(vas_paste_crb);
+
static void poll_window_busy_state(struct vas_window *window)
{
int busy;
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 61fd80f..4e3e5fe 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -482,4 +482,19 @@ static inline void decode_pswid(u32 pswid, int *vasid, int *winid)
if (winid)
*winid = pswid & 0xFFFF;
}
+
+#ifdef vas_debug
+
+static void print_fifo_msg_count(struct vas_window *txwin)
+{
+ uint64_t read_hvwc_reg(struct vas_window *w, char *n, uint64_t o);
+ pr_devel("Winid %d, Msg count %llu\n", txwin->winid,
+ (uint64_t)read_hvwc_reg(txwin, VREG(LRFIFO_PUSH)));
+}
+#else /* vas_debug */
+
+#define print_fifo_msg_count(window)
+
+#endif /* vas_debug */
+
#endif /* _VAS_H */
--
2.7.4
Define the vas_rx_win_open() interface. This interface is intended to be
used by the Nest Accelerator (NX) driver(s) to setup receive windows for
one or more NX engines (which implement compression/encryption algorithms
in the hardware).
Follow-on patches will provide an interface to close the window and to open
a send window that kernel subsystems can use to access the NX engines.
The interface to open a receive window is expected to be invoked for each
instance of VAS in the system.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v7]:
- vas_rx_win_open() is simplified because API for FTW windows
are simplified. We expect the driver to open an rxwin and txwin
one after the other and we don't get back an rx_win_handle from
user space so don't have to validate the pid permissions.
Changelog[v6]:
- Add support for FTW windows
Changelog[v4]:
- Export the symbols
Changelog[v3]:
- Fault receive windows must enable interrupts and disable
notifications. NX Windows are opposite.
- Use macros rather than enum for threshold-control mode
- Ignore irq_ports for in-kernel windows. They are needed for
user space windows and will be added later
---
arch/powerpc/include/asm/vas.h | 47 +++++
arch/powerpc/platforms/powernv/vas-window.c | 311 +++++++++++++++++++++++++++-
arch/powerpc/platforms/powernv/vas.h | 14 ++
3 files changed, 371 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 6c126f3..0701a08 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -53,6 +53,36 @@ enum vas_cop_type {
};
/*
+ * Receive window attributes specified by the (in-kernel) owner of window.
+ */
+struct vas_rx_win_attr {
+ void *rx_fifo;
+ int rx_fifo_size;
+ int wcreds_max;
+
+ bool pin_win;
+ bool rej_no_credit;
+ bool tx_wcred_mode;
+ bool rx_wcred_mode;
+ bool tx_win_ord_mode;
+ bool rx_win_ord_mode;
+ bool data_stamp;
+ bool nx_win;
+ bool fault_win;
+ bool user_win;
+ bool notify_disable;
+ bool intr_disable;
+ bool notify_early;
+
+ int lnotify_lpid;
+ int lnotify_pid;
+ int lnotify_tid;
+ uint32_t pswid;
+
+ int tc_mode;
+};
+
+/*
* Return a system-wide unique id for the VAS window @win.
*/
extern uint32_t vas_win_id(struct vas_window *win);
@@ -62,4 +92,21 @@ extern uint32_t vas_win_id(struct vas_window *win);
* can map that address into their address space.
*/
extern uint64_t vas_win_paste_addr(struct vas_window *win);
+
+/*
+ * Helper to initialize receive window attributes to defaults for an
+ * NX window.
+ */
+extern void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr,
+ enum vas_cop_type cop);
+
+/*
+ * Open a VAS receive window for the instance of VAS identified by @vasid
+ * Use @attr to initialize the attributes of the window.
+ *
+ * Return a handle to the window or ERR_PTR() on error.
+ */
+extern struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_rx_win_attr *attr);
+
#endif /* _MISC_VAS_H */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 42c1d4f..2dd4b63 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -12,6 +12,8 @@
#include <linux/slab.h>
#include <linux/io.h>
#include <linux/log2.h>
+#include <linux/rcupdate.h>
+#include <linux/cred.h>
#include "vas.h"
@@ -544,7 +546,7 @@ void vas_window_free(struct vas_window *window)
vas_release_window_id(&vinst->ida, winid);
}
-struct vas_window *vas_window_alloc(struct vas_instance *vinst)
+static struct vas_window *vas_window_alloc(struct vas_instance *vinst)
{
int winid;
struct vas_window *window;
@@ -570,6 +572,313 @@ struct vas_window *vas_window_alloc(struct vas_instance *vinst)
return ERR_PTR(-ENOMEM);
}
+/*
+ * Find the user space receive window given the @pswid.
+ * - We must have a valid vasid and it must belong to this instance.
+ * - The window must refer to an OPEN, FTW, RECEIVE window.
+ *
+ * NOTE: We access ->windows[] table and assume that vinst->mutex is held.
+ */
+struct vas_window *get_user_rxwin(struct vas_instance *vinst, uint32_t pswid)
+{
+ int vasid, winid;
+ struct vas_window *rxwin;
+
+ decode_pswid(pswid, &vasid, &winid);
+
+ if (vinst->vas_id != vasid)
+ return ERR_PTR(-EINVAL);
+
+ rxwin = vinst->windows[winid];
+
+ if (!rxwin || rxwin->tx_win || rxwin->cop != VAS_COP_TYPE_FTW)
+ return ERR_PTR(-EINVAL);
+
+ return rxwin;
+}
+
+/*
+ * Get the VAS receive window associated with NX engine identified
+ * by @cop and if applicable, @pswid.
+ *
+ * See also function header of set_vinst_win().
+ */
+struct vas_window *get_vinst_rxwin(struct vas_instance *vinst,
+ enum vas_cop_type cop, uint32_t pswid)
+{
+ struct vas_window *rxwin;
+
+ mutex_lock(&vinst->mutex);
+
+ if (cop == VAS_COP_TYPE_FTW)
+ rxwin = get_user_rxwin(vinst, pswid);
+ else
+ rxwin = vinst->rxwin[cop] ?: ERR_PTR(-EINVAL);
+
+ if (!IS_ERR(rxwin))
+ atomic_inc(&rxwin->num_txwins);
+
+ mutex_unlock(&vinst->mutex);
+
+ return rxwin;
+}
+
+/*
+ * We have two tables of windows in a VAS instance. The first one,
+ * ->windows[], contains all the windows in the instance and allows
+ * looking up a window by its id. It is used to look up send windows
+ * during fault handling and receive windows when pairing user space
+ * send/receive windows.
+ *
+ * The second table, ->rxwin[], contains receive windows that are
+ * associated with NX engines. This table has VAS_COP_TYPE_MAX
+ * entries and is used to look up a receive window by its
+ * coprocessor type.
+ *
+ * Here, we save @window in the ->windows[] table. If it is a receive
+ * window, we also save the window in the ->rxwin[] table.
+ */
+static void set_vinst_win(struct vas_instance *vinst,
+ struct vas_window *window)
+{
+ int id = window->winid;
+
+ mutex_lock(&vinst->mutex);
+
+ /*
+ * There should only be one receive window for a coprocessor type
+ * unless its a user (FTW) window.
+ */
+ if (!window->user_win && !window->tx_win) {
+ WARN_ON_ONCE(vinst->rxwin[window->cop]);
+ vinst->rxwin[window->cop] = window;
+ }
+
+ WARN_ON_ONCE(vinst->windows[id] != NULL);
+ vinst->windows[id] = window;
+
+ mutex_unlock(&vinst->mutex);
+}
+
+/*
+ * Clear this window from the table(s) of windows for this VAS instance.
+ * See also function header of set_vinst_win().
+ */
+void clear_vinst_win(struct vas_window *window)
+{
+ int id = window->winid;
+ struct vas_instance *vinst = window->vinst;
+
+ mutex_lock(&vinst->mutex);
+
+ if (!window->user_win && !window->tx_win) {
+ WARN_ON_ONCE(!vinst->rxwin[window->cop]);
+ vinst->rxwin[window->cop] = NULL;
+ }
+
+ WARN_ON_ONCE(vinst->windows[id] != window);
+ vinst->windows[id] = NULL;
+
+ mutex_unlock(&vinst->mutex);
+}
+
+static void init_winctx_for_rxwin(struct vas_window *rxwin,
+ struct vas_rx_win_attr *rxattr,
+ struct vas_winctx *winctx)
+{
+ /*
+ * We first zero (memset()) all fields and only set non-zero fields.
+ * Following fields are 0/false but maybe deserve a comment:
+ *
+ * ->notify_os_intr_reg In powerNV, send intrs to HV
+ * ->notify_disable False for NX windows
+ * ->intr_disable False for Fault Windows
+ * ->xtra_write False for NX windows
+ * ->notify_early NA for NX windows
+ * ->rsvd_txbuf_count NA for Rx windows
+ * ->lpid, ->pid, ->tid NA for Rx windows
+ */
+
+ memset(winctx, 0, sizeof(struct vas_winctx));
+
+ winctx->rx_fifo = rxattr->rx_fifo;
+ winctx->rx_fifo_size = rxattr->rx_fifo_size;
+ winctx->wcreds_max = rxattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
+ winctx->pin_win = rxattr->pin_win;
+
+ winctx->nx_win = rxattr->nx_win;
+ winctx->fault_win = rxattr->fault_win;
+ winctx->rx_word_mode = rxattr->rx_win_ord_mode;
+ winctx->tx_word_mode = rxattr->tx_win_ord_mode;
+ winctx->rx_wcred_mode = rxattr->rx_wcred_mode;
+ winctx->tx_wcred_mode = rxattr->tx_wcred_mode;
+
+ if (winctx->nx_win) {
+ winctx->data_stamp = true;
+ winctx->intr_disable = true;
+ winctx->pin_win = true;
+
+ WARN_ON_ONCE(winctx->fault_win);
+ WARN_ON_ONCE(!winctx->rx_word_mode);
+ WARN_ON_ONCE(!winctx->tx_word_mode);
+ WARN_ON_ONCE(winctx->notify_after_count);
+ } else if (winctx->fault_win) {
+ winctx->notify_disable = true;
+ } else if (winctx->user_win) {
+ /*
+ * Section 1.8.1 Low Latency Core-Core Wake up of
+ * the VAS workbook:
+ *
+ * - disable credit checks ([tr]x_wcred_mode = false)
+ * - disable FIFO writes
+ * - enable ASB_Notify, disable interrupt
+ */
+ winctx->fifo_disable = true;
+ winctx->intr_disable = true;
+ winctx->rx_fifo = NULL;
+ }
+
+ winctx->lnotify_lpid = rxattr->lnotify_lpid;
+ winctx->lnotify_pid = rxattr->lnotify_pid;
+ winctx->lnotify_tid = rxattr->lnotify_tid;
+ winctx->pswid = rxattr->pswid;
+ winctx->dma_type = VAS_DMA_TYPE_INJECT;
+ winctx->tc_mode = rxattr->tc_mode;
+
+ winctx->min_scope = VAS_SCOPE_LOCAL;
+ winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+}
+
+static bool rx_win_args_valid(enum vas_cop_type cop,
+ struct vas_rx_win_attr *attr)
+{
+ dump_rx_win_attr(attr);
+
+ if (cop >= VAS_COP_TYPE_MAX)
+ return false;
+
+ if (cop != VAS_COP_TYPE_FTW &&
+ attr->rx_fifo_size < VAS_RX_FIFO_SIZE_MIN)
+ return false;
+
+ if (attr->rx_fifo_size > VAS_RX_FIFO_SIZE_MAX)
+ return false;
+
+ if (attr->nx_win) {
+ /* cannot be fault or user window if it is nx */
+ if (attr->fault_win || attr->user_win)
+ return false;
+ /*
+ * Section 3.1.4.32: NX Windows must not disable notification,
+ * and must not enable interrupts or early notification.
+ */
+ if (attr->notify_disable || !attr->intr_disable ||
+ attr->notify_early)
+ return false;
+ } else if (attr->fault_win) {
+ /* cannot be both fault and user window */
+ if (attr->user_win)
+ return false;
+
+ /*
+ * Section 3.1.4.32: Fault windows must disable notification
+ * but not interrupts.
+ */
+ if (!attr->notify_disable || attr->intr_disable)
+ return false;
+
+ } else if (attr->user_win) {
+ /*
+ * User receive windows are only for fast-thread-wakeup
+ * (FTW). They don't need a FIFO and must disable interrupts
+ */
+ if (attr->rx_fifo || attr->rx_fifo_size || !attr->intr_disable)
+ return false;
+ } else {
+ /* Rx window must be one of NX or Fault or User window. */
+ return false;
+ }
+
+ return true;
+}
+
+void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr, enum vas_cop_type cop)
+{
+ memset(rxattr, 0, sizeof(*rxattr));
+
+ if (cop == VAS_COP_TYPE_842 || cop == VAS_COP_TYPE_842_HIPRI) {
+ rxattr->pin_win = true;
+ rxattr->nx_win = true;
+ rxattr->fault_win = false;
+ rxattr->intr_disable = true;
+ rxattr->rx_wcred_mode = true;
+ rxattr->tx_wcred_mode = true;
+ rxattr->rx_win_ord_mode = true;
+ rxattr->tx_win_ord_mode = true;
+ } else if (cop == VAS_COP_TYPE_FAULT) {
+ rxattr->pin_win = true;
+ rxattr->fault_win = true;
+ rxattr->notify_disable = true;
+ rxattr->rx_wcred_mode = true;
+ rxattr->tx_wcred_mode = true;
+ rxattr->rx_win_ord_mode = true;
+ rxattr->tx_win_ord_mode = true;
+ } else if (cop == VAS_COP_TYPE_FTW) {
+ rxattr->user_win = true;
+ rxattr->intr_disable = true;
+
+ /*
+ * As noted in the VAS Workbook we disable credit checks.
+ * If we enable credit checks in the future, we must also
+ * implement a mechanism to return the user credits or new
+ * paste operations will fail.
+ */
+ }
+}
+EXPORT_SYMBOL_GPL(vas_init_rx_win_attr);
+
+struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
+ struct vas_rx_win_attr *rxattr)
+{
+ struct vas_instance *vinst;
+ struct vas_window *rxwin;
+ struct vas_winctx winctx;
+
+ if (!vas_initialized())
+ return ERR_PTR(-EAGAIN);
+
+ if (!rx_win_args_valid(cop, rxattr))
+ return ERR_PTR(-EINVAL);
+
+ vinst = find_vas_instance(vasid);
+ if (!vinst) {
+ pr_devel("VAS: vasid %d not found!\n", vasid);
+ return ERR_PTR(-EINVAL);
+ }
+ pr_devel("VAS: Found instance %d\n", vasid);
+
+ rxwin = vas_window_alloc(vinst);
+ if (IS_ERR(rxwin)) {
+ pr_devel("VAS: Unable to allocate memory for Rx window\n");
+ return rxwin;
+ }
+
+ rxwin->tx_win = false;
+ rxwin->nx_win = rxattr->nx_win;
+ rxwin->user_win = rxattr->user_win;
+ rxwin->cop = cop;
+ if (rxattr->user_win)
+ rxwin->pid = task_pid_vnr(current);
+
+ init_winctx_for_rxwin(rxwin, rxattr, &winctx);
+ init_winctx_regs(rxwin, &winctx);
+
+ set_vinst_win(vinst, rxwin);
+
+ return rxwin;
+}
+EXPORT_SYMBOL_GPL(vas_rx_win_open);
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 3eadf90..61fd80f 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -289,6 +289,9 @@ enum vas_notify_after_count {
/*
* One per instance of VAS. Each instance will have a separate set of
* receive windows, one per coprocessor type.
+ *
+ * See also function header of set_vinst_win() for details on ->windows[]
+ * and ->rxwin[] tables.
*/
struct vas_instance {
int vas_id;
@@ -397,6 +400,16 @@ extern struct vas_instance *find_vas_instance(int vasid);
#define VREG(r) VREG_SFX(r, _OFFSET)
#ifdef vas_debug
+static inline void dump_rx_win_attr(struct vas_rx_win_attr *attr)
+{
+ pr_err("VAS: fault %d, notify %d, intr %d early %d\n",
+ attr->fault_win, attr->notify_disable,
+ attr->intr_disable, attr->notify_early);
+
+ pr_err("VAS: rx_fifo_size %d, max value %d\n",
+ attr->rx_fifo_size, VAS_RX_FIFO_SIZE_MAX);
+}
+
static inline void vas_log_write(struct vas_window *win, char *name,
void *regptr, uint64_t val)
{
@@ -409,6 +422,7 @@ static inline void vas_log_write(struct vas_window *win, char *name,
#else /* vas_debug */
#define vas_log_write(win, name, reg, val)
+#define dump_rx_win_attr(attr)
#endif /* vas_debug */
--
2.7.4
Define helpers to initialize window context registers of the VAS
hardware. These will be used in follow-on patches when opening/closing
VAS windows.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v6]
- Add support for FTW windows and drop the fault window id
code since it is not needed for FTW/kernel windows.
Changelog[v5]
- Fix: Copy the FIFO address into LFIFO_BAR register as is (don't
shift address into bits 8:53).
Changelog[v4]
- Michael Neuling] Use ilog2(), radix_enabled() helpers;
drop warning when 32-bit app uses VAS (a follow-on patch
will check and return error). Set MSR_PR state to 0 for
kernel (rather than reading from MSR).
Changelog[v3]
- Have caller, rather than init_xlate_regs() reset window regs
so we don't reset any settings caller may already have set.
- Translation mode should be 0x3 (0b11) not 0x11.
- Skip initilaizing read-only registers NX_UTIL and NX_UTIL_SE
- Skip initializing adder registers from UWC - they are already
initialized from the HVWC.
- Check winctx->user_win when setting translation registers
---
arch/powerpc/platforms/powernv/vas-window.c | 305 ++++++++++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 55 +++++
2 files changed, 360 insertions(+)
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index a3a705a..3a50d6a 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -11,6 +11,7 @@
#include <linux/mutex.h>
#include <linux/slab.h>
#include <linux/io.h>
+#include <linux/log2.h>
#include "vas.h"
@@ -185,6 +186,310 @@ int map_winctx_mmio_bars(struct vas_window *window)
return 0;
}
+/*
+ * Reset all valid registers in the HV and OS/User Window Contexts for
+ * the window identified by @window.
+ *
+ * NOTE: We cannot really use a for loop to reset window context. Not all
+ * offsets in a window context are valid registers and the valid
+ * registers are not sequential. And, we can only write to offsets
+ * with valid registers (or is that only in Simics?).
+ */
+void reset_window_regs(struct vas_window *window)
+{
+ write_hvwc_reg(window, VREG(LPID), 0ULL);
+ write_hvwc_reg(window, VREG(PID), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_MSR), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_LPCR), 0ULL);
+ write_hvwc_reg(window, VREG(XLATE_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(AMR), 0ULL);
+ write_hvwc_reg(window, VREG(SEIDR), 0ULL);
+ write_hvwc_reg(window, VREG(FAULT_TX_WIN), 0ULL);
+ write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
+ write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), 0ULL);
+ write_hvwc_reg(window, VREG(PSWID), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE1), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE2), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE3), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE4), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE5), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE6), 0ULL);
+ write_hvwc_reg(window, VREG(LFIFO_BAR), 0ULL);
+ write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
+ write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LRX_WCRED), 0ULL);
+ write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(LFIFO_SIZE), 0ULL);
+ write_hvwc_reg(window, VREG(WINCTL), 0ULL);
+ write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(TX_RSVD_BUF_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_CTL), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_PID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_LPID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_TID), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), 0ULL);
+ write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
+
+ /* Skip read-only registers: NX_UTIL and NX_UTIL_SE */
+
+ /*
+ * The send and receive window credit adder registers are also
+ * accessible from HVWC and have been initialized above. We don't
+ * need to initialize from the OS/User Window Context, so skip
+ * following calls:
+ *
+ * write_uwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+ * write_uwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ */
+}
+
+/*
+ * Initialize window context registers related to Address Translation.
+ * These registers are common to send/receive windows although they
+ * differ for user/kernel windows. As we resolve the TODOs we may
+ * want to add fields to vas_winctx and move the initialization to
+ * init_vas_winctx_regs().
+ */
+static void init_xlate_regs(struct vas_window *window, bool user_win)
+{
+ uint64_t lpcr, val;
+
+ /*
+ * MSR_TA, MSR_US are false for both kernel and user.
+ * MSR_DR and MSR_PR are false for kernel.
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_XLATE_MSR_HV, val, true);
+ val = SET_FIELD(VAS_XLATE_MSR_SF, val, true);
+ if (user_win) {
+ val = SET_FIELD(VAS_XLATE_MSR_DR, val, true);
+ val = SET_FIELD(VAS_XLATE_MSR_PR, val, true);
+ }
+ write_hvwc_reg(window, VREG(XLATE_MSR), val);
+
+ lpcr = mfspr(SPRN_LPCR);
+ val = 0ULL;
+ /*
+ * NOTE: From Section 5.7.6.1 Segment Lookaside Buffer of the
+ * Power ISA, v2.07, Page size encoding is 0 = 4KB, 5 = 64KB.
+ *
+ * NOTE: From Section 1.3.1, Address Translation Context of the
+ * Nest MMU Workbook, LPCR_SC should be 0 for Power9.
+ */
+ val = SET_FIELD(VAS_XLATE_LPCR_PAGE_SIZE, val, 5);
+ val = SET_FIELD(VAS_XLATE_LPCR_ISL, val, lpcr & LPCR_ISL);
+ val = SET_FIELD(VAS_XLATE_LPCR_TC, val, lpcr & LPCR_TC);
+ val = SET_FIELD(VAS_XLATE_LPCR_SC, val, 0);
+ write_hvwc_reg(window, VREG(XLATE_LPCR), val);
+
+ /*
+ * Section 1.3.1 (Address translation Context) of NMMU workbook.
+ * 0b00 Hashed Page Table mode
+ * 0b01 Reserved
+ * 0b10 Radix on HPT
+ * 0b11 Radix on Radix
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_XLATE_MODE, val, radix_enabled() ? 3 : 2);
+ write_hvwc_reg(window, VREG(XLATE_CTL), val);
+
+ /*
+ * TODO: Can we mfspr(AMR) even for user windows?
+ */
+ val = 0ULL;
+ val = SET_FIELD(VAS_AMR, val, mfspr(SPRN_AMR));
+ write_hvwc_reg(window, VREG(AMR), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_SEIDR, val, 0);
+ write_hvwc_reg(window, VREG(SEIDR), val);
+}
+
+/*
+ * Initialize Reserved Send Buffer Count for the send window. It involves
+ * writing to the register, reading it back to confirm that the hardware
+ * has enough buffers to reserve. See section 1.3.1.2.1 of VAS workbook.
+ *
+ * Since we can only make a best-effort attempt to fulfill the request,
+ * we don't return any errors if we cannot.
+ *
+ * TODO: Reserved (aka dedicated) send buffers are not supported yet.
+ */
+static void init_rsvd_tx_buf_count(struct vas_window *txwin,
+ struct vas_winctx *winctx)
+{
+ write_hvwc_reg(txwin, VREG(TX_RSVD_BUF_COUNT), 0ULL);
+}
+
+/*
+ * init_winctx_regs()
+ * Initialize window context registers for a receive window.
+ * Except for caching control and marking window open, the registers
+ * are initialized in the order listed in Section 3.1.4 (Window Context
+ * Cache Register Details) of the VAS workbook although they don't need
+ * to be.
+ *
+ * Design note: For NX receive windows, NX allocates the FIFO buffer in OPAL
+ * (so that it can get a large contiguous area) and passes that buffer
+ * to kernel via device tree. We now write that buffer address to the
+ * FIFO BAR. Would it make sense to do this all in OPAL? i.e have OPAL
+ * write the per-chip RX FIFO addresses to the windows during boot-up
+ * as a one-time task? That could work for NX but what about other
+ * receivers? Let the receivers tell us the rx-fifo buffers for now.
+ */
+int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
+{
+ uint64_t val;
+ int fifo_size;
+
+ reset_window_regs(window);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LPID, val, winctx->lpid);
+ write_hvwc_reg(window, VREG(LPID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_PID_ID, val, winctx->pidr);
+ write_hvwc_reg(window, VREG(PID), val);
+
+ init_xlate_regs(window, winctx->user_win);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_FAULT_TX_WIN, val, 0);
+ write_hvwc_reg(window, VREG(FAULT_TX_WIN), val);
+
+ /* In PowerNV, interrupts go to HV. */
+ write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_HV_INTR_SRC_RA, val, winctx->irq_port);
+ write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_PSWID_EA_HANDLE, val, winctx->pswid);
+ write_hvwc_reg(window, VREG(PSWID), val);
+
+ write_hvwc_reg(window, VREG(SPARE1), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE2), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE3), 0ULL);
+
+ /*
+ * NOTE: VAS expects the FIFO address to be copied into the LFIFO_BAR
+ * register as is - do NOT shift the address into VAS_LFIFO_BAR
+ * bit fields! Ok to set the page migration select fields -
+ * VAS ignores the lower 10+ bits in the address anyway, because
+ * the minimum FIFO size is 1K?
+ *
+ * See also: Design note in function header.
+ */
+ val = __pa(winctx->rx_fifo);
+ val = SET_FIELD(VAS_PAGE_MIGRATION_SELECT, val, 0);
+ write_hvwc_reg(window, VREG(LFIFO_BAR), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LDATA_STAMP, val, winctx->data_stamp);
+ write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LDMA_TYPE, val, winctx->dma_type);
+ val = SET_FIELD(VAS_LDMA_FIFO_DISABLE, val, winctx->fifo_disable);
+ write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), val);
+
+ write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
+ write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
+ write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LRX_WCRED, val, winctx->wcreds_max);
+ write_hvwc_reg(window, VREG(LRX_WCRED), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_TX_WCRED, val, winctx->wcreds_max);
+ write_hvwc_reg(window, VREG(TX_WCRED), val);
+
+ write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
+
+ fifo_size = winctx->rx_fifo_size / 1024;
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LFIFO_SIZE, val, ilog2(fifo_size));
+ write_hvwc_reg(window, VREG(LFIFO_SIZE), val);
+
+ /* Update window control and caching control registers last so
+ * we mark the window open only after fully initializing it and
+ * pushing context to cache.
+ */
+
+ write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
+
+ init_rsvd_tx_buf_count(window, winctx);
+
+ /* for a send window, point to the matching receive window */
+ val = 0ULL;
+ val = SET_FIELD(VAS_LRX_WIN_ID, val, winctx->rx_win_id);
+ write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), val);
+
+ write_hvwc_reg(window, VREG(SPARE4), 0ULL);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_NOTIFY_DISABLE, val, winctx->notify_disable);
+ val = SET_FIELD(VAS_INTR_DISABLE, val, winctx->intr_disable);
+ val = SET_FIELD(VAS_NOTIFY_EARLY, val, winctx->notify_early);
+ val = SET_FIELD(VAS_NOTIFY_OSU_INTR, val, winctx->notify_os_intr_reg);
+ write_hvwc_reg(window, VREG(LNOTIFY_CTL), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_PID, val, winctx->lnotify_pid);
+ write_hvwc_reg(window, VREG(LNOTIFY_PID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_LPID, val, winctx->lnotify_lpid);
+ write_hvwc_reg(window, VREG(LNOTIFY_LPID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_TID, val, winctx->lnotify_tid);
+ write_hvwc_reg(window, VREG(LNOTIFY_TID), val);
+
+ val = 0ULL;
+ val = SET_FIELD(VAS_LNOTIFY_MIN_SCOPE, val, winctx->min_scope);
+ val = SET_FIELD(VAS_LNOTIFY_MAX_SCOPE, val, winctx->max_scope);
+ write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), val);
+
+ /* Skip read-only registers NX_UTIL and NX_UTIL_SE */
+
+ write_hvwc_reg(window, VREG(SPARE5), 0ULL);
+ write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
+ write_hvwc_reg(window, VREG(SPARE6), 0ULL);
+
+ /* Finally, push window context to memory and... */
+ val = 0ULL;
+ val = SET_FIELD(VAS_PUSH_TO_MEM, val, 1);
+ write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
+
+ /* ... mark the window open for business */
+ val = 0ULL;
+ val = SET_FIELD(VAS_WINCTL_REJ_NO_CREDIT, val, winctx->rej_no_credit);
+ val = SET_FIELD(VAS_WINCTL_PIN, val, winctx->pin_win);
+ val = SET_FIELD(VAS_WINCTL_TX_WCRED_MODE, val, winctx->tx_wcred_mode);
+ val = SET_FIELD(VAS_WINCTL_RX_WCRED_MODE, val, winctx->rx_wcred_mode);
+ val = SET_FIELD(VAS_WINCTL_TX_WORD_MODE, val, winctx->tx_word_mode);
+ val = SET_FIELD(VAS_WINCTL_RX_WORD_MODE, val, winctx->rx_word_mode);
+ val = SET_FIELD(VAS_WINCTL_FAULT_WIN, val, winctx->fault_win);
+ val = SET_FIELD(VAS_WINCTL_NX_WIN, val, winctx->nx_win);
+ val = SET_FIELD(VAS_WINCTL_OPEN, val, 1);
+ write_hvwc_reg(window, VREG(WINCTL), val);
+
+ return 0;
+}
+
/* stub for now */
int vas_win_close(struct vas_window *window)
{
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 150d7b1..7b2bcd0 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -12,6 +12,7 @@
#include <linux/atomic.h>
#include <linux/idr.h>
#include <asm/vas.h>
+#include <linux/io.h>
/*
* Overview of Virtual Accelerator Switchboard (VAS).
@@ -385,4 +386,58 @@ struct vas_winctx {
extern bool vas_initialized(void);
extern struct vas_instance *find_vas_instance(int vasid);
+/*
+ * VREG(x):
+ * Expand a register's short name (eg: LPID) into two parameters:
+ * - the register's short name in string form ("LPID"), and
+ * - the name of the macro (eg: VAS_LPID_OFFSET), defining the
+ * register's offset in the window context
+ */
+#define VREG_SFX(n, s) __stringify(n), VAS_##n##s
+#define VREG(r) VREG_SFX(r, _OFFSET)
+
+#ifdef vas_debug
+static inline void vas_log_write(struct vas_window *win, char *name,
+ void *regptr, uint64_t val)
+{
+ if (val)
+ pr_err("%swin #%d: %s reg %p, val 0x%016llx\n",
+ win->tx_win ? "Tx" : "Rx", win->winid, name,
+ regptr, val);
+}
+
+#else /* vas_debug */
+
+#define vas_log_write(win, name, reg, val)
+
+#endif /* vas_debug */
+
+static inline void write_uwc_reg(struct vas_window *win, char *name,
+ int32_t reg, uint64_t val)
+{
+ void *regptr;
+
+ regptr = win->uwc_map + reg;
+ vas_log_write(win, name, regptr, val);
+
+ out_be64(regptr, val);
+}
+
+static inline void write_hvwc_reg(struct vas_window *win, char *name,
+ int32_t reg, uint64_t val)
+{
+ void *regptr;
+
+ regptr = win->hvwc_map + reg;
+ vas_log_write(win, name, regptr, val);
+
+ out_be64(regptr, val);
+}
+
+static inline uint64_t read_hvwc_reg(struct vas_window *win,
+ char *name __maybe_unused, int32_t reg)
+{
+ return in_be64(win->hvwc_map+reg);
+}
+
#endif /* _VAS_H */
--
2.7.4
Move the GET_FIELD and SET_FIELD macros to vas.h as VAS and other
users of VAS, including NX-842 can use those macros.
There is a lot of related code between the VAS/NX kernel drivers
and skiboot. For consistency, switch the order of parameters in
SET_FIELD to match the order in skiboot.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Reviewed-by: Dan Streetman <[email protected]>
---
Changelog[v7]
[Michael Ellerman] Move the macros to <asm/vas.h> rather than
to <uapi/asm/vas.h>
Changelog[v3]
- Fix order of parameters in nx-842 driver.
---
arch/powerpc/include/asm/vas.h | 8 ++++++++
drivers/crypto/nx/nx-842-powernv.c | 7 ++++---
drivers/crypto/nx/nx-842.h | 5 -----
3 files changed, 12 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 66a863c..e34d46a 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -30,6 +30,14 @@
#define VAS_THRESH_FIFO_GT_EIGHTH_FULL 3
/*
+ * Get/Set bit fields
+ */
+#define GET_FIELD(m, v) (((v) & (m)) >> MASK_LSH(m))
+#define MASK_LSH(m) (__builtin_ffsl(m) - 1)
+#define SET_FIELD(m, v, val) \
+ (((v) & ~(m)) | ((((typeof(v))(val)) << MASK_LSH(m)) & (m)))
+
+/*
* Co-processor Engine type.
*/
enum vas_cop_type {
diff --git a/drivers/crypto/nx/nx-842-powernv.c b/drivers/crypto/nx/nx-842-powernv.c
index 1710f80..3abb045 100644
--- a/drivers/crypto/nx/nx-842-powernv.c
+++ b/drivers/crypto/nx/nx-842-powernv.c
@@ -22,6 +22,7 @@
#include <asm/prom.h>
#include <asm/icswx.h>
+#include <asm/vas.h>
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Dan Streetman <[email protected]>");
@@ -424,9 +425,9 @@ static int nx842_powernv_function(const unsigned char *in, unsigned int inlen,
/* set up CCW */
ccw = 0;
- ccw = SET_FIELD(ccw, CCW_CT, nx842_ct);
- ccw = SET_FIELD(ccw, CCW_CI_842, 0); /* use 0 for hw auto-selection */
- ccw = SET_FIELD(ccw, CCW_FC_842, fc);
+ ccw = SET_FIELD(CCW_CT, ccw, nx842_ct);
+ ccw = SET_FIELD(CCW_CI_842, ccw, 0); /* use 0 for hw auto-selection */
+ ccw = SET_FIELD(CCW_FC_842, ccw, fc);
/* set up CRB's CSB addr */
csb_addr = nx842_get_pa(csb) & CRB_CSB_ADDRESS;
diff --git a/drivers/crypto/nx/nx-842.h b/drivers/crypto/nx/nx-842.h
index a4eee3b..30929bd 100644
--- a/drivers/crypto/nx/nx-842.h
+++ b/drivers/crypto/nx/nx-842.h
@@ -100,11 +100,6 @@ static inline unsigned long nx842_get_pa(void *addr)
return page_to_phys(vmalloc_to_page(addr)) + offset_in_page(addr);
}
-/* Get/Set bit fields */
-#define MASK_LSH(m) (__builtin_ffsl(m) - 1)
-#define GET_FIELD(v, m) (((v) & (m)) >> MASK_LSH(m))
-#define SET_FIELD(v, m, val) (((v) & ~(m)) | (((val) << MASK_LSH(m)) & (m)))
-
/**
* This provides the driver's constraints. Different nx842 implementations
* may have varying requirements. The constraints are:
--
2.7.4
Implement vas_init() and vas_exit() functions for a new VAS module.
This VAS module is essentially a library for other device drivers
and kernel users of the NX coprocessors like NX-842 and NX-GZIP.
In the future this will be extended to add support for user space
to access the NX coprocessors.
VAS is currently only supported with 64K page size.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
---
Changelog[v5]:
- [Ben Herrenschmidt]: Create and use platform device tree nodes,
fix up the "reg" properties for the VAS DT node and use the
platform device helpers to parse the reg properties; Use linked
list of VAS instances (don't assume vasids are sequential);
Use CONFIG_PPC_VAS instead of CONFIG_VAS.
Changelog[v4]:
- [Michael Neuling] Fix some accidental deletions; fix help text
in Kconfig; change vas_initialized to a function; move from
drivers/misc to arch/powerpc/kernel
- Drop the vas_window_reset() interface. It is not needed as
window will be initialized before each use.
- Add a "depends on PPC_64K_PAGES"
Changelog[v3]:
- Zero vas_instances memory on allocation
- [Haren Myneni] Fix description in Kconfig
Changelog[v2]:
- Get HVWC, UWC and window address parameters from device tree.
---
.../devicetree/bindings/powerpc/ibm,vas.txt | 24 +++
MAINTAINERS | 8 +
arch/powerpc/platforms/powernv/Kconfig | 14 ++
arch/powerpc/platforms/powernv/Makefile | 1 +
arch/powerpc/platforms/powernv/vas-window.c | 19 +++
arch/powerpc/platforms/powernv/vas.c | 183 +++++++++++++++++++++
arch/powerpc/platforms/powernv/vas.h | 3 +
7 files changed, 252 insertions(+)
create mode 100644 Documentation/devicetree/bindings/powerpc/ibm,vas.txt
create mode 100644 arch/powerpc/platforms/powernv/vas-window.c
create mode 100644 arch/powerpc/platforms/powernv/vas.c
diff --git a/Documentation/devicetree/bindings/powerpc/ibm,vas.txt b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
new file mode 100644
index 0000000..0e3111d
--- /dev/null
+++ b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
@@ -0,0 +1,24 @@
+* IBM Powerpc Virtual Accelerator Switchboard (VAS)
+
+VAS is a hardware mechanism that allows kernel subsystems and user processes
+to directly submit compression and other requests to Nest accelerators (NX)
+or other coprocessors functions.
+
+Required properties:
+- compatible : should be "ibm,vas" or "ibm,power9-vas"
+- ibm,vas-id : A unique identifier for each instance of VAS in the system
+- reg : Should contain 4 pairs of 64-bit fields specifying the Hypervisor
+ window context start and length, OS/User window context start and length,
+ "Paste address" start and length, "Paste window id" start bit and number
+ of bits)
+- name : "vas"
+
+Example:
+
+ vas@6019100000000 {
+ compatible = "ibm,vas", "ibm,power9-vas";
+ reg = <0x6019100000000 0x2000000 0x6019000000000 0x100000000 0x8000000000000 0x100000000 0x20 0x10>;
+ name = "vas";
+ ibm,vas-id = <0x1>;
+ };
+
diff --git a/MAINTAINERS b/MAINTAINERS
index 3c41902..abc235f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6425,6 +6425,14 @@ F: drivers/crypto/nx/nx.*
F: drivers/crypto/nx/nx_csbcpb.h
F: drivers/crypto/nx/nx_debugfs.h
+IBM Power Virtual Accelerator Switchboard
+M: Sukadev Bhattiprolu
+L: [email protected]
+S: Supported
+F: arch/powerpc/platforms/powernv/vas*
+F: arch/powerpc/include/asm/vas.h
+F: arch/powerpc/include/uapi/asm/vas.h
+
IBM Power Linux RAID adapter
M: Brian King <[email protected]>
S: Supported
diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
index 6a6f4ef..f565454 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -30,3 +30,17 @@ config OPAL_PRD
help
This enables the opal-prd driver, a facility to run processor
recovery diagnostics on OpenPower machines
+
+config PPC_VAS
+ bool "IBM Virtual Accelerator Switchboard (VAS)"
+ depends on PPC_POWERNV && PPC_64K_PAGES
+ default n
+ help
+ This enables support for IBM Virtual Accelerator Switchboard (VAS).
+
+ VAS allows accelerators in co-processors like NX-GZIP and NX-842
+ to be accessible to kernel subsystems and user processes.
+
+ VAS adapters are found in POWER9 based systems.
+
+ If unsure, say N.
diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index b5d98cb..e4db292 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -12,3 +12,4 @@ obj-$(CONFIG_PPC_SCOM) += opal-xscom.o
obj-$(CONFIG_MEMORY_FAILURE) += opal-memory-errors.o
obj-$(CONFIG_TRACEPOINTS) += opal-tracepoints.o
obj-$(CONFIG_OPAL_PRD) += opal-prd.o
+obj-$(CONFIG_PPC_VAS) += vas.o vas-window.o
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
new file mode 100644
index 0000000..6156fbe
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -0,0 +1,19 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/types.h>
+#include <linux/mutex.h>
+
+#include "vas.h"
+
+/* stub for now */
+int vas_win_close(struct vas_window *window)
+{
+ return -1;
+}
diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
new file mode 100644
index 0000000..556156b
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -0,0 +1,183 @@
+/*
+ * Copyright 2016 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/platform_device.h>
+#include <linux/of_platform.h>
+#include <linux/of_address.h>
+#include <linux/of.h>
+
+#include "vas.h"
+
+static bool init_done;
+LIST_HEAD(vas_instances);
+
+static int init_vas_instance(struct platform_device *pdev)
+{
+ int rc, vasid;
+ struct vas_instance *vinst;
+ struct device_node *dn = pdev->dev.of_node;
+ struct resource *res;
+
+ rc = of_property_read_u32(dn, "ibm,vas-id", &vasid);
+ if (rc) {
+ pr_err("VAS: No ibm,vas-id property for %s?\n", pdev->name);
+ return -ENODEV;
+ }
+
+ if (pdev->num_resources != 4) {
+ pr_err("VAS: Unexpected DT configuration for [%s, %d]\n",
+ pdev->name, vasid);
+ return -ENODEV;
+ }
+
+ vinst = kcalloc(1, sizeof(*vinst), GFP_KERNEL);
+ if (!vinst)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&vinst->node);
+ ida_init(&vinst->ida);
+ mutex_init(&vinst->mutex);
+ vinst->vas_id = vasid;
+ vinst->pdev = pdev;
+
+ res = &pdev->resource[0];
+ vinst->hvwc_bar_start = res->start;
+ vinst->hvwc_bar_len = res->end - res->start + 1;
+
+ res = &pdev->resource[1];
+ vinst->uwc_bar_start = res->start;
+ vinst->uwc_bar_len = res->end - res->start + 1;
+
+ res = &pdev->resource[2];
+ vinst->paste_base_addr = res->start;
+
+ res = &pdev->resource[3];
+ vinst->paste_win_id_shift = 63 - res->end;
+
+ pr_devel("VAS: Initialized instance [%s, %d], paste_base 0x%llx, "
+ "paste_win_id_shift 0x%llx\n", pdev->name, vasid,
+ vinst->paste_base_addr, vinst->paste_win_id_shift);
+
+ vinst->ready = true;
+ list_add(&vinst->node, &vas_instances);
+
+ dev_set_drvdata(&pdev->dev, vinst);
+
+ return 0;
+}
+
+/*
+ * Although this is read/used multiple times, it is written to only
+ * during initialization.
+ */
+struct vas_instance *find_vas_instance(int vasid)
+{
+ struct list_head *ent;
+ struct vas_instance *vinst;
+
+ list_for_each(ent, &vas_instances) {
+ vinst = list_entry(ent, struct vas_instance, node);
+ if (vinst->vas_id == vasid)
+ return vinst;
+ }
+
+ pr_devel("VAS: Instance %d not found\n", vasid);
+ return NULL;
+}
+
+bool vas_initialized(void)
+{
+ return init_done;
+}
+
+static int vas_probe(struct platform_device *pdev)
+{
+ if (!pdev || !pdev->dev.of_node)
+ return -ENODEV;
+
+ return init_vas_instance(pdev);
+}
+
+static void free_inst(struct vas_instance *vinst)
+{
+ list_del(&vinst->node);
+
+ kfree(vinst);
+}
+
+static int vas_remove(struct platform_device *pdev)
+{
+ struct vas_instance *vinst;
+
+ vinst = dev_get_drvdata(&pdev->dev);
+
+ pr_devel("VAS: Removed instance [%s, %d]\n", pdev->name,
+ vinst->vas_id);
+ free_inst(vinst);
+
+ return 0;
+}
+static const struct of_device_id powernv_vas_match[] = {
+ { .compatible = "ibm,vas",},
+ {},
+};
+
+static struct platform_driver vas_driver = {
+ .driver = {
+ .name = "vas",
+ .of_match_table = powernv_vas_match,
+ },
+ .probe = vas_probe,
+ .remove = vas_remove,
+};
+
+module_platform_driver(vas_driver);
+
+int vas_init(void)
+{
+ int found = 0;
+ struct device_node *dn;
+
+ for_each_compatible_node(dn, NULL, "ibm,vas") {
+ of_platform_device_create(dn, NULL, NULL);
+ found++;
+ }
+
+ if (!found)
+ return -ENODEV;
+
+ pr_devel("VAS: Found %d instances\n", found);
+ init_done = true;
+
+ return 0;
+}
+
+void vas_exit(void)
+{
+ struct list_head *ent;
+ struct vas_instance *vinst;
+
+ list_for_each(ent, &vas_instances) {
+ vinst = list_entry(ent, struct vas_instance, node);
+ of_platform_depopulate(&vinst->pdev->dev);
+ }
+
+ init_done = false;
+}
+
+module_init(vas_init);
+module_exit(vas_exit);
+MODULE_DESCRIPTION("Bare metal IBM Virtual Accelerator Switchboard");
+MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
+MODULE_LICENSE("GPL");
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index c66aaf0..150d7b1 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -382,4 +382,7 @@ struct vas_winctx {
enum vas_notify_after_count notify_after_count;
};
+extern bool vas_initialized(void);
+extern struct vas_instance *find_vas_instance(int vasid);
+
#endif /* _VAS_H */
--
2.7.4
Hi Suka,
Comments inline ...
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/Documentation/devicetree/bindings/powerpc/ibm,vas.txt b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
> new file mode 100644
> index 0000000..0e3111d
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
> @@ -0,0 +1,24 @@
> +* IBM Powerpc Virtual Accelerator Switchboard (VAS)
> +
> +VAS is a hardware mechanism that allows kernel subsystems and user processes
> +to directly submit compression and other requests to Nest accelerators (NX)
> +or other coprocessors functions.
> +
> +Required properties:
> +- compatible : should be "ibm,vas" or "ibm,power9-vas"
The driver doesn't look for the latter.
> +- ibm,vas-id : A unique identifier for each instance of VAS in the system
What is this?
> +- reg : Should contain 4 pairs of 64-bit fields specifying the Hypervisor
> + window context start and length, OS/User window context start and length,
> + "Paste address" start and length, "Paste window id" start bit and number
> + of bits)
> +- name : "vas"
I don't think the name is normally included in the binding, and in fact
there's no reason why the name is important, so I'd be inclined to drop that.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 3c41902..abc235f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6425,6 +6425,14 @@ F: drivers/crypto/nx/nx.*
> F: drivers/crypto/nx/nx_csbcpb.h
> F: drivers/crypto/nx/nx_debugfs.h
>
> +IBM Power Virtual Accelerator Switchboard
> +M: Sukadev Bhattiprolu
> +L: [email protected]
> +S: Supported
> +F: arch/powerpc/platforms/powernv/vas*
> +F: arch/powerpc/include/asm/vas.h
> +F: arch/powerpc/include/uapi/asm/vas.h
That's not in the right place, the file is sorted alphabetically.
V comes after L.
> diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
> index 6a6f4ef..f565454 100644
> --- a/arch/powerpc/platforms/powernv/Kconfig
> +++ b/arch/powerpc/platforms/powernv/Kconfig
> @@ -30,3 +30,17 @@ config OPAL_PRD
> help
> This enables the opal-prd driver, a facility to run processor
> recovery diagnostics on OpenPower machines
> +
> +config PPC_VAS
> + bool "IBM Virtual Accelerator Switchboard (VAS)"
^ bool, so never a module.
> + depends on PPC_POWERNV && PPC_64K_PAGES
> + default n
It should be default y.
I know the usual advice is to make things 'default n', but this has
fairly tight depends already, so y is OK IMO.
> diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
> new file mode 100644
> index 0000000..556156b
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/vas.c
> @@ -0,0 +1,183 @@
> +/*
> + * Copyright 2016 IBM Corp.
2016-2017.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
#define pr_fmt(fmt) "vas: " fmt
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/export.h>
> +#include <linux/types.h>
> +#include <linux/slab.h>
> +#include <linux/platform_device.h>
> +#include <linux/of_platform.h>
> +#include <linux/of_address.h>
> +#include <linux/of.h>
> +
> +#include "vas.h"
> +
> +static bool init_done;
> +LIST_HEAD(vas_instances);
Can be static.
> +
> +static int init_vas_instance(struct platform_device *pdev)
> +{
> + int rc, vasid;
> + struct vas_instance *vinst;
> + struct device_node *dn = pdev->dev.of_node;
> + struct resource *res;
struct device_node *dn = pdev->dev.of_node;
struct vas_instance *vinst;
struct resource *res;
int rc, vasid;
Petty I know, but much prettier :)
> +
> + rc = of_property_read_u32(dn, "ibm,vas-id", &vasid);
> + if (rc) {
> + pr_err("VAS: No ibm,vas-id property for %s?\n", pdev->name);
With the pr_fmt() above you don't need VAS: on the front of all these.
> + return -ENODEV;
> + }
> +
> + if (pdev->num_resources != 4) {
> + pr_err("VAS: Unexpected DT configuration for [%s, %d]\n",
> + pdev->name, vasid);
> + return -ENODEV;
> + }
> +
> + vinst = kcalloc(1, sizeof(*vinst), GFP_KERNEL);
kzalloc() would be more normal given there's only 1.
> + if (!vinst)
> + return -ENOMEM;
> +
> + INIT_LIST_HEAD(&vinst->node);
> + ida_init(&vinst->ida);
> + mutex_init(&vinst->mutex);
> + vinst->vas_id = vasid;
> + vinst->pdev = pdev;
> +
> + res = &pdev->resource[0];
> + vinst->hvwc_bar_start = res->start;
> + vinst->hvwc_bar_len = res->end - res->start + 1;
> +
> + res = &pdev->resource[1];
> + vinst->uwc_bar_start = res->start;
> + vinst->uwc_bar_len = res->end - res->start + 1;
You have vinst->pdev, why do you need to copy all those?
I don't see the lens even used.
> + res = &pdev->resource[2];
> + vinst->paste_base_addr = res->start;
> +
> + res = &pdev->resource[3];
> + vinst->paste_win_id_shift = 63 - res->end;
????
What if res->end is INT_MAX ?
> + pr_devel("VAS: Initialized instance [%s, %d], paste_base 0x%llx, "
> + "paste_win_id_shift 0x%llx\n", pdev->name, vasid,
> + vinst->paste_base_addr, vinst->paste_win_id_shift);
> +
> + vinst->ready = true;
I don't see ready used.
It also shouldn't be necessary, it's not ready unless it's in the list,
and if it's in the list then it's ready.
If you're actually concerned about concurrent usage then you need a
memory barrier here to order the stores to the vinst struct vs the
addition to the list. And you need a matching barrier on the read side.
> + list_add(&vinst->node, &vas_instances);
> +
> + dev_set_drvdata(&pdev->dev, vinst);
> +
> + return 0;
> +}
> +
> +/*
> + * Although this is read/used multiple times, it is written to only
> + * during initialization.
> + */
> +struct vas_instance *find_vas_instance(int vasid)
> +{
> + struct list_head *ent;
> + struct vas_instance *vinst;
> +
> + list_for_each(ent, &vas_instances) {
> + vinst = list_entry(ent, struct vas_instance, node);
> + if (vinst->vas_id == vasid)
> + return vinst;
> + }
There's no locking here, or any reference counting of the instances. see
> + pr_devel("VAS: Instance %d not found\n", vasid);
> + return NULL;
> +}
> +
> +bool vas_initialized(void)
> +{
> + return init_done;
> +}
> +
> +static int vas_probe(struct platform_device *pdev)
> +{
> + if (!pdev || !pdev->dev.of_node)
> + return -ENODEV;
Neither of those can happen, or if they did it would be a BUG.
> + return init_vas_instance(pdev);
> +}
> +
> +static void free_inst(struct vas_instance *vinst)
> +{
> + list_del(&vinst->node);
> + kfree(vinst);
And here you just delete and the free the instance, even though you have
handed out pointers to the instance above in find_vas_instance().
So that needs work.
Is there any hardware cleanup required before we delete the instance? Or
do we just leave any windows there?
Seems like to begin with you should probably just not support remove.
> +static int vas_remove(struct platform_device *pdev)
> +{
> + struct vas_instance *vinst;
> +
> + vinst = dev_get_drvdata(&pdev->dev);
> +
> + pr_devel("VAS: Removed instance [%s, %d]\n", pdev->name,
> + vinst->vas_id);
> + free_inst(vinst);
> +
> + return 0;
> +}
> +static const struct of_device_id powernv_vas_match[] = {
> + { .compatible = "ibm,vas",},
> + {},
> +};
> +
> +static struct platform_driver vas_driver = {
> + .driver = {
> + .name = "vas",
> + .of_match_table = powernv_vas_match,
> + },
> + .probe = vas_probe,
> + .remove = vas_remove,
> +};
> +
> +module_platform_driver(vas_driver);
Can't be a module.
> +
> +int vas_init(void)
> +{
> + int found = 0;
> + struct device_node *dn;
> +
> + for_each_compatible_node(dn, NULL, "ibm,vas") {
> + of_platform_device_create(dn, NULL, NULL);
> + found++;
> + }
> +
> + if (!found)
> + return -ENODEV;
> +
> + pr_devel("VAS: Found %d instances\n", found);
> + init_done = true;
What does init_done mean?
The way it's currently written it just means there were some ibm,vas
nodes in the device tree. But it doesn't say that we actually probed
them correctly and created vas instances for them.
So it doesn't really tell you much.
Two of the callers of vas_initialized() immediately do a
find_instance(), so they don't really need to call it at all, the lack
of any instances will suffice.
The other two callers are vas_copy_crb() and vas_paste_crb(). The only
use of the former is in nx which doesn't check the return code.
But both should never be called unless we allocated a window anyway.
In short it seems unecessary.
> +
> + return 0;
> +}
> +
> +void vas_exit(void)
> +{
> + struct list_head *ent;
> + struct vas_instance *vinst;
> +
> + list_for_each(ent, &vas_instances) {
> + vinst = list_entry(ent, struct vas_instance, node);
> + of_platform_depopulate(&vinst->pdev->dev);
> + }
> +
> + init_done = false;
> +}
> +
> +module_init(vas_init);
> +module_exit(vas_exit);
> +MODULE_DESCRIPTION("Bare metal IBM Virtual Accelerator Switchboard");
> +MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
> +MODULE_LICENSE("GPL");
Can't be a module.
Just a device_initcall() should work for init, you shouldn't need
vas_exit() or any of the other bits.
cheers
Michael Ellerman [[email protected]] wrote:
> Hi Suka,
>
> Comments inline ...
>
> Sukadev Bhattiprolu <[email protected]> writes:
> > diff --git a/Documentation/devicetree/bindings/powerpc/ibm,vas.txt b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
> > new file mode 100644
> > index 0000000..0e3111d
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/powerpc/ibm,vas.txt
> > @@ -0,0 +1,24 @@
> > +* IBM Powerpc Virtual Accelerator Switchboard (VAS)
> > +
> > +VAS is a hardware mechanism that allows kernel subsystems and user processes
> > +to directly submit compression and other requests to Nest accelerators (NX)
> > +or other coprocessors functions.
> > +
> > +Required properties:
> > +- compatible : should be "ibm,vas" or "ibm,power9-vas"
>
> The driver doesn't look for the latter.
Ok. I have removed it from this list of required properties
>
> > +- ibm,vas-id : A unique identifier for each instance of VAS in the system
>
> What is this?
Like the ibm,chip-id, but in the future, there could be more than one instance
of VAS per chip, so firmware assigns a unique id to each instance of VAS.
>
> > +- reg : Should contain 4 pairs of 64-bit fields specifying the Hypervisor
> > + window context start and length, OS/User window context start and length,
> > + "Paste address" start and length, "Paste window id" start bit and number
> > + of bits)
> > +- name : "vas"
>
> I don't think the name is normally included in the binding, and in fact
> there's no reason why the name is important, so I'd be inclined to drop that.
Ok. I dropped it.
>
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 3c41902..abc235f 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -6425,6 +6425,14 @@ F: drivers/crypto/nx/nx.*
> > F: drivers/crypto/nx/nx_csbcpb.h
> > F: drivers/crypto/nx/nx_debugfs.h
> >
> > +IBM Power Virtual Accelerator Switchboard
> > +M: Sukadev Bhattiprolu
> > +L: [email protected]
> > +S: Supported
> > +F: arch/powerpc/platforms/powernv/vas*
> > +F: arch/powerpc/include/asm/vas.h
> > +F: arch/powerpc/include/uapi/asm/vas.h
>
> That's not in the right place, the file is sorted alphabetically.
ah, fixed.
>
> V comes after L.
>
> > diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platforms/powernv/Kconfig
> > index 6a6f4ef..f565454 100644
> > --- a/arch/powerpc/platforms/powernv/Kconfig
> > +++ b/arch/powerpc/platforms/powernv/Kconfig
> > @@ -30,3 +30,17 @@ config OPAL_PRD
> > help
> > This enables the opal-prd driver, a facility to run processor
> > recovery diagnostics on OpenPower machines
> > +
> > +config PPC_VAS
> > + bool "IBM Virtual Accelerator Switchboard (VAS)"
>
> ^ bool, so never a module.
yes, it should be built in.
>
> > + depends on PPC_POWERNV && PPC_64K_PAGES
> > + default n
>
> It should be default y.
>
> I know the usual advice is to make things 'default n', but this has
> fairly tight depends already, so y is OK IMO.
Ok.
>
> > diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
> > new file mode 100644
> > index 0000000..556156b
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/vas.c
> > @@ -0,0 +1,183 @@
> > +/*
> > + * Copyright 2016 IBM Corp.
>
> 2016-2017.
Ok.
>
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
>
> #define pr_fmt(fmt) "vas: " fmt
Ok
>
> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include <linux/export.h>
> > +#include <linux/types.h>
> > +#include <linux/slab.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/of_platform.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of.h>
> > +
> > +#include "vas.h"
> > +
> > +static bool init_done;
> > +LIST_HEAD(vas_instances);
>
> Can be static.
Yes
>
> > +
> > +static int init_vas_instance(struct platform_device *pdev)
> > +{
> > + int rc, vasid;
> > + struct vas_instance *vinst;
> > + struct device_node *dn = pdev->dev.of_node;
> > + struct resource *res;
>
> struct device_node *dn = pdev->dev.of_node;
> struct vas_instance *vinst;
> struct resource *res;
> int rc, vasid;
>
> Petty I know, but much prettier :)
I usually go the opposite way (shortest first) so I have done that here also.
For newer files I will invert the tree.
>
> > +
> > + rc = of_property_read_u32(dn, "ibm,vas-id", &vasid);
> > + if (rc) {
> > + pr_err("VAS: No ibm,vas-id property for %s?\n", pdev->name);
>
> With the pr_fmt() above you don't need VAS: on the front of all these.
Ok
>
> > + return -ENODEV;
> > + }
> > +
> > + if (pdev->num_resources != 4) {
> > + pr_err("VAS: Unexpected DT configuration for [%s, %d]\n",
> > + pdev->name, vasid);
> > + return -ENODEV;
> > + }
> > +
> > + vinst = kcalloc(1, sizeof(*vinst), GFP_KERNEL);
>
> kzalloc() would be more normal given there's only 1.
Yes.
>
> > + if (!vinst)
> > + return -ENOMEM;
> > +
> > + INIT_LIST_HEAD(&vinst->node);
> > + ida_init(&vinst->ida);
> > + mutex_init(&vinst->mutex);
> > + vinst->vas_id = vasid;
> > + vinst->pdev = pdev;
> > +
> > + res = &pdev->resource[0];
> > + vinst->hvwc_bar_start = res->start;
> > + vinst->hvwc_bar_len = res->end - res->start + 1;
> > +
> > + res = &pdev->resource[1];
> > + vinst->uwc_bar_start = res->start;
> > + vinst->uwc_bar_len = res->end - res->start + 1;
>
> You have vinst->pdev, why do you need to copy all those?
>
> I don't see the lens even used.
I have dropped the len fields. Kept the starts for now as it might
be easier to understand what the field is.
>
> > + res = &pdev->resource[2];
> > + vinst->paste_base_addr = res->start;
> > +
> > + res = &pdev->resource[3];
> > + vinst->paste_win_id_shift = 63 - res->end;
>
> ????
>
> What if res->end is INT_MAX ?
Good question. I have added a check for res->end exceeding 62 but
am not sure how else to error check this or, for that matter, the
res->start fields that we get from skiboot.
>
> > + pr_devel("VAS: Initialized instance [%s, %d], paste_base 0x%llx, "
> > + "paste_win_id_shift 0x%llx\n", pdev->name, vasid,
> > + vinst->paste_base_addr, vinst->paste_win_id_shift);
> > +
> > + vinst->ready = true;
>
> I don't see ready used.
Yes, we don't need it now. I have dropped it.
>
> It also shouldn't be necessary, it's not ready unless it's in the list,
> and if it's in the list then it's ready.
>
> If you're actually concerned about concurrent usage then you need a
> memory barrier here to order the stores to the vinst struct vs the
> addition to the list. And you need a matching barrier on the read side.
>
> > + list_add(&vinst->node, &vas_instances);
> > +
> > + dev_set_drvdata(&pdev->dev, vinst);
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * Although this is read/used multiple times, it is written to only
> > + * during initialization.
> > + */
> > +struct vas_instance *find_vas_instance(int vasid)
> > +{
> > + struct list_head *ent;
> > + struct vas_instance *vinst;
> > +
> > + list_for_each(ent, &vas_instances) {
> > + vinst = list_entry(ent, struct vas_instance, node);
> > + if (vinst->vas_id == vasid)
> > + return vinst;
> > + }
>
> There's no locking here, or any reference counting of the instances. see
The vas_instances list is populated at startup and never really modified
(besides, the vas_exit() code which never gets called and has now been
dropped). I was trying to use vas_initialized() and avoid locking in
find_vas_instance().
I have now dropped vas_initialized() and added a lock here - this is
used in the window setup but not in copy/paste path, so the lock should
not matter much.
>
> > + pr_devel("VAS: Instance %d not found\n", vasid);
> > + return NULL;
> > +}
> > +
> > +bool vas_initialized(void)
> > +{
> > + return init_done;
> > +}
> > +
> > +static int vas_probe(struct platform_device *pdev)
> > +{
> > + if (!pdev || !pdev->dev.of_node)
> > + return -ENODEV;
>
> Neither of those can happen, or if they did it would be a BUG.
Ok. Changed to BUG_ON.
>
> > + return init_vas_instance(pdev);
> > +}
> > +
> > +static void free_inst(struct vas_instance *vinst)
> > +{
> > + list_del(&vinst->node);
> > + kfree(vinst);
>
> And here you just delete and the free the instance, even though you have
> handed out pointers to the instance above in find_vas_instance().
>
> So that needs work.
>
> Is there any hardware cleanup required before we delete the instance? Or
> do we just leave any windows there?
>
> Seems like to begin with you should probably just not support remove.
Yes, I have dropped support for the remove()
>
> > +static int vas_remove(struct platform_device *pdev)
> > +{
> > + struct vas_instance *vinst;
> > +
> > + vinst = dev_get_drvdata(&pdev->dev);
> > +
> > + pr_devel("VAS: Removed instance [%s, %d]\n", pdev->name,
> > + vinst->vas_id);
> > + free_inst(vinst);
> > +
> > + return 0;
> > +}
> > +static const struct of_device_id powernv_vas_match[] = {
> > + { .compatible = "ibm,vas",},
> > + {},
> > +};
> > +
> > +static struct platform_driver vas_driver = {
> > + .driver = {
> > + .name = "vas",
> > + .of_match_table = powernv_vas_match,
> > + },
> > + .probe = vas_probe,
> > + .remove = vas_remove,
> > +};
> > +
> > +module_platform_driver(vas_driver);
>
> Can't be a module.
Yes, its now built-in and not a module anymore.
>
> > +
> > +int vas_init(void)
> > +{
> > + int found = 0;
> > + struct device_node *dn;
> > +
> > + for_each_compatible_node(dn, NULL, "ibm,vas") {
> > + of_platform_device_create(dn, NULL, NULL);
> > + found++;
> > + }
> > +
> > + if (!found)
> > + return -ENODEV;
> > +
> > + pr_devel("VAS: Found %d instances\n", found);
> > + init_done = true;
>
> What does init_done mean?
>
> The way it's currently written it just means there were some ibm,vas
> nodes in the device tree. But it doesn't say that we actually probed
> them correctly and created vas instances for them.
>
> So it doesn't really tell you much.
>
> Two of the callers of vas_initialized() immediately do a
> find_instance(), so they don't really need to call it at all, the lack
> of any instances will suffice.
>
> The other two callers are vas_copy_crb() and vas_paste_crb(). The only
> use of the former is in nx which doesn't check the return code.
>
> But both should never be called unless we allocated a window anyway.
>
> In short it seems unecessary.
Yes, I have dropped vas_initialized().
>
> > +
> > + return 0;
> > +}
> > +
> > +void vas_exit(void)
> > +{
> > + struct list_head *ent;
> > + struct vas_instance *vinst;
> > +
> > + list_for_each(ent, &vas_instances) {
> > + vinst = list_entry(ent, struct vas_instance, node);
> > + of_platform_depopulate(&vinst->pdev->dev);
> > + }
> > +
> > + init_done = false;
> > +}
> > +
> > +module_init(vas_init);
> > +module_exit(vas_exit);
> > +MODULE_DESCRIPTION("Bare metal IBM Virtual Accelerator Switchboard");
> > +MODULE_AUTHOR("Sukadev Bhattiprolu <[email protected]>");
> > +MODULE_LICENSE("GPL");
>
> Can't be a module.
>
> Just a device_initcall() should work for init, you shouldn't need
> vas_exit() or any of the other bits.
Yes, fixed.
>
> cheers
Hi Suka,
Comments inline.
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> index 6156fbe..a3a705a 100644
> --- a/arch/powerpc/platforms/powernv/vas-window.c
> +++ b/arch/powerpc/platforms/powernv/vas-window.c
> @@ -9,9 +9,182 @@
>
> #include <linux/types.h>
> #include <linux/mutex.h>
> +#include <linux/slab.h>
> +#include <linux/io.h>
>
> #include "vas.h"
>
> +/*
> + * Compute the paste address region for the window @window using the
> + * ->paste_base_addr and ->paste_win_id_shift we got from device tree.
> + */
> +void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
> +{
> + uint64_t base, shift;
Please use the kernel types, so u64 here.
> + int winid;
> +
> + base = window->vinst->paste_base_addr;
> + shift = window->vinst->paste_win_id_shift;
> + winid = window->winid;
> +
> + *addr = base + (winid << shift);
> + if (len)
> + *len = PAGE_SIZE;
Having multiple output parameters makes for a pretty awkward API. Is it
really necesssary given len is a constant PAGE_SIZE anyway.
If you didn't return len, then you could just make the function return
the addr, and you wouldn't need any output parameters.
One of the callers that passes len is unmap_paste_region(), but that
is a bit odd. It would be more natural I think if once a window is
mapped it knows its size. Or if the mapping will always just be one page
then we can just know that.
> +
> + pr_debug("Txwin #%d: Paste addr 0x%llx\n", winid, *addr);
> +}
> +
> +static inline void get_hvwc_mmio_bar(struct vas_window *window,
> + uint64_t *start, int *len)
> +{
> + uint64_t pbaddr;
> +
> + pbaddr = window->vinst->hvwc_bar_start;
> + *start = pbaddr + window->winid * VAS_HVWC_SIZE;
> + *len = VAS_HVWC_SIZE;
This is:
#define VAS_HVWC_SIZE 512
But then we map it, which will round up to a page anyway. So again I
don't see the point of having the len returned form this helper.
> +}
> +
> +static inline void get_uwc_mmio_bar(struct vas_window *window,
> + uint64_t *start, int *len)
> +{
> + uint64_t pbaddr;
> +
> + pbaddr = window->vinst->uwc_bar_start;
> + *start = pbaddr + window->winid * VAS_UWC_SIZE;
> + *len = VAS_UWC_SIZE;
> +}
> +
> +/*
> + * Map the paste bus address of the given send window into kernel address
> + * space. Unlike MMIO regions (map_mmio_region() below), paste region must
> + * be mapped cache-able and is only applicable to send windows.
> + */
> +void *map_paste_region(struct vas_window *txwin)
> +{
> + int rc, len;
> + void *map;
> + char *name;
> + uint64_t start;
> +
> + rc = -ENOMEM;
You don't need that.
> + name = kasprintf(GFP_KERNEL, "window-v%d-w%d", txwin->vinst->vas_id,
> + txwin->winid);
> + if (!name)
> + return ERR_PTR(rc);
That can goto free_name;
> +
> + txwin->paste_addr_name = name;
> + compute_paste_address(txwin, &start, &len);
> +
> + if (!request_mem_region(start, len, name)) {
> + pr_devel("%s(): request_mem_region(0x%llx, %d) failed\n",
> + __func__, start, len);
> + goto free_name;
> + }
> +
> + map = ioremap_cache(start, len);
> + if (!map) {
> + pr_devel("%s(): ioremap_cache(0x%llx, %d) failed\n", __func__,
> + start, len);
> + goto free_name;
> + }
> +
> + pr_devel("VAS: mapped paste addr 0x%llx to kaddr 0x%p\n", start, map);
> + return map;
> +
> +free_name:
> + kfree(name);
Because kfree(NULL) is fine.
> + return ERR_PTR(rc);
And that can just return ERR_PTR(-ENOMEM);
> +}
cheers
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> index a3a705a..3a50d6a 100644
> --- a/arch/powerpc/platforms/powernv/vas-window.c
> +++ b/arch/powerpc/platforms/powernv/vas-window.c
> @@ -11,6 +11,7 @@
> #include <linux/mutex.h>
> #include <linux/slab.h>
> #include <linux/io.h>
> +#include <linux/log2.h>
>
> #include "vas.h"
>
> @@ -185,6 +186,310 @@ int map_winctx_mmio_bars(struct vas_window *window)
> return 0;
> }
>
> +/*
> + * Reset all valid registers in the HV and OS/User Window Contexts for
> + * the window identified by @window.
> + *
> + * NOTE: We cannot really use a for loop to reset window context. Not all
> + * offsets in a window context are valid registers and the valid
> + * registers are not sequential. And, we can only write to offsets
> + * with valid registers (or is that only in Simics?).
I assume there's no "reset everything" register we can write to do this
for us?
Also if you can clean up the comment to not mention Simics, I would
assume that applies on real hardware too.
> + */
> +void reset_window_regs(struct vas_window *window)
> +{
> + write_hvwc_reg(window, VREG(LPID), 0ULL);
> + write_hvwc_reg(window, VREG(PID), 0ULL);
> + write_hvwc_reg(window, VREG(XLATE_MSR), 0ULL);
> + write_hvwc_reg(window, VREG(XLATE_LPCR), 0ULL);
> + write_hvwc_reg(window, VREG(XLATE_CTL), 0ULL);
> + write_hvwc_reg(window, VREG(AMR), 0ULL);
> + write_hvwc_reg(window, VREG(SEIDR), 0ULL);
> + write_hvwc_reg(window, VREG(FAULT_TX_WIN), 0ULL);
> + write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
> + write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), 0ULL);
> + write_hvwc_reg(window, VREG(PSWID), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE1), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE2), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE3), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE4), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE5), 0ULL);
> + write_hvwc_reg(window, VREG(SPARE6), 0ULL);
Should we be writing to spare registers? Presumably in a future hardware
revision they might have some unknown purpose.
> + write_hvwc_reg(window, VREG(LFIFO_BAR), 0ULL);
> + write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), 0ULL);
> + write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), 0ULL);
> + write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
> + write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
> + write_hvwc_reg(window, VREG(LRX_WCRED), 0ULL);
> + write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
> + write_hvwc_reg(window, VREG(TX_WCRED), 0ULL);
> + write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
> + write_hvwc_reg(window, VREG(LFIFO_SIZE), 0ULL);
> + write_hvwc_reg(window, VREG(WINCTL), 0ULL);
> + write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
> + write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 0ULL);
> + write_hvwc_reg(window, VREG(TX_RSVD_BUF_COUNT), 0ULL);
> + write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_CTL), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_PID), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_LPID), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_TID), 0ULL);
> + write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), 0ULL);
> + write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
> +
> + /* Skip read-only registers: NX_UTIL and NX_UTIL_SE */
> +
> + /*
> + * The send and receive window credit adder registers are also
> + * accessible from HVWC and have been initialized above. We don't
> + * need to initialize from the OS/User Window Context, so skip
> + * following calls:
> + *
> + * write_uwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
> + * write_uwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
> + */
> +}
> +
> +/*
> + * Initialize window context registers related to Address Translation.
> + * These registers are common to send/receive windows although they
> + * differ for user/kernel windows. As we resolve the TODOs we may
> + * want to add fields to vas_winctx and move the initialization to
> + * init_vas_winctx_regs().
> + */
> +static void init_xlate_regs(struct vas_window *window, bool user_win)
> +{
> + uint64_t lpcr, val;
> +
> + /*
> + * MSR_TA, MSR_US are false for both kernel and user.
> + * MSR_DR and MSR_PR are false for kernel.
> + */
> + val = 0ULL;
> + val = SET_FIELD(VAS_XLATE_MSR_HV, val, true);
Using a bool here presumably works, but if you actually wrote:
((u64)true << VAS_XLATE_MSR_HV)
It would look pretty weird. Using an int would be more normal.
> + val = SET_FIELD(VAS_XLATE_MSR_SF, val, true);
> + if (user_win) {
> + val = SET_FIELD(VAS_XLATE_MSR_DR, val, true);
> + val = SET_FIELD(VAS_XLATE_MSR_PR, val, true);
> + }
> + write_hvwc_reg(window, VREG(XLATE_MSR), val);
> +
> + lpcr = mfspr(SPRN_LPCR);
> + val = 0ULL;
> + /*
> + * NOTE: From Section 5.7.6.1 Segment Lookaside Buffer of the
> + * Power ISA, v2.07, Page size encoding is 0 = 4KB, 5 = 64KB.
Which is 5.7.8.1 in ISA v3.0B.
> + *
> + * NOTE: From Section 1.3.1, Address Translation Context of the
> + * Nest MMU Workbook, LPCR_SC should be 0 for Power9.
> + */
> + val = SET_FIELD(VAS_XLATE_LPCR_PAGE_SIZE, val, 5);
> + val = SET_FIELD(VAS_XLATE_LPCR_ISL, val, lpcr & LPCR_ISL);
> + val = SET_FIELD(VAS_XLATE_LPCR_TC, val, lpcr & LPCR_TC);
> + val = SET_FIELD(VAS_XLATE_LPCR_SC, val, 0);
> + write_hvwc_reg(window, VREG(XLATE_LPCR), val);
> +
> + /*
> + * Section 1.3.1 (Address translation Context) of NMMU workbook.
> + * 0b00 Hashed Page Table mode
> + * 0b01 Reserved
> + * 0b10 Radix on HPT
> + * 0b11 Radix on Radix
> + */
> + val = 0ULL;
> + val = SET_FIELD(VAS_XLATE_MODE, val, radix_enabled() ? 3 : 2);
> + write_hvwc_reg(window, VREG(XLATE_CTL), val);
> +
> + /*
> + * TODO: Can we mfspr(AMR) even for user windows?
> + */
> + val = 0ULL;
> + val = SET_FIELD(VAS_AMR, val, mfspr(SPRN_AMR));
> + write_hvwc_reg(window, VREG(AMR), val);
> +
> + val = 0ULL;
> + val = SET_FIELD(VAS_SEIDR, val, 0);
> + write_hvwc_reg(window, VREG(SEIDR), val);
> +}
> +
> +/*
> + * Initialize Reserved Send Buffer Count for the send window. It involves
> + * writing to the register, reading it back to confirm that the hardware
> + * has enough buffers to reserve. See section 1.3.1.2.1 of VAS workbook.
> + *
> + * Since we can only make a best-effort attempt to fulfill the request,
> + * we don't return any errors if we cannot.
> + *
> + * TODO: Reserved (aka dedicated) send buffers are not supported yet.
> + */
> +static void init_rsvd_tx_buf_count(struct vas_window *txwin,
> + struct vas_winctx *winctx)
> +{
> + write_hvwc_reg(txwin, VREG(TX_RSVD_BUF_COUNT), 0ULL);
> +}
> +
> +/*
> + * init_winctx_regs()
> + * Initialize window context registers for a receive window.
> + * Except for caching control and marking window open, the registers
> + * are initialized in the order listed in Section 3.1.4 (Window Context
> + * Cache Register Details) of the VAS workbook although they don't need
> + * to be.
> + *
> + * Design note: For NX receive windows, NX allocates the FIFO buffer in OPAL
> + * (so that it can get a large contiguous area) and passes that buffer
> + * to kernel via device tree. We now write that buffer address to the
> + * FIFO BAR. Would it make sense to do this all in OPAL? i.e have OPAL
> + * write the per-chip RX FIFO addresses to the windows during boot-up
> + * as a one-time task? That could work for NX but what about other
> + * receivers? Let the receivers tell us the rx-fifo buffers for now.
Why did we do it that way?
If I'm reading the skiboot code right, the "large contiguous area" is 32K.
That's less than a single page?!
cheers
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> index 3a50d6a..9c12919 100644
> --- a/arch/powerpc/platforms/powernv/vas-window.c
> +++ b/arch/powerpc/platforms/powernv/vas-window.c
> @@ -490,6 +490,76 @@ int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
> return 0;
> }
>
> +DEFINE_SPINLOCK(vas_ida_lock);
static.
> +
> +void vas_release_window_id(struct ida *ida, int winid)
static.
> +{
> + spin_lock(&vas_ida_lock);
> + ida_remove(ida, winid);
> + spin_unlock(&vas_ida_lock);
> +}
> +
> +int vas_assign_window_id(struct ida *ida)
static.
> +{
> + int rc, winid;
> +
> + rc = ida_pre_get(ida, GFP_KERNEL);
> + if (!rc)
> + return -EAGAIN;
> +
> + spin_lock(&vas_ida_lock);
> + rc = ida_get_new_above(ida, 0, &winid);
If you're passing 0 you can just use ida_get_new().
Or did you actually want to exclude 0? In which case you should pass 1.
> + spin_unlock(&vas_ida_lock);
> +
> + if (rc)
> + return rc;
You're supposed to handle EAGAIN I thought.
> +
> + if (winid > VAS_WINDOWS_PER_CHIP) {
> + pr_err("VAS: Too many (%d) open windows\n", winid);
> + vas_release_window_id(ida, winid);
> + return -EAGAIN;
> + }
> +
> + return winid;
> +}
> +
> +void vas_window_free(struct vas_window *window)
static.
> +{
> + int winid = window->winid;
> + struct vas_instance *vinst = window->vinst;
> +
> + unmap_winctx_mmio_bars(window);
> + kfree(window);
> +
> + vas_release_window_id(&vinst->ida, winid);
> +}
> +
> +struct vas_window *vas_window_alloc(struct vas_instance *vinst)
> +{
> + int winid;
> + struct vas_window *window;
> +
> + winid = vas_assign_window_id(&vinst->ida);
> + if (winid < 0)
> + return ERR_PTR(winid);
> +
> + window = kzalloc(sizeof(*window), GFP_KERNEL);
> + if (!window)
> + return ERR_PTR(-ENOMEM);
You leak an id here.
The error handling would be easier in here if the caller did the alloc,
or if you split alloc and init, and alloc just did the kzalloc().
One of the callers even prints "unable to allocate memory" if this
function fails, but that's not accurate, there's several failure modes.
> +
> + window->vinst = vinst;
> + window->winid = winid;
> +
> + if (map_winctx_mmio_bars(window))
> + goto out_free;
> +
> + return window;
> +
> +out_free:
> + kfree(window);
Leak an id here.
> + return ERR_PTR(-ENOMEM);
> +}
> +
> /* stub for now */
> int vas_win_close(struct vas_window *window)
> {
> --
> 2.7.4
Sukadev Bhattiprolu <[email protected]> writes:
> Define an interface that the NX drivers can use to find the physical
> paste address of a send window. This interface is expected to be used
> with the mmap() operation of the NX driver's device. i.e the user space
> process can use driver's mmap() operation to map the send window's paste
> address into their address space and then use copy and paste instructions
> to submit the CRBs to the NX engine.
>
> Note that kernel drivers will use vas_paste_crb() directly and don't need
> this interface.
AFAICS this is not used. And if it was that would be a problem because
there are implications in letting userspace do paste, so can you drop
this from the series for now.
cheers
Sukadev Bhattiprolu <[email protected]> writes:
> Define an interface to return a system-wide unique id for a given VAS
> window.
>
> The vas_win_id() will be used in a follow-on patch to generate an unique
> handle for a user space receive window. Applications can use this handle
> to pair send and receive windows for fast thread-wakeup.
>
> The hardware refers to this system-wide unique id as a Partition Send
> Window ID which is expected to be used during fault handling. Hence the
> "pswid" in the function names.
Same comment as previous patch.
cheers
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
> new file mode 100644
> index 0000000..c66aaf0
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/vas.h
> @@ -0,0 +1,385 @@
...
> +
> +/*
> + * In-kernel state a VAS window. One per window.
> + */
> +struct vas_window {
> + /* Fields common to send and receive windows */
> + struct vas_instance *vinst;
> + int winid;
> + bool tx_win; /* True if send window */
> + bool nx_win; /* True if NX window */
> + bool user_win; /* True if user space window */
> + void *hvwc_map; /* HV window context */
> + void *uwc_map; /* OS/User window context */
> + pid_t pid; /* Linux process id of owner */
> +
> + /* Fields applicable only to send windows */
> + void *paste_kaddr;
> + char *paste_addr_name;
> + struct vas_window *rxwin;
> +
> + /* Feilds applicable only to receive windows */
> + enum vas_cop_type cop;
> + atomic_t num_txwins;
Just noticed this. You should probably use the new refcount_t type,
because AFAICS you're using this as a refcount.
cheers
Hi Suka,
More comments :)
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> index 2dd4b63..24288dd 100644
> --- a/arch/powerpc/platforms/powernv/vas-window.c
> +++ b/arch/powerpc/platforms/powernv/vas-window.c
> @@ -879,11 +887,92 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
> }
> EXPORT_SYMBOL_GPL(vas_rx_win_open);
>
> -/* stub for now */
> +static void poll_window_busy_state(struct vas_window *window)
> +{
> + int busy;
> + uint64_t val;
> +
> +retry:
> + /*
> + * Poll Window Busy flag
> + */
> + val = read_hvwc_reg(window, VREG(WIN_STATUS));
> + busy = GET_FIELD(VAS_WIN_BUSY, val);
> + if (busy) {
> + val = 0;
> + schedule_timeout(2000);
What's 2000?
That's in jiffies, so it's not a fixed amount of time.
But on a typical config that will be 20 _seconds_ ?!
But you haven't set the task state, so AFAIK it will just return
instantly.
And if there's a software/hardware bug and it never stops being busy,
then we have a softlockup. The other option would be print a big fat
warning and just not free the window. But maybe that doesn't work for
other reasons.
> + goto retry;
> + }
> +}
> +
> +static void poll_window_castout(struct vas_window *window)
> +{
> + int cached;
> + uint64_t val;
> +
> + /* Cast window context out of the cache */
> +retry:
> + val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
> + cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
> + if (cached) {
> + val = 0ULL;
> + val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
> + val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
> + write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
Sigh, I still don't like that macro :)
or:
write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 1ull << 63);
> +
> + schedule_timeout(2000);
> + goto retry;
> + }
> +}
> +
> +/*
> + * Close a window.
> + *
> + * See Section 1.12.1 of VAS workbook v1.05 for details on closing window:
> + * - Disable new paste operations (unmap paste address)
> + * - Poll for the "Window Busy" bit to be cleared
> + * - Clear the Open/Enable bit for the Window.
> + * - Poll for return of window Credits (implies FIFO empty for Rx win?)
> + * - Unpin and cast window context out of cache
> + *
> + * Besides the hardware, kernel has some bookkeeping of course.
> + */
> int vas_win_close(struct vas_window *window)
> {
> - return -1;
> + uint64_t val;
> +
> + if (!window)
> + return 0;
> +
> + if (!window->tx_win && atomic_read(&window->num_txwins) != 0) {
> + pr_devel("VAS: Attempting to close an active Rx window!\n");
> + WARN_ON_ONCE(1);
> + return -EAGAIN;
EAGAIN means "if you do the same thing again it might work".
I don't think that's right here. The window is not in a state where it
can be freed, the caller needs to do something to fix that.
EBUSY would probably be more appropriate.
cheers
Hi Suka,
A few more things ...
Sukadev Bhattiprolu <[email protected]> writes:
> diff --git a/arch/powerpc/platforms/powernv/copy-paste.h b/arch/powerpc/platforms/powernv/copy-paste.h
> new file mode 100644
> index 0000000..7783bb8
> --- /dev/null
> +++ b/arch/powerpc/platforms/powernv/copy-paste.h
> @@ -0,0 +1,74 @@
> +/*
> + * Copyright 2016 IBM Corp.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version
> + * 2 of the License, or (at your option) any later version.
> + */
> +
> +/*
> + * Macros taken from tools/testing/selftests/powerpc/context_switch/cp_abort.c
> + */
These are both out of date, they're changed in v3.0B.
> +#define PASTE(RA, RB, L, RC) \
> + .long (0x7c00070c | (RA) << (31-15) | (RB) << (31-20) \
> + | (L) << (31-10) | (RC) << (31-31))
You should define PPC_PASTE() in ppc-opcode.h
We already have PPC_INST_PASTE, so use that.
L and RC are gone.
> +
> +#define COPY(RA, RB, L) \
> + .long (0x7c00060c | (RA) << (31-15) | (RB) << (31-20) \
> + | (L) << (31-10))
Use PPC_COPY().
> +
> +#define CR0_FXM "0x80"
I don't think a #define for this helps readability.
> +#define CR0_SHIFT 28
> +#define CR0_MASK 0xF
Not used.
> +/*
> + * Copy/paste instructions:
> + *
> + * copy RA,RB,L
> + * Copy contents of address (RA) + effective_address(RB)
> + * to internal copy-buffer.
> + *
> + * L == 1 indicates this is the first copy.
> + *
> + * L == 0 indicates its a continuation of a prior first copy.
> + *
> + * paste RA,RB,L
> + * Paste contents of internal copy-buffer to the address
> + * (RA) + effective_address(RB)
> + *
> + * L == 0 indicates its a continuation of a prior paste. i.e.
> + * don't wait for the completion or update status.
> + *
> + * L == 1 indicates this is the last paste in the group (i.e.
> + * wait for the group to complete and update status in CR0).
> + *
> + * For Power9, the L bit must be 'true' in both copy and paste.
> + */
> +
> +static inline int vas_copy(void *crb, int offset, int first)
> +{
> + WARN_ON_ONCE(!first);
Please change the API to not require unused parameters.
Same for offset.
> +
> + __asm__ __volatile(stringify_in_c(COPY(%0, %1, %2))";"
I've never seen __volatile before.
Just use: asm volatile
> + :
> + : "b" (offset), "b" (crb), "i" (1)
> + : "memory");
> +
> + return 0;
> +}
> +
> +static inline int vas_paste(void *paste_address, int offset, int last)
> +{
> + unsigned long long cr;
cr is 32-bits actually.
> + WARN_ON_ONCE(!last);
> +
> + cr = 0;
> + __asm__ __volatile(stringify_in_c(PASTE(%1, %2, 1, 1))";"
> + "mfocrf %0," CR0_FXM ";"
> + : "=r" (cr)
> + : "b" (paste_address), "b" (offset)
> + : "memory");
You need cr0 in the clobbers.
> +
> + return cr;
I think it would be more natural if you just returned CR0, so if you did
shift and mask with the CR0 constants you have above.
> diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> index 70762c3..73081b4 100644
> --- a/arch/powerpc/platforms/powernv/vas-window.c
> +++ b/arch/powerpc/platforms/powernv/vas-window.c
> @@ -1040,6 +1041,57 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
> }
> EXPORT_SYMBOL_GPL(vas_tx_win_open);
>
> +int vas_copy_crb(void *crb, int offset, bool first)
> +{
> + if (!vas_initialized())
> + return -1;
> +
> + return vas_copy(crb, offset, first);
> +}
> +EXPORT_SYMBOL_GPL(vas_copy_crb);
> +
> +#define RMA_LSMP_REPORT_ENABLE PPC_BIT(53)
> +int vas_paste_crb(struct vas_window *txwin, int offset, bool last, bool re)
> +{
> + int rc;
> + uint64_t val;
> + void *addr;
> +
> + if (!vas_initialized())
> + return -1;
This is in the fast path, or at least the runtime path. So I don't think
these checks are wanted, how would we have got this far if vas wasn't
initialised?
cheers
Michael Ellerman [[email protected]] wrote:
> Hi Suka,
>
> Comments inline.
>
> Sukadev Bhattiprolu <[email protected]> writes:
> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> > index 6156fbe..a3a705a 100644
> > --- a/arch/powerpc/platforms/powernv/vas-window.c
> > +++ b/arch/powerpc/platforms/powernv/vas-window.c
> > @@ -9,9 +9,182 @@
> >
> > #include <linux/types.h>
> > #include <linux/mutex.h>
> > +#include <linux/slab.h>
> > +#include <linux/io.h>
> >
> > #include "vas.h"
> >
> > +/*
> > + * Compute the paste address region for the window @window using the
> > + * ->paste_base_addr and ->paste_win_id_shift we got from device tree.
> > + */
> > +void compute_paste_address(struct vas_window *window, uint64_t *addr, int *len)
> > +{
> > + uint64_t base, shift;
>
> Please use the kernel types, so u64 here.
Ok.
>
> > + int winid;
> > +
> > + base = window->vinst->paste_base_addr;
> > + shift = window->vinst->paste_win_id_shift;
> > + winid = window->winid;
> > +
> > + *addr = base + (winid << shift);
> > + if (len)
> > + *len = PAGE_SIZE;
>
> Having multiple output parameters makes for a pretty awkward API. Is it
> really necesssary given len is a constant PAGE_SIZE anyway.
>
> If you didn't return len, then you could just make the function return
> the addr, and you wouldn't need any output parameters.
I agree, I went back and forth on it. I was trying to avoid callers
making assumptions on the size. But since there are just a couple
of places, I guess we could have them assume PAGE_SIZE.
>
> One of the callers that passes len is unmap_paste_region(), but that
> is a bit odd. It would be more natural I think if once a window is
> mapped it knows its size. Or if the mapping will always just be one page
> then we can just know that.
Agree, since the len values are constant I was trying to avoid saving
them in each of the 64K windows - so the compute during unmap. Will change
to assume PAGE_SIZE.
Also agree with other comments here.
Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> > index a3a705a..3a50d6a 100644
> > --- a/arch/powerpc/platforms/powernv/vas-window.c
> > +++ b/arch/powerpc/platforms/powernv/vas-window.c
> > @@ -11,6 +11,7 @@
> > #include <linux/mutex.h>
> > #include <linux/slab.h>
> > #include <linux/io.h>
> > +#include <linux/log2.h>
> >
> > #include "vas.h"
> >
> > @@ -185,6 +186,310 @@ int map_winctx_mmio_bars(struct vas_window *window)
> > return 0;
> > }
> >
> > +/*
> > + * Reset all valid registers in the HV and OS/User Window Contexts for
> > + * the window identified by @window.
> > + *
> > + * NOTE: We cannot really use a for loop to reset window context. Not all
> > + * offsets in a window context are valid registers and the valid
> > + * registers are not sequential. And, we can only write to offsets
> > + * with valid registers (or is that only in Simics?).
>
> I assume there's no "reset everything" register we can write to do this
> for us?
Checked with the hardware team and they said there is no "reset everything"
register. While there are some tricky ways to clear the context, writing
zeroes is the easiest.
>
> Also if you can clean up the comment to not mention Simics, I would
> assume that applies on real hardware too.
>
> > + */
> > +void reset_window_regs(struct vas_window *window)
> > +{
> > + write_hvwc_reg(window, VREG(LPID), 0ULL);
> > + write_hvwc_reg(window, VREG(PID), 0ULL);
> > + write_hvwc_reg(window, VREG(XLATE_MSR), 0ULL);
> > + write_hvwc_reg(window, VREG(XLATE_LPCR), 0ULL);
> > + write_hvwc_reg(window, VREG(XLATE_CTL), 0ULL);
> > + write_hvwc_reg(window, VREG(AMR), 0ULL);
> > + write_hvwc_reg(window, VREG(SEIDR), 0ULL);
> > + write_hvwc_reg(window, VREG(FAULT_TX_WIN), 0ULL);
> > + write_hvwc_reg(window, VREG(OSU_INTR_SRC_RA), 0ULL);
> > + write_hvwc_reg(window, VREG(HV_INTR_SRC_RA), 0ULL);
> > + write_hvwc_reg(window, VREG(PSWID), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE1), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE2), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE3), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE4), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE5), 0ULL);
> > + write_hvwc_reg(window, VREG(SPARE6), 0ULL);
>
> Should we be writing to spare registers? Presumably in a future hardware
> revision they might have some unknown purpose.
Sure, will skip those.
>
> > + write_hvwc_reg(window, VREG(LFIFO_BAR), 0ULL);
> > + write_hvwc_reg(window, VREG(LDATA_STAMP_CTL), 0ULL);
> > + write_hvwc_reg(window, VREG(LDMA_CACHE_CTL), 0ULL);
> > + write_hvwc_reg(window, VREG(LRFIFO_PUSH), 0ULL);
> > + write_hvwc_reg(window, VREG(CURR_MSG_COUNT), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_AFTER_COUNT), 0ULL);
> > + write_hvwc_reg(window, VREG(LRX_WCRED), 0ULL);
> > + write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
> > + write_hvwc_reg(window, VREG(TX_WCRED), 0ULL);
> > + write_hvwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
> > + write_hvwc_reg(window, VREG(LFIFO_SIZE), 0ULL);
> > + write_hvwc_reg(window, VREG(WINCTL), 0ULL);
> > + write_hvwc_reg(window, VREG(WIN_STATUS), 0ULL);
> > + write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 0ULL);
> > + write_hvwc_reg(window, VREG(TX_RSVD_BUF_COUNT), 0ULL);
> > + write_hvwc_reg(window, VREG(LRFIFO_WIN_PTR), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_CTL), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_PID), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_LPID), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_TID), 0ULL);
> > + write_hvwc_reg(window, VREG(LNOTIFY_SCOPE), 0ULL);
> > + write_hvwc_reg(window, VREG(NX_UTIL_ADDER), 0ULL);
> > +
> > + /* Skip read-only registers: NX_UTIL and NX_UTIL_SE */
> > +
> > + /*
> > + * The send and receive window credit adder registers are also
> > + * accessible from HVWC and have been initialized above. We don't
> > + * need to initialize from the OS/User Window Context, so skip
> > + * following calls:
> > + *
> > + * write_uwc_reg(window, VREG(TX_WCRED_ADDER), 0ULL);
> > + * write_uwc_reg(window, VREG(LRX_WCRED_ADDER), 0ULL);
> > + */
> > +}
> > +
> > +/*
> > + * Initialize window context registers related to Address Translation.
> > + * These registers are common to send/receive windows although they
> > + * differ for user/kernel windows. As we resolve the TODOs we may
> > + * want to add fields to vas_winctx and move the initialization to
> > + * init_vas_winctx_regs().
> > + */
> > +static void init_xlate_regs(struct vas_window *window, bool user_win)
> > +{
> > + uint64_t lpcr, val;
> > +
> > + /*
> > + * MSR_TA, MSR_US are false for both kernel and user.
> > + * MSR_DR and MSR_PR are false for kernel.
> > + */
> > + val = 0ULL;
> > + val = SET_FIELD(VAS_XLATE_MSR_HV, val, true);
>
> Using a bool here presumably works, but if you actually wrote:
>
> ((u64)true << VAS_XLATE_MSR_HV)
>
> It would look pretty weird. Using an int would be more normal.
Ok.
>
> > + val = SET_FIELD(VAS_XLATE_MSR_SF, val, true);
> > + if (user_win) {
> > + val = SET_FIELD(VAS_XLATE_MSR_DR, val, true);
> > + val = SET_FIELD(VAS_XLATE_MSR_PR, val, true);
> > + }
> > + write_hvwc_reg(window, VREG(XLATE_MSR), val);
> > +
> > + lpcr = mfspr(SPRN_LPCR);
> > + val = 0ULL;
> > + /*
> > + * NOTE: From Section 5.7.6.1 Segment Lookaside Buffer of the
> > + * Power ISA, v2.07, Page size encoding is 0 = 4KB, 5 = 64KB.
>
> Which is 5.7.8.1 in ISA v3.0B.
Ok.
>
> > + *
> > + * NOTE: From Section 1.3.1, Address Translation Context of the
> > + * Nest MMU Workbook, LPCR_SC should be 0 for Power9.
> > + */
> > + val = SET_FIELD(VAS_XLATE_LPCR_PAGE_SIZE, val, 5);
> > + val = SET_FIELD(VAS_XLATE_LPCR_ISL, val, lpcr & LPCR_ISL);
> > + val = SET_FIELD(VAS_XLATE_LPCR_TC, val, lpcr & LPCR_TC);
> > + val = SET_FIELD(VAS_XLATE_LPCR_SC, val, 0);
> > + write_hvwc_reg(window, VREG(XLATE_LPCR), val);
> > +
> > + /*
> > + * Section 1.3.1 (Address translation Context) of NMMU workbook.
> > + * 0b00 Hashed Page Table mode
> > + * 0b01 Reserved
> > + * 0b10 Radix on HPT
> > + * 0b11 Radix on Radix
> > + */
> > + val = 0ULL;
> > + val = SET_FIELD(VAS_XLATE_MODE, val, radix_enabled() ? 3 : 2);
> > + write_hvwc_reg(window, VREG(XLATE_CTL), val);
> > +
> > + /*
> > + * TODO: Can we mfspr(AMR) even for user windows?
> > + */
> > + val = 0ULL;
> > + val = SET_FIELD(VAS_AMR, val, mfspr(SPRN_AMR));
> > + write_hvwc_reg(window, VREG(AMR), val);
> > +
> > + val = 0ULL;
> > + val = SET_FIELD(VAS_SEIDR, val, 0);
> > + write_hvwc_reg(window, VREG(SEIDR), val);
> > +}
> > +
> > +/*
> > + * Initialize Reserved Send Buffer Count for the send window. It involves
> > + * writing to the register, reading it back to confirm that the hardware
> > + * has enough buffers to reserve. See section 1.3.1.2.1 of VAS workbook.
> > + *
> > + * Since we can only make a best-effort attempt to fulfill the request,
> > + * we don't return any errors if we cannot.
> > + *
> > + * TODO: Reserved (aka dedicated) send buffers are not supported yet.
> > + */
> > +static void init_rsvd_tx_buf_count(struct vas_window *txwin,
> > + struct vas_winctx *winctx)
> > +{
> > + write_hvwc_reg(txwin, VREG(TX_RSVD_BUF_COUNT), 0ULL);
> > +}
> > +
> > +/*
> > + * init_winctx_regs()
> > + * Initialize window context registers for a receive window.
> > + * Except for caching control and marking window open, the registers
> > + * are initialized in the order listed in Section 3.1.4 (Window Context
> > + * Cache Register Details) of the VAS workbook although they don't need
> > + * to be.
> > + *
> > + * Design note: For NX receive windows, NX allocates the FIFO buffer in OPAL
> > + * (so that it can get a large contiguous area) and passes that buffer
> > + * to kernel via device tree. We now write that buffer address to the
> > + * FIFO BAR. Would it make sense to do this all in OPAL? i.e have OPAL
> > + * write the per-chip RX FIFO addresses to the windows during boot-up
> > + * as a one-time task? That could work for NX but what about other
> > + * receivers? Let the receivers tell us the rx-fifo buffers for now.
>
> Why did we do it that way?
>
> If I'm reading the skiboot code right, the "large contiguous area" is 32K.
>
> That's less than a single page?!
I guess it is because the FIFO can get larger in the future?
Ccing Haren.
Thanks
Sukadev
>
>
> cheers
Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> > + rc = ida_pre_get(ida, GFP_KERNEL);
> > + if (!rc)
> > + return -EAGAIN;
> > +
> > + spin_lock(&vas_ida_lock);
> > + rc = ida_get_new_above(ida, 0, &winid);
>
> If you're passing 0 you can just use ida_get_new().
Ok.
>
> Or did you actually want to exclude 0? In which case you should pass 1.
>
> > + spin_unlock(&vas_ida_lock);
> > +
> > + if (rc)
> > + return rc;
>
> You're supposed to handle EAGAIN I thought.
Yes, I will retry the pre_get()
>
> > +
> > + if (winid > VAS_WINDOWS_PER_CHIP) {
> > + pr_err("VAS: Too many (%d) open windows\n", winid);
> > + vas_release_window_id(ida, winid);
> > + return -EAGAIN;
> > + }
> > +
> > + return winid;
> > +}
> > +
> > +void vas_window_free(struct vas_window *window)
>
> static.
Ok
>
> > +{
> > + int winid = window->winid;
> > + struct vas_instance *vinst = window->vinst;
> > +
> > + unmap_winctx_mmio_bars(window);
> > + kfree(window);
> > +
> > + vas_release_window_id(&vinst->ida, winid);
> > +}
> > +
> > +struct vas_window *vas_window_alloc(struct vas_instance *vinst)
> > +{
> > + int winid;
> > + struct vas_window *window;
> > +
> > + winid = vas_assign_window_id(&vinst->ida);
> > + if (winid < 0)
> > + return ERR_PTR(winid);
> > +
> > + window = kzalloc(sizeof(*window), GFP_KERNEL);
> > + if (!window)
> > + return ERR_PTR(-ENOMEM);
>
> You leak an id here.
Argh. Yes.
>
> The error handling would be easier in here if the caller did the alloc,
> or if you split alloc and init, and alloc just did the kzalloc().
I was trying to simplify error handling in the callers where they have
to only deal with one failure now.
>
> One of the callers even prints "unable to allocate memory" if this
> function fails, but that's not accurate, there's several failure modes.
Yes, will fix that message and the leaks.
Thanks,
Suka
Michael Ellerman [[email protected]] wrote:
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > Define an interface to return a system-wide unique id for a given VAS
> > window.
> >
> > The vas_win_id() will be used in a follow-on patch to generate an unique
> > handle for a user space receive window. Applications can use this handle
> > to pair send and receive windows for fast thread-wakeup.
> >
> > The hardware refers to this system-wide unique id as a Partition Send
> > Window ID which is expected to be used during fault handling. Hence the
> > "pswid" in the function names.
>
> Same comment as previous patch.
Ok will drop them for now.
>
> cheers
Michael Ellerman [[email protected]] wrote:
> Hi Suka,
>
> More comments :)
Thanks!
>
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> > index 2dd4b63..24288dd 100644
> > --- a/arch/powerpc/platforms/powernv/vas-window.c
> > +++ b/arch/powerpc/platforms/powernv/vas-window.c
> > @@ -879,11 +887,92 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
> > }
> > EXPORT_SYMBOL_GPL(vas_rx_win_open);
> >
> > -/* stub for now */
> > +static void poll_window_busy_state(struct vas_window *window)
> > +{
> > + int busy;
> > + uint64_t val;
> > +
> > +retry:
> > + /*
> > + * Poll Window Busy flag
> > + */
> > + val = read_hvwc_reg(window, VREG(WIN_STATUS));
> > + busy = GET_FIELD(VAS_WIN_BUSY, val);
> > + if (busy) {
> > + val = 0;
> > + schedule_timeout(2000);
>
> What's 2000?
>
> That's in jiffies, so it's not a fixed amount of time.
>
> But on a typical config that will be 20 _seconds_ ?!
Ok. Should I change to that just HZ and
>
> But you haven't set the task state, so AFAIK it will just return
> instantly.
call set_current_state(TASK_UNINTERRUPTIBLE) before the schedule_timeout()?
>
> And if there's a software/hardware bug and it never stops being busy,
> then we have a softlockup. The other option would be print a big fat
> warning and just not free the window. But maybe that doesn't work for
> other reasons.
>
> > + goto retry;
> > + }
> > +}
> > +
> > +static void poll_window_castout(struct vas_window *window)
> > +{
> > + int cached;
> > + uint64_t val;
> > +
> > + /* Cast window context out of the cache */
> > +retry:
> > + val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
> > + cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
> > + if (cached) {
> > + val = 0ULL;
> > + val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
> > + val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
> > + write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
>
> Sigh, I still don't like that macro :)
:-) For one thing, I have used it a lot now and secondly isn't it easier
to know that VAS_CASTOUT_REQ bit is set to 1 without worrying about its
bit position? When debugging, yes we have to ensure VAS_CASTOUT_REQ is
properly defined and we have to work out value in "val".
>
> or:
> write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), 1ull << 63);
>
> > +
> > + schedule_timeout(2000);
> > + goto retry;
> > + }
> > +}
> > +
> > +/*
> > + * Close a window.
> > + *
> > + * See Section 1.12.1 of VAS workbook v1.05 for details on closing window:
> > + * - Disable new paste operations (unmap paste address)
> > + * - Poll for the "Window Busy" bit to be cleared
> > + * - Clear the Open/Enable bit for the Window.
> > + * - Poll for return of window Credits (implies FIFO empty for Rx win?)
> > + * - Unpin and cast window context out of cache
> > + *
> > + * Besides the hardware, kernel has some bookkeeping of course.
> > + */
> > int vas_win_close(struct vas_window *window)
> > {
> > - return -1;
> > + uint64_t val;
> > +
> > + if (!window)
> > + return 0;
> > +
> > + if (!window->tx_win && atomic_read(&window->num_txwins) != 0) {
> > + pr_devel("VAS: Attempting to close an active Rx window!\n");
> > + WARN_ON_ONCE(1);
> > + return -EAGAIN;
>
> EAGAIN means "if you do the same thing again it might work".
>
> I don't think that's right here. The window is not in a state where it
> can be freed, the caller needs to do something to fix that.
>
> EBUSY would probably be more appropriate.
Ok. Should not happen now (or even with the fast thread-wake up code)
since only the kernel should be closing the windows - so its really a
bug. Will change to EBUSY though.
>
>
> cheers
Michael Ellerman [[email protected]] wrote:
> Hi Suka,
>
> A few more things ...
>
> Sukadev Bhattiprolu <[email protected]> writes:
>
> > diff --git a/arch/powerpc/platforms/powernv/copy-paste.h b/arch/powerpc/platforms/powernv/copy-paste.h
> > new file mode 100644
> > index 0000000..7783bb8
> > --- /dev/null
> > +++ b/arch/powerpc/platforms/powernv/copy-paste.h
> > @@ -0,0 +1,74 @@
> > +/*
> > + * Copyright 2016 IBM Corp.
> > + *
> > + * This program is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU General Public License
> > + * as published by the Free Software Foundation; either version
> > + * 2 of the License, or (at your option) any later version.
> > + */
> > +
> > +/*
> > + * Macros taken from tools/testing/selftests/powerpc/context_switch/cp_abort.c
> > + */
>
> These are both out of date, they're changed in v3.0B.
>
> > +#define PASTE(RA, RB, L, RC) \
> > + .long (0x7c00070c | (RA) << (31-15) | (RB) << (31-20) \
> > + | (L) << (31-10) | (RC) << (31-31))
>
> You should define PPC_PASTE() in ppc-opcode.h
>
> We already have PPC_INST_PASTE, so use that.
>
> L and RC are gone.
Ok. I thought they would come back later, but of course we can update
these kernel-only calls then.
>
> > +
> > +#define COPY(RA, RB, L) \
> > + .long (0x7c00060c | (RA) << (31-15) | (RB) << (31-20) \
> > + | (L) << (31-10))
>
> Use PPC_COPY().
>
Ok
> > +
> > +#define CR0_FXM "0x80"
>
> I don't think a #define for this helps readability.
>
> > +#define CR0_SHIFT 28
> > +#define CR0_MASK 0xF
>
> Not used.
Will need them now to return value in cr0?
>
> > +/*
> > + * Copy/paste instructions:
> > + *
> > + * copy RA,RB,L
> > + * Copy contents of address (RA) + effective_address(RB)
> > + * to internal copy-buffer.
> > + *
> > + * L == 1 indicates this is the first copy.
> > + *
> > + * L == 0 indicates its a continuation of a prior first copy.
> > + *
> > + * paste RA,RB,L
> > + * Paste contents of internal copy-buffer to the address
> > + * (RA) + effective_address(RB)
> > + *
> > + * L == 0 indicates its a continuation of a prior paste. i.e.
> > + * don't wait for the completion or update status.
> > + *
> > + * L == 1 indicates this is the last paste in the group (i.e.
> > + * wait for the group to complete and update status in CR0).
> > + *
> > + * For Power9, the L bit must be 'true' in both copy and paste.
> > + */
> > +
> > +static inline int vas_copy(void *crb, int offset, int first)
> > +{
> > + WARN_ON_ONCE(!first);
>
> Please change the API to not require unused parameters.
>
> Same for offset.
Ok, Haren's NX patches will need to drop those parameters as well.
>
> > +
> > + __asm__ __volatile(stringify_in_c(COPY(%0, %1, %2))";"
>
> I've never seen __volatile before.
>
> Just use: asm volatile
ok
>
>
> > + :
> > + : "b" (offset), "b" (crb), "i" (1)
> > + : "memory");
> > +
> > + return 0;
> > +}
> > +
> > +static inline int vas_paste(void *paste_address, int offset, int last)
> > +{
> > + unsigned long long cr;
>
> cr is 32-bits actually.
ok
>
> > + WARN_ON_ONCE(!last);
> > +
> > + cr = 0;
> > + __asm__ __volatile(stringify_in_c(PASTE(%1, %2, 1, 1))";"
> > + "mfocrf %0," CR0_FXM ";"
> > + : "=r" (cr)
> > + : "b" (paste_address), "b" (offset)
> > + : "memory");
>
> You need cr0 in the clobbers.
ok
>
> > +
> > + return cr;
>
> I think it would be more natural if you just returned CR0, so if you did
> shift and mask with the CR0 constants you have above.
>
ok
>
> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
> > index 70762c3..73081b4 100644
> > --- a/arch/powerpc/platforms/powernv/vas-window.c
> > +++ b/arch/powerpc/platforms/powernv/vas-window.c
> > @@ -1040,6 +1041,57 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
> > }
> > EXPORT_SYMBOL_GPL(vas_tx_win_open);
> >
> > +int vas_copy_crb(void *crb, int offset, bool first)
> > +{
> > + if (!vas_initialized())
> > + return -1;
> > +
> > + return vas_copy(crb, offset, first);
> > +}
> > +EXPORT_SYMBOL_GPL(vas_copy_crb);
> > +
> > +#define RMA_LSMP_REPORT_ENABLE PPC_BIT(53)
> > +int vas_paste_crb(struct vas_window *txwin, int offset, bool last, bool re)
> > +{
> > + int rc;
> > + uint64_t val;
> > + void *addr;
> > +
> > + if (!vas_initialized())
> > + return -1;
>
> This is in the fast path, or at least the runtime path. So I don't think
> these checks are wanted, how would we have got this far if vas wasn't
> initialised?
Yes, I have dropped vas_initialized() now.
>
>
>
> cheers
Sukadev Bhattiprolu <[email protected]> writes:
> Michael Ellerman [[email protected]] wrote:
>> Sukadev Bhattiprolu <[email protected]> writes:
>> > diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
>> > index 2dd4b63..24288dd 100644
>> > --- a/arch/powerpc/platforms/powernv/vas-window.c
>> > +++ b/arch/powerpc/platforms/powernv/vas-window.c
>> > @@ -879,11 +887,92 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
>> > }
>> > EXPORT_SYMBOL_GPL(vas_rx_win_open);
>> >
>> > -/* stub for now */
>> > +static void poll_window_busy_state(struct vas_window *window)
>> > +{
>> > + int busy;
>> > + uint64_t val;
>> > +
>> > +retry:
>> > + /*
>> > + * Poll Window Busy flag
>> > + */
>> > + val = read_hvwc_reg(window, VREG(WIN_STATUS));
>> > + busy = GET_FIELD(VAS_WIN_BUSY, val);
>> > + if (busy) {
>> > + val = 0;
>> > + schedule_timeout(2000);
>>
>> What's 2000?
>>
>> That's in jiffies, so it's not a fixed amount of time.
>>
>> But on a typical config that will be 20 _seconds_ ?!
>
> Ok. Should I change to that just HZ and
Does the hardware give us any idea how long we need to wait?
HZ is better I guess but it's still a long time.
>> But you haven't set the task state, so AFAIK it will just return
>> instantly.
>
> call set_current_state(TASK_UNINTERRUPTIBLE) before the schedule_timeout()?
Yes.
>> > + val = read_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL));
>> > + cached = GET_FIELD(VAS_WIN_CACHE_STATUS, val);
>> > + if (cached) {
>> > + val = 0ULL;
>> > + val = SET_FIELD(VAS_CASTOUT_REQ, val, 1);
>> > + val = SET_FIELD(VAS_PUSH_TO_MEM, val, 0);
>> > + write_hvwc_reg(window, VREG(WIN_CTX_CACHING_CTL), val);
>>
>> Sigh, I still don't like that macro :)
>
> :-) For one thing, I have used it a lot now and secondly isn't it easier
> to know that VAS_CASTOUT_REQ bit is set to 1 without worrying about its
> bit position?
Yeah I guess it depends what you're doing. Are you comparing the code to
the spec, where seeing the name is probably what you want?
Or are you debugging the code running on hardware, where you have a dump
of register values or a trace etc. and you don't have any names just hex
values.
At least in my experience I spend more time doing the latter, so being
able to match the hex values in the code up with values I see in memory
or in registers is preferable.
But it's your code so I'll stop whining about that macro :)
cheers
Sukadev Bhattiprolu <[email protected]> writes:
> Michael Ellerman [[email protected]] wrote:
>> Hi Suka,
>>
>> A few more things ...
>>
>> Sukadev Bhattiprolu <[email protected]> writes:
>>
>> > diff --git a/arch/powerpc/platforms/powernv/copy-paste.h b/arch/powerpc/platforms/powernv/copy-paste.h
>> > new file mode 100644
>> > index 0000000..7783bb8
>> > --- /dev/null
>> > +++ b/arch/powerpc/platforms/powernv/copy-paste.h
>> > @@ -0,0 +1,74 @@
>> > +/*
>> > + * Copyright 2016 IBM Corp.
>> > + *
>> > + * This program is free software; you can redistribute it and/or
>> > + * modify it under the terms of the GNU General Public License
>> > + * as published by the Free Software Foundation; either version
>> > + * 2 of the License, or (at your option) any later version.
>> > + */
>> > +
>> > +/*
>> > + * Macros taken from tools/testing/selftests/powerpc/context_switch/cp_abort.c
>> > + */
>>
>> These are both out of date, they're changed in v3.0B.
>>
>> > +#define PASTE(RA, RB, L, RC) \
>> > + .long (0x7c00070c | (RA) << (31-15) | (RB) << (31-20) \
>> > + | (L) << (31-10) | (RC) << (31-31))
>>
>> You should define PPC_PASTE() in ppc-opcode.h
>>
>> We already have PPC_INST_PASTE, so use that.
>>
>> L and RC are gone.
>
> Ok. I thought they would come back later, but of course we can update
> these kernel-only calls then.
Possible, but if they do we can update them then.
>> > +#define CR0_SHIFT 28
>> > +#define CR0_MASK 0xF
>>
>> Not used.
>
> Will need them now to return value in cr0?
Yes.
>> > +/*
>> > + * Copy/paste instructions:
>> > + *
>> > + * copy RA,RB,L
>> > + * Copy contents of address (RA) + effective_address(RB)
>> > + * to internal copy-buffer.
>> > + *
>> > + * L == 1 indicates this is the first copy.
>> > + *
>> > + * L == 0 indicates its a continuation of a prior first copy.
>> > + *
>> > + * paste RA,RB,L
>> > + * Paste contents of internal copy-buffer to the address
>> > + * (RA) + effective_address(RB)
>> > + *
>> > + * L == 0 indicates its a continuation of a prior paste. i.e.
>> > + * don't wait for the completion or update status.
>> > + *
>> > + * L == 1 indicates this is the last paste in the group (i.e.
>> > + * wait for the group to complete and update status in CR0).
>> > + *
>> > + * For Power9, the L bit must be 'true' in both copy and paste.
>> > + */
>> > +
>> > +static inline int vas_copy(void *crb, int offset, int first)
>> > +{
>> > + WARN_ON_ONCE(!first);
>>
>> Please change the API to not require unused parameters.
>>
>> Same for offset.
>
> Ok, Haren's NX patches will need to drop those parameters as well.
That's fine, I'm merging them all via my tree. I can fix that up.
cheers