Received: by 10.223.185.116 with SMTP id b49csp120323wrg; Tue, 13 Feb 2018 17:55:42 -0800 (PST) X-Google-Smtp-Source: AH8x225zVCPi0IYPiaWvPYiQMU30EHuG/w+CcK6SyuPGyod1Fg52PZ/rpuUv9+4V7jWlYgMVIfDO X-Received: by 2002:a17:902:a983:: with SMTP id bh3-v6mr2874994plb.237.1518573342873; Tue, 13 Feb 2018 17:55:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518573342; cv=none; d=google.com; s=arc-20160816; b=XzPKogL42zoIs5r+6Cnrg0IOloQIPsB2jXBuDER/o2y7ps4kzmmW/BESWCNJSav7PJ nNg3oSBWIFju59df0mrRgDlgRjTowJdNTb9yGL8dxRYhG6jY8EeZaz8OLokhjkihJW8b pWnx1cdxvYE6W1FVG7S8XTV18YoDQAXFYdawp9BFYI0BGDDyeI2nLagphl4xMNcRQnP3 81ue/iGvDHrUJ2sE+AHKkUJ2qo9/xhYvvEo20hx+4JnC75UEBrb1VT5+ZZiMKKCURzVv e6ZwH2Xev+UJWf6JmaEgMRyrHQEcbXM4CJhxGwh0KuFIEefgeGJZZYlSgUtdQLNUBegJ GHtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=mKJ3z3Unddih4/e11tOJpO6lFztwWoWsLpzKo99QCNE=; b=QJJRgzdKRrtrRpW71av2L538r1HxWHrgnv6hVJSkG63ovWRmu+TEIBuVx8J/jDimTu w9hgndfQeUUeg7Fe4VyEwXfCtrCOAQuUoG2x51GS3+thJQrKVLxtqaIGFSZYYSuBembk +EYXHTOFYfXfouB/FjUXibBBmAD50NxSVAzcPvpuEVPMWQGEYhaUEnvsmBX/MPT6V8z7 sqKIWcljOrkdqvijUz9n6ljakfmHltqzHiO4JMqS1i1XaPFYsn4ToBtS++kCFOXCzeOS 76D/I08nXHt+PE1p8eCJ3CkRvuzGfIzxP/4Ljk8TDayXmtKbizi2cH1CGNVjX4zC7w+p X9Pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 65-v6si2144598plb.635.2018.02.13.17.55.27; Tue, 13 Feb 2018 17:55:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966562AbeBNBxH (ORCPT + 99 others); Tue, 13 Feb 2018 20:53:07 -0500 Received: from mga02.intel.com ([134.134.136.20]:56118 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966418AbeBNBvX (ORCPT ); Tue, 13 Feb 2018 20:51:23 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Feb 2018 17:51:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,510,1511856000"; d="scan'208";a="34498938" Received: from downor-z87x-ud5h.fm.intel.com ([10.1.122.107]) by orsmga002.jf.intel.com with ESMTP; 13 Feb 2018 17:51:22 -0800 From: Dongwon Kim To: linux-kernel@vger.kernel.org, linaro-mm-sig@lists.linaro.org, xen-devel@lists.xenproject.org Cc: dri-devel@lists.freedesktop.org, dongwon.kim@intel.com, mateuszx.potrola@intel.com, sumit.semwal@linaro.org Subject: [RFC PATCH v2 2/9] hyper_dmabuf: architecture specification and reference guide Date: Tue, 13 Feb 2018 17:50:01 -0800 Message-Id: <20180214015008.9513-3-dongwon.kim@intel.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180214015008.9513-1-dongwon.kim@intel.com> References: <20180214015008.9513-1-dongwon.kim@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Reference document for hyper_DMABUF driver Documentation/hyper-dmabuf-sharing.txt Signed-off-by: Dongwon Kim --- Documentation/hyper-dmabuf-sharing.txt | 734 +++++++++++++++++++++++++++++++++ 1 file changed, 734 insertions(+) create mode 100644 Documentation/hyper-dmabuf-sharing.txt diff --git a/Documentation/hyper-dmabuf-sharing.txt b/Documentation/hyper-dmabuf-sharing.txt new file mode 100644 index 000000000000..928e411931e3 --- /dev/null +++ b/Documentation/hyper-dmabuf-sharing.txt @@ -0,0 +1,734 @@ +Linux Hyper DMABUF Driver + +------------------------------------------------------------------------------ +Section 1. Overview +------------------------------------------------------------------------------ + +Hyper_DMABUF driver is a Linux device driver running on multiple Virtual +achines (VMs), which expands DMA-BUF sharing capability to the VM environment +where multiple different OS instances need to share same physical data without +data-copy across VMs. + +To share a DMA_BUF across VMs, an instance of the Hyper_DMABUF drv on the +exporting VM (so called, “exporter”) imports a local DMA_BUF from the original +producer of the buffer, then re-exports it with an unique ID, hyper_dmabuf_id +for the buffer to the importing VM (so called, “importer”). + +Another instance of the Hyper_DMABUF driver on importer registers +a hyper_dmabuf_id together with reference information for the shared physical +pages associated with the DMA_BUF to its database when the export happens. + +The actual mapping of the DMA_BUF on the importer’s side is done by +the Hyper_DMABUF driver when user space issues the IOCTL command to access +the shared DMA_BUF. The Hyper_DMABUF driver works as both an importing and +exporting driver as is, that is, no special configuration is required. +Consequently, only a single module per VM is needed to enable cross-VM DMA_BUF +exchange. + +------------------------------------------------------------------------------ +Section 2. Architecture +------------------------------------------------------------------------------ + +1. Hyper_DMABUF ID + +hyper_dmabuf_id is a global handle for shared DMA BUFs, which is compatible +across VMs. It is a key used by the importer to retrieve information about +shared Kernel pages behind the DMA_BUF structure from the IMPORT list. When +a DMA_BUF is exported to another domain, its hyper_dmabuf_id and META data +are also kept in the EXPORT list by the exporter for further synchronization +of control over the DMA_BUF. + +hyper_dmabuf_id is “targeted”, meaning it is valid only in exporting (owner of +the buffer) and importing VMs, where the corresponding hyper_dmabuf_id is +stored in their database (EXPORT and IMPORT lists). + +A user-space application specifies the targeted VM id in the user parameter +when it calls the IOCTL command to export shared DMA_BUF to another VM. + +hyper_dmabuf_id_t is a data type for hyper_dmabuf_id. It is defined as 16-byte +data structure, and it contains id and rng_key[3] as elements for +the structure. + +typedef struct { + int id; + int rng_key[3]; /* 12bytes long random number */ +} hyper_dmabuf_id_t; + +The first element in the hyper_dmabuf_id structure, int id is combined data of +a count number generated by the driver running on the exporter and +the exporter’s ID. The VM’s ID is a one byte value and located at the field’s +SB in int id. The remaining three bytes in int id are reserved for a count +number. + +However, there is a limit related to this count number, which is 1000. +Therefore, only little more than a byte starting from the LSB is actually used +for storing this count number. + +#define HYPER_DMABUF_ID_CREATE(domid, id) \ + ((((domid) & 0xFF) << 24) | ((id) & 0xFFFFFF)) + +This limit on the count number directly means the maximum number of DMA BUFs +that can be shared simultaneously by one VM. The second element of +hyper_dmabuf_id, that is int rng_key[3], is an array of three integers. These +numbers are generated by Linux’s native random number generation mechanism. +This field is added to enhance the security of the Hyper DMABUF driver by +maximizing the entropy of hyper_dmabuf_id (that is, preventing it from being +guessed by a security attacker). + +Once DMA_BUF is no longer shared, the hyper_dmabuf_id associated with +the DMA_BUF is released, but the count number in hyper_dmabuf_id is saved in +the ID list for reuse. However, random keys stored in int rng_key[3] are not +reused. Instead, those keys are always filled with freshly generated random +keys for security. + +2. IOCTLs + +a. IOCTL_HYPER_DMABUF_TX_CH_SETUP + +This type of IOCTL is used for initialization of a one-directional transmit +communication channel with a remote domain. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_tx_ch_setup { + /* IN parameters */ + /* Remote domain id */ + int remote_domain; +}; + +b. IOCTL_HYPER_DMABUF_RX_CH_SETUP + +This type of IOCTL is used for initialization of a one-directional receive +communication channel with a remote domain. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_rx_ch_setup { + /* IN parameters */ + /* Source domain id */ + int source_domain; +}; + +c. IOCTL_HYPER_DMABUF_EXPORT_REMOTE + +This type of IOCTL is used to export a DMA BUF to another VM. When a user +space application makes this call to the driver, it extracts Kernel pages +associated with the DMA_BUF, then makes those shared with the importing VM. + +All reference information for this shared pages and hyper_dmabuf_id is +created, then passed to the importing domain through a communications +channel for synchronous registration. In the meantime, the hyper_dmabuf_id +for the shared DMA_BUF is also returned to user-space application. + +This IOCTL can accept a reference to “user-defined” data as well as a FD +for the DMA BUF. This private data is then attached to the DMA BUF and +exported together with it. + +More details regarding this private data can be found in chapter for +“Hyper_DMABUF Private Data”. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_export_remote { + /* IN parameters */ + /* DMA buf fd to be exported */ + int dmabuf_fd; + /* Domain id to which buffer should be exported */ + int remote_domain; + /* exported dma buf id */ + hyper_dmabuf_id_t hid; + /* size of private data */ + int sz_priv; + /* ptr to the private data for Hyper_DMABUF */ + char *priv; +}; + +d. IOCTL_HYPER_DMABUF_EXPORT_FD + +The importing VM uses this IOCTL to import and re-export a shared DMA_BUF +locally to the end-consumer using the standard Linux DMA_BUF framework. +Upon IOCTL call, the Hyper_DMABUF driver finds the reference information +of the shared DMA_BUF with the given hyper_dmabuf_id, then maps all shared +pages in its own Kernel space. The driver then constructs a scatter-gather +list with those mapped pages and creates a brand-new DMA_BUF with the list, +which is eventually exported with a file descriptor to the local consumer. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_export_fd { + /* IN parameters */ + /* hyper dmabuf id to be imported */ + int hyper_dmabuf_id; + /* flags */ + int flags; + /* OUT parameters */ + /* exported dma buf fd */ + int fd; +}; + +e. IOCTL_HYPER_DMABUF_UNEXPORT + +This type of IOCTL is used when it is necessary to terminate the current +sharing of a DMA_BUF. When called, the driver first checks if there are any +consumers actively using the DMA_BUF. Then, it unexports it if it is not +mapped or used by any consumers. Otherwise, it postpones unexporting, but +makes the buffer invalid to prevent any further import of the same DMA_BUF. +DMA_BUF is completely unexported after the last consumer releases it. + +”Unexport” means removing all reference information about the DMA_BUF from the +LISTs and make all pages private again. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_unexport { + /* IN parameters */ + /* hyper dmabuf id to be unexported */ + int hyper_dmabuf_id; + /* delay in ms by which unexport processing will be postponed */ + int delay_ms; + /* OUT parameters */ + /* Status of request */ + int status; +}; + +f. IOCTL_HYPER_DMABUF_QUERY + +This IOCTL is used to retrieve specific information about a DMA_BUF that +is being shared. + +The user space argument for this type of IOCTL is defined as: + +struct ioctl_hyper_dmabuf_query { + /* in parameters */ + /* hyper dmabuf id to be queried */ + int hyper_dmabuf_id; + /* item to be queried */ + int item; + /* OUT parameters */ + /* output of query */ + /* info can be either value or reference */ + unsigned long info; +}; + + + +HYPER_DMABUF_QUERY_TYPE + - Return the type of DMA_BUF from the current domain, Exported or Imported. + +HYPER_DMABUF_QUERY_EXPORTER + - Return the exporting domain’s ID of a shared DMA_BUF. + +HYPER_DMABUF_QUERY_IMPORTER + - Return the importing domain’s ID of a shared DMA_BUF. + +HYPER_DMABUF_QUERY_SIZE + - Return the size of a shared DMA_BUF in bytes. + +HYPER_DMABUF_QUERY_BUSY + - Return ‘true’ if a shared DMA_BUF is currently used + (mapped by the end-consumer). + +HYPER_DMABUF_QUERY_UNEXPORTED + - Return ‘true’ if a shared DMA_BUF is not valid anymore + (so it does not allow a new consumer to map it). + +HYPER_DMABUF_QUERY_DELAYED_UNEXPORTED + - Return ‘true’ if a shared DMA_BUF is scheduled to be unexported + (but is still valid) within a fixed time. + +HYPER_DMABUF_QUERY_PRIV_INFO + - Return ‘private’ data attached to shared DMA_BUF to the user space. + ‘unsigned long info’ is the user space pointer for the buffer, where + private data will be copied to. + +HYPER_DMABUF_QUERY_PRIV_INFO_SIZE + - Return the size of the private data attached to the shared DMA_BUF. + +3. Event Polling + +Event-polling can be enabled optionally by selecting the Kernel config option, +Enable event-generation and polling operation under xen/hypervisor in Kernel’s +menuconfig. The event-polling mechanism includes the generation of +an import-event, adding it to the event-queue and providing a notification to +the application so that it can retrieve the event data from the queue. + +For this mechanism, “Poll” and “Read” operations are added to the Hyper_DMABUF +driver. A user application that polls the driver goes into a sleep state until +there is a new event added to the queue. An application uses “Read” to retrieve +event data from the event queue. Event data contains the hyper_dmabuf_id and +the private data of the buffer that has been received by the importer. + +For more information on private data, refer to Section 3.5). +Using this method, it is possible to lower the risk of the hyper_dmabuf_id and +other sensitive information about the shared buffer (for example, meta-data +for shared images) being leaked while being transferred to the importer because +all of this data is shared as “private info” at the driver level. However, +please note there should be a way for the importer to find the correct DMA_BUF +in this case when there are multiple Hyper_DMABUFs being shared simultaneously. +For example, the surface name or the surface ID of a specific rendering surface +needs to be sent to the importer in advance before it is exported in a surface- +sharing use-case. + +Each event data given to the user-space consists of a header and the private +information of the buffer. The data type is defined as follows: + +struct hyper_dmabuf_event_hdr { + int event_type; /* one type only for now - new import */ + hyper_dmabuf_id_t hid; /* hyper_dmabuf_id of specific hyper_dmabuf */ + int size; /* size of data */ +}; + +struct hyper_dmabuf_event_data { + struct hyper_dmabuf_event_hdr hdr; + void *data; /* private data */ +}; + +4. Hyper_DMABUF Private Data + +Each Hyper_DMABUF can come with private data, the size of which can be up to +AX_SIZE_PRIV_DATA (currently 192 byte). This private data is just a chunk of +plain data attached to every Hyper_DMABUF. It is guaranteed to be synchronized +across VMs, exporter and importer. This private data does not have any specific +structure defined at the driver level, so any “user-defined” format or +structure can be used. In addition, there is no dedicated use-case for this +data. It can be used virtually for any purpose. For example, it can be used to +share meta-data such as dimension and color formats for shared images in +a surface sharing model. Another example is when we share protected media +contents. + +This private data can be used to transfer flags related to content protection +information on streamed media to the importer. + +Private data is initially generated when a buffer is exported for the first +time. Then, it is updated whenever the same buffer is re-exported. During the +re-exporting process, the Hyper_DMABUF driver only updates private data on +both sides with new data from user-space since the same buffer already exists +on both the IMPORT LIST and EXPORT LIST. + +There are two different ways to retrieve this private data from user-space. +The first way is to use “Read” on the Hyper_DMABUF driver. “Read” returns the +data of events containing private data of the buffer. The second way is to +make a query to Hyper_DMABUF. There are two query items, +HYPER_DMABUF_QUERY_PRIV_INFO and HYPER_DMABUF_QUERY_PRIV_INFO_SIZE available +for retrieving private data and its size. + +5. Scatter-Gather List Table (SGT) Management + +SGT management is the core part of the Hyper_DMABUF driver that manages an +SGT, a representation of the group of kernel pages associated with a DMA_BUF. +This block includes four different sub-blocks: + +a. Hyper_DMABUF_id Manager + +This ID manager is responsible for generating a hyper_dmabuf_id for an +exported DMA_BUF. When an ID is requested, the ID Manager first checks if +there are any reusable IDs left in the list and returns one of those, +if available. Otherwise, it creates the next count number and returns it +to the caller. + +b. SGT Creator + +The SGT (struct sg_table) contains information about the DMA_BUF such as +references to all kernel pages for the buffer and their connections. The +SGT Creator creates a new SGT on the importer side with pages shared by +the hypervisor. + +c. Kernel Page Extractor + +The Page Extractor extracts pages from a given SGT before those pages +are shared. + +d. List Manager Interface + +The SGT manger also interacts with export and import list managers. It +sends out information (for example, hyper_dmabuf_id, reference, and +DMA_BUF information) about the exported or imported DMA_BUFs to the +list manager. Also, on IOCTL request, it asks the list manager to find +and return the information for a corresponding DMA_BUF in the list. + +6. DMA-BUF Interface + +The DMA-BUF interface provides standard methods to manage DMA_BUFs +reconstructed by the Hyper_DMABUF driver from shared pages. All of the +relevant operations are listed in struct dma_buf_ops. These operations +are standard DMA_BUF operations, therefore they follow standard DMA BUF +protocols. + +Each DMA_BUF operation communicates with the exporter at the end of the +routine for “indirect DMA_BUF synchronization”. + +7. Export/Import List Management + +Whenever a DMA_BUF is shared and exported, its information is added to the +database (EXPORT-list) on the exporting VM. Similarly, information about an +imported DMA_BUF is added to the importing database (IMPORT list) on the +importing VM, when the export happens. + +All of the entries in the lists are needed to manage the exported/imported +DMA_BUF more efficiently. Both lists are implemented as Linux hash tables. +The key to the list is hyper_dmabuf_id and the output is the information of +the DMA_BUF. The List Manager manages all requests from other blocks and +transactions within lists to ensure that all entries are up-to-date and +that the list structure is consistent. + +The List Manager provides basic functionality, such as: + +- Adding to the List +- Removal from the List +- Finding information about a DMA_BUF, given the hyper_dmabuf_id + +8. Page Sharing by Hypercalls + +The Hyper_DMABUF driver assumes that there is a native page-by-page memory +sharing mechanism available on the hypervisor. Referencing a group of pages +that are being shared is what the driver expects from “backend” APIs or the +hypervisor itself. + +For the example, xen backend integrated in current code base utilizes Xen’s +grant-table interface for sharing the underlying kernel pages (struct *page). + +More details about grant-table interface can be found at the following locations: + +https://wiki.xen.org/wiki/Grant_Table +https://xenbits.xen.org/docs/4.6-testing/misc/grant-tables.txt + +9. Message Handling + +The exporter and importer can each create a message that consists of an opcode +(command) and operands (parameters) and send it to each other. + +The message format is defined as: + +struct hyper_dmabuf_req { + unsigned int req_id; /* Sequence number. Used for RING BUF + synchronization */ + unsigned int stat; /* Status.Response from receiver. */ + unsigned int cmd; /* Opcode */ + unsigned int op[MAX_NUMBER_OF_OPERANDS]; /* Operands */ +}; + +The following table gives the list of opcodes: + + + +HYPER_DMABUF_EXPORT (exporter --> importer) + - Export a DMA_BUF to the importer. The importer registers the corresponding + DMA_BUF in its IMPORT LIST when the message is received. + +HYPER_DMABUF_EXPORT_FD (importer --> exporter) + - Locally exported as FD. The importer sends out this command to the exporter + to notify that the buffer is now locally exported (mapped and used). + +HYPER_DMABUF_EXPORT_FD_FAILED (importer --> exporter) + - Failed while exporting locally. The importer sends out this command to the + exporter to notify the exporter that the EXPORT_FD failed. + +HYPER_DMABUF_NOTIFY_UNEXPORT (exporter --> importer) + - Termination of sharing. The exporter notifies the importer that the DMA_BUF + has been unexported. + +HYPER_DMABUF_OPS_TO_REMOTE (importer --> exporter) + - Not implemented yet. + +HYPER_DMABUF_OPS_TO_SOURCE (exporter --> importer) + - DMA_BUF ops to the exporter, for DMA_BUF upstream synchronization. + Note: Implemented but it is done asynchronously due to performance issues. + +The following table shows the list of operands for each opcode. + + + +- HYPER_DMABUF_EXPORT + +op0 to op3 – hyper_dmabuf_id +op4 – number of pages to be shared +op5 – offset of data in the first page +op6 – length of data in the last page +op7 – reference number for the group of shared pages +op8 – size of private data +op9 to (op9+op8) – private data + +- HYPER_DMABUF_EXPORT_FD + +op0 to op3 – hyper_dmabuf_id + +- HYPER_DMABUF_EXPORT_FD_FAILED + +op0 to op3 – hyper_dmabuf_id + +- HYPER_DMABUF_NOTIFY_UNEXPORT + +op0 to op3 – hyper_dmabuf_id + +- HYPER_DMABUF_OPS_TO_REMOTE(Not implemented) + +- HYPER_DMABUF_OPS_TO_SOURCE + +op0 to op3 – hyper_dmabuf_id +op4 – type of DMA_BUF operation + +9. Inter VM (Domain) Communication + +Two different types of inter-domain communication channels are required, +one in kernel space and the other in user space. The communication channel +in user space is for transmitting or receiving the hyper_dmabuf_id. Since +there is no specific security (for example, encryption) involved in the +generation of a global id at the driver level, it is highly recommended that +the customer’s user application set up a very secure channel for exchanging +hyper_dmabuf_id between VMs. + +The communication channel in kernel space is required for exchanging messages +from “message management” block between two VMs. In the current reference +backend for Xen hypervisor, Xen ring-buffer and event-channel mechanisms are +used for message exchange between impoter and exporter. + +10. What are required in hypervisor + +emory sharing and message communication between VMs + +------------------------------------------------------------------------------ +Section 3. Hyper DMABUF Sharing Flow +------------------------------------------------------------------------------ + +1. Exporting + +To export a DMA_BUF to another VM, user space has to call an IOCTL +(IOCTL_HYPER_DMABUF_EXPORT_REMOTE) with a file descriptor for the buffer given +by the original exporter. The Hyper_DMABUF driver maps a DMA_BUF locally, then +issues a hyper_dmabuf_id and SGT for the DMA_BUF, which is registered to the +EXPORT list. Then, all pages for the SGT are extracted and each individual +page is shared via a hypervisor-specific memory sharing mechanism +(for example, in Xen this is grant-table). + +One important requirement on this memory sharing method is that it needs to +create a single integer value that represents the list of pages, which can +then be used by the importer for retrieving the group of shared pages. For +this, the “Backend” in the reference driver utilizes the multiple level +addressing mechanism. + +Once the integer reference to the list of pages is created, the exporter +builds the “export” command and sends it to the importer, then notifies the +importer. + +2. Importing + +The Import process is divided into two sections. One is the registration +of DMA_BUF from the exporter. The other is the actual mapping of the buffer +before accessing the data in the buffer. The former (termed “Registration”) +happens on an export event (that is, the export command with an interrupt) +in the exporter. + +The latter (termed “Mapping”) is done asynchronously when the driver gets the +IOCTL call from user space. When the importer gets an interrupt from the +exporter, it checks the command in the receiving queue and if it is an +“export” command, the registration process is started. It first finds +hyper_dmabuf_id and the integer reference for the shared pages, then stores +all of that information together with the “domain id” of the exporting domain +in the IMPORT LIST. + +In the case where “event-polling” is enabled (Kernel Config - Enable event- +generation and polling operation), a “new sharing available” event is +generated right after the reference info for the new shared DMA_BUF is +registered to the IMPORT LIST. This event is added to the event-queue. + +The user process that polls Hyper_DMABUF driver wakes up when this event-queue +is not empty and is able to read back event data from the queue using the +driver’s “Read” function. Once the user-application calls EXPORT_FD IOCTL with +the proper parameters including hyper_dmabuf_id, the Hyper_DMABUF driver +retrieves information about the matched DMA_BUF from the IMPORT LIST. Then, it +maps all pages shared (referenced by the integer reference) in its kernel +space and creates its own DMA_BUF referencing the same shared pages. After +this, it exports this new DMA_BUF to the other drivers with a file descriptor. +DMA_BUF can then be used just in the same way a local DMA_BUF is. + +3. Indirect Synchronization of DMA_BUF + +Synchronization of a DMA_BUF within a single OS is automatically achieved +because all of importer’s DMA_BUF operations are done using functions defined +on the exporter’s side, which means there is one central place that has full +control over the DMA_BUF. In other words, any primary activities such as +attaching/detaching and mapping/un-mapping are all captured by the exporter, +meaning that the exporter knows basic information such as who is using the +DMA_BUF and how it is being used. This, however, is not applicable if this +sharing is done beyond a single OS because kernel space (where the exporter’s +DMA_BUF operations reside) is simply not visible to the importing VM. + +Therefore, “indirect synchronization” was introduced as an alternative solution, +which is now implemented in the Hyper_DMABUF driver. This technique makes +the exporter create a shadow DMA_BUF when the end-consumer of the buffer maps +the DMA_BUF, then duplicates any DMA_BUF operations performed on +the importer’s side. Through this “indirect synchronization”, the exporter is +able to virtually track all activities done by the consumer (mostly reference +counter) as if those are done in exporter’s local system. + +------------------------------------------------------------------------------ +Section 4. Hypervisor Backend Interface +------------------------------------------------------------------------------ + +The Hyper_DMABUF driver has a standard “Backend” structure that contains +mappings to various functions designed for a specific Hypervisor. Most of +these API functions should provide a low-level implementation of communication +and memory sharing capability that utilize a Hypervisor’s native mechanisms. + +struct hyper_dmabuf_backend_ops { + /* retreiving id of current virtual machine */ + int (*get_vm_id)(void); + /* get pages shared via hypervisor-specific method */ + int (*share_pages)(struct page **, int, int, void **); + /* make shared pages unshared via hypervisor specific method */ + int (*unshare_pages)(void **, int); + /* map remotely shared pages on importer's side via + * hypervisor-specific method + */ + struct page ** (*map_shared_pages)(int, int, int, void **); + /* unmap and free shared pages on importer's side via + * hypervisor-specific method + */ + int (*unmap_shared_pages)(void **, int); + /* initialize communication environment */ + int (*init_comm_env)(void); + /* destroy communication channel */ + void (*destroy_comm)(void); + /* upstream ch setup (receiving and responding) */ + int (*init_rx_ch)(int); + /* downstream ch setup (transmitting and parsing responses) */ + int (*init_tx_ch)(int); + /* send msg via communication ch */ + int (*send_req)(int, struct hyper_dmabuf_req *, int); +}; + + + +1. get_vm_id + + Returns the VM (domain) ID + + Input: + + -ID of the current domain + + Output: + + None + +2. share_pages + + Get pages shared via hypervisor-specific method and return one reference + ID that represents the complete list of shared pages + + Input: + + -Array of pages + -ID of importing VM + -Number of pages + -Hypervisor specific Representation of reference info of shared + pages + + Output: + + -Hypervisor specific integer value that represents all of + the shared pages + +3. unshare_pages + + Stop sharing pages + + Input: + + -Hypervisor specific Representation of reference info of shared + pages + -Number of shared pages + + Output: + + 0 + +4. map_shared_pages + + Map shared pages locally using a hypervisor-specific method + + Input: + + -Reference number that represents all of shared pages + -ID of exporting VM, Number of pages + -Reference information for any purpose + + Output: + + -An array of shared pages (struct page**) + +5. unmap_shared_pages + + Unmap shared pages + + Input: + + -Hypervisor specific Representation of reference info of shared pages + + Output: + + -0 (successful) or one of Standard Kernel errors + +6. init_comm_env + + Setup infrastructure needed for communication channel + + Input: + + None + + Output: + + None + +7. destroy_comm + + Cleanup everything done via init_comm_env + + Input: + + None + + Output: + + None + +8. init_rx_ch + + Configure receive channel + + Input: + + -ID of VM on the other side of the channel + + Output: + + -0 (successful) or one of Standard Kernel errors + +9. init_tx_ch + + Configure transmit channel + + Input: + + -ID of VM on the other side of the channel + + Output: + + -0 (success) or one of Standard Kernel errors + +10. send_req + + Send message to other VM + + Input: + + -ID of VM that receives the message + -Message + + Output: + + -0 (success) or one of Standard Kernel errors + +------------------------------------------------------------------------------- +------------------------------------------------------------------------------- -- 2.16.1