Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp4215501pxj; Tue, 8 Jun 2021 09:04:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxCv2Wetw/8wcdy8BAs55Vkbw9udGXwPnnuctevRLCJiBe56MgrmhMnDZuyRQ8fVclV6dLU X-Received: by 2002:a17:906:3a05:: with SMTP id z5mr23730752eje.505.1623168250892; Tue, 08 Jun 2021 09:04:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623168250; cv=none; d=google.com; s=arc-20160816; b=TRI5ETPaRLL7qs0RyAV2E41Q7TKCvg3YP+doDsT+17O+S7/MmpJjx4zkp64BW9ntoF RJbRQdjvz/s5a90TdPFOU7bXGDLwA8wFSBLfp8GEq6mtjsuLoI2jDsdxhJubHgifxQxy vJMuBSWYNdZjWp30BqXe+pURur5xGiH/S+CpMMU0FE5uzgJ8f32negl4EjLCv91QwUYQ PIA0l7IV3F4dFGlfZRT0ZphIH9WfKZa2MPVCHVRS68zFV9VQs6twanDvRHYH0ldTGj8N KjLA0Oa+RSWdCudk1L04FY48kxU8fYjOv9R+qZ2eimG0XYby7pXc3F6TIOxfnTtVZpQa zuyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :ironport-sdr:ironport-sdr; bh=7dYB5pL45S4NzNwfuW3XjoDmidoxmw8qk5FMNNpyR/0=; b=PUzppAW4fn9TStUB72TUtYaWuCdh9oH0E7Xv2y9CruHkmt2Q1kForPG1PPQ5d2kvwj ZrT+ygI6uqYzYv1eEmbAhC+j9eHgQK1ymfld+8Zaau/jiwr6+Ah+JsuBA+JzUXda4MCs 2VxLy20YxtyoYv5ZJuQn7I5UYTYR5YB5B9NWLKCtqjsu8oZWLGF31ZMDrbce+km7510d tEGGg0y0KQcvfMKZ1gAVIG2/smnG7ZJ8B94K/C2uNblr6dDliibExHiqDoNp2iWgJMHC 5Ld8PUAftztFk90/FEbZAZ9bD7yZdyOCVWsfJ8xCaBY+I7TlG9S8h9jnNiLjbGIlcxTG EhIA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ka15si86707ejc.148.2021.06.08.09.03.44; Tue, 08 Jun 2021 09:04:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233549AbhFHQCG (ORCPT + 99 others); Tue, 8 Jun 2021 12:02:06 -0400 Received: from mga14.intel.com ([192.55.52.115]:60385 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233064AbhFHQBv (ORCPT ); Tue, 8 Jun 2021 12:01:51 -0400 IronPort-SDR: mVRG0FGNj45A9gOYQuk25Q8FvkHw99FXidqkOpEN5khso0+byJvgpM4OUjhGlbkTpZOVVZG2fw HCb++ZflKmIQ== X-IronPort-AV: E=McAfee;i="6200,9189,10009"; a="204688358" X-IronPort-AV: E=Sophos;i="5.83,258,1616482800"; d="scan'208";a="204688358" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 08:59:58 -0700 IronPort-SDR: 0I2CP+Pblpu+3vgO9QIzQkokHTafQ5vnkJOEaHdUGCiniXe/eeQpN5h5Z126UBTB42qYe4IQoz VIwdg31hxBoA== X-IronPort-AV: E=Sophos;i="5.83,258,1616482800"; d="scan'208";a="637683412" Received: from ticela-az-103.amr.corp.intel.com (HELO skuppusw-desk1.amr.corp.intel.com) ([10.254.36.77]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Jun 2021 08:59:58 -0700 From: Kuppuswamy Sathyanarayanan To: Peter Zijlstra , Andy Lutomirski , Dave Hansen , Tony Luck , Dan Williams Cc: Andi Kleen , Kirill Shutemov , Kuppuswamy Sathyanarayanan , Raj Ashok , Sean Christopherson , linux-kernel@vger.kernel.org, Kuppuswamy Sathyanarayanan Subject: [RFC v2-fix-v3 4/4] x86/tdx: Handle in-kernel MMIO Date: Tue, 8 Jun 2021 08:59:24 -0700 Message-Id: <9900bbbbd55ad107bf3d94c82aa879cfe84f8a73.1623167569.git.sathyanarayanan.kuppuswamy@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: "Kirill A. Shutemov" In traditional VMs, MMIO is usually implemented by giving a guest access to a mapping which will cause a VMEXIT on access and then the VMM emulating the access. That's not possible in TDX guest because VMEXIT will expose the register state to the host. TDX guests don't trust the host and can't have its state exposed to the host. In TDX the MMIO regions are instead configured to trigger a #VE exception in the guest. The guest #VE handler then emulates the MMIO instruction inside the guest and converts them into a controlled TDCALL to the host, rather than completely exposing the state to the host. Currently, we only support MMIO for instructions that are known to come from io.h macros (build_mmio_read/write()). For drivers that don't use the io.h macros or uses structure overlay to do MMIO are currently not supported in TDX guest (for example the MMIO based XAPIC is disable at runtime for TDX). This way of handling is similar to AMD SEV. Also, reasons for supporting #VE based MMIO in TDX guest are, * MMIO is widely used and we'll have more drivers in the future. * We don't want to annotate every TDX specific MMIO readl/writel etc. * If we didn't annotate we would need to add an alternative to every   MMIO access in the kernel (even though 99.9% will never be used on   TDX) which would be a complete waste and incredible binary bloat   for nothing. Signed-off-by: Kirill A. Shutemov Signed-off-by: Kuppuswamy Sathyanarayanan --- arch/x86/kernel/tdx.c | 109 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c index 48a0cc2663ea..053e69782e3d 100644 --- a/arch/x86/kernel/tdx.c +++ b/arch/x86/kernel/tdx.c @@ -5,6 +5,9 @@ #include #include +#include +#include +#include /* force_sig_fault() */ #include #include @@ -226,6 +229,104 @@ static void tdg_handle_io(struct pt_regs *regs, u32 exit_qual) } } +static unsigned long tdg_mmio(int size, bool write, unsigned long addr, + unsigned long *val) +{ + struct tdx_hypercall_output out = {0}; + u64 err; + + err = __tdx_hypercall(EXIT_REASON_EPT_VIOLATION, size, write, + addr, *val, &out); + *val = out.r11; + return err; +} + +static int tdg_handle_mmio(struct pt_regs *regs, struct ve_info *ve) +{ + struct insn insn = {}; + char buffer[MAX_INSN_SIZE]; + enum mmio_type mmio; + unsigned long *reg; + int size, ret; + u8 sign_byte; + unsigned long val; + + if (user_mode(regs)) { + ret = insn_fetch_from_user(regs, buffer); + if (!ret) + return -EFAULT; + if (!insn_decode_from_regs(&insn, regs, buffer, ret)) + return -EFAULT; + } else { + ret = copy_from_kernel_nofault(buffer, (void *)regs->ip, + MAX_INSN_SIZE); + if (ret) + return -EFAULT; + insn_init(&insn, buffer, MAX_INSN_SIZE, 1); + insn_get_length(&insn); + } + + mmio = insn_decode_mmio(&insn, &size); + if (mmio == MMIO_DECODE_FAILED) + return -EFAULT; + + if (mmio != MMIO_WRITE_IMM && mmio != MMIO_MOVS) { + reg = insn_get_modrm_reg_ptr(&insn, regs); + if (!reg) + return -EFAULT; + } + + switch (mmio) { + case MMIO_WRITE: + memcpy(&val, reg, size); + ret = tdg_mmio(size, true, ve->gpa, &val); + break; + case MMIO_WRITE_IMM: + val = insn.immediate.value; + ret = tdg_mmio(size, true, ve->gpa, &val); + break; + case MMIO_READ: + ret = tdg_mmio(size, false, ve->gpa, &val); + if (ret) + break; + /* Zero-extend for 32-bit operation */ + if (size == 4) + *reg = 0; + memcpy(reg, &val, size); + break; + case MMIO_READ_ZERO_EXTEND: + ret = tdg_mmio(size, false, ve->gpa, &val); + if (ret) + break; + + /* Zero extend based on operand size */ + memset(reg, 0, insn.opnd_bytes); + memcpy(reg, &val, size); + break; + case MMIO_READ_SIGN_EXTEND: + ret = tdg_mmio(size, false, ve->gpa, &val); + if (ret) + break; + + if (size == 1) + sign_byte = (val & 0x80) ? 0xff : 0x00; + else + sign_byte = (val & 0x8000) ? 0xff : 0x00; + + /* Sign extend based on operand size */ + memset(reg, sign_byte, insn.opnd_bytes); + memcpy(reg, &val, size); + break; + case MMIO_MOVS: + case MMIO_DECODE_FAILED: + return -EFAULT; + } + + if (ret) + return -EFAULT; + return insn.length; +} + unsigned long tdg_get_ve_info(struct ve_info *ve) { u64 ret; @@ -275,6 +376,14 @@ int tdg_handle_virtualization_exception(struct pt_regs *regs, case EXIT_REASON_IO_INSTRUCTION: tdg_handle_io(regs, ve->exit_qual); break; + case EXIT_REASON_EPT_VIOLATION: + /* Currently only MMIO triggers EPT violation */ + ve->instr_len = tdg_handle_mmio(regs, ve); + if (ve->instr_len < 0) { + pr_warn_once("MMIO failed\n"); + return -EFAULT; + } + break; default: pr_warn("Unexpected #VE: %lld\n", ve->exit_reason); return -EFAULT; -- 2.25.1