Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753153AbdHQPRG (ORCPT ); Thu, 17 Aug 2017 11:17:06 -0400 Received: from mail-by2nam03on0076.outbound.protection.outlook.com ([104.47.42.76]:26544 "EHLO NAM03-BY2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753124AbdHQPRC (ORCPT ); Thu, 17 Aug 2017 11:17:02 -0400 From: "Jorgen S. Hansen" To: Stefan Hajnoczi CC: Dexuan Cui , "davem@davemloft.net" , "netdev@vger.kernel.org" , "gregkh@linuxfoundation.org" , "devel@linuxdriverproject.org" , KY Srinivasan , Haiyang Zhang , "Stephen Hemminger" , George Zhang , Michal Kubecek , Asias He , "Vitaly Kuznetsov" , Cathy Avery , "jasowang@redhat.com" , Rolf Neugebauer , Dave Scott , "Marcelo Cerri" , "apw@canonical.com" , "olaf@aepfle.de" , "joe@perches.com" , "linux-kernel@vger.kernel.org" , Dan Carpenter Subject: Re: [PATCH] vsock: only load vmci transport on VMware hypervisor by default Thread-Topic: [PATCH] vsock: only load vmci transport on VMware hypervisor by default Thread-Index: AdMXLqHUpz8ZGmCVQCq3Yks74VajMAAMewiAAALTwAA= Date: Thu, 17 Aug 2017 15:16:58 +0000 Message-ID: <04460E3B-B213-4090-96CD-00CEEBE6AC32@vmware.com> References: <20170817135559.GG5539@stefanha-x1.localdomain> In-Reply-To: <20170817135559.GG5539@stefanha-x1.localdomain> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [80.197.119.203] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;CY1PR05MB2201;20:UBgCzQo16PnlQa2KEbzw9pYenYM1b9PjF+qv5XAY5H35Scw2QShOsW0lF+5bo2PGbsfw6NWF8aQ/LxSIfaaQ/VgLvcJIhJsKLXnUewV+CVNP2I+Nt0jAXsUQR2d8jOZR2hR5wQh5l3tZLHm+LJJ5D594p8u5g0wveRv5GFOi0LI= x-ms-office365-filtering-correlation-id: 70e0abcf-d4e1-477e-3884-08d4e5830124 x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603157)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:CY1PR05MB2201; x-ms-traffictypediagnostic: CY1PR05MB2201: x-exchange-antispam-report-test: UriScan:(61668805478150)(89211679590171)(17755550239193); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(5005006)(8121501046)(3002001)(100000703101)(100105400095)(10201501046)(93006095)(93001095)(6041248)(20161123555025)(20161123558100)(20161123564025)(20161123562025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:CY1PR05MB2201;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:CY1PR05MB2201; x-forefront-prvs: 0402872DA1 x-forefront-antispam-report: SFV:NSPM;SFS:(10009020)(6009001)(189002)(377454003)(24454002)(199003)(102836003)(3846002)(6246003)(66066001)(6506006)(68736007)(6116002)(6436002)(8676002)(83716003)(53546010)(106356001)(478600001)(82746002)(86362001)(110136004)(229853002)(97736004)(8936002)(189998001)(6916009)(2950100002)(77096006)(6486002)(81156014)(81166006)(101416001)(33656002)(99286003)(6512007)(2900100001)(54906002)(50986999)(54356999)(76176999)(53936002)(14454004)(4326008)(7416002)(105586002)(3660700001)(3280700002)(7736002)(8666007)(5660300001)(305945005)(25786009)(36756003)(2906002);DIR:OUT;SFP:1101;SCL:1;SRVR:CY1PR05MB2201;H:CY1PR05MB2217.namprd05.prod.outlook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; authentication-results: spf=none (sender IP is ) smtp.mailfrom=jhansen@vmware.com; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <3F713CD72BC7564A908376CE0F641686@namprd05.prod.outlook.com> MIME-Version: 1.0 X-OriginatorOrg: vmware.com X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Aug 2017 15:16:58.5741 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR05MB2201 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id v7HFI0LW015176 Content-Length: 5989 Lines: 129 > On Aug 17, 2017, at 3:55 PM, Stefan Hajnoczi wrote: > > On Thu, Aug 17, 2017 at 08:00:29AM +0000, Dexuan Cui wrote: >> >> Without the patch, vmw_vsock_vmci_transport.ko can automatically load >> when an application creates an AF_VSOCK socket. >> >> This is the expected good behavior on VMware hypervisor, but as we >> are going to add hv_sock.ko (i.e. Hyper-V transport for AF_VSOCK), we >> should make sure vmw_vsock_vmci_transport.ko can't load on Hyper-V, >> otherwise there is a -EBUSY conflict when both vmw_vsock_vmci_transport.ko >> and hv_sock.ko try to call vsock_core_init() on Hyper-V. >> >> On the other hand, hv_sock.ko can only load on Hyper-V, because it >> depends on hv_vmbus.ko, which detects Hyper-V in hv_acpi_init(). >> >> KVM's vsock_virtio_transport doesn't have the issue because it doesn't >> define MODULE_ALIAS_NETPROTO(PF_VSOCK). > > Thanks for sending this patch, vmci's MODULE_ALIAS_NETPROTO(PF_VSOCK) is > a problem for vhost_vsock.ko (the virtio host driver) too. A host > userspace program can create a AF_VSOCK socket before vhost_vsock is > loaded. The vmci transport will be unconditionally loaded and that's > not the right behavior. > > Putting aside nested virtualization, I want to load the transport (vmci, > Hyper-V, vsock) for which there is paravirtualized hardware present > inside the guest. Good points. Completely agree that this is the desired behavior for a guest. > It's a little tricker on the host side (doesn't matter for Hyper-V and > probably also doesn't for VMware) because the host-side driver is a > software device with no hardware backing it. In KVM we assume the > vhost_vsock.ko kernel module will be loaded sufficiently early. Since the vmci driver is currently tied to PF_VSOCK it hasn’t been a problem, but on the host side the VMCI driver has no hardware backing it either, so when we move to a more appropriate solution, this will be an issue for VMCI as well. I’ll check our shipped products, but they most likely assume that if an upstreamed vmci module is present, it will be loaded automatically. > > Things get trickier with nested virtualization because the VM might want > to talk to its host but also to its nested VMs. The simple way of > fixing this would be to allow two transports loaded simultaneously and > route traffic destined to CID 2 to the host transport and all other > traffic to the guest transport. This is close to the routing the VMCI driver does in a nested environment, but that is with the assumption that there is only one type of transport. Having two different transports would require that we delay resolving the transport type until the socket endpoint has been bound to an address. Things get trickier if listening sockets use VMADDR_CID_ANY - if only one transport is present, this would allow the socket to accept connections from both guests and outer host, but with multiple transports that won’t work, since we can’t associate a socket with a transport until the socket is bound. > > Perhaps we should discuss these cases a bit more to figure out how to > avoid conflicts over MODULE_ALIAS_NETPROTO(PF_VSOCK). Agreed. > >> >> The patch also adds a module parameter "skip_hypervisor_check" for >> vmw_vsock_vmci_transport.ko. >> >> Signed-off-by: Dexuan Cui >> Cc: Alok Kataria >> Cc: Andy King >> Cc: Adit Ranadive >> Cc: George Zhang >> Cc: Jorgen Hansen >> Cc: K. Y. Srinivasan >> Cc: Haiyang Zhang >> Cc: Stephen Hemminger >> --- >> net/vmw_vsock/Kconfig | 2 +- >> net/vmw_vsock/vmci_transport.c | 11 +++++++++++ >> 2 files changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/net/vmw_vsock/Kconfig b/net/vmw_vsock/Kconfig >> index a24369d..3f52929 100644 >> --- a/net/vmw_vsock/Kconfig >> +++ b/net/vmw_vsock/Kconfig >> @@ -17,7 +17,7 @@ config VSOCKETS >> >> config VMWARE_VMCI_VSOCKETS >> tristate "VMware VMCI transport for Virtual Sockets" >> - depends on VSOCKETS && VMWARE_VMCI >> + depends on VSOCKETS && VMWARE_VMCI && HYPERVISOR_GUEST >> help >> This module implements a VMCI transport for Virtual Sockets. >> >> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c >> index 10ae782..c068873 100644 >> --- a/net/vmw_vsock/vmci_transport.c >> +++ b/net/vmw_vsock/vmci_transport.c >> @@ -16,6 +16,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -73,6 +74,10 @@ struct vmci_transport_recv_pkt_info { >> struct vmci_transport_packet pkt; >> }; >> >> +static bool skip_hypervisor_check; >> +module_param(skip_hypervisor_check, bool, 0444); >> +MODULE_PARM_DESC(hot_add, "If set, attempt to load on non-VMware platforms"); >> + >> static LIST_HEAD(vmci_transport_cleanup_list); >> static DEFINE_SPINLOCK(vmci_transport_cleanup_lock); >> static DECLARE_WORK(vmci_transport_cleanup_work, vmci_transport_cleanup); >> @@ -2085,6 +2090,12 @@ static int __init vmci_transport_init(void) >> { >> int err; >> >> + /* Check if we are running on VMware's hypervisor and bail out >> + * if we are not. >> + */ >> + if (!skip_hypervisor_check && x86_hyper != &x86_hyper_vmware) >> + return -ENODEV; >> + I’ma bit concerned with this. On the host-side, we still want to be able to use the VMCI transport, so to allow that, the above should also allow loading the transport when x86_hyper == NULL. However, this may still cause a conflict with virtio host side support, so it looks like we need to find a better overall way to make the protocols co-exist. >> /* Create the datagram handle that we will use to send and receive all >> * VSocket control messages for this context. >> */ >> -- >> 2.7.4